Lioncash
bd82513199
frontend/ir_emitter: Add half-precision opcode for FPMulAdd
2020-04-22 21:01:44 +01:00
Lioncash
7bb5440507
general: Mark hash functions as noexcept
...
Generally hash functions shouldn't throw exceptions. It's also a
requirement for the standard library-provided hash functions to not
throw exceptions.
An exception to this rule is made for user-defined specializations,
however we can just be consistent with the standard library on this to
allow it to play nicer with it.
While we're at it, we can also make the std::less specializations
noexcpet as well, since they also can't throw.
2020-04-22 21:01:43 +01:00
Lioncash
fe95575b95
general: Replace unreachable-imitating assertions with UNREACHABLE()
...
We can just use the self-documenting assertion for indicating
unreachable paths, instead of manually passing false and providing a
message.
2020-04-22 21:01:43 +01:00
Merry
01bb1cdd88
Merge pull request #458 from lioncash/float-op
...
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f
frontend/ir_emitter: Add half-precision variant of FPAbs
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f
frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes
2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978
frontend/ir_emitter: Add half-precision variant of FPNeg
2020-04-22 20:58:12 +01:00
Lioncash
bd892ec4ef
frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677
frontend/ir/value: Add U16U32U64 type to represent floating point types
2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2
Merge pull request #447 from lioncash/flag
...
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
fb039e232c
Merge pull request #442 from lioncash/fcvtxn
...
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5
ir: Add A64-specific opcodes for getting and setting raw NZCV values
...
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Lioncash
5cf1478620
frontend/ir: Add opcodes for vector square roots
2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3
frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
...
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Lioncash
36027ebef5
frontend/ir/microinstruction: Add missing cases for FPRecipExponent{32,64} for ReadsFromAndWritesToFPSRCumulativeExceptionBits()
...
This was intended to be added within #437 , but was missed
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811
frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents
2020-04-22 20:58:10 +01:00
MerryMage
fa8925c4df
IR: Implement FPVectorMulX
2020-04-22 20:57:37 +01:00
V.Kalyuzhny
764a93bf5a
Switch boost::optional to std::optional
2020-04-22 20:57:37 +01:00
Lioncash
0583d401e3
ir/value: Add IsSignedImmediate() and IsUnsignedImmediate() functions to Value's interface
...
This allows testing against arbitrary values while also simultaneously
eliminating the need to check IsImmediate() all the time in expressions.
2020-04-22 20:57:37 +01:00
Lioncash
e3258e8525
ir/value: Add a GetImmediateAsS64() function
...
Provides a signed analogue to GetImmediateAsU64() for consistency with
both integral classes when it comes to signed/unsigned..
2020-04-22 20:57:37 +01:00
Lioncash
4a3c064b15
ir/value: Add an IsZero() member function to Value's interface
...
By far, one of the most common things to check for is whether or not a
value is zero, as it typically allows folding away unnecesary
operations (other close contenders that can help with eliding operations are 1 and -1).
So instead of requiring a check for an immediate and then actually
retrieving the integral value and checking it, we can wrap it within a
function to make it more convenient.
2020-04-22 20:57:37 +01:00
Lioncash
f40fcda1f6
ir/value: Add member function to check whether or not all bits of a contained value are set
...
This is useful when we wish to know if a contained value is something
like 0xFFFFFFFF, as this helps perform constant folding. For example the
operation: x & 0xFFFFFFFF can be folded to just x in the 32-bit case.
2020-04-22 20:55:50 +01:00
Lioncash
d69fceec55
value: Move ImmediateToU64() to be a part of Value's interface
...
This'll make it slightly nicer to do basic constant folding for 32-bit
and 64-bit variants of the same IR opcode type. By that, I mean it's
possible to inspect immediate values without a bunch of conditional
checks beforehand to verify that it's possible to call GetU32() or
GetU64, etc.
2020-04-22 20:55:50 +01:00
MerryMage
f0920c0ded
Fix VShift terminology
...
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.
- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
Lioncash
d426dfe942
ir: Add opcodes for unsigned saturating left shifts
2020-04-22 20:55:06 +01:00
MerryMage
02150bc0b7
IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed
2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0
opcodes.inc: Align columns to a tabstop of 4
2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d
IR: Add fbits argument to FixedToFP-related opcodes
2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46
ir: Add opcodes for left signed saturated shifts
2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6
IR: Add VectorSignedSaturatedDoublingMultiplyLong
2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa
emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
...
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5
IR: Implement Vector{Signed,Unsigned}Multiply{16,32}
2020-04-22 20:55:06 +01:00
Lioncash
e739624296
ir: Add opcodes for vector CLZ operations
...
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.
If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00
Lioncash
d4a76aaa04
ir: Add opcodes form unsigned saturated accumulations of signed values
2020-04-22 20:55:05 +01:00
Lioncash
6f911a26da
ir: Add opcodes for signed saturated accumulations of unsigned values
2020-04-22 20:55:05 +01:00
Lioncash
134bb02e19
ir/value: Replace includes with forward declarations
...
enum classes are still considered complete types when forward declared
(as the compiler knows the exact size of the type from the declaration
alone). The only difference in this case being that the members of the
enum class aren't visible. Given we don't use the members within this
header in any way, we can simply forward declare them here and remove
the inclusions.
2020-04-22 20:55:05 +01:00
Lioncash
2c8e07e7d0
ir/cond: Migrate to C++17 nested namespace specifiers
2020-04-22 20:55:05 +01:00
Lioncash
b6e74fd17d
ir: Add opcodes for performing unsigned reciprocal square root estimates
2020-04-22 20:55:05 +01:00
Lioncash
af83360f89
ir: Add opcodes for unsigned reciprocal estimate
2020-04-22 20:55:05 +01:00
Lioncash
fca7eddb9e
A64: Add opcodes for signed saturating negations
2020-04-22 20:53:46 +01:00
MerryMage
aa8d826c13
ir/terminal: Add FastDispatchHint
2020-04-22 20:53:46 +01:00
Lioncash
7ebfd0f31c
ir: Add opcodes for scalar signed saturated doubling multiplies
2020-04-22 20:53:46 +01:00
Lioncash
a0231e5546
ir: Add opcodes for signed saturated doubling multiplies
2020-04-22 20:53:46 +01:00
Lioncash
0507e47420
ir: Add opcodes for signed saturated absolute values
2020-04-22 20:53:46 +01:00
MerryMage
3415828fb4
IR: Simplify FP{Single,Double}ToFixed{U,S}{32,64}
2020-04-22 20:53:46 +01:00
Lioncash
053175f69b
ir_emitter: Rename fpscr_controlled parameters to fpcr_controlled
...
Part of addressing #333
2020-04-22 20:53:46 +01:00
MerryMage
ccbf6c7f63
microinstruction: A32ExceptionRaised causes CPU exception
2020-04-22 20:53:46 +01:00
MerryMage
89d08c7d61
IR: Add VectorTable and VectorTableLookup IR instructions
2020-04-22 20:53:45 +01:00
MerryMage
0288974512
opcodes: Cleanup opcodes table
...
* Remove T:: prefix from types.
* Add another column for a 4th argument.
2020-04-22 20:53:45 +01:00
Lioncash
0efa2ce3b0
ir: Add opcodes for performing rounding left shifts
2020-04-22 20:53:45 +01:00
Lioncash
f3f60cd179
A64: Implement ISB
...
Given we want to ensure that all instructions are fetched again, we can
treat an ISB instruction as a code cache flush.
2020-04-22 20:53:45 +01:00
MerryMage
d1d6f4feb5
system: Implement MRS CNTFRQ_EL0
2020-04-22 20:53:45 +01:00
Lioncash
acbaf04fef
ir: Add opcodes for unsigned saturating add and subtract
2020-04-22 20:46:23 +01:00
Lioncash
2188765e28
ir/value: Use type alias CoprocessorInfo for std::array<u8, 8>
...
Provides a more descriptive label for the interface, and avoids the need
to hardcode the array size in multiple places.
2020-04-22 20:46:23 +01:00
MerryMage
3f4d118d73
microinstruction: Improve assert messages
2020-04-22 20:46:23 +01:00
MerryMage
17f73974f2
IR: Implement FPMulX IR instruction
2020-04-22 20:46:23 +01:00
MerryMage
f976c47008
IR: Initial implementation of FPVectorRoundInt
2020-04-22 20:46:23 +01:00
MerryMage
10e196480f
IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths
2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02
ir: Add opcodes for vector paired maximum and minimums
...
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
2501bfbfae
ir: Add opcodes for performing scalar integral min/max
2020-04-22 20:46:23 +01:00
Lioncash
7fdd8b0197
A64: Implement PMULL{2}
2020-04-22 20:46:23 +01:00
Lioncash
affa312d1d
ir: Add opcode for performing polynomial multiplication
2020-04-22 20:46:22 +01:00
MerryMage
507bcd8b8b
IR: Implement FPVectorTo{Signed,Unsigned}Fixed
2020-04-22 20:46:22 +01:00
MerryMage
7b03da86c2
IR: Implement FPVector{Max,Min}
2020-04-22 20:46:22 +01:00
MerryMage
ddcff86f9c
microinstruction: Update ReadsFromAndWritesToFPSRCumulativeExceptionBits
2020-04-22 20:46:22 +01:00
MerryMage
901bd9b4e2
IR: Implement FPRecipStepFused, FPVectorRecipStepFused
2020-04-22 20:46:22 +01:00
MerryMage
939f5f5c7a
IR: Implement FPVectorRecipEstimate
2020-04-22 20:46:22 +01:00
MerryMage
c1dcfe29f7
IR: Implement FPRecipEstimate
2020-04-22 20:46:22 +01:00
MerryMage
04f325a05e
IR: Implement FPVectorNeg
2020-04-22 20:46:22 +01:00
MerryMage
771a4fc20b
IR: Implement FPVectorMulAdd
2020-04-22 20:46:22 +01:00
MerryMage
ecbf9dbae5
IR: Implement A64OrQC
2020-04-22 20:46:22 +01:00
MerryMage
b455b566e7
A64: Implement UQXTN (vector)
2020-04-22 20:46:22 +01:00
MerryMage
3874cb37e3
A64: Implement SQXTN (vector)
2020-04-22 20:46:22 +01:00
MerryMage
f020dbe4ed
A64: Implement SQXTUN
2020-04-22 20:46:22 +01:00
MerryMage
6918ef7360
microinstruction: Reorganize FPSCR related instruction queries
2020-04-22 20:46:22 +01:00
Lioncash
a639fa5534
microinstruction: Add missing FP scalar opcodes to ReadsFromFPSCR() and WritesToFPSCR()
...
These were forgotten when the opcodes were added.
2020-04-22 20:46:22 +01:00
MerryMage
b2e4c16ef8
A64: Implement FRSQRTS (vector), single/double variant
2020-04-22 20:46:22 +01:00
MerryMage
45dc5f74f3
A64: Implement FRSQRTE (vector), single/double variant
2020-04-22 20:46:22 +01:00
MerryMage
506e544bfe
IR: Implement FPRSqrtStepFused
2020-04-22 20:46:22 +01:00
MerryMage
bde58b04d4
IR: Implement FPRSqrtEstimate
2020-04-22 20:46:21 +01:00
Subv
4606a081c9
A64: The A64SetTPIDR IR instruction writes to a system register and should not be eliminated by the dead code elimination pass.
...
Previously this instruction was alway eliminated, resulting in incorrect values for TPIDR_EL0.
2020-04-22 20:46:21 +01:00
MerryMage
e18fca17dc
A64: Implement FABD in terms of existing IR instructions
...
Fixes NaN issue. Closes #306 .
2020-04-22 20:46:21 +01:00
MerryMage
b228694012
IR: Implement FPRoundInt
2020-04-22 20:46:20 +01:00
MerryMage
33fa65de23
A64: Implement FADDP (vector)
2020-04-22 20:46:19 +01:00
MerryMage
9dba273a8c
A64: Implement SADDLP
2020-04-22 20:46:19 +01:00
MerryMage
70ff2d73b5
A64: Implement UADDLP
2020-04-22 20:46:19 +01:00
MerryMage
caaf36dfd6
IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64}
...
This implementation just falls-back to the software floating point implementation.
2020-04-22 20:46:19 +01:00
Lioncash
4aa4885ba7
ir: Add opcodes for vector conversion of u32/u64 to floating-point
2020-04-22 20:46:19 +01:00
Lioncash
7a84b6e8d8
ir: Add opcodes for converting S64 and U64 to single-precision floating-point values
2020-04-22 20:46:19 +01:00
Lioncash
3a41465eaf
ir: Add opcodes for converting S64 and U64 to double-precision values
2020-04-22 20:46:18 +01:00
Lioncash
81e572c78c
ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16
2020-04-22 20:46:18 +01:00
Lioncash
fc731dddae
ir: Add opcodes for performing vector absolute floating-point values
...
This will be usable for implementing FACGE and FACGT
2020-04-22 20:46:18 +01:00
MerryMage
be354dbfd0
ir/basic_block: Add missing U16 immediate type to DumpBlock
2020-04-22 20:46:18 +01:00
Lioncash
8a4f8aed06
ir: Add opcode for performing FP vector absolute differences
2020-04-22 20:46:18 +01:00
MerryMage
8c90fcf58e
IR: Implement FPMulAdd
2020-04-22 20:46:18 +01:00
Lioncash
c695da1cf3
ir: Add opcode for floating-point GE and GT comparisons
...
The rest of the comparisons can be implemented in terms of these two
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e
ir: Add opcodes for floating-point vector equalities
2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28
ir: Add opcodes for performing rounding halving adds
2020-04-22 20:46:18 +01:00
Lioncash
1e10017f4b
ir: Add opcodes for signed absolute differences
2020-04-22 20:46:17 +01:00
Lioncash
3f6c529da2
ir: Add opcode to perform the vector conversion S64->F64
...
Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us
2020-04-22 20:46:17 +01:00