Commit graph

228 commits

Author SHA1 Message Date
Lioncash
2184d24e8f frontend/ir_emitter: Add half-precision opcode for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
5d5c9f149f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
6da0411111 frontend/ir_emitter: Add half-precision opcode for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
5b4673da4b frontend/ir_emitter: Add half-precision variant of FPVectorRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
ad0c698f89 frontend/ir_emitter: Add half-precision variant of FPRoundInt 2020-04-22 21:01:44 +01:00
Merry
cb9a1b18b6 Merge pull request #475 from lioncash/muladd
A64: Enable half-precision variants of floating-point multiply-add instructions
2020-04-22 21:01:44 +01:00
Lioncash
a4cadf1cd9 frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation 2020-04-22 21:01:44 +01:00
Lioncash
ec6b3ae084 ir/frontend: Add half-precision opcode for FPVectorMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
bd82513199 frontend/ir_emitter: Add half-precision opcode for FPMulAdd 2020-04-22 21:01:44 +01:00
Merry
01bb1cdd88 Merge pull request #458 from lioncash/float-op
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2 Merge pull request #447 from lioncash/flag
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
fb039e232c Merge pull request #442 from lioncash/fcvtxn
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5 ir: Add A64-specific opcodes for getting and setting raw NZCV values
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Lioncash
5cf1478620 frontend/ir: Add opcodes for vector square roots 2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3 frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811 frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents 2020-04-22 20:58:10 +01:00
MerryMage
fa8925c4df IR: Implement FPVectorMulX 2020-04-22 20:57:37 +01:00
MerryMage
f0920c0ded Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
Lioncash
d426dfe942 ir: Add opcodes for unsigned saturating left shifts 2020-04-22 20:55:06 +01:00
MerryMage
02150bc0b7 IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0 opcodes.inc: Align columns to a tabstop of 4 2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d IR: Add fbits argument to FixedToFP-related opcodes 2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2020-04-22 20:55:06 +01:00
Lioncash
e739624296 ir: Add opcodes for vector CLZ operations
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.

If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00
Lioncash
d4a76aaa04 ir: Add opcodes form unsigned saturated accumulations of signed values 2020-04-22 20:55:05 +01:00
Lioncash
6f911a26da ir: Add opcodes for signed saturated accumulations of unsigned values 2020-04-22 20:55:05 +01:00
Lioncash
b6e74fd17d ir: Add opcodes for performing unsigned reciprocal square root estimates 2020-04-22 20:55:05 +01:00
Lioncash
af83360f89 ir: Add opcodes for unsigned reciprocal estimate 2020-04-22 20:55:05 +01:00
Lioncash
fca7eddb9e A64: Add opcodes for signed saturating negations 2020-04-22 20:53:46 +01:00
Lioncash
7ebfd0f31c ir: Add opcodes for scalar signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
a0231e5546 ir: Add opcodes for signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
0507e47420 ir: Add opcodes for signed saturated absolute values 2020-04-22 20:53:46 +01:00
MerryMage
89d08c7d61 IR: Add VectorTable and VectorTableLookup IR instructions 2020-04-22 20:53:45 +01:00
MerryMage
0288974512 opcodes: Cleanup opcodes table
* Remove T:: prefix from types.
* Add another column for a 4th argument.
2020-04-22 20:53:45 +01:00
Lioncash
0efa2ce3b0 ir: Add opcodes for performing rounding left shifts 2020-04-22 20:53:45 +01:00
Lioncash
f3f60cd179 A64: Implement ISB
Given we want to ensure that all instructions are fetched again, we can
treat an ISB instruction as a code cache flush.
2020-04-22 20:53:45 +01:00
MerryMage
d1d6f4feb5 system: Implement MRS CNTFRQ_EL0 2020-04-22 20:53:45 +01:00
Lioncash
acbaf04fef ir: Add opcodes for unsigned saturating add and subtract 2020-04-22 20:46:23 +01:00
MerryMage
17f73974f2 IR: Implement FPMulX IR instruction 2020-04-22 20:46:23 +01:00
MerryMage
f976c47008 IR: Initial implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
10e196480f IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths 2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02 ir: Add opcodes for vector paired maximum and minimums
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
2501bfbfae ir: Add opcodes for performing scalar integral min/max 2020-04-22 20:46:23 +01:00