Commit graph

317 commits

Author SHA1 Message Date
MerryMage
02150bc0b7 IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0 opcodes.inc: Align columns to a tabstop of 4 2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d IR: Add fbits argument to FixedToFP-related opcodes 2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2020-04-22 20:55:06 +01:00
Lioncash
e739624296 ir: Add opcodes for vector CLZ operations
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.

If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00
Lioncash
d4a76aaa04 ir: Add opcodes form unsigned saturated accumulations of signed values 2020-04-22 20:55:05 +01:00
Lioncash
6f911a26da ir: Add opcodes for signed saturated accumulations of unsigned values 2020-04-22 20:55:05 +01:00
Lioncash
134bb02e19 ir/value: Replace includes with forward declarations
enum classes are still considered complete types when forward declared
(as the compiler knows the exact size of the type from the declaration
alone). The only difference in this case being that the members of the
enum class aren't visible. Given we don't use the members within this
header in any way, we can simply forward declare them here and remove
the inclusions.
2020-04-22 20:55:05 +01:00
Lioncash
2c8e07e7d0 ir/cond: Migrate to C++17 nested namespace specifiers 2020-04-22 20:55:05 +01:00
Lioncash
b6e74fd17d ir: Add opcodes for performing unsigned reciprocal square root estimates 2020-04-22 20:55:05 +01:00
Lioncash
af83360f89 ir: Add opcodes for unsigned reciprocal estimate 2020-04-22 20:55:05 +01:00
Lioncash
fca7eddb9e A64: Add opcodes for signed saturating negations 2020-04-22 20:53:46 +01:00
MerryMage
aa8d826c13 ir/terminal: Add FastDispatchHint 2020-04-22 20:53:46 +01:00
Lioncash
7ebfd0f31c ir: Add opcodes for scalar signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
a0231e5546 ir: Add opcodes for signed saturated doubling multiplies 2020-04-22 20:53:46 +01:00
Lioncash
0507e47420 ir: Add opcodes for signed saturated absolute values 2020-04-22 20:53:46 +01:00
MerryMage
3415828fb4 IR: Simplify FP{Single,Double}ToFixed{U,S}{32,64} 2020-04-22 20:53:46 +01:00
Lioncash
053175f69b ir_emitter: Rename fpscr_controlled parameters to fpcr_controlled
Part of addressing #333
2020-04-22 20:53:46 +01:00
MerryMage
ccbf6c7f63 microinstruction: A32ExceptionRaised causes CPU exception 2020-04-22 20:53:46 +01:00
MerryMage
89d08c7d61 IR: Add VectorTable and VectorTableLookup IR instructions 2020-04-22 20:53:45 +01:00
MerryMage
0288974512 opcodes: Cleanup opcodes table
* Remove T:: prefix from types.
* Add another column for a 4th argument.
2020-04-22 20:53:45 +01:00
Lioncash
0efa2ce3b0 ir: Add opcodes for performing rounding left shifts 2020-04-22 20:53:45 +01:00
Lioncash
f3f60cd179 A64: Implement ISB
Given we want to ensure that all instructions are fetched again, we can
treat an ISB instruction as a code cache flush.
2020-04-22 20:53:45 +01:00
MerryMage
d1d6f4feb5 system: Implement MRS CNTFRQ_EL0 2020-04-22 20:53:45 +01:00
Lioncash
acbaf04fef ir: Add opcodes for unsigned saturating add and subtract 2020-04-22 20:46:23 +01:00
Lioncash
2188765e28 ir/value: Use type alias CoprocessorInfo for std::array<u8, 8>
Provides a more descriptive label for the interface, and avoids the need
to hardcode the array size in multiple places.
2020-04-22 20:46:23 +01:00
MerryMage
3f4d118d73 microinstruction: Improve assert messages 2020-04-22 20:46:23 +01:00
MerryMage
17f73974f2 IR: Implement FPMulX IR instruction 2020-04-22 20:46:23 +01:00
MerryMage
f976c47008 IR: Initial implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
10e196480f IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths 2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02 ir: Add opcodes for vector paired maximum and minimums
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
2501bfbfae ir: Add opcodes for performing scalar integral min/max 2020-04-22 20:46:23 +01:00
Lioncash
7fdd8b0197 A64: Implement PMULL{2} 2020-04-22 20:46:23 +01:00
Lioncash
affa312d1d ir: Add opcode for performing polynomial multiplication 2020-04-22 20:46:22 +01:00
MerryMage
507bcd8b8b IR: Implement FPVectorTo{Signed,Unsigned}Fixed 2020-04-22 20:46:22 +01:00
MerryMage
7b03da86c2 IR: Implement FPVector{Max,Min} 2020-04-22 20:46:22 +01:00
MerryMage
ddcff86f9c microinstruction: Update ReadsFromAndWritesToFPSRCumulativeExceptionBits 2020-04-22 20:46:22 +01:00
MerryMage
901bd9b4e2 IR: Implement FPRecipStepFused, FPVectorRecipStepFused 2020-04-22 20:46:22 +01:00
MerryMage
939f5f5c7a IR: Implement FPVectorRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
c1dcfe29f7 IR: Implement FPRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
04f325a05e IR: Implement FPVectorNeg 2020-04-22 20:46:22 +01:00
MerryMage
771a4fc20b IR: Implement FPVectorMulAdd 2020-04-22 20:46:22 +01:00
MerryMage
ecbf9dbae5 IR: Implement A64OrQC 2020-04-22 20:46:22 +01:00
MerryMage
b455b566e7 A64: Implement UQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
3874cb37e3 A64: Implement SQXTN (vector) 2020-04-22 20:46:22 +01:00
MerryMage
f020dbe4ed A64: Implement SQXTUN 2020-04-22 20:46:22 +01:00
MerryMage
6918ef7360 microinstruction: Reorganize FPSCR related instruction queries 2020-04-22 20:46:22 +01:00