Commit graph

609 commits

Author SHA1 Message Date
MerryMage
b098c650df emit_x64_floating_point: Use EmitPostProcessNaNs in EmitFPMulX 2020-04-22 20:46:23 +01:00
MerryMage
c1babf41b2 emit_x64_floating_point: Remove unnecessary DenormalsAreZero from EmitFPSingleToDouble and EmitFPDoubleToSingle 2020-04-22 20:46:23 +01:00
MerryMage
700088408d emit_x64_floating_point: Simplify EmitFP{Min,Max}{,Numeric}{32,64} 2020-04-22 20:46:23 +01:00
MerryMage
07e0585994 emit_x64_floating_point: Reduce NaN processing overhead 2020-04-22 20:46:23 +01:00
MerryMage
17f73974f2 IR: Implement FPMulX IR instruction 2020-04-22 20:46:23 +01:00
Lioncash
391e16be64 emit_x64_vector: Vectorize 32-bit variants of paired min/max
Gets rid of the fallbacks for these cases.
2020-04-22 20:46:23 +01:00
MerryMage
5ae045d67e emit_x64_vector: Improve code emission of VectorGetElement* for index == 0 2020-04-22 20:46:23 +01:00
MerryMage
e9ab7f7664 reg_alloc: Do a UseScratch if a Use destination is too small 2020-04-22 20:46:23 +01:00
MerryMage
90f8dda966 emit_x64_floating_point: AVX implementation of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
dfb660cd16 emit_x64_vector_floating_point: Prefer blendvp{s,d} to vblendvp{s,d} where possible
It's a cheaper instruction.
2020-04-22 20:46:23 +01:00
MerryMage
476c0f15da backend_x64: Remove all use of xmm0 2020-04-22 20:46:23 +01:00
MerryMage
8252efd7b1 emit_x64_vector_floating_point: AVX implementation of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
746dc521b9 emit_x64_vector_floating_point: Reduce codesize of ForceToDefaultNaN 2020-04-22 20:46:23 +01:00
MerryMage
7731dcdca9 emit_x64_vector_floating_point: Reduce codesize of EmitTwoOpVectorOperation 2020-04-22 20:46:23 +01:00
MerryMage
bb93353f94 emit_x64_vector_floating_point: Correct FMA in FTZ mode
x64 rounds before flushing to zero
AArch64 rounds after flushing to zero

This difference of behaviour is noticable if something would round to a smallest normalized number
2020-04-22 20:46:23 +01:00
MerryMage
8ef195db3c emit_x64_floating_point: DenormalsAreZero is redundant as hardware already does DAZ
Exceptions: F{MIN,MAX}{,NM}
2020-04-22 20:46:23 +01:00
MerryMage
de9d8c461c emit_x64_floating_point: FlushToZero is redundant as hardware already does FTZ 2020-04-22 20:46:23 +01:00
MerryMage
822fd4a875 backend_x64: Fix FPVectorMulAdd and FPMulAdd NaN handling with denormals
Denormals should be treated as zero in NaN handler
2020-04-22 20:46:23 +01:00
MerryMage
b393e15ab6 backend_x64: Fix bugs when FPCR.FZ=1
Bugs:
* DenormalsAreZero flushed to positive zero instead of preserving sign.
* FMAXNM/FMINNM (scalar) should perform DAZ *before* special zero handling.
* FMAX/FMIN/FMAXNM/FMINNM (vector) did not DAZ.
2020-04-22 20:46:23 +01:00
MerryMage
2019d32743 emit_x64_floating_point: Deduplicate EmitFPMulAdd implementation 2020-04-22 20:46:23 +01:00
MerryMage
e038fe72df emit_x64_floating_point: Deduplicate code 2020-04-22 20:46:23 +01:00
MerryMage
ec82a845b7 emit_x64_vector_floating_point: Fix FPVector{Max,Min} when FPCR.DN = 1 2020-04-22 20:46:23 +01:00
MerryMage
7f27945411 emit_x64_floating_point: Fix FP{Max,Min} when FPCR.DN = 1 2020-04-22 20:46:23 +01:00
MerryMage
21a28c2545 IR: SSE4.1 implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
f976c47008 IR: Initial implementation of FPVectorRoundInt 2020-04-22 20:46:23 +01:00
MerryMage
10e196480f IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths 2020-04-22 20:46:23 +01:00
MerryMage
71db0e67ae a64_emit_x64: Bugfix EmitA64OrQC - Incorrect argument 2020-04-22 20:46:23 +01:00
Lioncash
463b9a3d02 ir: Add opcodes for vector paired maximum and minimums
For the time being, we can just do a naive implementation which avoids
falling back to the interpreter a bit. Horizontal operations aren't
necessarily x86 SIMD's forte anyways.
2020-04-22 20:46:23 +01:00
Lioncash
2501bfbfae ir: Add opcodes for performing scalar integral min/max 2020-04-22 20:46:23 +01:00
Lioncash
7fdd8b0197 A64: Implement PMULL{2} 2020-04-22 20:46:23 +01:00
MerryMage
f9c6d5e1a0 common: Move all cryptographic function to common/crypto 2020-04-22 20:46:22 +01:00
MerryMage
5dc23e49d7 a32_emit_x64: BMI2 implementation of A32SetCpsr 2020-04-22 20:46:22 +01:00
MerryMage
0f85305933 a32_emit_x64: Shorten EmitA32GetCpsr 2020-04-22 20:46:22 +01:00
MerryMage
9fe2bf8733 a32_emit_x64: Assert that memory layout assumption in EmitA32GetCpsr is valid 2020-04-22 20:46:22 +01:00
Lioncash
affa312d1d ir: Add opcode for performing polynomial multiplication 2020-04-22 20:46:22 +01:00
MerryMage
507bcd8b8b IR: Implement FPVectorTo{Signed,Unsigned}Fixed 2020-04-22 20:46:22 +01:00
MerryMage
da261772ea emit_x64_vector_floating_point: AVX implementation of FPVector{Max,Min} 2020-04-22 20:46:22 +01:00
MerryMage
a0d6f0de57 emit_x64_vector_floating_point: Remove unnecessary double jump in HandleNaNs 2020-04-22 20:46:22 +01:00
MerryMage
7b03da86c2 IR: Implement FPVector{Max,Min} 2020-04-22 20:46:22 +01:00
MerryMage
901bd9b4e2 IR: Implement FPRecipStepFused, FPVectorRecipStepFused 2020-04-22 20:46:22 +01:00
MerryMage
939f5f5c7a IR: Implement FPVectorRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
c1dcfe29f7 IR: Implement FPRecipEstimate 2020-04-22 20:46:22 +01:00
MerryMage
3fe45c6d8e block_of_code: Add ABI_PARAMS array 2020-04-22 20:46:22 +01:00
MerryMage
64c2f698a2 emit_x64_vector_floating_point: Specify NanHandler::function_type explicitly
MSVC doesn't like dealing with auto return types
2020-04-22 20:46:22 +01:00
MerryMage
2ef59b4f03 emit_x64_vector_floating_point: ChooseOnFsize arguments maybe_unused 2020-04-22 20:46:22 +01:00
MerryMage
04f325a05e IR: Implement FPVectorNeg 2020-04-22 20:46:22 +01:00
MerryMage
771a4fc20b IR: Implement FPVectorMulAdd 2020-04-22 20:46:22 +01:00
MerryMage
3218bb9890 emit_x64_vector_floating_point: Standardize naming scheme 2020-04-22 20:46:22 +01:00
MerryMage
8f72be0a02 emit_x64_floating_point: Simplify indexers 2020-04-22 20:46:22 +01:00
MerryMage
25b28bb234 emit_x64_vector_floating_point: Simplify EmitVectorOperation* 2020-04-22 20:46:22 +01:00