Commit graph

547 commits

Author SHA1 Message Date
MerryMage
506e544bfe IR: Implement FPRSqrtStepFused 2020-04-22 20:46:22 +01:00
MerryMage
66bb05fc0a emit_x64_floating_point: Fixup special NaN case in FMA FPMulAdd implementation 2020-04-22 20:46:21 +01:00
MerryMage
0ce11b7b15 emit_x64_floating_point: Implement accurate fallback for FPMulAdd{32,64} 2020-04-22 20:46:21 +01:00
MerryMage
1c8e93e74d block_of_code: Add SysV ABI fifth and sixth parameters 2020-04-22 20:46:21 +01:00
Lioncash
d65b056eba Simplify fallback case for EmitVectorSetElement64() 2020-04-22 20:46:21 +01:00
MerryMage
6087c2af6f emit_x64_floating_point: s/Esimate/Estimate/ 2020-04-22 20:46:21 +01:00
MerryMage
bde58b04d4 IR: Implement FPRSqrtEstimate 2020-04-22 20:46:21 +01:00
MerryMage
55eaa16615 a64_emit_x64: Ensure host has updated ticks in EmitA64GetCNTPCT
Discovered by @Subv.
Fixes incomplete fix begun in 5a91c94dca47c9702dee20fbd5ae1f4c07eef9df.
That fix fails to take into account that LinkBlock doesn't update ticks until there
are no remaining ticks to be executed.

Test added to confirm fix.
2020-04-22 20:46:21 +01:00
MerryMage
edd795e991 a64_emit_x64: Fix stack misalignment on Windows for 128-bit exclusive writes
Discovered by @Subv.
Includes a test to ensure this codepath is exercised on Windows.
2020-04-22 20:46:21 +01:00
Lioncash
04b4c8b0cf emit_x64_aes: Eliminate extraneous usage of a scratch register in EmitAESInverseMixColumns()
We can just use the same register the data is in as the result register,
eliminating the need to use a completely separate register to store the
result.
2020-04-22 20:46:21 +01:00
Lioncash
1dc1e3dcd8 fp: Use forward declarations where applicable
Minimizes the amount of files that need to be rebuilt if the headers
ever change.
2020-04-22 20:46:21 +01:00
Lioncash
46cb0d813b emit_x64_vector: Append 'v' prefix onto movq in AVX path
This is something I missed when adding in the AVX broadcast code.
2020-04-22 20:46:21 +01:00
MerryMage
b53127600b fp: A64::FPCR -> FP::FPCR 2020-04-22 20:46:21 +01:00
Lioncash
0ec8dac660 emit_x64: Remove FPSCR_RoundTowardsZero() virtual function from EmitContext struct
This code was bugged in that we were comparing if the rounding mode was
not equal to rounding towards zero. Fortunately, however, nothing uses
this function anymore, and there's already the more general
FPSCR_RMode() available, so this can be removed entirely.
2020-04-22 20:46:21 +01:00
Lioncash
fd92e2f186 emit_x64: Add missing <array> include
Commit 755adef62e504a8d616de9dda8937d2428a9471b introduced a helper
alias for std::array, eliminating the need to manually type out sizes
for them, however I forgot to add the include for <array>
2020-04-22 20:46:21 +01:00
Lioncash
f939bd0228 emit_x64_vector{_floating_point}: Add helper alias for sizing arrays relative to vector width
Avoids needing to remember to specify the proper size of the arrays, all
that's needed is to specify the type of the array and the size will
automatically be deduced from it. This helps prevent potential oversized
or undersized arrays from being specified.
2020-04-22 20:46:21 +01:00
MerryMage
58f3399032 A64/PopRSBHint: Prevent RETing to a guest PC of ~0ull from crashing the jit 2020-04-22 20:46:21 +01:00
MerryMage
e18fca17dc A64: Implement FABD in terms of existing IR instructions
Fixes NaN issue. Closes #306.
2020-04-22 20:46:21 +01:00
MerryMage
83be491875 emit_x64_floating_point: SSE4.1 implementation of EmitFPRound 2020-04-22 20:46:20 +01:00
MerryMage
b228694012 IR: Implement FPRoundInt 2020-04-22 20:46:20 +01:00
MerryMage
295deb4035 a64_jit_state: Add FPSR.QC flag 2020-04-22 20:46:20 +01:00
Lioncash
7797bc2fb2 emit_x64_vector: Use non-scratch Use* variants of registers within EmitVectorUnsignedAbsoluteDifference()
In some cases, a register isn't modified, depending on the branch taken,
so we can signify this by using the non-scratch variants in certain
cases.
2020-04-22 20:46:20 +01:00
MerryMage
33fa65de23 A64: Implement FADDP (vector) 2020-04-22 20:46:19 +01:00
MerryMage
9dba273a8c A64: Implement SADDLP 2020-04-22 20:46:19 +01:00
MerryMage
70ff2d73b5 A64: Implement UADDLP 2020-04-22 20:46:19 +01:00
MerryMage
5563bbbd79 A64: Implement EXT 2020-04-22 20:46:19 +01:00
MerryMage
304cc7f61e emit_x64_floating_point: SSE4.1 implementation for FP{Double,Single}ToFixed{S,U}{32,64} 2020-04-22 20:46:19 +01:00
MerryMage
caaf36dfd6 IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64}
This implementation just falls-back to the software floating point implementation.
2020-04-22 20:46:19 +01:00
MerryMage
760cc3ca89 EmitContext: Expose FPCR 2020-04-22 20:46:19 +01:00
MerryMage
3cb98e1560 fp: Move fp_util to fp/util 2020-04-22 20:46:19 +01:00
Lioncash
4aa4885ba7 ir: Add opcodes for vector conversion of u32/u64 to floating-point 2020-04-22 20:46:19 +01:00
Lioncash
7a84b6e8d8 ir: Add opcodes for converting S64 and U64 to single-precision floating-point values 2020-04-22 20:46:19 +01:00
Lioncash
066061fa50 constant_pool: Remove unnecessary std::memset from constructor
AllocateFromCodeSpace() already zeroes out the allocated memory.
2020-04-22 20:46:19 +01:00
Lioncash
35026a6ce3 emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32() 2020-04-22 20:46:19 +01:00
MerryMage
0b97e9bd8d emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode 2020-04-22 20:46:18 +01:00
MerryMage
a2eb9a02e0 backend_x86: Add FPSCR_RMode to EmitContext 2020-04-22 20:46:18 +01:00
MerryMage
d875c08ebf fp: Extract common RoundingMode enum 2020-04-22 20:46:18 +01:00
Lioncash
7252293184 emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle()
In the non-AVX512 path, the following code is present:

code.mov(from.cvt32(), from.cvt32());

since this potentially modifies 'from', we should be using
UseScratchGpr() instead.
2020-04-22 20:46:18 +01:00
Lioncash
fbd7623fe5 emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble()
AVX-512F provides convenient instructions for these kinds of conversions
directly
2020-04-22 20:46:18 +01:00
Lioncash
3a41465eaf ir: Add opcodes for converting S64 and U64 to double-precision values 2020-04-22 20:46:18 +01:00
MerryMage
436ca80bcd Merge branch 'global_monitor' 2020-04-22 20:46:18 +01:00
MerryMage
821cff1227 A64: Add ClearExclusiveState method 2020-04-22 20:46:18 +01:00
Lioncash
81e572c78c ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16 2020-04-22 20:46:18 +01:00
MerryMage
2a8de5f733 a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor
The kernel would have to execute an ERET instruction to return to
userland; this clears exclusive state.
2020-04-22 20:46:18 +01:00
MerryMage
57f7c7e1b0 Implement global exclusive monitor 2020-04-22 20:46:18 +01:00
MerryMage
85234338d3 a64_emit_x64: Simplify EmitExclusiveWrite 2020-04-22 20:46:18 +01:00
Lioncash
fc731dddae ir: Add opcodes for performing vector absolute floating-point values
This will be usable for implementing FACGE and FACGT
2020-04-22 20:46:18 +01:00
Lioncash
0bee648b4f emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions
Given both branches are the same, we can hoist out the common code.
2020-04-22 20:46:18 +01:00
Lioncash
b6e223fc58 emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8()
Given both branches use the same destination register size, we can hoist
the common code out.
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e ir: Add opcodes for floating-point vector equalities 2020-04-22 20:46:18 +01:00