MerryMage
b2e4c16ef8
A64: Implement FRSQRTS (vector), single/double variant
2020-04-22 20:46:22 +01:00
MerryMage
45dc5f74f3
A64: Implement FRSQRTE (vector), single/double variant
2020-04-22 20:46:22 +01:00
MerryMage
b74d5520f9
A64: Implement FRSQRTS (scalar), single/double variant
2020-04-22 20:46:22 +01:00
MerryMage
506e544bfe
IR: Implement FPRSqrtStepFused
2020-04-22 20:46:22 +01:00
MerryMage
6eb069e80d
fp: Implement FPRSqrtStepFused
2020-04-22 20:46:22 +01:00
MerryMage
b0ff35fcd1
fp: Implement FPNeg
2020-04-22 20:46:22 +01:00
MerryMage
ca6774ccce
process_nan: Add two operand variant
2020-04-22 20:46:22 +01:00
Lioncash
ace7d2ba50
A64: Implement FMAXP, FMINP, FMAXNMP and FMINNMP's scalar double/single-precision variant
2020-04-22 20:46:21 +01:00
MerryMage
66bb05fc0a
emit_x64_floating_point: Fixup special NaN case in FMA FPMulAdd implementation
2020-04-22 20:46:21 +01:00
Lioncash
070637e0f6
fp: Use a forward declaration in fused.h
...
It's permissible to forward declare here, so we can do so and eliminate
a direct header dependency
2020-04-22 20:46:21 +01:00
Lioncash
030820f649
u128: Implement comparison operators in terms of one another
...
We can just implement the comparisons in terms of operator< and
implement inequality with the negation of operator==.
2020-04-22 20:46:21 +01:00
MerryMage
a04553eb91
tests: Print cpu info
2020-04-22 20:46:21 +01:00
MerryMage
76b07d6646
u128: StickyLogicalShiftRight requires special-casing for amount == 64
...
In this case (128 - amount) == 64, and this invokes undefined behaviour
2020-04-22 20:46:21 +01:00
Lioncash
49c7edf7c6
A64: Implement FMLA and FMLS (by element)'s double/single-precision scalar variant
2020-04-22 20:46:21 +01:00
Lioncash
c704acafe4
A64: Implement FMUL (by element)'s scalar double/single-precision variant
2020-04-22 20:46:21 +01:00
MerryMage
0ce11b7b15
emit_x64_floating_point: Implement accurate fallback for FPMulAdd{32,64}
2020-04-22 20:46:21 +01:00
MerryMage
e199887fbc
fp: Implement FPMulAdd
2020-04-22 20:46:21 +01:00
MerryMage
53a8c15d12
process_nan: Add FPProcessNaNs3
2020-04-22 20:46:21 +01:00
MerryMage
1c8e93e74d
block_of_code: Add SysV ABI fifth and sixth parameters
2020-04-22 20:46:21 +01:00
MerryMage
1fe8f51c54
u128: Add StickyLogicalShiftRight
2020-04-22 20:46:21 +01:00
MerryMage
b0afd53ea7
u128: Add Multiply64To128
2020-04-22 20:46:21 +01:00
MerryMage
5566fab29a
u128: Add u128::Bit
2020-04-22 20:46:21 +01:00
MerryMage
3e62fea003
u128: Add comparison operators
2020-04-22 20:46:21 +01:00
MerryMage
f17cd6f2c5
unpacked: Use ResidualErrorOnRightShift in FPRoundBase
...
Fixes a bug relating to exponents that are severely out of range.
2020-04-22 20:46:21 +01:00
MerryMage
805428e35e
fp: Remove MantissaT
2020-04-22 20:46:21 +01:00
MerryMage
bda86fd167
FPRSqrtEstimate: Improve documentation of RecipSqrtEstimate
2020-04-22 20:46:21 +01:00
Lioncash
0a64a66b26
FPRSqrtEstimate: Deduplicate array bounds
...
Dehardcodes a few constants in the loops.
2020-04-22 20:46:21 +01:00
Lioncash
b7bd70fd19
A64: Implement FMAXV, FMINV, FMAXNMV, and FMINNMV
2020-04-22 20:46:21 +01:00
Lioncash
664fb12e21
FPRSqrtEstimate: Use forward declarations where applicable
2020-04-22 20:46:21 +01:00
Lioncash
3447c82656
translate: Return by bool in helpers where applicable
...
Gets rid of a bit of duplication regarding the early-out cases and makes
all helpers functions consistent (previously some had a return type of
bool, while others had a return type of void).
2020-04-22 20:46:21 +01:00
Lioncash
d65b056eba
Simplify fallback case for EmitVectorSetElement64()
2020-04-22 20:46:21 +01:00
MerryMage
6087c2af6f
emit_x64_floating_point: s/Esimate/Estimate/
2020-04-22 20:46:21 +01:00
MerryMage
f837ce8e78
simd_scalar_two_register_misc: Implement FRSQRTE, scalar variant
2020-04-22 20:46:21 +01:00
MerryMage
bde58b04d4
IR: Implement FPRSqrtEstimate
2020-04-22 20:46:21 +01:00
MerryMage
16061c28f3
simd_vector_x_indexed_element: Implement FMUL (by element), vector variant
2020-04-22 20:46:21 +01:00
MerryMage
55eaa16615
a64_emit_x64: Ensure host has updated ticks in EmitA64GetCNTPCT
...
Discovered by @Subv.
Fixes incomplete fix begun in 5a91c94dca47c9702dee20fbd5ae1f4c07eef9df.
That fix fails to take into account that LinkBlock doesn't update ticks until there
are no remaining ticks to be executed.
Test added to confirm fix.
2020-04-22 20:46:21 +01:00
MerryMage
edd795e991
a64_emit_x64: Fix stack misalignment on Windows for 128-bit exclusive writes
...
Discovered by @Subv.
Includes a test to ensure this codepath is exercised on Windows.
2020-04-22 20:46:21 +01:00
Lioncash
04b4c8b0cf
emit_x64_aes: Eliminate extraneous usage of a scratch register in EmitAESInverseMixColumns()
...
We can just use the same register the data is in as the result register,
eliminating the need to use a completely separate register to store the
result.
2020-04-22 20:46:21 +01:00
Lioncash
e5d80e998e
A64: Implement SADDLV
2020-04-22 20:46:21 +01:00
Lioncash
a1bc8ddb53
A64: Implement UADDLV
2020-04-22 20:46:21 +01:00
Lioncash
1dc1e3dcd8
fp: Use forward declarations where applicable
...
Minimizes the amount of files that need to be rebuilt if the headers
ever change.
2020-04-22 20:46:21 +01:00
Lioncash
46cb0d813b
emit_x64_vector: Append 'v' prefix onto movq in AVX path
...
This is something I missed when adding in the AVX broadcast code.
2020-04-22 20:46:21 +01:00
Subv
4606a081c9
A64: The A64SetTPIDR IR instruction writes to a system register and should not be eliminated by the dead code elimination pass.
...
Previously this instruction was alway eliminated, resulting in incorrect values for TPIDR_EL0.
2020-04-22 20:46:21 +01:00
MerryMage
b53127600b
fp: A64::FPCR -> FP::FPCR
2020-04-22 20:46:21 +01:00
MerryMage
084bf63a10
bit_util: Implement ClearBits and ModifyBits
2020-04-22 20:46:21 +01:00
MerryMage
699c5f36d5
system: Simplify static_cast
2020-04-22 20:46:21 +01:00
Lioncash
d4688b7f2d
externals: Update Xbyak to 5.65
2020-04-22 20:46:21 +01:00
MerryMage
3f602129f4
system: Ensure value of CNTPCT_EL0 is accurate
...
Since we currently only update the host's tick count at the end of a
block, we force an end-of-block before executing a MRS %, CNTPCT_ELO
instruction.
2020-04-22 20:46:21 +01:00
Lioncash
84affdb260
safe_ops: Avoid cases where shift bases are invalid with signed values
...
For example, say the converted signed type is s64, shifting left by 63
bits would be undefined behavior.
However, given an ASL is essentially the same behavior as an LSL
we can just use an unsigned type instead of converting to a signed type.
2020-04-22 20:46:21 +01:00
Lioncash
d0274f412a
safe_ops: Avoid signed overflow in Negate()
...
Negation of values such as -9223372036854775808 can't be represented in
signed equivalents (such as long long), leading to signed overflow.
Therefore, we can just invert bits and add 1 to perform this behavior
with unsigned arithmetic.
2020-04-22 20:46:21 +01:00