Commit graph

1043 commits

Author SHA1 Message Date
Lioncash
3714bc0ed4 floating_point_conversion_integer: Use FPS64ToDouble and FPU64ToDouble in SCVTF_float_int and UCVTF_float_int
The opcodes introduced in 979b6f39f1621b80bd463645ec5b08661cb6b1bf can
also be used here, avoiding more falling back to the interpreter.
2020-04-22 20:46:18 +01:00
Lioncash
b97358075e simd_scalar_two_register_misc: Handle 64-bit case in SCVTF and UCVTF's scalar double/single-precision variant
Avoids falling back to the interpreter in the 64-bit case.
2020-04-22 20:46:18 +01:00
Lioncash
7252293184 emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle()
In the non-AVX512 path, the following code is present:

code.mov(from.cvt32(), from.cvt32());

since this potentially modifies 'from', we should be using
UseScratchGpr() instead.
2020-04-22 20:46:18 +01:00
Lioncash
fbd7623fe5 emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble()
AVX-512F provides convenient instructions for these kinds of conversions
directly
2020-04-22 20:46:18 +01:00
Lioncash
3a41465eaf ir: Add opcodes for converting S64 and U64 to double-precision values 2020-04-22 20:46:18 +01:00
MerryMage
436ca80bcd Merge branch 'global_monitor' 2020-04-22 20:46:18 +01:00
Lioncash
0f4bf26e05 simd_two_register_misc: Utilize FPVectorAbs in FABS implementations
Since we already have opcodes introduced to implement FACGE and FACGT,
we can reutilize it for the FABS implementations.
2020-04-22 20:46:18 +01:00
MerryMage
821cff1227 A64: Add ClearExclusiveState method 2020-04-22 20:46:18 +01:00
Lioncash
81e572c78c ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16 2020-04-22 20:46:18 +01:00
MerryMage
2a8de5f733 a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor
The kernel would have to execute an ERET instruction to return to
userland; this clears exclusive state.
2020-04-22 20:46:18 +01:00
Lioncash
53dbb6a92a A64: Implement FACGE's vector single/double precision variants 2020-04-22 20:46:18 +01:00
MerryMage
57f7c7e1b0 Implement global exclusive monitor 2020-04-22 20:46:18 +01:00
Lioncash
6912a02d9b A64: Implement FACGT's vector single/double precision variants 2020-04-22 20:46:18 +01:00
MerryMage
85234338d3 a64_emit_x64: Simplify EmitExclusiveWrite 2020-04-22 20:46:18 +01:00
Lioncash
fc731dddae ir: Add opcodes for performing vector absolute floating-point values
This will be usable for implementing FACGE and FACGT
2020-04-22 20:46:18 +01:00
MerryMage
2fc6b33829 CMakeLists: Add missing files 2020-04-22 20:46:18 +01:00
Lioncash
0bee648b4f emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions
Given both branches are the same, we can hoist out the common code.
2020-04-22 20:46:18 +01:00
Lioncash
d86fea0d28 A64: Implement FCMEQ (zero)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
593eca7fb1 A64: Implement load/store single structure instructions
Implements LD{1, 2, 3, 4}, LD{1, 2, 3, 4}R, and ST{1, 2, 3, 4} single
structure variants.
2020-04-22 20:46:18 +01:00
Lioncash
9bec354791 A64: Implement FCMEQ (register)'s vector single and double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
b6e223fc58 emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8()
Given both branches use the same destination register size, we can hoist
the common code out.
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e ir: Add opcodes for floating-point vector equalities 2020-04-22 20:46:18 +01:00
MerryMage
be354dbfd0 ir/basic_block: Add missing U16 immediate type to DumpBlock 2020-04-22 20:46:18 +01:00
Lioncash
cf188448d4 emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
MerryMage
5503ff28c3 llvm_disassemble: Allow disassembly of invalid AArch64 instructions 2020-04-22 20:46:18 +01:00
Lioncash
954deff2d4 emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash
11a92eaaef A64: Implement SRHADD and URHADD 2020-04-22 20:46:18 +01:00
Lioncash
9e75d08860 A64: Implement FABD's scalar single/double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28 ir: Add opcodes for performing rounding halving adds 2020-04-22 20:46:18 +01:00
Lioncash
d898d1779d A64: Implement FABD's vector single/double precision variant 2020-04-22 20:46:18 +01:00
Lioncash
054549da35 emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash
8a4f8aed06 ir: Add opcode for performing FP vector absolute differences 2020-04-22 20:46:18 +01:00
Lioncash
cb456f914b A64: Implement UMLAL{2}, UMLSL{2}, and UMULL{2}
Now that we have the helper function set up for the signed variants, we
can also modify it to be used with the unigned ones by performing a zero
extension instead of a sign extension.
2020-04-22 20:46:18 +01:00
MerryMage
ba84e7a8de A64: Implement FNMSUB 2020-04-22 20:46:18 +01:00
Lioncash
3576c02d91 A64: Implement SMLSL{2} 2020-04-22 20:46:18 +01:00
MerryMage
a1042cfcd8 A64: Implement FNMADD 2020-04-22 20:46:18 +01:00
Lioncash
ada5c0b2fa A64: Implement SMLAL{2} 2020-04-22 20:46:18 +01:00
MerryMage
0d83032a6f A64: Implement FMSUB 2020-04-22 20:46:18 +01:00
Lioncash
2d1aca25e6 A64: Implement SMULL{2} 2020-04-22 20:46:18 +01:00
MerryMage
69e00d225c A64: Implement FMADD 2020-04-22 20:46:18 +01:00
MerryMage
8c90fcf58e IR: Implement FPMulAdd 2020-04-22 20:46:18 +01:00
Lioncash
c5ae9107a9 A64: Implement SABAL/SABAL2 and SABDL/SABDL2
Now that we have a helper function for the unsigned variants, we can
modify it to also be usable with the signed variants.
2020-04-22 20:46:18 +01:00
Lioncash
24e3299276 A64: Implement FCMGT, FCMGE (register) vector double and single precision variants 2020-04-22 20:46:18 +01:00
Lioncash
26d4473851 A64: Implement UABAL/UABAL2 2020-04-22 20:46:18 +01:00
Lioncash
350bc70be8 A64: Implement FCMGT, FCMGE, FCMLE, FCMLT (zero) vector double and single precision variants. 2020-04-22 20:46:18 +01:00
Lioncash
3397742c74 A64: Implement UABDL/UABDL2 2020-04-22 20:46:18 +01:00
Lioncash
c695da1cf3 ir: Add opcode for floating-point GE and GT comparisons
The rest of the comparisons can be implemented in terms of these two
2020-04-22 20:46:18 +01:00
Lioncash
6de5ed96e5 emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash
9054d1c20b A64: Implement LDR (literal, SIMD&FP) 2020-04-22 20:46:18 +01:00
Lioncash
0da5e949a8 Correct typo in DataCacheOperation enum
Fixes a typo for the InvalidateByVAToPoC enum entry. Given yuzu is the
only known user of 64-bit mode and it doesn't use this value, we can get
away with changing this.
2020-04-22 20:46:18 +01:00