Commit graph

333 commits

Author SHA1 Message Date
Wunkolo
490160ef43 emit_x64_vector: GNFI implementation of ArithmeticShiftRightByte
The bit-matrix is generated up-front and added to the constant-pool.
I'm using an embedded 64-bit broadcast here(m64bcst) which is the particular
EVEX encoded version of the instruction with AVX512VL+GNFI.

If it ever really matters, then we would ideally detect specific host
features like bare-GFNI and specific subsets of AVX512 and emit
the assembly based on that rather than by the entire Icelake uarch.
2020-11-07 15:29:12 +00:00
Wunkolo
7df235aefb emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftLeft8
Same principle as EmitVectorLogicalShiftRight8. An 8x8 galois identity
matrix is bit-shfited to allow for arbitrary 8-bit-lane shifts.
2020-11-07 15:29:12 +00:00
Wunkolo
5cc646ffed emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftRight8
Bitshifts of the GFNI identity matrix generates a new matrix that
applies lane-wise bitshifts as well. This allows for a fast
single-instruction implementation of a byte-lane bitshift.
2020-11-07 15:29:12 +00:00
Wunkolo
6bb49726f4 emit_x64_vector: GNFI+SSSE3 implementation of EmitVectorReverseBits
Performs a full 128-bit bit-reversal using only two instructions.

First by reversing all the bits of each byte using a galois matrix
multiplication(vgf2p8affineqb, Icelake), and then by reversing the bytes
themselves(pshufb, ssse3).
2020-10-02 05:56:59 +01:00
ReinUsesLisp
eb00bea1ff backend/x64/exception_handler_posix: Fix signal stack memory leak in SigHandler
std::malloc was being called inside SigHandler's constructor without a
std::free. This doesn't really matter as SigHandler is used as a
singleton and the OS will reclaim that memory. That said, properly
freeing memory keeps -fsanitize=address quiet.
2020-10-02 05:56:07 +01:00
Wunkolo
c2d5f6da90 block_of_code: Add HasAVX512_Icelake
Detect AVX512 feature support up to the [Icelake-level featureset](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512)
2020-09-19 15:20:40 +01:00
Lioncash
889635d17d general: Resolve -Wmissing-prototypes warnings 2020-08-14 14:50:09 -04:00
Lioncash
34f4d99454 block_of_code: Remove unused variables in GenRunCode()
These aren't used, so they can be removed.
2020-08-14 14:35:17 -04:00
MerryMage
6bbc53839f Unsafe Optimization: Extend Unsafe_UnfuseFMA to all FMA-related instructions 2020-07-12 12:45:12 +01:00
MerryMage
d05d95c132 Improve documentation of unsafe optimizations 2020-07-12 12:41:11 +01:00
MerryMage
82417da780 emit_x64{_vector}_floating_point: Add unsafe optimizations for RSqrtEstimate and RecipEstimate 2020-07-11 14:05:57 +01:00
MerryMage
761e95eec0 A64: Add unsafe_optimizations option
* Strength reduce FMA unsafely
2020-07-06 21:02:30 +01:00
MerryMage
f35aaa017c IR: Add VectorDeinterleave{Even,Odd}Lower 2020-07-04 11:04:10 +01:00
MerryMage
4ba1f8b9e7 Add optimization flags to disable specific optimizations 2020-07-04 11:04:10 +01:00
MerryMage
4b8a781c04 emit_x64_floating_point: Introduce ICODE 2020-07-04 11:04:10 +01:00
MerryMage
7022281a0b emit_x64_vector_floating_point: Introduce ICODE 2020-07-04 11:04:10 +01:00
Merry
b1ff971a92 backend/x64: Temporarily avoid use of DefineValue(Argument&)
Issues with inappropriate values in upper bits of values
2020-06-27 10:52:59 +01:00
MerryMage
46445d0866 A64: Remove NaN accuracy setting
Always do accuracte NaN handling.
2020-06-24 22:26:10 +01:00
MerryMage
2008fda88b emit_x64_floating_point: Correct error in s16 rounding in EmitFPToFixed 2020-06-22 22:54:38 +01:00
MerryMage
3ea49fc6d6 A32: Implement VFPv3 VCT (between floating-point and fixed-point) 2020-06-22 22:08:58 +01:00
Fernando Sahmkow
2fa1c1d13c A32: Allow cleaning up exclusive state from the interface.
This function is normally required for emulating certain OS mechanisms.
2020-06-21 18:18:33 +01:00
MerryMage
d7197745ac emit_x64_vector_floating_point: fpcr_controlled is unused when fsize == 16 in EmitFPVectorToFixed 2020-06-21 14:46:06 +01:00
MerryMage
b32fc5ab0f a64_emit_x64: EmitVAddrLookup: Use bzhi instruction when silently_mirror_page_table is active and BMI2 is available 2020-06-21 14:46:06 +01:00
MerryMage
c836b389c8 emit_x64_vector_floating_point: Add fpcr_controlled argument to all IR instructions 2020-06-21 14:28:25 +01:00
MerryMage
7d1e103ff5 IR: Implement VectorTranspose 2020-06-21 12:14:13 +01:00
MerryMage
8bbc9fdbb6 A32: Implement ASIMD VTBX 2020-06-20 22:35:31 +01:00
MerryMage
92cb4a5a34 A32: Implement ASIMD VRSQRTE 2020-06-20 15:13:22 +01:00
MerryMage
6f59c2cd8e A32: Implement ASIMD VRECPE 2020-06-20 15:07:06 +01:00
MerryMage
d3dc50d718 A32: Implement ASIMD VRSQRTS 2020-06-20 15:06:06 +01:00
MerryMage
8f506c80c3 A32: Implement ASIMD VRECPS 2020-06-20 14:39:05 +01:00
MerryMage
f58e247ef3 A32: Implement ASIMD VPADD (floating-point) 2020-06-20 14:25:04 +01:00
MerryMage
e006f0a205 A32: Implement ASIMD VSUB (floating-point) 2020-06-20 14:20:28 +01:00
MerryMage
4c939b9d0a A32: Implement ASIMD VADD (floating-point) 2020-06-20 14:20:28 +01:00
MerryMage
5ec8e48593 A32: Implement ASIMD VMUL (floating-point)
* Also add fpcr_controlled arguments to FPVectorMul IR instruction
* Merge ASIMD floating-point instruction implementations
2020-06-20 14:20:28 +01:00
MerryMage
bb4f3aa407 A32: Implement ASIMD VMAX, VMIN (floating-point) 2020-06-20 03:21:07 +01:00
MerryMage
656419286c ir: Add fpcr_controlled argument to FPVector{Equal,Greater,GreaterEqual} 2020-06-20 00:50:40 +01:00
MerryMage
1b3a70a83c backend/x64: Implement separate MSXCSR for ASIMDStandardValue 2020-06-20 00:00:36 +01:00
MerryMage
55c021fe82 emit_x64_aes: AESNI implementations of all opcodes 2020-06-19 12:11:45 +01:00
MerryMage
b759773b3b a32_emit_x64: EmitVAddrLookup: Use 64-bit registers where required 2020-06-19 00:44:52 +01:00
MerryMage
7dd9901de2 a32_emit_x64: Incorrect type in ExclusiveWriteMemory 2020-06-19 00:19:46 +01:00
MerryMage
87f6e412d0 emit_x64_vector: SSE4.1 implementation of EmitVectorPolynomialMultiply{Long}8 2020-06-18 18:44:00 +01:00
MerryMage
f5b41aabc6 emit_x64_vector: Implement EmitVectorPolynomialMultiplyLong64 in terms of pclmulqdq 2020-06-18 18:04:23 +01:00
MerryMage
13367a7efd A64: Match A32 page_table code
Here we increase the similarity between the A64 and A32 front-ends in terms of their
page_table handling code. In this commit, we:

* Reserve and use r14 as a register to store the page_table pointer.
* Align the code to be more similar in structure.
* Add a conf member to A32EmitContext.
* Remove scratch argument from EmitVAddrLookup.
2020-06-18 12:22:59 +01:00
MerryMage
b88c291f81 A32: Detect misaligned memory accesses
This avoids issues with misaligned memory accesses writing into the next page.
2020-06-17 17:51:37 +01:00
MerryMage
9f3277540a Merge A32 and A64 exclusive monitors 2020-06-17 10:33:09 +01:00
MerryMage
53422bec46 a64_emit_x64: Reduce code duplication in exclusive memory code 2020-06-16 18:16:33 +01:00
MerryMage
a1c9bb94a8 A32: Add yuzu-specific hacks 2020-06-16 17:54:21 +01:00
MerryMage
2c1a4843ad A32 global exlcusive monitor 2020-06-16 17:54:21 +01:00
MerryMage
58abdcce5b backend/x64/a32_*: Rename config to conf
Standardize on one name for both a32_* and a64_*.
2020-06-16 14:56:44 +01:00
MerryMage
7ea521b8bf a32_emit_x64: Change ExclusiveWriteMemory64 to require a single U64 argument 2020-06-16 13:32:50 +01:00
MerryMage
aa341b7eea a32_emit_x64: Make ExclusiveWrite a member function of A32EmitX64 2020-06-16 13:03:17 +01:00
MerryMage
34ef5142e3 a32_emit_x64: Specify callback as template argument
Removes unnecessary switch statement.
2020-06-16 10:23:51 +01:00
MerryMage
58b2c83944 a32_emit_x64: Reduce mov code duplication in {Read,Write}Memory 2020-06-16 10:14:06 +01:00
MerryMage
2796a85096 interface/a32: Remove descriptor argument from Disassemble 2020-06-12 15:27:42 +01:00
MerryMage
3ccc415c52 emit_x64_saturation: Improve codegen for saturated result in EmitSignedSaturation 2020-06-12 15:24:37 +01:00
MerryMage
e953f67201 emit_x64_packed: PackedAbsDiffSumS8: Fix case when bits above the lower 32 bits are not zero 2020-06-12 15:24:09 +01:00
MerryMage
c4cf0b3e47 exception_handler_posix: Just disable fastmem if initialization fails 2020-06-10 22:52:27 +01:00
MerryMage
55bddc767f backend/x64: Touch PEXT/PDEP code
* Use pext/pdep where not previously used
* Limit pext/pdep to non-AMD platforms due to slowness on AMD
* Use imul/and as alternatives for AMD and non-BMI2 platforms
2020-06-10 22:30:22 +01:00
MerryMage
f495018f53 block_of_code: Encapsulate CPU feature detection code 2020-06-09 21:25:57 +01:00
MerryMage
feddf69cb4 emit_x64_crc32: Use same constants 2020-06-06 20:46:09 +01:00
MerryMage
66a356e6cb emit_x64_crc32: Further improvements to codegen 2020-06-06 19:04:20 +01:00
MerryMage
bcde135c23 emit_x64_crc32: Improve 64-bit PCLMULQDQ implementation of EmitCRC32ISO
Reduce number of PCLMULQDQs to 3
2020-06-04 19:23:51 +01:00
MerryMage
0f9c70ff42 emit_x64_crc32: Improve PCLMULQDQ implementation of EmitCRC32ISO
Remove use of pshufd
2020-06-03 18:55:58 +01:00
MerryMage
fa6aee434e emit_x64_crc32: PCLMULQDQ implementation of EmitCRC32ISO 2020-06-03 11:16:53 +01:00
MerryMage
b47adaee1d emit_x64_vector: SSSE3 implementation of EmitVectorExtract 2020-06-01 15:41:36 +01:00
MerryMage
f3845cea9a A32: Implement ASIMD VQSUB instruction 2020-05-30 18:19:17 +01:00
MerryMage
4e90754873 IR: Implement VectorSaturated{Signed,Unsigned}{Add,Sub} 2020-05-30 15:55:32 +01:00
MerryMage
07108246cf A32/IR: Add SetVector and GetVector 2020-05-28 20:39:19 +01:00
MerryMage
93c289b54f Use tsl::robin_map and tsl::robin_set
Replace std::unordered_map and std::unordered_set with the above.
Better performance profile.
2020-05-26 20:51:48 +01:00
Lioncash
9b93a9de46 a32_jitstate: Remove obsoleted debug assert 2020-05-16 20:22:12 +01:00
MerryMage
2169653c50 a64_emit_x64: Invalid regalloc code for EmitA64ExclusiveReadMemory128
Attempted to allocate args[0] after end of allocation scope
2020-05-16 14:11:23 +01:00
MerryMage
8b3bc92bce backend/x64: Reduce conversions required for cpsr_nzcv
The guest program often accesses the NZCV flags directly much less
often than we need to use them for jumps and other such uses.

Therefore, we store our flags in cpsr_nzcv in a x64-friendly format.

This allows for a reduction in conditional jump related code.
2020-05-06 22:38:06 +01:00
Fernando Sahmkow
d7abae1e31 A64: Implement Exceptional Exit. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
41521ed856 User Config: Add option to specify wall clock CNTPCT. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
97b9d3e058 Exclusive Monitor: Rework exclusive monitor interface. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
b5d8b24a3c Exclusive Monitor: Allow clearing a single processor. 2020-05-03 01:40:36 +01:00
Fernando Sahmkow
2068658a82 A64 Interface: Allow changing processor id.
This commit allows the JIT to be used per guest thread and change it's
core when the thread is migrated.
2020-05-03 01:40:36 +01:00
MerryMage
24229ab899 constant_propagation_pass: Don't fold add if we nee flags
Results in incorrect flags
2020-04-29 15:33:12 +01:00
MerryMage
94d0d33e02 Fix single stepping for certain instructions
Several issues:
1. Several terminal instructions did not stop at the end of a single-step block
2. x64 backend for the A32 frontend sometimes polluted upper_location_descriptor with the single-stepping flag

We also introduce the enable_optimizations parameter to the A32 frontend.
2020-04-24 11:44:38 +01:00
MerryMage
69061d87fa exception_handler_windows: Ignore irrelevant exceptions 2020-04-23 20:58:24 +01:00
MerryMage
5c0bb5cc63 Remove unreachable code (MSVC warnings) 2020-04-23 16:36:34 +01:00
MerryMage
a8a712c801 Relicense to 0BSD 2020-04-23 15:45:57 +01:00
MerryMage
f59b9fb020 IR: Add ReplicateBit microinstruction 2020-04-22 21:07:09 +01:00
MerryMage
0c51313479 A64: Add enable_optimizations configuration option
Allow library users to disable optimizations for debugging reasons.
2020-04-22 21:06:18 +01:00
MerryMage
8bef1afb9a emit_x64_floating_point: SSE2 implementation for DenormalsAreZero 2020-04-22 21:06:18 +01:00
MerryMage
cd1560c664 emit_x64: Do not clear fast_dispatch_table unnecessarily
Reduces invalidation overhead
2020-04-22 21:06:18 +01:00
MerryMage
35402a9a17 a64_emit_x64: Fix location descriptor generation in GenTerminalHandlers 2020-04-22 21:06:18 +01:00
MerryMage
2770115757 emit_x64_data_processing: EmitMaskedShift: Use appropriately sized immediates 2020-04-22 21:06:18 +01:00
MerryMage
cc012a830c exception_handler_windows: Do not attempt to call cb when cb isn't callable 2020-04-22 21:06:18 +01:00
MerryMage
4e83e81e58 backend/x64: Add fastmem support to Windows exception handler 2020-04-22 21:06:18 +01:00
MerryMage
b7b71d65c2 backend/x64: Add POSIX exception handler with fastmem support 2020-04-22 21:06:18 +01:00
MerryMage
2d348d2d68 backend/x64: Add macOS exception handler with fastmem support 2020-04-22 21:06:18 +01:00
MerryMage
4636055646 a32_emit_x64: Implement fastmem 2020-04-22 21:06:17 +01:00
MerryMage
f9b9081d4c a32_emit_x64: Fully wrapped memory fallbacks
In the same style as the A64 backend
2020-04-22 21:06:17 +01:00
MerryMage
ad52c997f4 a32_emit_x64: Use r14 for page_table pointer 2020-04-22 21:06:17 +01:00
MerryMage
49fcfe040c reg_alloc: Explicitly specify GPR and XMM order
This allows each backend to modify what registers they want to use and their preferred orderings
2020-04-22 21:06:17 +01:00
MerryMage
c232ad7971 a32_emit_x64: Make {Read,Write}Memory member functions of A32EmitX64 2020-04-22 21:06:17 +01:00
MerryMage
5267dbb8cf emit_x64_saturation: Prefer changeBit to setBit 2020-04-22 21:06:17 +01:00
MerryMage
9d60d92692 backend/x64: Make ExceptionHandler its own class 2020-04-22 21:06:17 +01:00
MerryMage
325808949f backend/x64: Rename namespace BackendX64 -> Backend::X64 2020-04-22 21:06:17 +01:00