MerryMage
92cb4a5a34
A32: Implement ASIMD VRSQRTE
2020-06-20 15:13:22 +01:00
MerryMage
6f59c2cd8e
A32: Implement ASIMD VRECPE
2020-06-20 15:07:06 +01:00
MerryMage
d3dc50d718
A32: Implement ASIMD VRSQRTS
2020-06-20 15:06:06 +01:00
MerryMage
8f506c80c3
A32: Implement ASIMD VRECPS
2020-06-20 14:39:05 +01:00
MerryMage
f58e247ef3
A32: Implement ASIMD VPADD (floating-point)
2020-06-20 14:25:04 +01:00
MerryMage
e006f0a205
A32: Implement ASIMD VSUB (floating-point)
2020-06-20 14:20:28 +01:00
MerryMage
4c939b9d0a
A32: Implement ASIMD VADD (floating-point)
2020-06-20 14:20:28 +01:00
MerryMage
5ec8e48593
A32: Implement ASIMD VMUL (floating-point)
...
* Also add fpcr_controlled arguments to FPVectorMul IR instruction
* Merge ASIMD floating-point instruction implementations
2020-06-20 14:20:28 +01:00
MerryMage
bb4f3aa407
A32: Implement ASIMD VMAX, VMIN (floating-point)
2020-06-20 03:21:07 +01:00
MerryMage
656419286c
ir: Add fpcr_controlled argument to FPVector{Equal,Greater,GreaterEqual}
2020-06-20 00:50:40 +01:00
MerryMage
1b3a70a83c
backend/x64: Implement separate MSXCSR for ASIMDStandardValue
2020-06-20 00:00:36 +01:00
MerryMage
55c021fe82
emit_x64_aes: AESNI implementations of all opcodes
2020-06-19 12:11:45 +01:00
MerryMage
b759773b3b
a32_emit_x64: EmitVAddrLookup: Use 64-bit registers where required
2020-06-19 00:44:52 +01:00
MerryMage
7dd9901de2
a32_emit_x64: Incorrect type in ExclusiveWriteMemory
2020-06-19 00:19:46 +01:00
MerryMage
87f6e412d0
emit_x64_vector: SSE4.1 implementation of EmitVectorPolynomialMultiply{Long}8
2020-06-18 18:44:00 +01:00
MerryMage
f5b41aabc6
emit_x64_vector: Implement EmitVectorPolynomialMultiplyLong64 in terms of pclmulqdq
2020-06-18 18:04:23 +01:00
MerryMage
13367a7efd
A64: Match A32 page_table code
...
Here we increase the similarity between the A64 and A32 front-ends in terms of their
page_table handling code. In this commit, we:
* Reserve and use r14 as a register to store the page_table pointer.
* Align the code to be more similar in structure.
* Add a conf member to A32EmitContext.
* Remove scratch argument from EmitVAddrLookup.
2020-06-18 12:22:59 +01:00
MerryMage
b88c291f81
A32: Detect misaligned memory accesses
...
This avoids issues with misaligned memory accesses writing into the next page.
2020-06-17 17:51:37 +01:00
MerryMage
9f3277540a
Merge A32 and A64 exclusive monitors
2020-06-17 10:33:09 +01:00
MerryMage
53422bec46
a64_emit_x64: Reduce code duplication in exclusive memory code
2020-06-16 18:16:33 +01:00
MerryMage
a1c9bb94a8
A32: Add yuzu-specific hacks
2020-06-16 17:54:21 +01:00
MerryMage
2c1a4843ad
A32 global exlcusive monitor
2020-06-16 17:54:21 +01:00
MerryMage
58abdcce5b
backend/x64/a32_*: Rename config to conf
...
Standardize on one name for both a32_* and a64_*.
2020-06-16 14:56:44 +01:00
MerryMage
7ea521b8bf
a32_emit_x64: Change ExclusiveWriteMemory64 to require a single U64 argument
2020-06-16 13:32:50 +01:00
MerryMage
aa341b7eea
a32_emit_x64: Make ExclusiveWrite a member function of A32EmitX64
2020-06-16 13:03:17 +01:00
MerryMage
34ef5142e3
a32_emit_x64: Specify callback as template argument
...
Removes unnecessary switch statement.
2020-06-16 10:23:51 +01:00
MerryMage
58b2c83944
a32_emit_x64: Reduce mov code duplication in {Read,Write}Memory
2020-06-16 10:14:06 +01:00
MerryMage
2796a85096
interface/a32: Remove descriptor argument from Disassemble
2020-06-12 15:27:42 +01:00
MerryMage
3ccc415c52
emit_x64_saturation: Improve codegen for saturated result in EmitSignedSaturation
2020-06-12 15:24:37 +01:00
MerryMage
e953f67201
emit_x64_packed: PackedAbsDiffSumS8: Fix case when bits above the lower 32 bits are not zero
2020-06-12 15:24:09 +01:00
MerryMage
c4cf0b3e47
exception_handler_posix: Just disable fastmem if initialization fails
2020-06-10 22:52:27 +01:00
MerryMage
55bddc767f
backend/x64: Touch PEXT/PDEP code
...
* Use pext/pdep where not previously used
* Limit pext/pdep to non-AMD platforms due to slowness on AMD
* Use imul/and as alternatives for AMD and non-BMI2 platforms
2020-06-10 22:30:22 +01:00
MerryMage
f495018f53
block_of_code: Encapsulate CPU feature detection code
2020-06-09 21:25:57 +01:00
MerryMage
feddf69cb4
emit_x64_crc32: Use same constants
2020-06-06 20:46:09 +01:00
MerryMage
66a356e6cb
emit_x64_crc32: Further improvements to codegen
2020-06-06 19:04:20 +01:00
MerryMage
bcde135c23
emit_x64_crc32: Improve 64-bit PCLMULQDQ implementation of EmitCRC32ISO
...
Reduce number of PCLMULQDQs to 3
2020-06-04 19:23:51 +01:00
MerryMage
0f9c70ff42
emit_x64_crc32: Improve PCLMULQDQ implementation of EmitCRC32ISO
...
Remove use of pshufd
2020-06-03 18:55:58 +01:00
MerryMage
fa6aee434e
emit_x64_crc32: PCLMULQDQ implementation of EmitCRC32ISO
2020-06-03 11:16:53 +01:00
MerryMage
b47adaee1d
emit_x64_vector: SSSE3 implementation of EmitVectorExtract
2020-06-01 15:41:36 +01:00
MerryMage
f3845cea9a
A32: Implement ASIMD VQSUB instruction
2020-05-30 18:19:17 +01:00
MerryMage
4e90754873
IR: Implement VectorSaturated{Signed,Unsigned}{Add,Sub}
2020-05-30 15:55:32 +01:00
MerryMage
07108246cf
A32/IR: Add SetVector and GetVector
2020-05-28 20:39:19 +01:00
MerryMage
93c289b54f
Use tsl::robin_map and tsl::robin_set
...
Replace std::unordered_map and std::unordered_set with the above.
Better performance profile.
2020-05-26 20:51:48 +01:00
Lioncash
9b93a9de46
a32_jitstate: Remove obsoleted debug assert
2020-05-16 20:22:12 +01:00
MerryMage
2169653c50
a64_emit_x64: Invalid regalloc code for EmitA64ExclusiveReadMemory128
...
Attempted to allocate args[0] after end of allocation scope
2020-05-16 14:11:23 +01:00
MerryMage
8b3bc92bce
backend/x64: Reduce conversions required for cpsr_nzcv
...
The guest program often accesses the NZCV flags directly much less
often than we need to use them for jumps and other such uses.
Therefore, we store our flags in cpsr_nzcv in a x64-friendly format.
This allows for a reduction in conditional jump related code.
2020-05-06 22:38:06 +01:00
Fernando Sahmkow
d7abae1e31
A64: Implement Exceptional Exit.
2020-05-03 01:40:37 +01:00
Fernando Sahmkow
41521ed856
User Config: Add option to specify wall clock CNTPCT.
2020-05-03 01:40:37 +01:00
Fernando Sahmkow
97b9d3e058
Exclusive Monitor: Rework exclusive monitor interface.
2020-05-03 01:40:37 +01:00
Fernando Sahmkow
b5d8b24a3c
Exclusive Monitor: Allow clearing a single processor.
2020-05-03 01:40:36 +01:00
Fernando Sahmkow
2068658a82
A64 Interface: Allow changing processor id.
...
This commit allows the JIT to be used per guest thread and change it's
core when the thread is migrated.
2020-05-03 01:40:36 +01:00
MerryMage
24229ab899
constant_propagation_pass: Don't fold add if we nee flags
...
Results in incorrect flags
2020-04-29 15:33:12 +01:00
MerryMage
94d0d33e02
Fix single stepping for certain instructions
...
Several issues:
1. Several terminal instructions did not stop at the end of a single-step block
2. x64 backend for the A32 frontend sometimes polluted upper_location_descriptor with the single-stepping flag
We also introduce the enable_optimizations parameter to the A32 frontend.
2020-04-24 11:44:38 +01:00
MerryMage
69061d87fa
exception_handler_windows: Ignore irrelevant exceptions
2020-04-23 20:58:24 +01:00
MerryMage
5c0bb5cc63
Remove unreachable code (MSVC warnings)
2020-04-23 16:36:34 +01:00
MerryMage
a8a712c801
Relicense to 0BSD
2020-04-23 15:45:57 +01:00
MerryMage
f59b9fb020
IR: Add ReplicateBit microinstruction
2020-04-22 21:07:09 +01:00
MerryMage
0c51313479
A64: Add enable_optimizations configuration option
...
Allow library users to disable optimizations for debugging reasons.
2020-04-22 21:06:18 +01:00
MerryMage
8bef1afb9a
emit_x64_floating_point: SSE2 implementation for DenormalsAreZero
2020-04-22 21:06:18 +01:00
MerryMage
cd1560c664
emit_x64: Do not clear fast_dispatch_table unnecessarily
...
Reduces invalidation overhead
2020-04-22 21:06:18 +01:00
MerryMage
35402a9a17
a64_emit_x64: Fix location descriptor generation in GenTerminalHandlers
2020-04-22 21:06:18 +01:00
MerryMage
2770115757
emit_x64_data_processing: EmitMaskedShift: Use appropriately sized immediates
2020-04-22 21:06:18 +01:00
MerryMage
cc012a830c
exception_handler_windows: Do not attempt to call cb when cb isn't callable
2020-04-22 21:06:18 +01:00
MerryMage
4e83e81e58
backend/x64: Add fastmem support to Windows exception handler
2020-04-22 21:06:18 +01:00
MerryMage
b7b71d65c2
backend/x64: Add POSIX exception handler with fastmem support
2020-04-22 21:06:18 +01:00
MerryMage
2d348d2d68
backend/x64: Add macOS exception handler with fastmem support
2020-04-22 21:06:18 +01:00
MerryMage
4636055646
a32_emit_x64: Implement fastmem
2020-04-22 21:06:17 +01:00
MerryMage
f9b9081d4c
a32_emit_x64: Fully wrapped memory fallbacks
...
In the same style as the A64 backend
2020-04-22 21:06:17 +01:00
MerryMage
ad52c997f4
a32_emit_x64: Use r14 for page_table pointer
2020-04-22 21:06:17 +01:00
MerryMage
49fcfe040c
reg_alloc: Explicitly specify GPR and XMM order
...
This allows each backend to modify what registers they want to use and their preferred orderings
2020-04-22 21:06:17 +01:00
MerryMage
c232ad7971
a32_emit_x64: Make {Read,Write}Memory member functions of A32EmitX64
2020-04-22 21:06:17 +01:00
MerryMage
5267dbb8cf
emit_x64_saturation: Prefer changeBit to setBit
2020-04-22 21:06:17 +01:00
MerryMage
9d60d92692
backend/x64: Make ExceptionHandler its own class
2020-04-22 21:06:17 +01:00
MerryMage
325808949f
backend/x64: Rename namespace BackendX64 -> Backend::X64
2020-04-22 21:06:17 +01:00
MerryMage
f569d7913c
block_of_code: Reduce jmps in dispatcher loop
2020-04-22 21:06:17 +01:00
MerryMage
7e0c415473
block_of_code: Always specify codeptr to run from
2020-04-22 21:06:17 +01:00
MerryMage
b6536115ef
A32: Add Step
2020-04-22 21:06:17 +01:00
MerryMage
f69c77391e
A64: Add Step
...
Allow for stepping instruction-by-instruction
2020-04-22 21:06:17 +01:00
MerryMage
09d3c77d74
IR: Add masked shift IR instructions
...
Also use these in the A64 frontend to avoid the need to mask the shift amount.
2020-04-22 21:06:17 +01:00
MerryMage
bd88286b21
cast_util: Add FptrCast
...
Reduce unnecessary type duplication when casting a lambda to a function pointer.
2020-04-22 21:06:17 +01:00
MerryMage
fe583aa076
lut_from_list: Reduce number of required template arguments
2020-04-22 21:06:17 +01:00
MerryMage
81fcb4e537
mp: Migrate to shared version of mp library
2020-04-22 21:06:17 +01:00
MerryMage
25e27282e3
a64_emit_x64: Reduce patchpoint sizes
2020-04-22 21:04:23 +01:00
MerryMage
a59c335b05
A64: Add options for detecting misaligned loads and stores
2020-04-22 21:04:23 +01:00
Marshall Mohror
1ebc1895ee
A32/x64: Create a global_offset optimization for the page table ( #507 )
...
Instead of looking up the page table like:
table[addr >> 12][addr & 0xFFF]
We can use a global offset on the table to query the memory like:
table[addr >> 12][addr]
This saves two instructions on *every* memory access within the recompiler.
Original change by degasus in A64 emitter
2020-04-22 21:04:23 +01:00
Markus Wick
93668c24be
A64/x64: Create a global_offset optimization for the page table.
...
Instead of looking up the page table like:
table[addr >> 12][addr & 0xFFF]
We can use a global offset on the table to query the memory like:
table[addr >> 12][addr]
This saves two instructions on *every* memory access within the recompiler.
Thanks at skmp for the idea.
2020-04-22 21:04:23 +01:00
MerryMage
6325ac23eb
a32_emit_x64: Use std::get_if in EmitA32Coproc*
2020-04-22 21:04:23 +01:00
MerryMage
ada66d7092
a32_interface: Remove unused TransferJitState function
2020-04-22 21:04:23 +01:00
MerryMage
b4884a51e0
a32_jitstate: Only transfer required state
...
Importantly, reset exclusive state upon transfer.
2020-04-22 21:04:23 +01:00
MerryMage
c7d20f3f2f
fuzz_arm: Test MSR and MRS instructions against unicorn
...
* Add always_little_endian option to mach unicorn behavior.
* Correct CPSR.Mode = Usermode
2020-04-22 21:04:23 +01:00
MerryMage
2f06ef5d4e
a32_emit_x64: EmitA32SetCpsr: BUGFIX: Actually set CPSR.GE
...
Was unintentionally masking the writing of CPSR.GE due to 32-bit immediate sign extension.
2020-04-22 21:04:23 +01:00
MerryMage
0a6f822d76
a32_emit_x64: GenTerminalHandlers: Remove unnecessary mov
2020-04-22 21:04:23 +01:00
MerryMage
396116ee61
A32: Add hook_hint_instructions option
2020-04-22 21:04:23 +01:00
MerryMage
2f2a859615
a32_jitstate: Consolidate upper bits of location descriptor into upper_location_descriptor
...
Also solves a performance regression initially introduced by b6e8297e369f2dc4758bafe944e51efb8d1a2552,
primarily due to excessively mismatched load/store sizes causing less than optimal load-to-store forwarding.
2020-04-22 21:04:23 +01:00
Merry
1c97edac77
Merge pull request #503 from lioncash/cmp
...
A64: Implement half-precision variants of FCMEQ
2020-04-22 21:04:22 +01:00
Merry
f252a62c1b
Merge pull request #502 from lioncash/header
...
General: Remove unnecessary includes
2020-04-22 21:04:22 +01:00
Lioncash
22bd95902d
backend/x64/reg_alloc: Apply const where applicable
...
Also tidies up bracing where applicable along the way.
2020-04-22 21:04:22 +01:00
Lioncash
349d4b577a
General: Remove unnecessary includes
...
Removes unnecessary header dependencies that have accumulated over time
as changes have been made. Lessens the amount of files that need to be
rebuilt when the headers change.
2020-04-22 21:04:22 +01:00
Lioncash
43fd2b400a
frontend/ir_emitter: Add half-precision opcode for FPVectorEquals
2020-04-22 21:04:22 +01:00
Lioncash
cba9351b82
backend/x64/emit_*: Apply const where applicable
2020-04-22 21:04:22 +01:00