Commit graph

2264 commits

Author SHA1 Message Date
Lioncash
4d6f68525d A32: Implement ASIMD VADD (integer) 2020-06-19 11:31:38 +01:00
Lioncash
fbdae61c13 A32: Implement ASIMD VMVN (register)
Fairly straightforward
2020-06-19 11:31:14 +01:00
MerryMage
b759773b3b a32_emit_x64: EmitVAddrLookup: Use 64-bit registers where required 2020-06-19 00:44:52 +01:00
merry
687c604197
Merge pull request #532 from lioncash/shift
A32: Implement several ASIMD shift instructions
2020-06-19 00:22:18 +01:00
MerryMage
7dd9901de2 a32_emit_x64: Incorrect type in ExclusiveWriteMemory 2020-06-19 00:19:46 +01:00
Lioncash
00b2f9b319 asimd: Prevent misdecodes from occurring
Pointed out by Mary when reviewing the shift code.
2020-06-18 15:04:48 -04:00
MerryMage
87f6e412d0 emit_x64_vector: SSE4.1 implementation of EmitVectorPolynomialMultiply{Long}8 2020-06-18 18:44:00 +01:00
MerryMage
f5b41aabc6 emit_x64_vector: Implement EmitVectorPolynomialMultiplyLong64 in terms of pclmulqdq 2020-06-18 18:04:23 +01:00
MerryMage
d34763242c Revert "A32: Implement ASIMD VCEQ, VCGE, VCGT, VCLE, VCLT with zero"
This reverts commit 179951b10f.

These instructions require StandardFPSCRValue.
2020-06-18 17:38:40 +01:00
Lioncash
179951b10f A32: Implement ASIMD VCEQ, VCGE, VCGT, VCLE, VCLT with zero
Fairly self-explanatory, we can leverage the existing IR functions for
the purpose of these instructions.

In the integer case, we can just insert function pointers
into an array and index it, given all comparison primitives exist
already for the integer side of things.
2020-06-18 17:01:57 +01:00
Lioncash
6ca20c2fe3 A32: Implement ASIMD VSLI 2020-06-18 11:51:08 -04:00
Lioncash
887732d8a8 A32: Implement ASIMD VSRI 2020-06-18 11:28:12 -04:00
Lioncash
8b98c91ecc A32: Implement ASIMD VSHL 2020-06-18 11:18:33 -04:00
Lioncash
69c999bc66 A32: Implement ASIMD VRSRA
Now that we have the accumulation and rounding code in place, VRSRA is
extremely trivial to implement.
2020-06-18 11:03:39 -04:00
Lioncash
14fdd15199 A32: Implement ASIMD VRSHR 2020-06-18 11:00:45 -04:00
Lioncash
276e0b71dc A32: Implement ASIMD VSRA 2020-06-18 11:00:27 -04:00
Lioncash
054dff7cd5 A32: Implement ASIMD VTST 2020-06-18 15:34:05 +01:00
Lioncash
6c142bc5cc A32: Implement ASIMD VSHR 2020-06-18 10:30:20 -04:00
MerryMage
13367a7efd A64: Match A32 page_table code
Here we increase the similarity between the A64 and A32 front-ends in terms of their
page_table handling code. In this commit, we:

* Reserve and use r14 as a register to store the page_table pointer.
* Align the code to be more similar in structure.
* Add a conf member to A32EmitContext.
* Remove scratch argument from EmitVAddrLookup.
2020-06-18 12:22:59 +01:00
Lioncash
08350d06f1 A32: Implement ASIMD VQNEG 2020-06-18 09:49:29 +01:00
Lioncash
f6b665f5a4 A32: Implement ASIMD VQABS 2020-06-18 09:49:29 +01:00
MerryMage
b88c291f81 A32: Detect misaligned memory accesses
This avoids issues with misaligned memory accesses writing into the next page.
2020-06-17 17:51:37 +01:00
MerryMage
9f3277540a Merge A32 and A64 exclusive monitors 2020-06-17 10:33:09 +01:00
Lioncash
4b371c0445 A32: Implement ASIMD VREV{16, 32, 64} 2020-06-17 10:21:59 +01:00
Lioncash
6dd2c94095 A32: Implement ASIMD VABS
Very similar to VNEG in that the only thing that differs is the function
called.
2020-06-16 22:42:18 +01:00
MerryMage
53422bec46 a64_emit_x64: Reduce code duplication in exclusive memory code 2020-06-16 18:16:33 +01:00
MerryMage
a1c9bb94a8 A32: Add yuzu-specific hacks 2020-06-16 17:54:21 +01:00
MerryMage
2c1a4843ad A32 global exlcusive monitor 2020-06-16 17:54:21 +01:00
MerryMage
58abdcce5b backend/x64/a32_*: Rename config to conf
Standardize on one name for both a32_* and a64_*.
2020-06-16 14:56:44 +01:00
MerryMage
7ea521b8bf a32_emit_x64: Change ExclusiveWriteMemory64 to require a single U64 argument 2020-06-16 13:32:50 +01:00
MerryMage
aa341b7eea a32_emit_x64: Make ExclusiveWrite a member function of A32EmitX64 2020-06-16 13:03:17 +01:00
MerryMage
34ef5142e3 a32_emit_x64: Specify callback as template argument
Removes unnecessary switch statement.
2020-06-16 10:23:51 +01:00
MerryMage
58b2c83944 a32_emit_x64: Reduce mov code duplication in {Read,Write}Memory 2020-06-16 10:14:06 +01:00
Lioncash
aabd0d824d A32: Add immediate creation helper
Provides the same helper function that exists within the A64 frontend
for creating immediate values.
2020-06-16 09:54:28 +01:00
Lioncash
93ed3441b7 A32: Implement ASIMD VCLS/VCLZ/VCNT 2020-06-16 09:54:28 +01:00
Lioncash
15b3de95e4 A32: Implement VNEG 2020-06-16 01:53:21 +01:00
MerryMage
2796a85096 interface/a32: Remove descriptor argument from Disassemble 2020-06-12 15:27:42 +01:00
MerryMage
3ccc415c52 emit_x64_saturation: Improve codegen for saturated result in EmitSignedSaturation 2020-06-12 15:24:37 +01:00
MerryMage
e953f67201 emit_x64_packed: PackedAbsDiffSumS8: Fix case when bits above the lower 32 bits are not zero 2020-06-12 15:24:09 +01:00
MerryMage
c4cf0b3e47 exception_handler_posix: Just disable fastmem if initialization fails 2020-06-10 22:52:27 +01:00
MerryMage
55bddc767f backend/x64: Touch PEXT/PDEP code
* Use pext/pdep where not previously used
* Limit pext/pdep to non-AMD platforms due to slowness on AMD
* Use imul/and as alternatives for AMD and non-BMI2 platforms
2020-06-10 22:30:22 +01:00
MerryMage
f495018f53 block_of_code: Encapsulate CPU feature detection code 2020-06-09 21:25:57 +01:00
MerryMage
feddf69cb4 emit_x64_crc32: Use same constants 2020-06-06 20:46:09 +01:00
MerryMage
66a356e6cb emit_x64_crc32: Further improvements to codegen 2020-06-06 19:04:20 +01:00
MerryMage
bb203429c6 crc32: Remove unnecessary masking 2020-06-04 20:33:46 +01:00
MerryMage
bcde135c23 emit_x64_crc32: Improve 64-bit PCLMULQDQ implementation of EmitCRC32ISO
Reduce number of PCLMULQDQs to 3
2020-06-04 19:23:51 +01:00
MerryMage
0f9c70ff42 emit_x64_crc32: Improve PCLMULQDQ implementation of EmitCRC32ISO
Remove use of pshufd
2020-06-03 18:55:58 +01:00
MerryMage
fa6aee434e emit_x64_crc32: PCLMULQDQ implementation of EmitCRC32ISO 2020-06-03 11:16:53 +01:00
MerryMage
b47adaee1d emit_x64_vector: SSSE3 implementation of EmitVectorExtract 2020-06-01 15:41:36 +01:00
MerryMage
f3845cea9a A32: Implement ASIMD VQSUB instruction 2020-05-30 18:19:17 +01:00
MerryMage
16ff880f8f A32: Implement ASIMD VQADD 2020-05-30 16:09:37 +01:00
MerryMage
174fbb74c5 simd_three_same: Use VectorSaturated{Signed,Unsigned}{Add,Sub} in SaturatingArithmeticOperation 2020-05-30 15:55:32 +01:00
MerryMage
4e90754873 IR: Implement VectorSaturated{Signed,Unsigned}{Add,Sub} 2020-05-30 15:55:32 +01:00
MerryMage
3a50d444dc A32: Implement ASIMD VHSUB 2020-05-28 22:29:00 +01:00
MerryMage
205e6c5a56 A32: Implement ASIMD VRHADD 2020-05-28 22:29:00 +01:00
MerryMage
946eb03a3b A32: Implement ASIMD VHADD 2020-05-28 22:29:00 +01:00
MerryMage
f8062345bb asimd_two_regs_misc: Use {Get,Set}Vector 2020-05-28 21:05:30 +01:00
MerryMage
11cec1e3b6 asimd_three_same: Use {Get,Set}Vector 2020-05-28 21:05:16 +01:00
MerryMage
7d0b16de32 asimd_one_reg_modified_immediate: Use {Get,Set}Vector 2020-05-28 20:40:26 +01:00
MerryMage
cae857b8c8 verification_pass: Have an appropriate assertion message 2020-05-28 20:40:11 +01:00
MerryMage
ebddf6cca0 basic_block: Allow printing of invalid instruction pointers 2020-05-28 20:39:50 +01:00
MerryMage
07108246cf A32/IR: Add SetVector and GetVector 2020-05-28 20:39:19 +01:00
MerryMage
93c289b54f Use tsl::robin_map and tsl::robin_set
Replace std::unordered_map and std::unordered_set with the above.
Better performance profile.
2020-05-26 20:51:48 +01:00
Lioncash
c4a4bdd7de frontend: Relocate ExtReg handling to types.h
Same behavior, but deduplicates the code being placed across several
files
2020-05-24 23:55:47 +01:00
Lioncash
1900df5340 frontend: Relocate advanced SIMD expansion to a common file
Deduplicates code a little bit.
2020-05-24 23:55:47 +01:00
Lioncash
fc112e61f2 A32: Implement ASIMD modified immediate functions
Implements VBIC, VMOV, VMVN, and VORR modified immediate instructions.
2020-05-24 23:55:47 +01:00
Lioncash
659d78c9c4 A32: Implement ASIMD VSWP
A trivial one to implement, this just swaps the contents of two
registers in place.
2020-05-22 19:43:24 +01:00
MerryMage
d0075f4ea6 print_info: Use LLVM to disassemble A32 2020-05-17 22:30:46 +01:00
MerryMage
c59a127e86 opcodes: Switch from std::map to std::array
Optimization.
2020-05-17 17:01:39 +01:00
MerryMage
d0b45f6150 A32: Implement ARMv8 VST{1-4} (multiple) 2020-05-17 17:01:39 +01:00
Lioncash
eb332b3836 asimd_three_same: Unify BitwiseInstructionWithDst with BitwiseInstruction
Now that all bitwise instructions are implemented, we can unify all of
them together using if constexpr.
2020-05-16 20:22:12 +01:00
Lioncash
f42b3ad4a0 A32: Implement ASIMD VBIF (register) 2020-05-16 20:22:12 +01:00
Lioncash
ee9a81dcba A32: Implement ASIMD VBIT (register) 2020-05-16 20:22:12 +01:00
Lioncash
d624059ead A32: Implement ASIMD VBSL (register) 2020-05-16 20:22:12 +01:00
Lioncash
66663cf8e7 asimd_three_same: Collapse all bitwise implementations into a single code path
Less code and results in only writing the parts that matter once.
2020-05-16 20:22:12 +01:00
Lioncash
4b5e3437cf A32: Implement ASIMD VEOR (register) 2020-05-16 20:22:12 +01:00
Lioncash
67b284f6fa A32: Implement ASIMD VORN (register) 2020-05-16 20:22:12 +01:00
Lioncash
1fdd90ca2a A32: Implement ASIMD VORR (register) 2020-05-16 20:22:12 +01:00
Lioncash
9b93a9de46 a32_jitstate: Remove obsoleted debug assert 2020-05-16 20:22:12 +01:00
Lioncash
64fa804dd4 A32: Implement ASIMD VBIC (register) 2020-05-16 20:22:12 +01:00
Lioncash
0441ab81a1 A32: Implement ASIMD VAND (register) 2020-05-16 20:22:12 +01:00
Lioncash
1b25e867ae asimd_load_store_structures: Simplify ToExtRegD()
ExtReg has a supplied operator+, so we can make use of that instead.
2020-05-16 11:27:22 -04:00
MerryMage
2169653c50 a64_emit_x64: Invalid regalloc code for EmitA64ExclusiveReadMemory128
Attempted to allocate args[0] after end of allocation scope
2020-05-16 14:11:23 +01:00
MerryMage
1a0bc5ba91 A32/ASIMD: ARMv8: Implement VLD{1-4} (multiple) 2020-05-16 14:11:23 +01:00
MerryMage
e7f1a0d408 A32: ARMv8: Implement LDA{,EX}{,B,D,H} and STL{,EX}{,B,D,H} 2020-05-15 21:07:36 +01:00
Lioncash
af3b65b135 decoder_detail: Mark GetMaskAndExpect() as constexpr
Elides quite a bit of code at runtime when constructing the decoding
tables.
2020-05-11 08:29:06 +01:00
MerryMage
59db2c191a VFPv3: Implement VMOV (immediate) 2020-05-10 15:09:37 +01:00
MerryMage
3c86d58064 VFPv4: Implement VCVTB, VCVTT 2020-05-10 14:45:18 +01:00
MerryMage
010fab9a0e VFPv4: Implement VFMA, VFMS 2020-05-10 14:20:11 +01:00
MerryMage
8e97b10acb VFPv4: Implement VFNMS, VFNMA 2020-05-10 14:14:03 +01:00
MerryMage
6df660c889 fuzz_arm: Ensure all instructions are fuzzed
* VFP instructions were not getting fuzzed due to matching coprocessor instructions (as invalid instructions)
* Fix VPOP writeback for doubles when (imm8 & 1) == 1
* Do not accidentally fuzz unimplemented unconditional instructions
2020-05-10 13:57:39 +01:00
MerryMage
9a38c7324f A32: Add decoders for remaining v7 instructions 2020-05-10 10:50:34 +01:00
MerryMage
8b3bc92bce backend/x64: Reduce conversions required for cpsr_nzcv
The guest program often accesses the NZCV flags directly much less
often than we need to use them for jumps and other such uses.

Therefore, we store our flags in cpsr_nzcv in a x64-friendly format.

This allows for a reduction in conditional jump related code.
2020-05-06 22:38:06 +01:00
Fernando Sahmkow
d7abae1e31 A64: Implement Exceptional Exit. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
41521ed856 User Config: Add option to specify wall clock CNTPCT. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
97b9d3e058 Exclusive Monitor: Rework exclusive monitor interface. 2020-05-03 01:40:37 +01:00
Fernando Sahmkow
b5d8b24a3c Exclusive Monitor: Allow clearing a single processor. 2020-05-03 01:40:36 +01:00
Fernando Sahmkow
2068658a82 A64 Interface: Allow changing processor id.
This commit allows the JIT to be used per guest thread and change it's
core when the thread is migrated.
2020-05-03 01:40:36 +01:00
MerryMage
24229ab899 constant_propagation_pass: Don't fold add if we nee flags
Results in incorrect flags
2020-04-29 15:33:12 +01:00
MerryMage
e7166e8ba7 constant_propagation_pass: Fold add and sub 2020-04-29 14:16:17 +01:00
MerryMage
dca983803a translate_arm: ConditionPassed: Some instructions emit no microinstructions 2020-04-24 13:12:13 +01:00
MerryMage
94d0d33e02 Fix single stepping for certain instructions
Several issues:
1. Several terminal instructions did not stop at the end of a single-step block
2. x64 backend for the A32 frontend sometimes polluted upper_location_descriptor with the single-stepping flag

We also introduce the enable_optimizations parameter to the A32 frontend.
2020-04-24 11:44:38 +01:00
MerryMage
69061d87fa exception_handler_windows: Ignore irrelevant exceptions 2020-04-23 20:58:24 +01:00
MerryMage
5c0bb5cc63 Remove unreachable code (MSVC warnings) 2020-04-23 16:36:34 +01:00
MerryMage
a8a712c801 Relicense to 0BSD 2020-04-23 15:45:57 +01:00
MerryMage
d51a83d265 constant_propagation_pass: Fold IsZero 2020-04-22 21:07:09 +01:00
MerryMage
df1a0eecaf constant_propagation_pass: Fold shifts 2020-04-22 21:07:09 +01:00
MerryMage
7242388577 A64: Specialize arithmetic shift SBFM aliases 2020-04-22 21:07:09 +01:00
MerryMage
a13392e432 A64: Specialize sign-extension SBFM aliases 2020-04-22 21:07:09 +01:00
MerryMage
4573511fe3 constant_propagation_pass: Prepare for IR matchers 2020-04-22 21:07:09 +01:00
MerryMage
0d7476d3ec constant_propagation_pass: Propagate constants across commutative operations
e.g. (a & b) & c == a & (b & c) where b and c are constants
2020-04-22 21:07:09 +01:00
MerryMage
f59b9fb020 IR: Add ReplicateBit microinstruction 2020-04-22 21:07:09 +01:00
MerryMage
93adcfa5c6 value: Add GetInstRecursive 2020-04-22 21:06:18 +01:00
MerryMage
996d5cb841 ir_opt: Add IdentityRemovalPass 2020-04-22 21:06:18 +01:00
MerryMage
2ae68b13ed value: Add IsIdentity function 2020-04-22 21:06:18 +01:00
MerryMage
8db4d65587 A64/decoder: Use a lookup table instead of doing a linear scan 2020-04-22 21:06:18 +01:00
MerryMage
0c51313479 A64: Add enable_optimizations configuration option
Allow library users to disable optimizations for debugging reasons.
2020-04-22 21:06:18 +01:00
MerryMage
8bef1afb9a emit_x64_floating_point: SSE2 implementation for DenormalsAreZero 2020-04-22 21:06:18 +01:00
MerryMage
7c917f1c12 CMakeLists: Add DYNARMIC_FRONTENDS option
Allows library user to select which frontends to enable
2020-04-22 21:06:18 +01:00
MerryMage
668a43f815 A32: Detect unpredictable LDM/STM instructions 2020-04-22 21:06:18 +01:00
MerryMage
cd1560c664 emit_x64: Do not clear fast_dispatch_table unnecessarily
Reduces invalidation overhead
2020-04-22 21:06:18 +01:00
MerryMage
35402a9a17 a64_emit_x64: Fix location descriptor generation in GenTerminalHandlers 2020-04-22 21:06:18 +01:00
MerryMage
2770115757 emit_x64_data_processing: EmitMaskedShift: Use appropriately sized immediates 2020-04-22 21:06:18 +01:00
MerryMage
cc012a830c exception_handler_windows: Do not attempt to call cb when cb isn't callable 2020-04-22 21:06:18 +01:00
MerryMage
4e83e81e58 backend/x64: Add fastmem support to Windows exception handler 2020-04-22 21:06:18 +01:00
MerryMage
b7b71d65c2 backend/x64: Add POSIX exception handler with fastmem support 2020-04-22 21:06:18 +01:00
MerryMage
2d348d2d68 backend/x64: Add macOS exception handler with fastmem support 2020-04-22 21:06:18 +01:00
MerryMage
4636055646 a32_emit_x64: Implement fastmem 2020-04-22 21:06:17 +01:00
MerryMage
f9b9081d4c a32_emit_x64: Fully wrapped memory fallbacks
In the same style as the A64 backend
2020-04-22 21:06:17 +01:00
MerryMage
ad52c997f4 a32_emit_x64: Use r14 for page_table pointer 2020-04-22 21:06:17 +01:00
MerryMage
49fcfe040c reg_alloc: Explicitly specify GPR and XMM order
This allows each backend to modify what registers they want to use and their preferred orderings
2020-04-22 21:06:17 +01:00
MerryMage
c232ad7971 a32_emit_x64: Make {Read,Write}Memory member functions of A32EmitX64 2020-04-22 21:06:17 +01:00
MerryMage
5267dbb8cf emit_x64_saturation: Prefer changeBit to setBit 2020-04-22 21:06:17 +01:00
MerryMage
9d60d92692 backend/x64: Make ExceptionHandler its own class 2020-04-22 21:06:17 +01:00
MerryMage
325808949f backend/x64: Rename namespace BackendX64 -> Backend::X64 2020-04-22 21:06:17 +01:00
MerryMage
f569d7913c block_of_code: Reduce jmps in dispatcher loop 2020-04-22 21:06:17 +01:00
MerryMage
7e0c415473 block_of_code: Always specify codeptr to run from 2020-04-22 21:06:17 +01:00
MerryMage
b6536115ef A32: Add Step 2020-04-22 21:06:17 +01:00
MerryMage
f69c77391e A64: Add Step
Allow for stepping instruction-by-instruction
2020-04-22 21:06:17 +01:00
MerryMage
09d3c77d74 IR: Add masked shift IR instructions
Also use these in the A64 frontend to avoid the need to mask the shift amount.
2020-04-22 21:06:17 +01:00
MerryMage
bd88286b21 cast_util: Add FptrCast
Reduce unnecessary type duplication when casting a lambda to a function pointer.
2020-04-22 21:06:17 +01:00
MerryMage
fe583aa076 lut_from_list: Reduce number of required template arguments 2020-04-22 21:06:17 +01:00
MerryMage
81fcb4e537 mp: Migrate to shared version of mp library 2020-04-22 21:06:17 +01:00
MerryMage
28e5af20b5 mp/function_info: Add parameter_count_v 2020-04-22 21:04:24 +01:00
MerryMage
aa225a7dc4 bit_util: Add CountLeadingZeros 2020-04-22 21:04:24 +01:00
MerryMage
25e27282e3 a64_emit_x64: Reduce patchpoint sizes 2020-04-22 21:04:23 +01:00
MerryMage
a59c335b05 A64: Add options for detecting misaligned loads and stores 2020-04-22 21:04:23 +01:00
Marshall Mohror
1ebc1895ee A32/x64: Create a global_offset optimization for the page table (#507)
Instead of looking up the page table like:
  table[addr >> 12][addr & 0xFFF]
We can use a global offset on the table to query the memory like:
  table[addr >> 12][addr]

This saves two instructions on *every* memory access within the recompiler.

Original change by degasus in A64 emitter
2020-04-22 21:04:23 +01:00
MerryMage
e10985d179 ir/basic_block: Add FastDispatchHint to TerminalToString
Use a boost::static_visitor to ensure this is caught at compile-time in the future.
2020-04-22 21:04:23 +01:00
Lioncash
af3614553b A64/impl: Move AccType and MemOp enum into general IR emitter header
These will be used by both frontends in the future, so this performs the
migratory changes separate from the changes that will make use of them.
2020-04-22 21:04:23 +01:00
Markus Wick
93668c24be A64/x64: Create a global_offset optimization for the page table.
Instead of looking up the page table like:
  table[addr >> 12][addr & 0xFFF]
We can use a global offset on the table to query the memory like:
  table[addr >> 12][addr]

This saves two instructions on *every* memory access within the recompiler.

Thanks at skmp for the idea.
2020-04-22 21:04:23 +01:00
MerryMage
6325ac23eb a32_emit_x64: Use std::get_if in EmitA32Coproc* 2020-04-22 21:04:23 +01:00
MerryMage
39bd2c034d constant_propagation_pass: Handle GetCarryFromOp for MostSignificantWord 2020-04-22 21:04:23 +01:00
MerryMage
ada66d7092 a32_interface: Remove unused TransferJitState function 2020-04-22 21:04:23 +01:00
MerryMage
b4884a51e0 a32_jitstate: Only transfer required state
Importantly, reset exclusive state upon transfer.
2020-04-22 21:04:23 +01:00
MerryMage
1aa7b62e92 A32/Thumb: Correct behaviour for UDF and Unpredictable instructions
Raise an exception instead of calling the interpreter and ASSERT-ing respectively.
2020-04-22 21:04:23 +01:00
MerryMage
c7d20f3f2f fuzz_arm: Test MSR and MRS instructions against unicorn
* Add always_little_endian option to mach unicorn behavior.
* Correct CPSR.Mode = Usermode
2020-04-22 21:04:23 +01:00
MerryMage
2f06ef5d4e a32_emit_x64: EmitA32SetCpsr: BUGFIX: Actually set CPSR.GE
Was unintentionally masking the writing of CPSR.GE due to 32-bit immediate sign extension.
2020-04-22 21:04:23 +01:00
MerryMage
0a6f822d76 a32_emit_x64: GenTerminalHandlers: Remove unnecessary mov 2020-04-22 21:04:23 +01:00
MerryMage
717bd2fbb2 A64: Add hook_hint_instructions option 2020-04-22 21:04:23 +01:00
MerryMage
396116ee61 A32: Add hook_hint_instructions option 2020-04-22 21:04:23 +01:00
MerryMage
2f2a859615 a32_jitstate: Consolidate upper bits of location descriptor into upper_location_descriptor
Also solves a performance regression initially introduced by b6e8297e369f2dc4758bafe944e51efb8d1a2552,
primarily due to excessively mismatched load/store sizes causing less than optimal load-to-store forwarding.
2020-04-22 21:04:23 +01:00
MerryMage
e41a7dc678 CMakeLists: Temporarily remove export
Unable to export fmt in projects that have DYNARMIC_NO_BUNDLED_FMT enabled
2020-04-22 21:04:22 +01:00
Merry
1c97edac77 Merge pull request #503 from lioncash/cmp
A64: Implement half-precision variants of FCMEQ
2020-04-22 21:04:22 +01:00
Merry
f252a62c1b Merge pull request #502 from lioncash/header
General: Remove unnecessary includes
2020-04-22 21:04:22 +01:00
Lioncash
11d1114a17 A64: Implement all half-precision variants of FCMEQ 2020-04-22 21:04:22 +01:00
Lioncash
22bd95902d backend/x64/reg_alloc: Apply const where applicable
Also tidies up bracing where applicable along the way.
2020-04-22 21:04:22 +01:00
Lioncash
349d4b577a General: Remove unnecessary includes
Removes unnecessary header dependencies that have accumulated over time
as changes have been made. Lessens the amount of files that need to be
rebuilt when the headers change.
2020-04-22 21:04:22 +01:00
Lioncash
43fd2b400a frontend/ir_emitter: Add half-precision opcode for FPVectorEquals 2020-04-22 21:04:22 +01:00
Lioncash
cba9351b82 backend/x64/emit_*: Apply const where applicable 2020-04-22 21:04:22 +01:00
Lioncash
557a01a787 common/fp/op: Add soft-float implementation of FPCompareEQ
This will be used to implement the half-precision floating-point
variants of FCMEQ in following changes.
2020-04-22 21:04:22 +01:00
Lioncash
dd315e89eb A64/translate/*: Apply const where applicable
Just some tidying up for consistency
2020-04-22 21:04:22 +01:00
Lioncash
d9d59bc1f4 common/cast_util: Declare BitCast and BitCastPointee with the noexcept specifier
std::bit_cast is also defined with the noexcept specifier, so we can do
the same here to match up with it and stay similar with the standard
library.
2020-04-22 21:04:22 +01:00
Lioncash
4f47861669 A64/translate/impl: Mark DecodeBitMasks and AdvSIMDExpandImm as static
These don't rely on instance state to perform their behavior. They're
just helper functions.
2020-04-22 21:04:22 +01:00
Lioncash
dddba94c17 disassembler_arm: Apply const where applicable 2020-04-22 21:04:22 +01:00
Lioncash
9365487797 frontend/A32/ir_emitter: Remove unnecessary includes
std::initializer_list isn't used anywhere in here, and we can just
forward declare the CoprocReg enum to avoid needing to include the
header.
2020-04-22 21:04:22 +01:00
Lioncash
23f56bdb67 x64/exception_handler_windows: Join namespace declaration
Uses a nested namespace declaration like the rest of the codebase.
2020-04-22 21:04:22 +01:00
Lioncash
3bbb06c34a a64_emit_x64: Apply [[maybe_unused]] to unused lambda parameter
This can result in an unused variable warning on Windows otherwise.
2020-04-22 21:04:22 +01:00
Lioncash
bfa8035414 A32/A64: Make public header inclusions consistent
For all public header inclusions, we use the <> form of including them
as opposed to "", which we typically use for internal headers.
2020-04-22 21:04:22 +01:00
Lioncash
ccf923305c a32_interface: Remove duplicated documentation comments
These already exist in the header, so these ones can be removed.
2020-04-22 21:04:22 +01:00
Lioncash
fb7d33830c A32: Make includes consistent
Normalizes includes to be relative to the project root, like the rest of
the includes in the project.
2020-04-22 21:04:22 +01:00
Lioncash
b57ed8917a frontend/A32/types: Remove redundant std::string initializer
std::string initializes to empty by default. While we're at it, brace a
lone unbraced if statement.
2020-04-22 21:04:22 +01:00
Lioncash
25b4e463d3 ir_opt/a64_get_set_elimination_pass: Remove redundant return
This lambda function has a void return type, so we don't need to
explicitly return at the end of it.
2020-04-22 21:04:22 +01:00
Lioncash
182ceb2807 General: Make parameter names from declarations and implementations consistent
Most of the time when this occurs, it's a bug. Thankfully this isn't the
case. However, we can resolve these cases to make the codebase more
consistent.
2020-04-22 21:04:22 +01:00
MerryMage
3513ed1c60 CMakeLists: Define FMT_USE_USER_DEFINED_LITERALS=0
This disable a fmtlib feature that depends on a non-standard feature
for its implementation.
2020-04-22 21:04:22 +01:00
Lioncash
b301fcd520 A32/translate/translate: Add missing doxygen parameter string 2020-04-22 21:04:22 +01:00
Lioncash
44b61212e5 Revert "CMakeLists: Handle DYNARMIC_NO_BUNDLED_FMT in relation to export()"
I was being silly. This isn't required.

This reverts commit 00b79cbb72c61744470e0aa1a96b673702b33931.
2020-04-22 21:04:22 +01:00
Lioncash
6b9bf7868a General: Correct typos is code comments 2020-04-22 21:04:22 +01:00
Lioncash
acd7ac5ed3 CMakeLists: Handle DYNARMIC_NO_BUNDLED_FMT in relation to export()
This is pretty gross, but until DYNARMIC_NO_BUNDLED_FMT is eliminated,
this fixes the use of it in existing libraries or applications making
use of dynarmic.
2020-04-22 21:04:22 +01:00
Lioncash
6187de7ca7 a32_interface: std::move UserConfig where applicable
UserConfig instances contain up to 16 std::shared_ptr<Coprocessor>
instances. We can std::move here to avoid performing 16 redundant atomic
reference increment and decrement operations.

Mostly inconsequential on x64, but we may as well signify intent.
2020-04-22 21:04:22 +01:00
MerryMage
7d20f3b861 A32/translate_thumb: Split off implementation into thumb16 and thumb32 2020-04-22 21:04:22 +01:00
Lioncash
b79ce71b0f ir/basic_block: std::move Terminal within SetTerminal and ReplaceTerminal
A terminal isn't a trivial type (and boost::variant is allowed to heap
allocate), so we can std::move it here to avoid a redundant copy.
2020-04-22 21:04:22 +01:00
MerryMage
e639aa1583 A32/translate: Rename translate_arm directory to impl
Mirror what the A64 frontend does.
2020-04-22 21:04:22 +01:00
Lioncash
63eff4e7cc ir/terminal: std::move constructor parameters where applicable
Allows the compiler to choose the most suitable code in this scenario,
given a Terminal isn't a trivial type.
2020-04-22 21:04:22 +01:00
Lioncash
b13b6610b5 a32_interface: Default destructor in the cpp file
Makes it more consistent with code throughout the codebase.
2020-04-22 21:04:22 +01:00
MerryMage
5f8eb7c51c A32/location_descriptor: Add CPSR.IT to A32::LocationDescriptor 2020-04-22 21:04:22 +01:00
MerryMage
13f65f55eb PSR: Use Common::ModifyBit{,s} 2020-04-22 21:04:22 +01:00
MerryMage
74633301c1 A32: Add ITState 2020-04-22 21:04:22 +01:00
MerryMage
6e2cd35e4f a32_jitstate: Optimize runtime location descriptor calculation
Calculation is now one unaligned 64-bit load.
2020-04-22 21:04:22 +01:00
MerryMage
0de3993373 a32_jitstate: Remove fpsr_idc
We do not really have accurate FPSR state in any case.
2020-04-22 21:04:22 +01:00
MerryMage
6f49c0ef8e {a32,a64}_jitstate: Rename CPSR_* to cpsr_* 2020-04-22 21:04:22 +01:00
MerryMage
8cd7837839 a32_jitstate: Remove old_FPSCR 2020-04-22 21:04:22 +01:00
MerryMage
b3bb544bca a32_jitstate: Rename FPSCR_nzcv to fpsr_nzcv 2020-04-22 21:04:22 +01:00
MerryMage
76f986979d a32_jitstate: Rename FPSCR_mode to fpcr_mode 2020-04-22 21:04:22 +01:00
MerryMage
49fca15f90 {a32,a64}_jitstate: Rename FPSCR_IDC to fpsr_idc 2020-04-22 21:04:21 +01:00
MerryMage
622c02f537 {a32,a64}_jitstate: Remove FPSCR_UFC 2020-04-22 21:04:21 +01:00
MerryMage
366d63f4b4 a32_jitstate: Enable SSE FTZ and DAZ 2020-04-22 21:04:21 +01:00
MerryMage
f178562ee7 a32_jitstate: Remove exception trap enables from FPSCR_MODE_MASK
We don't currently use this for anything (we do not currently trap
floating point exceptions).

This frees these bits up for other purposes.
2020-04-22 21:04:21 +01:00
Merry
fd6222f0a1 Merge pull request #500 from lioncash/cbz
A32: Implement Thumb-1's CBZ/CBNZ instructions
2020-04-22 21:04:21 +01:00
Merry
bab4e29075 Merge pull request #498 from lioncash/ahp
A32/location_descriptor: Add AHP bit to the FPSCR mask
2020-04-22 21:04:21 +01:00
Lioncash
2d695e3c7c common/fp/info: Make formatting of FPInfo struct member functions consistent
Orgranizes the functions to all be consistent with the half-precision
specialization.
2020-04-22 21:04:21 +01:00
Lioncash
05b330906e common/fp/util: Make ProcessNaN utility functions constexpr
Nothing in particular prevents these from being constexpr. Do so to make
them consistent with the bulk of other functions in this header that are
constexpr.
2020-04-22 21:04:21 +01:00
Lioncash
6b9a40bdc4 common/fp/op/FPNeg: Make FPNeg constexpr
Negation in (standard IEEE) floating-point is simply flipping the sign-bit, so this
operation will never be more complex than what is presented here, making
constexpr a reasonable allowance.
2020-04-22 21:04:21 +01:00
Lioncash
ef95e0fa7d CMakeLists: Add FPNeg.h to the library target sources
Ensures that the header shows up in IDE generated projects.
2020-04-22 21:04:21 +01:00
Lioncash
87083af733 general: Remove trailing spaces
General code-related cleanup. Gets rid of trailing spaces in the
codebase.
2020-04-22 21:04:21 +01:00
Lioncash
fdbafbc1ae x64/reg_alloc: Remove reference qualifier to variable in GetArgumentInfo()
The result of GetArg() is returned by value, so this is essentially
still a copy. While the previous code *is* valid, this communicates what
is actually happening a little more explicitly.
2020-04-22 21:04:20 +01:00
Lioncash
1f6878fb46 ir_opt/verification_pass: Add include for std::puts
Ensures that the header dependency is always satisfied directly, and not
through other project headers. While we're at it, we can qualify the
call with the std:: namespace.
2020-04-22 21:03:38 +01:00
Lioncash
fc9c59d056 ir_opt/verification_pass: Eliminate redundant GetArg()
Given the same argument is used inside the condition's body if it's
true, we can just utilize the local to cut out a GetArg() operation.
Avoids redundant internal assertion checking.
2020-04-22 21:03:35 +01:00
Lioncash
03e6899fd7 A32: Implement Thumb-1's CBZ/CBNZ instructions
Introduced in ARMv6T2, this allows for short forward branches.
2020-04-22 21:02:47 +01:00
Lioncash
d02a4e6fc9 A32/location_descriptor: Add AHP bit to the FPSCR mask
Ensures the alternate half-precision state is preserved within the
location descriptors, which will be necessary when implementing the
half-precision extensions for VFP and NEON.
2020-04-22 21:02:47 +01:00
Merry
f4990a5f6b Merge pull request #499 from lioncash/movw
A32: Implement ARM-mode MOVW
2020-04-22 21:02:47 +01:00
Lioncash
bd755ae494 frontend/ir/ir_emitter: Add A32 equivalent to A64's SetCheckBit
This will be used in a subsequent change to implement ARMv6T2's CBZ/CBNZ
Thumb-1 instructions.
2020-04-22 21:02:47 +01:00
Lioncash
c6e1fd1416 a64_emit_x64: Use const on locals where applicable
Normalizes the use of const in the source file.
2020-04-22 21:02:47 +01:00
Merry
f6f0b6da65 Merge pull request #497 from lioncash/boost
A32/coprocessor: Remove boost from public interface
2020-04-22 21:02:47 +01:00
Lioncash
106c8c2473 A32: Implement ARM-mode MOVW
Introduced to the ISA in ARMv6T2
2020-04-22 21:02:47 +01:00
Lioncash
fb437080be a32_emit_x64: Use const on locals where applicable
Normalizes the use of const in the source file.
2020-04-22 21:02:47 +01:00
Lioncash
9935f3aa28 A32: Implement Thumb-1 variant of SEVL
While we're at it, also add the Thumb-2 encoding to the encoding table
to make sure it isn't forgotten about in the future.
2020-04-22 21:02:47 +01:00
Lioncash
5f9ba970b9 emit_x64: Use const on locals where applicable 2020-04-22 21:02:47 +01:00
Lioncash
92daae9513 A32/coprocessor: Remove boost from public interface
Removes a boost header from the public includes in favor of using the
standard-provided std::variant.

The use of boost in public interfaces is often a dealbreaker for some
people. Given we use std::optional in the header already, we can
transition over to std::variant from boost::variant.

With this removal, this makes all of our dependencies internal to the
library itself.
2020-04-22 21:02:47 +01:00
Lioncash
9a097e307f A32: Implement the ARM-mode variant of SEVL 2020-04-22 21:02:47 +01:00
Lioncash
a40b921cb5 emit_x64: Remove unnecessary typename in GetBasicBlock()
This can be deduced from the name alone.
2020-04-22 21:02:47 +01:00
Lioncash
e89ca42048 A32: Implement Thumb-1 variant of YIELD 2020-04-22 21:02:47 +01:00
Lioncash
675f67e41d emit_x64_vector: Use const on locals where applicable
Normalizes the use of const in the source file.
2020-04-22 21:02:47 +01:00
Lioncash
ebab7ede55 A32: Implement Thumb-1 variant of WFI 2020-04-22 21:02:47 +01:00
Lioncash
cccbc7fd0e emit_x64_saturation: Use const on locals where applicable
Normalizes the use of const in the source file.
2020-04-22 21:02:47 +01:00
Lioncash
b4110af22a A32: Implement Thumb-1 variant of WFE 2020-04-22 21:02:47 +01:00
Lioncash
7316fa47b3 emit_x64_packed: Use const on locals where applicable
Normalizes the use of const across the source file.
2020-04-22 21:02:47 +01:00
Lioncash
3a9c2f81d0 block_of_code: Use variable template variants of type traits
Now all type traits are using the variable template variants where
applicable.
2020-04-22 21:02:47 +01:00
Lioncash
57675fe592 A32: Implement Thumb-1 variant of SEV 2020-04-22 21:02:47 +01:00
Lioncash
9b783a5527 emit_x64_data_processing: Use const on locals where applicable
Normalizes the use of const across the source file.
2020-04-22 21:02:47 +01:00
Lioncash
07699b47ba A32/translate_thumb: Add helper function for raising exceptions
Similar to the variant within the ARM-mode translator visitor. This will
be used in subsequent changes to implement the hint instructions
introduced in ARMv7.
2020-04-22 21:02:47 +01:00
Lioncash
f74762ae4e frontend/decoder/decoder_detail: Replace std::is_same, with std::is_same_v
Same thing, same readability, less characters.
2020-04-22 21:02:47 +01:00
Lioncash
64879396f6 A32: Implement Thumb-1 variant of NOP 2020-04-22 21:02:47 +01:00
Merry
6a67da1225 Merge pull request #493 from lioncash/ir
frontend/ir/ir_emitter: Remove unnecessary logical shift overloads
2020-04-22 21:02:47 +01:00
Merry
81b908b077 Merge pull request #495 from lioncash/bkpt
A32: Implement Thumb-16's variant of BKPT
2020-04-22 21:02:47 +01:00
Merry
30d28029a8 Merge pull request #492 from lioncash/vfp
A32: Rename vfp2-related files to vfp
2020-04-22 21:02:47 +01:00
Merry
c4fb7cf540 Merge pull request #494 from lioncash/pldw
A32: Handle PLDW
2020-04-22 21:02:47 +01:00
Lioncash
b17a5d3365 A32: Implement Thumb-16's variant of BKPT 2020-04-22 21:02:47 +01:00
Lioncash
b902f72001 A32/disassembler_arm: Remove <unimplemented> from hint instruction output
Given we now support hooking these hint instructions, we can consider
them implemented.
2020-04-22 21:02:47 +01:00
Lioncash
0fa0bca22a A32: Handle different variants of PLD 2020-04-22 21:02:47 +01:00
Lioncash
c6f99235e1 frontend/ir/ir_emitter: Remove unnecessary logical shift overloads
These aren't necessary anymore, now that the U32U64 overload already
exists.
2020-04-22 21:02:46 +01:00
Merry
9ba503e394 Merge pull request #491 from lioncash/hint
A32: Allow hooking of hint instructions in ARM mode.
2020-04-22 21:02:46 +01:00
Lioncash
97277c598b A32: Rename vfp2-related files to vfp
Now that we fuzz against Unicorn, we aren't just restricted to VFPv2.
VFPv3 and VFPv4 facilities can now be implemented. This renames
constructs mentioning VFPv2 to just refer to VFP.
2020-04-22 21:02:46 +01:00
Lioncash
8c3122ff46 A64/translate/impl/impl: Mark locals const where applicable in DecodeBitMasks()
Follows the convention of making immutable state explicit.
2020-04-22 21:02:46 +01:00
Merry
a132b56d57 Merge pull request #490 from lioncash/crc32
A32: Implement ARM-mode CRC32 instructions
2020-04-22 21:02:46 +01:00
Lioncash
966e04d03d A32: Allow hooking of hint instructions in ARM mode.
Mirrors the hooking functionality from the AArch64 frontend to make the
behavior of both consistent.
2020-04-22 21:02:46 +01:00
Lioncash
134b586c5c frontend/ir/ir_emitter: Amend arguments to conversion opcodes
Accidentally caused within 967d1fcc8d6f60749a162a96b997439450fed687.
That one's on me. My bad.
2020-04-22 21:02:46 +01:00
MerryMage
5debb411cc block_of_code: Explicitly delete copy constructor 2020-04-22 21:02:46 +01:00
Lioncash
e37689315d A32: Implement ARM-mode CRC32 instructions
Implements the ARM-mode variants of the CRC32 instructions introduced
within ARMv8. This is also one of the instruction cases where there is
UNPREDICTABLE behavior that is constrained (we must do one of the
options indicated by the reference manual).

In both documented cases of constrained unpredictable behavior, we treat
the instructions as unpredictable in order to allow library users to
hook the unpredictable exception to provide the intended behavior they
desire.
2020-04-22 21:02:46 +01:00
Lioncash
95d9baea67 {A32, A64}/types: Use std::array deduction guides where applicable
We also make the arrays static here, as MSVC tends to load the whole
array every time the function is called, instead of storing the data
within rodata.

This also line breaks the elements a little earlier for readability.
2020-04-22 21:02:46 +01:00
MerryMage
605a43d23e Suppress MSVC warning C4702: unreachable code 2020-04-22 21:02:46 +01:00
Lioncash
bac945f2d8 A32: Resolve parameter discrepancies discovered via use of the Imm template 2020-04-22 21:02:46 +01:00
Lioncash
e4c65721fe frontend/ir/type: Generify std::array declaration
With deduction guides, we can eliminate the need to explicitly size the
array. Also newlines the elements based off their relation, making it
slightly nicer to read.
2020-04-22 21:02:46 +01:00
Lioncash
4ba2318b2e A32: Replace immediate type aliases with the Imm template
Replaces type aliases of raw integral types with the more type-safe Imm
template, like how the AArch64 frontend has been using it.

This makes the two frontends more consistent with one another.
2020-04-22 21:02:46 +01:00
Lioncash
f01dc9192a CMakeLists: Add a namespace to the export
Avoids potentially dumping boost, fmt, and xbyak targets into a
top-level namespace without any qualification, which can lead to build
errors in projects that already make use of them.
2020-04-22 21:02:46 +01:00
Lioncash
f96036b3f1 A32/barrier: Correct PC assignment within ISB
The SetRegister() IR function doesn't allow specifying the PC as a
register. This is a discrepancy that slipped through (my bad). Instead,
we can use BranchWritePC(), like how the other similar PC modifying
locations do it.
2020-04-22 21:02:46 +01:00
Lioncash
196e7b5e35 frontend/A32/ir_emitter: Mark locals as const where applicable
Makes const usage consistent within the source file.
2020-04-22 21:02:46 +01:00
Lioncash
511613c736 frontend/A32/types: Use helper function in operator+ overload
Allows deduplicating an assert and a cast.
2020-04-22 21:02:46 +01:00
Lioncash
8103652a91 frontend: Move imm.h to the top-level directory of the frontends
Preparation to utilize the immediate type within the A32 backend as
well, which will allow eliminating numerous type aliases like Imm4,
Imm5, etc.
2020-04-22 21:02:46 +01:00
Lioncash
796bb8a7f7 frontend/A64/types: Make RegNumber() and VecNumber() constexpr
Given they simply perform casting, they can be safely made constexpr.
2020-04-22 21:02:46 +01:00
Lioncash
64e51a6d4d A32/disassembler_arm: Mark utility functions as static where applicable
These don't depend on class state and can be marked static to make that
explicit.
2020-04-22 21:02:46 +01:00
Lioncash
0c43228ad5 frontend/A64/types: Use helper functions in operator+ overloads
Allows us to get rid of another explicit cast.
2020-04-22 21:02:46 +01:00
Lioncash
a1cace21a9 frontend/ir/ir_emitter: Apply const to locals where applicable
Makes const usage consistent with all other functions in the source
file.
2020-04-22 21:02:46 +01:00
Lioncash
0a35836998 frontend/ir/ir_emitter: Use switch constructs in floating point opcodes where applicable
This'll reduce the amount of noise necessary in changes implementing
half-precision instructions, as the type can just be prepended to the
switch cases, instead of rewriting the whole if/else branch.
2020-04-22 21:02:46 +01:00
Lioncash
8316d231e9 A32: Implement barrier instructions introduced in ARMv7
Provides basic implementations of the barrier instruction introduced
within ARMv7. Currently these simply mirror the behavior of the AArch64
equivalents.
2020-04-22 21:02:46 +01:00
Lioncash
7fc3bd689d A32: Implement ARM-mode MLS 2020-04-22 21:02:46 +01:00
Lioncash
8b338b7def A32: Implement ARM-mode MOVT 2020-04-22 21:02:46 +01:00
Lioncash
877fa0f8c3 A32: Implement ARM-mode SBFX 2020-04-22 21:02:46 +01:00
Lioncash
47218ee65d A32: Implement ARM-mode UBFX 2020-04-22 21:02:46 +01:00
Lioncash
2970b34e3c A32: Implement ARM-mode BFI 2020-04-22 21:02:46 +01:00
Lioncash
fab3a59e05 A32: Implement ARM-mode BFC 2020-04-22 21:02:46 +01:00
Lioncash
7305d13221 A32: Implement ARM-mode RBIT 2020-04-22 21:02:46 +01:00
Lioncash
b2f7a0e7ba A32: Implement ARM-mode SDIV/UDIV
Now that we have Unicorn in place, we can freely implement instructions
introduced in newer versions of the ARM architecture.
2020-04-22 21:02:46 +01:00
Lioncash
c0ae23bbb7 A32/translate_thumb: Clean up formatting
Performs a similar tidying up of the Thumb translator, like what was
done with the regular ARM translator to make it consistent with the rest
of the codebase.

The A32 backend (both Thumb and ARM), will likely see more changes to it
in the near future, so this just acts as a "dusting off".
2020-04-22 21:02:46 +01:00
Merry
837c23a8ec Merge pull request #483 from lioncash/invert
frontend/ir/cond: Remove unused invert() function
2020-04-22 21:02:46 +01:00
Lioncash
d12e375481 common/fp/op/FPConvert: Remove unnecessary casts in FPConvert()
These were made unnecessary in 2c2fdb435cf8e358a0c5b907ce8131e434df3f22,
but were missed during the initial removal.
2020-04-22 21:02:46 +01:00
Merry
09ee64ea98 Merge pull request #482 from lioncash/fixedfp
A64: Handle half-precision variants of FP->Fixed instructions
2020-04-22 21:02:45 +01:00
MerryMage
1e1e9c17c7 emit_x64_data_processing: Remove INVALID_REG
INVALID_REG.cvt8() now throws
2020-04-22 21:02:45 +01:00
Lioncash
06ec6ab0da frontend/ir/cond: Remove unused invert() function
This is no longer used by anything in the codebase, so it can be
removed.
2020-04-22 21:01:46 +01:00
Merry
d71f51b0da Merge pull request #481 from lioncash/alloc
ir/basic_block: Forward declare headers where applicable
2020-04-22 21:01:46 +01:00
Lioncash
64e3d233f4 A64: Handle half-precision variants of FP->Fixed-point instructions 2020-04-22 21:01:45 +01:00
Lioncash
4fc531f71b ir/basic_block: Forward declare headers where applicable
Now that the constructor and destructors have been placed within the cpp
file, we can forward declare the memory pool data structures. Now, a
change to the memory pool code won't ripple across the entirety of the
IR emitter.
2020-04-22 21:01:45 +01:00
Lioncash
427b7afd66 frontend/ir/microinstruction: Add missing fixed-point opcodes to ReadsFromAndWritesToFPSRCumulativeExceptionBits() 2020-04-22 21:01:45 +01:00
Lioncash
c9777ef997 common/fp/info: Make half-precision info struct functions return correctly sized types
While initially done to potentially prevent creating bugs due to C++
having a silly type-promotion mechanism involving types < sizeof(int)
and unsignedness, given that the bulk of these functions' usages
are on exit paths, these can return the correct type to avoid the need
to cast at every usage point.
2020-04-22 21:01:45 +01:00
Lioncash
9309d95b17 ir/block: Default ctor and dtor in the cpp file
Prevents potentially inlining allocation code everywhere. While we're at
it, also explicitly delete/default the copy/move constructor/assignment
operators to be explicit about them.
2020-04-22 21:01:45 +01:00
Lioncash
604f39f00a frontend/ir_emitter: Add half-precision->fixed-point opcodes 2020-04-22 21:01:45 +01:00
Lioncash
4ecfbc14de common/fp/op/FPToFixed: Add half-precision specialization of FPToFixed 2020-04-22 21:01:45 +01:00
Lioncash
471eb77bc9 A64: Implement FRSQRTS' half-precision vector variant 2020-04-22 21:01:45 +01:00
Lioncash
f9b2862217 A64: Implement FRSQRTS' half-precision scalar variant
With the necessary machinery in place, we can now handle the
half-precision variant.
2020-04-22 21:01:45 +01:00
Lioncash
96356fac93 frontend/ir_emitter: Add half-precision opcode variant of FPVectorRSqrtStepFused 2020-04-22 21:01:45 +01:00
Merry
45864133f5 Merge pull request #478 from lioncash/stepfused
A64: Handle half-precision variants of FRECPE and FRECPS
2020-04-22 21:01:44 +01:00
Lioncash
824c551ba2 frontend/ir_emitter: Add half-precision opcode variant of FPRSqrtStepFused 2020-04-22 21:01:44 +01:00
Lioncash
3739d92097 A64: Implement half-precision vector variant of FRECPE 2020-04-22 21:01:44 +01:00
Lioncash
e3b2eb57b5 common/fp/op/FPRSqrtStepFused: Add half-precision specialization for FPRSqrtStepFused 2020-04-22 21:01:44 +01:00
Lioncash
7b212ec8ae A64: Implement half-precision variant of FRSQRTE's vector variant 2020-04-22 21:01:44 +01:00
Lioncash
0945a491bd A64: Implement half-precision scalar variant of FRECPE 2020-04-22 21:01:44 +01:00
Lioncash
77c84bcf9b A64: Implement half-precision variant of FRSQRTE's scalar variant 2020-04-22 21:01:44 +01:00
Lioncash
86b7626a2f A64: Implement half-precision vector variant of FRECPS 2020-04-22 21:01:44 +01:00
Lioncash
037acb17b9 frontend/ir_emitter: Add half-precision opcode variant for FPVectorRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
de43f011a7 A64: Implement half-precision scalar variant of FRECPS 2020-04-22 21:01:44 +01:00
Lioncash
5dba99b4f4 frontend/ir_emitter: Add half-precision opcode variant for FPRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
825a3ea16f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
726b9914c5 common/fp/op/FPRSqrtEstimate: Add half-precision specialization for FPRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
2184d24e8f frontend/ir_emitter: Add half-precision opcode for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
af2e5afed6 common/fp/op: Add half-precision specialization for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
d7f394fc1a A64: Enable half-precision vector FRINT* variants 2020-04-22 21:01:44 +01:00
Lioncash
5d5c9f149f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
24f583c498 A64: Enable half-precision variants of floating-point FRINT* variants
With all the backing machinery in place, we can remove the fallback
check for half-precision.
2020-04-22 21:01:44 +01:00
Lioncash
6da0411111 frontend/ir_emitter: Add half-precision opcode for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
fb829b9525 frontend/microinstruction: Add FPVectorRoundInt types to ReadsFromAndWritesToFPSRCumulativeExceptionBits()
All variants were previously missing from this.
2020-04-22 21:01:44 +01:00
Lioncash
68d8cd2b13 common/fp/op: Add half-precision specialization for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
5b4673da4b frontend/ir_emitter: Add half-precision variant of FPVectorRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
ad0c698f89 frontend/ir_emitter: Add half-precision variant of FPRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
61cec94a19 fp/op/FPRoundInt: Add half-precision specialization of FPRoundInt 2020-04-22 21:01:44 +01:00
Merry
cb9a1b18b6 Merge pull request #475 from lioncash/muladd
A64: Enable half-precision variants of floating-point multiply-add instructions
2020-04-22 21:01:44 +01:00
Merry
d6db7ad46c Merge pull request #474 from lioncash/bracing
load_store_*: Make bracing consistent and variables const where applicable
2020-04-22 21:01:44 +01:00
Merry
1b6520f5dd A64/location_descriptor: Ensure FZ16 is included in the FPCR mask 2020-04-22 21:01:44 +01:00
Merry
13f421c27d Merge pull request #473 from lioncash/sqshlu
A64: Implement SQSHLU
2020-04-22 21:01:44 +01:00
Lioncash
b5bf890584 load_store_*: Make bracing consistent and variables const where applicable
Makes bracing consistent, and variables const where applicable to be
consistent with the rest of the codebase.

In most bracing cases, they'd need to be added to conditionals that
would involve checking stack pointer alignment in the future anyways.
2020-04-22 21:01:44 +01:00
Lioncash
9a58c3f1c7 A64: Implement FMLA/FMLS' half-precision vector indexed variants 2020-04-22 21:01:44 +01:00
Merry
d7da53a74b Merge pull request #472 from lioncash/exception
general: Mark hash functions as noexcept
2020-04-22 21:01:44 +01:00
Lioncash
9dcc04e106 A64: Implement SQSHLU's scalar variant 2020-04-22 21:01:44 +01:00
Merry
b91c6c8bae Merge pull request #471 from lioncash/sqrdmulh
A64: Implement SQRDMULH's scalar vector variant
2020-04-22 21:01:44 +01:00
Lioncash
1fdd3ef8a0 A64: Implement FMLA/FMLS' half-precision scalar indexed variants 2020-04-22 21:01:44 +01:00
Lioncash
2d59d10ac8 A64: Implement SQSHLU's vector variant
The vector shift by immediate category is now fully implemented.
2020-04-22 21:01:44 +01:00
Merry
b5e25959d9 Merge pull request #470 from lioncash/assert
general: Replace unreachable-imitating assertions with UNREACHABLE()
2020-04-22 21:01:44 +01:00
Lioncash
d6606deda2 A64: Implement half-precision vector variants of FMLA/FMLS 2020-04-22 21:01:44 +01:00
Lioncash
a4cadf1cd9 frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation 2020-04-22 21:01:44 +01:00
Lioncash
ec6b3ae084 ir/frontend: Add half-precision opcode for FPVectorMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
5f74d25bf7 A64: Enable half-precision floating point variants of FP data-processing three register instructions
This handles half-precision floating point for:

- FMADD
- FMSUB
- FNMADD
- FNMSUB
2020-04-22 21:01:44 +01:00
Lioncash
bd82513199 frontend/ir_emitter: Add half-precision opcode for FPMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
79a892d23c fp/op/FPMulAdd: Add half-precision floating-point specialization 2020-04-22 21:01:44 +01:00
Lioncash
7bb5440507 general: Mark hash functions as noexcept
Generally hash functions shouldn't throw exceptions. It's also a
requirement for the standard library-provided hash functions to not
throw exceptions.

An exception to this rule is made for user-defined specializations,
however we can just be consistent with the standard library on this to
allow it to play nicer with it.

While we're at it, we can also make the std::less specializations
noexcpet as well, since they also can't throw.
2020-04-22 21:01:43 +01:00
Lioncash
3b46b4a37d A64: Implement SQRDMULH's scalar vector variant
Implements the scalar variant in terms of the vector variant for the
time being.
2020-04-22 21:01:43 +01:00
Lioncash
fe95575b95 general: Replace unreachable-imitating assertions with UNREACHABLE()
We can just use the self-documenting assertion for indicating
unreachable paths, instead of manually passing false and providing a
message.
2020-04-22 21:01:43 +01:00
Merry
4a3d808354 Merge pull request #468 from lioncash/const
ir_opt: Mark locals as const where applicable
2020-04-22 21:01:43 +01:00
Lioncash
64de80839e A64/impl: Reorganize peculiar void use in V_scalar
To a reader this might look particularly strange, given the function
itself has a void return value, but this is actually valid, given the
function in the return statement also has a void return value.

This instead alters it to be a little easier to parse and potentially be
a little less confusing at a glance.
2020-04-22 21:01:43 +01:00
Merry
9a4e3b24e4 Merge pull request #467 from lioncash/reserved
A64: Handle reserved instruction cases more specifically where applicable
2020-04-22 21:01:43 +01:00
Merry
0b794cbcea Merge pull request #466 from lioncash/fcmla
A64: Implement FCMLA's indexed element variant
2020-04-22 21:01:43 +01:00
Merry
994349d154 Merge pull request #465 from neobrain/master
CMakeLists: Allow importing dynarmic build trees into other CMake projects
2020-04-22 21:01:43 +01:00
Lioncash
cfd7513a7d ir_opt/verification_pass: Mark locals as const where applicable
Makes our immutable state a little more explicit.
2020-04-22 21:01:40 +01:00
Lioncash
8309d49588 A64: Handle reserved instruction cases more specifically where applicable
These are cases that are defined as reserved within the ARMv8 reference
manual, so we can handle them as such instead of as unallocated
encodings.

While this doesn't actually change emulated behavior, it does at least
allow the JIT to generate the more appropriate exception.
2020-04-22 21:00:47 +01:00
Lioncash
6c2c68bce6 A64: Implement FCMLA's indexed element variant
With this, all of the instructions introduced with ARMv8.3-CompNum have
an implementation.
2020-04-22 21:00:47 +01:00
Tony Wasserka
7d99a6c00f CMakeLists: Allow importing dynarmic build trees into other CMake projects 2020-04-22 21:00:47 +01:00
Lioncash
1a45f35b9c ir_opt/a64_callback_config_pass: Mark locals as const where applicable
Makes our immutable state a little more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
7bc7042104 simd_scalar_shift_by_immediate: Change UnallocatedEncoding() path in SaturatingShiftLeft to ReservedValue()
Strictly speaking, immh being zero is defined as reserved in the ARMv8
reference manual. This was just an error on my part when introducing the
SQSHL immediate scalar variant.
2020-04-22 21:00:47 +01:00
Lioncash
dc97977576 ir_opt/a32_get_set_elimination_pass: Mark local variables as const where applicable
Makes our intended immutable state slightly more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
b1b4487e4d A64: Implement UQSHL (immediate)'s scalar variant
Like SQSHL's immediate scalar variant, we can also implement UQSHL's
immediate scalar variant in terms of the vector variant for the time
being.
2020-04-22 21:00:47 +01:00
Lioncash
3649dc6d9a A64: Implement scalar variant of SQSHL (immediate)
This can be handled in terms of the vector variant for the time being.
2020-04-22 21:00:47 +01:00
Lioncash
7d535eaba6 ir_opt/a32_constant_memory_reads_pass: Apply const where applicable to locals
Makes immutable state just slightly more explicit.
2020-04-22 21:00:47 +01:00
Lioncash
e1b4ff1068 simd_scalar_shift_by_immediate: Migrate SQSHL implementation to file-scope function
This will allow it to be reused for the implementation of UQSHL.
2020-04-22 21:00:47 +01:00
Lioncash
b37279f65c backend/x64/emit_x64_vector: Prevent undefined behavior within VectorSignedSaturatedShiftLeft
Avoids undefined behavior by potentially left-shifting a signed negative
value.
2020-04-22 21:00:47 +01:00
Lioncash
46eae8cf2f common/fp/op/FPRecipExponent: Prevent undefined behavior from shifting a negative value
Due to promotion rules (types < int, even if unsigned, get promoted to
int when arithmetic is performed on them), this is a potential spot for
undefined behavior.
2020-04-22 21:00:47 +01:00
MerryMage
13e8b7b516 emit_x64_floating_point: F16C implementation of FPSingleToHalf 2020-04-22 20:58:17 +01:00
MerryMage
d32d6fe598 emit_x64_floating_point: F16C implementation of FPHalfToSingle and FPHalfToDouble 2020-04-22 20:58:12 +01:00
MerryMage
a53ba12be2 emit_x64_floating_point: Factor out ConvertRoundingModeToX64Immediate 2020-04-22 20:58:12 +01:00
MerryMage
5a2adc6629 backend/x64: Expose FPCR in EmitContext instead of its subcomponents 2020-04-22 20:58:12 +01:00
Merry
01bb1cdd88 Merge pull request #458 from lioncash/float-op
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Lioncash
28a8b4d210 A64: Handle half-precision floating point in scalar FMOV
This is simply performing a scalar value transfer between registers
without conversions, so this is trivial to handle as-is.
2020-04-22 20:58:12 +01:00
Lioncash
d7ac5a664f A64: Handle half-precision floating point in FCVTL
Like FCVTN, now that we have half-precision floating point conversion
functions available, we can go ahead and use those to eliminate the
interpreter fallback.
2020-04-22 20:58:12 +01:00
Lioncash
fe84ecb780 A64: Handle half-precision floating point in scalar FABS
Now that we have the half-precision variant of the opcode added, we can
simply handle the instruction instead of treating it as undefined.
2020-04-22 20:58:12 +01:00
Lioncash
fac9224d5e A64: Handle half-precision floating point in FCVTN
Now that we have IR instructions for performing conversions with
half-precision floating point, we can also handle half-precision values
within FCVTN.
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
16de99d3e3 A64: Enable FCVT floating-point conversions for half-precision
With this, we no longer have to fall back to the interpreter in any of
the FCVT floating-point conversion instructions.
2020-04-22 20:58:12 +01:00
Lioncash
10abc77fad A64: Handle half-precision floating point in scalar FNEG
With the half-precision variant of the FPNeg opcode added, we can
utilize it here to emulate the half-precision variant of FNEG.
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
dff5da1063 common/fp/unpacked: Amend behavior of FPUnpackCV
This is supposed to call FPUnpackBase instead of FPUnpack. This would
result in alternate half-precision representations being misinterpreted
when it comes to dealing with NaNs.
2020-04-22 20:58:12 +01:00
Merry
f01afc5ae6 Merge pull request #456 from lioncash/mov
A64: Enable FMOV (general) for half-precision floating point
2020-04-22 20:58:12 +01:00
Lioncash
03bc2334fe common/fp/op/FPConvert: Amend off-by one in double NaN case in FPConvertNaN
Avoids potentially clobbering the intended sign bit value during
conversions to double-precision values. The other conversion types are
already properly handled, so those don't need to be addressed.
2020-04-22 20:58:12 +01:00
Lioncash
c57b146fb2 common/fp/op/FPConvert: Add half-precision instantiations to FPConvert 2020-04-22 20:58:12 +01:00
Merry
c1ce94872d Merge pull request #455 from lioncash/sqrdmulh-scalar
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2020-04-22 20:58:11 +01:00
Lioncash
25a7256ee1 A64: Enable FMOV (general) for half-precision floating point
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2020-04-22 20:58:11 +01:00
Lioncash
97dd3d0596 A64: Implement SQRDMULH's scalar indexed element variant 2020-04-22 20:58:11 +01:00
Lioncash
49b51e34f1 simd_vector_x_indexed_element: Deduplicate index and Vm operand construction 2020-04-22 20:58:11 +01:00
Lioncash
692aba91b6 A64: Implement SQDMULL{2}'s scalar indexed element variant 2020-04-22 20:58:11 +01:00
Lioncash
c043b831d5 A64: Implement SQDMULL{2}'s by-element variant 2020-04-22 20:58:11 +01:00
Lioncash
72af5a3dff simd_scalar_x_indexed_element: Factor out index and Vm argument construction
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2020-04-22 20:58:11 +01:00
Lioncash
224ff0afaa A64: Implement SQRDMULH's by-index vector variant 2020-04-22 20:58:11 +01:00
Lioncash
3a3542414b A64: Implement FRECPX's half-precision floating point variant 2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677 frontend/ir/value: Add U16U32U64 type to represent floating point types 2020-04-22 20:58:11 +01:00
Lioncash
eb3e0d5908 common/fp/op/FPRecipExponent: Add half-precision floating point specialization 2020-04-22 20:58:11 +01:00
Lioncash
a829c93406 common/fp/unpacked: Correct edge-cases within FPUnpack for half-precision floating point
This corrects one case where floating-point exceptions could be set when
they're not supposed to be.

This also corrects a case where values were being treated as NaNs when
they weren't supposed to be.
2020-04-22 20:58:11 +01:00
Lioncash
7030b9af95 common/fp/process_nan: Add half-precision instantiations for NaN processing functions 2020-04-22 20:58:11 +01:00
Lioncash
14f55d7476 common/fp/unpacked: Add half-precision instantiation of FPRoundBase 2020-04-22 20:58:11 +01:00
Lioncash
7e814de445 common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase 2020-04-22 20:58:11 +01:00
Lioncash
8f9fe8690a common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.

At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2020-04-22 20:58:11 +01:00
Merry
37c4c39d62 Merge pull request #448 from lioncash/saturate
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2020-04-22 20:58:11 +01:00
Merry
f5d774bdbd Merge pull request #449 from lioncash/hp
common/fp/info: Add specialization of FPInfo for half-precision floating point
2020-04-22 20:58:11 +01:00