Commit graph

582 commits

Author SHA1 Message Date
Lioncash
b97b71b8aa emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
033e400df0 emit_x64_vector_floating_point: Deduplicate accurate NaN handling code
Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.
2020-04-22 20:46:17 +01:00
Lioncash
0f067b7330 emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
d4ee878cbd emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available 2020-04-22 20:46:17 +01:00
Lioncash
51e4f1d9db emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32() 2020-04-22 20:46:17 +01:00
Lioncash
c692ccdd6d emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8() 2020-04-22 20:46:17 +01:00
Lioncash
b194313d8c emit_x64_vector: Vectorize fallback path in EmitVectorMinU32() 2020-04-22 20:46:17 +01:00
Lioncash
7ceda6d919 emit_x64_vector: Vectorize fallback path in EmitVectorMinU16() 2020-04-22 20:46:17 +01:00
Lioncash
cda85a1da0 emit_x64_vector: Vectorize fallback path in EmitVectorMinS32() 2020-04-22 20:46:17 +01:00
Lioncash
6e08eed210 emit_x64_vector: Vectorize fallback path in EmitVectorMinS8() 2020-04-22 20:46:17 +01:00
Lioncash
0fb6dce689 emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
This can simply be merged with the previous one.
2020-04-22 20:46:17 +01:00
Lioncash
5b71b1337b emit_x64_vector: Avoid left shift of negative value in LogicalVShift
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2020-04-22 20:46:17 +01:00
Lioncash
9954d28868 a64_jitstate: Zero SP and PC on construction of A64JitState
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2020-04-22 20:46:17 +01:00
Lioncash
4efbd40ea4 backend_x64/callback: Default virtual destructor in the cpp file
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2020-04-22 20:46:17 +01:00
Lioncash
edd0b5c8c7 a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
It's well-defined to static_cast a void* to its proper type.
2020-04-22 20:46:17 +01:00
Lioncash
21974ee57e backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash
9fc89f0a0e emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash
af28e89a13 emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16() 2020-04-22 20:46:17 +01:00
Lioncash
0d20423ad5 emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32() 2020-04-22 20:46:17 +01:00
Lioncash
d70ee7c0d1 emit_x64_vector: Use VBPROADCAST where applicable and available
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash
26d77c6f09 ir: Add opcodes for performing vector deinterleaving 2020-04-22 20:46:17 +01:00
Lioncash
87ca63699f emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash
f17702f608 emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs 2020-04-22 20:46:17 +01:00
Lioncash
596a8dd1dd emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash
75fd4eaaaa emit_x64_vector: Get rid of some magic numbers in loop bounds 2020-04-22 20:46:17 +01:00
Lioncash
7b80ac25eb emit_x64_vector: Generify variable shift functions 2020-04-22 20:46:17 +01:00
Lioncash
38fa984b53 IR: Add opcode for packed word->f32 conversions 2020-04-22 20:46:16 +01:00
Lioncash
64b1f2d468 ir: Add opcode for reversing bits in a vector 2020-04-22 20:46:15 +01:00
Lioncash
e33dcce14a ir: Add opcodes for performing vector absolute values 2020-04-22 20:46:15 +01:00
MerryMage
3472f371df IR: Implement VectorExtract, VectorExtractLower IR instructions 2020-04-22 20:46:15 +01:00
MerryMage
5c47f03888 A64: Implement FMUL (vector) 2020-04-22 20:46:15 +01:00
Lioncash
ad5cf584ce ir: Add opcodes for performing vector unsigned absolute differences 2020-04-22 20:46:15 +01:00
Lioncash
701f43d61e IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords
I should have added this when I introduced the functions for interleaving
low-order equivalents for consistency in the interface.
2020-04-22 20:46:15 +01:00
Lioncash
3985f7bf84 emit_x64_data_processing: Deduplicate some code in zero-extension functions
EmitZeroExtendByteToLong() can be implemented in terms of EmitZeroExtendByteToWord() and
EmitZeroExtendHalfToLong() can be implemented in terms of EmitZeroExtendHalfToWord().
2020-04-22 20:46:15 +01:00
MerryMage
e7b60189b3 abi: Missing includes' 2020-04-22 20:46:15 +01:00
MerryMage
cdc5c3ad95 emit_x64_floating_point: Near jump instead of short jump in FPMinNumberic{32,64} 2020-04-22 20:46:15 +01:00
MerryMage
df4ee0f51e emit_X64_floating_point: Near jmp to end instead of short jmp
Jump destination can be further than what can be reached in a short
jump under some FPCR options.
2020-04-22 20:46:15 +01:00
Lioncash
b8d5765f9b emit_x64_vector: Fix typo in VectorShuffleImpl
This is supposed to be pshufd, not pshufw (which only allows a 64-bit operand)
2020-04-22 20:46:15 +01:00
Lioncash
6b0010c940 ir: Add IR opcodes for emitting vector shuffles
This uses the ARM terminology for sizes (Halfword -> 2 bytes, Word -> 4 bytes)
as opposed to the x86 terminology of (Word -> 2 bytes, Double word -> 4 bytes)
2020-04-22 20:46:15 +01:00
Lioncash
eb2d28d2b1 emit_x64_vector_floating_point: Fix out of bounds array access in EmitVectorOperation64 2020-04-22 20:46:15 +01:00
MerryMage
49cc6d7fad A64: Implement FDIV (vector) 2020-04-22 20:46:15 +01:00
MerryMage
c832cec96d Correct FPSR and FPCR 2020-04-22 20:46:15 +01:00
MerryMage
147284427b A64: Implement USHL 2020-04-22 20:46:15 +01:00
MerryMage
e4697b1676 A64: Implement system register TPIDR_EL0 2020-04-22 20:46:15 +01:00
MerryMage
e3da92024e A64: Implement system registers FPCR and FPSR 2020-04-22 20:46:15 +01:00
MerryMage
9e4e4e9c1d A64: Implement system register CNTPCT_EL0 2020-04-22 20:46:15 +01:00
MerryMage
1e15283d00 A64: Implement system register CTR_EL0 2020-04-22 20:46:15 +01:00
MerryMage
710d09471b IR: Add IR instruction ZeroVector 2020-04-22 20:46:15 +01:00
MerryMage
2721bb5ace emit_x64_floating_point: Add maybe_unused to preprocess parameter 2020-04-22 20:46:15 +01:00
MerryMage
0575e7421b A64: Implement FMINNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1c9804ea07 A64: Implement FMAXNM (scalar) 2020-04-22 20:46:15 +01:00
MerryMage
1dfce0894d constant_pool: Add frame parameter 2020-04-22 20:46:14 +01:00
MerryMage
84f1c9b7f4 reg_alloc: Only exchange GPRs 2020-04-22 20:46:14 +01:00
MerryMage
6541ec064d emit_x64_floating_point: Correct FP{Max,Min}{32,64} implementations for -0/+0 2020-04-22 20:46:14 +01:00
MerryMage
7c193485e1 a64/config: Allow NaN emulation accuracy to be set 2020-04-22 20:46:14 +01:00
MerryMage
a3df46a75a a64_emit_x64: Add conf to A64EmitContext 2020-04-22 20:46:14 +01:00
MerryMage
07520f32c3 backend_x64: Accurately handle NaNs 2020-04-22 20:46:14 +01:00
MerryMage
e97581d063 fuzz_with_unicorn: Print AArch64 disassembly 2020-04-22 20:46:14 +01:00
MerryMage
47c0ad0fc8 IR: Implement Vector{Max,Min}{Signed,Unsigned} 2020-04-22 20:46:14 +01:00
MerryMage
f4775910f5 IR: Implement VectorGreaterSigned 2020-04-22 20:46:14 +01:00
MerryMage
1f5b3bca43 Exclusive fixups
* Incorrect size of exclusive_address
* Disable tests on exclusive memory instructions for now
2020-04-22 20:46:14 +01:00
MerryMage
f3fa4a042f a64_emit_x64: EmitExclusiveWrite: Make MSVC happy (narrowing conversion warning) 2020-04-22 20:46:14 +01:00
MerryMage
8698f057d0 A64: Implement STXP, STLXP, LDXP, LDAXP 2020-04-22 20:46:14 +01:00
MerryMage
b7a2c1a7df A64: Implement STXRB, STXRH, STXR, STLXRB, STLXRH, STLXR, LDXRB, LDXRH, LDXR, LDAXRB, LDAXRH, LDAXR 2020-04-22 20:46:14 +01:00
MerryMage
a6cc667509 Direct Page Table Access: Handle address spaces less than the full 64-bit in size 2020-04-22 20:46:14 +01:00
MerryMage
f45a5e17c6 Implement direct page table access 2020-04-22 20:46:14 +01:00
MerryMage
bfd3e30c75 callbacks: Member functions should be const 2020-04-22 20:46:14 +01:00
MerryMage
9f2f08db8d a64_emit_x64: Implement {Read,Write}Memory128 in terms of a function call 2020-04-22 20:46:14 +01:00
MerryMage
6c4773e85b abi: Add RAX to ABI_ALL_CALLER_SAVE 2020-04-22 20:46:14 +01:00
MerryMage
8756487554 A64: Partially implement MRS 2020-04-22 20:46:14 +01:00
MerryMage
bfd65bedfe A64: Implement DSB, DMB 2020-04-22 20:46:14 +01:00
MerryMage
5edd623b9d Implement DC instructions 2020-04-22 20:46:14 +01:00
MerryMage
2cb0a699ba IR: Implement FPMax, FPMin 2020-04-22 20:46:14 +01:00
MerryMage
98c8e7d1af IR: Implement FPVectorAdd 2020-04-22 20:46:14 +01:00
MerryMage
eae518a338 IR: Implement VectorSignExtend 2020-04-22 20:46:14 +01:00
MerryMage
b9cd345ddc IR: Implement FPVectorSub 2020-04-22 20:46:14 +01:00
MerryMage
851fc83445 emit_x64_vector: EmitOneArgumentFallback 2020-04-22 20:46:14 +01:00
MerryMage
303088a51e IR: Implement VectorPopulationCount 2020-04-22 20:46:14 +01:00
MerryMage
bf2cd92da9 emit_x64_vector: Add SSE4.1 implementation for EmitVectorMultiply64 2020-04-22 20:46:14 +01:00
MerryMage
b062266b8e emit_x64_vector: More explicit lambda decay 2020-04-22 20:46:14 +01:00
MerryMage
b6de612e01 IR: Implement VectorMultiply 2020-04-22 20:46:14 +01:00
MerryMage
90a053a5e4 emit_x64_vector: Order alphabetically 2020-04-22 20:46:14 +01:00
MerryMage
715ae1c229 IR: Implement VectorArithmeticShiftRight 2020-04-22 20:46:14 +01:00
MerryMage
132c783320 IR: Implement VectorNarrow 2020-04-22 20:46:13 +01:00
MerryMage
1423584f9f constant_pool: Allow for 128-bit constants 2020-04-22 20:46:13 +01:00
MerryMage
69de50a878 emit_x64_vector: Add SSE4.1 implementations for VectorZeroExtend 2020-04-22 20:46:13 +01:00
MerryMage
cbc9f361b0 IR: Implement VectorSub 2020-04-22 20:46:13 +01:00
MerryMage
b22c5961f9 IR: Implement VectorLogicalShiftRight 2020-04-22 20:46:13 +01:00
MerryMage
59ace60b03 IR: Implement VectorZeroExtend 2020-04-22 20:46:13 +01:00
MerryMage
f6247125c0 IR: Implement VectorLogicalShiftLeft{8,16,32,64} 2020-04-22 20:46:13 +01:00
MerryMage
15e8231f24 opcodes: Sort vector IR opcodes alphabetically 2020-04-22 20:46:13 +01:00
MerryMage
d74f4e35f6 block_of_code: Increase constant pool size 2020-04-22 20:46:13 +01:00
MerryMage
e69288f803 devirtualize: MinGW uses Intanium MFP ABI 2020-04-22 20:46:13 +01:00
MerryMage
ad428cbd7a callback: Properly handle calls with return pointers and simplify interface 2020-04-22 20:46:13 +01:00
MerryMage
7a87e3fc55 devirtualize: Handle Windows ABI 2020-04-22 20:46:13 +01:00
MerryMage
f808a0fbde devirtualize: Devirtualize Itanium ABI MFPs at runtime 2020-04-22 20:46:13 +01:00
Lioncash
35a29a9665 A64: Implement ZIP1 2020-04-22 20:46:13 +01:00
FernandoS27
586854117b Implemented UMULH and SMULH instructions 2020-04-22 20:46:13 +01:00
MerryMage
44c3c2312a a64_jitstate: Remove unnecessary FPSCR_nzcv member 2020-04-22 20:46:13 +01:00
MerryMage
aac5af50e2 IR: FPCompare{32,64} now return NZCV flags instead of implicitly setting them 2020-04-22 20:46:13 +01:00
MerryMage
dd2a6684fe IR: Add ConditionalSelectNZCV instruction 2020-04-22 20:46:13 +01:00
MerryMage
b173fcf34e backend_x64: Simplify FPDoubleToU32 and FPSingleToU32
They're inaccurate in terms of FPSR at the moment anyway.
2020-04-22 20:46:13 +01:00
Lioncash
d040920727 Common: Put AES code within its own nested namespace
Prevents the functions from potentially clashing with other stuff in Common in the future
2020-04-22 20:46:13 +01:00
Lioncash
40614202e7 A64: Implement AESD 2020-04-22 20:46:13 +01:00
Lioncash
ccef85dbb7 A64: Implement AESE 2020-04-22 20:46:13 +01:00
MerryMage
68f46c8334 backend_x64: Use a reference to BlockOfCode instead of a pointer 2020-04-22 20:46:13 +01:00
MerryMage
8931ee346b IR: Add IR instruction NZCVFromPackedFlags
This instruction expects NZCV to be in the high bits.
i.e.: The positions they were in PSTATE.
2020-04-22 20:46:13 +01:00
MerryMage
75b8a76630 a64_jitstate: A64 does not have a seperate FPSCR.NZCV 2020-04-22 20:46:13 +01:00
MerryMage
6414736a8d emit_x64_vector: bug: VectorGetElement8 returning incorrect values for non-SSE4.1
This bug wasn't discovered earlier because we previously only used index == 0.
2020-04-22 20:46:13 +01:00
MerryMage
ebfc51c609 IR: Implement VectorSetElement{8,16,32,64} 2020-04-22 20:46:13 +01:00
Lioncash
a5c4fbc783 A64: Implement AESIMC and AESMC 2020-04-22 20:46:13 +01:00
Lioncash
ab9b5fb8aa Common: Relocate common bits of CRC32
Allows the algorithm to be used in any other potential backend.
2020-04-22 20:46:12 +01:00
Lioncash
af1384d700 A64: Implement CRC32 2020-04-22 20:46:12 +01:00
MerryMage
64761dbc72 scope_exit: Add SCOPE_SUCCESS and SCOPE_EXIT 2020-04-22 20:46:12 +01:00
MerryMage
bafb39ebc5 A64: Add Disassemble method 2020-04-22 20:46:12 +01:00
MerryMage
f023bbb893 A32: Add ExceptionRaised IR instruction and use it 2020-04-22 20:46:12 +01:00
Lioncash
7ffbebf290 A64: Implement CRC32C 2020-04-22 20:46:12 +01:00
MerryMage
d7044bc751 assert: Use fmt in ASSERT_MSG 2020-04-22 20:46:12 +01:00
MerryMage
52268298a8 a64_emit_x64: Perform RSB predictions 2020-04-22 20:46:12 +01:00
MerryMage
98ec9c5f90 A32: Change UserCallbacks to be similar to A64's interface 2020-04-22 20:46:12 +01:00
Lioncash
b9ce660113 reg_alloc: std::move RegAlloc's function argument 2020-04-22 20:46:12 +01:00
Lioncash
ed561d6653 General: Add missing override specifiers 2020-04-22 20:46:12 +01:00
MerryMage
b2d99eddc6 EmitZeroExtendLongToQuad: Do not rely on register allocator to zero extend 64->128 2020-04-22 20:46:12 +01:00
MerryMage
54de64f5bf a64_emit_x64: bug: x64 sign-extends 32-bit immediates 2020-04-22 20:46:12 +01:00
MerryMage
6fc228f7fd ir_opt: Add A64 Get/Set Elimination Pass 2020-04-22 20:46:12 +01:00
MerryMage
af793c2527 {a32,a64}_interface: Predict entrypoint 2020-04-22 20:46:12 +01:00
Lioncash
7734cf1050 A64: Implement EXTR 2020-04-22 20:46:12 +01:00
MerryMage
d497464c9f a64_jitstate: Have 128-bit wide spills 2020-04-22 20:44:38 +01:00
MerryMage
b513b2ef05 IR: Implement IR instructions A64{Get,Set}S 2020-04-22 20:44:38 +01:00
MerryMage
16fa2cd8f6 a64_emit_x64: Use xword from Xbyak::util 2020-04-22 20:44:38 +01:00
Lioncash
67443efb62 General: Convert multiple namespace specifiers to nested namespace specifiers where applicable
Makes namespacing a little less noisy
2020-04-22 20:44:38 +01:00
MerryMage
d5283e46e8 IR: Implement IR instructions VectorEqual{8,16,32,64,128} 2020-04-22 20:44:38 +01:00
MerryMage
4ce9c65cfb reg_alloc: Use std::exchange 2020-04-22 20:44:38 +01:00
Fernando Sahmkow
e0c12ec2ad A64: Implemented EOR (vector), ORR (vector, register) and ORN (vector) Instructions (#142) 2020-04-22 20:44:38 +01:00
MerryMage
d124a1d761 emit_x64_packed: EmitPackedSubU16 modified xmm_b wasn't writeable
For CPUs that didn't support SSE4.1, this was a bug.
2020-04-22 20:44:38 +01:00
MerryMage
01a26fa644 fixup: travis: Test with disabled CPU feature detection 2020-04-22 20:44:37 +01:00
MerryMage
30936f5e94 travis: Test with disabled CPU feature detection
Ensure that fallbacks are working correctly.
2020-04-22 20:44:37 +01:00
MerryMage
285fd22c30 IR: Add IR instruction VectorZeroUpper 2020-04-22 20:44:37 +01:00
MerryMage
da3e9a5704 a64_emit_x64: bug: EmitA64WriteMemory128 should write not read 2020-04-22 20:44:37 +01:00
FernandoS27
ab84524806 Implemented SDIV and UDIV instructions 2020-04-22 20:44:37 +01:00
MerryMage
f698848e26 IR: Add IR instructions A64Memory{Read,Write}128
Add the Windows ABI implementation
2020-04-22 20:44:37 +01:00
MerryMage
e1df7ae621 IR: Add IR instructions A64Memory{Read,Write}128
This implementation only works on macOS and Linux.
2020-04-22 20:44:37 +01:00
MerryMage
e00a522cba IR: Add IR instruction VectorGetElement{8,16,32,64} 2020-04-22 20:44:37 +01:00
MerryMage
28ccd85e5c IR: Add IR instruction ZeroExtendToQuad 2020-04-22 20:44:37 +01:00
MerryMage
af848c627d block_of_code: Add ABI_RETURN2 2020-04-22 20:44:37 +01:00
MerryMage
1749780929 interface: Move Vector typedef to config.h 2020-04-22 20:44:37 +01:00
MerryMage
793753bf63 IR: Implement Vector{Lower,}Broadcast{8,16,32,64} 2020-04-22 20:44:37 +01:00
Lioncash
8ee854232c General: Default constructors and destructors where applicable 2020-04-22 20:44:37 +01:00
MerryMage
db30e02ac8 emit_x64: Extract BlockRangeInformation, remove template parameter 2020-04-22 20:44:36 +01:00
MerryMage
58c4a25527 emit_x64: Use JitStateInfo 2020-04-22 20:42:46 +01:00