MerryMage
760cc3ca89
EmitContext: Expose FPCR
2020-04-22 20:46:19 +01:00
MerryMage
3cb98e1560
fp: Move fp_util to fp/util
2020-04-22 20:46:19 +01:00
Lioncash
4aa4885ba7
ir: Add opcodes for vector conversion of u32/u64 to floating-point
2020-04-22 20:46:19 +01:00
Lioncash
7a84b6e8d8
ir: Add opcodes for converting S64 and U64 to single-precision floating-point values
2020-04-22 20:46:19 +01:00
Lioncash
066061fa50
constant_pool: Remove unnecessary std::memset from constructor
...
AllocateFromCodeSpace() already zeroes out the allocated memory.
2020-04-22 20:46:19 +01:00
Lioncash
35026a6ce3
emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32()
2020-04-22 20:46:19 +01:00
MerryMage
0b97e9bd8d
emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode
2020-04-22 20:46:18 +01:00
MerryMage
a2eb9a02e0
backend_x86: Add FPSCR_RMode to EmitContext
2020-04-22 20:46:18 +01:00
MerryMage
d875c08ebf
fp: Extract common RoundingMode enum
2020-04-22 20:46:18 +01:00
Lioncash
7252293184
emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle()
...
In the non-AVX512 path, the following code is present:
code.mov(from.cvt32(), from.cvt32());
since this potentially modifies 'from', we should be using
UseScratchGpr() instead.
2020-04-22 20:46:18 +01:00
Lioncash
fbd7623fe5
emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble()
...
AVX-512F provides convenient instructions for these kinds of conversions
directly
2020-04-22 20:46:18 +01:00
Lioncash
3a41465eaf
ir: Add opcodes for converting S64 and U64 to double-precision values
2020-04-22 20:46:18 +01:00
MerryMage
436ca80bcd
Merge branch 'global_monitor'
2020-04-22 20:46:18 +01:00
MerryMage
821cff1227
A64: Add ClearExclusiveState method
2020-04-22 20:46:18 +01:00
Lioncash
81e572c78c
ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16
2020-04-22 20:46:18 +01:00
MerryMage
2a8de5f733
a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor
...
The kernel would have to execute an ERET instruction to return to
userland; this clears exclusive state.
2020-04-22 20:46:18 +01:00
MerryMage
57f7c7e1b0
Implement global exclusive monitor
2020-04-22 20:46:18 +01:00
MerryMage
85234338d3
a64_emit_x64: Simplify EmitExclusiveWrite
2020-04-22 20:46:18 +01:00
Lioncash
fc731dddae
ir: Add opcodes for performing vector absolute floating-point values
...
This will be usable for implementing FACGE and FACGT
2020-04-22 20:46:18 +01:00
Lioncash
0bee648b4f
emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions
...
Given both branches are the same, we can hoist out the common code.
2020-04-22 20:46:18 +01:00
Lioncash
b6e223fc58
emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8()
...
Given both branches use the same destination register size, we can hoist
the common code out.
2020-04-22 20:46:18 +01:00
Lioncash
5ce187a54e
ir: Add opcodes for floating-point vector equalities
2020-04-22 20:46:18 +01:00
Lioncash
cf188448d4
emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64()
...
Gets rid of the need to perform a fallback.
2020-04-22 20:46:18 +01:00
Lioncash
954deff2d4
emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned()
...
This doesn't alter behavior but does make the code better if anything
else is ever added to this function in the future.
2020-04-22 20:46:18 +01:00
Lioncash
bc718c5b28
ir: Add opcodes for performing rounding halving adds
2020-04-22 20:46:18 +01:00
Lioncash
054549da35
emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64
...
I realized I introduced a helper for simple AVX operation emitting, so
use that instead of writing it all out long-form.
2020-04-22 20:46:18 +01:00
Lioncash
8a4f8aed06
ir: Add opcode for performing FP vector absolute differences
2020-04-22 20:46:18 +01:00
MerryMage
8c90fcf58e
IR: Implement FPMulAdd
2020-04-22 20:46:18 +01:00
Lioncash
c695da1cf3
ir: Add opcode for floating-point GE and GT comparisons
...
The rest of the comparisons can be implemented in terms of these two
2020-04-22 20:46:18 +01:00
Lioncash
6de5ed96e5
emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs
...
Shortens code-gen down to a single instruction in the 64-bit path.
2020-04-22 20:46:18 +01:00
Lioncash
1e10017f4b
ir: Add opcodes for signed absolute differences
2020-04-22 20:46:17 +01:00
Lioncash
3f6c529da2
ir: Add opcode to perform the vector conversion S64->F64
...
Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us
2020-04-22 20:46:17 +01:00
Lioncash
44a5f8095a
ir: Add opcodes for performing vector halving subtracts
2020-04-22 20:46:17 +01:00
Lioncash
b312d28295
ir: Add an opcode for doing an SM4 lookup table query
2020-04-22 20:46:17 +01:00
Lioncash
27a6d5f6ce
emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available
2020-04-22 20:46:17 +01:00
Lioncash
089096948a
ir: Add opcodes for performing halving adds
2020-04-22 20:46:17 +01:00
Lioncash
3d00dd63b4
emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
b97b71b8aa
emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
033e400df0
emit_x64_vector_floating_point: Deduplicate accurate NaN handling code
...
Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.
2020-04-22 20:46:17 +01:00
Lioncash
0f067b7330
emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
d4ee878cbd
emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available
2020-04-22 20:46:17 +01:00
Lioncash
51e4f1d9db
emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32()
2020-04-22 20:46:17 +01:00
Lioncash
c692ccdd6d
emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8()
2020-04-22 20:46:17 +01:00
Lioncash
b194313d8c
emit_x64_vector: Vectorize fallback path in EmitVectorMinU32()
2020-04-22 20:46:17 +01:00
Lioncash
7ceda6d919
emit_x64_vector: Vectorize fallback path in EmitVectorMinU16()
2020-04-22 20:46:17 +01:00
Lioncash
cda85a1da0
emit_x64_vector: Vectorize fallback path in EmitVectorMinS32()
2020-04-22 20:46:17 +01:00
Lioncash
6e08eed210
emit_x64_vector: Vectorize fallback path in EmitVectorMinS8()
2020-04-22 20:46:17 +01:00
Lioncash
0fb6dce689
emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift
...
This can simply be merged with the previous one.
2020-04-22 20:46:17 +01:00
Lioncash
5b71b1337b
emit_x64_vector: Avoid left shift of negative value in LogicalVShift
...
Now that we handle the signed variants, we also have to be careful about left shifts with negative values,
as this is considered undefined behavior.
2020-04-22 20:46:17 +01:00
Lioncash
9954d28868
a64_jitstate: Zero SP and PC on construction of A64JitState
...
Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent
2020-04-22 20:46:17 +01:00
Lioncash
4efbd40ea4
backend_x64/callback: Default virtual destructor in the cpp file
...
Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)
2020-04-22 20:46:17 +01:00
Lioncash
edd0b5c8c7
a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks
...
It's well-defined to static_cast a void* to its proper type.
2020-04-22 20:46:17 +01:00
Lioncash
21974ee57e
backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants
...
Also adds IR opcodes to dispatch said variants
2020-04-22 20:46:17 +01:00
Lioncash
9fc89f0a0e
emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size
...
Similar changes were done in emit_x64_vector, but these were missed.
2020-04-22 20:46:17 +01:00
Lioncash
af28e89a13
emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()
2020-04-22 20:46:17 +01:00
Lioncash
0d20423ad5
emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()
2020-04-22 20:46:17 +01:00
Lioncash
d70ee7c0d1
emit_x64_vector: Use VBPROADCAST where applicable and available
...
Uses the instruction that does what it says in its name if available. Allows avoiding the use
of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.
2020-04-22 20:46:17 +01:00
Lioncash
26d77c6f09
ir: Add opcodes for performing vector deinterleaving
2020-04-22 20:46:17 +01:00
Lioncash
87ca63699f
emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
f17702f608
emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs
2020-04-22 20:46:17 +01:00
Lioncash
596a8dd1dd
emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs
...
Provides a better alternative to a fallback operation.
2020-04-22 20:46:17 +01:00
Lioncash
75fd4eaaaa
emit_x64_vector: Get rid of some magic numbers in loop bounds
2020-04-22 20:46:17 +01:00
Lioncash
7b80ac25eb
emit_x64_vector: Generify variable shift functions
2020-04-22 20:46:17 +01:00
Lioncash
38fa984b53
IR: Add opcode for packed word->f32 conversions
2020-04-22 20:46:16 +01:00
Lioncash
64b1f2d468
ir: Add opcode for reversing bits in a vector
2020-04-22 20:46:15 +01:00
Lioncash
e33dcce14a
ir: Add opcodes for performing vector absolute values
2020-04-22 20:46:15 +01:00
MerryMage
3472f371df
IR: Implement VectorExtract, VectorExtractLower IR instructions
2020-04-22 20:46:15 +01:00
MerryMage
5c47f03888
A64: Implement FMUL (vector)
2020-04-22 20:46:15 +01:00
Lioncash
ad5cf584ce
ir: Add opcodes for performing vector unsigned absolute differences
2020-04-22 20:46:15 +01:00
Lioncash
701f43d61e
IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords
...
I should have added this when I introduced the functions for interleaving
low-order equivalents for consistency in the interface.
2020-04-22 20:46:15 +01:00
Lioncash
3985f7bf84
emit_x64_data_processing: Deduplicate some code in zero-extension functions
...
EmitZeroExtendByteToLong() can be implemented in terms of EmitZeroExtendByteToWord() and
EmitZeroExtendHalfToLong() can be implemented in terms of EmitZeroExtendHalfToWord().
2020-04-22 20:46:15 +01:00
MerryMage
e7b60189b3
abi: Missing includes'
2020-04-22 20:46:15 +01:00
MerryMage
cdc5c3ad95
emit_x64_floating_point: Near jump instead of short jump in FPMinNumberic{32,64}
2020-04-22 20:46:15 +01:00
MerryMage
df4ee0f51e
emit_X64_floating_point: Near jmp to end instead of short jmp
...
Jump destination can be further than what can be reached in a short
jump under some FPCR options.
2020-04-22 20:46:15 +01:00
Lioncash
b8d5765f9b
emit_x64_vector: Fix typo in VectorShuffleImpl
...
This is supposed to be pshufd, not pshufw (which only allows a 64-bit operand)
2020-04-22 20:46:15 +01:00
Lioncash
6b0010c940
ir: Add IR opcodes for emitting vector shuffles
...
This uses the ARM terminology for sizes (Halfword -> 2 bytes, Word -> 4 bytes)
as opposed to the x86 terminology of (Word -> 2 bytes, Double word -> 4 bytes)
2020-04-22 20:46:15 +01:00
Lioncash
eb2d28d2b1
emit_x64_vector_floating_point: Fix out of bounds array access in EmitVectorOperation64
2020-04-22 20:46:15 +01:00
MerryMage
49cc6d7fad
A64: Implement FDIV (vector)
2020-04-22 20:46:15 +01:00
MerryMage
c832cec96d
Correct FPSR and FPCR
2020-04-22 20:46:15 +01:00
MerryMage
147284427b
A64: Implement USHL
2020-04-22 20:46:15 +01:00
MerryMage
e4697b1676
A64: Implement system register TPIDR_EL0
2020-04-22 20:46:15 +01:00
MerryMage
e3da92024e
A64: Implement system registers FPCR and FPSR
2020-04-22 20:46:15 +01:00
MerryMage
9e4e4e9c1d
A64: Implement system register CNTPCT_EL0
2020-04-22 20:46:15 +01:00
MerryMage
1e15283d00
A64: Implement system register CTR_EL0
2020-04-22 20:46:15 +01:00
MerryMage
710d09471b
IR: Add IR instruction ZeroVector
2020-04-22 20:46:15 +01:00
MerryMage
2721bb5ace
emit_x64_floating_point: Add maybe_unused to preprocess parameter
2020-04-22 20:46:15 +01:00
MerryMage
0575e7421b
A64: Implement FMINNM (scalar)
2020-04-22 20:46:15 +01:00
MerryMage
1c9804ea07
A64: Implement FMAXNM (scalar)
2020-04-22 20:46:15 +01:00
MerryMage
1dfce0894d
constant_pool: Add frame parameter
2020-04-22 20:46:14 +01:00
MerryMage
84f1c9b7f4
reg_alloc: Only exchange GPRs
2020-04-22 20:46:14 +01:00
MerryMage
6541ec064d
emit_x64_floating_point: Correct FP{Max,Min}{32,64} implementations for -0/+0
2020-04-22 20:46:14 +01:00
MerryMage
7c193485e1
a64/config: Allow NaN emulation accuracy to be set
2020-04-22 20:46:14 +01:00
MerryMage
a3df46a75a
a64_emit_x64: Add conf to A64EmitContext
2020-04-22 20:46:14 +01:00
MerryMage
07520f32c3
backend_x64: Accurately handle NaNs
2020-04-22 20:46:14 +01:00
MerryMage
e97581d063
fuzz_with_unicorn: Print AArch64 disassembly
2020-04-22 20:46:14 +01:00
MerryMage
47c0ad0fc8
IR: Implement Vector{Max,Min}{Signed,Unsigned}
2020-04-22 20:46:14 +01:00
MerryMage
f4775910f5
IR: Implement VectorGreaterSigned
2020-04-22 20:46:14 +01:00
MerryMage
1f5b3bca43
Exclusive fixups
...
* Incorrect size of exclusive_address
* Disable tests on exclusive memory instructions for now
2020-04-22 20:46:14 +01:00
MerryMage
f3fa4a042f
a64_emit_x64: EmitExclusiveWrite: Make MSVC happy (narrowing conversion warning)
2020-04-22 20:46:14 +01:00
MerryMage
8698f057d0
A64: Implement STXP, STLXP, LDXP, LDAXP
2020-04-22 20:46:14 +01:00
MerryMage
b7a2c1a7df
A64: Implement STXRB, STXRH, STXR, STLXRB, STLXRH, STLXR, LDXRB, LDXRH, LDXR, LDAXRB, LDAXRH, LDAXR
2020-04-22 20:46:14 +01:00
MerryMage
a6cc667509
Direct Page Table Access: Handle address spaces less than the full 64-bit in size
2020-04-22 20:46:14 +01:00
MerryMage
f45a5e17c6
Implement direct page table access
2020-04-22 20:46:14 +01:00
MerryMage
bfd3e30c75
callbacks: Member functions should be const
2020-04-22 20:46:14 +01:00
MerryMage
9f2f08db8d
a64_emit_x64: Implement {Read,Write}Memory128 in terms of a function call
2020-04-22 20:46:14 +01:00
MerryMage
6c4773e85b
abi: Add RAX to ABI_ALL_CALLER_SAVE
2020-04-22 20:46:14 +01:00
MerryMage
8756487554
A64: Partially implement MRS
2020-04-22 20:46:14 +01:00
MerryMage
bfd65bedfe
A64: Implement DSB, DMB
2020-04-22 20:46:14 +01:00
MerryMage
5edd623b9d
Implement DC instructions
2020-04-22 20:46:14 +01:00
MerryMage
2cb0a699ba
IR: Implement FPMax, FPMin
2020-04-22 20:46:14 +01:00
MerryMage
98c8e7d1af
IR: Implement FPVectorAdd
2020-04-22 20:46:14 +01:00
MerryMage
eae518a338
IR: Implement VectorSignExtend
2020-04-22 20:46:14 +01:00
MerryMage
b9cd345ddc
IR: Implement FPVectorSub
2020-04-22 20:46:14 +01:00
MerryMage
851fc83445
emit_x64_vector: EmitOneArgumentFallback
2020-04-22 20:46:14 +01:00
MerryMage
303088a51e
IR: Implement VectorPopulationCount
2020-04-22 20:46:14 +01:00
MerryMage
bf2cd92da9
emit_x64_vector: Add SSE4.1 implementation for EmitVectorMultiply64
2020-04-22 20:46:14 +01:00
MerryMage
b062266b8e
emit_x64_vector: More explicit lambda decay
2020-04-22 20:46:14 +01:00
MerryMage
b6de612e01
IR: Implement VectorMultiply
2020-04-22 20:46:14 +01:00
MerryMage
90a053a5e4
emit_x64_vector: Order alphabetically
2020-04-22 20:46:14 +01:00
MerryMage
715ae1c229
IR: Implement VectorArithmeticShiftRight
2020-04-22 20:46:14 +01:00
MerryMage
132c783320
IR: Implement VectorNarrow
2020-04-22 20:46:13 +01:00
MerryMage
1423584f9f
constant_pool: Allow for 128-bit constants
2020-04-22 20:46:13 +01:00
MerryMage
69de50a878
emit_x64_vector: Add SSE4.1 implementations for VectorZeroExtend
2020-04-22 20:46:13 +01:00
MerryMage
cbc9f361b0
IR: Implement VectorSub
2020-04-22 20:46:13 +01:00
MerryMage
b22c5961f9
IR: Implement VectorLogicalShiftRight
2020-04-22 20:46:13 +01:00
MerryMage
59ace60b03
IR: Implement VectorZeroExtend
2020-04-22 20:46:13 +01:00
MerryMage
f6247125c0
IR: Implement VectorLogicalShiftLeft{8,16,32,64}
2020-04-22 20:46:13 +01:00
MerryMage
15e8231f24
opcodes: Sort vector IR opcodes alphabetically
2020-04-22 20:46:13 +01:00
MerryMage
d74f4e35f6
block_of_code: Increase constant pool size
2020-04-22 20:46:13 +01:00
MerryMage
e69288f803
devirtualize: MinGW uses Intanium MFP ABI
2020-04-22 20:46:13 +01:00
MerryMage
ad428cbd7a
callback: Properly handle calls with return pointers and simplify interface
2020-04-22 20:46:13 +01:00
MerryMage
7a87e3fc55
devirtualize: Handle Windows ABI
2020-04-22 20:46:13 +01:00
MerryMage
f808a0fbde
devirtualize: Devirtualize Itanium ABI MFPs at runtime
2020-04-22 20:46:13 +01:00
Lioncash
35a29a9665
A64: Implement ZIP1
2020-04-22 20:46:13 +01:00
FernandoS27
586854117b
Implemented UMULH and SMULH instructions
2020-04-22 20:46:13 +01:00
MerryMage
44c3c2312a
a64_jitstate: Remove unnecessary FPSCR_nzcv member
2020-04-22 20:46:13 +01:00
MerryMage
aac5af50e2
IR: FPCompare{32,64} now return NZCV flags instead of implicitly setting them
2020-04-22 20:46:13 +01:00
MerryMage
dd2a6684fe
IR: Add ConditionalSelectNZCV instruction
2020-04-22 20:46:13 +01:00
MerryMage
b173fcf34e
backend_x64: Simplify FPDoubleToU32 and FPSingleToU32
...
They're inaccurate in terms of FPSR at the moment anyway.
2020-04-22 20:46:13 +01:00
Lioncash
d040920727
Common: Put AES code within its own nested namespace
...
Prevents the functions from potentially clashing with other stuff in Common in the future
2020-04-22 20:46:13 +01:00
Lioncash
40614202e7
A64: Implement AESD
2020-04-22 20:46:13 +01:00
Lioncash
ccef85dbb7
A64: Implement AESE
2020-04-22 20:46:13 +01:00
MerryMage
68f46c8334
backend_x64: Use a reference to BlockOfCode instead of a pointer
2020-04-22 20:46:13 +01:00
MerryMage
8931ee346b
IR: Add IR instruction NZCVFromPackedFlags
...
This instruction expects NZCV to be in the high bits.
i.e.: The positions they were in PSTATE.
2020-04-22 20:46:13 +01:00
MerryMage
75b8a76630
a64_jitstate: A64 does not have a seperate FPSCR.NZCV
2020-04-22 20:46:13 +01:00
MerryMage
6414736a8d
emit_x64_vector: bug: VectorGetElement8 returning incorrect values for non-SSE4.1
...
This bug wasn't discovered earlier because we previously only used index == 0.
2020-04-22 20:46:13 +01:00
MerryMage
ebfc51c609
IR: Implement VectorSetElement{8,16,32,64}
2020-04-22 20:46:13 +01:00
Lioncash
a5c4fbc783
A64: Implement AESIMC and AESMC
2020-04-22 20:46:13 +01:00
Lioncash
ab9b5fb8aa
Common: Relocate common bits of CRC32
...
Allows the algorithm to be used in any other potential backend.
2020-04-22 20:46:12 +01:00
Lioncash
af1384d700
A64: Implement CRC32
2020-04-22 20:46:12 +01:00