Commit graph

2144 commits

Author SHA1 Message Date
Wunkolo
490160ef43 emit_x64_vector: GNFI implementation of ArithmeticShiftRightByte
The bit-matrix is generated up-front and added to the constant-pool.
I'm using an embedded 64-bit broadcast here(m64bcst) which is the particular
EVEX encoded version of the instruction with AVX512VL+GNFI.

If it ever really matters, then we would ideally detect specific host
features like bare-GFNI and specific subsets of AVX512 and emit
the assembly based on that rather than by the entire Icelake uarch.
2020-11-07 15:29:12 +00:00
Wunkolo
7df235aefb emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftLeft8
Same principle as EmitVectorLogicalShiftRight8. An 8x8 galois identity
matrix is bit-shfited to allow for arbitrary 8-bit-lane shifts.
2020-11-07 15:29:12 +00:00
Wunkolo
5cc646ffed emit_x64_vector: GNFI implementation of EmitVectorLogicalShiftRight8
Bitshifts of the GFNI identity matrix generates a new matrix that
applies lane-wise bitshifts as well. This allows for a fast
single-instruction implementation of a byte-lane bitshift.
2020-11-07 15:29:12 +00:00
MerryMage
46f96904db decoder_detail: Add check for N==0 to GetArgInfo 2020-10-11 22:12:21 +01:00
Wunkolo
6bb49726f4 emit_x64_vector: GNFI+SSSE3 implementation of EmitVectorReverseBits
Performs a full 128-bit bit-reversal using only two instructions.

First by reversing all the bits of each byte using a galois matrix
multiplication(vgf2p8affineqb, Icelake), and then by reversing the bytes
themselves(pshufb, ssse3).
2020-10-02 05:56:59 +01:00
ReinUsesLisp
eb00bea1ff backend/x64/exception_handler_posix: Fix signal stack memory leak in SigHandler
std::malloc was being called inside SigHandler's constructor without a
std::free. This doesn't really matter as SigHandler is used as a
singleton and the OS will reclaim that memory. That said, properly
freeing memory keeps -fsanitize=address quiet.
2020-10-02 05:56:07 +01:00
Wunkolo
c2d5f6da90 block_of_code: Add HasAVX512_Icelake
Detect AVX512 feature support up to the [Icelake-level featureset](https://en.wikipedia.org/wiki/AVX-512#CPUs_with_AVX-512)
2020-09-19 15:20:40 +01:00
Lioncash
0e1112b7df Revert "basic_block: Mark move constructor and assignment as noexcept"
This reverts commit 4f12e86ebb.

Big fan of MSVC preventing standard behavior.
2020-08-14 16:49:40 -04:00
Lioncash
889635d17d general: Resolve -Wmissing-prototypes warnings 2020-08-14 14:50:09 -04:00
Lioncash
68fea20020 common/assert: Resolve several -Wextra-semi warnings
Resolves 200+ warnings.
2020-08-14 14:45:53 -04:00
Lioncash
4f12e86ebb basic_block: Mark move constructor and assignment as noexcept
Allows the type to play nicely with standard library facilities better
(also we shouldn't be throwing in move operations to begin with).
2020-08-14 14:38:28 -04:00
Lioncash
34f4d99454 block_of_code: Remove unused variables in GenRunCode()
These aren't used, so they can be removed.
2020-08-14 14:35:17 -04:00
Lioncash
29d1758923 ir_matcher: Add missing header guard 2020-08-14 14:32:34 -04:00
MerryMage
6bbc53839f Unsafe Optimization: Extend Unsafe_UnfuseFMA to all FMA-related instructions 2020-07-12 12:45:12 +01:00
MerryMage
d05d95c132 Improve documentation of unsafe optimizations 2020-07-12 12:41:11 +01:00
MerryMage
82417da780 emit_x64{_vector}_floating_point: Add unsafe optimizations for RSqrtEstimate and RecipEstimate 2020-07-11 14:05:57 +01:00
MerryMage
761e95eec0 A64: Add unsafe_optimizations option
* Strength reduce FMA unsafely
2020-07-06 21:02:30 +01:00
MerryMage
82868034d3 A32/ASIMD: Ensure decoder table is correct
* Raise a DecoderError instead of ASSERT-ing on a decode error
* Correct ASIMD decode table
* Write a test which verifies every possible ASIMD instruction
2020-07-05 18:45:42 +01:00
MerryMage
3c742960a9 simd_three_same: Ensure zero in upper for PairedMinMaxOperation 2020-07-04 11:25:36 +01:00
MerryMage
735738c7b6 A32: Implement ASIMD VPMAX, VPMIN (floating-point) 2020-07-04 11:04:10 +01:00
MerryMage
88e74cb2ba A32: Implement ASIMD VPMAX, VPMIN (integer) 2020-07-04 11:04:10 +01:00
MerryMage
d9914b1d51 simd_permute: Implement VectorUnzip with deinterleave lower 2020-07-04 11:04:10 +01:00
MerryMage
f35aaa017c IR: Add VectorDeinterleave{Even,Odd}Lower 2020-07-04 11:04:10 +01:00
MerryMage
df477c46c2 asimd_load_store_structures: VST1 undef correction 2020-07-04 11:04:10 +01:00
MerryMage
4ba1f8b9e7 Add optimization flags to disable specific optimizations 2020-07-04 11:04:10 +01:00
MerryMage
3eed024caf asimd_three_same: Ignore Q=1 for VPADD (floating-point) 2020-07-04 11:04:10 +01:00
MerryMage
896cb46c89 asimd_*: Standardize order of n and m to reduce confusion 2020-07-04 11:04:10 +01:00
MerryMage
4b8a781c04 emit_x64_floating_point: Introduce ICODE 2020-07-04 11:04:10 +01:00
MerryMage
7022281a0b emit_x64_vector_floating_point: Introduce ICODE 2020-07-04 11:04:10 +01:00
Merry
4f967387c0 asimd_three_regs: Reimplement asimd_VMLAL in terms of WideInstruction 2020-06-27 13:06:46 +01:00
Merry
7997404ee7 A32: Implement ASIMD V{ADD,SUB}{W,L} 2020-06-27 12:58:47 +01:00
Merry
868bd00ab5 A32: Rearrange translators for ASIMD Three Registers
* Separate Three Registers with Different Lengths from Same Lengths decoders
2020-06-27 11:15:07 +01:00
Merry
b1ff971a92 backend/x64: Temporarily avoid use of DefineValue(Argument&)
Issues with inappropriate values in upper bits of values
2020-06-27 10:52:59 +01:00
MerryMage
8a1f106dba decoder/asimd: Correct names of scalar exceptions 2020-06-25 17:40:11 +01:00
MerryMage
495f58eed8 A32: Implement ASIMD VSHLL 2020-06-24 23:47:13 +01:00
MerryMage
ed48a9d7d5 A32: Implement VFPv5 VRINTX 2020-06-24 22:31:58 +01:00
MerryMage
46445d0866 A64: Remove NaN accuracy setting
Always do accuracte NaN handling.
2020-06-24 22:26:10 +01:00
Lioncash
b5df8d1ef8 A32: Implement ASIMD VQDMULL (scalar) 2020-06-23 18:19:42 +01:00
Lioncash
20a2bf29fc A32: Implement ASIMD VQRDMULH (scalar) 2020-06-23 18:19:42 +01:00
Lioncash
ab5efe8632 A32: Implement ASIMD VQDMULH (scalar) 2020-06-23 18:19:42 +01:00
MerryMage
2008fda88b emit_x64_floating_point: Correct error in s16 rounding in EmitFPToFixed 2020-06-22 22:54:38 +01:00
MerryMage
3ea49fc6d6 A32: Implement VFPv3 VCT (between floating-point and fixed-point) 2020-06-22 22:08:58 +01:00
MerryMage
48b2ffdde9 A32: Implement ASIMD VQMOVUN, VQMOVN 2020-06-22 20:02:52 +01:00
MerryMage
52b8039367 A32: Implement VFPv5 VRINT{R,Z} 2020-06-22 19:35:32 +01:00
MerryMage
47bc99ad9f asimd_load_store_structures: Fix 2-byte aligned vld1.16
Previously incorrectly undefined
2020-06-22 18:46:22 +01:00
Lioncash
dd8d5497da A32: Implement ASIMD VQRDMULH 2020-06-22 17:31:57 +01:00
Lioncash
0b7a111b54 A32: Implement ASIMD VQDMULH 2020-06-22 17:31:57 +01:00
Lioncash
39488e4aad A32: Implement ASIMD VRSHRN 2020-06-21 23:15:43 +01:00
Lioncash
86b0e5c1c5 A32: Implement ASIMD VQSHRN 2020-06-21 23:15:43 +01:00
Lioncash
85222e3e65 A32: Implement ASIMD VQSHRUN
We can leverage ShiftRightNarrowing() to implement this.
2020-06-21 23:15:43 +01:00
MerryMage
562a98bcf9 A32: Implement ASIMD VCVT (between floating-point and fixed-point) 2020-06-21 20:23:40 +01:00
MerryMage
6f56043a73 A32: Implement ASIMD VFMA, VFMS 2020-06-21 20:21:53 +01:00
Lioncash
aa0358d324 A32: Implement ASIMD VMLAL/VMLSL (integer) 2020-06-21 20:03:19 +01:00
Lioncash
eab26b404a A32: Implement ASIMD VABAL 2020-06-21 20:01:08 +01:00
Lioncash
98581839ca A32: Implement ASIMD VABDL 2020-06-21 19:55:00 +01:00
MerryMage
db85e7ced5 asimd: Add missing three registers of different lengths instructions 2020-06-21 19:54:32 +01:00
Lioncash
95919594d1 A32: Implement ASIMD VQSHL/VQSHLU (immediate) 2020-06-21 19:26:30 +01:00
MerryMage
3557576ece A32: Implement ASIMD AESD, AESE, AESIMC, AESMC 2020-06-21 18:39:57 +01:00
Fernando Sahmkow
2fa1c1d13c A32: Allow cleaning up exclusive state from the interface.
This function is normally required for emulating certain OS mechanisms.
2020-06-21 18:18:33 +01:00
MerryMage
df58a429ee A32: Implement ASIMD VQRSHRN 2020-06-21 17:41:18 +01:00
MerryMage
589d717af5 A32: Implement ASIMD VQRSHRUN 2020-06-21 17:41:18 +01:00
MerryMage
e009d99924 A32: Implement ASIMD VSHRN 2020-06-21 17:41:18 +01:00
MerryMage
473949d486 asimd_load_store_structures: Suppress MSVC shift warning 2020-06-21 17:41:18 +01:00
MerryMage
8f0f1cfd66 A32: Implement ASIMD VST{1,2,3,4} (single n-element structure from one lane) 2020-06-21 16:27:33 +01:00
MerryMage
5a597f415c A32: Implement A32 VLD{1,2,3,4} (single n-element structure to one lane) 2020-06-21 16:22:43 +01:00
MerryMage
f221912409 bit_util: Bits without template arguments 2020-06-21 16:07:59 +01:00
MerryMage
3202e4c539 A32: Implement ASIMD VLD{1,2,3,4} (single n-element structure to all lanes) 2020-06-21 15:25:26 +01:00
MerryMage
d7197745ac emit_x64_vector_floating_point: fpcr_controlled is unused when fsize == 16 in EmitFPVectorToFixed 2020-06-21 14:46:06 +01:00
MerryMage
b32fc5ab0f a64_emit_x64: EmitVAddrLookup: Use bzhi instruction when silently_mirror_page_table is active and BMI2 is available 2020-06-21 14:46:06 +01:00
MerryMage
809dfe9c54 A32: Implement ASIMD VCVT (between floating-point and integer) 2020-06-21 14:28:25 +01:00
MerryMage
43a4b2a0b8 ir_emitter: Remove dummy fpcr_controlled arguments from scalar FP instructions 2020-06-21 14:28:25 +01:00
MerryMage
c836b389c8 emit_x64_vector_floating_point: Add fpcr_controlled argument to all IR instructions 2020-06-21 14:28:25 +01:00
MerryMage
33a81dae68 asimd: VEXT was being shadowed 2020-06-21 13:12:19 +01:00
MerryMage
bf093395d8 A32: Implement ASIMD VMOVN 2020-06-21 12:35:39 +01:00
MerryMage
c7785cd982 A32: Implement ASIMD VUZP and VZIP 2020-06-21 12:34:55 +01:00
MerryMage
603cd09c8f A32: Implement ASIMD VTRN 2020-06-21 12:14:13 +01:00
MerryMage
a8b481ab63 simd_permute: Implement TRN{1,2} in terms of VectorTranspose 2020-06-21 12:14:13 +01:00
MerryMage
7d1e103ff5 IR: Implement VectorTranspose 2020-06-21 12:14:13 +01:00
MerryMage
9cc11681dc A32: Implement ASIMD VMLAL, VMLSL, VMULL (scalar) 2020-06-21 10:31:30 +01:00
MerryMage
69a1d58a2b A32: Implement ASIMD VMULL 2020-06-21 10:00:24 +01:00
Lioncash
8c23f02330 A32: Implement ASIMD VABD 2020-06-21 07:54:21 +01:00
Lioncash
fc1633a2ea A32: Implement ASIMD VABA 2020-06-21 07:54:21 +01:00
Lioncash
bdb92f7055 asimd: Split out VABA/VABD decoders
These differ in bit encodings anyway
2020-06-21 07:54:21 +01:00
Lioncash
230fa02648 A32: Implement ASIMD VMLA/VMLS (scalar)
While we're at it, we can join the implementation of VMUL into a common
function.
2020-06-21 07:51:17 +01:00
MerryMage
239ee289cf A32: Implement VDUP (scalar) 2020-06-21 00:22:42 +01:00
Lioncash
a8efe3f0f5 A32: Implement ASIMD VACGE/VACGT 2020-06-21 00:02:48 +01:00
Lioncash
e319257ec0 A32: Implement VCEQ/VCGE/VCGT (floating point) 2020-06-21 00:02:48 +01:00
Lioncash
faefb264a6 A32: Implement ASIMD VCEQ (integer) 2020-06-21 00:02:48 +01:00
Lioncash
7276993352 A32: Implement ASIMD VCGE (integer) 2020-06-21 00:02:48 +01:00
Lioncash
7292320445 A32: Implement ASIMD VCGT (integer) 2020-06-21 00:02:48 +01:00
MerryMage
fda4e11887 A32: Implement ASIMD VMOV (general-purpose register to scalar) 2020-06-20 23:40:48 +01:00
MerryMage
7ec22b4e1d A32: Implement ASIMD VMOV (scalar to general-purpose register) 2020-06-20 23:30:56 +01:00
MerryMage
8bbc9fdbb6 A32: Implement ASIMD VTBX 2020-06-20 22:35:31 +01:00
Lioncash
06f7229c57 A32: Implement ASIMD VPADAL (integer) 2020-06-20 22:28:47 +01:00
Lioncash
266c6a2000 A32: Implement ASIMD VPADDL (integer) 2020-06-20 22:28:47 +01:00
Lioncash
4bb286ac23 A32: Implement ASIMD VPADD (integer) 2020-06-20 21:22:14 +01:00
Lioncash
1ffeeeb6a2 A32: Implement ASIMD VMAX/VMIN (integer) 2020-06-20 21:20:47 +01:00
Lioncash
945b757b6c A32: Implement ASIMD VMLA/VMLS (integer) 2020-06-20 21:20:21 +01:00
MerryMage
715db8381f A32: Implement ASIMD VMUL (scalar) 2020-06-20 20:34:08 +01:00
MerryMage
b0beecdd41 A32: Implement ASIMD VTBL 2020-06-20 19:25:14 +01:00
MerryMage
28f27bc19d A32: Implement ASIMD VEXT 2020-06-20 19:05:14 +01:00
MerryMage
e8c460c167 A32: Implement ASIMD VDUP (ARM core register) 2020-06-20 16:02:43 +01:00
MerryMage
15ee562dd0 decoder/asimd: Add misc data-processing instructions 2020-06-20 15:39:00 +01:00
MerryMage
92cb4a5a34 A32: Implement ASIMD VRSQRTE 2020-06-20 15:13:22 +01:00
MerryMage
6f59c2cd8e A32: Implement ASIMD VRECPE 2020-06-20 15:07:06 +01:00
MerryMage
d3dc50d718 A32: Implement ASIMD VRSQRTS 2020-06-20 15:06:06 +01:00
MerryMage
8f506c80c3 A32: Implement ASIMD VRECPS 2020-06-20 14:39:05 +01:00
MerryMage
9eef4f7471 A32: Implement ASIMD VMLA, VMLS (floating-point) 2020-06-20 14:31:06 +01:00
MerryMage
60f6e729ac A32: Implement ASIMD VABD (floating-point) 2020-06-20 14:25:04 +01:00
MerryMage
f58e247ef3 A32: Implement ASIMD VPADD (floating-point) 2020-06-20 14:25:04 +01:00
MerryMage
e006f0a205 A32: Implement ASIMD VSUB (floating-point) 2020-06-20 14:20:28 +01:00
MerryMage
4c939b9d0a A32: Implement ASIMD VADD (floating-point) 2020-06-20 14:20:28 +01:00
MerryMage
5ec8e48593 A32: Implement ASIMD VMUL (floating-point)
* Also add fpcr_controlled arguments to FPVectorMul IR instruction
* Merge ASIMD floating-point instruction implementations
2020-06-20 14:20:28 +01:00
MerryMage
bb4f3aa407 A32: Implement ASIMD VMAX, VMIN (floating-point) 2020-06-20 03:21:07 +01:00
Lioncash
8d067d5d60 A32: Implement ASIMD VMUL (integer and polynomial) 2020-06-20 00:53:56 +01:00
Lioncash
ed6ca58058 A32: Implement ASIMD VCEQ, VCGE, VCGT, VCLE, VCLT with zero
Fairly self-explanatory, we can leverage the existing IR functions for
the purpose of these instructions.

In the integer case, we can just insert function pointers
into an array and index it, given all comparison primitives exist
already for the integer side of things.
2020-06-20 00:50:40 +01:00
MerryMage
656419286c ir: Add fpcr_controlled argument to FPVector{Equal,Greater,GreaterEqual} 2020-06-20 00:50:40 +01:00
MerryMage
1b3a70a83c backend/x64: Implement separate MSXCSR for ASIMDStandardValue 2020-06-20 00:00:36 +01:00
MerryMage
d3664b03fe ir_emitter: Default fpcr_controlled arguments to true 2020-06-19 22:51:23 +01:00
Lioncash
794440cf8d A32: Implement ASIMD VRSHL 2020-06-19 21:27:48 +01:00
Lioncash
682621ef1a A32: Implement ASIMD VQSHL (register) 2020-06-19 21:27:48 +01:00
Lioncash
e46fb98cc5 A32: Implement ASIMD VSHL (register) 2020-06-19 21:27:48 +01:00
MerryMage
ad96b2b18d VFPv5: Implement VCVT{A,N,P,M} 2020-06-19 20:31:43 +01:00
MerryMage
6a965b80d6 VFPv5: Implement VRINT{A,N,P,M} 2020-06-19 20:24:13 +01:00
MerryMage
3e252cdbfc VFPv5: Implement VSEL 2020-06-19 19:44:45 +01:00
MerryMage
669d05caca VFPv5: Implement VMINNM 2020-06-19 19:44:45 +01:00
MerryMage
6e7ea151a3 VFPv5: Implement VMAXNM 2020-06-19 19:39:01 +01:00
MerryMage
4df3b2f97f vfp: Add decoders for VFPv5
These instructions were introduced in the Cortex-M7
2020-06-19 19:24:32 +01:00
MerryMage
55c021fe82 emit_x64_aes: AESNI implementations of all opcodes 2020-06-19 12:11:45 +01:00
Lioncash
551e207661 A32: Implement ASIMD VSUB (integer) 2020-06-19 11:31:38 +01:00
Lioncash
4d6f68525d A32: Implement ASIMD VADD (integer) 2020-06-19 11:31:38 +01:00
Lioncash
fbdae61c13 A32: Implement ASIMD VMVN (register)
Fairly straightforward
2020-06-19 11:31:14 +01:00
MerryMage
b759773b3b a32_emit_x64: EmitVAddrLookup: Use 64-bit registers where required 2020-06-19 00:44:52 +01:00
merry
687c604197
Merge pull request #532 from lioncash/shift
A32: Implement several ASIMD shift instructions
2020-06-19 00:22:18 +01:00
MerryMage
7dd9901de2 a32_emit_x64: Incorrect type in ExclusiveWriteMemory 2020-06-19 00:19:46 +01:00
Lioncash
00b2f9b319 asimd: Prevent misdecodes from occurring
Pointed out by Mary when reviewing the shift code.
2020-06-18 15:04:48 -04:00
MerryMage
87f6e412d0 emit_x64_vector: SSE4.1 implementation of EmitVectorPolynomialMultiply{Long}8 2020-06-18 18:44:00 +01:00
MerryMage
f5b41aabc6 emit_x64_vector: Implement EmitVectorPolynomialMultiplyLong64 in terms of pclmulqdq 2020-06-18 18:04:23 +01:00
MerryMage
d34763242c Revert "A32: Implement ASIMD VCEQ, VCGE, VCGT, VCLE, VCLT with zero"
This reverts commit 179951b10f.

These instructions require StandardFPSCRValue.
2020-06-18 17:38:40 +01:00
Lioncash
179951b10f A32: Implement ASIMD VCEQ, VCGE, VCGT, VCLE, VCLT with zero
Fairly self-explanatory, we can leverage the existing IR functions for
the purpose of these instructions.

In the integer case, we can just insert function pointers
into an array and index it, given all comparison primitives exist
already for the integer side of things.
2020-06-18 17:01:57 +01:00
Lioncash
6ca20c2fe3 A32: Implement ASIMD VSLI 2020-06-18 11:51:08 -04:00
Lioncash
887732d8a8 A32: Implement ASIMD VSRI 2020-06-18 11:28:12 -04:00
Lioncash
8b98c91ecc A32: Implement ASIMD VSHL 2020-06-18 11:18:33 -04:00
Lioncash
69c999bc66 A32: Implement ASIMD VRSRA
Now that we have the accumulation and rounding code in place, VRSRA is
extremely trivial to implement.
2020-06-18 11:03:39 -04:00
Lioncash
14fdd15199 A32: Implement ASIMD VRSHR 2020-06-18 11:00:45 -04:00
Lioncash
276e0b71dc A32: Implement ASIMD VSRA 2020-06-18 11:00:27 -04:00
Lioncash
054dff7cd5 A32: Implement ASIMD VTST 2020-06-18 15:34:05 +01:00
Lioncash
6c142bc5cc A32: Implement ASIMD VSHR 2020-06-18 10:30:20 -04:00
MerryMage
13367a7efd A64: Match A32 page_table code
Here we increase the similarity between the A64 and A32 front-ends in terms of their
page_table handling code. In this commit, we:

* Reserve and use r14 as a register to store the page_table pointer.
* Align the code to be more similar in structure.
* Add a conf member to A32EmitContext.
* Remove scratch argument from EmitVAddrLookup.
2020-06-18 12:22:59 +01:00
Lioncash
08350d06f1 A32: Implement ASIMD VQNEG 2020-06-18 09:49:29 +01:00