dynarmic

Author	SHA1	Message	Date
MerryMage	fa8925c4df	IR: Implement FPVectorMulX	2020-04-22 20:57:37 +01:00
MerryMage	f0920c0ded	Fix VShift terminology An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift. - Rename VectorLogicalVShiftS* -> VectorArithmeticVShift* - Rename VectorLogicalVShiftU* -> VectorLogicalVShift*	2020-04-22 20:55:50 +01:00
Lioncash	d426dfe942	ir: Add opcodes for unsigned saturating left shifts	2020-04-22 20:55:06 +01:00
MerryMage	02150bc0b7	IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed	2020-04-22 20:55:06 +01:00
MerryMage	8051f60db0	opcodes.inc: Align columns to a tabstop of 4	2020-04-22 20:55:06 +01:00
MerryMage	90193b0e3d	IR: Add fbits argument to FixedToFP-related opcodes	2020-04-22 20:55:06 +01:00
Lioncash	b14eaaec46	ir: Add opcodes for left signed saturated shifts	2020-04-22 20:55:06 +01:00
MerryMage	3e447614c6	IR: Add VectorSignedSaturatedDoublingMultiplyLong	2020-04-22 20:55:06 +01:00
MerryMage	06b31448aa	emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply * Return both the upper and lower parts of the multiply if required * SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead * Improve port utilisation where possible (punpck instructions were a bottleneck)	2020-04-22 20:55:06 +01:00
MerryMage	08c0e017a5	IR: Implement Vector{Signed,Unsigned}Multiply{16,32}	2020-04-22 20:55:06 +01:00
Lioncash	e739624296	ir: Add opcodes for vector CLZ operations We can optimize these cases further for with the use of a fair bit of shuffling via pshufb and the use of masks, but given the uncommon use of this instruction, I wouldn't consider it to be beneficial in terms of amount of code to be worth it over a simple manageable naive solution like this. If we ever do hit a case where vectorized CLZ happens to be a bottleneck, then we can revisit this. At least with AVX-512CD, this can be done with a single instruction for the 32-bit word case.	2020-04-22 20:55:05 +01:00
Lioncash	d4a76aaa04	ir: Add opcodes form unsigned saturated accumulations of signed values	2020-04-22 20:55:05 +01:00
Lioncash	6f911a26da	ir: Add opcodes for signed saturated accumulations of unsigned values	2020-04-22 20:55:05 +01:00
Lioncash	b6e74fd17d	ir: Add opcodes for performing unsigned reciprocal square root estimates	2020-04-22 20:55:05 +01:00
Lioncash	af83360f89	ir: Add opcodes for unsigned reciprocal estimate	2020-04-22 20:55:05 +01:00
Lioncash	fca7eddb9e	A64: Add opcodes for signed saturating negations	2020-04-22 20:53:46 +01:00
Lioncash	7ebfd0f31c	ir: Add opcodes for scalar signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	a0231e5546	ir: Add opcodes for signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	0507e47420	ir: Add opcodes for signed saturated absolute values	2020-04-22 20:53:46 +01:00
MerryMage	89d08c7d61	IR: Add VectorTable and VectorTableLookup IR instructions	2020-04-22 20:53:45 +01:00
MerryMage	0288974512	opcodes: Cleanup opcodes table * Remove T:: prefix from types. * Add another column for a 4th argument.	2020-04-22 20:53:45 +01:00
Lioncash	0efa2ce3b0	ir: Add opcodes for performing rounding left shifts	2020-04-22 20:53:45 +01:00
Lioncash	f3f60cd179	A64: Implement ISB Given we want to ensure that all instructions are fetched again, we can treat an ISB instruction as a code cache flush.	2020-04-22 20:53:45 +01:00
MerryMage	d1d6f4feb5	system: Implement MRS CNTFRQ_EL0	2020-04-22 20:53:45 +01:00
Lioncash	acbaf04fef	ir: Add opcodes for unsigned saturating add and subtract	2020-04-22 20:46:23 +01:00
MerryMage	17f73974f2	IR: Implement FPMulX IR instruction	2020-04-22 20:46:23 +01:00
MerryMage	f976c47008	IR: Initial implementation of FPVectorRoundInt	2020-04-22 20:46:23 +01:00
MerryMage	10e196480f	IR: Generalise SignedSaturated{Add,Sub} to support more bitwidths	2020-04-22 20:46:23 +01:00
Lioncash	463b9a3d02	ir: Add opcodes for vector paired maximum and minimums For the time being, we can just do a naive implementation which avoids falling back to the interpreter a bit. Horizontal operations aren't necessarily x86 SIMD's forte anyways.	2020-04-22 20:46:23 +01:00
Lioncash	2501bfbfae	ir: Add opcodes for performing scalar integral min/max	2020-04-22 20:46:23 +01:00
Lioncash	7fdd8b0197	A64: Implement PMULL{2}	2020-04-22 20:46:23 +01:00
Lioncash	affa312d1d	ir: Add opcode for performing polynomial multiplication	2020-04-22 20:46:22 +01:00
MerryMage	507bcd8b8b	IR: Implement FPVectorTo{Signed,Unsigned}Fixed	2020-04-22 20:46:22 +01:00
MerryMage	7b03da86c2	IR: Implement FPVector{Max,Min}	2020-04-22 20:46:22 +01:00
MerryMage	901bd9b4e2	IR: Implement FPRecipStepFused, FPVectorRecipStepFused	2020-04-22 20:46:22 +01:00
MerryMage	939f5f5c7a	IR: Implement FPVectorRecipEstimate	2020-04-22 20:46:22 +01:00
MerryMage	c1dcfe29f7	IR: Implement FPRecipEstimate	2020-04-22 20:46:22 +01:00
MerryMage	04f325a05e	IR: Implement FPVectorNeg	2020-04-22 20:46:22 +01:00
MerryMage	771a4fc20b	IR: Implement FPVectorMulAdd	2020-04-22 20:46:22 +01:00
MerryMage	ecbf9dbae5	IR: Implement A64OrQC	2020-04-22 20:46:22 +01:00
MerryMage	b455b566e7	A64: Implement UQXTN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	3874cb37e3	A64: Implement SQXTN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	f020dbe4ed	A64: Implement SQXTUN	2020-04-22 20:46:22 +01:00
MerryMage	b2e4c16ef8	A64: Implement FRSQRTS (vector), single/double variant	2020-04-22 20:46:22 +01:00
MerryMage	45dc5f74f3	A64: Implement FRSQRTE (vector), single/double variant	2020-04-22 20:46:22 +01:00
MerryMage	506e544bfe	IR: Implement FPRSqrtStepFused	2020-04-22 20:46:22 +01:00
MerryMage	bde58b04d4	IR: Implement FPRSqrtEstimate	2020-04-22 20:46:21 +01:00
MerryMage	e18fca17dc	A64: Implement FABD in terms of existing IR instructions Fixes NaN issue. Closes #306.	2020-04-22 20:46:21 +01:00
MerryMage	b228694012	IR: Implement FPRoundInt	2020-04-22 20:46:20 +01:00
MerryMage	33fa65de23	A64: Implement FADDP (vector)	2020-04-22 20:46:19 +01:00
MerryMage	9dba273a8c	A64: Implement SADDLP	2020-04-22 20:46:19 +01:00
MerryMage	70ff2d73b5	A64: Implement UADDLP	2020-04-22 20:46:19 +01:00
MerryMage	caaf36dfd6	IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64} This implementation just falls-back to the software floating point implementation.	2020-04-22 20:46:19 +01:00
Lioncash	4aa4885ba7	ir: Add opcodes for vector conversion of u32/u64 to floating-point	2020-04-22 20:46:19 +01:00
Lioncash	7a84b6e8d8	ir: Add opcodes for converting S64 and U64 to single-precision floating-point values	2020-04-22 20:46:19 +01:00
Lioncash	3a41465eaf	ir: Add opcodes for converting S64 and U64 to double-precision values	2020-04-22 20:46:18 +01:00
Lioncash	81e572c78c	ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16	2020-04-22 20:46:18 +01:00
Lioncash	fc731dddae	ir: Add opcodes for performing vector absolute floating-point values This will be usable for implementing FACGE and FACGT	2020-04-22 20:46:18 +01:00
Lioncash	8a4f8aed06	ir: Add opcode for performing FP vector absolute differences	2020-04-22 20:46:18 +01:00
MerryMage	8c90fcf58e	IR: Implement FPMulAdd	2020-04-22 20:46:18 +01:00
Lioncash	c695da1cf3	ir: Add opcode for floating-point GE and GT comparisons The rest of the comparisons can be implemented in terms of these two	2020-04-22 20:46:18 +01:00
Lioncash	5ce187a54e	ir: Add opcodes for floating-point vector equalities	2020-04-22 20:46:18 +01:00
Lioncash	bc718c5b28	ir: Add opcodes for performing rounding halving adds	2020-04-22 20:46:18 +01:00
Lioncash	1e10017f4b	ir: Add opcodes for signed absolute differences	2020-04-22 20:46:17 +01:00
Lioncash	3f6c529da2	ir: Add opcode to perform the vector conversion S64->F64 Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us	2020-04-22 20:46:17 +01:00
Lioncash	44a5f8095a	ir: Add opcodes for performing vector halving subtracts	2020-04-22 20:46:17 +01:00
Lioncash	b312d28295	ir: Add an opcode for doing an SM4 lookup table query	2020-04-22 20:46:17 +01:00
Lioncash	089096948a	ir: Add opcodes for performing halving adds	2020-04-22 20:46:17 +01:00
Lioncash	21974ee57e	backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants Also adds IR opcodes to dispatch said variants	2020-04-22 20:46:17 +01:00
Lioncash	26d77c6f09	ir: Add opcodes for performing vector deinterleaving	2020-04-22 20:46:17 +01:00
Lioncash	38fa984b53	IR: Add opcode for packed word->f32 conversions	2020-04-22 20:46:16 +01:00
Lioncash	64b1f2d468	ir: Add opcode for reversing bits in a vector	2020-04-22 20:46:15 +01:00
Lioncash	e33dcce14a	ir: Add opcodes for performing vector absolute values	2020-04-22 20:46:15 +01:00
MerryMage	3472f371df	IR: Implement VectorExtract, VectorExtractLower IR instructions	2020-04-22 20:46:15 +01:00
MerryMage	5c47f03888	A64: Implement FMUL (vector)	2020-04-22 20:46:15 +01:00
Lioncash	ad5cf584ce	ir: Add opcodes for performing vector unsigned absolute differences	2020-04-22 20:46:15 +01:00
Lioncash	701f43d61e	IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords I should have added this when I introduced the functions for interleaving low-order equivalents for consistency in the interface.	2020-04-22 20:46:15 +01:00
Lioncash	6b0010c940	ir: Add IR opcodes for emitting vector shuffles This uses the ARM terminology for sizes (Halfword -> 2 bytes, Word -> 4 bytes) as opposed to the x86 terminology of (Word -> 2 bytes, Double word -> 4 bytes)	2020-04-22 20:46:15 +01:00
MerryMage	49cc6d7fad	A64: Implement FDIV (vector)	2020-04-22 20:46:15 +01:00
MerryMage	147284427b	A64: Implement USHL	2020-04-22 20:46:15 +01:00
MerryMage	e4697b1676	A64: Implement system register TPIDR_EL0	2020-04-22 20:46:15 +01:00
MerryMage	e3da92024e	A64: Implement system registers FPCR and FPSR	2020-04-22 20:46:15 +01:00
MerryMage	9e4e4e9c1d	A64: Implement system register CNTPCT_EL0	2020-04-22 20:46:15 +01:00
MerryMage	1e15283d00	A64: Implement system register CTR_EL0	2020-04-22 20:46:15 +01:00
MerryMage	710d09471b	IR: Add IR instruction ZeroVector	2020-04-22 20:46:15 +01:00
MerryMage	0575e7421b	A64: Implement FMINNM (scalar)	2020-04-22 20:46:15 +01:00
MerryMage	1c9804ea07	A64: Implement FMAXNM (scalar)	2020-04-22 20:46:15 +01:00
MerryMage	47c0ad0fc8	IR: Implement Vector{Max,Min}{Signed,Unsigned}	2020-04-22 20:46:14 +01:00
MerryMage	f4775910f5	IR: Implement VectorGreaterSigned	2020-04-22 20:46:14 +01:00
MerryMage	8698f057d0	A64: Implement STXP, STLXP, LDXP, LDAXP	2020-04-22 20:46:14 +01:00
MerryMage	b7a2c1a7df	A64: Implement STXRB, STXRH, STXR, STLXRB, STLXRH, STLXR, LDXRB, LDXRH, LDXR, LDAXRB, LDAXRH, LDAXR	2020-04-22 20:46:14 +01:00
MerryMage	8756487554	A64: Partially implement MRS	2020-04-22 20:46:14 +01:00
MerryMage	bfd65bedfe	A64: Implement DSB, DMB	2020-04-22 20:46:14 +01:00
MerryMage	5edd623b9d	Implement DC instructions	2020-04-22 20:46:14 +01:00
MerryMage	2cb0a699ba	IR: Implement FPMax, FPMin	2020-04-22 20:46:14 +01:00
MerryMage	98c8e7d1af	IR: Implement FPVectorAdd	2020-04-22 20:46:14 +01:00
MerryMage	eae518a338	IR: Implement VectorSignExtend	2020-04-22 20:46:14 +01:00
MerryMage	b9cd345ddc	IR: Implement FPVectorSub	2020-04-22 20:46:14 +01:00
MerryMage	303088a51e	IR: Implement VectorPopulationCount	2020-04-22 20:46:14 +01:00
MerryMage	b6de612e01	IR: Implement VectorMultiply	2020-04-22 20:46:14 +01:00

1 2 3 4 5 ...

258 commits