dynarmic

Author	SHA1	Message	Date
MerryMage	ecbf9dbae5	IR: Implement A64OrQC	2020-04-22 20:46:22 +01:00
MerryMage	f0fecf2615	A64: Implement UQSHRN, UQRSHRN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	8f4c1a8558	emit_x64_vector: -0x80000000 isn't -0x80000000	2020-04-22 20:46:22 +01:00
MerryMage	b455b566e7	A64: Implement UQXTN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	e686a81612	emit_x64_vector: Fix non-SSE4.1 saturated narrowing reconstruction comparison Allows non-SSE4.1 to produce the correct FPSR.QC flag	2020-04-22 20:46:22 +01:00
MerryMage	3874cb37e3	A64: Implement SQXTN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	8ef114d48f	emit_x64_vector: packusdw reqiures SSE4.1 In EmitVectorSignedSaturatedNarrowToUnsigned32.	2020-04-22 20:46:22 +01:00
MerryMage	712c6c1d7e	A64: Implement SQSHRUN, SQRSHRUN (vector)	2020-04-22 20:46:22 +01:00
MerryMage	c5722ec963	simd_shift_by_immediate: Simplify ShiftRight	2020-04-22 20:46:22 +01:00
MerryMage	f020dbe4ed	A64: Implement SQXTUN	2020-04-22 20:46:22 +01:00
MerryMage	6918ef7360	microinstruction: Reorganize FPSCR related instruction queries	2020-04-22 20:46:22 +01:00
Lioncash	a639fa5534	microinstruction: Add missing FP scalar opcodes to ReadsFromFPSCR() and WritesToFPSCR() These were forgotten when the opcodes were added.	2020-04-22 20:46:22 +01:00
Lioncash	3ca18d8a6d	u128: Make Bit() a const-qualified member function This function doesn't modify the struct members, so it can be made const.	2020-04-22 20:46:22 +01:00
MerryMage	b2e4c16ef8	A64: Implement FRSQRTS (vector), single/double variant	2020-04-22 20:46:22 +01:00
MerryMage	45dc5f74f3	A64: Implement FRSQRTE (vector), single/double variant	2020-04-22 20:46:22 +01:00
MerryMage	b74d5520f9	A64: Implement FRSQRTS (scalar), single/double variant	2020-04-22 20:46:22 +01:00
MerryMage	506e544bfe	IR: Implement FPRSqrtStepFused	2020-04-22 20:46:22 +01:00
MerryMage	6eb069e80d	fp: Implement FPRSqrtStepFused	2020-04-22 20:46:22 +01:00
MerryMage	b0ff35fcd1	fp: Implement FPNeg	2020-04-22 20:46:22 +01:00
MerryMage	ca6774ccce	process_nan: Add two operand variant	2020-04-22 20:46:22 +01:00
Lioncash	ace7d2ba50	A64: Implement FMAXP, FMINP, FMAXNMP and FMINNMP's scalar double/single-precision variant	2020-04-22 20:46:21 +01:00
MerryMage	66bb05fc0a	emit_x64_floating_point: Fixup special NaN case in FMA FPMulAdd implementation	2020-04-22 20:46:21 +01:00
Lioncash	070637e0f6	fp: Use a forward declaration in fused.h It's permissible to forward declare here, so we can do so and eliminate a direct header dependency	2020-04-22 20:46:21 +01:00
Lioncash	030820f649	u128: Implement comparison operators in terms of one another We can just implement the comparisons in terms of operator< and implement inequality with the negation of operator==.	2020-04-22 20:46:21 +01:00
MerryMage	76b07d6646	u128: StickyLogicalShiftRight requires special-casing for amount == 64 In this case (128 - amount) == 64, and this invokes undefined behaviour	2020-04-22 20:46:21 +01:00
Lioncash	49c7edf7c6	A64: Implement FMLA and FMLS (by element)'s double/single-precision scalar variant	2020-04-22 20:46:21 +01:00
Lioncash	c704acafe4	A64: Implement FMUL (by element)'s scalar double/single-precision variant	2020-04-22 20:46:21 +01:00
MerryMage	0ce11b7b15	emit_x64_floating_point: Implement accurate fallback for FPMulAdd{32,64}	2020-04-22 20:46:21 +01:00
MerryMage	e199887fbc	fp: Implement FPMulAdd	2020-04-22 20:46:21 +01:00
MerryMage	53a8c15d12	process_nan: Add FPProcessNaNs3	2020-04-22 20:46:21 +01:00
MerryMage	1c8e93e74d	block_of_code: Add SysV ABI fifth and sixth parameters	2020-04-22 20:46:21 +01:00
MerryMage	1fe8f51c54	u128: Add StickyLogicalShiftRight	2020-04-22 20:46:21 +01:00
MerryMage	b0afd53ea7	u128: Add Multiply64To128	2020-04-22 20:46:21 +01:00
MerryMage	5566fab29a	u128: Add u128::Bit	2020-04-22 20:46:21 +01:00
MerryMage	3e62fea003	u128: Add comparison operators	2020-04-22 20:46:21 +01:00
MerryMage	f17cd6f2c5	unpacked: Use ResidualErrorOnRightShift in FPRoundBase Fixes a bug relating to exponents that are severely out of range.	2020-04-22 20:46:21 +01:00
MerryMage	805428e35e	fp: Remove MantissaT	2020-04-22 20:46:21 +01:00
MerryMage	bda86fd167	FPRSqrtEstimate: Improve documentation of RecipSqrtEstimate	2020-04-22 20:46:21 +01:00
Lioncash	0a64a66b26	FPRSqrtEstimate: Deduplicate array bounds Dehardcodes a few constants in the loops.	2020-04-22 20:46:21 +01:00
Lioncash	b7bd70fd19	A64: Implement FMAXV, FMINV, FMAXNMV, and FMINNMV	2020-04-22 20:46:21 +01:00
Lioncash	664fb12e21	FPRSqrtEstimate: Use forward declarations where applicable	2020-04-22 20:46:21 +01:00
Lioncash	3447c82656	translate: Return by bool in helpers where applicable Gets rid of a bit of duplication regarding the early-out cases and makes all helpers functions consistent (previously some had a return type of bool, while others had a return type of void).	2020-04-22 20:46:21 +01:00
Lioncash	d65b056eba	Simplify fallback case for EmitVectorSetElement64()	2020-04-22 20:46:21 +01:00
MerryMage	6087c2af6f	emit_x64_floating_point: s/Esimate/Estimate/	2020-04-22 20:46:21 +01:00
MerryMage	f837ce8e78	simd_scalar_two_register_misc: Implement FRSQRTE, scalar variant	2020-04-22 20:46:21 +01:00
MerryMage	bde58b04d4	IR: Implement FPRSqrtEstimate	2020-04-22 20:46:21 +01:00
MerryMage	16061c28f3	simd_vector_x_indexed_element: Implement FMUL (by element), vector variant	2020-04-22 20:46:21 +01:00
MerryMage	55eaa16615	a64_emit_x64: Ensure host has updated ticks in EmitA64GetCNTPCT Discovered by @Subv. Fixes incomplete fix begun in 5a91c94dca47c9702dee20fbd5ae1f4c07eef9df. That fix fails to take into account that LinkBlock doesn't update ticks until there are no remaining ticks to be executed. Test added to confirm fix.	2020-04-22 20:46:21 +01:00
MerryMage	edd795e991	a64_emit_x64: Fix stack misalignment on Windows for 128-bit exclusive writes Discovered by @Subv. Includes a test to ensure this codepath is exercised on Windows.	2020-04-22 20:46:21 +01:00
Lioncash	04b4c8b0cf	emit_x64_aes: Eliminate extraneous usage of a scratch register in EmitAESInverseMixColumns() We can just use the same register the data is in as the result register, eliminating the need to use a completely separate register to store the result.	2020-04-22 20:46:21 +01:00
Lioncash	e5d80e998e	A64: Implement SADDLV	2020-04-22 20:46:21 +01:00
Lioncash	a1bc8ddb53	A64: Implement UADDLV	2020-04-22 20:46:21 +01:00
Lioncash	1dc1e3dcd8	fp: Use forward declarations where applicable Minimizes the amount of files that need to be rebuilt if the headers ever change.	2020-04-22 20:46:21 +01:00
Lioncash	46cb0d813b	emit_x64_vector: Append 'v' prefix onto movq in AVX path This is something I missed when adding in the AVX broadcast code.	2020-04-22 20:46:21 +01:00
Subv	4606a081c9	A64: The A64SetTPIDR IR instruction writes to a system register and should not be eliminated by the dead code elimination pass. Previously this instruction was alway eliminated, resulting in incorrect values for TPIDR_EL0.	2020-04-22 20:46:21 +01:00
MerryMage	b53127600b	fp: A64::FPCR -> FP::FPCR	2020-04-22 20:46:21 +01:00
MerryMage	084bf63a10	bit_util: Implement ClearBits and ModifyBits	2020-04-22 20:46:21 +01:00
MerryMage	699c5f36d5	system: Simplify static_cast	2020-04-22 20:46:21 +01:00
MerryMage	3f602129f4	system: Ensure value of CNTPCT_EL0 is accurate Since we currently only update the host's tick count at the end of a block, we force an end-of-block before executing a MRS %, CNTPCT_ELO instruction.	2020-04-22 20:46:21 +01:00
Lioncash	84affdb260	safe_ops: Avoid cases where shift bases are invalid with signed values For example, say the converted signed type is s64, shifting left by 63 bits would be undefined behavior. However, given an ASL is essentially the same behavior as an LSL we can just use an unsigned type instead of converting to a signed type.	2020-04-22 20:46:21 +01:00
Lioncash	d0274f412a	safe_ops: Avoid signed overflow in Negate() Negation of values such as -9223372036854775808 can't be represented in signed equivalents (such as long long), leading to signed overflow. Therefore, we can just invert bits and add 1 to perform this behavior with unsigned arithmetic.	2020-04-22 20:46:21 +01:00
Lioncash	af3e23b224	simd_scalar_shift_by_immediate: Implement FCVT{ZS, ZU} (vector, fixed-point)'s scalar double/single-precision variant	2020-04-22 20:46:21 +01:00
Lioncash	91abf87169	simd_scalar_two_register_misc: Implement FCVT{AS, AU, MS, MU, NS, NU, PS, PU, ZS, ZU} (vector)'s scalar double/single-precision variants We can simply implement this in terms of the fixed-point IR opcodes.	2020-04-22 20:46:21 +01:00
Lioncash	0ec8dac660	emit_x64: Remove FPSCR_RoundTowardsZero() virtual function from EmitContext struct This code was bugged in that we were comparing if the rounding mode was not equal to rounding towards zero. Fortunately, however, nothing uses this function anymore, and there's already the more general FPSCR_RMode() available, so this can be removed entirely.	2020-04-22 20:46:21 +01:00
Lioncash	fd92e2f186	emit_x64: Add missing <array> include Commit 755adef62e504a8d616de9dda8937d2428a9471b introduced a helper alias for std::array, eliminating the need to manually type out sizes for them, however I forgot to add the include for <array>	2020-04-22 20:46:21 +01:00
Lioncash	f939bd0228	emit_x64_vector{_floating_point}: Add helper alias for sizing arrays relative to vector width Avoids needing to remember to specify the proper size of the arrays, all that's needed is to specify the type of the array and the size will automatically be deduced from it. This helps prevent potential oversized or undersized arrays from being specified.	2020-04-22 20:46:21 +01:00
MerryMage	58f3399032	A64/PopRSBHint: Prevent RETing to a guest PC of ~0ull from crashing the jit	2020-04-22 20:46:21 +01:00
MerryMage	e18fca17dc	A64: Implement FABD in terms of existing IR instructions Fixes NaN issue. Closes #306.	2020-04-22 20:46:21 +01:00
MerryMage	1dbe9d95e6	FPRoundInt: Final FPRound based on new sign While this shouldn't change any of the results in theory, it's just logically more consistent	2020-04-22 20:46:21 +01:00
MerryMage	83be491875	emit_x64_floating_point: SSE4.1 implementation of EmitFPRound	2020-04-22 20:46:20 +01:00
MerryMage	a40127a054	A64: Implement FRINTX, FRINTI (scalar)	2020-04-22 20:46:20 +01:00
MerryMage	962fa3b65e	A64: Implement FRINTP, FRINTM, FRINTZ (scalar)	2020-04-22 20:46:20 +01:00
MerryMage	5200bf41cf	A64: Implement FRINTN (scalar)	2020-04-22 20:46:20 +01:00
MerryMage	8718dc1692	A64: Implement FRINTA (scalar)	2020-04-22 20:46:20 +01:00
MerryMage	b228694012	IR: Implement FPRoundInt	2020-04-22 20:46:20 +01:00
MerryMage	e24054f4d7	fp: Implement FPRoundInt	2020-04-22 20:46:20 +01:00
MerryMage	f876e4afa2	fp: Implement FPProcessNaN	2020-04-22 20:46:20 +01:00
MerryMage	591adee443	fp/info: Add DefaultNaN	2020-04-22 20:46:20 +01:00
MerryMage	797e18cd97	fp: Move FPToFixed to its own file	2020-04-22 20:46:20 +01:00
MerryMage	295deb4035	a64_jit_state: Add FPSR.QC flag	2020-04-22 20:46:20 +01:00
Lioncash	7797bc2fb2	emit_x64_vector: Use non-scratch Use* variants of registers within EmitVectorUnsignedAbsoluteDifference() In some cases, a register isn't modified, depending on the branch taken, so we can signify this by using the non-scratch variants in certain cases.	2020-04-22 20:46:20 +01:00
Lioncash	f7f83b76b7	simd_scalar_two_register_misc: Implement scalar double/single-precision variants of FCM{EQ, GE, GT, LE, LT} (zero)	2020-04-22 20:46:20 +01:00
Lioncash	9db6d1e98b	translate_arm: Remove unnecessary rotr() function We already have RotateRight() in our common code, so we can remove this function and replace it with it. We can also implement ArmExpandImm_C() in terms of ArmExpandImm().	2020-04-22 20:46:20 +01:00
Lioncash	9f8a44c982	cast_util: Remove unnecessary typename Given we use std::aligned_storage_t, we don't need to specify typename here. If we used std::aligned_storage, then we would need to.	2020-04-22 20:46:19 +01:00
MerryMage	89e43867c1	A64: Implement FADDP (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	33fa65de23	A64: Implement FADDP (vector)	2020-04-22 20:46:19 +01:00
MerryMage	9dba273a8c	A64: Implement SADDLP	2020-04-22 20:46:19 +01:00
MerryMage	70ff2d73b5	A64: Implement UADDLP	2020-04-22 20:46:19 +01:00
MerryMage	5563bbbd79	A64: Implement EXT	2020-04-22 20:46:19 +01:00
MerryMage	304cc7f61e	emit_x64_floating_point: SSE4.1 implementation for FP{Double,Single}ToFixed{S,U}{32,64}	2020-04-22 20:46:19 +01:00
MerryMage	3d9677d094	A64: Implement FCVTMU (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	79c9018d60	A64: Implement FCVTMS (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	49c4499a87	A64: Implement FCVTPU (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	af661ef5a6	A64: Implement FCVTPS (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	27319822bb	A64: Implement FCVTAU (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	c0c7a26314	A64: Implement FCVTAS (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	a1965a74a0	A64: Implement FCVTNU (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	7d36dbcdfd	A64: Implement FCVTNS (scalar)	2020-04-22 20:46:19 +01:00
MerryMage	617ca0adf0	floating_point_conversion_integer: Refactor implementation of FCVTZS_float_int and FCVTZU_float_int	2020-04-22 20:46:19 +01:00
MerryMage	caaf36dfd6	IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64} This implementation just falls-back to the software floating point implementation.	2020-04-22 20:46:19 +01:00
MerryMage	760cc3ca89	EmitContext: Expose FPCR	2020-04-22 20:46:19 +01:00
MerryMage	9571269552	fp/op: Implement FPToFixed	2020-04-22 20:46:19 +01:00
MerryMage	8087e8df05	mantissa_util: Implement ResidualErrorOnRightShift Accurately calculate residual error that is shifted out	2020-04-22 20:46:19 +01:00
MerryMage	8668d61881	fp/unpacked: Implement FPRound	2020-04-22 20:46:19 +01:00
MerryMage	55d590c01f	FPCR: Add AHP setter and FZ16 getter	2020-04-22 20:46:19 +01:00
MerryMage	7360a2579b	mp: Implement metaprogramming library	2020-04-22 20:46:19 +01:00
MerryMage	4ab029c114	fp: Implement FPUnpack	2020-04-22 20:46:19 +01:00
MerryMage	4875658917	fp: Implement FPProcessException	2020-04-22 20:46:19 +01:00
MerryMage	3cb98e1560	fp: Move fp_util to fp/util	2020-04-22 20:46:19 +01:00
MerryMage	c41a38b13e	fp: Add FPSR	2020-04-22 20:46:19 +01:00
MerryMage	66381352f3	fp: Add FPInfo Provides information about floating-point format for various bit sizes	2020-04-22 20:46:19 +01:00
MerryMage	d21659152c	safe_ops: Implement safe shifting operations Implement shifiting operations that perform consistently across architectures without running into undefined or implemented-defined behaviour.	2020-04-22 20:46:19 +01:00
MerryMage	b00fe23b91	bit_util: Implement MostSignificantBit	2020-04-22 20:46:19 +01:00
MerryMage	95ad0d0a66	bit_util: Use Ones to implement Bits	2020-04-22 20:46:19 +01:00
MerryMage	62b640b2fa	bit_util: Add ClearBit and ModifyBit	2020-04-22 20:46:19 +01:00
MerryMage	8651c2d10e	u128: Implement u128 For when we need a 128-bit integer	2020-04-22 20:46:19 +01:00
Lioncash	e7409fdfe4	A64: Implement UCVTF (vector, integer)'s double/single-precision variant	2020-04-22 20:46:19 +01:00
Lioncash	4aa4885ba7	ir: Add opcodes for vector conversion of u32/u64 to floating-point	2020-04-22 20:46:19 +01:00
Lioncash	fcae4e2418	simd_three_different: Deduplicate common implementations Generally, the only difference between the signed variants and the unsigned variants is whether or not we use a sign-extension or zero-extension, so we can simply use common functions to implement both cases without totally duplicating code twice here.	2020-04-22 20:46:19 +01:00
Lioncash	9c0d5cf15c	floating_point_conversion_integer: Handle S64/U64 -> F32 conversions in SCVTF_float_int and UCVTF_float_int	2020-04-22 20:46:19 +01:00
Lioncash	7a84b6e8d8	ir: Add opcodes for converting S64 and U64 to single-precision floating-point values	2020-04-22 20:46:19 +01:00
Lioncash	066061fa50	constant_pool: Remove unnecessary std::memset from constructor AllocateFromCodeSpace() already zeroes out the allocated memory.	2020-04-22 20:46:19 +01:00
Lioncash	a1d6a86e8c	A64: Implement ADDV	2020-04-22 20:46:19 +01:00
Lioncash	35026a6ce3	emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32()	2020-04-22 20:46:19 +01:00
Lioncash	245c903129	simd_three_same: Join FPAbsoluteComparison() into FPCompareRegister() These are part of the same comparison family, so there's no real point in keeping them separate.	2020-04-22 20:46:19 +01:00
Lioncash	9912836b59	A64: Implement scalar double/single-precision variants of FACGE, FACGT, FCMEQ, FCMGE, FCMGT	2020-04-22 20:46:18 +01:00
MerryMage	0b97e9bd8d	emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode	2020-04-22 20:46:18 +01:00
MerryMage	a2eb9a02e0	backend_x86: Add FPSCR_RMode to EmitContext	2020-04-22 20:46:18 +01:00
MerryMage	d875c08ebf	fp: Extract common RoundingMode enum	2020-04-22 20:46:18 +01:00
Lioncash	3714bc0ed4	floating_point_conversion_integer: Use FPS64ToDouble and FPU64ToDouble in SCVTF_float_int and UCVTF_float_int The opcodes introduced in 979b6f39f1621b80bd463645ec5b08661cb6b1bf can also be used here, avoiding more falling back to the interpreter.	2020-04-22 20:46:18 +01:00
Lioncash	b97358075e	simd_scalar_two_register_misc: Handle 64-bit case in SCVTF and UCVTF's scalar double/single-precision variant Avoids falling back to the interpreter in the 64-bit case.	2020-04-22 20:46:18 +01:00
Lioncash	7252293184	emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle() In the non-AVX512 path, the following code is present: code.mov(from.cvt32(), from.cvt32()); since this potentially modifies 'from', we should be using UseScratchGpr() instead.	2020-04-22 20:46:18 +01:00
Lioncash	fbd7623fe5	emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble() AVX-512F provides convenient instructions for these kinds of conversions directly	2020-04-22 20:46:18 +01:00
Lioncash	3a41465eaf	ir: Add opcodes for converting S64 and U64 to double-precision values	2020-04-22 20:46:18 +01:00
MerryMage	436ca80bcd	Merge branch 'global_monitor'	2020-04-22 20:46:18 +01:00
Lioncash	0f4bf26e05	simd_two_register_misc: Utilize FPVectorAbs in FABS implementations Since we already have opcodes introduced to implement FACGE and FACGT, we can reutilize it for the FABS implementations.	2020-04-22 20:46:18 +01:00
MerryMage	821cff1227	A64: Add ClearExclusiveState method	2020-04-22 20:46:18 +01:00
Lioncash	81e572c78c	ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16	2020-04-22 20:46:18 +01:00
MerryMage	2a8de5f733	a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor The kernel would have to execute an ERET instruction to return to userland; this clears exclusive state.	2020-04-22 20:46:18 +01:00
Lioncash	53dbb6a92a	A64: Implement FACGE's vector single/double precision variants	2020-04-22 20:46:18 +01:00
MerryMage	57f7c7e1b0	Implement global exclusive monitor	2020-04-22 20:46:18 +01:00
Lioncash	6912a02d9b	A64: Implement FACGT's vector single/double precision variants	2020-04-22 20:46:18 +01:00
MerryMage	85234338d3	a64_emit_x64: Simplify EmitExclusiveWrite	2020-04-22 20:46:18 +01:00
Lioncash	fc731dddae	ir: Add opcodes for performing vector absolute floating-point values This will be usable for implementing FACGE and FACGT	2020-04-22 20:46:18 +01:00
MerryMage	2fc6b33829	CMakeLists: Add missing files	2020-04-22 20:46:18 +01:00
Lioncash	0bee648b4f	emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions Given both branches are the same, we can hoist out the common code.	2020-04-22 20:46:18 +01:00
Lioncash	d86fea0d28	A64: Implement FCMEQ (zero)'s vector single and double precision variant	2020-04-22 20:46:18 +01:00
Lioncash	593eca7fb1	A64: Implement load/store single structure instructions Implements LD{1, 2, 3, 4}, LD{1, 2, 3, 4}R, and ST{1, 2, 3, 4} single structure variants.	2020-04-22 20:46:18 +01:00
Lioncash	9bec354791	A64: Implement FCMEQ (register)'s vector single and double precision variant	2020-04-22 20:46:18 +01:00
Lioncash	b6e223fc58	emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8() Given both branches use the same destination register size, we can hoist the common code out.	2020-04-22 20:46:18 +01:00
Lioncash	5ce187a54e	ir: Add opcodes for floating-point vector equalities	2020-04-22 20:46:18 +01:00
MerryMage	be354dbfd0	ir/basic_block: Add missing U16 immediate type to DumpBlock	2020-04-22 20:46:18 +01:00
Lioncash	cf188448d4	emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64() Gets rid of the need to perform a fallback.	2020-04-22 20:46:18 +01:00
MerryMage	5503ff28c3	llvm_disassemble: Allow disassembly of invalid AArch64 instructions	2020-04-22 20:46:18 +01:00
Lioncash	954deff2d4	emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned() This doesn't alter behavior but does make the code better if anything else is ever added to this function in the future.	2020-04-22 20:46:18 +01:00
Lioncash	11a92eaaef	A64: Implement SRHADD and URHADD	2020-04-22 20:46:18 +01:00
Lioncash	9e75d08860	A64: Implement FABD's scalar single/double precision variant	2020-04-22 20:46:18 +01:00
Lioncash	bc718c5b28	ir: Add opcodes for performing rounding halving adds	2020-04-22 20:46:18 +01:00
Lioncash	d898d1779d	A64: Implement FABD's vector single/double precision variant	2020-04-22 20:46:18 +01:00
Lioncash	054549da35	emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64 I realized I introduced a helper for simple AVX operation emitting, so use that instead of writing it all out long-form.	2020-04-22 20:46:18 +01:00
Lioncash	8a4f8aed06	ir: Add opcode for performing FP vector absolute differences	2020-04-22 20:46:18 +01:00
Lioncash	cb456f914b	A64: Implement UMLAL{2}, UMLSL{2}, and UMULL{2} Now that we have the helper function set up for the signed variants, we can also modify it to be used with the unigned ones by performing a zero extension instead of a sign extension.	2020-04-22 20:46:18 +01:00
MerryMage	ba84e7a8de	A64: Implement FNMSUB	2020-04-22 20:46:18 +01:00
Lioncash	3576c02d91	A64: Implement SMLSL{2}	2020-04-22 20:46:18 +01:00
MerryMage	a1042cfcd8	A64: Implement FNMADD	2020-04-22 20:46:18 +01:00
Lioncash	ada5c0b2fa	A64: Implement SMLAL{2}	2020-04-22 20:46:18 +01:00
MerryMage	0d83032a6f	A64: Implement FMSUB	2020-04-22 20:46:18 +01:00
Lioncash	2d1aca25e6	A64: Implement SMULL{2}	2020-04-22 20:46:18 +01:00
MerryMage	69e00d225c	A64: Implement FMADD	2020-04-22 20:46:18 +01:00
MerryMage	8c90fcf58e	IR: Implement FPMulAdd	2020-04-22 20:46:18 +01:00
Lioncash	c5ae9107a9	A64: Implement SABAL/SABAL2 and SABDL/SABDL2 Now that we have a helper function for the unsigned variants, we can modify it to also be usable with the signed variants.	2020-04-22 20:46:18 +01:00
Lioncash	24e3299276	A64: Implement FCMGT, FCMGE (register) vector double and single precision variants	2020-04-22 20:46:18 +01:00
Lioncash	26d4473851	A64: Implement UABAL/UABAL2	2020-04-22 20:46:18 +01:00
Lioncash	350bc70be8	A64: Implement FCMGT, FCMGE, FCMLE, FCMLT (zero) vector double and single precision variants.	2020-04-22 20:46:18 +01:00
Lioncash	3397742c74	A64: Implement UABDL/UABDL2	2020-04-22 20:46:18 +01:00
Lioncash	c695da1cf3	ir: Add opcode for floating-point GE and GT comparisons The rest of the comparisons can be implemented in terms of these two	2020-04-22 20:46:18 +01:00
Lioncash	6de5ed96e5	emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs Shortens code-gen down to a single instruction in the 64-bit path.	2020-04-22 20:46:18 +01:00
Lioncash	9054d1c20b	A64: Implement LDR (literal, SIMD&FP)	2020-04-22 20:46:18 +01:00
Lioncash	0da5e949a8	Correct typo in DataCacheOperation enum Fixes a typo for the InvalidateByVAToPoC enum entry. Given yuzu is the only known user of 64-bit mode and it doesn't use this value, we can get away with changing this.	2020-04-22 20:46:18 +01:00
Lioncash	9736e2cce2	A64: Implement FABS' half-precision variant	2020-04-22 20:46:18 +01:00
Lioncash	6e5750e4ec	A64: Implement FABS' single and double precision variant	2020-04-22 20:46:18 +01:00
Lioncash	7bce8d8757	A64: Implement URSHR (scalar) and URSRA (scalar) Now that the utility function is all set up from implementing SRSRA, the unsigned variants can now be trivially implemented by modifying the utility function to perform a logical shift right instead of an arithmetical shift right for the unsigned case.	2020-04-22 20:46:18 +01:00
Lioncash	1e70a589b0	A64: Implement SRSRA (scalar)	2020-04-22 20:46:18 +01:00
Lioncash	998aef07f6	A64: Implement SRSHR (scalar)	2020-04-22 20:46:17 +01:00
Lioncash	7c0250e9f8	A64: Implement SABA	2020-04-22 20:46:17 +01:00
Lioncash	f00789e6f7	A64: Implement SABD	2020-04-22 20:46:17 +01:00
Lioncash	1e10017f4b	ir: Add opcodes for signed absolute differences	2020-04-22 20:46:17 +01:00
Tillmann Karras	d3b44c1b5a	decoder_detail: use structured bindings	2020-04-22 20:46:17 +01:00
Lioncash	f745eb28bf	simd_two_register_misc: Handle 64-bit case for SCVTF_int_4	2020-04-22 20:46:17 +01:00
Lioncash	3f6c529da2	ir: Add opcode to perform the vector conversion S64->F64 Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us	2020-04-22 20:46:17 +01:00
Lioncash	0e61ee6bf6	A64: Implement SHLL/SHLL2	2020-04-22 20:46:17 +01:00
Lioncash	43e6e98c3b	A64: Add missing decoding for PRFM (unscaled offset)	2020-04-22 20:46:17 +01:00
Lioncash	f2a85d5601	A64: Implement UHSUB	2020-04-22 20:46:17 +01:00
Lioncash	b33360a324	A64: Implement SHSUB	2020-04-22 20:46:17 +01:00
Lioncash	44a5f8095a	ir: Add opcodes for performing vector halving subtracts	2020-04-22 20:46:17 +01:00
Lioncash	4f37c0ec5a	A64: Implement SM4EKEY	2020-04-22 20:46:17 +01:00
Lioncash	3bde3347a5	A64: Implement SM4E	2020-04-22 20:46:17 +01:00
Lioncash	b312d28295	ir: Add an opcode for doing an SM4 lookup table query	2020-04-22 20:46:17 +01:00
Lioncash	27a6d5f6ce	emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available	2020-04-22 20:46:17 +01:00
Lioncash	4dcc7724e0	A64: Implement UHADD	2020-04-22 20:46:17 +01:00
Lioncash	f8714f7250	A64: Implement SHADD	2020-04-22 20:46:17 +01:00
Lioncash	089096948a	ir: Add opcodes for performing halving adds	2020-04-22 20:46:17 +01:00
Lioncash	3d00dd63b4	emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	b97b71b8aa	emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	033e400df0	emit_x64_vector_floating_point: Deduplicate accurate NaN handling code Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.	2020-04-22 20:46:17 +01:00
Lioncash	0f067b7330	emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	d4ee878cbd	emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	b38dd191bd	disassembler_arm: Remove rotation helper function in favor of Common::RotateRight Mildly reduces the amount of duplicated behavior	2020-04-22 20:46:17 +01:00
Lioncash	51e4f1d9db	emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32()	2020-04-22 20:46:17 +01:00
Lioncash	c692ccdd6d	emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8()	2020-04-22 20:46:17 +01:00
Lioncash	b194313d8c	emit_x64_vector: Vectorize fallback path in EmitVectorMinU32()	2020-04-22 20:46:17 +01:00
Lioncash	7ceda6d919	emit_x64_vector: Vectorize fallback path in EmitVectorMinU16()	2020-04-22 20:46:17 +01:00
Lioncash	cda85a1da0	emit_x64_vector: Vectorize fallback path in EmitVectorMinS32()	2020-04-22 20:46:17 +01:00
Lioncash	6e08eed210	emit_x64_vector: Vectorize fallback path in EmitVectorMinS8()	2020-04-22 20:46:17 +01:00
Lioncash	0fb6dce689	emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift This can simply be merged with the previous one.	2020-04-22 20:46:17 +01:00
Lioncash	5b71b1337b	emit_x64_vector: Avoid left shift of negative value in LogicalVShift Now that we handle the signed variants, we also have to be careful about left shifts with negative values, as this is considered undefined behavior.	2020-04-22 20:46:17 +01:00
Lioncash	9954d28868	a64_jitstate: Zero SP and PC on construction of A64JitState Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent	2020-04-22 20:46:17 +01:00
Lioncash	4efbd40ea4	backend_x64/callback: Default virtual destructor in the cpp file Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)	2020-04-22 20:46:17 +01:00
Lioncash	edd0b5c8c7	a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks It's well-defined to static_cast a void* to its proper type.	2020-04-22 20:46:17 +01:00
Lioncash	e71612d394	A64: Implement SSHL (scalar)	2020-04-22 20:46:17 +01:00
Lioncash	ef1e69a1e3	A64: Implement SSHL (vector)	2020-04-22 20:46:17 +01:00
Lioncash	21974ee57e	backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants Also adds IR opcodes to dispatch said variants	2020-04-22 20:46:17 +01:00
Lioncash	9fc89f0a0e	emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size Similar changes were done in emit_x64_vector, but these were missed.	2020-04-22 20:46:17 +01:00
Lioncash	af28e89a13	emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()	2020-04-22 20:46:17 +01:00
Lioncash	cda75e2079	A64: Implement CMTST's scalar variant	2020-04-22 20:46:17 +01:00
Lioncash	0d20423ad5	emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()	2020-04-22 20:46:17 +01:00
Lioncash	d70ee7c0d1	emit_x64_vector: Use VBPROADCAST where applicable and available Uses the instruction that does what it says in its name if available. Allows avoiding the use of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.	2020-04-22 20:46:17 +01:00
Lioncash	bebe7235ae	A64: Implement UZP1 and UZP2	2020-04-22 20:46:17 +01:00
Lioncash	26d77c6f09	ir: Add opcodes for performing vector deinterleaving	2020-04-22 20:46:17 +01:00
Lioncash	d6f9ed47d9	A64: Implement FNEG (half-precision)	2020-04-22 20:46:17 +01:00
Lioncash	7efbd73bac	A64: Implement USHL (scalar)	2020-04-22 20:46:17 +01:00
Lioncash	41f4717f2b	A64: Implement FNEG (vector)	2020-04-22 20:46:17 +01:00
Lioncash	ba1cc6366d	A64: Implement RSUBHN/RSUBHN2	2020-04-22 20:46:17 +01:00
Lioncash	e41640fe33	A64: Implement RADDHN/RADDHN2	2020-04-22 20:46:17 +01:00
Lioncash	b719a6b3f7	A64: Implement XAR	2020-04-22 20:46:17 +01:00
Lioncash	0b1b131ec2	simd_two_register_misc: Factor out common comparison code Gets rid of a tiny bit of duplicated code.	2020-04-22 20:46:17 +01:00
Lioncash	ed0b84da70	A64: Implement CMLE (zero)'s vector variant	2020-04-22 20:46:17 +01:00
Lioncash	b595a68ffa	A64: Implement CMTST (vector)	2020-04-22 20:46:17 +01:00
Lioncash	48c7f8630c	A64: Implement ADDHN{2} and SUBHN{2}	2020-04-22 20:46:17 +01:00
Lioncash	3acd9c9200	translate: zero extend result in Vpart when storing to lower part of vector	2020-04-22 20:46:17 +01:00
Lioncash	87ca63699f	emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs	2020-04-22 20:46:17 +01:00
Lioncash	f17702f608	emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs	2020-04-22 20:46:17 +01:00
Lioncash	596a8dd1dd	emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs Provides a better alternative to a fallback operation.	2020-04-22 20:46:17 +01:00
Lioncash	75fd4eaaaa	emit_x64_vector: Get rid of some magic numbers in loop bounds	2020-04-22 20:46:17 +01:00
Lioncash	7b80ac25eb	emit_x64_vector: Generify variable shift functions	2020-04-22 20:46:17 +01:00
Lioncash	4ec735f707	A64: Implement CMLE (zero)'s scalar variant	2020-04-22 20:46:17 +01:00
Lioncash	6534184df2	A64: Implement CMLT (zero)'s scalar single/double-precision variant	2020-04-22 20:46:17 +01:00
Lioncash	8863c9bb4b	A64: Implement SHA512H2	2020-04-22 20:46:17 +01:00
Lioncash	033b890e25	A64: Implement SHA512H	2020-04-22 20:46:17 +01:00
Lioncash	d1f5b084b4	A64: Handle S32->F32 case for SCVTF (vector)	2020-04-22 20:46:17 +01:00
Lioncash	38fa984b53	IR: Add opcode for packed word->f32 conversions	2020-04-22 20:46:16 +01:00
Lioncash	b8587d8e34	A64: Implement SHA512SU1	2020-04-22 20:46:16 +01:00
Lioncash	44d846045a	A64: Implement SHA512SU0	2020-04-22 20:46:16 +01:00
Lioncash	ca903c1585	A64: Implement SHA256H and SHA256H2	2020-04-22 20:46:16 +01:00
MerryMage	e4237c44eb	A64: Implement SCVTF (vector, integer), scalar varaint	2020-04-22 20:46:16 +01:00
MerryMage	bfba38d0b6	impl: Reorganize scalar two-register misc instructions	2020-04-22 20:46:16 +01:00
Lioncash	ea582b17cc	A64: Implement SHA256SU1	2020-04-22 20:46:16 +01:00
Lioncash	06c5dcaf5e	simd_two_register_misc: Add missing zeroing of the vector for CMGT and CMLT	2020-04-22 20:46:16 +01:00
Lioncash	0d50d7314b	A64: Implement CMGE (zero)'s vector variant	2020-04-22 20:46:16 +01:00
Lioncash	ab35dc0e78	A64: Implement MLS (by element)	2020-04-22 20:46:16 +01:00
Lioncash	1651e60462	A64: Implement MUL (by element)	2020-04-22 20:46:16 +01:00
MerryMage	a86d4093cd	A64: Implement MLA (by element)	2020-04-22 20:46:16 +01:00
Lioncash	7f47402609	A64: Implement ABS (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	c8eb4528be	A64: Implement SHA256SU0	2020-04-22 20:46:16 +01:00
Lioncash	181c3b0790	A64: Implement SHA1M	2020-04-22 20:46:16 +01:00
Lioncash	47bc97a71b	A64: Implement SHA1P	2020-04-22 20:46:16 +01:00
Lioncash	718f3e9bb4	A64: Implement scalar variants of CMEQ, CMGT, and CMGE zero comparison instructions These can trivially use the ScalarCompare helper function.	2020-04-22 20:46:16 +01:00
Lioncash	3ad4e547e4	A64: Implement scalar variant of NEG	2020-04-22 20:46:16 +01:00
Lioncash	b4f3051e4b	simd: Relocate REV16, REV32 and REV64 vector variants to the proper file These aren't scalar instruction variants.	2020-04-22 20:46:16 +01:00
Lioncash	19e276d10f	A64: Implement CMEQ (register, scalar)	2020-04-22 20:46:16 +01:00
Lioncash	5b8c9e5146	A64: Implement CMHS (register, scalar)	2020-04-22 20:46:16 +01:00
Lioncash	78bb12276a	A64: Implement CMHI (register, scalar)	2020-04-22 20:46:16 +01:00
Lioncash	c18b20b8d1	A64: Implement CMGE (register, scalar)	2020-04-22 20:46:16 +01:00
Lioncash	755981d0da	A64: Implement CMGT (register, scalar)	2020-04-22 20:46:16 +01:00
Lioncash	da6627124b	A64: Implement SHA1C	2020-04-22 20:46:16 +01:00
Lioncash	3c013bd9f8	A64: Implement SLI (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	154cac594a	A64: Implement SRI (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	6bcfdba1ad	general: Remove unused lambda captures Resolves warnings that occur in Xcode 9.3	2020-04-22 20:46:16 +01:00
Lioncash	205ca6b4cb	A64: Implement SHA1SU1	2020-04-22 20:46:16 +01:00
Lioncash	16a001b9ff	A64: Implement SHA1SU0	2020-04-22 20:46:16 +01:00
Lioncash	3b6db59850	A64: Implement TRN2	2020-04-22 20:46:16 +01:00
Lioncash	30e158f8d0	A64: Implement TRN1	2020-04-22 20:46:16 +01:00
Lioncash	52cad2d9d0	A64: Implement SSRA (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	255a33936d	A64: Implement SSHR (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	6723b00497	A64: Implement USRA (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	d56fa8f735	A64: Implement USHR (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	870e418b0b	A64: Implement SHL (scalar)	2020-04-22 20:46:16 +01:00
Lioncash	97f2bea4f2	A64: Implement SM3PARTW1	2020-04-22 20:46:16 +01:00
Lioncash	e268b110f0	simd_sha512: Simplify RAX1 Now that the vector rotation helpers are in, replace the explicit shifting with the relevant helper function that does the same thing. Simply tidies up code; no behavioral changes are made.	2020-04-22 20:46:16 +01:00
Lioncash	20d2491267	A64: Implement SM3PARTW2	2020-04-22 20:46:16 +01:00
Lioncash	e1b662e90c	ir: Add helper functions for vector rotation	2020-04-22 20:46:16 +01:00
Lioncash	8a60a63a8b	A64: Implement SM3TT2B	2020-04-22 20:46:16 +01:00
Lioncash	b3d4c02098	A64: Implement SM3TT2A	2020-04-22 20:46:16 +01:00
Lioncash	7fbccabd81	A64: Implement SM3TT1B	2020-04-22 20:46:16 +01:00
Lioncash	769373b3ed	A64: Implement SM3TT1A	2020-04-22 20:46:16 +01:00
Lioncash	2d269fdcc7	simd_shift_by_immediate: Merge signed/unsigned helper functions Gets rid of a little more code duplication.	2020-04-22 20:46:16 +01:00
Lioncash	d5461be6b4	A64: Implement SM3SS1	2020-04-22 20:46:16 +01:00
Lioncash	2db032ac83	A64: Implement SRI (vector)	2020-04-22 20:46:16 +01:00
Lioncash	11005cfe26	A64: Implement SLI (vector)	2020-04-22 20:46:16 +01:00
Lioncash	e3d9bf55e7	A64: Implement SRSRA (vector)	2020-04-22 20:46:16 +01:00

... 4 5 6 7 8 ...

1422 commits