dynarmic

Author	SHA1	Message	Date
Lioncash	5d5c9f149f	frontend/ir_emitter: Add half-precision opcode for FPVectorRecipStepFused	2020-04-22 21:01:44 +01:00
Lioncash	6da0411111	frontend/ir_emitter: Add half-precision opcode for FPRecipStepFused	2020-04-22 21:01:44 +01:00
Lioncash	5b4673da4b	frontend/ir_emitter: Add half-precision variant of FPVectorRoundInt	2020-04-22 21:01:44 +01:00
Lioncash	ad0c698f89	frontend/ir_emitter: Add half-precision variant of FPRoundInt	2020-04-22 21:01:44 +01:00
Merry	cb9a1b18b6	Merge pull request #475 from lioncash/muladd A64: Enable half-precision variants of floating-point multiply-add instructions	2020-04-22 21:01:44 +01:00
Merry	13f421c27d	Merge pull request #473 from lioncash/sqshlu A64: Implement SQSHLU	2020-04-22 21:01:44 +01:00
Merry	d7da53a74b	Merge pull request #472 from lioncash/exception general: Mark hash functions as noexcept	2020-04-22 21:01:44 +01:00
Lioncash	a4cadf1cd9	frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation	2020-04-22 21:01:44 +01:00
Lioncash	ec6b3ae084	ir/frontend: Add half-precision opcode for FPVectorMulAdd	2020-04-22 21:01:44 +01:00
Lioncash	bd82513199	frontend/ir_emitter: Add half-precision opcode for FPMulAdd	2020-04-22 21:01:44 +01:00
Lioncash	7bb5440507	general: Mark hash functions as noexcept Generally hash functions shouldn't throw exceptions. It's also a requirement for the standard library-provided hash functions to not throw exceptions. An exception to this rule is made for user-defined specializations, however we can just be consistent with the standard library on this to allow it to play nicer with it. While we're at it, we can also make the std::less specializations noexcpet as well, since they also can't throw.	2020-04-22 21:01:43 +01:00
Lioncash	fe95575b95	general: Replace unreachable-imitating assertions with UNREACHABLE() We can just use the self-documenting assertion for indicating unreachable paths, instead of manually passing false and providing a message.	2020-04-22 21:01:43 +01:00
Lioncash	b37279f65c	backend/x64/emit_x64_vector: Prevent undefined behavior within VectorSignedSaturatedShiftLeft Avoids undefined behavior by potentially left-shifting a signed negative value.	2020-04-22 21:00:47 +01:00
MerryMage	13e8b7b516	emit_x64_floating_point: F16C implementation of FPSingleToHalf	2020-04-22 20:58:17 +01:00
MerryMage	d32d6fe598	emit_x64_floating_point: F16C implementation of FPHalfToSingle and FPHalfToDouble	2020-04-22 20:58:12 +01:00
MerryMage	a53ba12be2	emit_x64_floating_point: Factor out ConvertRoundingModeToX64Immediate	2020-04-22 20:58:12 +01:00
MerryMage	5a2adc6629	backend/x64: Expose FPCR in EmitContext instead of its subcomponents	2020-04-22 20:58:12 +01:00
Merry	01bb1cdd88	Merge pull request #458 from lioncash/float-op A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV	2020-04-22 20:58:12 +01:00
Lioncash	8309ec7a9f	frontend/ir_emitter: Add half-precision variant of FPAbs	2020-04-22 20:58:12 +01:00
Lioncash	e4c259d69f	frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes	2020-04-22 20:58:12 +01:00
Lioncash	c97efcb978	frontend/ir_emitter: Add half-precision variant of FPNeg	2020-04-22 20:58:12 +01:00
Lioncash	bd892ec4ef	frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point	2020-04-22 20:58:11 +01:00
Merry	bbd5330ad2	Merge pull request #447 from lioncash/flag A64: Implement CFINV, RMIF, AXFlag and XAFlag	2020-04-22 20:58:11 +01:00
Merry	fb039e232c	Merge pull request #442 from lioncash/fcvtxn A64: Implement scalar and vector variants of FCVTXN	2020-04-22 20:58:11 +01:00
Lioncash	597a8be5d5	ir: Add A64-specific opcodes for getting and setting raw NZCV values This will be necessary to implement the flag manipulation and flag format instructions.	2020-04-22 20:58:11 +01:00
Lioncash	5cf1478620	frontend/ir: Add opcodes for vector square roots	2020-04-22 20:58:10 +01:00
Lioncash	7c81a58ed3	frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode This will be necessary to special-case the non-IEEE Von Neumann rounding to odd rounding mode.	2020-04-22 20:58:10 +01:00
Merry	9f11720a69	Merge pull request #437 from lioncash/frecpx A64: Implement FRECPX (single, double precision)	2020-04-22 20:58:10 +01:00
Lioncash	9cf3c25811	frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents	2020-04-22 20:58:10 +01:00
Lioncash	2e180a7f14	backend/x64/a32_interface: Mark Context move constructor and move assignment as noexcept Provides a more "correct" move constructor/assignment operator, since these relevant functions shouldn't throw exceptions. Has the benefit of playing nicely with std::move_if_noexcept and other noexcept library facilities.	2020-04-22 20:58:09 +01:00
Lioncash	deb9dd4acc	block_of_code: Replace cast with [[maybe_unused]] in DoesCpuSupport()	2020-04-22 20:58:09 +01:00
Lioncash	3290a9fdc2	common: Remove address_range.h The AddressRange structure isn't used anywhere within the codebase, so this can be removed. Particularly because there's no real appeal/heavy potential use of it in the future that isn't trivial to add back if needed.	2020-04-22 20:57:38 +01:00
Lioncash	93351c7efb	a64_emit_x64: Make constness of loop elements explicit within GenFastmemFallbacks()	2020-04-22 20:57:37 +01:00
Lioncash	7752ffc50c	a64_emit_x64: Convert std::vector instances in GenFastmemFallbacks() to std::array Given these are quite small, we can avoid the need to heap allocate here.	2020-04-22 20:57:37 +01:00
MerryMage	7c8fcaef26	emit_x64_vector_floating_point: AVX && DN implementation of EmitFPVectorMulX	2020-04-22 20:57:37 +01:00
MerryMage	fa8925c4df	IR: Implement FPVectorMulX	2020-04-22 20:57:37 +01:00
V.Kalyuzhny	764a93bf5a	Switch boost::optional to std::optional	2020-04-22 20:57:37 +01:00
Lioncash	d69fceec55	value: Move ImmediateToU64() to be a part of Value's interface This'll make it slightly nicer to do basic constant folding for 32-bit and 64-bit variants of the same IR opcode type. By that, I mean it's possible to inspect immediate values without a bunch of conditional checks beforehand to verify that it's possible to call GetU32() or GetU64, etc.	2020-04-22 20:55:50 +01:00
MerryMage	ca603c1215	reg_alloc: Emit AVX instructions where able Smaller codesize.	2020-04-22 20:55:50 +01:00
MerryMage	e2358af5ef	abi: Emit AVX instructions where able Smaller codesize.	2020-04-22 20:55:50 +01:00
MerryMage	7c0378f56d	a64_exclusive_monitor: Loosen memory ordering requirements It is not necessary to be as strict as it was.	2020-04-22 20:55:50 +01:00
MerryMage	f0920c0ded	Fix VShift terminology An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift. - Rename VectorLogicalVShiftS* -> VectorArithmeticVShift* - Rename VectorLogicalVShiftU* -> VectorLogicalVShift*	2020-04-22 20:55:50 +01:00
MerryMage	b51dae790d	emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS16	2020-04-22 20:55:50 +01:00
MerryMage	bd47f2ca8f	emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftS64	2020-04-22 20:55:50 +01:00
MerryMage	3bf183d7e8	emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftS32	2020-04-22 20:55:50 +01:00
MerryMage	94f9d402eb	emit_x64_vector: AVX512 implementation of EmitVectorLogicalVShiftU16()	2020-04-22 20:55:50 +01:00
MerryMage	6d9639e3b0	emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU64()	2020-04-22 20:55:50 +01:00
MerryMage	bbc066a266	emit_x64_vector: AVX2 implementation of EmitVectorLogicalVShiftU32()	2020-04-22 20:55:50 +01:00
Lioncash	da2e7fad87	emit_x64_vector: SSSE3 variant of EmitVectorCountLeadingZeros8() pshufb lyfe	2020-04-22 20:55:50 +01:00
MerryMage	238f2f2cd0	a64_emit_x64: Lowercase PAGE_SIZE PAGE_SIZE is defined as a macro by musl.	2020-04-22 20:55:50 +01:00
MerryMage	7162f6f254	emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed	2020-04-22 20:55:50 +01:00
MerryMage	e7a5592699	emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE	2020-04-22 20:55:50 +01:00
MerryMage	b8fde48732	emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8	2020-04-22 20:55:50 +01:00
MerryMage	fd37b637aa	emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16	2020-04-22 20:55:50 +01:00
MerryMage	03ad2072a7	emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed	2020-04-22 20:55:06 +01:00
MerryMage	f9129db6fd	A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant	2020-04-22 20:55:06 +01:00
Lioncash	d426dfe942	ir: Add opcodes for unsigned saturating left shifts	2020-04-22 20:55:06 +01:00
MerryMage	02150bc0b7	IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed	2020-04-22 20:55:06 +01:00
MerryMage	90193b0e3d	IR: Add fbits argument to FixedToFP-related opcodes	2020-04-22 20:55:06 +01:00
Lioncash	b14eaaec46	ir: Add opcodes for left signed saturated shifts	2020-04-22 20:55:06 +01:00
Lioncash	a2cd643525	emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked Given this is just an internal helper function, it can be marked static.	2020-04-22 20:55:06 +01:00
Lioncash	c39ea2e3c9	perf_map: Use std::string_view instead of std::string for PerfMapRegister() We can just use a non-owning view into a string in this case instead of potentially allocating a std::string instance.	2020-04-22 20:55:06 +01:00
MerryMage	12243692f5	A64: Implement SQRDMULH (vector), vector variant	2020-04-22 20:55:06 +01:00
MerryMage	3e447614c6	IR: Add VectorSignedSaturatedDoublingMultiplyLong	2020-04-22 20:55:06 +01:00
MerryMage	06b31448aa	emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply * Return both the upper and lower parts of the multiply if required * SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead * Improve port utilisation where possible (punpck instructions were a bottleneck)	2020-04-22 20:55:06 +01:00
MerryMage	08c0e017a5	IR: Implement Vector{Signed,Unsigned}Multiply{16,32}	2020-04-22 20:55:06 +01:00
Lioncash	b6df34cdde	backend_x64/a64_interface: Re-enable the constant folding pass This was disabled for debugging, but never re-enabled. Just to be sure, testing was done downstream in yuzu to make sure this didn't happen to break anything (which seems to be the case).	2020-04-22 20:55:06 +01:00
MerryMage	06ba397af2	emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	e553c4fe8d	emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused	2020-04-22 20:55:06 +01:00
MerryMage	3caeb62ef1	emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	344ee76aba	emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64}	2020-04-22 20:55:06 +01:00
MerryMage	1492573267	emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32}	2020-04-22 20:55:06 +01:00
Lioncash	26df6e5e7b	emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks I had initially meant to use BitSize() here, not sizeof()	2020-04-22 20:55:06 +01:00
MerryMage	a4a26ac226	emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation	2020-04-22 20:55:06 +01:00
MerryMage	a7c66d2d28	emit_x64_vector: Simplify fpsr_qc related code Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously pay the cost of conversion for every saturation instruction.	2020-04-22 20:55:06 +01:00
Lioncash	e739624296	ir: Add opcodes for vector CLZ operations We can optimize these cases further for with the use of a fair bit of shuffling via pshufb and the use of masks, but given the uncommon use of this instruction, I wouldn't consider it to be beneficial in terms of amount of code to be worth it over a simple manageable naive solution like this. If we ever do hit a case where vectorized CLZ happens to be a bottleneck, then we can revisit this. At least with AVX-512CD, this can be done with a single instruction for the 32-bit word case.	2020-04-22 20:55:05 +01:00
Lioncash	5653e7637e	emit_x64_vector: Remove unnecessary [[maybe_unused]] attributes These were unintentionally left in when introducing SUQADD and USQADD	2020-04-22 20:55:05 +01:00
Lioncash	d4a76aaa04	ir: Add opcodes form unsigned saturated accumulations of signed values	2020-04-22 20:55:05 +01:00
Lioncash	6f911a26da	ir: Add opcodes for signed saturated accumulations of unsigned values	2020-04-22 20:55:05 +01:00
Lioncash	b6e74fd17d	ir: Add opcodes for performing unsigned reciprocal square root estimates	2020-04-22 20:55:05 +01:00
Lioncash	af83360f89	ir: Add opcodes for unsigned reciprocal estimate	2020-04-22 20:55:05 +01:00
Lioncash	fca7eddb9e	A64: Add opcodes for signed saturating negations	2020-04-22 20:53:46 +01:00
Lioncash	f1ebbcd7bc	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtract() In the event position is zero, we can just treat it as a NOP, given there's no need to move the data.	2020-04-22 20:53:46 +01:00
Lioncash	87372917f9	emit_x64_vector: Simplify "position == 0" case for EmitVectorExtractLower() In the event position == 0, we can just treat it as a simple movq, clearing the upper half of the XMM register. This also makes that case use only one register.	2020-04-22 20:53:46 +01:00
MerryMage	8f9206901d	backend/x64: Do not clear fast_dispatch_table if not enabled There is no need to pay for the cost of setting a large block of memory if we're not using it.	2020-04-22 20:53:46 +01:00
MerryMage	9b65100660	A64: Implement FastDispatchHint	2020-04-22 20:53:46 +01:00
MerryMage	f96c43d422	A32: Implement FastDispatchHint	2020-04-22 20:53:46 +01:00
MerryMage	aa8d826c13	ir/terminal: Add FastDispatchHint	2020-04-22 20:53:46 +01:00
Lioncash	7ebfd0f31c	ir: Add opcodes for scalar signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	a0231e5546	ir: Add opcodes for signed saturated doubling multiplies	2020-04-22 20:53:46 +01:00
Lioncash	0507e47420	ir: Add opcodes for signed saturated absolute values	2020-04-22 20:53:46 +01:00
MerryMage	27427595b7	emit_x64_floating_point: EmitFPToFixed: maxsd optimization maxsd is not required when doing a signed conversion, because x64 produces a 0x80...00 value for out of range values.	2020-04-22 20:53:46 +01:00
MerryMage	1abf82ac4a	emit_x64_floating_point: ZeroIfNaN: pxor -> xorps xorps is shorter and more appropriate here.	2020-04-22 20:53:46 +01:00
Lioncash	4507627905	emit_x64_vector: Provide AVX path for EmitVectorMinU64()	2020-04-22 20:53:46 +01:00
Lioncash	fd49a62b06	emit_x64_vector: Provide AVX path for EmitVectorMinS64()	2020-04-22 20:53:46 +01:00
Lioncash	770723f449	emit_x64_vector: Provide AVX path for EmitVectorMaxU64()	2020-04-22 20:53:46 +01:00
Lioncash	8fb90c0cf1	emit_x64_vector: Provide AVX path for EmitVectorMaxS64()	2020-04-22 20:53:46 +01:00
Lioncash	2cac6ad129	emit_x64_vector: Simplify EmitVectorLogicalLeftShift8() Similar to EmitVectorLogicalRightShift8(), we can determine a mask ahead of time and just and the results of a halfword left shift.	2020-04-22 20:53:46 +01:00
Lioncash	135107279d	emit_x64_vector: Simplify EmitVectorLogicalShiftRight8() We can generate the mask and AND it against the result of a halfword shift instead of looping.	2020-04-22 20:53:46 +01:00
Lioncash	2952b46b16	emit_x64_vector: Amend value definition in SSE 4.1 path for EmitVectorSignExtend16() We should be defining the value after the results have been calculated to be consistent with the rest of the code.	2020-04-22 20:53:46 +01:00
Lioncash	fda19095ea	emit_x64_vector: Remove fallback in EmitVectorSignExtend64() This is fairly trivial to do manually.	2020-04-22 20:53:46 +01:00
Lioncash	39593fcd26	emit_x64_vector: Remove fallback for EmitVectorSignExtend32() We can just do the extension manually, which gets rid of the need to fall back here.	2020-04-22 20:53:46 +01:00
MerryMage	a12854857b	A32: Add define_unpredictable_behaviour option	2020-04-22 20:53:46 +01:00
MerryMage	d5b9c4a4bb	block_of_code: Hide NX support behind compiler flag Systems that require W^X can use the DYNARMIC_ENABLE_NO_EXECUTE_SUPPORT cmake option.	2020-04-22 20:53:46 +01:00
MerryMage	de4494ffa5	Implement perfmap	2020-04-22 20:53:46 +01:00
MerryMage	f73104633b	a32_emit_x64: Fix incorrect BMI2 implementation for SetCpsr * The MSB for each byte in cpsr_ge were not being appropriately set. * We also expand test coverage to test this case. * We fix the disassembly of the MSR (imm) and MSR (reg) instructions as well.	2020-04-22 20:53:46 +01:00
MerryMage	3432a08e0a	backend/x64: Support W^X systems Closes #176.	2020-04-22 20:53:46 +01:00
BreadFish64	2a65442933	Backend: Create "backend" folder similar to the "frontend" folder	2020-04-22 20:53:46 +01:00

... 2 3 4 5 6

258 commits