dynarmic

Author	SHA1	Message	Date
VelocityRa	c30b8dbe99	decoders: Cast to correctly-sized type before shifting Fixes decoding for 64-bit instructions Does not help/apply to any currently supported ARM versions (since all are 32-bit length or below), it's for future-proofing should such an arch be supported.	2020-04-22 20:55:50 +01:00
MerryMage	7162f6f254	emit_x64_vector_floating_point: SSE4.1 implementation of EmitFPVectorToFixed	2020-04-22 20:55:50 +01:00
MerryMage	e7a5592699	emit_x64_vector_floating_point: EmitFPVectorRoundInt: Use FCODE	2020-04-22 20:55:50 +01:00
MerryMage	b8fde48732	emit_x64_vector: AVX implementation for EmitVectorCountLeadingZeros8	2020-04-22 20:55:50 +01:00
MerryMage	fd37b637aa	emit_x64_vector: SSE implementation of EmitVectorCountLeadingZeros16	2020-04-22 20:55:50 +01:00
MerryMage	09bf273bc8	A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant	2020-04-22 20:55:06 +01:00
MerryMage	03ad2072a7	emit_x64_floating_point: Reduce fallback LUT code in EmitFPToFixed	2020-04-22 20:55:06 +01:00
MerryMage	f9129db6fd	A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant	2020-04-22 20:55:06 +01:00
Lioncash	48df9b9a7d	A64: Implement UQSHL's vector immediate and register variants	2020-04-22 20:55:06 +01:00
Lioncash	d426dfe942	ir: Add opcodes for unsigned saturating left shifts	2020-04-22 20:55:06 +01:00
Lioncash	ab60720418	A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants Makes them all consistent, so it isn't necessary to change the prototypes over when implementing them.	2020-04-22 20:55:06 +01:00
Lioncash	6b5ea6ee66	A64: Implement BRK Currently, we can just implement this as part of the exception interface, similar to how it's done for the A32 interface with BKPT.	2020-04-22 20:55:06 +01:00
Lioncash	b915364c16	A64/imm: Add full range of comparison operators to Imm template Makes the comparison interface consistent by providing all of the relevant members. This also modifies the comparison operators to take the Imm instance by value, as it's really only a u32 under the covers, and it's cheaper to shuffle around a u32 than a 64-bit pointer address.	2020-04-22 20:55:06 +01:00
MerryMage	02150bc0b7	IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed	2020-04-22 20:55:06 +01:00
MerryMage	027b0ef725	A64: Implement SCVTF, UCVTF (scalar, fixed-point)	2020-04-22 20:55:06 +01:00
MerryMage	8051f60db0	opcodes.inc: Align columns to a tabstop of 4	2020-04-22 20:55:06 +01:00
MerryMage	90193b0e3d	IR: Add fbits argument to FixedToFP-related opcodes	2020-04-22 20:55:06 +01:00
Lioncash	616a153c16	A64: Implement SQSHL's vector immediate variant	2020-04-22 20:55:06 +01:00
Lioncash	e8b0f25dff	A64: Implement SQSHL's vector register variant	2020-04-22 20:55:06 +01:00
Lioncash	b14eaaec46	ir: Add opcodes for left signed saturated shifts	2020-04-22 20:55:06 +01:00
Lioncash	da55ed7b31	branch: Make variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	867b666285	move_wide: Make variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	78024a9dc4	load_store_register_unprivileged: Make variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	e45e5da610	load_store_register_immediate: Place conditional bodies on their own line Makes the conditionals visually consistent with the rest of the codebase.	2020-04-22 20:55:06 +01:00
Lioncash	b586cf3f56	load_store_load_literal: Make variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	c3a3b9687e	data_processing_logical: Move datasize declarations after early-exit conditionals While we're at it, make variables const where applicable.	2020-04-22 20:55:06 +01:00
Lioncash	ed797e6540	data_processing_conditional_select: Make variables const where applicable Makes CSEL's function consistent with all of the others.	2020-04-22 20:55:06 +01:00
Lioncash	c82fa5ec5a	data_processing_addsub: Move datasize declarations after early-exit conditionals While we're at it, also make relevant variables const where applicable	2020-04-22 20:55:06 +01:00
Lioncash	f4a66d2477	data_processing_bitfield: Move datasize variables after early-exit conditionals Moves the declaration of datasize to the scope that it's used within. This also takes the opportunity to apply const where applicable, and make early-exits all vertically consistent with one another.	2020-04-22 20:55:06 +01:00
Lioncash	2e0fcd6161	A64: Implement CLS's vector variant Leverages CLZ like the integral variant does.	2020-04-22 20:55:06 +01:00
Lioncash	a2cd643525	emit_x64_vector: Make EmitVectorUnsignedSaturatedAccumulateSigned() internally linked Given this is just an internal helper function, it can be marked static.	2020-04-22 20:55:06 +01:00
Lioncash	c39ea2e3c9	perf_map: Use std::string_view instead of std::string for PerfMapRegister() We can just use a non-owning view into a string in this case instead of potentially allocating a std::string instance.	2020-04-22 20:55:06 +01:00
MerryMage	12243692f5	A64: Implement SQRDMULH (vector), vector variant	2020-04-22 20:55:06 +01:00
MerryMage	a9ffcf08b1	A64: Implement SQDMULL (vector), vector variant	2020-04-22 20:55:06 +01:00
MerryMage	3e447614c6	IR: Add VectorSignedSaturatedDoublingMultiplyLong	2020-04-22 20:55:06 +01:00
MerryMage	06b31448aa	emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply * Return both the upper and lower parts of the multiply if required * SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead * Improve port utilisation where possible (punpck instructions were a bottleneck)	2020-04-22 20:55:06 +01:00
MerryMage	08c0e017a5	IR: Implement Vector{Signed,Unsigned}Multiply{16,32}	2020-04-22 20:55:06 +01:00
Lioncash	b6df34cdde	backend_x64/a64_interface: Re-enable the constant folding pass This was disabled for debugging, but never re-enabled. Just to be sure, testing was done downstream in yuzu to make sure this didn't happen to break anything (which seems to be the case).	2020-04-22 20:55:06 +01:00
MerryMage	06ba397af2	emit_x64_vector_floating_point: Hardware FMA implementation for RSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	e553c4fe8d	emit_x64_vector_floating_point: Hardware FMA implementation of FPVectorRecipStepFused	2020-04-22 20:55:06 +01:00
MerryMage	3caeb62ef1	emit_x64_floating_point: Hardware FMA implementation of FPRSqrtStepFused	2020-04-22 20:55:06 +01:00
MerryMage	344ee76aba	emit_x64_floating_point: Hardware FMA implementation of FPRecipStepFused{32,64}	2020-04-22 20:55:06 +01:00
MerryMage	1492573267	emit_x64_vector: SSE implementation of VectorSignedSaturatedAccumulateUnsigned{8,16,32}	2020-04-22 20:55:06 +01:00
Lioncash	26df6e5e7b	emit_x64_vector: Correct static asserts for < 64-bit type checks in saturated accumulate fallbacks I had initially meant to use BitSize() here, not sizeof()	2020-04-22 20:55:06 +01:00
MerryMage	a4a26ac226	emit_x64_vector: EmitVectorSignedSaturatedAccumulateUnsigned64: SSE implementation	2020-04-22 20:55:06 +01:00
MerryMage	a7c66d2d28	emit_x64_vector: Simplify fpsr_qc related code Move the bool conversion into A64JitState::GetFpsr so we don't have to continuously pay the cost of conversion for every saturation instruction.	2020-04-22 20:55:06 +01:00
Lioncash	112cff9ab9	A64: Implement CLZ's vector variant	2020-04-22 20:55:06 +01:00
Lioncash	e739624296	ir: Add opcodes for vector CLZ operations We can optimize these cases further for with the use of a fair bit of shuffling via pshufb and the use of masks, but given the uncommon use of this instruction, I wouldn't consider it to be beneficial in terms of amount of code to be worth it over a simple manageable naive solution like this. If we ever do hit a case where vectorized CLZ happens to be a bottleneck, then we can revisit this. At least with AVX-512CD, this can be done with a single instruction for the 32-bit word case.	2020-04-22 20:55:05 +01:00
MerryMage	d4c37a68a8	A64/translate: VectorZeroUpper for V(64) stores Ensures correctness.	2020-04-22 20:55:05 +01:00
MerryMage	b8daa4feac	simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper	2020-04-22 20:55:05 +01:00

... 35 36 37 38 39 ...

3403 commits