dynarmic

Author	SHA1	Message	Date
MerryMage	70ff2d73b5	A64: Implement UADDLP	2020-04-22 20:46:19 +01:00
MerryMage	5563bbbd79	A64: Implement EXT	2020-04-22 20:46:19 +01:00
MerryMage	304cc7f61e	emit_x64_floating_point: SSE4.1 implementation for FP{Double,Single}ToFixed{S,U}{32,64}	2020-04-22 20:46:19 +01:00
MerryMage	caaf36dfd6	IR: Initial implementation of FP{Double,Single}ToFixed{S,U}{32,64} This implementation just falls-back to the software floating point implementation.	2020-04-22 20:46:19 +01:00
MerryMage	760cc3ca89	EmitContext: Expose FPCR	2020-04-22 20:46:19 +01:00
MerryMage	3cb98e1560	fp: Move fp_util to fp/util	2020-04-22 20:46:19 +01:00
Lioncash	4aa4885ba7	ir: Add opcodes for vector conversion of u32/u64 to floating-point	2020-04-22 20:46:19 +01:00
Lioncash	7a84b6e8d8	ir: Add opcodes for converting S64 and U64 to single-precision floating-point values	2020-04-22 20:46:19 +01:00
Lioncash	066061fa50	constant_pool: Remove unnecessary std::memset from constructor AllocateFromCodeSpace() already zeroes out the allocated memory.	2020-04-22 20:46:19 +01:00
Lioncash	35026a6ce3	emit_x64_vector: Vectorize fallback path for EmitVectorMaxU32()	2020-04-22 20:46:19 +01:00
MerryMage	0b97e9bd8d	emit_x64_floating_point: Fix EmitFPU64ToDouble for TowardsMinusInfinity rounding mode	2020-04-22 20:46:18 +01:00
MerryMage	a2eb9a02e0	backend_x86: Add FPSCR_RMode to EmitContext	2020-04-22 20:46:18 +01:00
MerryMage	d875c08ebf	fp: Extract common RoundingMode enum	2020-04-22 20:46:18 +01:00
Lioncash	7252293184	emit_x64_floating_point: Correct use of UseGpr() in EmitFPU32ToDouble() and EmitFPU32ToSingle() In the non-AVX512 path, the following code is present: code.mov(from.cvt32(), from.cvt32()); since this potentially modifies 'from', we should be using UseScratchGpr() instead.	2020-04-22 20:46:18 +01:00
Lioncash	fbd7623fe5	emit_x64_floating_point: Add AVX512F conversion operations to EmitFPU32ToSingle() and EmitFPU32ToDouble() AVX-512F provides convenient instructions for these kinds of conversions directly	2020-04-22 20:46:18 +01:00
Lioncash	3a41465eaf	ir: Add opcodes for converting S64 and U64 to double-precision values	2020-04-22 20:46:18 +01:00
MerryMage	436ca80bcd	Merge branch 'global_monitor'	2020-04-22 20:46:18 +01:00
MerryMage	821cff1227	A64: Add ClearExclusiveState method	2020-04-22 20:46:18 +01:00
Lioncash	81e572c78c	ir: Extend FPVectorAbs opcode to also handle 16-bit elements for FP16	2020-04-22 20:46:18 +01:00
MerryMage	2a8de5f733	a64_emit_x64: Clear exclusive state in EmitA64CallSupervisor The kernel would have to execute an ERET instruction to return to userland; this clears exclusive state.	2020-04-22 20:46:18 +01:00
MerryMage	57f7c7e1b0	Implement global exclusive monitor	2020-04-22 20:46:18 +01:00
MerryMage	85234338d3	a64_emit_x64: Simplify EmitExclusiveWrite	2020-04-22 20:46:18 +01:00
Lioncash	fc731dddae	ir: Add opcodes for performing vector absolute floating-point values This will be usable for implementing FACGE and FACGT	2020-04-22 20:46:18 +01:00
Lioncash	0bee648b4f	emit_x64_vector: Deduplicate a bit of code in EmitVectorSetElement{8, 32, 64} functions Given both branches are the same, we can hoist out the common code.	2020-04-22 20:46:18 +01:00
Lioncash	b6e223fc58	emit_x64_vector: Deduplicate a bit of code within EmitVectorGetElement8() Given both branches use the same destination register size, we can hoist the common code out.	2020-04-22 20:46:18 +01:00
Lioncash	5ce187a54e	ir: Add opcodes for floating-point vector equalities	2020-04-22 20:46:18 +01:00
Lioncash	cf188448d4	emit_x64_vector: Vectorize fallback case in EmitVectorMultiply64() Gets rid of the need to perform a fallback.	2020-04-22 20:46:18 +01:00
Lioncash	954deff2d4	emit_x64_vector: Add break to final case in EmitVectorRoundingHalvingAddUnsigned() This doesn't alter behavior but does make the code better if anything else is ever added to this function in the future.	2020-04-22 20:46:18 +01:00
Lioncash	bc718c5b28	ir: Add opcodes for performing rounding halving adds	2020-04-22 20:46:18 +01:00
Lioncash	054549da35	emit_x64_vector: Simplify AVX-512 codepath in EmitVectorMultiply64 I realized I introduced a helper for simple AVX operation emitting, so use that instead of writing it all out long-form.	2020-04-22 20:46:18 +01:00
Lioncash	8a4f8aed06	ir: Add opcode for performing FP vector absolute differences	2020-04-22 20:46:18 +01:00
MerryMage	8c90fcf58e	IR: Implement FPMulAdd	2020-04-22 20:46:18 +01:00
Lioncash	c695da1cf3	ir: Add opcode for floating-point GE and GT comparisons The rest of the comparisons can be implemented in terms of these two	2020-04-22 20:46:18 +01:00
Lioncash	6de5ed96e5	emit_x64_vector: Emit VPMULLQ in EmitVectorMultiply64 on AVX-512{DQ, VL} capable CPUs Shortens code-gen down to a single instruction in the 64-bit path.	2020-04-22 20:46:18 +01:00
Lioncash	1e10017f4b	ir: Add opcodes for signed absolute differences	2020-04-22 20:46:17 +01:00
Lioncash	3f6c529da2	ir: Add opcode to perform the vector conversion S64->F64 Unfortunately x86 prior to AVX-512 doesn't really give us any convenient instruction to do the work for us	2020-04-22 20:46:17 +01:00
Lioncash	44a5f8095a	ir: Add opcodes for performing vector halving subtracts	2020-04-22 20:46:17 +01:00
Lioncash	b312d28295	ir: Add an opcode for doing an SM4 lookup table query	2020-04-22 20:46:17 +01:00
Lioncash	27a6d5f6ce	emit_x64_vector: Use VPOPCNTB in EmitVectorPopulationCount() if AVX-512 BITALG is available	2020-04-22 20:46:17 +01:00
Lioncash	089096948a	ir: Add opcodes for performing halving adds	2020-04-22 20:46:17 +01:00
Lioncash	3d00dd63b4	emit_x64_vector: Emit VPMINSQ and VPMINUQ for 64-bit vector min operations if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	b97b71b8aa	emit_x64_vector: Emit VPMAXSQ and VPMAXUQ for 64-bit vector max operations if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	033e400df0	emit_x64_vector_floating_point: Deduplicate accurate NaN handling code Allows the code to both be used from the 32 bit and 64 bit operations without duplicating code.	2020-04-22 20:46:17 +01:00
Lioncash	0f067b7330	emit_x64_vector: Emit VPABSQ in EmitVectorAbs() for the 64-bit case if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	d4ee878cbd	emit_x64_vector: Use VPSRAQ in EmitVectorArithmeticShiftRight64() if AVX-512VL is available	2020-04-22 20:46:17 +01:00
Lioncash	51e4f1d9db	emit_x64_vector: Vectorize fallback path of EmitVectorMaxS32()	2020-04-22 20:46:17 +01:00
Lioncash	c692ccdd6d	emit_x64_vector: Vectorize fallback path of EmitVectorMaxS8()	2020-04-22 20:46:17 +01:00
Lioncash	b194313d8c	emit_x64_vector: Vectorize fallback path in EmitVectorMinU32()	2020-04-22 20:46:17 +01:00
Lioncash	7ceda6d919	emit_x64_vector: Vectorize fallback path in EmitVectorMinU16()	2020-04-22 20:46:17 +01:00
Lioncash	cda85a1da0	emit_x64_vector: Vectorize fallback path in EmitVectorMinS32()	2020-04-22 20:46:17 +01:00
Lioncash	6e08eed210	emit_x64_vector: Vectorize fallback path in EmitVectorMinS8()	2020-04-22 20:46:17 +01:00
Lioncash	0fb6dce689	emit_x64_vector: Remove unnecessary if constexpr expression in LogicalVShift This can simply be merged with the previous one.	2020-04-22 20:46:17 +01:00
Lioncash	5b71b1337b	emit_x64_vector: Avoid left shift of negative value in LogicalVShift Now that we handle the signed variants, we also have to be careful about left shifts with negative values, as this is considered undefined behavior.	2020-04-22 20:46:17 +01:00
Lioncash	9954d28868	a64_jitstate: Zero SP and PC on construction of A64JitState Given we zero out/reset everything else in the struct, do the same for these members to keep initialization consistent	2020-04-22 20:46:17 +01:00
Lioncash	4efbd40ea4	backend_x64/callback: Default virtual destructor in the cpp file Prevents the vtable being generated in each translation unit that includes the header (and silences -Wweak-vtables warnings)	2020-04-22 20:46:17 +01:00
Lioncash	edd0b5c8c7	a32_interface/a64_interface: Change reinterpret_casts to static_casts in GetCurrentBlock thunks It's well-defined to static_cast a void* to its proper type.	2020-04-22 20:46:17 +01:00
Lioncash	21974ee57e	backend_x64/ir: Amend generic LogicalVShift() template to also handle signed variants Also adds IR opcodes to dispatch said variants	2020-04-22 20:46:17 +01:00
Lioncash	9fc89f0a0e	emit_x64_vector_floating_point: Use arrays for retrieving size instead of hardcoding the size Similar changes were done in emit_x64_vector, but these were missed.	2020-04-22 20:46:17 +01:00
Lioncash	af28e89a13	emit_x64_vector: Vectorize fallback path in EmitVectorMaxU16()	2020-04-22 20:46:17 +01:00
Lioncash	0d20423ad5	emit_x64_vector: Vectorize non-SSE4.1 fallback path for VectorMultiply32()	2020-04-22 20:46:17 +01:00
Lioncash	d70ee7c0d1	emit_x64_vector: Use VBPROADCAST where applicable and available Uses the instruction that does what it says in its name if available. Allows avoiding the use of a scratch register in EmitVectorBroadcast8() and EmitVectorBroadcastLower8()'s SSSE3 path.	2020-04-22 20:46:17 +01:00
Lioncash	26d77c6f09	ir: Add opcodes for performing vector deinterleaving	2020-04-22 20:46:17 +01:00
Lioncash	87ca63699f	emit_x64_vector: Emit PMAXUD in EmitVectorMaxU32 on SSE4.1-capable CPUs	2020-04-22 20:46:17 +01:00
Lioncash	f17702f608	emit_x64_vector: Emit PMINUD in EmitVectorMinU32 on SSE4.1-capable CPUs	2020-04-22 20:46:17 +01:00
Lioncash	596a8dd1dd	emit_x64_vector: Emit PMINSD in EmitVectorMinS32 on SSE4.1-capable CPUs Provides a better alternative to a fallback operation.	2020-04-22 20:46:17 +01:00
Lioncash	75fd4eaaaa	emit_x64_vector: Get rid of some magic numbers in loop bounds	2020-04-22 20:46:17 +01:00
Lioncash	7b80ac25eb	emit_x64_vector: Generify variable shift functions	2020-04-22 20:46:17 +01:00
Lioncash	38fa984b53	IR: Add opcode for packed word->f32 conversions	2020-04-22 20:46:16 +01:00
Lioncash	64b1f2d468	ir: Add opcode for reversing bits in a vector	2020-04-22 20:46:15 +01:00
Lioncash	e33dcce14a	ir: Add opcodes for performing vector absolute values	2020-04-22 20:46:15 +01:00
MerryMage	3472f371df	IR: Implement VectorExtract, VectorExtractLower IR instructions	2020-04-22 20:46:15 +01:00
MerryMage	5c47f03888	A64: Implement FMUL (vector)	2020-04-22 20:46:15 +01:00
Lioncash	ad5cf584ce	ir: Add opcodes for performing vector unsigned absolute differences	2020-04-22 20:46:15 +01:00
Lioncash	701f43d61e	IR: Add opcodes for interleaving upper-order bytes/halfwords/words/doublewords I should have added this when I introduced the functions for interleaving low-order equivalents for consistency in the interface.	2020-04-22 20:46:15 +01:00
Lioncash	3985f7bf84	emit_x64_data_processing: Deduplicate some code in zero-extension functions EmitZeroExtendByteToLong() can be implemented in terms of EmitZeroExtendByteToWord() and EmitZeroExtendHalfToLong() can be implemented in terms of EmitZeroExtendHalfToWord().	2020-04-22 20:46:15 +01:00
MerryMage	e7b60189b3	abi: Missing includes'	2020-04-22 20:46:15 +01:00
MerryMage	cdc5c3ad95	emit_x64_floating_point: Near jump instead of short jump in FPMinNumberic{32,64}	2020-04-22 20:46:15 +01:00
MerryMage	df4ee0f51e	emit_X64_floating_point: Near jmp to end instead of short jmp Jump destination can be further than what can be reached in a short jump under some FPCR options.	2020-04-22 20:46:15 +01:00
Lioncash	b8d5765f9b	emit_x64_vector: Fix typo in VectorShuffleImpl This is supposed to be pshufd, not pshufw (which only allows a 64-bit operand)	2020-04-22 20:46:15 +01:00
Lioncash	6b0010c940	ir: Add IR opcodes for emitting vector shuffles This uses the ARM terminology for sizes (Halfword -> 2 bytes, Word -> 4 bytes) as opposed to the x86 terminology of (Word -> 2 bytes, Double word -> 4 bytes)	2020-04-22 20:46:15 +01:00
Lioncash	eb2d28d2b1	emit_x64_vector_floating_point: Fix out of bounds array access in EmitVectorOperation64	2020-04-22 20:46:15 +01:00
MerryMage	49cc6d7fad	A64: Implement FDIV (vector)	2020-04-22 20:46:15 +01:00
MerryMage	c832cec96d	Correct FPSR and FPCR	2020-04-22 20:46:15 +01:00
MerryMage	147284427b	A64: Implement USHL	2020-04-22 20:46:15 +01:00
MerryMage	e4697b1676	A64: Implement system register TPIDR_EL0	2020-04-22 20:46:15 +01:00
MerryMage	e3da92024e	A64: Implement system registers FPCR and FPSR	2020-04-22 20:46:15 +01:00
MerryMage	9e4e4e9c1d	A64: Implement system register CNTPCT_EL0	2020-04-22 20:46:15 +01:00
MerryMage	1e15283d00	A64: Implement system register CTR_EL0	2020-04-22 20:46:15 +01:00
MerryMage	710d09471b	IR: Add IR instruction ZeroVector	2020-04-22 20:46:15 +01:00
MerryMage	2721bb5ace	emit_x64_floating_point: Add maybe_unused to preprocess parameter	2020-04-22 20:46:15 +01:00
MerryMage	0575e7421b	A64: Implement FMINNM (scalar)	2020-04-22 20:46:15 +01:00
MerryMage	1c9804ea07	A64: Implement FMAXNM (scalar)	2020-04-22 20:46:15 +01:00
MerryMage	1dfce0894d	constant_pool: Add frame parameter	2020-04-22 20:46:14 +01:00
MerryMage	84f1c9b7f4	reg_alloc: Only exchange GPRs	2020-04-22 20:46:14 +01:00
MerryMage	6541ec064d	emit_x64_floating_point: Correct FP{Max,Min}{32,64} implementations for -0/+0	2020-04-22 20:46:14 +01:00
MerryMage	7c193485e1	a64/config: Allow NaN emulation accuracy to be set	2020-04-22 20:46:14 +01:00
MerryMage	a3df46a75a	a64_emit_x64: Add conf to A64EmitContext	2020-04-22 20:46:14 +01:00
MerryMage	07520f32c3	backend_x64: Accurately handle NaNs	2020-04-22 20:46:14 +01:00
MerryMage	e97581d063	fuzz_with_unicorn: Print AArch64 disassembly	2020-04-22 20:46:14 +01:00
MerryMage	47c0ad0fc8	IR: Implement Vector{Max,Min}{Signed,Unsigned}	2020-04-22 20:46:14 +01:00

1 2 3 4 5 ...

573 commits