Lioncash
3a3542414b
A64: Implement FRECPX's half-precision floating point variant
2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef
frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677
frontend/ir/value: Add U16U32U64 type to represent floating point types
2020-04-22 20:58:11 +01:00
Lioncash
126c29a9e9
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
...
These can just be implemented in terms of the vector variants for the
time being.
2020-04-22 20:58:11 +01:00
Lioncash
dd7433f9d3
A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
...
These take a vector for a destination.
2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2
Merge pull request #447 from lioncash/flag
...
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
fb039e232c
Merge pull request #442 from lioncash/fcvtxn
...
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Merry
4f937c1ee1
Merge pull request #446 from lioncash/sqshl
...
A64: Implement scalar variants of SQSHL (register) and UQSHL (register)
2020-04-22 20:58:11 +01:00
Lioncash
aa22db534b
A64: Implement AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
d74cccbc84
Merge pull request #445 from lioncash/sqrt
...
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:11 +01:00
Lioncash
20ffe568d0
A64: Implement RMIF
2020-04-22 20:58:11 +01:00
Merry
6d7e7c3269
Merge pull request #443 from lioncash/flag
...
A64: Rearrange flag format/manipulation instructions
2020-04-22 20:58:11 +01:00
Lioncash
51b526e453
A64: Implement CFINV
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5
ir: Add A64-specific opcodes for getting and setting raw NZCV values
...
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Lioncash
d3515279df
A64: Implement the vector version of FCVTXN
2020-04-22 20:58:10 +01:00
Lioncash
17aea0b997
A64: Implement UQSHL (register)'s scalar variant
...
This can be implemented in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
c99d4b762e
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:10 +01:00
Lioncash
54e0b487f3
A64: Rearrange flag format/manipulation instructions
...
Gives these instructions better categorical labeling.
2020-04-22 20:58:10 +01:00
Lioncash
302f56b36a
A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
...
Rather than straight-up treating them as undefined, we can fall back to an
interpreter in this case.
2020-04-22 20:58:10 +01:00
Lioncash
4339a8fff6
A64: Implement the scalar version of FCVTXN
2020-04-22 20:58:10 +01:00
Lioncash
35ddf68ad5
A64: Implement SQSHL (register)'s scalar variant
...
We can implement this in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
5cf1478620
frontend/ir: Add opcodes for vector square roots
2020-04-22 20:58:10 +01:00
Lioncash
36027ebef5
frontend/ir/microinstruction: Add missing cases for FPRecipExponent{32,64} for ReadsFromAndWritesToFPSRCumulativeExceptionBits()
...
This was intended to be added within #437 , but was missed
2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3
frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
...
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Merry
40b081438a
Merge pull request #439 from lioncash/fcmla
...
A64: Implement FCADD and FCMLA
2020-04-22 20:58:10 +01:00
Merry
d91192681a
Merge pull request #438 from lioncash/fmulx
...
A64: Implement scalar double/single precision FMULX (by element)
2020-04-22 20:58:10 +01:00
Lioncash
ed29ef8cca
A64: Implement FCMLA
2020-04-22 20:58:10 +01:00
Merry
9f11720a69
Merge pull request #437 from lioncash/frecpx
...
A64: Implement FRECPX (single, double precision)
2020-04-22 20:58:10 +01:00
Lioncash
bdcea0b0dc
A64: Implement scalar double/single precision FMULX (by element)
2020-04-22 20:58:10 +01:00
Lioncash
5ce17574f9
A64: Implement FCADD
2020-04-22 20:58:10 +01:00
Merry
34d917f34e
Merge pull request #436 from lioncash/no-alloc
...
A64: Implement LDNP/STNP
2020-04-22 20:58:10 +01:00
Lioncash
e44730ba6d
A64: Implement FRECPX (single, double precision)
2020-04-22 20:58:10 +01:00
Lioncash
bfaeb08d3c
A64: Implement LDNP/STNP
...
LDNP and STNP indicate that a memory access is non-temporal/streaming
(i.e. unlikely to be repeated), allowing data caching to not be
performed. However, given this is only a hint, we can treat these two
instructions as regular LDP and STP instructions for the time being.
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811
frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents
2020-04-22 20:58:10 +01:00
Lioncash
05a6ab691d
translate_arm/coprocessor: Minor tidying up
2020-04-22 20:58:10 +01:00
Lioncash
1e32a09c03
translate_arm/vfp2: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
e209b31073
translate_arm/synchronization: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
9514e3602e
translate_arm/status_register_access: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c6aa1a708a
translate_arm/saturated: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
a72813599a
translate_arm/reversal: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
7be56e6b67
translate_arm/parallel: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
3c00a616d6
translate_arm/packing: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c711188f46
translate_arm/multiply: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c8dad40d81
translate_arm/misc: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
a7bf5ff77d
translate_arm/load_store: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
f4b19a7393
translate_arm/extension: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
c2de6ecfd0
translate_arm/exception_generating: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
d8a8d3b073
translate_arm/data_processing: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
df5c51ff47
translate_arm/branch: Invert conditionals where applicable
...
Allows unindenting code a bit.
2020-04-22 20:58:09 +01:00
Lioncash
ee973f13c7
frontend/A32/ir_emitter: Mark PC() and AlignPC() as const-qualified member functions
...
These don't modify instance state, so they can be const-qualified member
functions.
2020-04-22 20:57:38 +01:00
Lioncash
3a2dd09122
frontend/A64/ir_emitter: Mark PC() and AlignPC() as const qualified member functions
...
These don't actually alter any instance state.
2020-04-22 20:57:38 +01:00
MerryMage
e3898e628e
A64: Implement FMULX (by element), single and double precision variants
2020-04-22 20:57:37 +01:00
MerryMage
c106d8cedf
A64: Implement FMULX, vector single-precision and double-precision variant
2020-04-22 20:57:37 +01:00
MerryMage
fa8925c4df
IR: Implement FPVectorMulX
2020-04-22 20:57:37 +01:00
Michał Janiszewski
bbd8abaa25
Provide justification for always-true condition ( #412 )
2020-04-22 20:57:37 +01:00
V.Kalyuzhny
764a93bf5a
Switch boost::optional to std::optional
2020-04-22 20:57:37 +01:00
Lioncash
f1a66c37ba
a64: Add ARMv8.4+ instructions encodings to the encoding table
...
Keeps the table up to date with the ARM specification.
2020-04-22 20:57:37 +01:00
Lioncash
0583d401e3
ir/value: Add IsSignedImmediate() and IsUnsignedImmediate() functions to Value's interface
...
This allows testing against arbitrary values while also simultaneously
eliminating the need to check IsImmediate() all the time in expressions.
2020-04-22 20:57:37 +01:00
Lioncash
e3258e8525
ir/value: Add a GetImmediateAsS64() function
...
Provides a signed analogue to GetImmediateAsU64() for consistency with
both integral classes when it comes to signed/unsigned..
2020-04-22 20:57:37 +01:00
Lioncash
4a3c064b15
ir/value: Add an IsZero() member function to Value's interface
...
By far, one of the most common things to check for is whether or not a
value is zero, as it typically allows folding away unnecesary
operations (other close contenders that can help with eliding operations are 1 and -1).
So instead of requiring a check for an immediate and then actually
retrieving the integral value and checking it, we can wrap it within a
function to make it more convenient.
2020-04-22 20:57:37 +01:00
Merry
c649f11c0a
Merge pull request #401 from lioncash/folding
...
constant_propagation_pass: Fold &, |, ^, and ~ operations where applicable
2020-04-22 20:56:01 +01:00
MerryMage
2524d536b0
A32/ir_emitter: Bugfix: ExceptionRaised was producing incorrect PC
...
Use actual PC and not pipelined PC.
2020-04-22 20:56:01 +01:00
Lioncash
d69fceec55
value: Move ImmediateToU64() to be a part of Value's interface
...
This'll make it slightly nicer to do basic constant folding for 32-bit
and 64-bit variants of the same IR opcode type. By that, I mean it's
possible to inspect immediate values without a bunch of conditional
checks beforehand to verify that it's possible to call GetU32() or
GetU64, etc.
2020-04-22 20:55:50 +01:00
Lioncash
f40fcda1f6
ir/value: Add member function to check whether or not all bits of a contained value are set
...
This is useful when we wish to know if a contained value is something
like 0xFFFFFFFF, as this helps perform constant folding. For example the
operation: x & 0xFFFFFFFF can be folded to just x in the 32-bit case.
2020-04-22 20:55:50 +01:00
MerryMage
f0920c0ded
Fix VShift terminology
...
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.
- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
VelocityRa
c30b8dbe99
decoders: Cast to correctly-sized type before shifting
...
Fixes decoding for 64-bit instructions
Does not help/apply to any currently supported ARM versions (since
all are 32-bit length or below), it's for future-proofing should
such an arch be supported.
2020-04-22 20:55:50 +01:00
MerryMage
09bf273bc8
A64: Implement SCVTF, UCVTF (vector, fixed-point), scalar variant
2020-04-22 20:55:06 +01:00
MerryMage
f9129db6fd
A64: Implement FCVTZS, FCVTZU, UCVTF, SCVTF (vector, fixed-point), vector variant
2020-04-22 20:55:06 +01:00
Lioncash
48df9b9a7d
A64: Implement UQSHL's vector immediate and register variants
2020-04-22 20:55:06 +01:00
Lioncash
d426dfe942
ir: Add opcodes for unsigned saturating left shifts
2020-04-22 20:55:06 +01:00
Lioncash
ab60720418
A64/translate/impl: Make signatures consistent for unimplemented by-element SIMD variants
...
Makes them all consistent, so it isn't necessary to change the
prototypes over when implementing them.
2020-04-22 20:55:06 +01:00
Lioncash
6b5ea6ee66
A64: Implement BRK
...
Currently, we can just implement this as part of the exception
interface, similar to how it's done for the A32 interface with BKPT.
2020-04-22 20:55:06 +01:00
Lioncash
b915364c16
A64/imm: Add full range of comparison operators to Imm template
...
Makes the comparison interface consistent by providing all of the
relevant members. This also modifies the comparison operators to take
the Imm instance by value, as it's really only a u32 under the covers,
and it's cheaper to shuffle around a u32 than a 64-bit pointer address.
2020-04-22 20:55:06 +01:00
MerryMage
02150bc0b7
IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed
2020-04-22 20:55:06 +01:00
MerryMage
027b0ef725
A64: Implement SCVTF, UCVTF (scalar, fixed-point)
2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0
opcodes.inc: Align columns to a tabstop of 4
2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d
IR: Add fbits argument to FixedToFP-related opcodes
2020-04-22 20:55:06 +01:00
Lioncash
616a153c16
A64: Implement SQSHL's vector immediate variant
2020-04-22 20:55:06 +01:00
Lioncash
e8b0f25dff
A64: Implement SQSHL's vector register variant
2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46
ir: Add opcodes for left signed saturated shifts
2020-04-22 20:55:06 +01:00
Lioncash
da55ed7b31
branch: Make variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
867b666285
move_wide: Make variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
78024a9dc4
load_store_register_unprivileged: Make variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
e45e5da610
load_store_register_immediate: Place conditional bodies on their own line
...
Makes the conditionals visually consistent with the rest of the
codebase.
2020-04-22 20:55:06 +01:00
Lioncash
b586cf3f56
load_store_load_literal: Make variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
c3a3b9687e
data_processing_logical: Move datasize declarations after early-exit conditionals
...
While we're at it, make variables const where applicable.
2020-04-22 20:55:06 +01:00
Lioncash
ed797e6540
data_processing_conditional_select: Make variables const where applicable
...
Makes CSEL's function consistent with all of the others.
2020-04-22 20:55:06 +01:00
Lioncash
c82fa5ec5a
data_processing_addsub: Move datasize declarations after early-exit conditionals
...
While we're at it, also make relevant variables const where applicable
2020-04-22 20:55:06 +01:00
Lioncash
f4a66d2477
data_processing_bitfield: Move datasize variables after early-exit conditionals
...
Moves the declaration of datasize to the scope that it's used within.
This also takes the opportunity to apply const where applicable, and
make early-exits all vertically consistent with one another.
2020-04-22 20:55:06 +01:00
Lioncash
2e0fcd6161
A64: Implement CLS's vector variant
...
Leverages CLZ like the integral variant does.
2020-04-22 20:55:06 +01:00
MerryMage
12243692f5
A64: Implement SQRDMULH (vector), vector variant
2020-04-22 20:55:06 +01:00
MerryMage
a9ffcf08b1
A64: Implement SQDMULL (vector), vector variant
2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6
IR: Add VectorSignedSaturatedDoublingMultiplyLong
2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa
emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
...
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5
IR: Implement Vector{Signed,Unsigned}Multiply{16,32}
2020-04-22 20:55:06 +01:00
Lioncash
112cff9ab9
A64: Implement CLZ's vector variant
2020-04-22 20:55:06 +01:00
Lioncash
e739624296
ir: Add opcodes for vector CLZ operations
...
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.
If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00
MerryMage
d4c37a68a8
A64/translate: VectorZeroUpper for V(64) stores
...
Ensures correctness.
2020-04-22 20:55:05 +01:00
MerryMage
b8daa4feac
simd_two_register_misc: FNEG (vector) with Q == 0 had dirty upper
2020-04-22 20:55:05 +01:00
Lioncash
14e026a7f0
A64: Implement USQADD's scalar and vector variants
2020-04-22 20:55:05 +01:00