Commit graph

359 commits

Author SHA1 Message Date
Lioncash
427b7afd66 frontend/ir/microinstruction: Add missing fixed-point opcodes to ReadsFromAndWritesToFPSRCumulativeExceptionBits() 2020-04-22 21:01:45 +01:00
Lioncash
604f39f00a frontend/ir_emitter: Add half-precision->fixed-point opcodes 2020-04-22 21:01:45 +01:00
Merry
45864133f5 Merge pull request #478 from lioncash/stepfused
A64: Handle half-precision variants of FRECPE and FRECPS
2020-04-22 21:01:44 +01:00
Lioncash
037acb17b9 frontend/ir_emitter: Add half-precision opcode variant for FPVectorRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
5dba99b4f4 frontend/ir_emitter: Add half-precision opcode variant for FPRSqrtEstimate 2020-04-22 21:01:44 +01:00
Lioncash
825a3ea16f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
2184d24e8f frontend/ir_emitter: Add half-precision opcode for FPRecipEstimate 2020-04-22 21:01:44 +01:00
Lioncash
5d5c9f149f frontend/ir_emitter: Add half-precision opcode for FPVectorRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
6da0411111 frontend/ir_emitter: Add half-precision opcode for FPRecipStepFused 2020-04-22 21:01:44 +01:00
Lioncash
fb829b9525 frontend/microinstruction: Add FPVectorRoundInt types to ReadsFromAndWritesToFPSRCumulativeExceptionBits()
All variants were previously missing from this.
2020-04-22 21:01:44 +01:00
Lioncash
5b4673da4b frontend/ir_emitter: Add half-precision variant of FPVectorRoundInt 2020-04-22 21:01:44 +01:00
Lioncash
ad0c698f89 frontend/ir_emitter: Add half-precision variant of FPRoundInt 2020-04-22 21:01:44 +01:00
Merry
cb9a1b18b6 Merge pull request #475 from lioncash/muladd
A64: Enable half-precision variants of floating-point multiply-add instructions
2020-04-22 21:01:44 +01:00
Merry
13f421c27d Merge pull request #473 from lioncash/sqshlu
A64: Implement SQSHLU
2020-04-22 21:01:44 +01:00
Merry
d7da53a74b Merge pull request #472 from lioncash/exception
general: Mark hash functions as noexcept
2020-04-22 21:01:44 +01:00
Lioncash
a4cadf1cd9 frontend/ir_emitter: Add opcodes for signed saturated left shifts with unsigned saturation 2020-04-22 21:01:44 +01:00
Lioncash
ec6b3ae084 ir/frontend: Add half-precision opcode for FPVectorMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
bd82513199 frontend/ir_emitter: Add half-precision opcode for FPMulAdd 2020-04-22 21:01:44 +01:00
Lioncash
7bb5440507 general: Mark hash functions as noexcept
Generally hash functions shouldn't throw exceptions. It's also a
requirement for the standard library-provided hash functions to not
throw exceptions.

An exception to this rule is made for user-defined specializations,
however we can just be consistent with the standard library on this to
allow it to play nicer with it.

While we're at it, we can also make the std::less specializations
noexcpet as well, since they also can't throw.
2020-04-22 21:01:43 +01:00
Lioncash
fe95575b95 general: Replace unreachable-imitating assertions with UNREACHABLE()
We can just use the self-documenting assertion for indicating
unreachable paths, instead of manually passing false and providing a
message.
2020-04-22 21:01:43 +01:00
Merry
01bb1cdd88 Merge pull request #458 from lioncash/float-op
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677 frontend/ir/value: Add U16U32U64 type to represent floating point types 2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2 Merge pull request #447 from lioncash/flag
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
fb039e232c Merge pull request #442 from lioncash/fcvtxn
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5 ir: Add A64-specific opcodes for getting and setting raw NZCV values
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Lioncash
5cf1478620 frontend/ir: Add opcodes for vector square roots 2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3 frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Lioncash
36027ebef5 frontend/ir/microinstruction: Add missing cases for FPRecipExponent{32,64} for ReadsFromAndWritesToFPSRCumulativeExceptionBits()
This was intended to be added within #437, but was missed
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811 frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents 2020-04-22 20:58:10 +01:00
MerryMage
fa8925c4df IR: Implement FPVectorMulX 2020-04-22 20:57:37 +01:00
V.Kalyuzhny
764a93bf5a Switch boost::optional to std::optional 2020-04-22 20:57:37 +01:00
Lioncash
0583d401e3 ir/value: Add IsSignedImmediate() and IsUnsignedImmediate() functions to Value's interface
This allows testing against arbitrary values while also simultaneously
eliminating the need to check IsImmediate() all the time in expressions.
2020-04-22 20:57:37 +01:00
Lioncash
e3258e8525 ir/value: Add a GetImmediateAsS64() function
Provides a signed analogue to GetImmediateAsU64() for consistency with
both integral classes when it comes to signed/unsigned..
2020-04-22 20:57:37 +01:00
Lioncash
4a3c064b15 ir/value: Add an IsZero() member function to Value's interface
By far, one of the most common things to check for is whether or not a
value is zero, as it typically allows folding away unnecesary
operations (other close contenders that can help with eliding operations  are 1 and -1).

So instead of requiring a check for an immediate and then actually
retrieving the integral value and checking it, we can wrap it within a
function to make it more convenient.
2020-04-22 20:57:37 +01:00
Lioncash
f40fcda1f6 ir/value: Add member function to check whether or not all bits of a contained value are set
This is useful when we wish to know if a contained value is something
like 0xFFFFFFFF, as this helps perform constant folding. For example the
operation: x & 0xFFFFFFFF can be folded to just x in the 32-bit case.
2020-04-22 20:55:50 +01:00
Lioncash
d69fceec55 value: Move ImmediateToU64() to be a part of Value's interface
This'll make it slightly nicer to do basic constant folding for 32-bit
and 64-bit variants of the same IR opcode type. By that, I mean it's
possible to inspect immediate values without a bunch of conditional
checks beforehand to verify that it's possible to call GetU32() or
GetU64, etc.
2020-04-22 20:55:50 +01:00
MerryMage
f0920c0ded Fix VShift terminology
An arithmetic shift is by definition a signed shift, and a logical shift is by definition an unsigned shift.

- Rename VectorLogicalVShiftS* -> VectorArithmeticVShift*
- Rename VectorLogicalVShiftU* -> VectorLogicalVShift*
2020-04-22 20:55:50 +01:00
Lioncash
d426dfe942 ir: Add opcodes for unsigned saturating left shifts 2020-04-22 20:55:06 +01:00
MerryMage
02150bc0b7 IR: Add fbits argument to FPVectorFrom{Signed,Unsigned}Fixed 2020-04-22 20:55:06 +01:00
MerryMage
8051f60db0 opcodes.inc: Align columns to a tabstop of 4 2020-04-22 20:55:06 +01:00
MerryMage
90193b0e3d IR: Add fbits argument to FixedToFP-related opcodes 2020-04-22 20:55:06 +01:00
Lioncash
b14eaaec46 ir: Add opcodes for left signed saturated shifts 2020-04-22 20:55:06 +01:00
MerryMage
3e447614c6 IR: Add VectorSignedSaturatedDoublingMultiplyLong 2020-04-22 20:55:06 +01:00
MerryMage
06b31448aa emit_x64_vector: Changes to VectorSignedSaturatedDoublingMultiply
* Return both the upper and lower parts of the multiply if required
* SSE2 does not support the pmuldq instruction, do sign correction to an unsigned result instead
* Improve port utilisation where possible (punpck instructions were a bottleneck)
2020-04-22 20:55:06 +01:00
MerryMage
08c0e017a5 IR: Implement Vector{Signed,Unsigned}Multiply{16,32} 2020-04-22 20:55:06 +01:00
Lioncash
e739624296 ir: Add opcodes for vector CLZ operations
We can optimize these cases further for with the use of a fair bit of
shuffling via pshufb and the use of masks, but given the uncommon use of
this instruction, I wouldn't consider it to be beneficial in terms of
amount of code to be worth it over a simple manageable naive solution
like this.

If we ever do hit a case where vectorized CLZ happens to be a
bottleneck, then we can revisit this. At least with AVX-512CD, this can
be done with a single instruction for the 32-bit word case.
2020-04-22 20:55:05 +01:00