Lioncash
28a8b4d210
A64: Handle half-precision floating point in scalar FMOV
...
This is simply performing a scalar value transfer between registers
without conversions, so this is trivial to handle as-is.
2020-04-22 20:58:12 +01:00
Lioncash
d7ac5a664f
A64: Handle half-precision floating point in FCVTL
...
Like FCVTN, now that we have half-precision floating point conversion
functions available, we can go ahead and use those to eliminate the
interpreter fallback.
2020-04-22 20:58:12 +01:00
Lioncash
fe84ecb780
A64: Handle half-precision floating point in scalar FABS
...
Now that we have the half-precision variant of the opcode added, we can
simply handle the instruction instead of treating it as undefined.
2020-04-22 20:58:12 +01:00
Lioncash
fac9224d5e
A64: Handle half-precision floating point in FCVTN
...
Now that we have IR instructions for performing conversions with
half-precision floating point, we can also handle half-precision values
within FCVTN.
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f
frontend/ir_emitter: Add half-precision variant of FPAbs
2020-04-22 20:58:12 +01:00
Lioncash
16de99d3e3
A64: Enable FCVT floating-point conversions for half-precision
...
With this, we no longer have to fall back to the interpreter in any of
the FCVT floating-point conversion instructions.
2020-04-22 20:58:12 +01:00
Lioncash
10abc77fad
A64: Handle half-precision floating point in scalar FNEG
...
With the half-precision variant of the FPNeg opcode added, we can
utilize it here to emulate the half-precision variant of FNEG.
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f
frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes
2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978
frontend/ir_emitter: Add half-precision variant of FPNeg
2020-04-22 20:58:12 +01:00
Lioncash
dff5da1063
common/fp/unpacked: Amend behavior of FPUnpackCV
...
This is supposed to call FPUnpackBase instead of FPUnpack. This would
result in alternate half-precision representations being misinterpreted
when it comes to dealing with NaNs.
2020-04-22 20:58:12 +01:00
Merry
f01afc5ae6
Merge pull request #456 from lioncash/mov
...
A64: Enable FMOV (general) for half-precision floating point
2020-04-22 20:58:12 +01:00
Lioncash
03bc2334fe
common/fp/op/FPConvert: Amend off-by one in double NaN case in FPConvertNaN
...
Avoids potentially clobbering the intended sign bit value during
conversions to double-precision values. The other conversion types are
already properly handled, so those don't need to be addressed.
2020-04-22 20:58:12 +01:00
Lioncash
c57b146fb2
common/fp/op/FPConvert: Add half-precision instantiations to FPConvert
2020-04-22 20:58:12 +01:00
Merry
c1ce94872d
Merge pull request #455 from lioncash/sqrdmulh-scalar
...
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2020-04-22 20:58:11 +01:00
Lioncash
25a7256ee1
A64: Enable FMOV (general) for half-precision floating point
...
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2020-04-22 20:58:11 +01:00
Lioncash
97dd3d0596
A64: Implement SQRDMULH's scalar indexed element variant
2020-04-22 20:58:11 +01:00
Lioncash
49b51e34f1
simd_vector_x_indexed_element: Deduplicate index and Vm operand construction
2020-04-22 20:58:11 +01:00
Lioncash
692aba91b6
A64: Implement SQDMULL{2}'s scalar indexed element variant
2020-04-22 20:58:11 +01:00
Lioncash
c043b831d5
A64: Implement SQDMULL{2}'s by-element variant
2020-04-22 20:58:11 +01:00
Lioncash
72af5a3dff
simd_scalar_x_indexed_element: Factor out index and Vm argument construction
...
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2020-04-22 20:58:11 +01:00
Lioncash
224ff0afaa
A64: Implement SQRDMULH's by-index vector variant
2020-04-22 20:58:11 +01:00
Lioncash
3a3542414b
A64: Implement FRECPX's half-precision floating point variant
2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef
frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677
frontend/ir/value: Add U16U32U64 type to represent floating point types
2020-04-22 20:58:11 +01:00
Lioncash
eb3e0d5908
common/fp/op/FPRecipExponent: Add half-precision floating point specialization
2020-04-22 20:58:11 +01:00
Lioncash
a829c93406
common/fp/unpacked: Correct edge-cases within FPUnpack for half-precision floating point
...
This corrects one case where floating-point exceptions could be set when
they're not supposed to be.
This also corrects a case where values were being treated as NaNs when
they weren't supposed to be.
2020-04-22 20:58:11 +01:00
Lioncash
7030b9af95
common/fp/process_nan: Add half-precision instantiations for NaN processing functions
2020-04-22 20:58:11 +01:00
Lioncash
14f55d7476
common/fp/unpacked: Add half-precision instantiation of FPRoundBase
2020-04-22 20:58:11 +01:00
Lioncash
7e814de445
common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase
2020-04-22 20:58:11 +01:00
Lioncash
8f9fe8690a
common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
...
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.
At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2020-04-22 20:58:11 +01:00
Merry
37c4c39d62
Merge pull request #448 from lioncash/saturate
...
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2020-04-22 20:58:11 +01:00
Merry
f5d774bdbd
Merge pull request #449 from lioncash/hp
...
common/fp/info: Add specialization of FPInfo for half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
126c29a9e9
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
...
These can just be implemented in terms of the vector variants for the
time being.
2020-04-22 20:58:11 +01:00
Lioncash
0b67b94b6c
common/fp/info: Add specialization of FPInfo for half-precision floating point
...
Puts the necessary info struct in place for further use.
2020-04-22 20:58:11 +01:00
Lioncash
dd7433f9d3
A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
...
These take a vector for a destination.
2020-04-22 20:58:11 +01:00
Lioncash
99c494bae9
common/fp/unpacked: Add FPRoundCV
...
Corresponds to the equivalent pseudocode within the ARMv8 reference
manual. This will be necessary for supporting half-precision
floating-point.
This also makes use of it within FPConvert
2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2
Merge pull request #447 from lioncash/flag
...
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Lioncash
490bebbd9a
common/fp/unpacked: Add FPUnpackCV
...
Adds a template function that performs the same behavior as in the ARM
pseudocode, and utilizes it in FPConvert, which will be necessary for
half-float support.
2020-04-22 20:58:11 +01:00
Merry
fb039e232c
Merge pull request #442 from lioncash/fcvtxn
...
A64: Implement scalar and vector variants of FCVTXN
2020-04-22 20:58:11 +01:00
Lioncash
6aed4036ef
ir_opt/a64_get_set_elimination_pass: Add handling for NZCV raw get and set operations
2020-04-22 20:58:11 +01:00
Merry
4f937c1ee1
Merge pull request #446 from lioncash/sqshl
...
A64: Implement scalar variants of SQSHL (register) and UQSHL (register)
2020-04-22 20:58:11 +01:00
Lioncash
aa22db534b
A64: Implement AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Merry
d74cccbc84
Merge pull request #445 from lioncash/sqrt
...
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:11 +01:00
Lioncash
20ffe568d0
A64: Implement RMIF
2020-04-22 20:58:11 +01:00
Merry
6d7e7c3269
Merge pull request #443 from lioncash/flag
...
A64: Rearrange flag format/manipulation instructions
2020-04-22 20:58:11 +01:00
Lioncash
51b526e453
A64: Implement CFINV
2020-04-22 20:58:11 +01:00
Merry
5d01f1b462
Merge pull request #441 from lioncash/constexpr
...
common/bit_util: Mark a few functions as constexpr
2020-04-22 20:58:11 +01:00
Lioncash
597a8be5d5
ir: Add A64-specific opcodes for getting and setting raw NZCV values
...
This will be necessary to implement the flag manipulation and flag
format instructions.
2020-04-22 20:58:11 +01:00
Merry
743c52fdc5
Merge pull request #440 from lioncash/include
...
common/fp: Remove unnecessary includes
2020-04-22 20:58:11 +01:00
Lioncash
d3515279df
A64: Implement the vector version of FCVTXN
2020-04-22 20:58:10 +01:00
Lioncash
17aea0b997
A64: Implement UQSHL (register)'s scalar variant
...
This can be implemented in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
c99d4b762e
A64: Implement single and double-precision vector variant of FSQRT
2020-04-22 20:58:10 +01:00
Lioncash
54e0b487f3
A64: Rearrange flag format/manipulation instructions
...
Gives these instructions better categorical labeling.
2020-04-22 20:58:10 +01:00
Lioncash
88d1977cb9
common/bit_util: Make a few functions as constexpr
...
These four functions can be made constexpr with no issue.
2020-04-22 20:58:10 +01:00
Lioncash
f33e5939b7
common/fp: Remove unnecessary includes
2020-04-22 20:58:10 +01:00
Lioncash
302f56b36a
A64: Fall back to interpreting for FCADD and FCMLA half-precision variants
...
Rather than straight-up treating them as undefined, we can fall back to an
interpreter in this case.
2020-04-22 20:58:10 +01:00
Lioncash
4339a8fff6
A64: Implement the scalar version of FCVTXN
2020-04-22 20:58:10 +01:00
Lioncash
35ddf68ad5
A64: Implement SQSHL (register)'s scalar variant
...
We can implement this in terms of the vector variant.
2020-04-22 20:58:10 +01:00
Lioncash
5cf1478620
frontend/ir: Add opcodes for vector square roots
2020-04-22 20:58:10 +01:00
Lioncash
36027ebef5
frontend/ir/microinstruction: Add missing cases for FPRecipExponent{32,64} for ReadsFromAndWritesToFPSRCumulativeExceptionBits()
...
This was intended to be added within #437 , but was missed
2020-04-22 20:58:10 +01:00
Merry
40b081438a
Merge pull request #439 from lioncash/fcmla
...
A64: Implement FCADD and FCMLA
2020-04-22 20:58:10 +01:00
Lioncash
7c81a58ed3
frontend/ir/ir_emitter: Alter parameters of FPDoubleToSingle() and FPSingleToDouble() to pass along desired rounding mode
...
This will be necessary to special-case the non-IEEE Von Neumann rounding
to odd rounding mode.
2020-04-22 20:58:10 +01:00
Merry
d91192681a
Merge pull request #438 from lioncash/fmulx
...
A64: Implement scalar double/single precision FMULX (by element)
2020-04-22 20:58:10 +01:00
Lioncash
ed29ef8cca
A64: Implement FCMLA
2020-04-22 20:58:10 +01:00
Lioncash
95af9dafbe
common/fp/op: Add FP conversion functions
2020-04-22 20:58:10 +01:00
Merry
9f11720a69
Merge pull request #437 from lioncash/frecpx
...
A64: Implement FRECPX (single, double precision)
2020-04-22 20:58:10 +01:00
Lioncash
bdcea0b0dc
A64: Implement scalar double/single precision FMULX (by element)
2020-04-22 20:58:10 +01:00
Lioncash
5ce17574f9
A64: Implement FCADD
2020-04-22 20:58:10 +01:00
Merry
34d917f34e
Merge pull request #436 from lioncash/no-alloc
...
A64: Implement LDNP/STNP
2020-04-22 20:58:10 +01:00
Lioncash
e44730ba6d
A64: Implement FRECPX (single, double precision)
2020-04-22 20:58:10 +01:00
Lioncash
bfaeb08d3c
A64: Implement LDNP/STNP
...
LDNP and STNP indicate that a memory access is non-temporal/streaming
(i.e. unlikely to be repeated), allowing data caching to not be
performed. However, given this is only a hint, we can treat these two
instructions as regular LDP and STP instructions for the time being.
2020-04-22 20:58:10 +01:00
Lioncash
9cf3c25811
frontend/ir/ir_emitter: Add opcodes for floating point reciprocal exponents
2020-04-22 20:58:10 +01:00
Merry
dbf47db713
Merge pull request #434 from lioncash/format
...
A32/translate_arm: Formatting/tidying up
2020-04-22 20:58:10 +01:00
Lioncash
b168c2a9f9
common/fp/op: Add operations for floating-point reciprocal exponents
2020-04-22 20:58:10 +01:00
Lioncash
05a6ab691d
translate_arm/coprocessor: Minor tidying up
2020-04-22 20:58:10 +01:00
Lioncash
1e32a09c03
translate_arm/vfp2: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
e209b31073
translate_arm/synchronization: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
9514e3602e
translate_arm/status_register_access: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c6aa1a708a
translate_arm/saturated: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
a72813599a
translate_arm/reversal: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
7be56e6b67
translate_arm/parallel: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
3c00a616d6
translate_arm/packing: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c711188f46
translate_arm/multiply: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
c8dad40d81
translate_arm/misc: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
a7bf5ff77d
translate_arm/load_store: Invert conditionals where applicable
2020-04-22 20:58:10 +01:00
Lioncash
2e180a7f14
backend/x64/a32_interface: Mark Context move constructor and move assignment as noexcept
...
Provides a more "correct" move constructor/assignment operator, since
these relevant functions shouldn't throw exceptions.
Has the benefit of playing nicely with std::move_if_noexcept and other
noexcept library facilities.
2020-04-22 20:58:09 +01:00
Lioncash
f4b19a7393
translate_arm/extension: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
deb9dd4acc
block_of_code: Replace cast with [[maybe_unused]] in DoesCpuSupport()
2020-04-22 20:58:09 +01:00
Lioncash
c2de6ecfd0
translate_arm/exception_generating: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
d8a8d3b073
translate_arm/data_processing: Invert conditionals where applicable
2020-04-22 20:58:09 +01:00
Lioncash
df5c51ff47
translate_arm/branch: Invert conditionals where applicable
...
Allows unindenting code a bit.
2020-04-22 20:58:09 +01:00
Lioncash
3290a9fdc2
common: Remove address_range.h
...
The AddressRange structure isn't used anywhere within the codebase, so
this can be removed. Particularly because there's no real appeal/heavy
potential use of it in the future that isn't trivial to add back if
needed.
2020-04-22 20:57:38 +01:00
Lioncash
ee973f13c7
frontend/A32/ir_emitter: Mark PC() and AlignPC() as const-qualified member functions
...
These don't modify instance state, so they can be const-qualified member
functions.
2020-04-22 20:57:38 +01:00
Lioncash
3a2dd09122
frontend/A64/ir_emitter: Mark PC() and AlignPC() as const qualified member functions
...
These don't actually alter any instance state.
2020-04-22 20:57:38 +01:00
Lioncash
575ae852a9
constant_propagation_pass: Fold byte reversal opcodes where applicable
...
These are reasonably trivial to fold away when applicable. We just
perform the swap and replace the instruction with the constant value.
2020-04-22 20:57:37 +01:00
Merry
2c53f354ab
Merge pull request #418 from lioncash/fold-op
...
constant_propagation_pass: Handle folding for Least/MostSignificant{Bit, Byte, Half, Word} opcodes
2020-04-22 20:57:37 +01:00
Merry
ad14a33672
Merge pull request #417 from lioncash/swap
...
common: Move byte swapping functions to bit_utils.h
2020-04-22 20:57:37 +01:00
Lioncash
d302d9bd0c
constant_propagation_pass: Handle folding for Least/MostSignificant{Bit, Byte, Half, Word} opcodes
...
These are quite trivial to fold.
2020-04-22 20:57:37 +01:00
Lioncash
7139942976
common: Move byte swapping functions to bit_utils.h
...
These are quite general functions, so they can just be moved into common
instead of recreating a namespace here.
2020-04-22 20:57:37 +01:00
MerryMage
7c8fcaef26
emit_x64_vector_floating_point: AVX && DN implementation of EmitFPVectorMulX
2020-04-22 20:57:37 +01:00