Commit graph

2699 commits

Author SHA1 Message Date
MerryMage
080b4b3aff Squashed 'externals/xbyak/' changes from 671fc805..4a6fac8a
4a6fac8a update version to 5.77
801cf3fd cosmetic change of getNumCores
d397e824 fix number of cores that share LLC cache
a669e092 support non-intel-cpu visual studio
af5f422e Merge branch 'fenghaitao-guard_x86' into develop
9b98dc17 Guard x86 specific codes with "#if defined(__i386__) || defined(__x86_64__)"
dd4173e1 move some member variables input private
f72646a7 update version
4612528f format change
4b95e862 Merge branch 'shelleygoel-master'
4c262fa6 add functionality to get num of cores using x2APIC ID
bc70e7e1 recover Xbyak::CastTo
d09a230f unlink Label when LabelManager is destroyed
973e8597 update version
afdb9fe9 Xbyak::CastTo is removed
b011aca4 add RegRip +/- int
acae93cd increase max temp regs for StackFrame
ea4e3562 util::StackFrame uses push/pop instead of mov
42462ef9 use evex encoding for vpslld/vpslldq/vpsraw/...(reg, mem, imm);
da9117a9 update version of readme.md
d35f4fb7 fix the encoding of vinsertps for disp8N
1de435ed bf uses Label class
613922bd add Label L() for convenience
43e15583 fix typo
93579ee6 add protect-re.cpp
60004b5c fix url of protect-re.cpp
348b2709 fix typo of doc
f34f6ed5 update manual
232110be update test
82b78bf0 add setProtectMode
dd8b290f put warning message if pageSize != 4096
64775ca2 a little refactoring
7c3e7b85 fix wrong VSIB encoding with idx >= 16

git-subtree-dir: externals/xbyak
git-subtree-split: 4a6fac8ade404f667b94170f713367fe7da2a852
2020-04-22 20:59:14 +01:00
MerryMage
b941cbbcfb externals/xbyak: Update xbyak to 5.77
Merge commit '080b4b3affbdc1d56f2f8230663725413ab03d21' into HEAD
2020-04-22 20:59:14 +01:00
Merry
3a0b9e8883 Merge pull request #459 from lioncash/catch
externals: Update catch to 2.7.0
2020-04-22 20:58:17 +01:00
MerryMage
13e8b7b516 emit_x64_floating_point: F16C implementation of FPSingleToHalf 2020-04-22 20:58:17 +01:00
Lioncash
e1aca18944 externals: Update catch to 2.7.0
Keeps the unit testing library up to date.
2020-04-22 20:58:12 +01:00
MerryMage
d32d6fe598 emit_x64_floating_point: F16C implementation of FPHalfToSingle and FPHalfToDouble 2020-04-22 20:58:12 +01:00
MerryMage
a53ba12be2 emit_x64_floating_point: Factor out ConvertRoundingModeToX64Immediate 2020-04-22 20:58:12 +01:00
MerryMage
5a2adc6629 backend/x64: Expose FPCR in EmitContext instead of its subcomponents 2020-04-22 20:58:12 +01:00
Merry
01bb1cdd88 Merge pull request #458 from lioncash/float-op
A64: Handle half-precision floating point in FABS, FNEG, and scalar FMOV
2020-04-22 20:58:12 +01:00
Merry
74be34d93c Merge pull request #457 from lioncash/fpconv
A64: Handle half-precision floating point in floating-point FCVT, FCVTL, and FCVTN
2020-04-22 20:58:12 +01:00
Lioncash
28a8b4d210 A64: Handle half-precision floating point in scalar FMOV
This is simply performing a scalar value transfer between registers
without conversions, so this is trivial to handle as-is.
2020-04-22 20:58:12 +01:00
Merry
f01afc5ae6 Merge pull request #456 from lioncash/mov
A64: Enable FMOV (general) for half-precision floating point
2020-04-22 20:58:12 +01:00
Lioncash
d7ac5a664f A64: Handle half-precision floating point in FCVTL
Like FCVTN, now that we have half-precision floating point conversion
functions available, we can go ahead and use those to eliminate the
interpreter fallback.
2020-04-22 20:58:12 +01:00
Lioncash
fe84ecb780 A64: Handle half-precision floating point in scalar FABS
Now that we have the half-precision variant of the opcode added, we can
simply handle the instruction instead of treating it as undefined.
2020-04-22 20:58:12 +01:00
Lioncash
fac9224d5e A64: Handle half-precision floating point in FCVTN
Now that we have IR instructions for performing conversions with
half-precision floating point, we can also handle half-precision values
within FCVTN.
2020-04-22 20:58:12 +01:00
Lioncash
8309ec7a9f frontend/ir_emitter: Add half-precision variant of FPAbs 2020-04-22 20:58:12 +01:00
Lioncash
16de99d3e3 A64: Enable FCVT floating-point conversions for half-precision
With this, we no longer have to fall back to the interpreter in any of
the FCVT floating-point conversion instructions.
2020-04-22 20:58:12 +01:00
Lioncash
10abc77fad A64: Handle half-precision floating point in scalar FNEG
With the half-precision variant of the FPNeg opcode added, we can
utilize it here to emulate the half-precision variant of FNEG.
2020-04-22 20:58:12 +01:00
Lioncash
e4c259d69f frontend/ir_emitter: Add half->{single, double} and {double, single}->half conversion opcodes 2020-04-22 20:58:12 +01:00
Lioncash
c97efcb978 frontend/ir_emitter: Add half-precision variant of FPNeg 2020-04-22 20:58:12 +01:00
Lioncash
dff5da1063 common/fp/unpacked: Amend behavior of FPUnpackCV
This is supposed to call FPUnpackBase instead of FPUnpack. This would
result in alternate half-precision representations being misinterpreted
when it comes to dealing with NaNs.
2020-04-22 20:58:12 +01:00
Lioncash
03bc2334fe common/fp/op/FPConvert: Amend off-by one in double NaN case in FPConvertNaN
Avoids potentially clobbering the intended sign bit value during
conversions to double-precision values. The other conversion types are
already properly handled, so those don't need to be addressed.
2020-04-22 20:58:12 +01:00
Lioncash
c57b146fb2 common/fp/op/FPConvert: Add half-precision instantiations to FPConvert 2020-04-22 20:58:12 +01:00
Merry
c1ce94872d Merge pull request #455 from lioncash/sqrdmulh-scalar
A64: Implement SQRDMULH and SQDMULL's scalar indexed variants
2020-04-22 20:58:11 +01:00
Lioncash
25a7256ee1 A64: Enable FMOV (general) for half-precision floating point
This just transfers values between vector registers and general-purpose
registers with no conversions performed, so this is trivial to add
support for half-precision to.
2020-04-22 20:58:11 +01:00
Merry
98d8f81d7c Merge pull request #454 from lioncash/sqrdmulh
A64: Implement SQRDMULH and SQDMULL{2}'s vector indexed element variants
2020-04-22 20:58:11 +01:00
Lioncash
97dd3d0596 A64: Implement SQRDMULH's scalar indexed element variant 2020-04-22 20:58:11 +01:00
Merry
42b090d234 Merge pull request #452 from lioncash/frecpx
A64: Implement FRECPX's half-precision floating-point variant
2020-04-22 20:58:11 +01:00
Lioncash
49b51e34f1 simd_vector_x_indexed_element: Deduplicate index and Vm operand construction 2020-04-22 20:58:11 +01:00
Lioncash
692aba91b6 A64: Implement SQDMULL{2}'s scalar indexed element variant 2020-04-22 20:58:11 +01:00
Merry
32364fb62c Merge pull request #451 from lioncash/unpck
common/fp: Minor adjustments for half-precision floating point support
2020-04-22 20:58:11 +01:00
Lioncash
3a3542414b A64: Implement FRECPX's half-precision floating point variant 2020-04-22 20:58:11 +01:00
Lioncash
c043b831d5 A64: Implement SQDMULL{2}'s by-element variant 2020-04-22 20:58:11 +01:00
Lioncash
72af5a3dff simd_scalar_x_indexed_element: Factor out index and Vm argument construction
This will be useful in the implementations of SQRDMULH and SQDMULL{2} as
well.
2020-04-22 20:58:11 +01:00
Merry
37c4c39d62 Merge pull request #448 from lioncash/saturate
A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
2020-04-22 20:58:11 +01:00
Lioncash
7030b9af95 common/fp/process_nan: Add half-precision instantiations for NaN processing functions 2020-04-22 20:58:11 +01:00
Lioncash
bd892ec4ef frontend/ir/ir_emitter: Amend FPRecipExponent to handle half-precision floating point 2020-04-22 20:58:11 +01:00
Lioncash
224ff0afaa A64: Implement SQRDMULH's by-index vector variant 2020-04-22 20:58:11 +01:00
Merry
f5d774bdbd Merge pull request #449 from lioncash/hp
common/fp/info: Add specialization of FPInfo for half-precision floating point
2020-04-22 20:58:11 +01:00
Lioncash
126c29a9e9 A64: Implement SQSHRN, SQSHRUN, and UQSHRN's scalar variants
These can just be implemented in terms of the vector variants for the
time being.
2020-04-22 20:58:11 +01:00
Lioncash
14f55d7476 common/fp/unpacked: Add half-precision instantiation of FPRoundBase 2020-04-22 20:58:11 +01:00
Lioncash
974fbf0677 frontend/ir/value: Add U16U32U64 type to represent floating point types 2020-04-22 20:58:11 +01:00
Merry
4b86151a0c Merge pull request #450 from lioncash/cv
common/fp/unpacked: Add FPRoundCV and FPUnpackCV
2020-04-22 20:58:11 +01:00
Lioncash
0b67b94b6c common/fp/info: Add specialization of FPInfo for half-precision floating point
Puts the necessary info struct in place for further use.
2020-04-22 20:58:11 +01:00
Lioncash
dd7433f9d3 A64: Amend prototypes of some SIMD scalar shift by immediate opcodes
These take a vector for a destination.
2020-04-22 20:58:11 +01:00
Lioncash
7e814de445 common/fp/unpacked: Handle half-precision unpacking in FPUnpackBase 2020-04-22 20:58:11 +01:00
Lioncash
eb3e0d5908 common/fp/op/FPRecipExponent: Add half-precision floating point specialization 2020-04-22 20:58:11 +01:00
Merry
bbd5330ad2 Merge pull request #447 from lioncash/flag
A64: Implement CFINV, RMIF, AXFlag and XAFlag
2020-04-22 20:58:11 +01:00
Lioncash
99c494bae9 common/fp/unpacked: Add FPRoundCV
Corresponds to the equivalent pseudocode within the ARMv8 reference
manual. This will be necessary for supporting half-precision
floating-point.

This also makes use of it within FPConvert
2020-04-22 20:58:11 +01:00
Lioncash
8f9fe8690a common/fp/unpacked: Adjust FPUnpack to operate like ARM pseudocode
This function is defined as always disabling the AHP bit in the fpcr
before performing any operations.

At the same time, rename the original FPUnpack function to FPUnpackBase
to match the pseudocode in the ARM reference manual.
2020-04-22 20:58:11 +01:00