Commit graph

3063 commits

Author SHA1 Message Date
bunnei
1819c2183f
backend: x64: block_of_code: Double the total code size. (#595)
- The current limits are being hit in yuzu with some games (e.g. newer updates of BotW and SSBU).
- Increasing this fixes slow-downs in these games due to code being recompiled.
2021-04-01 20:53:49 +01:00
MerryMage
c4cff773b9 emit_x64_vector_floating_point: Avoid checking inputs for NaNs for three-ops where able 2021-03-28 21:54:36 +01:00
Wunk
e06933f123
block_of_code: Allow Fast BMI2 paths on Zen 3 (#593)
BMI2 instructions such as `pdep` and `pext` have been
known to be incredibly slow on AMD. But on Zen3
and newer, the performance of these instructions
are now much greater, but previous versions of AMD
architectures should still avoid BMI2.

On Zen 2, pdep/pext were 300 cycles. Now on Zen 3 it is 3 cycles.
This is a big enough improvement to allow BMI2 code to
be dispatched if available. The Zen 3 architecture is checked for
by detecting the family of the processor.
2021-03-27 21:36:51 +00:00
Merry
c28f13af97 emit_x64_vector: Bugfix for EmitVectorReverseBits on AVX-512: Do not reverse bytes without vector 2021-03-27 21:32:43 +00:00
Merry
4d33feb1fa emit_x64_vector: Bugfix for EmitVectorLogicalShiftRight8: shift_amount can be >= 8 2021-03-27 21:32:07 +00:00
Merry
91337788ee emit_x64_vector: Bugfix for EmitVectorLogicalShiftLeft8: shift_amount can be >= 8 2021-03-27 21:31:51 +00:00
Merry
dc37fe6e28 emit_x64_vector: Bugfix for ArithmeticShiftRightByte: shift_amount can be >= 8 2021-03-27 21:31:22 +00:00
MerryMage
2f9dea5cc3 Squashed 'externals/xbyak/' changes from 0140eeff1..590c10e37
590c10e37 fix typo
396715a89 use github action
69a25fd92 change build status to travis-ci.com
a6a9dac91 meanings of the name
8d1e41b65 test_util.cpp supports OpenBSD
77ffe7173 Merge pull request #115 from Ryan-rsm-McKenzie/master
a1da3403a fix build interface include directory
e0136d4ef fix add_library call with INTERFACE
3071eee0c support slightly more modern cmake
e626d6209 v5.991
a49c4bc11 disable XBYAK_CONSTEXPR for g++-5 -std=c++-14
70777a699 Merge branch 'atafra-old_mac_fix' into dev
6b81678d0 fixed compile error on some older macOS versions
2c3b43f15 refactor util
91784e2b8 test_util checks AMX and AVX_VNNI
70b70c557 update to v5.99
284cc5bed refactor
6b3eb9c1e default encoding is always evex
f85b1100b refactor vnni
276d09bae Merge branch 'akharito-akharito/adl_support' into dev
50df86ce3 v5.98
97ce92d58 Merge branch 'akharito/adl_support' of https://github.com/akharito/xbyak into akharito-akharito/adl_support
1f119a04a support [scale * reg]
9ee1bef9a cpuid - check that GRT CPUID leaf 7 subleaf 0 should return EAX=1
be93adb2c add AVX VNNI instruction support
0c277240a add ADL CPUID
a9a5cc2e2 Add option to choose VEX or EVEX encoding
29bfd25ba fix indent
a0c49fa2e Merge branch 'atafra-mac_avx512_fix' into dev
ea388b3c6 fixed incorrect detection of AVX-512 on macOS
ed1b8186f Merge branch 'FEX-Emu-extended_features' into dev
3dacddfec Merge branch 'extended_features' of https://github.com/FEX-Emu/xbyak into FEX-Emu-extended_features
898f78ca3 Merge branch 'kariya-mitsuru-use-sh'
0b7f1411c Merge branch 'use-sh' of https://github.com/kariya-mitsuru/xbyak into kariya-mitsuru-use-sh
99e2b13b2 Fixes extended feature support checking
b0a43c7e5 Use sh instead of tcsh for test scripts
87e8f41ae remove warning of _MSC_VER

git-subtree-dir: externals/xbyak
git-subtree-split: 590c10e3746978dbfcf102d6da933ac2659e4544
2021-03-27 21:08:22 +00:00
MerryMage
ad9b33164e externals: Update xbyak
- Fix on-demand AVX512 on macOS

Merge commit '2f9dea5cc355c266ad46d2f6397b141b99f78480'
2021-03-27 21:08:22 +00:00
Lioncash
5873e6b955 thumb32: Implement LDRD (immediate) 2021-03-13 15:29:56 -05:00
Lioncash
9757e2353f thumb32: Implement LDRD (literal) 2021-03-13 15:29:56 -05:00
Lioncash
a74843ca17 thumb32: Implement STRD 2021-03-13 15:29:56 -05:00
Lioncash
258ca93c53 thumb32: Implement TBB/TBH 2021-03-13 15:29:49 -05:00
merry
03575dea84
Merge pull request #591 from lioncash/multiple
thumb32: Implement LDM/STM variants
2021-03-13 07:58:40 +00:00
Lioncash
1d0b705996 thumb32: Implement PUSH
This can be handled as an alias for STMDB.
2021-03-12 19:54:35 -05:00
Lioncash
9cb4790428 thumb32: Implement POP
This can just be treated as an alias to LDMIA
2021-03-12 19:43:47 -05:00
Lioncash
39edee70ff thumb32: Implement LDMDB/LDMEA 2021-03-12 19:35:28 -05:00
Lioncash
ae83713f4f thumb32: Simplify existing store functions into helper function
We can also make a STM helper.
2021-03-12 19:30:29 -05:00
Lioncash
0d887d9ecd thumb32: Implement LDMIA/LDMFD 2021-03-12 19:26:03 -05:00
Lioncash
714ccf13dd thumb32: Implement STMDB/STMFD 2021-03-12 19:05:39 -05:00
Lioncash
91c4d59da9 thumb32: Implement STMIA/STMEA 2021-03-12 19:05:15 -05:00
merry
543ba4e61f
Merge pull request #589 from lioncash/adr
thumb32: Implement plain binary immediate ADR variants
2021-03-12 23:10:23 +00:00
Lioncash
85b8adeb32 thumb32: Implement plain binary immediate ADR variants
Now all the plain binary immediate instructions are implemented.
2021-03-12 18:05:41 -05:00
merry
54c481c8bd
Merge pull request #588 from lioncash/store
thumb32: Implement STRB{T}/STRH{T}/STR{T} immediate variants
2021-03-12 21:30:00 +00:00
Lioncash
bd02d9e27f thumb32: Implement STR immediate variants 2021-03-12 14:03:40 -05:00
Lioncash
2521314384 thumb32: Implement STRH immediate variants 2021-03-12 13:55:39 -05:00
Lioncash
cbf9027278 thumb32: Implement STRB immediate variants 2021-03-12 13:33:11 -05:00
merry
2093d2b775
Merge pull request #587 from lioncash/8dot7
a64: Add v8.7 instruction additions to the decoder
2021-03-10 21:19:03 +00:00
merry
41bee2db3c
Merge pull request #586 from lioncash/halfword
thumb32: Implement halfword load instructions
2021-03-10 21:18:48 +00:00
Lioncash
035580abd2 a64: Add v8.7 instruction additions to the decoder
Adds the instructions introduced in FEAT_WFxT and FEAT_LS64/FEAT_LS64_V
in ARMv8.7
2021-03-09 18:41:20 -05:00
Lioncash
fb30922cd1 thumb32: Add supporting decoder entry for PLD (literal)
LDRH (literal)'s pseduocode indicates that cases where Rt specifies the
PC, that the instruction should be execured as if it were a PLD
instruction.

Curiously, however, within the ARM reference manual, the encodings in the case
that happens doesn't match up.

The bit pattern for LDRH (literal) has bit 21 set to 1, but the encoding
of PLD (literal) has bit 21 set to zero for it's only thumb encoding.
2021-03-09 18:16:08 -05:00
Lioncash
921998f6e9 thumb32: Implement LDRSH variants 2021-03-09 18:11:33 -05:00
Lioncash
7a9bdc8f21 thumb32: Implement LDRH variants 2021-03-09 17:12:46 -05:00
merry
1a5f4930b7
Merge pull request #585 from lioncash/word
thumb32: Implement Thumb-2 LDR variants
2021-03-09 19:16:13 +00:00
Lioncash
3d7e81e7d1 thumb32: Implement LDR variants 2021-03-09 13:12:15 -05:00
MerryMage
646fd05920 thumb32: Implement RSB (reg) 2021-03-06 19:49:44 +00:00
MerryMage
3f97cb1f9b thumb32: Implement SUB (reg) 2021-03-06 19:49:44 +00:00
MerryMage
17bdb54d30 thumb32: Implement CMP (reg) 2021-03-06 19:49:44 +00:00
MerryMage
a63271fd3b thumb32: Implement SBC (reg) 2021-03-06 19:49:44 +00:00
MerryMage
95189b78ef thumb32: Implement ADC (reg) 2021-03-06 19:49:44 +00:00
MerryMage
af33155ef8 thumb32: Implement ADD (reg) 2021-03-06 19:49:44 +00:00
MerryMage
41ac9971f4 thumb32: Implement CMN (reg) 2021-03-06 19:49:44 +00:00
MerryMage
e7ecd3a7ee thumb32: Implement PKHBT, PKHTB 2021-03-06 19:49:44 +00:00
MerryMage
d2d996e6ba thumb32: Implement EOR (reg) 2021-03-06 19:49:44 +00:00
MerryMage
158a13173c thumb32: Implement AND (reg) 2021-03-06 19:49:44 +00:00
MerryMage
c253b8fc51 thumb32: Implement TST (reg) 2021-03-06 19:49:44 +00:00
merry
ea5d8a3047
Merge pull request #584 from lioncash/loads
thumb32: Implement Thumb-2 Load Byte and Memory Hints instructions
2021-03-06 17:31:45 +00:00
MerryMage
531bb42ab5 thumb32: Implement B (T3) 2021-03-06 17:29:55 +00:00
MerryMage
86aa3f0701 thumb32: Implement B (T4) 2021-03-06 17:27:54 +00:00
Lioncash
52fdf801d0 thumb32: Implement LDRSB variants 2021-03-06 11:33:33 -05:00