Commit graph

3430 commits

Author SHA1 Message Date
MerryMage
510862e50c backend/x64: Change V flag testing to cmp instead of add
Prefer a non-destructive read to a destructive read.
2021-04-26 00:26:28 +01:00
MerryMage
f35d98c923 fuzz_with_unicorn: Widen scope of floating point fuzzing 2021-04-26 00:26:28 +01:00
MerryMage
3f74a839b9 emit_x64_floating_point: Optimize 64-bit EmitFPRSqrtEstimate 2021-04-26 00:26:28 +01:00
MerryMage
7bc9e36ed7 emit_x64_floating_point: Optimize 32-bit EmitFPRSqrtEstimate 2021-04-26 00:26:28 +01:00
MerryMage
e19f898aa2 ir: Reorganize to new top level folder 2021-04-21 22:22:07 +01:00
MerryMage
5bec200c36 block_of_code: Add santiy check that far_code_offset < total_code_size 2021-04-21 18:26:26 +01:00
MerryMage
08ed8b4a11 abi: Consolodate ABI information into one place 2021-04-21 18:25:04 +01:00
Lioncash
f5263cc196 thumb32: Implement exclusive loads
Implements the remaining loads for ARMv7
2021-04-19 19:46:19 +01:00
MerryMage
9c6332fcbd thumb32_load_store_dual: imm8 in STREX should be shifted left by 2 2021-04-19 18:57:28 +01:00
MerryMage
b2a4da5e65 block_of_code: Correct SpaceRemaining 2021-04-11 15:37:25 +01:00
Lioncash
6241ff6be2 thumb32: Implement STREX variants
Implements the exclusive store instructions. Now all that remains for
ARMv7 load/stores to be done is the exclusive loads.
2021-04-10 17:15:19 +01:00
MerryMage
d8066b091b decoder/arm: Complete instruction version information 2021-04-10 17:11:24 +01:00
merry
71491c0a4a
Merge pull request #596 from degasus/fix_perf_register
backend/x64: Fix PerfMapRegister usages.
2021-04-05 21:43:10 +01:00
MerryMage
9ab83180db {a32,a64}_interface: Clear exclusive state during an exceptional exit
This is normally done by the ERET instruction during a service call.
2021-04-02 19:33:28 +01:00
MerryMage
c788bcdf17 block_of_code: Enable configuration of code cache sizes 2021-04-02 11:17:46 +01:00
Markus Wick
b2acdec8cb backend/x64: Fix PerfMapRegister usages.
Both the far code and fast_dispatch_table_lookup were missing.
2021-04-02 00:17:07 +02:00
merry
d0372aebaf
Merge pull request #592 from lioncash/dual
thumb32: Implement LDRD/STRD/TBB/TBH
2021-04-01 20:54:10 +01:00
bunnei
1819c2183f
backend: x64: block_of_code: Double the total code size. (#595)
- The current limits are being hit in yuzu with some games (e.g. newer updates of BotW and SSBU).
- Increasing this fixes slow-downs in these games due to code being recompiled.
2021-04-01 20:53:49 +01:00
MerryMage
c4cff773b9 emit_x64_vector_floating_point: Avoid checking inputs for NaNs for three-ops where able 2021-03-28 21:54:36 +01:00
Wunk
e06933f123
block_of_code: Allow Fast BMI2 paths on Zen 3 (#593)
BMI2 instructions such as `pdep` and `pext` have been
known to be incredibly slow on AMD. But on Zen3
and newer, the performance of these instructions
are now much greater, but previous versions of AMD
architectures should still avoid BMI2.

On Zen 2, pdep/pext were 300 cycles. Now on Zen 3 it is 3 cycles.
This is a big enough improvement to allow BMI2 code to
be dispatched if available. The Zen 3 architecture is checked for
by detecting the family of the processor.
2021-03-27 21:36:51 +00:00
Merry
c28f13af97 emit_x64_vector: Bugfix for EmitVectorReverseBits on AVX-512: Do not reverse bytes without vector 2021-03-27 21:32:43 +00:00
Merry
4d33feb1fa emit_x64_vector: Bugfix for EmitVectorLogicalShiftRight8: shift_amount can be >= 8 2021-03-27 21:32:07 +00:00
Merry
91337788ee emit_x64_vector: Bugfix for EmitVectorLogicalShiftLeft8: shift_amount can be >= 8 2021-03-27 21:31:51 +00:00
Merry
dc37fe6e28 emit_x64_vector: Bugfix for ArithmeticShiftRightByte: shift_amount can be >= 8 2021-03-27 21:31:22 +00:00
MerryMage
2f9dea5cc3 Squashed 'externals/xbyak/' changes from 0140eeff1..590c10e37
590c10e37 fix typo
396715a89 use github action
69a25fd92 change build status to travis-ci.com
a6a9dac91 meanings of the name
8d1e41b65 test_util.cpp supports OpenBSD
77ffe7173 Merge pull request #115 from Ryan-rsm-McKenzie/master
a1da3403a fix build interface include directory
e0136d4ef fix add_library call with INTERFACE
3071eee0c support slightly more modern cmake
e626d6209 v5.991
a49c4bc11 disable XBYAK_CONSTEXPR for g++-5 -std=c++-14
70777a699 Merge branch 'atafra-old_mac_fix' into dev
6b81678d0 fixed compile error on some older macOS versions
2c3b43f15 refactor util
91784e2b8 test_util checks AMX and AVX_VNNI
70b70c557 update to v5.99
284cc5bed refactor
6b3eb9c1e default encoding is always evex
f85b1100b refactor vnni
276d09bae Merge branch 'akharito-akharito/adl_support' into dev
50df86ce3 v5.98
97ce92d58 Merge branch 'akharito/adl_support' of https://github.com/akharito/xbyak into akharito-akharito/adl_support
1f119a04a support [scale * reg]
9ee1bef9a cpuid - check that GRT CPUID leaf 7 subleaf 0 should return EAX=1
be93adb2c add AVX VNNI instruction support
0c277240a add ADL CPUID
a9a5cc2e2 Add option to choose VEX or EVEX encoding
29bfd25ba fix indent
a0c49fa2e Merge branch 'atafra-mac_avx512_fix' into dev
ea388b3c6 fixed incorrect detection of AVX-512 on macOS
ed1b8186f Merge branch 'FEX-Emu-extended_features' into dev
3dacddfec Merge branch 'extended_features' of https://github.com/FEX-Emu/xbyak into FEX-Emu-extended_features
898f78ca3 Merge branch 'kariya-mitsuru-use-sh'
0b7f1411c Merge branch 'use-sh' of https://github.com/kariya-mitsuru/xbyak into kariya-mitsuru-use-sh
99e2b13b2 Fixes extended feature support checking
b0a43c7e5 Use sh instead of tcsh for test scripts
87e8f41ae remove warning of _MSC_VER

git-subtree-dir: externals/xbyak
git-subtree-split: 590c10e3746978dbfcf102d6da933ac2659e4544
2021-03-27 21:08:22 +00:00
MerryMage
ad9b33164e externals: Update xbyak
- Fix on-demand AVX512 on macOS

Merge commit '2f9dea5cc355c266ad46d2f6397b141b99f78480'
2021-03-27 21:08:22 +00:00
Lioncash
5873e6b955 thumb32: Implement LDRD (immediate) 2021-03-13 15:29:56 -05:00
Lioncash
9757e2353f thumb32: Implement LDRD (literal) 2021-03-13 15:29:56 -05:00
Lioncash
a74843ca17 thumb32: Implement STRD 2021-03-13 15:29:56 -05:00
Lioncash
258ca93c53 thumb32: Implement TBB/TBH 2021-03-13 15:29:49 -05:00
merry
03575dea84
Merge pull request #591 from lioncash/multiple
thumb32: Implement LDM/STM variants
2021-03-13 07:58:40 +00:00
Lioncash
1d0b705996 thumb32: Implement PUSH
This can be handled as an alias for STMDB.
2021-03-12 19:54:35 -05:00
Lioncash
9cb4790428 thumb32: Implement POP
This can just be treated as an alias to LDMIA
2021-03-12 19:43:47 -05:00
Lioncash
39edee70ff thumb32: Implement LDMDB/LDMEA 2021-03-12 19:35:28 -05:00
Lioncash
ae83713f4f thumb32: Simplify existing store functions into helper function
We can also make a STM helper.
2021-03-12 19:30:29 -05:00
Lioncash
0d887d9ecd thumb32: Implement LDMIA/LDMFD 2021-03-12 19:26:03 -05:00
Lioncash
714ccf13dd thumb32: Implement STMDB/STMFD 2021-03-12 19:05:39 -05:00
Lioncash
91c4d59da9 thumb32: Implement STMIA/STMEA 2021-03-12 19:05:15 -05:00
merry
543ba4e61f
Merge pull request #589 from lioncash/adr
thumb32: Implement plain binary immediate ADR variants
2021-03-12 23:10:23 +00:00
Lioncash
85b8adeb32 thumb32: Implement plain binary immediate ADR variants
Now all the plain binary immediate instructions are implemented.
2021-03-12 18:05:41 -05:00
merry
54c481c8bd
Merge pull request #588 from lioncash/store
thumb32: Implement STRB{T}/STRH{T}/STR{T} immediate variants
2021-03-12 21:30:00 +00:00
Lioncash
bd02d9e27f thumb32: Implement STR immediate variants 2021-03-12 14:03:40 -05:00
Lioncash
2521314384 thumb32: Implement STRH immediate variants 2021-03-12 13:55:39 -05:00
Lioncash
cbf9027278 thumb32: Implement STRB immediate variants 2021-03-12 13:33:11 -05:00
merry
2093d2b775
Merge pull request #587 from lioncash/8dot7
a64: Add v8.7 instruction additions to the decoder
2021-03-10 21:19:03 +00:00
merry
41bee2db3c
Merge pull request #586 from lioncash/halfword
thumb32: Implement halfword load instructions
2021-03-10 21:18:48 +00:00
Lioncash
035580abd2 a64: Add v8.7 instruction additions to the decoder
Adds the instructions introduced in FEAT_WFxT and FEAT_LS64/FEAT_LS64_V
in ARMv8.7
2021-03-09 18:41:20 -05:00
Lioncash
fb30922cd1 thumb32: Add supporting decoder entry for PLD (literal)
LDRH (literal)'s pseduocode indicates that cases where Rt specifies the
PC, that the instruction should be execured as if it were a PLD
instruction.

Curiously, however, within the ARM reference manual, the encodings in the case
that happens doesn't match up.

The bit pattern for LDRH (literal) has bit 21 set to 1, but the encoding
of PLD (literal) has bit 21 set to zero for it's only thumb encoding.
2021-03-09 18:16:08 -05:00
Lioncash
921998f6e9 thumb32: Implement LDRSH variants 2021-03-09 18:11:33 -05:00
Lioncash
7a9bdc8f21 thumb32: Implement LDRH variants 2021-03-09 17:12:46 -05:00