BMI2 instructions such as `pdep` and `pext` have been
known to be incredibly slow on AMD. But on Zen3
and newer, the performance of these instructions
are now much greater, but previous versions of AMD
architectures should still avoid BMI2.
On Zen 2, pdep/pext were 300 cycles. Now on Zen 3 it is 3 cycles.
This is a big enough improvement to allow BMI2 code to
be dispatched if available. The Zen 3 architecture is checked for
by detecting the family of the processor.
590c10e37 fix typo
396715a89 use github action
69a25fd92 change build status to travis-ci.com
a6a9dac91 meanings of the name
8d1e41b65 test_util.cpp supports OpenBSD
77ffe7173 Merge pull request #115 from Ryan-rsm-McKenzie/master
a1da3403a fix build interface include directory
e0136d4ef fix add_library call with INTERFACE
3071eee0c support slightly more modern cmake
e626d6209 v5.991
a49c4bc11 disable XBYAK_CONSTEXPR for g++-5 -std=c++-14
70777a699 Merge branch 'atafra-old_mac_fix' into dev
6b81678d0 fixed compile error on some older macOS versions
2c3b43f15 refactor util
91784e2b8 test_util checks AMX and AVX_VNNI
70b70c557 update to v5.99
284cc5bed refactor
6b3eb9c1e default encoding is always evex
f85b1100b refactor vnni
276d09bae Merge branch 'akharito-akharito/adl_support' into dev
50df86ce3 v5.98
97ce92d58 Merge branch 'akharito/adl_support' of https://github.com/akharito/xbyak into akharito-akharito/adl_support
1f119a04a support [scale * reg]
9ee1bef9a cpuid - check that GRT CPUID leaf 7 subleaf 0 should return EAX=1
be93adb2c add AVX VNNI instruction support
0c277240a add ADL CPUID
a9a5cc2e2 Add option to choose VEX or EVEX encoding
29bfd25ba fix indent
a0c49fa2e Merge branch 'atafra-mac_avx512_fix' into dev
ea388b3c6 fixed incorrect detection of AVX-512 on macOS
ed1b8186f Merge branch 'FEX-Emu-extended_features' into dev
3dacddfec Merge branch 'extended_features' of https://github.com/FEX-Emu/xbyak into FEX-Emu-extended_features
898f78ca3 Merge branch 'kariya-mitsuru-use-sh'
0b7f1411c Merge branch 'use-sh' of https://github.com/kariya-mitsuru/xbyak into kariya-mitsuru-use-sh
99e2b13b2 Fixes extended feature support checking
b0a43c7e5 Use sh instead of tcsh for test scripts
87e8f41ae remove warning of _MSC_VER
git-subtree-dir: externals/xbyak
git-subtree-split: 590c10e3746978dbfcf102d6da933ac2659e4544
LDRH (literal)'s pseduocode indicates that cases where Rt specifies the
PC, that the instruction should be execured as if it were a PLD
instruction.
Curiously, however, within the ARM reference manual, the encodings in the case
that happens doesn't match up.
The bit pattern for LDRH (literal) has bit 21 set to 1, but the encoding
of PLD (literal) has bit 21 set to zero for it's only thumb encoding.