* Use pext/pdep where not previously used
* Limit pext/pdep to non-AMD platforms due to slowness on AMD
* Use imul/and as alternatives for AMD and non-BMI2 platforms
The guest program often accesses the NZCV flags directly much less
often than we need to use them for jumps and other such uses.
Therefore, we store our flags in cpsr_nzcv in a x64-friendly format.
This allows for a reduction in conditional jump related code.
Several issues:
1. Several terminal instructions did not stop at the end of a single-step block
2. x64 backend for the A32 frontend sometimes polluted upper_location_descriptor with the single-stepping flag
We also introduce the enable_optimizations parameter to the A32 frontend.
Most of the time when this occurs, it's a bug. Thankfully this isn't the
case. However, we can resolve these cases to make the codebase more
consistent.