Math operations such as Matrix multiplication utilize these particular
instructions enough that there should be some unit tests for thesein particular.
The lane-splatting form of FMUL and FMLA instructions are of particular
interest and I've found them to be very common in retail game binaries
such as Pokemon Sword.
https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/coding-for-neon---part-3-matrix-multiplication
I'm primarily adding this unit test so that I can ensure compatibility
while I tune and optimize them.