85d37c9994
Some implementations can use the std::nullopt_t constructor of std::optional to avoid needing to completely zero out the internal buffer of the optional and instead only set the validity byte within it. e.g. Consider the following function: std::optional<std::vector<ShaderDiskCacheRaw>> fn() { return {}; } With libc++ this will result in the following code generation on x86-64: Fn(): mov rax, rdi vxorps xmm0, xmm0, xmm0 vmovups ymmword ptr [rdi], ymm0 vzeroupper ret With libstdc++, we also get the similar equivalent: Fn(): vpxor xmm0, xmm0, xmm0 mov rax, rdi vmovdqu XMMWORD PTR [rdi], xmm0 vmovdqu XMMWORD PTR [rdi+16], xmm0 ret If we change this function to return std::nullopt instead, then this simplifies both the code gen from libc++ and libstdc++ down to: Fn(): mov BYTE PTR [rdi+24], 0 mov rax, rdi ret Given how little of a change is necessary to result in better code generation, this is essentially a "free" very minor optimization. |
||
---|---|---|
.. | ||
debug_utils | ||
renderer_opengl | ||
shader | ||
swrasterizer | ||
texture | ||
CMakeLists.txt | ||
command_processor.cpp | ||
command_processor.h | ||
generate_shaders.cmake | ||
geometry_pipeline.cpp | ||
geometry_pipeline.h | ||
gpu_debugger.h | ||
pica.cpp | ||
pica.h | ||
pica_state.h | ||
pica_types.h | ||
primitive_assembly.cpp | ||
primitive_assembly.h | ||
rasterizer_interface.h | ||
regs.cpp | ||
regs.h | ||
regs_framebuffer.h | ||
regs_lighting.h | ||
regs_pipeline.h | ||
regs_rasterizer.h | ||
regs_shader.h | ||
regs_texturing.h | ||
renderer_base.cpp | ||
renderer_base.h | ||
utils.h | ||
vertex_loader.cpp | ||
vertex_loader.h | ||
video_core.cpp | ||
video_core.h |