Deferred gbuffer rendering via compute shader ray marching output


I've got deferred gbuffer rendering working following compute shader ray marching output pass.  (This is a prerequisite to implementing global illumination and dynamic lighting.)

GPU buffer usage went from 4.4GiB down to 2.2GiB for the large scene "Church_Of_St_Sophia.vox" which is a significant GPU RAM usage improvement.

Unfortunately, the frame rate is about half as fast.  As best I can guess is that when ray marching was being done in a fragment shader, there were less cache misses because the color data went right into the frame buffer.  With a compute shader pass and the rendering of the gbuffer in a fragment shader into the frame buffer, there are more cache misses both due to using the gbuffer and the additional data (voxel world space location, voxel (for its property bits [alpha, specular, emissive, portal], and scene depth) I'm storing in the gbuffer.

I also managed to break (no longer working at all) the mipmaps and voxel level ambient occlusion optimization bits.  The code changes to fix this should be small, but tracking down what is wrong with the code has been a challenge.

Comments

Log in with itch.io to leave a comment.

(2 edits)

P.S.  Decreasing ray march compute shader workgroup size local_size_x has improved frame rates.