More ambient occlusions optimizations
I got to thinking about the GLSL voxel level ambient occlusion code and realized that the biggest bottleneck in performance with it is determining adjacent voxels in order to calculate how much ambient occlusion darkness to apply to the current fragment. The old approach has always been to look at neighboring voxels. That's fairly fast for neighboring voxels in the same voxel volume, but still has some overhead and even more so if neighboring voxels are not in the same voxel volume. I improved performance on both accounts previously with a caching mechanism for volume indices to avoid unnecessary repeated queries of the tetrahexacontree and I also added voxel bits which indicate if a voxel side even requires ambient occlusion calculations to be done.
Better rendering performance is usually always about making the GPU do less (without sacrificing visual quality). So with that thought in mind, I realized if each voxel were to have neighboring voxel adjacency bits and these bits are only calculated once or recalculated when one or more respective voxels are changed, a significant performance gain could be had with rendering voxel level ambient occlusion. As before, voxels sides which don't need voxel level ambient occlusion are trivially rejected and rendered without voxel ambient occlusion.
? In this test scene, I now get better frame rates with voxel level ambient occlusion on rather than off because my video card goes into a higher performance mode (GPU clock speed stepping) with the little extra "push".
Voxel level ambient occlusion off:
Voxel level ambient occlusion on:
P.S. Previously I was seeing a little better frames per second over all, but then there was some screen tearing which I fixed that was due to a synchronization issue.
P.S. P.S. Voxels are now 64-bit integers (GLSL uint64_t) with the extra 20-bits required for this optimization. I'm using 16-bits for color and 1-bit for interior (or hidden) property. This still leaves 27-bits free in each voxel for opacity, reflectivity, face normals (face normals can be calculated from voxel face and its adjacency bits in shader if need be) and/or other voxel properties I may come up with. Before voxels were 32-bit integers.
Hantverk'n
Status | In development |
Author | Teknologicus |
Genre | Adventure |
Tags | Exploration, Open World, Voxel |
More posts
- Previous folly and finally success1 day ago
- Rasterized triangles are evil!10 days ago
- Procedurally Generated Blocks (Video)13 days ago
- Fog is working again16 days ago
- Improved fragment/compute shading rates18 days ago
- Debugging variable compute shader rates23 days ago
- Deferred gbuffer rendering update24 days ago
- Deferred gbuffer rendering via compute shader ray marching output25 days ago
- Fast and Accurate Color Depth Conversion30 days ago
- Update: Hilbert Curve surface grooved stone block32 days ago
Comments
Log in with itch.io to leave a comment.
I'm curious, how are the actual voxel volume modification speeds going? (block placing and breaking)
For placing a block made of 32x32x32 voxels, I'm seeing an update time of about 1/10 of a second. For removing a block made of 32x32x32 voxels, I'm seeing an update time of about 1/100 of a second.
Placing a procedurally generated block takes longer than removing a block because procedural generation is more computationally expensive depending on the block type.
I still haven't gotten the voxel volume mipmap and voxel adjacency bits updating working in compute shaders, so these numbers are measured with code compiled to do said updates on the CPU and push the updated voxel data to the GPU. (I would like to move procedural block generation to a compute shader too.)
For simple game play, these numbers are acceptable, but for procedural terrain and/or procedural structure (villages, castles, etc.) generation, world editing or anything that would update a lot of voxels at once, these numbers are way too slow in my opinion.
Wow. destroying voxels is pretty fast in comparison, Do you think pasting the 32 voxel blocks would be faster?
I tested it with larger block of voxels sizes and it is definitely slower. Once I get the more computationally expensive part of block placement (procedural generation, voxel adjacency bits and mipmap updates) moved to compute shaders (computationally parallel), it will be fast for any reasonable block size or number of blocks.I misread your question. Yes, pasting would be faster rather than procedural generation.
Cool.
Cool. this project keeps advancing!
Thank you!
looking good
Thank you!