More ambient occlusions optimizations

Vorxel » Devlog

Share this post:

Share on Bluesky Share on Twitter Share on Facebook

I got to thinking about the GLSL voxel level ambient occlusion code and realized that the biggest bottleneck in performance with it is determining adjacent voxels in order to calculate how much ambient occlusion darkness to apply to the current fragment. The old approach has always been to look at neighboring voxels. That's fairly fast for neighboring voxels in the same voxel volume, but still has some overhead and even more so if neighboring voxels are not in the same voxel volume. I improved performance on both accounts previously with a caching mechanism for volume indices to avoid unnecessary repeated queries of the tetrahexacontree and I also added voxel bits which indicate if a voxel side even requires ambient occlusion calculations to be done.

Better rendering performance is usually always about making the GPU do less (without sacrificing visual quality). So with that thought in mind, I realized if each voxel were to have neighboring voxel adjacency bits and these bits are only calculated once or recalculated when one or more respective voxels are changed, a significant performance gain could be had with rendering voxel level ambient occlusion. As before, voxels sides which don't need voxel level ambient occlusion are trivially rejected and rendered without voxel ambient occlusion.

? In this test scene, I now get better frame rates with voxel level ambient occlusion on rather than off because my video card goes into a higher performance mode (GPU clock speed stepping) with the little extra "push".

Voxel level ambient occlusion off:

Voxel level ambient occlusion on:

P.S. Previously I was seeing a little better frames per second over all, but then there was some screen tearing which I fixed that was due to a synchronization issue.

P.S. P.S. Voxels are now 64-bit integers (GLSL uint64_t) with the extra 20-bits required for this optimization. I'm using 16-bits for color and 1-bit for interior (or hidden) property. This still leaves 27-bits free in each voxel for opacity, reflectivity, ~~face normals~~ (face normals can be calculated from voxel face and its adjacency bits in shader if need be) and/or other voxel properties I may come up with. Before voxels were 32-bit integers.

Vorxel

Add Game To Collection

Status	In development
Author	Teknologicus
Genre	Adventure
Tags	Exploration, Open World, Voxel

Fixed bug with dynamic direct sunlight and level of detail
Feb 04, 2025
Voxel Cone Tracing Experiment
Feb 04, 2025
Dynamic direct sunlight working with level of detail
Feb 01, 2025
Improved dynamic direct sunlight
Jan 27, 2025
Dynamic direct sunlight
Jan 26, 2025
Previous folly and finally success
Jan 19, 2025
Rasterized triangles are evil!
Jan 10, 2025
Procedurally Generated Blocks (Video)
Jan 07, 2025
Fog is working again
Jan 04, 2025
Improved fragment/compute shading rates
Jan 02, 2025

See all posts

Comments

MountainLabs308 days ago

I'm curious, how are the actual voxel volume modification speeds going? (block placing and breaking)

Teknologicus307 days ago (2 edits)

For placing a block made of 32x32x32 voxels, I'm seeing an update time of about 1/10 of a second. For removing a block made of 32x32x32 voxels, I'm seeing an update time of about 1/100 of a second.

Placing a procedurally generated block takes longer than removing a block because procedural generation is more computationally expensive depending on the block type.

I still haven't gotten the voxel volume mipmap and voxel adjacency bits updating working in compute shaders, so these numbers are measured with code compiled to do said updates on the CPU and push the updated voxel data to the GPU. (I would like to move procedural block generation to a compute shader too.)

For simple game play, these numbers are acceptable, but for procedural terrain and/or procedural structure (villages, castles, etc.) generation, world editing or anything that would update a lot of voxels at once, these numbers are way too slow in my opinion.

MountainLabs307 days ago

Wow. destroying voxels is pretty fast in comparison, Do you think pasting the 32 voxel blocks would be faster?

Teknologicus306 days ago (1 edit)

I tested it with larger block of voxels sizes and it is definitely slower. Once I get the more computationally expensive part of block placement (procedural generation, voxel adjacency bits and mipmap updates) moved to compute shaders (computationally parallel), it will be fast for any reasonable block size or number of blocks.

Teknologicus306 days ago (1 edit)

I misread your question. Yes, pasting would be faster rather than procedural generation.

MountainLabs305 days ago

Cool.

MountainLabs308 days ago

Cool. this project keeps advancing!

Teknologicus307 days ago

Thank you!

MR.Rose01309 days ago

looking good

Teknologicus308 days ago

Thank you!