Wow. destroying voxels is pretty fast in comparison, Do you think pasting the 32 voxel blocks would be faster?
Viewing post in More ambient occlusions optimizations comments
I tested it with larger block of voxels sizes and it is definitely slower. Once I get the more computationally expensive part of block placement (procedural generation, voxel adjacency bits and mipmap updates) moved to compute shaders (computationally parallel), it will be fast for any reasonable block size or number of blocks.