As you likely know, modern GPUs shade triangles in blocks of 2x2 pixels, or quads. Consequently, redundant processing can happen along the edges where there’s partial coverage, since only some of the pixels will end up contributing to the final image. Normally this isn’t a problem, but – depending on the complexity of the pixel shader – it can significantly increase, or even dominate, the cost of rendering meshes with lots of very small or thin triangles.
Figure 1: Quad overshading, the silent performance killer
It’s hardly surprising, then, that IHVs have been advising for years to avoid triangles smaller than a certain size, but that’s somewhat at odds with game developers – artists in particular – wanting to increase visual fidelity and believability, through greater surface detail, smoother silhouettes, more complex shading, etc. (As a 3D programmer, part of my job involves the thrill of being stuck in the middle of these kinds of arguments!)
Traditionally, mesh LODs have helped to keep triangle density in check. More recently, deferred rendering methods have sidestepped a large chunk of the redundant shading work, by writing out surface attributes and then processing lighting more coherently via volumes or tiles. However, these are by no means definitive solutions, and nascent techniques such as DX11 tessellation and tile-based forward shading not only challenge the status quo, but also bring new relevancy to the problem of quad shading overhead.
Knowing about this issue is one thing, but, as they say: seeing is believing. In a previous article, I showed how to display hi-z and quad overshading on Xbox 360, via some plaform-specific tricks. That’s all well and good, but it would be great to have the same sort of visualisation on PC, built into the game editor. It would also be helpful to have some overall stats on shading efficiency, without having to link against a library (GPUPerfAPI, PerfKit) or run a separate tool.
There are several ways of reaching these modest goals, which I’ll cover next. What I’ve settled on so far is admittedly a hack: a compromise between efficiency, memory usage, correctness and simplicity. Still, it fulfils my needs so far and I hope you find it useful as well.