As stated in Part 1 of this series, the main goal of Microsoft Direct3D* (D3D) 12 is to reduce CPU overhead. To make it more efficient running render commands sent to the CPU. Which leads us to the first new feature of D3D 12 we will discuss, the Pipeline State Object or PSO. In order to discuss the PSO we must first review the D3D 11 render context, then cover the changes in D3D 12. Below you will find the D3D 11 render context as shown by Max McMullen, the D3D Development Lead, during BUILD 2014 held in April.
The arrows show the individual pipeline states, where each state can be retrieved or set based on the needs of the game. The other states at the bottom consists of fixed function states such as viewport or scissor rect. The rest of the diagram we will review as we cover the relevant features in the following parts of this overview. For the purposes of this discussion we need only review the left side of the diagram. Now D3D 11’s small state objects did reduce the CPU overhead from D3D 9, but there was still additional work in the driver taking these small state objects and combining them into GPU code at render time. Let’s call it the hardware mismatch overhead. Take a look at another diagram from BUILD 2014.
The left side shows a D3D 9 style pipeline with simpler states, it is what the application is using to do its work. On the right is the HW that needs to be programmed. State 1 represents the shader code. State 2 is a combination of the rasterizer and the control flow linking the rasterizer to the shaders. State 3 is the linkage between the blend and pixel shader. We see that the D3D Vertex Shader effects HW states 1 & 2, the Rasterizer state 2, Pixel shader states 1-3 and so on. Most drivers do not want to submit the calls at the same time as the application, they prefer to record and defer until the work is done so they can see what the application actually wants. So this means additional CPU overhead as things are marked ‘dirty’. Then at draw time they have control flow to check the states of each object and program the hardware to match the state the game has set. That is a lot of additional work, and as we all know the more complicated it is the more things can go wrong. Ideally once the game sets the pipeline state the driver knows what the game intends and programs the hardware just once. In the diagram below we see the D3D 12 pipeline which does just that in what is called the Pipeline State Object.
Referring back to the D3D 11 render context, remember some states were marked other. The D3D 12 team realized the importance of keeping the PSO size under control and allowing the game to change the render target without affecting a compiled PSO. So things like viewport and scissor rect are left separate and programmed orthogonally to the rest of the pipeline, which most hardware can do. The resulting PSO looks like this:
Instead of the ability to set and read each individual state we now have a single point. Reducing or removing entirely the hardware mismatch overhead. The application sets the PSO as it needs while the driver simply takes the API commands and translates them to GPU code without the additional flow control overhead. The application is ’closer to the metal’ thus draw commands take fewer cycles and performance increases.
Next up in Part 3: Resource Binding
Diagrams from BUILD 2014 presentation created by Max McMullen, D3D Development Lead at Microsoft.