The VS Invocations metric represents the number of vertex shader invocations - the vertex shader is invoked once per vertex. The number of vertex shader invocations depends both on the vertex and primitive counts and the operation of the post-transform vertex cache (VCache). In an optimal situation the GPU fetches already-processed vertices from the cache rather than recalculating this data, which could impact the value of this metric.
Therefore, when the VS Invocations and the Vertex Count have similar values, it means that the geometry is not optimized to take advantage of the VCache.
The OptimizedMesh sample from the Microsoft* DirectX* SDK is a good example to illustrate the Vertex Count and VCache optimizations:
- When rendering one un-optimized mesh as a triangle list, the Vertex Count is equal to 141K and the VS Invocations is 112K.
- When rendering the same mesh as a triangle list that has been reordered for optimum VCache usage, the Vertex Count is still the same but the VS Invocations number drops to 27K, which is almost four times less.
- When rendering the same mesh as a VCache-optimized triangle strip, the Vertex Count drops to 52K and the VS Invocations drops to 25K.
To improve vertex processing performance and reduce the number of vertex shader invocations, try to reorder the geometry for optimum VCache usage. The D3DX utility library contains functions that reorder the geometry to improve VCache utilization (ID3DXMesh::Optimize, ID3DXMesh::Optimize, D3DXOptimizeFaces, D3DXOptimizeVertices).
If you render point sprites, the metric is always equal to Vertex Count and Primitive Count (that is, no optimizations are necessary).
The size of the VCache varies for different GPU models, so you may see different metric values when using the same geometry on different hardware.