GPA reports the texture sampler is busy 95%... What's next?

GPA reports the texture sampler is busy 95%... What's next?

 

Hi,

I am developing a DirectX based medical imaging application that uses volume rendering. We used to require discrete video cards, but now we are tweaking/re-engineering it to work on recent intel processors. (HD Graphics as well as ivy/sandy bridge).

We don't appear to be CPU bound. From the GPA, I know that we spend about 1% of the time in the vertex shader. During continuous render, The pixel shader is about 50% utilized, and it appears to be stalled the other half. We are sampling volume textures A LOT. As the title says, the texture sampler is busy 95% of the time. I suspect this is due to memory latency, but I don't know how to confirm this. I did not find a counter that indicates how much the sampler is waiting for memory. There is a counter that indicates wether the sampler is "stalled", but that is near zero all the time. 

So what would be the logical next step in the performance analysis? I would like to know if we are limited by the sampler, memory bandwidth, or both.

Thanks in Advance

Jan

5 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Hello,

First of all, a couple of things to try:

1) Be sure you have the latest version of Intel GPA (2012 R5, build 187105), and the latest graphics drivers available from the Intel download site (should be 15.28.7.2875/15.28.8.64.2875). We have seen some issues with "old" drivers in particular -- the latest driver has the most accurate metrics data.

2) For Ivy Bridge graphics, I believe that you have enabled additional metrics since you mention the "texture sampler busy" metric. But just to be sure:

On Intel® HD Graphics 4000/2500: to access metrics marked with the asterisk (*), you must explicitly enable the Intel(R) Graphics Performance Analyzers option in your BIOS settings:

Select Advanced
Select System Agent (SA) Configuration
Select Graphics Configuration
Reboot your machine

If the BIOS on your system does not include the Intel® Graphics Performance Analyzers option, update your BIOS to the latest version from Intel. After completing your performance monitoring activity, we recommend that you disable the Intel® Graphics Performance Analyzers BIOS option and reboot your machine.

After enabling these extra metrics, you should now be able to see 8 different memory metrics (including texture reads) -- hopefully this gives you the info you need. For example, check out "GPU memory reads" to see whether you're having to fetch lots of data from the CPU -- The GPU Memory Reads metric represents the number of bytes read from memory by the GPU, and only includes reads due to cache misses and explicitly uncached resources. For texture data, only reads that miss both the texture cache and the L3 cache are included in this total. Therefore, the GPU Texture Reads metric could be significantly higher than the GPU Memory Reads metric if the L3 cache is effectively utilized.

You also didn't indicate if you are using Intel GPA System Analyzer or Intel GPA Frame Analyzer. If you're using Intel GPA Frame Analyzer, you'll be able to see all the metrics together, and something may end up being unusual and therefore suspect. I would also recommend that you look at the documentation for each of the metrics -- there are various "hints" available for how to fix certain issues, and you may need to track down a number of them before seeing one or more issues (for example, http://software.intel.com/sites/products/documentation/gpa/12.5/index.ht...). In that section, we suggest "Examine the GPU EUs Stalls metric to see amount of EUs stalls. If the percentage is high and the Texture Sampler Busy is close to 100%, most likely you have a texture bottleneck."

You can also try various experiments, such as "2x2 textures" to isolate potential performance bottlenecks.

You also indicated that you didn't believe that you were CPU-bound -- did you use any "overrides" in Intel GPA System Analyzer to help verify this?

Hopefully this gets you started. If this brings up more questions, probably the next step is to get one of your frame capture files so that we can see all the data at once, and get a better picture of what's happening in the GPU.

Regards,

Neal

Hi,

Just checking... did any of my comments help resolve your original questions?

Regards,

Neal

Nice of you to ask, but not yet.

I have switched to new machine, with an Ivy Bridge CPU, as I understand it will provide more performance counters to will help me figure out what is the bottleneck for our volume texture sampling. Once I get GPA running, I can continue on this project.

Hello,

Please realize that for Ivy Bridge graphics, you'll want to set a BIOS option that provides access to a larger set of GPU metrics. This is documented here: http://software.intel.com/sites/products/documentation/gpa/12.5/hh_goto....

Regards,

Neal

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui