Intel® VTune™ Amplifier
Average bandwidth of data transfer between a CPU and a GPU. In some cases (for example, clEnqueueMapBuffer), there may be transfers generating high bandwidth values because memory is not copied but shared via L3 cache.