Subdevices and CPU affinity

Subdevices and CPU affinity

Tobias's picture

Hi,

Details:
I tried to work with subdevices on the CPU. I was assuming that this could help me control the mapping of the kernel to core assignment. To be more precise, I have some host threads that need to give some performance guarantees, and I want to be sure that the OpenCL kernels do not run on the same core. For the host threads I can set the cpu affinity. However, for the OpenCL kernels I cannot. I thought that subdevices could solve that problem in some way.

So, while I was digging into that topic, I have come over some peculiarities.

1. I figured out that the Intel OpenCL runtime creates one thread per CPU core, each of them having set a specific cpu affinity. This can be seen in gdb or htop. It is, however, strange that the device affinity of those threads is not constant for the whole runtime (i.e. it is reset from time to time).
2. I also figured out, that some of those threads seem to be set to the same affinity. This can also be a side effect of the refresh rate in htop, so that I am not able to see when the affinity has changed.
4. Subdevices cannot be created by affinity domain (CL_DEVICE_PARTITION_BY_AFFINITY_DOMAIN), although clGetDeviceInfo(...CL_DEVICE_PARTITION_PROPERTIES..) returns so.
5. When subdevices are created using CL_DEVICE_PARTITION_EQUALLY, the number of utilized cores seems to be one less than actually specified (i.e. partition equally to subdevices each having 4 compute units, will actually only use 3 cpu cores).

Questions:
- Is it possible to set the cpu affinity per subdevice or to a running OpenCL kernel?
- Can you reproduce and explain the behavior above? Does it make sense that two running kernels are sharing one CPU core, even though they are running on a seperate subdevice?

Setup:
- Linux 3.7.6-1-ARCH #1 SMP PREEMPT x86_64 GNU/Linux
- Intel Core i7 2600K
- Intel OpenCL SDK 2013 Build 56860

Attachments: A minimal running code example, some additional information to my system setup.

Kind Regards,
Tobias.

AttachmentSize
Download clinfo.txt9.66 KB
Download main.c4.21 KB
Download proc-cpuinfo.txt7.15 KB
3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Yuri Kulakov (Intel)'s picture

Hi Tobias,

Thank you for the information. We were able to reproduce the behavior you mentioned. The investigation is in progress. We will inform you of the results.

Thanks,
Yuri

Aharon Abramson (Intel)'s picture

Hi, Tobias.

We have found a bug in thread affinitization - both in root device and sub-devices. We have fixed the root device problem for the next release (gold). For the sub-device problem we have filed a high-priority bug.

Thanks, Aharon.

Login to leave a comment.