OPENCL MIC DEVICE HW EXCEPTION error

OPENCL MIC DEVICE HW EXCEPTION error

Portrait de Fernando G.

Hi, 

I am trying to execute a piece of code in an intel Xeon Phi accelerator without any success. I am obtaining this error a lot of times: *** OPENCL MIC DEVICE HW EXCEPTION ***: Segmentation fault (Address not mapped to object [0x7fcafd726640]) (the mapped adrress is different every time). I am pretty sure that the cl_variables have the correct bounds, i.e., I am not writing out of bounds and the memory should not have been corrupted. 

Besides, I have two Xeon Phi accelerators mounted in the same host, but the openCL driver recognizes just one of them. Is there anything I have missed?

I have to clarify that this very same code runs without troubles using the openCL driver from nVidia on nVidia GPUs.

Thanks in advance.

8 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Hi Fernando,

Is it possible to attach a minimal reproducer for the segfault issue?

As for the 2 installed daccelerators and 1 detected device - I will get back to you once I clarify it.

Thanks,
Yuri

Fernando,

Can you list the Device ID of your part? Also the PCI config space subsystem vendor ID and revision ID would be useful. Just so we know what part we are dealing with.

- Chuck

An update regarding OpenCL support of several Xeon Phi devices. Current release (XE 2013 Beta) supports only 1 device. Multiple devices will be supported in the next release later this year.

Thanks,

Yuri

Portrait de Fernando G.

Hi all, 

I list here the results obtained when executing /opt/intel/mic/bin/micinfo:

MicInfo Utility Log

Created Tue Jan 15 11:44:02 2013

System Info
Host OS : Linux
OS Version : 2.6.32-279.19.1.el6.x86_64
Driver Version : 4346-16
MPSS Version : 2.1.4346-16
Host Physical Memory : 65886 MB
CPU Family : GenuineIntel Family 6 Model 45 Stepping 7
CPU Speed : 1200.000
Threads per Core : 2

Device No: 0, Device Name: Intel(R) Xeon Phi(TM) coprocessor

Version
Flash Version : 2.1.01.0375
UOS Version : 2.6.34.11-g65c0cd9
Device Serial Number : ADKC23000122

Board
Vendor ID : 8086
Device ID : 225d
SubSystem ID : 2500
MIC Processor Stepping ID : 1
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 4096 bytes
MIC Processor Model : 0x01
MIC Processor Model Ext : 0x00
MIC Processor Type : 0x00
MIC Processor Family : 0x0b
MIC Processor Family Ext : 0x00
MIC Silicon Stepping : B0
Board SKU : ES2-P1330
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS

Core
Total No of Active Cores: 57
Voltage : 1049000 uV
Frequency : 1100000 kHz

Thermal
Fan Speed Control : On
SMC Firmware Version : 1.6.3983
FSC Strap : 14 MHz
Fan RPM : 2700
Fan PWM : 50
Die Temp : 55 C

GDDR
GDDR Vendor : Hynix
GDDR Version : 0x3
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1000000 uV

Device No: 1, Device Name: Intel(R) Xeon Phi(TM) coprocessor

Version
Flash Version : 2.1.01.0375
UOS Version : 2.6.34.11-g65c0cd9
Device Serial Number : ADKC22900378

Board
Vendor ID : 8086
Device ID : 225d
SubSystem ID : 2500
MIC Processor Stepping ID : 1
PCIe Width : x16
PCIe Speed : 5 GT/s
PCIe Max payload size : 256 bytes
PCIe Max read req size : 4096 bytes
MIC Processor Model : 0x01
MIC Processor Model Ext : 0x00
MIC Processor Type : 0x00
MIC Processor Family : 0x0b
MIC Processor Family Ext : 0x00
MIC Silicon Stepping : B0
Board SKU : ES2-P1330
ECC Mode : Enabled
SMC HW Revision : Product 300W Active CS

Core
Total No of Active Cores: 57
Voltage : 1042000 uV
Frequency : 1100000 kHz

Thermal
Fan Speed Control : On
SMC Firmware Version : 1.6.3983
FSC Strap : 14 MHz
Fan RPM : 2700
Fan PWM : 50
Die Temp : 53 C

GDDR
GDDR Vendor : Hynix
GDDR Version : 0x3
GDDR Density : 2048 Mb
GDDR Size : 5952 MB
GDDR Technology : GDDR5
GDDR Speed : 5.000000 GT/s
GDDR Frequency : 2500000 kHz
GDDR Voltage : 1000000 uV

As to the segmentation fault issue, please find attached a minimal code that reproduces the issue. The calGrad.cpp can be compiled in two ways (I assume you are under linux):

c++ -o gradTest calGrad.cpp -lOpenCL

with this compilation the calGrad.cl file containing the crashing kernel is used.

c++ -o gradTest -D__CALGRAD2__ calGrad.cpp -lOpenCL

with this compilation the calGrad2.cl file containing the kernel that does not crash is used.

The difference between them is that in the former the cl_ngroup variable is updated inside the kernel by the work item  get_global_size(0)-1, while in the latter the cl_ngroup variable is written outside the kernel in the cpp file. The first one crashes producing one of these messages referred in the first post.

Thanks for your help.

Fichiers joints: 

Fichier attachéTaille
Télécharger calgrad.tar.bz2492.58 Ko

Fernando,

Thank you for the code. I was able to reproduce the behaviour. I will get back to you after initial investigation.

Thanks,
Yuri

Hi Fernando,

Sorry, it took so long to answer. It looks like this is the kernel issue.

When the last work-item modifies cl_ngroup[0] at line 63, it doesn't necessarily mean that all work-items have finished their execution at this point. The work-items execute in parallel, so the above change might affect the other work-items that only started their execution and use this value at line 14.

Thanks,
Yuri

Portrait de Fernando G.

Hi Yuri,

Thanks for the answer, I'll keep that behavior in mind when executing kernels in Intel accelerators.

Connectez-vous pour laisser un commentaire.