The Compute Architecture of Intel Processor Graphics Gen7.5

The Compute Architecture of Intel Processor Graphics Gen7.5

We have released a new whitepaper explaining the architecture of Intel(r) Processor Graphics Gen7.5, specifically the components associated with running compute applications on Processor Graphics.   This is the architecture that supports compute APIs such as OpenCL, DirectX Compute Shader, Renderscript, C++AMP, etc.   

We'd be grateful to hear your feedback on the whitepaper contents.    Tell us what you think!

regards -stephen

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Great job!  

I'd be a lot more interested in Intel's opencl if

1) There was GPU supported fp64. Ideally 1:2 performance ratio

2) The CPU supported a vector fp64 rsqrt

3) The native CPU and GPU fp64 rsqrt had opencl level numerical precision.

It would be good if there was a document comparing the native (double) precision (in ulp) of the different hardware.Nvidia, AMD and Intel)
My perception is that AMD and Nvidia have higher native precision. Or at least they are constantly striving to improve them. eg the 290X has "precision improvements to the native LOG and EXP operations" and so on.
My impression is that Intel expects you to buy a software library that has to do many more iterations to achieve the same result.
Maybe this is just a case of insufficient documentation?


Your Gen8 presentation and whitepaper are excellent.

The PDFs were posted here:

The Gen8 IGP looks like it has become a truly general-purpose compute platform.

One question, are FP16 FMA operations available? If so, can each EU perform 16 FP16 FMAs per clock?

Architecture, yes; driver not yet.

Intel processor graphics architecture supports 16bit float FMAs as of Gen8.  

In the API example of OpenCL 1.2, 2.0 the OpenCL C “half” data type for 16 bit floats is currently an “optional” extension feature enabled by Khronos cl_khr_fp16 extension. 


Perhaps as feedback for feature prioritization decisions, could you possibly say what OS platform, which compute API, and what kind of device (tablet, laptop, server…) you seek to target with FP16?  



regards, -stephen

Thanks for the clarification.

I'm looking to use FP16 FMA's in an OpenCL kernel on both Windows and Linux (and someday OS X).

As far as devices, anything with a display -- tablet/laptop/workstation.

Leave a Comment

Please sign in to add a comment. Not a member? Join today