Execution Units as SIMD

Execution Units as SIMD

I am currently unrolling loops on the 3000 GPU inside the i5-2500 k in the assumption that the Execution Units handle the threads very much as the Stream processors in the Nvidia environment.

Can one see the Execution Units in the i5's GPU in this way ? Can each EU be viewed as an SIMD device ?

Is there a good description of what an EU actually is with regard to processing with the Compute Shader ? I have not been able to find one.

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

In other words.is the Intel 3000 GPU architecture similar to the 48 core Nvidia GT 520 or the 96 cores of Nvidia GT430.

Is there some similarity between these architectures. What is this EU thing doing in the GPU/ the SSE and AVX in theregular cores are quite simple yet this GPU EU is a puzzle

Hi magicfoot,
Is there a specific application that you are looking to use on HD3000? Are you using OpenCL or porting shaders to do GPGPU on Intel HD graphics? COnceptually, the EUs are similar to shader cores, but are not replated to AVX/ SSE as these are CPU features. Understanding what you are trying to do will help me answer your question better.
Thanks
-deepak

Hi deepak,

I have programmed an algorithm on the i5-2500k with 4 cores and the AVX. I used openMP and AVX and results are correct.

I then programmed this same algorithm with DirectX11 for the HD 3000 GPU on the i5-2500k and got some good performance, at least of the same order as the 4cores with AVX. The performance of the HD 3000 GPU was a little quicker than the GTS240(128 cuda cores)from Nvidia using CUDA with same algorithm.

I understand the Nvidia GTS240 architecture fairly well and understand what I am programming.

I do not understand much about the HD3000 apart from the fact that there are 12 CISC like EUs. I apply some threads to these EU and they produce very good results. My question now:

Is the intel EU arch like the Nvidia SM? Does the EU have an AVX like SIMD facility, equivalent to Nvidia's warp? If these are 12 conventional cores that are working together, why is there no bottleneck ? What am I dealing with when I program the EU with DirectX11?

Kind regards.

Leave a Comment

Please sign in to add a comment. Not a member? Join today