Architecture and usage question

Architecture and usage question

Hi all,

I've started experimenting with the Media SDK and comparing results between an i7-860 and i7-2600, and while I'm getting a good speedup with the Sandy Bridge CPU (about 2x), I have some basic questions about what's going on under the hood.

1) Is there a difference between the AVX instruction set and the new enc/dec capability of Sandy Bridge? At first I thought they were the same thing, but after reading many discussions and watching some of the Intel intro videos, it sounds like the enc/dec which the Media SDK hooks into is a coprocessor (or SOC?) with a high-level interface, while AVX is a new instruction set. Is this correct?

2) At first I was not able to run the sample AVC encoder with the -hw flag on the i7-2600. After installing the latest graphics drivers the -hw flag worked. At this point where is the dividing line between the CPU and GPU? Is GPU functionality now embedded in the Sandy Bridge CPU, is the GPU using some functionality from the CPU, or is the enc/dec functionality in Sandy Bridge really part of a separate Intel GPU? Since the Media SDK is targeting Intel GPUs, I'm a bit confused as to how it really works with Sandy Bridge.

3) When I run the sample encoder application on the i7-2600 (without the -hw flag) it runs about twice as fast as on the i7-860. Is this software level using the AVX instruction set? When I run with the -hw flag (on the i7-2600), I see no timing difference (against non -hw on the i7-2600). Is this expected?

4) When the Media SDK is being run in SW mode, is it being redirected to IPP calls?

5) I guess it comes down to, what is SW vs. HW implementation in the Media SDK on Sandy Bridge? Is SW using IPP (and with it AVX instructions) while HW is using this new dedicated enc/dec "thing" which exists on the CPU/GPU?

I'm sure I will get some very useful links to diagrams and overviews about Sandy Bridge in response, but if I could also get answers to the specific questions I asked, I'd really appreciate it - I think it will also help me to better interpret the links I receive :).


4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

Hi Peter,

Please see answers below, I hope these help in understanding the the capabilities of Intel Media SDK and2nd generation Intel Core processor family.

1)Sandy Bridge (2nd generation Intel Core processor family) supports HW encode and decode using dedicated HW blocks (part of the on-chip GPU).

Intel AVX provides SIMD extentions accelerating processor execution on instruction level.You can find more details/samples/references on Intel AVX here:

The GPU and the processor cores are located on the same chip butGPU HW encode/decode blocks are not connected to Intel AVX instruction set and do not share resources except indirectly due toshared cache and memory controller. The GPU and processor also share TDP.

2)See above.

Regarding using Intel Media SDK:
- If HW encode/decode is selected as the target the HW blocks will be used to accelerate the workload.
- If SW encode/decode is explicity selected then pure SW implementation of the codec will be used.

3)Clear performance comparison between platforms using the Intel Media SDK samples is tricky. The measured performance will depend on theplatform configuration such amount of memory, storage device speed (SSD/HD), clock frequency etc.Aside from that, the performance will vary greatly depending on encode target (MPEG2, H.264), quality setting, resolution etc.

Also note that for the case of sample_encode, raw yuv file is read from disk (a non CPU/GPU bottleneck)
As far as pure processor performance is concerned, due to architectural improvements in Sandy Bridge, it will comfortably outperform an older platforms.

4) Intel Media SDKSW codec implementation partially uses the Intel IPP library, which is optimized for AVX.

5)For further details on 2nd generation Intel Core processor family please refer to the links below:


Hi Petter,

Thank for for the detailed set of answers. I think I have a much better picture of what Sandy Bridge is now, but I have a couple of new questions based on your answers:

1) If the GPU core is on the CPU, is it possible to add a graphics board and still access the embedded enc/dec capability? Probably yes, but it's worth asking.

2) Is there a recommended way to profile encode performance? I was going to put a Sync call after the EncodeFrameAsync() call and time that, but that may unfairly include transfers (if there are any?) in the timing which may be overlapping with the async call.

3) Is any on-motherboard GPU functionality still on a separate chip, or is everything embedded in the CPU now?

4) From what I've read it sounds like the embedded GPU may have access to the L3 cache which should reduce PCI transfer overhead. Is this true?

Thanks again,


Hi Peter,

Please see answers below.

1) Yes, it is possible if the system supports "switchable graphics" feature. This enables the primary graphics device to be switched between the Sandy Bridge GPU and the discrete graphics card.Sandy Bridge GPU must be the primary GPU for Intel Media SDK to be able to use HW acceleration.

2) Sorry. We do not have any specific external recommendations on how to profile Intel Media SDK.

3) The GPU is fully integrated with the CPU and on the same chip (on2nd generation Intel Core family of processors).

4) Yes


Leave a Comment

Please sign in to add a comment. Not a member? Join today