a lot of SIMD code ?

a lot of SIMD code ?

The rule 4 or 9 "Don't forget tu use SIMD Instructions" is interesting, but i have once small question.

I work on video and i'm the code optimizer of my company.
I have write a lot of SIMD routines in order to optimize our watermarking system, but i have a fear which could be express by :

If i have 3 or 4 thread on a single-Processor (maybe HT) which execute same jobs. These Jobs have been really well optimized with SIMD (15ms/frame without SIMD to 5ms with optimizations). There is only once SSE2 Unit and MMX unit so how processor manage this amount of SIMD code ?

There is no SIMD tools like Instruction Level Parallelism ? There is no risk to decrease performances when we use lot's of SIMD CODE ?

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

MMX/SSE/SSE2 code and hyperthreading are working very well together, with HT enabled and 2 running threads you can maximize the throughput by around 20% also with well optimized SIMD code

here are some concrete examples of speedups with HT and heavy SIMD usage (MMX and SSE)

see "Kribi" scores there :


Leave a Comment

Please sign in to add a comment. Not a member? Join today