Archives

Posts from Michael Stoner (Intel) RSS

Michael Stoner (Intel)

Mike Stoner is a Senior Applications Engineer with Intel's Software Solutions Group. He has been with Intel since 1996, working mainly in the role of helping software developers optimize code for Intel platforms and driving related improvements into future CPU designs. Prior to joining Intel Mike received his MS and BS in Electrical Engineering from The Ohio State University.

Restructuring loops for LAME mp3 high-pass filter

By Michael Stoner (Intel) (7 posts) on September 10, 2009 at 12:01 pm
Comments (0)

Here’s another quick performance tip for LAME mp3 encoding.  This nested loop in the function ‘L3psycho_anal_ns’ is a hotspot for constant bit-rate encoding:         for (i = 0; i < 576; i++)         {             FLOAT   sum1, sum2;             sum1 = firbuf[i + 10];             sum2 = 0.0;             for (j = 0; j < ((NSFIRLEN - 1) / 2) - 1; j += [...]

Continued ›

Category: Open Source, Software Engineering, Visual Computing

Using SSE4.1 for mp3 encoding quantization

By Michael Stoner (Intel) (7 posts) on January 7, 2009 at 3:16 pm
Comments (0)

In this post I'd like to promote the new SSE 4.1 instruction set extension as it relates to the quantization loop I wrote about a few months ago. As you may recall, the modified code from ‘quantize_xrpow_lines" looked like this: for(i=0; i < l; i++)    {       float x0 = xr[i] * istep;       int [...]

Continued ›

Category: Open Source, Parallel Programming, Software Engineering, Visual Computing

Another tip for faster mp3 encoding

By Michael Stoner (Intel) (7 posts) on October 31, 2008 at 4:05 pm
Comments (2)

In this entry I want to highlight a loop in the ‘count_bits’ function which yielded a 1.15x app-level gain when we coaxed it to vectorize with the Intel Compiler.  After disabling Takehiro’s float-to-int hack, this was the top hotspot in our constant bit-rate encoding workload:  for (l = -width; l < 0; l++)             if (xr[j [...]

Continued ›

Category: Open Source, Parallel Programming, Software Engineering, Visual Computing

Open source project - LAME mp3 encoder optimization

By Michael Stoner (Intel) (7 posts) on October 6, 2008 at 4:17 pm
Comments (0)

One of the nice things about working on open source code is that any interesting findings can be freely discussed, such as in this blog.  With that in mind I recently took up a project to optimize performance of the popular LAME mp3 encoder.  Over the years I had seen LAME used in several other [...]

Continued ›

Category: Open Source, Parallel Programming, Software Engineering, Visual Computing

Assessing the accelerator buzz: Another tip for faster Monte Carlo computing

By Michael Stoner (Intel) (7 posts) on July 30, 2008 at 5:18 pm
Comments (0)

Continuing with the GaussianRand example, a 1.5x gain is nice but were there additional opportunities for performance gains?  Of course there were! (That was a rhetorical question…)  Seeing as floating point divides are among the longer latency operations, we should look at the two that are coded into the do/while loop to normalize the random [...]

Continued ›

Category: Financial Services Industry, Parallel Programming

Assessing the accelerator buzz: Vectorization of Monte Carlo algorithms

By Michael Stoner (Intel) (7 posts) on July 15, 2008 at 2:45 pm
Comments (0)

Now we’ll take a look at optimizing something more interesting and complex.  Since we can’t show much of the customer source we work on, we’ll look at some public domain code from the internet, specifically this Box Muller random number transformation from http://www.taygeta.com/random/gaussian.html:       for (int i = 0; i < LENGTH; i++)       {         double w, [...]

Continued ›

Category: Financial Services Industry, Parallel Programming

Assessing the accelerator buzz: Tips and Tricks for Intel® Compiler vectorization

By Michael Stoner (Intel) (7 posts) on June 26, 2008 at 10:08 am
Comments (3)

Here at Intel we have spent much of the last year assessing the rising buzz about GPGPU’s and other accelerator cards in the financial services community.  These technologies promise tremendous computing capability, but often we see performance claims that are exaggerated by comparing the best possible accelerator implementation to a very unoptimal version of the [...]

Continued ›

Category: Financial Services Industry, Parallel Programming