4,580 Posts served
11,094 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Server

- Site News & Announcements

- Software Tools

- Ultrabook

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
Intel® Summary Statistics Library: how fast is the algorithm for detection of outliers?
By Dmitry Kabaev (Intel) (9 posts) on December 26, 2008 at 2:38 am
In one of my previous posts I described the scheme for detection of outliers in datasets which is important component of the Intel® Summary Statistics Library. We included optimized version of this algorithm in the Update for the first version of the package that was recently released. To have an idea about speed of the algorithm I measured its performance on two Intel CPU, Intel® Xeon® E5440, 2.83 GHz and Intel® Core™ i7, 2.93GHz based machines. For these experiments I generate the dataset from multivariate Gaussian distribution. Dimension of the Gaussian vector, p is varied from 50 till 1,000, and number of observations n – from 20,000 till 100,000. Generation of outliers is similar to that in my previous post. Two graphs below demonstrate performance of the outliers detection in Intel® Summary Statistics Library 1.0 Update. For p=50 performance of the algorithm is less than 0.5 second and is not showed on the graphs.
If dimension of the task p is equal to 1,000 and number of observations is 100,000 then the whole procedure takes less then one minute on Intel® Core™ i7 CPU based machine and a little bit longer – on Intel® Xeon® E5440.
In other words, Intel® Core™ i7 CPU is up 2x times faster than Intel® Xeon® E5440 in this specific application. The graph below that compares two platforms cleans out the CPU speed. As Intel® Core™ i7 CPU has higher frequency then speed-up of the algorithm for detection of outliers on this platform is even higher.
Categories: Intel SW Partner Program, Parallel Programming, Software Tools
Tags: bio-engeneering, covariance matrix, estimation, life sciences, moments, outliers detection, statistics, summary statistics
For more complete information about compiler optimizations, see our Optimization Notice.




