4,580 Posts served
11,094 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Server

- Site News & Announcements

- Software Tools

- Ultrabook

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
GDC09: parallel load, store and other ops with Larrabee
By Michael J Huelskoetter (90 posts) on August 14, 2009 at 5:24 am
“You must finish what you start.” This is my today's credo regarding GDC09 as I've blogged only yesterday about one of the planned tech sessions referring to Larrabee programming. And today I'm publishing the second session's preview.
This tech talk is being called "SIMD programming with Larrabee: Second Glance at the New Instructions in action" and has to do with register based programming for Larrabee software. Moreover Intel's Steve Hughes and Steve McCalla will immerse deeply into the topic and talk about the fact that ...
… every Larrabee CPU core has its own 32 vector registers which are 16 bits wide. These registers can be used for parallel executable operations. This results in 512 bit wide SIMD registers per processor core.
… Larrabee based programming provides two methods to perform vector operations: SOA (Structure of Arrays) and AOS (Array of Structure). Both make sense under certain conditions and Steve & Steve will demonstrate in detail, which method is the better one. This will be shown with the help of easy math examples like matrix calculation.
… Predication uses eight 16-bit wide registers to execute vector based comparative operations in parallel. Predication can be used for typical x86 loops either.
… Gather and Scatter are also important if you talk about Larrabee based coding. Gather corresponds to a load instruction which is executed simultaneously on 16 floating point registers. And Scatter means the reverse way, namely the simultaneous saving of these 512 bit wide vectors. Besides, we will also learn that Gather and Scatter are only limited by the cache speed.
… there is already a C ++ Larrabee protoype library which uses only one header and no .lib or .dll files anymore. And we'll hear also that debugging will be easier and that existing platforms will be supported.
So, there are many good reasons to mark your calender. You either join the session directly at GDC'09 on Tuesday morning from 10:10 to 11:00 am CET. Or you stop and follow this blog to get all the information in a dedicated article as we will video blog what Steve & Steve are talking about - and why Larrabee will be important for many developers in the future
Categories: Events, Game Development, Graphics & Media, Parallel Programming
Tags: GDC, GDC 2009
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (1)
Trackbacks (4)
- Intel Software Network Blogs » GDC09: programming efficiently and simultaneously with Larrabee
August 18, 2009 9:00 AM PDT - GDC09: programming efficiently and simultaneously with Larrabee
August 18, 2009 12:21 PM PDT - Game Developers Conference: catch up on Youtube « SoftTalk – multicore and parallel programming
August 19, 2009 8:33 AM PDT - Intel Settles With NVIDIA: More Money, Fewer Problems, No x86
– AnandTech :: Your Source for Hardware Analysis and News
October 2, 2011 10:21 AM PDT


omnia b7610