Execution

Introduction to Parallel Programming video lecture series – Part 01 “Why Parallel? Why Now?”

The lecture given here is the first part in the “Introduction to Parallel Programming” video series. This part endeavors to define parallel computing, explain why parallel computing is becoming mainstream, and explain why explicit parallel programming is necessary. This part sets the tone for the other 11 parts in the series.

Running time: 9:51

Optimize Code for the Most-Often Used Code Path


Challenge

Overcome the limitation of optimizing compilers in terms of not knowing which code-execution path is most likely to be used. For example, an optimizer can refine a long series of if statements and have it run at great speed; but if it does not know that in the majority of runs, the very last test is the one that is run, the optimizer cannot rearrange the sequence for best possible performance. It has to work on the assumption that all if tests in the sequence are equally probable.

  • Execution
  • performance optimization
  • Parallel Computing
  • Excess Riscification on the Pentium® 4 Processor


    Challenge

    Avoid excess code RISC-ification on the Pentium® 4 processor with the Microsoft Visual Studio* C++ .NET* 2003 compiler. RISC-ification is a compiler optimization that was developed for the Pentium® processor. Code would be RISC-ified, scheduled for U-V pipelines, and then CISC-ified to restore code in cases where no scheduling opportunities arose.

  • Execution
  • performance optimization
  • Unpredictable Conditional Branches on 32-Bit Intel® Architecture


    Challenge

    Eliminate unpredictable conditional branches in code. Eliminating these branches improves performance because it does the following:

    • Reduces the possibility of mispredictions
    • Reduces the number of required branch target buffer (BTB) entries; conditional branches that are never taken do not consume BTB resources

     

    Consider a line of C code that has a condition dependent upon one of the constants:

  • Execution
  • performance optimization
  • Source-Level Optimizations for the Pentium® M Processor


    Challenge

    Promote excellent performance on the Pentium® M processor, as well as the Pentium® 4 processor, by means of application-level and source-level optimizations. Substantial performance gains are possible by focusing such efforts on application hotspots identified using the Intel® VTune™ Performance Analyzer.

  • Execution
  • performance optimization
  • Subscribe to Execution