performance optimization

Optimize Financial Applications using Intel® Math Kernel Library

Intel® Math Kernel Library (Intel® MKL) contains a wealth of highly optimized math functions that are fundamental to a wide variety of Financial Applications. Intel MKL uses Industry Standard interfaces and can be easily integrated into your current application framework. The Webinar provides an overview of Intel MKL to accelerate financial applications. Topics include:

  • 开发人员
  • Apple OS X*
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • C/C++
  • Fortran
  • 入门级
  • 中级
  • 英特尔® C++ 编译器
  • 英特尔® C++ Composer XE
  • Intel® Fortran Compiler
  • 英特尔® Fortran Composer XE
  • 英特尔® 数学核心函数库
  • Learning Lab
  • FSI
  • Financial Services
  • Intel MKL Training
  • performance optimization
  • 开发工具
  • 金融服务行业
  • 优化
  • 并行计算
  • Sid Meier’s Civilization* V Finds the Graphics Sweet Spot

    Sid Meier’s Civilization* series has a successful 20-year history. This white paper describes how Firaxis utilized Intel® GPA to ensure Civilization V (Civ5) offers the best possible mix of graphics & game performance for the vast majority of systems.
  • 开发人员
  • 游戏开发
  • Intel® Threading Building Blocks
  • 英特尔® INDE
  • 图形性能分析器
  • 英特尔® VTune™ 放大器 XE
  • Performance analysis
  • performance optimization
  • Civilization
  • sid meier
  • visual computing
  • Firaxis
  • Civ5
  • Task Analyzer
  • 游戏开发
  • 图形
  • 优化

    There are 8 MEM_TRANS_RETIRED.LOAD_LATENCY_GT_* precise events available on Intel® Microarchitecture Codename Sandy Bridge.  The events allow you to pinpoint loads that exceeded a given latency, measured in CPU clock cycles.  For example, the MEM_TRANS_RETIRED.LOAD_LATENCY_GT_4 event is for loads exceeding 4 clocks in latency, and the MEM_TRANS_RETIRED.LOAD_LATENCY_GT_512 event is for loads longer than 512 clocks. 

  • 英特尔® VTune™ 放大器 XE
  • performance tuning
  • performance optimization
  • performance profiler
  • event-based sampling
  • Optimize Code for the Most-Often Used Code Path


    Overcome the limitation of optimizing compilers in terms of not knowing which code-execution path is most likely to be used. For example, an optimizer can refine a long series of if statements and have it run at great speed; but if it does not know that in the majority of runs, the very last test is the one that is run, the optimizer cannot rearrange the sequence for best possible performance. It has to work on the assumption that all if tests in the sequence are equally probable.

  • Execution
  • performance optimization
  • 并行计算
  • Create Cache-Data Blocks


    Take advantage of data-cache locality with cache-data blocking. Loops with frequent iterations over large data arrays should be restructured such that the large array is subdivided into smaller blocks, or tiles. Each data element in the array is therefore reused within the data block, so that the block of data fits within the data cache, before operating on the next block or tile.

  • Memory cache
  • performance optimization
  • 并行计算
  • 订阅 performance optimization