Server

Intel® Trace Analyzer and Collector 9.0 Beta Update 1 Readme

The Intel® Trace Analyzer and Collector is a low-overhead scalable event-tracing library with graphical analysis that reduces the time it takes an application developer to enable maximum performance of cluster applications. This Beta package is for users who develop on and build for Intel® 64 architectures on Linux* and Windows*, as well as customers running on the Intel® Xeon Phi™ coprocessor on Linux*. You must have a valid license to download, install and use this product.

  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8
  • Server
  • C/C++
  • Fortran
  • Intel® Trace Analyzer and Collector
  • Message Passing Interface
  • Cluster Computing
  • Intel® MPI Library 5.0 Beta Update 1 Readme

    The Intel® MPI Library is a high-performance interconnect-independent multi-fabric library implementation of the industry-standard Message Passing Interface, v3.0 (MPI-3.0) specification. This Beta package is for MPI users who develop on and build for Intel® 64 architectures on Linux* and Windows*, as well as customers running on the Intel® Xeon Phi™ coprocessor on Linux*. You must have a valid license to download, install and use this product.

  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8
  • Server
  • C/C++
  • Fortran
  • Intel® MPI Library
  • Message Passing Interface
  • Cluster Computing
  • The Chronicles of Phi - part 4 - Hyper-Thread Phalanx – tiled_HT2

    The prior part (3) of this blog showed the effects of the first-level implementation of the Hyper-Thread Phalanx. The change in programming yielded 9.7% improvement in performance for the small model, and little to no improvement in the large model. This left part 3 of this blog with the questions:

    What is non-optimal about this strategy?
    And: What can be improved?

    There are two things, one is obvious, and the other is not so obvious.

    Data alignment

    TBB's parallel_sort is 5x slower on Xeon Phi 5110P, than on E5-2670 ??

    Dear all,

    I'm benchmarking accelerated libraries for key-value sorting. Intel mentions parallel_sort in hundreds of tutorials, but they never go further than just mentioning - almost no compilable code examples for MIC, and no perf numbers, except 1 slide here. As result, I had to write my own sample (attached). Surprisingly, for me Xeon Phi 5110P processes 256M elements 5 times slower than Xeon CPU. I doubt it's normal.

    FMA Support

    Hello guys, sorry for a basic question. I've been looking for architectures which supports FMA. I know Sandy Bridge doesn't support, and Haswel supports it. But, what about Ivy Bridge? Does Ivy Bridge supports FMA?

    Best regards.

    Intel® Math Kernel Library Parallel Direct Sparse Solver for Clusters

    The Intel® Math Kernel Library Parallel Direct Sparse Solver for Clusters (CPARDISO) is a powerful tool set for solving system of linear equations with sparse matrix of millions rows/columns size.

    CPARDISO provides an advanced implementation of the modern algorithms and could be considerate as expansion of Intel MKL Pardiso on cluster computations.

  • Developers
  • Professors
  • Students
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8
  • Unix*
  • Server
  • UX
  • Windows*
  • C/C++
  • Fortran
  • Intel® Cluster Ready
  • Message Passing Interface
  • Intel(R) MKL
  • Cluster Pardiso
  • pardiso
  • Tech reviewer needed for upcoming book about Intel compilers, tools and libraries

    Hello,

    I am currently working on a book about parallel programming using the intel compilers and libraries. A chapter will be dedicated to the Xeon Phi and I would therefore need an Intel tech reviewer for the chapter. I would preferably wish an Intel employee with hands on experience on the Xeon Phi. As a tech reviewer you will appear in the book's front matter including a short bio. The book is part of the ApressOpen platform.

    If you are interested in reviewing the book, please contact John Aiken at Intel Industry Education (formerly Intel Press).

     

    about RAPL PAPI event

    Is RAPL event PACKAGE_ENERGY:PACKAGE0 including DRAM_ENERGY:PACKAGE0 and PP0_ENERGY:PACKAGE0? Or the total system energy should add the three events joules together?
    PACKAGE_ENERGY:PACKAGE0   176.450363J   (Average Power 42.9W)
    PACKAGE_ENERGY:PACKAGE1    75.812454J   (Average Power 18.4W)
    DRAM_ENERGY:PACKAGE0       11.899246J   (Average Power 2.9W)
    DRAM_ENERGY:PACKAGE1        8.341141J   (Average Power 2.0W)
    PP0_ENERGY:PACKAGE0       118.029236J   (Average Power 28.7W)
    PP0_ENERGY:PACKAGE1        16.759064J   (Average Power 4.1W)
    
    Subscribe to Server