The Texas Advanced Computing Center at the University of Texas supports cutting-edge research in nearly every field of science, powering the discoveries of tomorrow. The Texas Advanced Computing Center (TACC) has deployed the first large scale system integrated with the new Intel Xeon Phi Coprocessor technology. The system called Stampede went into full scale production in early 2013 and is currently ranked as #6 most powerful system in the Top500 world rankings. Today, more than 5,000 science and engineering users from around the world, working on a variety of numerical and data intensive workloads have access to the Stampede system.
TACC staff work with these researchers on their numerical and data intensive workloads to optimize their applications to run more efficiently on the Stampede system. Using Intel tools and compilers, TACC staff are able to help researchers run their models and simulations more efficiently & accurately to help solve some of the big challenges in science today and tomorrow. TACC also works aggressively to train scientific computing users to make use of parallel programming techniques: several thousand users attend TACC in person or online training every year.
“TACC is committed to using advanced computation to advance science and society, and we firmly believe that concurrency is the central challenge of future programming. With the debut of the Stampede system, million-thread applications are now a reality. Concurrency should be a part of all training in programming” – Dan Stanzione, Deputy Director, TACC.
Examples of Parallel Programming & Technical Compute Application Modernization efforts:
- The Memory Access Centric Performance Optimization tool, or MACPO, generates memory traces of the important data structures by code segment. These memory traces are processed to determine the access and reuse patterns of data in each thread for each structure, allowing new levels of parallel code optimization.
- The “R” language is increasingly popular as a high level language for exploration of data using statistical methods. R has widespread adoption in life sciences, image processing, and other critical areas of science. TACC worked with Intel to enable Automatic Offload (AO) of R codes to the Xeon Phi through the Intel Math Kernel Libraries (MKL): you can find out more about how this is done in this tutorial. Note the material also covers a coprocessor-aware implementation of Python.
- The PCIT network comparison code is an algorithm used in life sciences to infer gene networks based on large databases of genomic data. TACC has worked with the biologists and statisticians who originated this method to develop a parallel, vectorized version of this code that achieves enormous speedups on both Intel Xeon and Intel Xeon Phi coprocessor over the original implementation.
- Ray tracing libraries which provide a vastly improve rendering of a scene versus conventional rasterization techniques, at better performance than a GPU. Demo to be highlighted by Intel & TACC at SC’13.
TACC is thrilled to be part of the Intel Parallel Computing Center program and looks forward to collaborating with other centers for the betterment of entire science community.