Anouar Benali obtained a Ph.D. in Theoretical Physical Chemistry from the University of Toulouse (France) in 2010. He is an Assistant Computational Scientist at the Argonne Leadership Computing Facility and a fellow of the Computation Institute at the University of Chicago. His work focuses on implementing and speeding QMC algorithms for High Performance Computers.
Luke Shulenburger is a staff scientist at Sandia National Laboratories working on electronic structure calculations of materials with a particular focus on extremes of temperature and pressure. He received his PhD from the University of Illinois at Urbana-Champaign in 2008, and was a postdoctoral researcher at the Carnegie Institution of Washington until moving to Sandia in 2010.
Quantum Monte Carlo (QMC) has emerged as an important tool for extreme-scale calculations of complex material properties. QMCPACK is a code for calculating the electronic structure of materials with unprecedented accuracy. It works by stochastically solving the many-body Schrödinger equation. This method is uniquely suited for calculations of technologically important materials and has been shown to be predictive for a wide range of materials and molecules. Over the past decade, the size of the physical problems and computational facilities have been firmly in a regime where the method has been shown to scale nearly linearly with the number of computational elements available. The coming of the exascale era has allowed consideration of larger problems involving thousands of electrons that will need to utilize millions of threads, further straining this relationship. Additionally, the constant memory necessary for evaluating single-particle wavefunctions will grow beyond the fast device memory expected in heterogeneous architectures. Through the Intel® Parallel Computing Center(s) (Intel® PCC), we aim to increase the current vectorization of the code, parallelize the work for each "walker" to achieve good parallel efficiency using nested threading, and finally develop a caching scheme to allow use of slower main memory for heterogeneous platforms with minor performance penalty. This project will pilot extreme-scale threading and vectorization in a popular QMC code and will disseminate the experience gained to other QMC codes, allowing the study of larger and more realistic systems with predictive accuracy.
英特尔的编译器针对非英特尔微处理器的优化程度可能与英特尔微处理器相同（或不同）。这些优化包括 SSE2、SSE3 和 SSSE3 指令集和其他优化。对于在非英特尔制造的微处理器上进行的优化，英特尔不对相应的可用性、功能或有效性提供担保。该产品中依赖于微处理器的优化仅适用于英特尔微处理器。某些非特定于英特尔微架构的优化保留用于英特尔微处理器。关于此通知涵盖的特定指令集的更多信息，请参阅适用产品的用户指南和参考指南。