Thomas Ludwig received his doctoral degree and the German habilitation degree at the Technische Universität München, where he conducted research on HPC from 1988 to 2001. From 2001 to 2009 he had a chair for parallel computing at the Universität Heidelberg. 2009 he moved to Hamburg. He is now director of the German Climate Computing Center (DKRZ) and professor at the Universität Hamburg. His research activity is in the fields of high volume data storage, energy efficiency, and performance analysis concepts and tools for parallel systems.
Michael Kuhn is a postdoctoral researcher in the Scientific Computing group at the Universität of Hamburg, where he also received his doctoral degree in computer science in 2015. He conducts research in the area of high performance I/O with a special focus on I/O interfaces and data reduction techniques. Other interests of his include file systems and high performance computing in general.
Employed at the German Climate Computing Center (DKRZ), Dr. Kunkel is a postdoctoral researcher in the research department of DKRZ that is joint with the Scientific Computing group at the Universität Hamburg. Julian gained interest in the topic of HPC storage during his studies of computer science in 2003. Since then, he researches methods to improve efficiency of storage systems in general. Besides his main goal to provide efficient and performance-portable I/O, his HPC-related interests are: data reduction techniques, performance analysis of parallel applications and parallel I/O, management of cluster systems, cost-efficiency considerations, and software engineering of scientific software. In 2013, he defended his thesis about monitoring and simulation of parallel programs on application and system level.
The group Scientific Computing conducts research on high performance I/O optimizations, energy efficiency, and simulation of cluster infrastructure. We have expertise in parallel programming and environmental modeling.
Due to the increasing gap between computational speed, network speed and storage capacity, it has become necessary to investigate data reduction techniques. Storage systems have become a significant part of the total cost of ownership due to the increased amount of storage devices, their associated acquisition cost and energy consumption.
Ultimately, we are aiming for compression support in Lustre* at multiple levels:
- Client-side compression allows using the available network and storage capacity more efficiently,
- Client hints empower applications to provide information useful for compression and
- Adaptive compression makes it possible to choose appropriate settings depending on performance metrics and projected benefits.
Compression will be completely transparent to the applications because it will be performed by the client and/or server on their behalf. However, it will be possible for users to tune Lustre's behavior to obtain the best performance/compression/etc. When using client-side compression, the single stream performance bottleneck will directly benefit from the compression. Initial studies have shown that a compression ratio of 1.5 can be achieved for scientific data using lz4.
Data Compression for Climate Data (Michael Kuhn, Julian Kunkel, Thomas Ludwig), In Supercomputing Frontiers and Innovations, Series: Volume 3, Number 1, pp. 75–94, (Editors: Jack Dongarra, Vladimir Voevodin), 2016-06
Analyzing Data Properties using Statistical Sampling – Illustrated on Scientific File Formats (Julian Kunkel), In Supercomputing Frontiers and Innovations, Series: Volume 3, Number 3, pp. 19–33, (Editors: Jack Dongarra, Vladimir Voevodin), 2016-10
Linux kernel contributions – LZ4 vendoring the kernel (LZ4 implementation updated in 4.11)
HowTo – Build Guide for Lustre and ZFS
Design proposal - LAD’16
Anna Fuchs, 6/20/2016, Enhanced Adaptive Compression in Lustre, ISC 17, Conference
Product and Performance Information
Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.