Learn more about the operating modes for Intel® Optane™ DC persistent memory, and get a look at some of sample code.
Welcome. My name is Eduardo Berrocal from Intel. In this video, I will go over the different modes supported in Intel® Optane™ DC persistent memory modules and how your application can benefit from them. Let's get started.
Persistent memory supports three different operating modes. Memory mode, Application Direct or App-Direct mode, and mix mode, which is a combination of the two. Memory mode allows the system to use persistent memory as main memory.
In this case, DRAM is used as a cache and cannot be explicitly managed by the operating system. This means data placement is controlled by the memory controller. In app-direct mode, persistent memory is exposed as special block devices to the OS. And applications can just switch devices for data persistence. In this case, DRAM is still used as main memory.
You can also use persistent memory in App-Direct mode as a regular block device and put a regular file system on it as if it were an SSD. It is even possible to boot an OS from it, which allows you to configure systems with no disks at all. Memory mode is appealing for cases where applications need more memory.
But expanding the systems memory with only DRAM is either very expensive or simply not possible. If you are unsure if your application will benefit from more memory, you can run a memory consumption analysis with VTune™ Amplifier to determine your application's memory footprint. Another tool that you can use is Platform Profiler, which is more lightweight than VTune Amplifier and allows you to run the analysis for a longer period of time.
Either way, a footprint of 90% or more is a good sign that you might benefit from a memory expansion. Remember that Intel Optane DC persistent memory is very fast but not quite as fast as DRAM. Given that in memory mode DRAM is used as a cache, a good indication that your application will perform well in memory mode is if its working set– that is the amount of memory use more often– fits inside the DRAM capacity.
You can run memory access analysis in VTune Amplifier to see each memory object that was allocated by the application. Its size and the number of loads and the stores that access it. Using this analysis, we can determine the working set by adding up the total memory allocated for those objects that are accessed the most.
If memory mode does not give you all the performance you expected– which might be because your application's workload is not cache friendly– you might want to switch to App-Direct mode and handle memory placement yourself. For this, you can take advantage of the library memkind, developed by Intel. A heap manager built on top of jemalloc which provides a unified, malloc-like interface across a [INAUDIBLE] set of [? available ?] memory pools in the system.
If you would like to learn more about memkind, please be sure to see our video, Managing Volatile Memory with Memkind in the links. To maximize performance, you might want to place hot objects, those that are used the most in DRAM, while leaving warm objects for persistent memory. It is also recommended to direct as many writes to DRAM as possible given that writes do not perform as well in persistent memory as reads do you can do that by placing those objects that are written to more often in DRAM while leaving mostly read objects for persistent memory.
Run memory access analysis in VTune Amplifier again to gather the necessary information to help you transform your application. You can use total loads and stores to check what are the most used objects. Or just total number of stores to understand which objects are the most written to in your application.
At this point, I know what you're thinking. These are all [? bottle ?] use cases. Where is the persistence in persistent memory?
To determine if your application will benefit from the persistent aspect of persistent memory, you need to determine if your application is bottlenecked by I/O. You can do this by running the storage device analysis in VTune Amplifier. Excessive iowait or a large queue depth, might be a good indication that your application can benefit from using persistent memory as storage.
Of course, using persistent memory as a storage is low hanging fruit. The best approach to really benefit from what persistent memory has to offer is to avoid the traditional I/O stack altogether. This is achieved by transforming your data from traditional serialize format optimized for proc I/O to data structures residing in persistent memory.
The main advantage here is unified data model and the possibility to access your data at cache [INAUDIBLE] directly from user space. By using the persistent aspect of Intel Optane DC persistent memory modules as well as it's large capacities– up to six terabytes for a two-socket server– you can save precious work that will otherwise need to be recomputed.
This is the case with some in-memory databases. In one case for example, storing the indexes in persistent memory allow the database server to reboot in just 17 seconds. As opposed to 35 minutes for the case where DRAM was used to keep those indexes.
If you are wondering how you can transform your code for persistent memory, you should know that Intel has developed a Persistent Memory Developer Kit or PMDK. A collection of open source libraries and tools that provide low level primitives as well as useful high level structures to help you get up to speed. If you want to learn more about the topics covered in this video, please check out the links. Thanks for watching.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804