This video provides an introduction to the Storage Networking Industry Association (SNIA) persistent memory programming model, illustrated with code examples using the Persistent Memory Development Kit (PMDK).
Welcome. My name is Eduardo Berrocal from Intel. In this video, I will go over the programming model for persistent memory.
Persistent memory aware or PMEM-aware file systems are needed in order to use Intel® Optane™ DC persistent memory in-application direct or off-direct mode. The main reason is that file systems provide useful functionality, such as the ability to store, name, and easily find information, as well as the ability to protect such information between different users and applications. The difference lies in how database is read and written.
In the case of PMEM-aware file systems, all data accesses go directly to the persistent media, bypassing any I/O caching done by the OS. The better usage of this feature is in the case of memory mapped files. Once the file is memory mapped to the address space of the application, all loads and stores originating from the application go directly to the persistent media, and data accesses are done at cache line granularity.
Bypassing I/O caching, however, does not mean bypassing CPU caching. Recent data writes might reside on flash in the CPU caches. Unfortunately, these are not protected against a sudden loss of power. If that were to happen, we might end up with corrupted data structures.
To avoid corruption, programmers need to design their data structures in such a way that temporarily turn torn-writes are allowed and make sure that the proper flashing instructions are issued at exactly the right time. As you can probably guess, this way of programming is not trivial. Fortunately, Intel has developed a Persistent Memory Developer Kit, or PMDK– an open source collection of libraries and tools that provide low-level primitives, as well as useful, high-level instructions, to help persistent memory programmers overcome these obstacles.
In the code sample that I present next, a simple double-linked list with head and tail pointers, I use the C++ bindings for libpmemobj. libpmemobj is the most flexible library and provides transactions to avoid corrupting our data structures during writes.
The first thing we need in order to use libpmemobj is a pool. Pool is the terminology used in PMDK to refer to a file that will be memory mapped. We can create a pool by either using the common line tool, pmempool, or inside our program by calling the Create function in the pool class.
In libpmemobj, we need to define a root object. These root acts as the entry point to anchor all the other objects created in the pool. In the case of our linked list, the root consists of two pointers, corresponding to the head and the tail of the list. You can call the function root to retrieve a pointer to the root of a particular pool.
Due to the nature of virtual memory, the template class, persistent_ptr, is needed to safely store pointers to objects in persistent memory. It also helps programmers avoid mistakes due to mixing pointers to volatile objects with those to persistent objects. Regular variables residing in persistent memory need to be declared using the template class, p. This allows the library to add those variables automatically to transactions, thanks to operator overloading.
One way to define transactions in libpmemobj is using lambda functions. Lambdas allows us to insert anonymous functions in place to easily define code regions that need to be executed atomically, like the node insertion example presented here. In addition to atomic update, libpmemobj sections can also be synchronized in a multi-threaded application by passing a mutex variable to the transaction.
There might be cases where applications might not care about the persistent aspect of persistent memory in off-direct mode and just want to use it as an extra pool of available memory. For this, developers can take advantage of the library, memkind, also developed by Intel– a hip manager built on top of jemalloc, which provides a unified malloc-like interface across an heterogeneous set of available memory pools in the system.
There is much more to learn about Intel Optane DC Persistent Memory and persistent memory programming. I urge you to explore further by following the links, as well as watching our other persistent memory videos.