Introducing the Low Level Persistent Library (LLPL) for Java*

Introduction

In this article, I present an introduction to Low Level Persistent library (LLPL), an open-source Java* library being developed by Intel for persistent memory programming. By providing Java access to persistent memory at a memory block level, LLPL gives developers a foundation for building custom abstractions or retrofitting existing code. Another open-source library being developed by Intel is Persistent Collections for Java (PCJ), which emphasizes persistent collections and provides higher-level functionality such as garbage collection. To learn more about PCJ, I encourage you to read Code Sample: Introduction to Java* API for Persistent Memory Programming.

This article describes LLPL in detail, providing simple examples for every part of the library. I also show how to compile and run Java programs using LLPL. In a follow-up article, I plan to do a performance comparison between PCJ and LLPL by implementing and comparing two versions (one for each library) of the same persistent code. Stay tuned!

Here we assume you have a basic understanding of persistent memory concepts and are familiar with features of the PMDK. If not, visit the Intel® Developer Zone Persistent Memory Programming site, where you’ll find the information you need to get started.

Memory Allocation and Destruction

LLPL gives us two main abstractions: (1) persistent memory blocks, and (2) transactions. Persistent memory blocks are exactly what their name suggests; that is, a contiguous block of memory carved out of a persistent memory heap. We have three types of blocks— MemoryBlock, PersistentMemoryBlock and TransactionalMemoryBlock.

Memory block classes allow us to allocate and destroy blocks on a given heap, and to write and read the memory from these blocks by using different accessor methods. It is in block allocation and writing where we find the differences between these three classes, as we will see later on. Allocation is done within the different constructor methods. These constructors call a native allocator written in C++ using the Java Native Interface (JNI). The native allocator, in turn, calls the allocator from the libpmemobj library, which is part of the Persistent Memory Development Kit (PMDK). For a high-level overview of the implementation stack, see Figure 1:

Map - High-level overview, LLPL implementation stack.
Figure 1. A high-level overview of the LLPL implementation stack.

In the constructor for MemoryBlock, we see:

 ...
long allocSize = size + baseOffset ();
heap.allocateAtomic (allocSize);
...

Here, baseOffset() is an abstract method in charge of returning the size of the metadata for the block. Given that different memory block classes have different needs, the size of their metadata is also different.

If we then go to look for heap.allocateAtomic(), we can see the call to the native method:

 ...
long allocateAtomic(long size) {
    return nativeAllocateAtomic(poolHandle, size, getAllocationClassIndex(size));
}
...

The above snippets are internal to the constructors. To allocate a new persistent block from your Java application, do the following:

 ...
TransactionalHeap h = TransactionalHeap.getHeap("/mnt/mem/persistent_heap", 2147483648L);
...
TransactionalMemoryBlock block = h.allocateMemoryBlock(size);
...

Heap initialization is also shown above. When we call getHeap(), the heap is either created if it does not exist, or just opened if it does. To allocate blocks of a particular class, we need to use the corresponding heap class. There are also three: Heap, PersistentHeap, andTransactionalHeap. For allocation, we are using the allocator method allocateMemoryBlock() provided by the TransactionalHeap class.

In these snippets, we are assuming that a persistent memory device—real or emulated using RAM—is mounted at /mnt/mem. To free the persistent memory of an allocated block, simply call freeMemoryBlock():

...
h.freeMemoryBlock(block);
...

The Root Object

Similar to the C/C++ libpmemobj library in PMDK (and to PCJ), we need a common root block, that is, an entry point, to anchor all the other blocks created in the persistent memory heap. Since LLPL is very low level, we can’t define what the type (class or struct) of the root object is. In the case of PMDK, this is defined during heap (pool, in PMDK terminology) creation. Since there is only one root object in the entire pool/heap, there isn’t a need to allocate this object explicitly by the application. Something similar happens in PCJ, although in that case the root object—called ObjectDirectory—is always of the same type, that is, a key-value collection of persistent objects, where keys are object names (in the form of strings), and values are object references.

In LLPL we need to set the root explicitly. To do that, we need to pass the handle of the block we want to use as root to setRoot(). The handle is nothing more than the relative address (offset) of the block with respect to the beginning of the heap. Assuming that h references our heap object:

...
h.setRoot(block.handle());
...

To recover the root block we do:

...
long rootHandle = h.getRoot();
if (rootHandle == 0) {
    System.out.println("Root block not found!");
    System.exit(0);
}
MemoryBlock block = 
                   h.memoryBlockFromHandle(rootHandle);
...

As you can see in the snippet, the method getRoot() gives us the handle for the root block. After we get the root address, it is always good practice to check if it is zero (in which case the root for the heap has never been set). Finally, we create a MemoryBlock object using the root handle by calling the memoryBlockFromHandle() method. It is important to point out that, in this case, no new persistent memory is allocated. The MemoryBlock object we are creating resides only on volatile memory, but its fields address and size are set with the proper values in order to reference the block in persistent memory. In other words, we are creating a reference object. If still in doubt, Figure 2 should help clear things up:

 

Map - High-level description of persistent memory reference in LLPL
Figure 2. A high-level description of persistent memory reference in LLPL. In this figure, the persistent block referenced has a size of 1024 bytes.

Reading and Writing Block Data

Reading and writing data inside persistent blocks is done through multiple accessor methods. The following list includes these methods, highlighting the differences between the three memory block classes:

Get Methods

The implementation of these methods is common to all classes. The methods expect an offset relative to the block address.

byte getByte(long offset)
short getShort(long offset)
int getInt(long offset)
long getLong(long offset)

Set Methods

In this case, method behavior depends on the class used to create the block.

  1. MemoryBlock: Writes done to blocks created using this class are raw writes, which means that no implicit flushing will be performed to ensure that the new data is written all the way to the persistent media. Flushing can be done explicitly.
  2. PersistentMemoryBlock: Writes done to blocks created using this class automatically flush all writes. The advantage here is that the programmer does not need to worry about flushing; when the method returns, it is guaranteed that the data has been safely written all the way to the persistent media. Flushing, however, may impact performance. Programmers should only use this if it is a requirement that every write be flushed sequentially.
  3. TransactionalMemoryBlock: Writes done to blocks created using this class do all writes transactionally. To understand why this is important in persistent memory, I recommend reading An introduction to pmemobj (part 2) - transactions and C++ bindings for libpmemobj (part 6) - transactions at pmem.io.
void setByte(long offset, byte value)
void setShort(long offset, short value)
void setInt(long offset, int value)
void setLong(long offset, long value)

Memory Copying Methods

In addition to the getters and setters for the basic types shown above, LLPL also gives us three general memory copying methods. The behavior here also depends on the class used to create the block. I avoid repeating that information here because the behavior is the same as for the set methods described above.

void copyFromMemoryBlock(AnyMemoryBlock srcBlock, long srcOffset, long dstOffset, long length)


This method copies length bytes of memory from srcBlock at srcOffset into the block at dstOffset.

void copyFromArray(byte[] srcArray, int srcOffset, long dstOffset, int length)

This method copies length bytes of memory from the position srcOffset of the array srcArray into the block at dstOffset.

void setMemory(byte val, long offset, long length)

This method copies the byte val multiple (length) times starting at offset.

If you are curious, direct raw access to out-of-heap memory in LLPL is implemented using the sun.misc.Unsafe public API.

Let’s now take a quick look, with a very simple code snippet, at how we use the above methods for reading and writing persistent data:

import lib.llpl.*;

public class LlplTest {
        public static void main (String[] args) {
                Heap h = Heap.getHeap("/mnt/mem/persistent_heap", 2147483648L);

                // block allocation of Flushable class
                int HEADER_SIZE = Integer.BYTES;
                int size = 100;
                MemoryBlock block =
                         h.allocateMemoryBlock(HEADER_SIZE + Integer.BYTES * size, true);
                block.setInt(0, size);

                // writing some integers
                block.setInt(HEADER_SIZE + Integer.BYTES * 0, 111);
                block.setInt(HEADER_SIZE + Integer.BYTES * 1, 222);
                block.setInt(HEADER_SIZE + Integer.BYTES * 2, 333);
                block.flush(0, HEADER_SIZE + Integer.BYTES * 3);

                // reading them back
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 0) == 111);
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 1) == 222);
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 2) == 333);

                // setting 50 bytes to value=0x44 from offset 10
                block.setMemory((byte)0x44, 10, 50);
                block.flush(10, 50);

                // copying these 50 bytes from one block (offset 10) 
                // to another (at offset 0)
                MemoryBlock newBlock =
                         h.allocateMemoryBlock(HEADER_SIZE + Integer.BYTES * size, true);

                newBlock.copyFromMemoryBlock(block, 10, 0, 50);
                newBlock.flush(0, 50);

                // copying from an array of length 5 to block (at offset 80)
                byte array[] = new byte[]{1,3,6,2,9};
                newBlock.copyFromArray(array, 0, 80, 5);
                newBlock.flush(80, 5);
        }
}

In the above snippet, we are flushing explicitly multiple times after a set of raw writes have been performed calling flush(long offset, long size). In the next section, I will show how it is also possible to use transactions with MemoryBlock and PersistentMemoryBlock objects as well. If you want to learn more about the API, I recommend that you read the source code for the memory block and heap classes.

Transactions

It is time to introduce LLPL transactions. These transactions allow us to create transactional writes larger than simple field writes—or single memory copy operations—available with the TransactionalMemoryBlock class. Transactions are useful when multiple operations over a data structure, such as moving pointers, have the potential to permanently corrupt the data structure if not done atomically.

We can use transactions by either calling the static singleton (thread-local syntax) or by creating a new Transaction object (parameter passing syntax). In both situations, we have a single transaction per thread (although the second option usually has better performance). We also have a single transaction per thread when we do transaction nesting: only the outermost one is the transaction that counts.

...
TransactionalHeap h = TransactionalHeap.getHeap("/mnt/mem/persistent_heap", 2147483648L);
...
// block allocation of Transactional class
int size = 100;
TransactionalMemoryBlock block =  h.allocateMemoryBlock(Integer.BYTES * size, true);

// thread-local syntax
Transaction.run(h, () -> {
    block.setInt(0, 777);
    block.setLong(10, 1000);
    block.setInt(50, 11);
});

// parameter passing syntax
Transaction t = new Transaction(h);
t.run (() -> {
    block.setInt(0, 888);
    block.setLong(10, 2000);
    block.setInt(50, 22);
});
...

Transactions can also be used with blocks that are not explicitly transactional, as mentioned above. For MemoryBlock objects, we need to add chunks of memory to the transaction using the addToTransaction(long offset, long size) method. In the case of PersistentMemoryBlock objects, all we need to do is to call the transactionalSet...() set of functions to add the chunk of memory we want to include in the current transaction. In the following snippet, we are adding to the transactions the bytes equivalent to the positions 7, 8, and 9 in this hypothetical array of integers residing on a MemoryBlock object:

...
// block allocation of Raw class
int size = 100;
MemoryBlock block = h.allocateMemoryBlock(Integer.BYTES * size, true);

Transaction.run(() -> {
    block.addToTransaction(Integer.BYTES * 7, Integer.BYTES * 3);
    block.setInt(Integer.BYTES * 7, 777);
    block.setInt(Integer.BYTES * 8, 888);
    block.setInt(Integer.BYTES * 9, 999);
});
...

The same is true for a PersistentMemoryBlock object:

...
PersistentMemoryBlock pblock = h.allocateMemoryBlock(Integer.BYTES * size, true);

Transaction.run(h, () -> {
        pblock.transactionalSetInt(Integer.BYTES * 0, 5);
        pblock.transactionalSetLong(Integer.BYTES * 10, 6);
        pblock.transactionalSetInt(Integer.BYTES * 50, 7);
});
...

How to Compile and Run

To use LLPL with your Java application, you need to have PMDK and LLPL installed on your system. To compile the Java classes, you need to specify the LLPL class path. Assuming you have LLPL installed on your home directory, do the following:

$ javac -cp .:/home/<username>/llpl/target/classes LlplTest.java

After that, you should see the generated *.class file. To run the main() method inside your class, you need to again pass the LLPL class path. You also need to set the java.library.path environment variable to the location of the compiled native library used as a bridge between LLPL and PMDK:

$ java -cp .:/.../llpl/target/classes -Djava.library.path=/.../llpl/target/cppbuild LlplTest

Summary

In this article, I presented an introduction to Low Level Persistent library(LLPL), an open-source Java library being developed by Intel for persistent memory programming. By providing Java access to persistent memory at a memory block level, LLPL gives developers a foundation for building custom abstractions or retrofitting existing code. Another open-source library being developed by Intel is Persistent Collections for Java (PCJ), which emphasizes persistent collections and provides higher-level functionality such as garbage collection.

In the article, I went over LLPL in detail, providing simple examples for every part of the library. In a follow-up article, I plan to do a performance comparison between PCJ and LLPL by implementing—and comparing—two versions (one for each library) of the same persistent code (stay tuned). I finalized the article by showing how Java programs using LLPL can be compiled and run.

About the Author

Eduardo Berrocal joined Intel as a Cloud Software Engineer in July 2017 after receiving his Ph.D. in Computer Science from the Illinois Institute of Technology (IIT) in Chicago, Illinois. His doctoral research interests were focused on (but not limited to) data analytics and fault tolerance for high-performance computing. In the past, he worked as a summer intern at Bell Labs (Nokia), as a research aide at Argonne National Laboratory, as a scientific programmer and web developer at the University of Chicago, and as an intern in the CESVIMA laboratory in Spain.

Resources

  1. Low Level Persistence Library, pmem/llpl
  2. Persistent Collections for Java, pmem/pcj
  3. Code Sample: Introduction to Java* API for Persistent Memory Programming
  4. The Persistent Memory Development Kit (PMDK)
  5. The Java Native Interface (JNI)
  6. How to Emulate Persistent Memory Using Dynamic Random-access Memory (DRAM)
  7. An introduction to pmemobj (part 2) – transactions
  8. C++ bindings for libpmemobj (part 6) – transactions
  9. Java Magic. Part 4: sun.misc.Unsafe
For more complete information about compiler optimizations, see our Optimization Notice.