Introducing the Low Level Persistent Library (LLPL) for Java*

Introduction

In this article I present an introduction to Low Level Persistent library (LLPL), an open-source Java* library being developed by Intel for persistent memory programming. By providing Java access to persistent memory at a memory block level, LLPL gives developers a foundation for building custom abstractions or retrofitting existing code. Another open-source library being developed by Intel is Persistent Collections for Java (PCJ), which emphasizes persistent collections and provides higher level functionality such as garbage collection. To learn more about PCJ, I encourage you to read Code Sample: Introduction to Java* API for Persistent Memory Programming.

This article describes LLPL in detail, providing simple examples for every part of the library. I also show how to compile and run Java programs using LLPL. In a follow-up article, I plan to do a performance comparison between PCJ and LLPL by implementing and comparing two versions (one for each library) of the same persistent code. Stay tuned!

Here we assume you have a basic understanding of persistent memory concepts and are familiar with features of the PMDK. If not, visit the Intel® Developer Zone Persistent Memory Programming site, where you’ll find the information you need to get started.

Memory Allocation and Destruction

LLPL gives us two abstractions: (1) persistent memory blocks, and (2) transactions. Persistent memory blocks are exactly what their name suggests; that is, a contiguous block of memory carved out of a persistent memory heap. We have three types of blocks—raw, flushable, and transactional—as shown in Figure 1.

UML class diagram
Figure 1. UML class diagram for the LLPL’s persistent memory blocks.

Memory block classes allow us to allocate and destroy blocks on a given heap, and to write and read the memory from these blocks by using different accessor methods. It is in the latter where we find the differences between these three classes, as we will see later on. Allocation is done within the MemoryBlock constructor method. This constructor calls a native allocator written in C++ using the Java Native Interface (JNI). The native allocator, in turn, calls the allocator from the libpmemobj library, which is part of the Persistent Memory Development Kit (PMDK). For a high-level overview of the implementation stack, see Figure 2:

Map - High-level overview, LLPL implementation stack.
Figure 2. High-level overview of the LLPL implementation stack.

...
address = heap.nativeAllocate(size + baseOffset());
...

Here, baseOffset()is an abstract method (implemented in the child classes) in charge of returning the size of the metadata for the block. Given that different memory block classes have different needs, the size of their metadata is also different. For example, the metadata for RawMemoryBlock occupies 8 bytes while the metadata for FlushableMemoryBlock occupies 12 bytes.

Nevertheless, the above snippet is internal to the constructor. In order to allocate a new persistent block from your Java application, do the following:

...
Heap h = Heap.getHeap("/mnt/mem/persistent_heap", 2147483648L);
...
MemoryBlock<Transactional> block = h.allocateMemoryBlock(Transactional.class, size);
...

Heap initialization is also shown above. When we call getHeap(), the heap is either created if it does not exist, or just opened if it does. For allocation we are using the allocator method provided by the Heap class.

In these snippets we are assuming that a persistent memory device—real or emulated using RAM—is mounted at /mnt/mem. To free the persistent memory of an allocated block, simply call freeMemoryBlock():

...
h.freeMemoryBlock(block);
...

The Root Object

Similar to the C/C++ libpmemobj library in PMDK (and to PCJ), we need a common root block, that is, an entry point, to anchor all the other blocks created in the persistent memory heap. Since LLPL is very low level, we can’t define what the type (class or struct) of the root object is. In the case of PMDK, this is defined during heap (pool, in PMDK terminology) creation. Since there is only one root object in the entire pool/heap, there isn’t a need to allocate this object explicitly by the application. Something similar happens in PCJ, although in that case the root object—called ObjectDirectory—is always of the same type, that is, a key-value collection of persistent objects, where keys are object names (in the form of strings), and values are object references.

In LLPL we need to set the root explicitly. To do that, we need to pass the address of the block we want to use as root to setRoot(). Assuming that h references our heap object:

...
h.setRoot(block.address());
...

To recover the root block we do:

...
long rootAddr = h.getRoot();
if (rootAddr == 0) {
    System.out.println("Root block not found!");
    System.exit(0);
}
MemoryBlock<Transactional> block = 
                   h.memoryBlockFromAddress(Transactional.class, rootAddr);
...

As you can see in the snippet, the method getRoot() gives us the address for the root block. After we get the root address, it is always good practice to check if it is zero (in which case the root for the heap has never been set). Finally, we create a MemoryBlock object using the root address by calling the memoryBlockFromAddress() method. It is important to point out that, in this case, no new persistent memory is allocated. The MemoryBlock object we are creating resides only on volatile memory, but its fields address and size are set with the proper values in order to reference the block in persistent memory. In other words, we are creating a reference object. If still in doubt, hopefully figure three can help in clearing things up:

Map - High-level description of persistent memory reference in LLPL
Figure 3. High-level description of persistent memory reference in LLPL. In this figure, the persistent block referenced has a size of 1024 bytes.

Reading and Writing Block Data

Reading and writing data inside persistent blocks is done through multiple accessor methods. The following list includes these methods, highlighting the differences between the three classes:

Get Methods

The implementation of these methods is done in the parent class MemoryBlock, and hence common to all classes. The methods expect an offset relative to the block address.

byte getByte(long offset)
short getShort(long offset)
int getInt(long offset)
long getLong(long offset)

Set Methods

In this case, method behavior depends on the class used to create the block. For raw blocks, the set methods simply write the data to the specified destination. For flushable blocks, data is written and the modified memory addresses are added to an internal list called addressRanges, which is flushed when the method flush() is called (this type of flush—with no parameters—is only available for flushable blocks). Finally, transactional blocks write the data using a transaction. This means that values are updated atomically. To understand why this is important in persistent memory, I recommend reading An introduction to pmemobj (part 2) - transactions and C++ bindings for libpmemobj (part 6) - transactions at pmem.io.

void setByte(long offset, byte value)
void setShort(long offset, short value)
void setInt(long offset, int value)
void setLong(long offset, long value)

Memory Copying Methods

In addition to the getters and setters for the basic types shown above, LLPL also gives us three general memory copying methods. Behavior here also depends on the class used to create the block. I avoid repeating that information here because the behavior is the same as for the set methods described above.

void copyFromMemory(MemoryBlock<?> srcBlock, long srcOffset, long dstOffset, long length)


This method copies length bytes of memory from srcBlock at srcOffset into the block at dstOffset.

void copyFromArray(byte[] srcArray, int srcOffset, long dstOffset, int length)

This method copies length bytes of memory from the position srcOffset of the array srcArray into the block at dstOffset.

void setMemory(byte val, long offset, long length)

This method copies the byte val multiple (length) times starting at offset.

If you are curious, direct raw access to out-of-heap memory in LLPL is implemented using the sun.misc.Unsafe public API.

Let’s now take a quick look, with a very simple code snippet, at how we use the above methods for reading and writing persistent data:

import lib.llpl.*;

public class LlplTest {
        public static void main (String[] args) {
                Heap h = Heap.getHeap("/mnt/mem/persistent_heap", 2147483648L);

                // block allocation of Flushable class
                int HEADER_SIZE = Integer.BYTES;
                int size = 100;
                MemoryBlock<Flushable> block =
                         h.allocateMemoryBlock(Flushable.class, 
                                               HEADER_SIZE + Integer.BYTES * size);
                block.setInt(0, size);

                // writing some integers
                block.setInt(HEADER_SIZE + Integer.BYTES * 0, 111);
                block.setInt(HEADER_SIZE + Integer.BYTES * 1, 222);
                block.setInt(HEADER_SIZE + Integer.BYTES * 2, 333);
                block.flush();

                // reading them back
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 0) == 111);
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 1) == 222);
                assert (block.getInt(HEADER_SIZE + Integer.BYTES * 2) == 333);

                // setting 50 bytes to value=0x44 from offset 10
                block.setMemory((byte)0x44, 10, 50);
                block.flush();

                // copying these 50 bytes from one block (offset 10) 
                // to another (at offset 0)
                MemoryBlock<Flushable> newBlock =
                         h.allocateMemoryBlock(Flushable.class, 
                                               HEADER_SIZE + Integer.BYTES * size);

                newBlock.copyFromMemory(block, 10, 0, 50);
                newBlock.flush();

                // copying from an array of length 5 to block (at offset 80)
                byte array[] = new byte[]{1,3,6,2,9};
                newBlock.copyFromArray(array, 0, 80, 5);
                newBlock.flush();
        }
}

Regarding writes, there is more to the API than this. For example, it is possible to do transactional writes from the raw class (the API support methods such as setTransactionalByte(), and so on). Also, blocks of any class can call flush(long offset, long size) to perform a flush operation on a given memory position. For the sake of keeping the size of this article manageable I am skipping those. Nevertheless, if you want to learn more I recommend that you read the source code for the memory block classes as well as the Heap class.

Transactions

It is time to introduce LLPL transactions. These transactions allow us to create transactional writes larger than simple field writes—or single memory copy operations—available with the memory blocks classes. Transactions are useful when multiple operations over a data structure, such as moving pointers, have the potential to permanently corrupt the data structure if not done atomically.

We can use transactions by either calling the static singleton (thread-local syntax) or by creating a new Transaction object (parameter passing syntax). In both situations we have a single transaction per thread (although the second option usually has better performance). We also have a single transaction per thread when we do transaction nesting: only the outermost one is the transaction that counts.

...
Heap h = Heap.getHeap("/mnt/mem/persistent_heap", 2147483648L);
...
// block allocation of Transactional class
int size = 100;
MemoryBlock<Transactional> block = 
         h.allocateMemoryBlock(Transactional.class, Integer.BYTES * size);

// thread-local syntax
Transaction.run(() -> {
    block.setInt(0, 777);
    block.setLong(10, 1000);
    block.setInt(50, 11);
});

// parameter passing syntax
Transaction t = new Transaction();
t.execute(() -> {
    block.setInt(0, 888);
    block.setLong(10, 2000);
    block.setInt(50, 22);
});
...

Transactions can also be used with raw blocks as mentioned above. All we need to do is to call the setTransactional...() set of functions to add the chunk of memory we want to include in the current transaction. In the following snippet, we are adding to the transactions the bytes equivalent to the positions 7, 8, and 9 in this hypothetical array of integers:

...
// block allocation of Raw class
int size = 100;
MemoryBlock<Raw> block = 
         h.allocateMemoryBlock(Raw.class, Integer.BYTES * size);

Transaction.run(() -> {
    block.setTransactionalInt(Integer.BYTES * 7, 777);
    block.setTransactionalInt(Integer.BYTES * 8, 888);
    block.setTransactionalInt(Integer.BYTES * 9, 999);
});
...

How to Compile and Run

In order to use LLPL with your Java application, you need to have PMDK and LLPL installed on your system. To compile the Java classes, you need to specify the LLPL class path. Assuming you have LLPL installed on your home directory, do the following:

$ javac -cp .:/home/<username>/llpl/target/classes LlplTest.java

After that, you should see the generated *.class file. In order to run the main() method inside your class, you need to again pass the LLPL class path. You also need to set the java.library.path environment variable to the location of the compiled native library used as a bridge between LLPL and PMDK:

$ java -cp .:/.../llpl/target/classes -Djava.library.path=/.../llpl/target/cppbuild LlplTest

Summary

In this article I presented an introduction to Low Level Persistent library(LLPL), an open-source Java library being developed by Intel for persistent memory programming. By providing Java access to persistent memory at a memory block level, LLPL gives developers a foundation for building custom abstractions or retrofitting existing code. Another open-source library being developed by Intel is Persistent Collections for Java (PCJ), which emphasizes persistent collections and provides higher level functionality such as garbage collection.

In the article, I went over LLPL in detail, providing simple examples for every part of the library. In a follow-up article, I plan to do a performance comparison between PCJ and LLPL by implementing—and comparing—two versions (one for each library) of the same persistent code (stay tuned). I finalized the article by showing how Java programs using LLPL can be compiled and run.

About the Author

Eduardo Berrocal joined Intel as a Cloud Software Engineer in July 2017 after receiving his PhD in Computer Science from the Illinois Institute of Technology (IIT) in Chicago, Illinois. His doctoral research interests were focused on (but not limited to) data analytics and fault tolerance for high-performance computing. In the past he worked as a summer intern at Bell Labs (Nokia), as a research aide at Argonne National Laboratory, as a scientific programmer and web developer at the University of Chicago, and as an intern in the CESVIMA laboratory in Spain.

Resources

  1. Low Level Persistence Library, pmem/llpl
  2. Persistent Collections for Java, pmem/pcj
  3. Code Sample: Introduction to Java* API for Persistent Memory Programming
  4. The Persistent Memory Development Kit (PMDK)
  5. The Java Native Interface (JNI)
  6. How to Emulate Persistent Memory Using Dynamic Random-access Memory (DRAM)
  7. An introduction to pmemobj (part 2) – transactions
  8. C++ bindings for libpmemobj (part 6) – transactions
  9. Java Magic. Part 4: sun.misc.Unsafe
For more complete information about compiler optimizations, see our Optimization Notice.