Persistent Memory Programming with Java*

Overview

The new Intel® Optane™ DC persistent memory redefines traditional architectures, offering large capacity and new memory tier at an affordable cost. Java is a popular language of choice in the Cloud and Enterprise workloads. Persistent memory provides design flexibility and significant performance opportunities in Java applications that need to store data persistently.  Intel developed a low-Level Persistence Library (LLPL) for flexible, high-performance access to persistent memory from Java. It has easy to use idiomatic Java API and suitable for direct use or as a base for higher-level abstractions.

What You Will Learn

  • A brief overview of different ways to access persistent memory from Java today.  
  • An in-depth exploration of LLPL
  • A walkthrough of Java code samples to demonstrate the power of using LLPL and direct memory programming for the storage of persistent data.
  • For a basic understanding of persistent memory concepts and features of the PMDK, visit the Intel® Developer Zone at Persistent Memory.

Transcript

Welcome, everyone. Thank you for taking the time to attend this webinar titled Java Programming with Persistent Memory. My name is Olasoji Denloye. I'm a Software Engineer in the Developer Software Engineering Division at Intel, and I'll be the moderator for today's session.
Persistent memory is a new memory technology that has both memory and storage attributes. It is fast like DRAM and has high capacity like storage, and it's being used to accelerate the data center today. Java continues to be a popular language for data center applications such as databases. Today's webinar will touch on different ways of accessing persistent memory from Java, and provide an in-depth exploration of an open source library developed by Intel that offers high performance access to persistent memory from Java.
Our presenter today is my colleague, Steve Dohrmann. Steve is a Senior Staff Software Engineer at Intel. He has worked on various Java projects over the past 20 years, and is passionate about Java and the feature. His Java development work includes the Java Media Framework, the Cryptography Framework, embedded pilot programming for Java, and currently enabling Java to use persistent memory. 
Before we begin, I would like to remind everyone that this webinar will be available on demand after this live session, and you can access it via the same link. Also, if you have any questions at any time during the presentation, you can submit them using the Ask a Question tab on your screen. We plan to leave some time at the end for answering your questions.
And now I'll hand over to Steve to begin. Steve?
OK, thanks, Soji. Hi, everybody. Thanks for coming. I'll jump right into an overview. This is a list of five available paths to programming persistent memory with Java. The first three are Intel-developed open source libraries. The first one being the Low-Level Persistence Library, LLPL. We'll be talking about that for most of this session. The second one is a key value store library called pmemkv. It's part of the Persistent Memory Development Kit, one of the libraries in that suite, and it offers bindings to multiple languages, including Java. So, that's a nice easy-to-use library. The third one in that list is another library we developed called Persistent Collections for Java. This is a high-level library offering persistent collection classes and other persistent classes, as well as things like automatic memory management. This is in an experimental state right now, but it, like the other two, are available on GitHub. And then there are two open JDK enhancements, both available in the current JDK14 release. The first one, Persistent MappedByteBuffer, this is just a small change to the MappedByteBuffer API, making it suitable to have MappedByteBuffers backed by persistent memory. And then the second one in that list is called the Foreign-Memory Access API. This is part of a new package coming in JDK14, Java.Foreign, and it allows you to make Foreign library calls without writing any C code, without writing JNI code, so a pure Java Foreign function interface, and then the base level of this thing is this Memory Access API, which lets you do Java code for off-heap programming, including off-heap persistent memory programming.
Here's a high-level table comparing these five offerings. I'm not going to go into the details here. We're going to be talking mostly today about LLPL on the left. I'll just highlight, the two biggest differences probably between these five things is the minimum JDK version they're required to run. You'll see the first three is JDK 8+, and the new things, of course, require JDK14, and the second big difference is the programming level. They offer a high-level, low-level programming difference. The ones on the end, LLPL and Memory Access API, are offering low-level programming with things like pointer like things and the ability to create pointers that link structures, and the three in the middle, a little bit higher-level things, and that's one of the big differences between these five.
So, let's jump into a deeper look at the LLPL library. As pointed out, it'll run on anything JDK 8 or later, and it's at Version 1.0 right now, released at the end of last year, and we're working on Version 1.1. It's a component of the PMDK, and in fact, it uses two of the libraries in the suite of PMDK libraries with Pmem and the pmemobj. You can build LLPL either with Maven, or Make, and we're currently working on making LLPL available on Maven Central. When we try to write down the goals of LLPL, it's these three. The primary goal was to provide very high performance access to persistent memory from Java, and this is low-level access, which gives you a lot of flexibility on how to use it. Particularly you can use it as a direct programming API as a library, or you can use LLPL as a base for some higher-level abstractions you want to create, and despite its low-level nature, our goal was to make the APIs actually idiomatic Java. So, we'll see those in the code and hopefully you'll see them as being easy to use despite being low-level APIs.
So, there's three primary elements in LLPL: Heaps, memory blocks, and Transactions. Now, heap is just a pool of persistent memory and an allocator for it. You can create as many heaps as you want, of almost any size. Multiple terabyte sized heaps is doable. And, of course, these heaps are persistent, so we need to be able to adapt them after a restart and you can just reopen these heaps. We'll see that in code. Unlike in Java, memory management for this persistent memory is manual. There's an allocate and free pair methods that are used, and the heap API itself is thread safe. So, what you allocate is a block of memory, and you get back a MemoryBlock object that is the accessor API for that allocated block.
This MemoryBlock API consists of low-level setters and getters. You describe what bytes you want to read or write using a zero-based offset from the beginning of the MemoryBlock, and you can refer to these allocated blocks long-term and they can be used to write into other memory blocks so you can link blocks together and create reference-based data structures. These references come in the form of Java long value called a handle, and these things are stable for the life of the heap. The MemoryBlock API itself does no locking or other thread confinement. The developer has both the responsibility and the freedom to create whatever concurrency scheme they find both safe and performant.
The third component is Transactions, and these allow you to do failsafe rights, create failsafe rights, and to group these rights together into an atomic aggregate that will execute all together or act like it did not execute at all. A single thread participates in one transaction, and you can nest these things in any particular way you want to. They will, however, behave as one flattened transaction where the outermost transaction body creates the container for that, and all of the inner bodies will commit or abort together. You'll see in code that actually produces a very easy-to-use implementation of a transaction, and we've integrated this transaction API carefully with Java's exception handling, and idiomatic things like lambdas. We use lambdas for transaction bodies.
Here's a high-level block diagram showing the Java heap on the left-hand side in DRAM, and a couple of LLPL heaps on the right-hand side in persistent memory, and what we do with this LLPL library is we create control objects that are regular Java objects and reside in the Java heap on the left there, and they are very thin objects. They basically have a pointer that points to a place in persistent memory, and that API inside of the Java object controls whatever it's pointing to. So, we have a couple of heap objects on the left, and we have some MemoryBlock objects pointing to memory blocks on the right. You only need to have one of these regular Java objects, these control objects, present if you're actively accessing that particular piece of persistent memory, and you can have, as you can see here, multiple heaps being accessed by the same Java process. The heaps remain separate, and in particular, the memory blocks themselves are heap specific, but you can copy memory between LLPL heaps if you want to.
When the Java process exits, of course, the Java heap and all of those control objects go away, but all of the persistent memory state remains in place and all the relationships between blocks of memory that you may have created remain in place. For example, on the right in LLPL Heap 1 we show arrows connecting MemoryBlock A to MemoryBlock B. That's basically writing one of those handle values into MemoryBlock A referring to MemoryBlock B, and that all stays persistent even after the Java process exits.
Here's a snapshot of the MemoryBlock API showing its low-level nature. We have setters and getters for Java integer scalar types, and then there are bulk operations to do memory copy copies, either between persistent blocks of memory or between a persistent block of memory and a Java byte array. There are a couple other supporting methods on MemoryBlock, one to free the MemoryBlock when you want to deallocate it, and a getter to get the handle, essentially the numeric name to that MemoryBlock. We will see the use of these in code soon.
So, since we're trying to depend on this memory now possibly long-term, using its persistence, we want to be able to implement a data integrity and a consistency scheme that's suitable for our application. This will vary quite a bit from application to application, but specifically, as application developers, we want to be able to say something clear about the usability of the heap data after either an expected event, like a regular process exit, or unexpected events, like a crash or a power failure. In support of building these kinds of policies, we will be talking and showing in code two kinds of rights that you can do to modify persistent memory, either a durable right or a transactional right, and if you use durable rights, the basis of the policy is that if those rights don't get interrupted, then when you reopen your heap, everything will be intact and usable, but sometimes you need stronger resilience to unexpected events, like a crash, so you can use transactional rights, and these give you a stronger guarantee. They say if you use transactional rights, then even if you have a crash or a power failure, the consistency of the rights and the integrity of the rights that you did will be present even after those events, and we'll describe that also in code, but it lets you keep a sound and usable heap even in the face of these events like crash or power failure. LLPL offers these two kinds of rights and some other tools so that you build such data integrity and consistency policies, either simple ones or very customized.
So, we talked about durable writes, transactional writes. Along with these two kinds of writes that are core to persistent memory programming, there are a couple of errors that surface and we want to be able to take steps to mitigate or eliminate those errors. So, I'm going to add a third write in that top list called a volatile write, just to show that it's a component of the other two. When we do a regular write to DRAM, let's say we're setting a Java object field, we write the data, and it goes into the CPU cache, and it's made visible to other threads on the system, and our logic is served by that behavior, but it doesn't necessarily make it all the way into the DRAM memory module. That’s based on when space is needed in the cache and having it being flushed to the modules itself. Persistent memory behaves exactly the same way in that sense. When we do a write, it starts with essentially a volatile write and we write, the data goes into the CPU cache, but it also isn't necessarily flushed all the way out to the persistent memory DIMMs, so we have to do that. At some point, to get a durable write, we have to flush it, so we know it's made it all the way to the persistent media.
For a transactional write, we do those two steps, but we do a first step before we write the data, and that is to tell a transaction what data we're about to modify. Specifically, we tell it the range of bytes we're going to modify, and what it'll do is it'll create a backup of the data we're about to change into an undo buffer, and it can use that in the event that a transaction is interrupted to restore the data to its original state, it's pre-transaction state, giving us this guarantee of consistency.
With those two kinds of writes, specifically, those steps in bold, come two possible programming errors. One is you just forget to flush a durable write, and the second is you forget to add a range to the transaction before you actually overwrite data. So, these are hard to test out of your code. They’re a little bit like race conditions in that sense. So, what LLPL has done to help with these errors is it gives you the flexibility that you might want to create an arbitrary consistency scheme, and so the general heap, which we'll show in code, gives you this flexibility but you do the flushes yourself and you do these “add to transaction range” yourself, so you can forget to do it and those bugs could be present. If that’s unacceptable, then you can, for durable writes, use a PersistentHeap in a corresponding MemoryBlock and that heap guarantees that writes you do are flushed. So, if your code compiles, you know  that didn't forget to flush because it's been done for you. For the stronger consistency guarantees of transactional writes, you can use a TransactionalHeap, and that gives you the guarantee that if your code compiles, all of your transactional writes were actually properly added to the transaction before the data was overwritten, so you can't forget to do that. You lose some flexibility using these PersistentHeap and TransactionalHeap, and potentially some performance, but you can choose between these three and have it match the data consistency and integrity policy you want to implement.
OK, before we go on to walk through some code, I wanted to give a motivating example. This is a collaboration we did with the open source community called the Cassandra database. We implemented a persistent memory storage engine for Cassandra, and the design of that is shown on this slide. This design actually came from a Cassandra PMC member and you can see a box called Storage Engine. That's an implementation of an interface, a pluggable storage engine interface, that's being developed for Cassandra, and the front end of Cassandra is shown in the two boxes at the top. That doesn't change at all. This plugs into the back end storage of the database. And it's an interesting design. It's common in databases to shard data to give high concurrency access, and that's what this is. You have queues that receive work for whatever shard of data they're responsible for. This is all on one node in one storage engine. In fact, this is just one table in the database, and then a single thread behind each of those queues owns the data for that shard and it will do the reads and the writes to persistent data structures below there, shown in the triangles, and each one of those triangles is an implementation of a tree data structure and adaptive radix tree. You see the link to the paper at the bottom. We implemented that adaptive radix tree from that paper. In fact, we implemented all of the persistent memory data structures using LLPL.
So, this worked out pretty well, and on the next slide, I'll show some results and just give a brief description of where the speed-ups came from. So, in the gray bar, we're showing the baseline Cassandra running with four very fast NVMe SSDs, and the blue is the persistent memory storage engine. We saw good speed-ups across the board for both reads and writes, and mixed workload, and the big speed-ups actually came from primarily two things. For example, in the read case, the number of instructions required to do a read operation dropped dramatically when we went from the block-based, disk-based storage in the baseline of Cassandra to the persistent memory storage engine based on these in-memory data structures. In fact, it dropped by a factor of three, so we start with a third of the work to do in order to do a query.
The second element of the metrics that got much better was the concurrency. In using this direct Memory Access and designing a specialized concurrency scheme for the data structures, the sharded scheme, we were able to apply many more threads in the same instance of Cassandra to perform reads, while staying within our SLA for the database. So, we actually got a speed up of 2.3x better CPU utilization when using the memory-based data structures, and the product of those two numbers, three times 2.3, gives us 7 up to 8x, and there were other improvements that made up the rest of it. So, we feel really good. It's just a brute-force way, using persistent memory and memory-based data structures, with pointers between nodes in a tree, to get big speed-ups in a very optimized database. Cassandra has been around a long time, and it’s very optimized already.
OK, we'll walk through about four, or maybe we can get to five, of these code samples, and I'd like to point out that in the LLPL repository on GitHub, there's an examples directory with some more complete examples, including, for example, an implementation of that adaptive radix tree using LLPL. All the things we've talked about so far, and all the code we're going to talk about, there are links to that in the last slide of this presentation.
So, I'm going to switch now to the code, and we should be seeing the first exam called Getting Started. This will show in code the basics of what we talked about in the first few slides there, and in fact, if we look at the first, say, 36 lines, we'll actually see all the components of essentially all there is to do in persistent memory programming. Well, let's walk through it.
In line 10, we see a string being created. This is a path. So, we're going to create an LLPL heap and we name heaps with a path. The first part of that path, the Pmem part, that's a base location where persistent memory was made available in a special kind of file system for persistent memory. It's actually a regular file system, but it has an option on it called DAX, Direct Access, and this is supported by recent kernels, and it lets you refer to persistent memory pools with a convenient path, and use utilities with the heats associated with this, but once you've set this up, we'll see that there's actually no filesystem involvement at all when we're doing reading and writing. There's no file system cache, and there's no kernel involvement when the reads and writes are happening. It's essentially just move instructions in code. That's where the speed comes from. OK, so let's walk through this, the rest of this.
Hey Steve. We can't actually see what you're sharing. Sorry to interrupt.
Oh, I'm sorry, I didn't switch. Thank you. I have to switch the… Sorry.
OK, that should work.
There we go. Thank you, Soji. Yes, sorry, line 10 is the string path I talked about here. By the way, we're just importing three classes in lines three, four, and five: Heap, MemoryBlock, and Transaction, the three we talked about. 
So, we're going to check and see if the path already exists in this file system, and if it does, then we're going to open the heap, but for this first run, it won't exist yet, and so in line 12, we're going to call the second part of the conditional there, and we'll call a static method on Heap called createHeap. It takes two parameters, the path we described there and then a Java long value, which is the number of bytes we want for the heap, and we're not initialized yet, so in line 14 we'll go into this first block, and the first thing we're going to do is allocate a MemoryBlock from this heap. We do that in line 16 by calling an instance method on our heap called allocateMemoryBlock and giving it a size in the first parameter of 256. It also just says we're going to do just a thread safe atomic allocation here. We're not going to need a transaction. So, that's a Boolean that you can tell whether you want a transactional allocation or not.
Now we have a block of memory and an access report, and what we're going to do before we go further is we're going to do something specific to persistent memory, and we're going to make a kind of bookmark. We're going to call this block our starting data structure, our route data structure, and we're going to make a bookmark to it so we can get back to this route data structure when we restart our application, and to do that, we get the handle to that block. I talked about Java long, that's the numeric name for a block. We do that with the block.handle accessor call there in line 17, and we call heap.setRoot passing that eight byte Java long. So, we'll see how we use that later to get the MemoryBlock back.
OK, so we'll go ahead and do one of those durable writes we described, and in line 20, we're going to call block.setLong. This will set eight bytes, and the first parameter there is the offset within the block we want to set the long value to—or set the long value at—and so we'll write 12345 at offset zero, and now we need to flush that to make it durable. So, we call block.Flush, and that takes two arguments as well. The first is an offset within the block to start flushing and the second is a byte count, which is how many bytes we want to flush, so eight in this case. And then we'll give a simple example of a transactional write in the next few lines.
In line 25, we're going to start a transaction by calling a static method on the Transaction class, create, and it takes two parameters. Each transaction is heap specific, and then the second one is the body of the transaction, which in this case is a Java runnable, and within the body, we're going to do a write, but because this is transactional, before we actually change any values we're going to tell the transaction about the range of bytes we're going to change. So, we use the block.addToTransaction call, and again, this is the offset at which we're going to do the writing. Eight, let's say, we're picking that, and we're going to change eight bytes, so we tell it how many bytes to add to the transaction. And after we’ll do a backup of the original data, and now we can go ahead and do our write in line 27, setLong at offset eight, and we'll give it a different value, 23456.
So, we've created a heap, allocated a block, done a durable write and done a transactional write. We're going to do one more operation, and what we're going to do is create another block, and we're going to make a link between the two blocks. So, in line 32, we'll allocate, we'll call our allocateMemoryBlock call again, again 256 bytes, let's say, and let's write a value into that new block in line 33, and offset zero within the new block, we'll write the value 111, and now we're going to flush in this case four bytes. We wrote an Int. A Java Int is four bytes, we'll flush that, getting the range zero and the byte count to four, and then in line 35, we're going to link the two blocks by writing a handle to the new block into the original block. So, let's write that handle at offset 16 in the original block, and what we're going to write—it's a Java long value but it's not an arithmetic value. It's a handle value, and we get that by calling another block.handle. Now we've linked the two, we have to flush that long write, so we have another flush call on line 36 starting at offset 16 for eight bytes.
So, now we've got two blocks, and we've linked them together, and this is where this particular name would exit. So, when we restart our application, we'll come back up to line 11, and that heap path will have existed because we did the Create, and so we'll go into the first part of the conditional in line 12, and just reopen the heap by passing the path argument to the static openHeap method. And then our initial Else will drop down to this line 38 and begin executing, and all we're going to do here is read all those values back and assert that the values are correct.
I'm going to scroll a little here, and in line 40 this is where we're bootstrapping ourselves back into the first MemoryBlock. We got back to the heap because it uses a path as a name, and that's a statically named thing in our code. We don't have that ability built into Java for other things, and that's what this root location in the heap gives you, is this bootstrap ability. So, we'll call heap.getRoot, that gives us the handle to our original block back, and now we can dereference that handle into a MemoryBlock object in line 41 by calling an instance method on our heap, memoryBlockFromHandle. That’s a handle and you get back a memory block.
Now we're back to where we were, and we have an accessor for that first block, and we'll do a read on the two longs we wrote to that block in lines 42 and 43, and we're going to go get our other block back, the second block we created, and this is another call to memoryBlockFromHandle, but we're going to retrieve the handle by reading a handle from the first block. If you remember, we wrote that handle at offset 16 within the first block, so in line 44, we get the handle by reading that long there, pass that to the heap, and now we have a MemoryBlock object for the second block we created. We'll go ahead and read a value we wrote at offset zero there and assert that all three of the values we wrote are, in fact, showing up as correct.
If you were done with these blocks, you might not do it in the normal case, deallocate them as soon as you read them the first time, but just to show how it's done. In line 53, you can deallocate a block by just calling three on the block object and the false there's the same Boolean we used on the Allocate call, whether you want this free to be transactional. We're not using that in this case, and since we deallocated our original block, which we had set its handle, into the root, we will reset the root to zero, which is an invalid handle value.
OK, so that’s the basics of all there is to persistent memory programming in the core operations.
The second example I want to walk through is I'm just going to quickly show that there’s ways to control the sizing of heaps. Normally in a volatile programming environment we have one heap, it's owned by the kernel or the system, and applications compete for that DRAM, that volatile memory as a resource, but in persistent applications, you want to have memory that's owned by the application, so it's both private and it can be retrieved. So, we have multiple heaps and we have to manage the size of those as they compete for what is also a shared resource persistent memory.
OK, so what we're seeing in line 14 is what we just did in the previous example. We're creating a fixed size heap. You pass the createHeap call, a path, which in this case is a file name, and a long argument, which is the number of bytes you want reserved for this heap. So, you'll get 100 megabytes reserved for that particular heap, and no one else can get at those bytes. Sometimes you don't know how big a heap should be, and you need to be able to have it grow.
So, in the second case, in line 18, we're going to create another path, but this is not a file path. It's a directory path, and we'll go ahead in line 19 and create the directories if they don't exist, and we pass that path to the same call, createHeap, but this is an overloaded version that doesn't take a long and it will interpret this path as a directory and it will use that directory as the basis for creating a growable heap. It'll start at the minimum heap size, which is currently eight megabytes, and it'll grow in 128-megabyte chunks, as needed, as you allocate until memory is no longer available for allocation. You'll get an out of memory error then.
So, the third example is a combination of the first two. When you want a growable heap, but you don't want it to grow without limit, you want to put a limit on it, we create a directory path and pass that same directory path in line 26 to the createHeap, but now we also do give it a size. The size here is interpreted as a max size. So, it'll grow up to one gigabyte and then, if you continue to allocate, it'll throw an out of memory error.
There are two more advanced ways to manage heap sizes. Sometimes you don't have to have a file system mounted on a persistent memory device. You can use it directly and the LLPL supports that, and sometimes you have more than one file system mounted for persistent memory use, and you want to join them together to make one heap out of both of those, and you can do that in LLPL too, shown in these commented out examples.
So, let me move onto the third sample code here. It's showing how to use those other heaps we talked about. So, far, we've just been showing code for the general purpose heap, and I'm going to show essentially the same Getting Started code for the TransactionalHeap and the PersistentHeap just to show the differences.
 So, we're going to start out with almost the same code in line 13. We have a file path, and we'll see if we've already initialized that path and created a heap for it. If we have, we will open the heap, and if we haven't, we'll call createHeap. This is the same arguments, almost the same signatures, it's just the class name has changed. We're using TransactionalHeap instead of Heap. If we're initialized, we'll go ahead and do a transaction and do some writes. In this case, however, we're going to start a transaction in line 20, and hand a body as a runnable, but in line 21, we're going to do our allocation inside of the transaction. This is really common, and the reason for this is because you want to be able to have consistent heap state if things get interrupted, so we want this whole body to happen together. So, in line 21, we allocate a MemoryBlock, and then in 22, we're going to do this setRoot call again. This looks almost exactly the same, but you'll notice that if we were interrupted between line 21 and 22, let's say we happened to get a power failure right at that instant, we wouldn't believe it because we haven't managed to put our bookmark to the block in the root yet, but because this is inside the transaction, that allocation will roll back and there won't be any memory leak. So, this is a common idiom to put your allocations and your initialization into a transaction.
We'll go ahead in line  23 and make a set call. This is the same long set we did at offset zero, but notice we didn't have to do an “add to transaction” here. That's done automatically because we're using the TransactionalMemoryBlock.
And then, if we look at the read code, we would do in the second run of this program, it's exactly the same calls except we're using TransactionalMemoryBlock. In a similar way, the use of the PersistentHeap is really similar to using the general heap, but you'll see in—we'll jump down all the way to line 45 where we're running the first time through, we're using PersistentMemoryBlock, and PersistentHeap. We will allocate the MemoryBlock and call setRoot there. We do a setLong in line 47, but we don't have to call flush. This is done automatically, and then the read code in lines 51 through 56 is the same, except again, we're using PersistentMemoryBlock and then for the associated PersistentHeap. So, if your code compiles, you know you didn't forget to flush with that code, and above, if your code compiles, you know you didn't forget to add to a transaction, and mostly all the programming between the three heaps is the same that way.
OK, so I'm going to step through a few more examples of how transactions work, and if we start in line 13, we're going to create one TransactionalHeap here, and then in line 14, allocate one 1k block, and I'll just use those repeatedly for these little snippets underneath here.
Shown in line 17 is a simple transaction like we've been looking at, and we can think about this as a transaction that completes, and after the body exits, those writes are known to be present in the persistent memory. There's an alternate way to create transactions. That's shown in line 25. If you don't want to pass a body right there, but you want to have a transaction object, you can pass along as arguments in calls. You can use a different static method called Create, but it doesn't take a body. It just takes a heap and gives you a transaction object back.
And then in line 26, you can make a run call, an instance method run call on that transaction and pass it runnable, in this case at the body. You'll notice after the first write in line 27, we're calling another method in our program here called otherWrites, and we're passing the transaction object and a MemoryBlock as arguments in that call. That is shown up to the top in line eight, and you can see otherWrites takes a transaction object and a MemoryBlock and just does a run call on the transaction, updating that transaction with an additional body. And this is a nested transaction call, and being able to pass things like this is convenient for code that is factored where a particular utility method will do, say, initialization of a header field, but you want to factor your code and it's not all in the same place. So, we have a lexical nesting of transaction bodies here, but it conforms to a natural way you might want to write your code.
OK, so nothing's gone wrong with either of these transactions and they commit and all of their bodies’ changes to persistent memory are affected. If something was to go wrong, we want to simulate that here. We will actually cause it if we ran this program. I'm not running these programs today, but they are all runnable, and a link to all of this code is in the last slide in the slide deck that we showed.
So, here in this transaction example, we're going to see how the transaction can roll back, and in line 37 we'll start by writing an initial value into our MemoryBlock at offset 100, 777, and now I have a try-catch here just to instrument the behavior, the control flow of a transaction. Normally, you wouldn't have to put this try-catch around there, but we start our transaction in line 39 and we call create, and the body we hand in has two writes. The first one is a setLong that overwrites the first write we did above at offset 100 to a new 888, and then in 41, we're going to do another write to offset 10,000 in our block of the value 234. This is beyond the bounds of our 1k allocated MemoryBlock, so this is going to throw an index out of bounds exception, and what we're going to instrument here, and it's the basic behavior of transactions in LLPL, if an exception is thrown from a transaction body, an uncaught exception, it will immediately abort the transaction and then rethrow the exception that caused the abort. So, in this case, this index out of bounds exception will be seen by the transaction object, that will cause an abort of this incomplete transaction and then it will rethrow the index out of bounds exception, which we'll catch at line 43. And what we're going to do in our catch there is just prove that the transaction rolled back. So, in line 44, we will read the value at offset 100 and assert that it was, in fact, rolled back to the original 777 value, and that'll be true, of course, outside of the try-catch.
Sometimes, you want to be able to recover from exceptions, and not have them cause a transaction to abort, and that's easy to do, and it's shown in lines 52 through 62 by just putting a try-catch inside of the transaction body and catching any recoverable errors, and dealing with them before the transaction can actually see it. So, we'll do a set in line 53, and then we're going to begin imagining we're going to parse a string. This one's going to fail because it's a decimal-based string and we're asking it to parse an integer value. So, it'll throw a number format exception, but since we're catching that inside the transaction body, the transaction won't see it, and no abort will happen. We'll deal with it in the way we want to here by just indicating a degenerate value for this value for local, and then we'll write that in line 61 as our second write to the MemoryBlock. And we'll see in lines 65 and 66 that, in fact, both writes happened just fine. One had this funny degenerate value, because that's what we decided to write, but no transaction abort happened.
And this last quick example is just showing that sometimes you want to execute some code just when a transaction commits, and execute other code just when it aborts, and this is a common idiom for writing commit or abort handlers in LLPL, where this time you've deliberately put a try-catch around your transaction, catching throwable kind of at a wall, and you know that if you see anything there, that that transaction, if it came out of the transaction body, the transaction aborted, so you can make a note of that, and then in line 77, execute whatever code you want to be your abort handler code. And then we have a “finally” that will be the commit handler code, but first, it will check that flag reset, and if we get abort, we won't execute the commit handler code, otherwise we will.
And lastly in line 86, it's important to maintain proper rollback of interrupted transactions that if it's possible for multiple threads to access the same region of memory during the transactional operations, that you isolate this transaction code to be just one thread at a time for the duration of the transaction, and one simple way to do that in Java and in LLPL is just to put a synchronized block around the transaction, and in this case, we're going to lock on the block object since that's the thing that's actually shared.
OK, I had one more example, but I don't think we're going to have time to walk through it. You can see this code on the website. When you do offset-based programming, it's very flexible, but it can get tedious and it's error prone, so there are various ways to abstract this offset-based programming out and to hide the offset arithmetic. One common way is to take a regular Java class and create some static fields that reference offsets within a block and then makes the one instance field of this Java class a MemoryBlock, a LLPL MemoryBlock, and then you can write setters and getters that use the static offsets described above to can up this offset-based access and arithmetic. So, once you write this once, from the outside, the setters and getters, in this case just getters, look like normal Java calls and you aren't having to validate the correctness of offsets over and over.
OK, Soji, that's all I had for today. I'll hand it back to you.
Thank you, Steve, for that detailed presentation and sample code walkthrough. At this time, I'll open it up for questions.
As a reminder, this webinar is being recorded and will be available on demand after this live session. Also, remember to ask your questions using the Ask a Question tab.
There are a few questions that have already come in. The first question is about garbage collection. Does garbage collection work in persistent heaps? Will this be true in some libraries and not in others? I believe this is in reference to LLPL and PCJ, which you mentioned, and perhaps the Java feature that allows you to enable the holding point in persistent memory. So, does garbage collection work in persistent heaps?
So, in LLPL, all of the memory management for persistent heaps is manual. Of course, the garbage collection is working fine for the control objects, but when a control object, one of these MemoryBlock objects, on the Java heap gets collected, nothing happens on the persistent side. In PCJ, this experimental library we have, we specifically coupled the Java garbage collection to automatic memory management, a reference counting based scheme on the persistent heap, to make the same reachability-based lifetime that is familiar for regular Java objects, the memory lifetime behavior for persistent objects. But there's a lot of overhead there, and that is, as you can see, more complex and more difficult, and it's still experimental. So, it's a wonderful goal, and I'd love—as a longtime Java programmer, we really want to have automatic memory management, but there there's first things first, and we have LLPL to give this highest performance access to persistent memory, and it can be used as a base to build other things. In fact, we did build PCJ using a base very similar to LLPL.
For the third library that was mentioned, the whole heap, we didn't talk about it, but there's a whole heap on persistent memory chop, and that's been in Java since JDK12. It's strictly for volatile use, so there's no persistence available in that use of persistent memory, but you do get full garbage collection behavior there. In fact, it's just a regular Java heap that's been placed in persistent memory, or part of the heap. So, it’s available in experimental form in a persistent way, it's available in a production form in a volatile way, and the rest of the cases are manual. Thank you.
The next question is about persistent byte buffers in Java 14. How is LLPL different?
OK, well, probably the biggest one—there are two big differences. One is that byte buffers are ubiquitous. They've been around forever, everybody knows them, but they are limited to two gigabytes in size because they're indexed and sized with Java integers. LLPL doesn't have that limit. Everything is sized with long so you can have a terabyte sized heap and very big memory blocks. The second difference is LLPL gives you handles and essentially the ability to build pointer-based data structures, so you can link blocks of memory. You can allocate small blocks and link them together. This is key in building memory-based data structures such as radix trees and so forth. This kind of a thing is hard to do with mapped byte buffers. They were really designed for an I/O type of application, literally as a buffer, and they might not scale that well to build Linked Data Structures out of that. Those are the two biggest differences. It's great we have the mapped byte buffers, but LLPL, it scales better with size, and you can build a lot of things over the LLPL that might be difficult with mapped byte buffers.
The next question is about memory blocks. What happens if I allocate a memory block and don't store the handle or deallocate the block? Is that a memory leak?
Yes, it is, and, in fact, we saw in the transaction example that if we put the allocation and whatever, some setting, some way to reference that block later into the same transaction, we won't get a leak. So, the allocation, we get back a handle, if we don't write that down somewhere, put it somewhere in memory that we can get at it later, then it will leak. There's actually a way to iterate through all the allocations in a heap, but we don't support it directly in LLPL, and so you want to take care of your allocations either with a transaction or simply knowing that you weren't interrupted between the time you allocated a block, and the time that you wrote its value somewhere.
The next question is about the Cassandra storage engine that you mentioned. What is the status of that?
It is not upstreamed. It’s still in progress and has been for quite some time. The pluggable storage engine API I talked about is unfinished, and that really needs to be done, again, in collaboration with the Cassandra open source community, and the last time we talked with them about this, they were very excited about it, moving forward with that and persistent memory, but it was after Version 4.0 of Cassandra ships, and that has been in progress for long time. It’s a stabilizing release of Cassandra, and the doors have been closed to new big features until after that ships. They ran a few alphas, they're on a beta now, so it should be soon, but we have more work to do before the ability to have a persistent memory storage engine can be upstreamed into the trunk.
And the last question that we have is about heaps. How do you decide which one to use? I believe you touched on it a bit in the presentation.
Yes, I think if you have mission critical data, where the integrity of your heap is really critical, in maybe financial cases or other things, using the transactional heap gives you a compile time knowledge that all of your writes are going to roll back according to however your transaction bodies were written, and it makes it easier to get all the long-term consistency of your heap in place. On the other hand, there are a lot of applications that maybe need a mixture of writes. They'll write their main data in just a durable way, and they'll write some sort of log, maybe a commit log, in a transactional way. So, if you need that mixture, then either the persistent heap or the regular heap is usable. And just for the most flexibility, the general purpose heap lets you do anything, so it's trading off some guarantees you get, particularly compile time guarantees, versus flexibility and potentially in maximum speed. They're all fast, they really are, but you can even get to that trade-off made by choosing the different heaps.
Well, that was the last question. Thanks again, everyone for joining us for this live webinar, and thank you for your questions, and thank you, Steve. Goodbye, everyone.
Bye.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804