Code Sample: Using Libpmemobj to Manage Persistent Memory Arrays in C++

File(s):

Download
License:3-Clause BSD License
Optimized for... 
OS:Linux* kernel version 4.3 or higher
Hardware:Emulated: See How to Emulate Persistent Memory Using Dynamic Random-access Memory (DRAM)
Software:
(Programming Language, tool, IDE, Framework)
Intel® C++ Compiler and Persistent Memory Developers Kit (PMDK)
Prerequisites:Familiarity with C++

Introduction

This code sample uses libpmemobj, a persistent memory library for C++, to demonstrate how to manage persistent memory arrays. Using the command line, you can allocate, reallocate, free, and print arrays of integers. The use of persistent memory means that in the case of a power failure or application crash, the state of your data will be retained. In this example, we will examine code snippets that demonstrate several concepts, including persistent pointers, transactions, and pools. The entire sample code can be found on GitHub*.

Persistent Memory

This article assumes that you have a basic understanding of persistent memory (PMEM) concepts and are familiar with features of the Persistent Memory Development Kit (PMDK). If not, visit the Intel® Developer Zone Persistent Memory site, where you'll find the information you need to get started.

Read further to learn how persistent memory was used to implement an array in C++.

Data Structures

Since there can be multiple arrays at a time, the array_list struct is a linked list containing the array name, array size, the actual array, and a pointer to the next object. The array_list declaration can be seen below. Just below that, head is declared; that is our pointer to the array_list.

struct array_list {
   char name[MAX_BUFFLEN];
   p<size_t> size;
   persistent_ptr<int[]> array;
   persistent_ptr<array_list> next;
};

persistent_ptr<array_list> head = nullptr;

As you can see, there are two ways persistent variables are being initialized in the array_list definition:

  1. With the persistent template class p<> for basic types.
  2. Using the persistent_ptr<> for pointers to complex types.

Size is declared using the persistent template, with size_t being the type. This variable needs to be persistent because the value of size can change during the life of the array. This is not the case of name, which is set once during object construction and is never changed. Array and next both use the persistent_ptr syntax. The array variable holds the array of integers and is where space will be allocated, reallocated, and freed. Next is our pointer to the next element in the array_list linked list.

Let’s now take a look at the rest of the code where we’ll see more persistent memory being implemented.

Code

Find_array is a private function frequently used throughout the example to simplify use of the linked list. This function loops through the array_list linked list until a specified array is found or returns null if it is not found. The find_prev parameter in find_array is an optional parameter that is set to true when a function wants to find the item in the linked list that is right before the one we want. We’ll see this used in the delete_array method later on.

persistent_ptr<array_list> find_array(const char *name, bool find_prev = false);

Main

Inside the main function, we first parse the inputs and see if the file name passed in matches a file that already exists. If the file exists, it is opened, and all data previously stored there is accessible. If the file does not exist, a new one is created and opened. This file is accessed using the pop variable, which stands for pool object pointer; this variable is used throughout the code.

const char *file = argv[1];
pool<examples::pmem_array> pop;

if (file_exists(file) != 0) {
    pop = pool<examples::pmem_array>::create(
        file, LAYOUT, POOLSIZE, CREATE_MODE_RW);
} else {
    pop = pool<examples::pmem_array>::open(file, LAYOUT);
}

Next, we further parse the inputs to see which operation the user passed in. The function parse_array_op returns an array_op defined by this enum:

enum class array_op {
    UNKNOWN,
    PRINT,
    FREE,
    REALLOC,
    ALLOC,

    MAX_ARRAY_OP
 };

Once the operation is determined, we enter a switch case, which directs the inputs to various functions to complete the request. In each case, the number of operations passed in is checked. If the count of arguments does not meet what is expected, then the program usage is printed.

array_op op = parse_array_op(argv[2]);

switch (op) {
    case array_op::PRINT:
        if (argc == 4)
            arr->print_array(name);
        else arr->print_usage(op, prog_name);
        break;
    case array_op::FREE:
        if (argc == 4)
            arr->delete_array(pop, name);
        else arr->print_usage(op, prog_name);
        break;
    case array_op::REALLOC:
        if (argc == 5)
            arr->resize(pop, name, atoi(argv[4]));
        else arr->print_usage(op, prog_name);
        break;
    case array_op::ALLOC:
        if (argc == 5)
            arr->add_array(pop, name, atoi(argv[4]));
        else arr->print_usage(op, prog_name);
        break;
    default:
        std::cout << "Ruh roh! You passed an invalid operation!!" << std::endl;
        arr->print_usage(op, prog_name);
        break;
}

Let’s take a look at each function.

Print

Print array is called by running the following command:

$ ./example-array.cpp <file_name> print <array_name>

The print_array function takes in an array name. The find_array helper function is used to determine if an array with that name exists. If the returned array_list object is a null pointer, then a message is printed saying that no array with that name was found. If an array with that name was successfully found, it can be accessed from the returned pointer to the array_list object. You will see that this is how most of the functions start.

After the array is located, its contents are printed to the screen. This is how it would look if an array named myArray of size 8 was printed.

$ ./example-array.cpp file print myArray
myArray = [0, 1, 2, 3, 4, 5, 6, 7]

The entire print_array function can be seen below:

void
print_array(const char *name){
    persistent_ptr<array_list> arr = find_array(name);
    if (arr == NULL)
        std::cout << "No array found with name: " << name << std::endl;
    else{
        std::cout << arr->name << " = [";
        for (size_t i = 0; i < arr->size-1; i++) {
            std::cout << arr->array[i] << ", ";
        }
        std::cout << arr->array[arr->size-1] << "]" << std::endl;
    }
}  

Free

Arrays can be freed by running the following command:

$ ./example-array.cpp <file_name> free <array_name>

When a user specifies an array to free, the delete_array function is called. Again, find_array is called to locate the array in the linked list. This time, though, find_array is called with the optional parameter, find_prev, set to true. This returns the array just previous to the one we are looking for.

If no array with that name was found, a message is posted and the function returns. On the other hand, if the array is found, we set cur_arr to point to the array we want to delete. In most cases, cur_arr is set to prev_arr-> next, since prev_arr is the element right before the one we want to delete. If there is only one element in the linked list, though, or if the array we are hoping to delete is the first element in the list, cur_arr is set to head.

There are three types of transactions. In this sample, we use an automatic transaction type. Transactions are used here to wrap code that is modifying data. In case of program failure, a transaction will either execute fully or not at all. This prevents issues that may be created if a power failure or process crash occurs in the middle of writing to memory.

transaction::exec_tx(pop, [&] {
    if (head == cur_arr)
        head = cur_arr->next;
    else 
        prev_arr->next = cur_arr->next;
    
    delete_persistent<int[]>(cur_arr->array, cur_arr->size);
    delete_persistent<array_list>(cur_arr);
});

Inside the transaction, the “if” statement checks whether head equals cur_arr. This is the case when the array we’re searching for is the first element in the list. If head does equal cur_arr, head is simply reassigned to point to cur_arr->next. If the array we’re searching for is not the first one in the list, though, prev_arr’s pointer to next is now reassigned to cur_arr’s pointer to next. This action removes any pointers to the array we want deleted. Next, to free the memory that was being used, we delete the array object and the array_list element. The whole delete_array process is illustrated in the figure below:

delete_array process illustrated

Further details about transactions can be found in the article C++ Transactions for Persistent Memory Programming. Explore the source code on GitHub to see the full delete_array method.

Alloc

A user can allocate a new array by running this command:

$ ./example-array.cpp <file_name> alloc <array_name> <size>

Alloc will trigger the add_array function which requires the pool object pointer (pop), the array name and the size of the array you want to create. We first check to see if an array with that name exists. If one does, a note is posted to the terminal asking if the user would rather reallocate this array and then the realloc instructions are posted. If the size is acceptable, and the name is not already taken, we then enter the transaction.

Inside the transaction, we create a new_array object which will be filled with function inputs: name and size. The persistent array is allocated in the array field, and for now next is set to nullptr.

Using a for loop, we assign values to the array field. This will be helpful when printing an array after reallocating it because we will be able to clearly see how the array was enlarged or shrunk.

Now we insert that new_array object to the front of the linked list by assigning new_array ->next to head, then assigning head to new_array.

transaction::exec_tx(pop, [&] {
    auto new_array = make_persistent<array_list>();
    strncpy(new_array->name, name);
    new_array->size = (size_t)size;
    new_array->array = make_persistent<int[]>(size);
    new_array->next = nullptr;

    // assign values to new_array->array
    for (size_t i = 0; i < new_array->size; i++)
        new_array->array[i]=i;
	
    new_array->next = head;
    head = new_array;    	
});

Realloc

Realloc changes the size of a pre-existing array. To do so, a user runs this command:

$ ./example-array.cpp <file_name> realloc <array_name> <size>

Realloc calls the resize function which takes in the pop, name, and size variables. Find_array is used to locate the array_list object by name. Comments are sent back to the command line if the array doesn’t exist, or if the size is smaller than 1. If the size is okay and the array exists, we enter the transaction.

Inside the transaction, rather than resizing the array, which isn’t easy in C++, we instead allocate a new array of the desired size and copy over the values from the prior allocation. In the code below you can see that new_array is our new persistent pointer to an array of integers. Next, the values are copied over from arr->array to new_array. To wrap up, the previous array is deleted by using the delete_persistent function, the size field of arr is updated, and the array field is now set to point to new_array.

void
resize(pool_base &pop, const char *name, int size)
{
    persistent_ptr<array_list> arr = find_array(name);
    if (arr == nullptr) {
        std::cout << "No array found with name: " << name << std::endl;
    } else if (size < 1) {
        std::cout << "size must be a non-negative integer" << std::endl;
        print_usage(array_op::REALLOC, prog_name);
    } else {
        transaction::exec_tx(pop, [&] {
            persistent_ptr<int[]> new_array = make_persistent<int[]>(size);
            size_t copy_size = arr->size;
            if ((size_t)size < arr->size)
                copy_size = (size_t)size;
            for (size_t i = 0; i < copy_size; i++){
                new_array[i]=arr->array[i];
            }
            delete_persistent<int[]>(arr->array, arr->size);
            arr->size = (size_t)size;
            arr->array = new_array;
        });
    }
}

It is important to note that if the size of the new array is larger than the current array, the new indices are filled with zeros. If the new array is smaller, then previously stored data will be lost. This is demonstrated below:

First allocate and print array, arr of size 8:

libpmemobj-cpp/build$ ./ example-array newFile alloc arr 8
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4, 5, 6, 7]

Now we reallocate arr to be of size 5. This will copy the first 5 values:

libpmemobj-cpp/build$ ./ example-array newFile realloc arr 5
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4]

Since only the first five values were copied, the other ones were lost. Now when we reallocate to be a larger array, new indices are filled with zeros:

libpmemobj-cpp/build$ ./ example-array newFile realloc arr 12
libpmemobj-cpp/build$ ./ example-array newFile print arr
arr = [0, 1, 2, 3, 4, 0, 0, 0, 0, 0, 0, 0]

Building and Running

After building the source, navigate to the /build/examples directory. Operations include: alloc, realloc, free, and print.

 $ ./example-array <file_name> <print|alloc|free|realloc> <array_name>

Summary

This example demonstrates how the libpmemobj library for persistent memory can be used to create a program that allocates, reallocates, frees, and prints information about arrays of integers. The examples of transactions, pools, and persistent pointers demonstrated here are good references for developers looking to learn the basics of persistent memory. To learn more, check out other articles on Intel® Developer Zone Persistent Memory site, and visit the pmem GitHub page.

About the Author

Kelly Lyon is a developer advocate at Intel Corporation with three years of previous experience as a software engineer. Kelly is dedicated to advocating for users and looks forward to bringing clarity to complex ideas by researching and providing simple, easy-to-understand guides and tutorials. Follow her journey on Twitter*.

Resources

For more complete information about compiler optimizations, see our Optimization Notice.