Use Intel® Inspector for Persistent Memory Out-of-Order Store Analysis

Direct Access to Persistent Memory

One of the biggest benefits of the new Intel® Optane™ DC Persistent Memory is direct memory access mode. For programmers this means that they can map an arbitrary memory block of persistent memory to the application’s address space and use it like regular byte-addressable memory. Access to that memory is expected to be faster than using system I/O stack and every bit of data stored there becomes persistent. For example, we could allocate some arbitrary data array in that region for some long running computation and be able to continue that operation after an interrupt because all previous data is already there.

It also adds more challenges to programming, because developers need to keep in mind consistency aspects of their data structures. Since interruption could happen anywhere, the program should always keep data in a valid state.

Sample Application

The first application continuously stores records in persistent memory for further reading.  The structure has a fixed size and has a flag indicating that content of a particular record is valid. The program either stores data into that block or reserves it for further modification. The number of records is variable and could be big enough that reading an entire block to get the value would be time consuming. Therefore, a header block that includes number of stored records was added.

struct header_t
{
    uint32_t counter;
    uint8_t reserved[60];
};

struct record_t
{
    char name[63];
    char valid;
};

The ‘store’ code looks like this:

for (int i = 0; i < RecordsToWrite; i++)
{
    //Store number of records
    header->counter++;

    if(rand() % 2 == 0)
    {
        //Store valid record
        snprintf(records[i].name, sizeof(records[i].name),
                 "record #%u", i + 1);
        _mm_clflush(records[i].name);

        records[i].valid = true;
    }
    else
    {
        //Store empty record
        records[i].valid = false;
    }
    _mm_clflush(&records[i].valid);
}

The application needs to be fault tolerant and not corrupt the data structures in case of unexpected interruption (e.g., process termination or power failure). In general, it means that it should avoid any inconsistencies in the data structures at any execution stage.

The second application reads the memory and prints all valid records from that data file using code like:

for (uint32_t i = 0; i < header->counter; i++)
{
	// If record is valid, print it to console
	if (records[i].valid)
	{
		std::cout << "found valid record:\n";
		std::cout << "  name    = " << records[i].name << "\n";
	}
}

If we run this application, we get expected results. Everything looks good. However, the Intel® Inspector - Persistence Inspector tool within Intel Inspector can be used to see if there are any potential persistent memory programming errors.

Check Application Correctness

The first run is to capture how the application stores data into persistent memory. To do this, run the following command:

pmeminsp cb -- out_of_order write

In the next step, we want to see how the application loads that data:

pmeminsp ca -- out_of_order read

Once the data is successfully collected, proceed to analysis of application correctness with respect to data consistency. To generate a report with out-of-order stores analysis mode run the following command:

pmeminsp rp -check-out-of-order-store -insp -- out_of_order

The previous commands store the collected data in ‘.pmeminspdata’ subfolder of the current directory.

To view the data in the Intel® Inspector ’s graphical user interface (UI), run the following command:

inspxe-gui ./pmeminspdata

There are several problems reported:

Intel Inspector GUI

Intel Inspector is reporting that the application has an incorrect order of stores between the record ‘valid’ flag and the total records counter. This could be a real problem if the application terminates somewhere between incrementing the counter and setting the ‘valid’ flag. Since the counter is incremented before the valid data is stored into the record, we might observe that counter is incremented, but the new record entry is not initialized in persistent memory. Subsequent reading of the block will have an undefined behavior.

The Intel Inspector  UI merges similar problems into the same group and shows that the store of “header->counter” is out of order with respect to different portions of the ‘name’ member as well. However, we can ignore the ordering with respect to the ‘name’ member because it has dependency on the ‘valid’ flag which will be set after the current record is initialized.

Fixing Inconsistent Data

In order to fix this problem the original source needs to be modified:

for (int i = 0; i < RecordsToWrite; i++)
{
    if(rand() % 2 == 0)
    {
        //Store valid record
        snprintf(records[i].name, sizeof(records[i].name),
                 "record #%u", i + 1);
        _mm_clflush(records[i].name);

        records[i].valid = true;
    }
    else
    {
        //Store empty record
        records[i].valid = false;
    }
    _mm_clflush(&records[i].valid);
    
    //Increment number of records
    header->counter++;
}

Store the record’s content first and flush it to ensure that the data is stored in persistent memory. Only after that, can the total records counter be safely incremented. Using this ordering ensures that non-initialized or partially initialized record entries are not loaded by the reader code.

Next recompile this application and repeat the analysis of memory stores to see if there are more problems to work on. Run the following command one more time:

pmeminsp cb -- out_of_order write

Then generate problems report into console this time:

pmeminsp rp -check-out-of-order-store -- out_of_order

The report shows that the application now has only two problems and looking at them carefully, the first one is about the primary initialization of the ‘counter’ member. The other one suggests reversing order of the storing symbols into the ‘name’ member. It might be worth doing, but since the entire ‘name’ data block depends on the ‘valid’ flag, it can safely be ignored.

#===============================================================================
# Diagnostic # 1: Out-of-order stores
#-------------------
  Memory store
    of size 4 at address 0x20AA9230000 (offset 0x0 in d:\testapp\outoforder.dat)
    in d:\testapp\out_of_order.exe!save_data_file_fixed at main_article.cpp:110 - 0x13D8
    in d:\testapp\out_of_order.exe!main at main_article.cpp:190 - 0x1886

  is out of order with respect to

  memory store
    of size 1 at address 0x20AA923007F (offset 0x7F in d:\testapp\outoforder.dat)
    in d:\testapp\out_of_order.exe!save_data_file_fixed at main_article.cpp:126 - 0x14A0
    in d:\testapp\out_of_order.exe!main at main_article.cpp:190 - 0x1886


#===============================================================================
# Diagnostic # 2: Out-of-order stores
#-------------------
  Memory store
    of size 1 at address 0x20AA9230080 (offset 0x80 in d:\testapp\outoforder.dat)
    in c:\windows\system32\msvcr120d.dll!mbctombb_l at <unknown_file>:<unknown_line> - 0xC4F20
    in c:\windows\system32\msvcr120d.dll!mbctombb_l at <unknown_file>:<unknown_line> - 0xC3C93
    in c:\windows\system32\msvcr120d.dll!snprintf at <unknown_file>:<unknown_line> - 0x2D05B
    in d:\testapp\out_of_order.exe!save_data_file_fixed at main_article.cpp:118 - 0x144D
    in d:\testapp\out_of_order.exe!main at main_article.cpp:190 - 0x1886

  is out of order with respect to

  memory store
    of size 8 at address 0x20AA9230088 (offset 0x88 in d:\testapp\outoforder.dat)
    in c:\windows\system32\msvcr120d.dll!mbctombb_l at <unknown_file>:<unknown_line> - 0xC4F20
    in c:\windows\system32\msvcr120d.dll!mbctombb_l at <unknown_file>:<unknown_line> - 0xC5083
    in c:\windows\system32\msvcr120d.dll!mbctombb_l at <unknown_file>:<unknown_line> - 0xC4C38
    in c:\windows\system32\msvcr120d.dll!snprintf at <unknown_file>:<unknown_line> - 0x2D05B
    in d:\testapp\out_of_order.exe!save_data_file_fixed at main_article.cpp:118 - 0x144D
    in d:\testapp\out_of_order.exe!main at main_article.cpp:190 - 0x1886

Conclusion

Persistent memory is a very powerful technology that can bring various classes of applications to the next level, but this new technology introduces new challenges and requires more attention to code correctness. This example shows how Intel Inspector - Persistence Inspector tool can be used to verify an application and fix problems that lead to sporadic failures.

For more complete information about compiler optimizations, see our Optimization Notice.