Intel® ISA-L Erasure Coding

  • Overview
  • Resources
  • Transcript

As applications have scaled to the data center, so have demands on the storage infrastructure that support them. Storage availability and fault tolerance have become a crucial challenge. To ensure storage systems meet data center availability requirements, two techniques are pervasive: multi-replication (typically triple-replication) and Reed-Solomon erasure codes (RS EC), both of which ensure there are always copies of the data available despite single or dual failures. Triple replication has the advantage of simplicity, but requires that a system’s raw storage capacity is (at least) 3X the design capacity. By contrast, RS EC has historically been computationally intensive, but is far more flexible and space efficient, enabling raw capacity to be only 1.5X the design capacity. For storage applications requiring large data sets, this difference in the underlying availability algorithm can translate into huge differences in capital and operating expenditures. The Intel® Intelligent Storage Library (Intel ISA-L) includes support for erasure coding. This video describes how the use of erasure coding can benefit a storage application and explains the Intel ISA-L implementation, which uses Reed Solomon error correction (RS EC).

Watch the rest of the videos in this series:

What is Intel® ISA-L? 
Intel® ISA-L Compression Algorithms
Intel® ISA-L Semi-Dynamic Compression Code Sample Walk Through
Intel® ISA-L Cryptographic Hashing
Intel® ISA-L: Cryptographic Hashing Code Sample Overview
- Intel® ISA-L Erasure Coding
Intel® ISA-L Erasure Coding Sample Application Overview

Hi. I'm Praveen from Intel. In this video, we're going to talk about Intel Intelligence Storage Acceleration Library erasure coding. 

A lot of people intuitively understand RAID. But erasure coding is little more esoteric. Many of the clouds are using erasure recording, especially people who build systems to scale to many nodes, more than 10 to 20. 

For these systems, erasure codes make lots of sense because it gives you all of the same redundancy guarantees as triple replication but with half the raw data footprint, or potentially less, depending the way how you configure your erasure-coded system. 

Triple replication is the process by which you know a single copy of data is mirrored in at least two other places. Say if one or two whole nodes disappear often at work, you still haven't lost access to the data. Essentially, erasure coding in general will continue to give access to that data even though there are failures. 

If we can shrink the cost of providing those access guarantees-- in this case, half of the cost in using erasure coding scheme as opposing to triple replication-- then that presents giant savings from operating and capital expenditures. Just about anyone, enterprises or hyperscalers who are building systems above a few nodes, can start taking advantage. 

The reason they weren't ubiquitously adopted prior to ISA-L is because they were computationally expensive. The performance [? delta ?] is gigantic. 

The reason for ISA-L to implement these is to enable the people to get these economies of storage media as we started looking towards to the solid-state transformations. Just to give a sense of scale about the performance, looking at the example that I'm going to introduce in this playlist, we see roughly 5 gigabits per second per code of erasure coding calculations on [? E5E4 ?] that we used in the example, which is quite substantial. 

The flip side of the throughput is a latency, and especially [? software-in-use ?] latency. It is a key aspect for people who are building at larger scales. You want to minimize the latency that you can incur in software, as any given operation is split up and parallelized and touching thousands of systems. 

That software latency compounds quite tremendously in those parallel systems. Removing thoughtful latency and, as a side effect, getting high throughput is one of the main goals of the Intel ISA-L. 

Thanks for watching. Be sure to watch the code sample portion of this example of Intel ISA-L of erasure coding and the rest of the playlist. Don't forget to like this video and subscribe.