Intel® ISA-L Cryptographic Hashing

  • Overview
  • Resources
  • Transcript

Cryptographic hashing is attractive for applications such as deduplication and encryption because it offers a very low probability of collision. Learn how the Intel® Intelligent Storage Acceleration Library (Intel ISA-L) implementation takes maximum advantage of Intel® architecture and the inherent parallelism of the execution pipeline to provide great performance using a technique called multi-buffer hashing.

Watch the rest of the videos in this series:

What is Intel® ISA-L? 
Intel® ISA-L Compression Algorithms
Intel® ISA-L Semi-Dynamic Compression Code Sample Walk Through
- Intel® ISA-L Cryptographic Hashing
Intel® ISA-L: Cryptographic Hashing Code Sample Overview
Intel® ISA-L Erasure Coding
Intel® ISA-L Erasure Coding Sample Application Overview

Hi. I'm Praveen from Intel. In this video, we're going to talk about Intel Intelligent Storage Acceleration Library Cryptographic Hashing. Don't forget to follow the links in the description for more information. 

What is cryptographic hashing? Check out this infographic. As you can see, if you give it a chunk of data, it outputs the digest. The key reason to do cryptographic hashing is there's a very low probability of collision. For example, if one chunk that is inputted is the word dog and the other chunk is the word cat, it is not desirable when the output either two digests somehow aligned with each other or if they have the same value. 

When two digests do align, this is called collision, and it is problematic. Intel ISA-L uses a normal technique called multi-buffer hashing, which takes maximum advantage of the Intel architecture and the inherent parallelism of the execution pipeline. It is not between cores, but within each core. The ISA-L's cryptographic implementation will give you incredibly good performance because it can compute several hashes at once within a single call. 

The key, though, is that it requires a parallel or asynchronous interface to take advantage of this. And that is not the easiest thing to use. Obtaining best performance requires software to keep all the lanes full. If the software can compute four hashes for the computational price of one, then benefits only become visible when multiple chunks are submitted for hashing at once. The hash algorithms that ISA-L support are as follows. SHA1, SHA2-256, SHA2-512, and MD5. 

As previously stated, the asynchronous interface is a problem. So we use multihash that wraps an asynchronous interface with a synchronous one. Let's take a look at this in practice. The difference for multihash, though, we haven't implemented for all these algorithms yet, but only implemented for SHA1 and we stitched two of these together. SHA1 plus something called Murmur, which is another hash. 

Multi-buffer hashing, in a nutshell, is when you take a bunch of hashes, throw them into AVX and stop using standard CPU instructions and pile them all up, you'll find parallelism within execution pipeline to get four for the price of one. You can get a tremendously improved throughput, and therefore, improve average latency for calculations of doing hashes. 

As I have said before, the challenge is, when you have to use an asynchronous interface-- as you have to keep using multiple lanes full-- if you use this multi-buffer interface for just one chunk, it's probably going to take longer than it would if it calculates the traditional way. So the challenge is to keep the lanes full. Keeping enough parallelism and enough chunks being submitted to your thread during the hashing to get maximum utilization of this parallelism. 

This kind of problem is solved with a technique that I will discuss called multihash. It is a wrapper around that multi-buffer. You can give it a single buffer. It will do parallel hashing for all of them, break up all of those, and hash all the chunks for you. And then it will do one extra pass where it hashes the digest of all these parallel chunks. 

So you don't lose anything. You have the same control properties as SHA1, but the digest itself doesn't match the canonical output for that hash. It works fabulously if you need that synchronous interface and don't care whether or not the digest matches. 

Now let's talk about use cases. Deduplication is the obvious use case for this, but there are some of those too. There is data integrity, where you can compute these hashes so cheaply that you can use them as a substitute for CLC and so on. 

Encryption is the other one. And the reason that these hashes exist is to sign the cryptographic secure packets. And there are obvious implications there. And in general, anywhere that data leaves a fingerprint, you can use this hashing technique, especially if you can get at either large chunks that parallelize well, or many smaller chunks that you want to do in parallel. 

Thanks for watching the video. Be sure to keep watching the Intel ISA-L playlist, where I will talk about an example on the Intel ISA-L implementation of multi-buffer hashing. Don't forget to like this video and subscribe to the Intel Software YouTube channel.