We just released the very interesting paper showing the way to speed up a pair of independent functions or algorithms, like a block cipher and a hash often called sequentially on the same input buffer.
One can greatly improve the utilization of the underlying microarchitecture’s execution resources by combining two algorithms and computing them together at the same time, we call it “stitching”.
With that technique applied to the pairs of a block cipher and a hash often called together, like ARC4 and MD5, or AES-128 and SHA-1 we were able to achieve a significant performance improvements in a range of ~1.4X to ~1.9X.
Another highlight is that with Intel processor supporting AES-NI, like recently released Intel Xeon® 5600 family, modern and more secure algorithms AES-128 and AES-256 used with SHA-1 hash no longer come with any performance penalty over the old ARC4-MD5 pair: when optimized and stitched, all the pairs are not only much faster overall, but also end up having about the same performance.
These results allow a notable performance improvement of the secure web transactions processing. The paper is available here: http://download.intel.com/design/intarch/papers/323686.pdf
We plan on making the source code with our implementations available to the public sometimes in the future. If you believe the implementations mentioned in the paper might benefit your business, please let us know!
Want a faster calculated hash and a block cipher? “Stitch” them!