Developer Reference

  • 2020
  • 10/21/2020
  • Public Content
Contents

Burrows-Wheeler Transform

Burrows-Wheeler Transform (BWT) does not compress data, but it simplifies the structure of input data and makes more effective further compression. One of the distinctive feature of this method is operation on the block of data (as a rule of size 512kB - 2 mB). The main idea of this method is block sorting which groups symbols with a similar context. Let us consider how BWT works on the input data block ‘
abracadabra
'. The first step is to create a matrix containing all its possible cyclic permutations. The first row is input string, the second is created by shifting it to the left by one symbol and so on:
abracadabra bracadabraa racadabraab acadabraabr cadabraabra adabraabrac dabraabraca abraabracad braabracada raabracadab aabracadabr
Then all rows are sorted in accordance with the lexicographic order:
0 aabracadabr 1 abraabracad 2 abracadabra 3 acadabraabr 4 adabraabrac 5 braabracada 6 bracadabraa 7 cadabraabra 8 dabraabraca 9 raabracadab 10 racadabraab
The last step is to write out the last column and the index of the input string:
rdarcaaaabb
, 2 - this is a result of the forward BWT transform.
Inverse BTW is performed as follows:
elements of the input string are numbered in ascending order
0 r 1 d 2 a 3 r 4 c 5 a 6 a 7 a 8 a 9 b 10 b
and sorted in accordance with the lexicographic order:
2 a 5 a 6 a 7 a 8 a 9 b 10 b 4 c 1 d 0 r 3 r
This index array is a vector of the inverse transform (
Inv
), the further reconstruction of the string is performed in the following manner:
src[] = "rdarcaaaabb";
Inv[] = {2,5,6,7,8,9,10,4,1,0,3};
index = 2; // index of the initial string is known from the forward BWT
for( i = 0; i < len; i++ ) {
index = Inv[index];
dst[i] = src[index];
}

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804