| November 19, 2009 1:00 PM PST | |
Abstract
In this article, Anatoliy Kuznetsov answers the questions and tells us about the open BitMagic C++ Library.
Introduction
In my regular browsing through 64-bit programming related websites, I often came across references to BitMagic C++ Library and realized that it had benefited a lot from using 64-bits. I decided to contact the author asking him to give us an interview about his research and developments.
Questions are asked by: Andrey Karpov - in-house developer at OOO "Program Verification Systems currently" working on PVS-Studio tool designed for verification modern C++ applications.
The answers are given by: Anatoliy Kuznetsov - chief software engineer at NCBI; developer of the open source BitMagic C++ Library.
Anatoliy, please tell us a few words about yourself. What projects are you involved in?
I am a chief software engineer, at present I am working for the bio-molecular data discovery and visualization team at NCBI (National Center for Biotechnology Information). Apart from my primary job, I am the chief developer and architect of the open-source BitMagic C++ Library.
I am a planning engineer by education, a graduate of the Lobachevskiy University of Nizhniy Novgorod.
What is BitMagic?
BitMagic has been developed as a universal template library for processing compressed bit vectors. The library solves several tasks:
It provides a bit container which is really compatible with STL in terms of ideology. It means that such container must support iterators, memory allocators and interact with algorithms and other STL containers.
The library can efficiently operate on very long and sparse vectors.
It enables the serialization vectors for further writing to a database or sending them over a network.
A developer is provided with a set of algorithms to implement set-theory operations and calculate distances and similarity metrics in multidimensional binary spaces.
Much consideration is given to optimization for popular fast calculation systems such as SSE.
What tasks addressed by BitMagic make it so attractive for developers?
The library turned out to be fairly a one-stop solution and apparently it won’t be easy to list all its possible applications. At present, the library offers more compelling results in the following spheres:
Bit and inverted indexes building for full-text search systems, acceleration of relational algebra operations (AND, OR, JOIN etc).
Development of non-standard extensions and indexes for existing databases (Oracle Cartridges, MS SQL extended stored procedures). Generally, such extensions help integrate scientific, geographic and other non-standard data into the database.
Data mining algorithms development.
In-memory indexes and databases development.
Development of precise access isolation systems with a large number of objects (security enhanced databases with the isolation of access to specific fields and columns).
Task management systems (for the computation cluster), real-time task state tracking systems, and the storage of task states described as Finite State Machines.
Tasks related to the representation and storage of strongly connected graphs.
What can you tell about the history of BitMagic development? What motivated you to create it?
For a long time my colleagues and I had worked on tasks related to large databases, analysis and visualization systems. The initial working version demonstrating bit-vector capabilities was produced by Maxim Shemanaryov (he is the developer of Antigrain Geometry, a wonderful 2D vector graphics library: http://www.antigrain.com). Then, some ideas for an equivalent representation of data sets were described by Koen Van Damm, an engineer from Europe, who worked on programming language parsers for the verification of complex systems. There were other sources as well. I decided to bring them all together in the form of a library suitable for repeated use in various projects.
What license BitMagic is distributed under? Where I can download it?
The library is free for commercial and non-commercial use and is available in the form of source texts. The only limitation is the the requirement to give credit to the library and its authors if you are using it in a released product.
You can check out all the materials here: http://bmagic.sourceforge.net.
Am I right saying that BitMagic has significant advantage if compiled as a 64-bit version?
Yes indeed, the library uses a series of optimization methods to accelerate processing in 64-bit systems or SIMD-enabled systems (128-bit SSE2).
Here are some factors leading to a faster execution of algorithms:
a wide machine word (logical operations are performed over a wide word);
the programmer (and the compiler) has access to additional registers and the lack of registers is not so critical (this an inherited disadvantage of x86 architecture);
memory alignment often makes the operation faster (128-bit alignment of addresses provides good results);
and of course it’s possible to place more objects and data for processing in a program’s memory. That’s a great benefit of the 64-bit version, which is clear to everyone.
At the moment the fastest method is to use 128-bit SSE2 optimizations with a 64-bit program. This mode combines the double number of x86 registers with a wide machine word to perform logical operations.
64-bit systems and programs are going through a real Renaissance. The migration to 64-bits will be faster than the move from 16 to 32. The launch of 64-bit Windows versions in the mainstream market and the availability of related tools (like the one your company is developing) will stimulate this process. In the environment of increasingly complex systems and larger amounts of code, such tools as PVS-Studio will be of great practical utility as they help reduce labor costs and the time to market.
Please tell us about the compression methods used in BitMagic.
The current 3.6.0 version of the library uses several compression methods.
"Bitvectors" in memory are split into blocks. When a block is not occupied or is occupied fully, it won’t be allocated. That is, the programmer can set bits in a wide range very far from zero. Thus setting a 100,000,000 bit won’t result in an exploded memory use which is often common to vectors with a two-dimensional linear model.
Blocks in memory can have an equivalent representation in the form of areas – the so-called gaps. Actually that’s a sort of RLE coding. Unlike RLE, our library doesn't lose the capability to execute logical operations or access random bits.
When serializing "bitvectors", a set of different methods is used: conversion to lists of integer numbers (representing nulls or ones) and list coding by the Elias Gamma Coding method. When implementing these methods, we do lose the random bit access capability but it is not so critical for disk writes if compared with reduced storage and input-output overheads.
Could you give some code examples demonstrating possible uses of BitMagic?
The first example simply creates 2 vectors, initializes them and performs the logical operation AND. Further, the enumerator class is used for the iteration and printing of values saved in the vector.
#include <iostream>
#include "bm.h"
using namespace std;
int main(void)
{
bm::bvector<> bv;
bv[10] = true; bv[100] = true; bv[10000] = true;
bm::bvector<> bv2(bv);
bv2[10000] = false;
bv &= bv2;
bm::bvector<>::enumerator en = bv.first();
bm::bvector<>::enumerator en_end = bv.end();
for (; en < en_end; ++en) {
cout << *en << endl;
}
return 0;
}
The next example demonstrates vector serialization and the use of compression.
#include <stdlib.h>
#include <iostream>
#include "bm.h"
#include "bmserial.h"
using namespace std;
// This procedure creates very dense bitvector.
// The resulting set will consists mostly from ON (1) bits
// interrupted with small gaps of 0 bits.
//
void fill_bvector(bm::bvector<>* bv)
{
for (unsigned i = 0; i < MAX_VALUE; ++i) {
if (rand() % 2500) {
bv->set_bit(i);
}
}
}
void print_statistics(const bm::bvector<>& bv)
{
bm::bvector<>::statistics st;
bv.calc_stat(&st);
cout << "Bits count:" << bv.count() << endl;
cout << "Bit blocks:" << st.bit_blocks << endl;
cout << "GAP blocks:" << st.gap_blocks << endl;
cout << "Memory used:"<< st.memory_used << endl;
cout << "Max.serialize mem.:" <<
st.max_serialize_mem << endl << endl;;
}
unsigned char* serialize_bvector(
bm::serializer<bm::bvector<> >& bvs,
bm::bvector<>& bv)
{
// It is reccomended to optimize
// vector before serialization.
bv.optimize();
bm::bvector<>::statistics st;
bv.calc_stat(&st);
cout << "Bits count:" << bv.count() << endl;
cout << "Bit blocks:" << st.bit_blocks << endl;
cout << "GAP blocks:" << st.gap_blocks << endl;
cout << "Memory used:"<< st.memory_used << endl;
cout << "Max.serialize mem.:" <<
st.max_serialize_mem << endl;
// Allocate serialization buffer.
unsigned char* buf =
new unsigned char[st.max_serialize_mem];
// Serialization to memory.
unsigned len = bvs.serialize(bv, buf, 0);
cout << "Serialized size:" << len << endl << endl;
return buf;
}
int main(void)
{
bm::bvector<> bv1;
bm::bvector<> bv2;
// set DGAP compression mode ON
bv2.set_new_blocks_strat(bm::BM_GAP);
fill_bvector(&bv1);
fill_bvector(&bv2);
// Prepare a serializer class
// for best performance it is best
// to create serilizer once and reuse it
// (saves a lot of memory allocations)
//
bm::serializer<bm::bvector<> > bvs;
// next settings provide lowest serilized size
bvs.byte_order_serialization(false);
bvs.gap_length_serialization(false);
bvs.set_compression_level(4);
unsigned char* buf1 = serialize_bvector(bvs, bv1);
unsigned char* buf2 = serialize_bvector(bvs, bv2);
// Serialized bvectors (buf1 and buf2) now ready to be
// saved to a database, file or send over a network.
// ...
// Deserialization.
bm::bvector<> bv3;
// As a result of desrialization bv3
// will contain all bits from
// bv1 and bv3:
// bv3 = bv1 OR bv2
bm::deserialize(bv3, buf1);
bm::deserialize(bv3, buf2);
print_statistics(bv3);
// After a complex operation
// we can try to optimize bv3.
bv3.optimize();
print_statistics(bv3);
delete [] buf1;
delete [] buf2;
return 0;
}
What are your plans for BitMagic further development?
We want to implement some new vector compression methods with parallel data procession capability.
In view of the mass expansion of Intel Core i5-i7-i9, it would be reasonable to release a SSE 4.2 version of the library. Intel has added some interesting features that can be used very efficiently. The most promising feature is hardware support for bit number calculation (Population Count).
We are experimenting with nVidia CUDA and other GPGPUs. Today graphic cards enable you to perform integer and logical operations - and their resources can be harnessed for algorithms working on data sets and compression.
References
- Elias Gamma encoding of bit-vector Delta gaps (D-Gaps). http://bmagic.sourceforge.net/dGap-gamma.html
- Hierarchical Compression. http://bmagic.sourceforge.net/hCompression.html
- D-Gap Compression. http://bmagic.sourceforge.net/dGap.html
- 64-bit Programming And Optimization. http://bmagic.sourceforge.net/bm64opt.html
- Optimization of memory allocations. http://bmagic.sourceforge.net/memalloc.html
- Bitvector as a container. http://bmagic.sourceforge.net/enum.html
- 128-bit SSE2 optimization. http://bmagic.sourceforge.net/bmsse2opt.html
- Using BM library in memory saving mode. http://bmagic.sourceforge.net/memsave.html
- Efficient distance metrics. http://bmagic.sourceforge.net/distopt.html
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (0) 
Trackbacks (0)
Leave a comment 
Andrey Karpov
|

