Basically, there are two motivations for improving the concurrent vector. The first one is to satisfy requirements for containers from the C++ STL standard as much as it's sensible.
The second goal is to reduce memory blow up in case of thousands instances of concurrent_vector each storing only a few items. To solve this problem, we have two solutions: optimization of the container to store less then 16 items, and replacement of an allocator to improve memory usage instead of eliminating of false sharing.
The August 15 development snapshot contains only a part of the above changes. It has concurrent_vector with a custom allocator support via a template argument. So, you can change allocator for items of concurrent_vector now. The default cache_aligned_allocator pads memory requests to a multiple of cache line size. So for small vectors, it might make sense to replace it with something more memory-effective.
More changes are coming soon including standard constructors, comparison operators and other STL-like methods.