## Billions of Integers per Second with SIMD

Came across some good work from **Daniel Lemire**, CS Professor at University of Quebec, for highly-efficient, fast compression/encoding/decoding of integers. Full details here:

Decoding billions of integers per second through vectorization

It's a very nice use of *vectorisation* and Streaming SIMD Extensions (SSE2/3/4) in modern chipsets. Their SIMD-BP128 vectorised scheme makes use of vectorisation for bit packing prior to compress/decompress operations, and also for the delta encoding phase where, instead of storing the native integers, the differences are stored, with these differences calculated in a vectorised manner. Their scheme is **not patented**, and they have posted their code on github. You can find C, C++ and Java libraries for usage.

For the specific case of sorted integers, they have published a separate paper, given below, that also applies algorithms based on available SIMD instructions to boost compression speed. They claim their S4-BP128-D4 scheme "uses as little as 0.7 CPU cycles per decoded integer while still providing state-of-the-art compression".

SIMD Compression and the Intersection of Sorted Integers

Conveniently they provide the code for it on github: https://github.com/lemire/SIMDCompressionAndIntersection.

In finance most market data is effectively an append-only, immutable time-series where every record has a new timestamp, but timestamps are just long offsets from a time epoch (such as midnight 1/1/1900). Thus, using columnar storage, we get a column of "time" values which is a vector of non-descending integers/longs. These are very amenable to delta-encoding or some more specialised form of sorted integer compression.

*20 Jun 2016*
*Damien Wintour*
1 comment