Using SSE2 to Evaluate a Hidden Markov Model with Viterbi Decoding

Tags:

Introduction

The Streaming SIMD Extensions 2 (SSE2) technology introduces new Single Instruction Multiple Data (SIMD) double-precision floating-point instructions and new SIMD integer instructions into the IA-32 Intel® architecture. The double-precision SIMD instructions extend functionality in a manner analogous to the single-precision instructions introduced with the Streaming SIMD Extensions (SSE). The 128-bit SIMD integer extensions are a full superset of the 64-bit integer SIMD instructions, with additional instructions to support more integer data types, conversion between integer and floating-point data types, and efficient operations between the caches and system memory. These instructions provide a means to accelerate operations typical of 3D graphics, real-time physics, spatial (3D) audio, video encoding/decoding, encryption, and scientific application. The 128-bit integer SIMD extensions in SSE2 technology can process data 128 bits at a time using the XMM registers, enabling the implementation of important algorithms, such as the Hidden Markov Model, to be improved further than previous implementations using MMX™ technology and SSE. This application note (AP-946) contains both the code and a description of how the SSE2 instructions can be used to implement a Veterbi algorithm to evaluate a Hidden Markov Model.

Application note AP-569, entitled Using MMX™ Instructions to Implement Viterbi Decoding, describes how to use the MMX instructions to gain a 2x improvement over scalar code. Another application note, AP-811, entitled Using the Streaming SIMD Extensions to Evaluate a Hidden Markov Model with Viterbi Decoding, shows how using the SSE instructions and operation on four data elements at a time (increasing the SIMD width by two) can further increase the performance gain. This application note will describe how the SSE2 instructions provide a significant performance gain when compared to the implementation that uses the SSE instructions.


Download Code Samples

Download w_hmm.zip

Download KernelTemplate.zip. This library is required to compile the application.


View entire article (PDF 162KB)


For more complete information about compiler optimizations, see our Optimization Notice.

Comments

noeticusnaver.com's picture

I've tried to download w_hmm.zip but I can't download this file.
How can I do it?

pavelaalexeev's picture

I can not download w_hmm.zip too...