SSE2

Absolute-Difference Motion Estimation for Intel® Pentium® 4 Processors

Introduction

The media extensions to the Intel Architecture (IA) instruction set include single-instruction, multiple-data (SIMD) instructions. Streaming SIMD Extensions 2 (SSE2) instructions extend SIMD for the Intel NetBurst® microarchitecture with 144 new instructions. This paper describes an SSE2 implementation to calculate the absolute difference between two 16x16 blocks of pixels. This paper also compares the performance gains of that solution to non-SSE2 implementations. This SSE2 solution can be an integral part of a motion-estimation kernel.

  • pentium4
  • pentium
  • SSE2
  • Processori Intel® Pentium®
  • Reducing the Impact of Misaligned Memory Accesses


    Introduction

    Misalignment of memory access is a problem commonly encountered when optimizing code with Streaming SIMD Extensions 2 (SSE2). An SSE2 algorithm often requires loading and storing data 16 bytes at a time to match the size of the XMM registers. If alignment cannot be guaranteed, some part of the performance gain achieved by processing multiple data elements in parallel will be lost because either the compiler or assembly programmer must use unaligned move instructions.

  • Sviluppatori
  • Intel® Streaming SIMD Extensions
  • SSE2
  • Grafica
  • Processori Intel® Pentium®
  • How to Vectorize Code on 32-Bit Intel® Architecture


    Challenge

    Vectorize code for greater performance. The SIMD features of Streaming SIMD Extensions (SSE), Streaming SIMD Extensions 2 (SSE2) and MMX™ technology require new methods of coding algorithms. One of them is vectorization. Vectorization is the process of transforming sequentially executing, or scalar, code into code that can execute in parallel, taking advantage of the SIMD architecture parallelism.

  • SSE2
  • Processori Intel® Pentium®
  • x87 and SSE Floating Point Assists in IA-32: Flush-To-Zero (FTZ) and Denormals-Are-Zero (DAZ)

    Introduction

    This document details the difference between how assists are handled with x87 and Single Instruction Multiple Data (SIMD) instructions, and gives information on how to change their behavior when using (Streaming SIMD Extensions) SSE and SSE2.

  • Sviluppatori
  • Intermedio
  • Intel® Streaming SIMD Extensions
  • SSE2
  • SSE
  • simd
  • Processori Intel® Pentium®
  • Iscriversi a SSE2