# Weird OpenMP Reduction

By Arch D. Robison (Intel), Updated

This is a computer translation of the original content. It is provided for general information only and should not be relied upon as complete or accurate.

Typical reductions in OpenMP* involve using a associative operator *op *to do local reductions, and then using a reduction clause to collect those local reductions. For example, the following code computes a dot product by computing local sums on each thread and then summing them.

// Returns dot product of two vectors int dot( int x[], int y[], int n ) { int sum = 0; #pragma omp parallel for reduction(+:sum) for( int i=0; i<n; ++i ) { sum += x[i]*y[i]; } return sum; }

The reduction clause specifies two things:

- When control enters the parallel region, each thread in the region gets a thread-private copy of sum, initialized to the identity element for +.
- When control leaves the parallel region, the original sum is updated by combining its value with the final values of thethread-private copies, using +. Since + is associative (or nearly so for floating-point), the final sum has the same value (or nearly so) as it would for serial execution of the code.

*without*using any addition operations inside the loop. It assumes two's complement arithmetic.

// Returns dot product of two vectors int dot( int x[], int y[], int n ) { int c = 0; int s = 0; #pragma omp parallel for reduction(+:c,s) for( int i=0; i<n; ++i ) { int p = x[i]*y[i]; int q = c^p; c &= p|s; c |= p&s; c <<= 1; s ^= q; } return c+s; }

Consider it a puzzle, not recommended practice. If you need a hint for why it works correctly, look up the datasheet for the classic TTL 74183 chip. [Last line edited 2015-4-1 because old link to 74183 datasheet broke.]

*Other names and brands may be claimed as the property of others