Macro Function for Matrix Transposition

Intel® Streaming SIMD Extensions (Intel® SSE) provide the following macro function to transpose a 4 by 4 matrix of single precision floating point values.

_MM_TRANSPOSE4_PS(row0, row1, row2, row3)

The arguments row0, row1, row2, and row3 are __m128 values whose elements form the corresponding rows of a 4 by 4 matrix. The matrix transposition is returned in arguments row0, row1, row2, and row3 where row0 now holds column 0 of the original matrix, row1 now holds column 1 of the original matrix, and so on.

The transposition function of this macro is illustrated in the figure below.

Matrix Transposition Using _MM_TRANSPOSE4_PS Macro

For more complete information about compiler optimizations, see our Optimization Notice.