matrix inverse FLOPS

matrix inverse FLOPS

Hi , 

What should be the required FLOPS for 16x16 MKL_Complex8 matrix inversion using cpotrf and than cpotri ?

How many CPU clocks it should take on ATOM E3826 CPU and I5-3470 CPU ?

Is there any performance difference using Linux 32bit operating system vs Linux 64bit operating system ? (for those specific CPUs)

Thanks , Nimrod

 

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Nimrod,

Approximate flops formula for (S/D)POTRF is 1/3*N^3, (S/D)POTRI is 2/3*N^3, for complex case these multiplied by four.
More precise formulas for complex case which makes sence for such a small size are:

CPOTRF_FLOPS = 6 * N * (N * (N * 1./6. + .5) + 1./3.) + 2 * N * 1./6. * (N * N - 1.);

CPOTRI_FLOPS = 6 * N * (N * (N * 1./3. + 1.) + 2./3.) + 2 * N * (N * (N * 1./3. - .5) + 1./6.)

 

Usually there is a difference for 32 and 64 bit code, which comes from richer set of registers in Intel 64 architecture and other improvements in x86-64 Application Binary Interface (ABI).

Unfortunately I don't have clock counts for these functions.

 

W.B.R., Alexander
 

Leave a Comment

Please sign in to add a comment. Not a member? Join today