This is a large array on the sum of the optimization problem.
There are two double type array, then the code for this problem is as follows:
#pragma omp parallel for
for (long i=0; i<5000000; i++)
array1[i] += array2[i];
My computer is "Dell PowerEdge 2900III 5U" with Xeon 5420 * 2 and 48G Memory.
And the OS is MS Windows Server 2003 R2 Enterprise x64 Edition sp2.
The C++ compilers are VC++ 2008 and Intel C++ 11.0.061, and the solution platform is x64.
and then i used VC and IC compiled the program,the two result are basiclly the same.
and then i used the funtion of INTEL MKL 10.1 to compute,as follows:
cblas_daxpy(5000000, 1, array2, 1, array1, 1);
the performance of the program have no different.
and the i used other funtino of INTEL MKL 10.1:
vdAdd( n, a, b, y );
Program performance decreased significantly, and only about 80% of the original.
i would like to know what way to optimize this problem by enhancing program performance