VS2010 installation problem

VS2010 installation problem

Hello,I have recently installed VS2010 and Intel Parallel Studio XE. I have been trying to use the ArrayNotation example from the Intel website.I am unable to resolve the following error: "ERROR: identifier __assume_aligned is undefined"I'm sure there is something basic I need to due, but I have been unable to sort it out.Additionally, the array notation in the code is showing up as red underline as if it isn't recognized syntax - How can I resolve this.Any help would be greatly apprciated, I have been pulling my hair out.Regards,Brad

8 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello Brad,

did you switch the solution to Intel C++ Compiler?


Especially the "__assume_aligned(a,n)" directive is built-in and won't require any more action, besides selecting "Use Intel C++".

Regards,

Georg Zitzlsberger

While gcc-compatibility in this respect has recognized value, I'm not certain it is implemented in past Windows compilers, where you may need the equivalent declspec. CEAN array notation should be recognized in any ICL 12.x version, so you should check that you have switched in ICL in place of CL.

Hello Tim,

it is supported and semantically different to the __declspec(align(...)) variants, see documentation:

Feature

Description

__declspec(align(n))

Directs the compiler to align the variable to an n-byte boundary. Address of the variable is address mod n=0.

__declspec(align(n,off))

Directs the compiler to align the variable to an n-byte boundary with offset off within each n-byte boundary. Address of the variable is address mod n=off.

__assume_aligned(a,n)

Instructs the compiler to assume that array a is aligned on an n-byte boundary; used in cases where the compiler has failed to obtain alignment information.

Best regards,

Georg Zitzlsberger

Hello Georg,Thank you for your suggestion.I verified that I have been using the Intel C++ compiler. I can build the project without errors. However, it doesn't appear to be running any faster than the serial implementation. I noticed your __assume_aligned is also underlined red. Any ideas on how to fix this.Regards,Brad Kimbrough

Hello,I can build the ArrayNotation example, and it will run without error. However, It is not vectorizing one of the loops as outlined in the example documentation.Here is my code:#include #include #include #define S 1024#define TCOUNT 16// Use 16-byte alignment for a CPU with 128-bit vector registers. For the CPUs with Intel AVX support// use 32-byte alignment, and use 64-byte alignment for Intel MIC architecture.#define ALIGNMENT 16#define S 1024#define ITERS 1024*1024*10// Request the compiler to use 16-byte alignments for the arrays.__declspec(align(16)) float A[S], B[S], C[S];__declspec(align(16)) int mask[S];int main() { // Initialize the global arrays A[:] = 0.0f; B[:] = 1.0f / (A[:] + 1); C[:] = B[:]; mask[:] = 0; mask[0:S/2:2] = 1; for (int i = 0; i < ITERS; i++) { //Invocation of the Array Notation implementation startTime = clock_it(); longvector(A,B,C,1.1f,mask); endTime = clock_it(); execTime += (endTime - startTime); } printf("Time taken in seconds with default Vector Length Array Notation implementation is %2.6f\n", execTime); return 0;}__declspec(noinline) void longvector(float A[S], float B[S], float C[S], float k, int mask[S]) {// Let the compiler know it is safe to assume that the function arguments// are 64-byte aligned. __assume_aligned(A,ALIGNMENT); __assume_aligned(B,ALIGNMENT); __assume_aligned(C,ALIGNMENT); if (mask[:]) { A[:] = B[:] + C[:] * k; }}The loop in the longvector() function is not being vectorized as it should. The report is:ArrayNotation.cpp(56): warning : loop was not vectorized: existence of vector dependence.ArrayNotation.cpp(56:5-56:5):VEC:?longvector@@YAXQAM00MQAH@Z: loop was not vectorized: existence of vector dependenceArrayNotation.cpp(57:7-57:7):VEC:?longvector@@YAXQAM00MQAH@Z: potential FLOW dependence between A and B.1> potential ANTI dependence between B and A.ArrayNotation.cpp(56): warning : loop was not vectorized: existence of vector dependence.ArrayNotation.cpp(56:5-56:5):VEC:?longvector@@YAXQAM00MQAH@Z: loop was not vectorized: existence of vector dependenceArrayNotation.cpp(57:7-57:7):VEC:?longvector@@YAXQAM00MQAH@Z: potential FLOW dependence between A and B.1> potential ANTI dependence between B and A.This is the same report given when simply trying to implement the scalar version of the code.Any help would be much appreciated.Thank you,Brad Kimbrough

Hello Brad,

thank you for the small code example. This makes it easy for us to reproduce. I'm using the latest update version (Intel Composer XE 2011 Update 11) in the following.

The reason IntelliSense from Microsoft Visual Studio* does underline some keywords/directives is because our integration misses to register them. I've created a ticket to fix that in a future release (DPD200294636). It's not critical, though. You can continue without problems.

Using the example you provided I see that function "longvector(...)" is vectorized (excerpt from the function):

.B2.2:                          ; Preds .B2.2 .B2.1

$LN110:

        movaps    xmm7, XMMWORD PTR [edx+esi*4]                 ;23.7

$LN111:

        cvtps2pd  xmm3, xmm7                                    ;23.7

$LN112:

        movdqu    xmm4, XMMWORD PTR [edi+esi*4]                 ;22.5

$LN113:

        movhlps   xmm7, xmm7                                    ;23.7

$LN114:

        pcmpeqd   xmm4, xmm5                                    ;22.5

$LN115:

        cvtps2pd  xmm0, xmm7                                    ;23.7

$LN116:

        movaps    xmm7, XMMWORD PTR [ecx+esi*4]                 ;23.7

$LN117:

        pxor      xmm4, xmm6                                    ;22.5

$LN118:

        cvtps2pd  xmm1, xmm7                                    ;23.7

$LN119:

        movhlps   xmm7, xmm7                                    ;23.7

$LN120:

        cvtps2pd  xmm7, xmm7                                    ;23.7

$LN121:

        mulpd     xmm1, xmm2                                    ;23.7

$LN122:

        mulpd     xmm7, xmm2                                    ;23.7

$LN123:

        addpd     xmm3, xmm1                                    ;23.7

$LN124:

        addpd     xmm0, xmm7                                    ;23.7

$LN125:

        movups    xmm1, XMMWORD PTR [eax+esi*4]                 ;23.7

$LN126:

        cvtpd2ps  xmm3, xmm3                                    ;23.7

$LN127:

        cvtpd2ps  xmm0, xmm0                                    ;23.7

$LN128:

        movlhps   xmm3, xmm0                                    ;23.7

$LN129:

        andps     xmm3, xmm4                                    ;23.7

$LN130:

        andnps    xmm4, xmm1                                    ;23.7

$LN131:

        orps      xmm3, xmm4                                    ;23.7

$LN132:

        movaps    XMMWORD PTR [eax+esi*4], xmm3                 ;23.7

$LN133:

        add       esi, 4                                        ;22.5

$LN134:

        cmp       esi, 1024                                     ;22.5

$LN135:

        jb        .B2.2         ; Prob 99%                      ;22.5

The *ps and *pd op-codes (e.g. andps, mulpd, etc.) indicate packed operations (p = packed), which is good!

So, why don't you see it:

  • I don't have your implementation of "clock_it()". However, keep in mind that there are some implementations of timer functions that don't work as expected on multi-core systems.
    I'd recommend to use the "rdtsc()" intrinsic which reads the clock ticks.
    Also, and in general for benchmarking, you might turn off Intel SpeedStep, Intel Turbo Boost and (optional) Intel Hyper-Threading. The reason for this is to get comparable CPU performance between benchmark runs.
  • Maybe you're using an older compiler version that cannot vectorize the code you provided. The most recent version can to: I don't get warnings about dependencies in the vectorization report and the above assembly is created.

Best regards,

Georg Zitzlsberger

Hello,

I'd like to inform you that we can not fix DPD200294636 (incorrect syntax highlighting of keywords/directives). The reason is the API of integrations into Microsoft Visual Studio* that would require unreasonable efforts on our side. So, please ignore the underlining of IntelliSense in such cases.

Best regards,

Georg Zitzlsberger

Leave a Comment

Please sign in to add a comment. Not a member? Join today