Architecture Code Analyzer Help needed

Architecture Code Analyzer Help needed

dkpmo's picture

Hi,

I saw today the Architecture Code Analyzer project and I got really interested in it. I do have some code snippets that I would like to analyze statically in order to gain better understanding of the bottlenecks. I downloaded the program and inserted the start macro at the beginning of a function and the end macro at the end of the same function. My project is a DLL file so first I did try the object file with the function I'm interested in (.obj) from the release folder of my visual studio solution directory:

iaca.exe -o output.txt -32 test.obj

This produced the following output: COULD NOT FIND START_MARKER.

Then I tried the DLL with

iaca.exe -o output.txt -32 test.dll

and I did get the following output:

Intel Architecture Code Analyzer Version - 20090115
Analyzed File - test.dll
Binary Format - 32BIT
INSTRUCTION NOT SUPPORTED(785) - imul edx, eax
INSTRUCTION NOT SUPPORTED(795) - cdq
INSTRUCTION NOT SUPPORTED(798) - idiv ecx
INSTRUCTION NOT SUPPORTED(806) - cmovz eax, ebx
INSTRUCTION NOT SUPPORTED(818) - cmovnle eax, ecx

Do you have any ideas what is happening here. Can I use the tool on DLL or I need exe file? Why the program is not recognizing some of the instructions? The DLL is not build with AVX support but it requires core 2 duo processor (ssse3).

And one question about the future of the program, do you think that it will be possible to model older processors - I think this should be strait forward to implement. There is no need for older ones like pentium or pentium 2 but it will be nice if core 2 and i7 processors are supported.

Appreciate any help,
Thanks

13 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Israel Hirsh (Intel)'s picture
Quoting - dkpmo Hi,

I saw today the Architecture Code Analyzer project and I got really interested in it. I do have some code snippets that I would like to analyze statically in order to gain better understanding of the bottlenecks. I downloaded the program and inserted the start macro at the beginning of a function and the end macro at the end of the same function. My project is a DLL file so first I did try the object file with the function I'm interested in (.obj) from the release folder of my visual studio solution directory:

iaca.exe -o output.txt -32 test.obj

This produced the following output: COULD NOT FIND START_MARKER.

Then I tried the DLL with

iaca.exe -o output.txt -32 test.dll

and I did get the following output:

Intel Architecture Code Analyzer Version - 20090115
Analyzed File - test.dll
Binary Format - 32BIT
INSTRUCTION NOT SUPPORTED(785) - imul edx, eax
INSTRUCTION NOT SUPPORTED(795) - cdq
INSTRUCTION NOT SUPPORTED(798) - idiv ecx
INSTRUCTION NOT SUPPORTED(806) - cmovz eax, ebx
INSTRUCTION NOT SUPPORTED(818) - cmovnle eax, ecx

Do you have any ideas what is happening here. Can I use the tool on DLL or I need exe file? Why the program is not recognizing some of the instructions? The DLL is not build with AVX support but it requires core 2 duo processor (ssse3).

And one question about the future of the program, do you think that it will be possible to model older processors - I think this should be strait forward to implement. There is no need for older ones like pentium or pentium 2 but it will be nice if core 2 and i7 processors are supported.

Appreciate any help,
Thanks

Hi,

Thanks for using Intel Architecture Code Analyzer, hope you'll benefit from it soon. Not sure why the tool failed to analyze the object file and succeeded in analyzing the DLL. It should have equally succeeded with the object file or an exe file. Core 2 instructions only are ok. Are you sure you used the correct object file as the tool's argument?

Anyway, you hit a known issue with the tool when you executed it with the DLL as an argument. There's a subset of the Intel Architecture instructions that the tool does not support. The current version aborts the analysis when unsupported instructions are encountered, the next version expected to be released later this month will quietly ignore those instructions, i.e. will supply incomplete information rather than no information at all. It will mark the instructions that are not included in the report.

The current tool version may still be useful for you if those unsupported instructions reside outside the "hot" part of the function of interest. If you insert the START and END macros just around that hot part only, rather than at the the beginning and end of the function, you may exclude those problematic instructions and get the data you need.

By the way, some of the unsupported instructions you bumped into will start being supported with the new tool version.

We do not plan to model older processors at this time.

Thanks again, Israel.

dkpmo's picture

Israel, thank you very much for the reply.

I'm sure that I used the proper OBJ file. The DLL was build from the same OBJ file but maybe the program quit when saw the unsupported instructions and could not print the same message as for the OBJ file.
I'm not sure that I didn't had the unsupported instructions in the code that I'm trying to analyze. I'll check this next week and let you know. I'll also try few more times with different functions and let you know about the results.

I'm waiting for the new version. This program looks to be a very nice and useful tool even without support for older processors. From my understanding the Sandy bridge architecture will be available first for mobile/desktop users and the Nehalem will continue to be the server architecture. So it will be nice to have at least support for Nehalem. To me it looks like the only difference is the additional load port that is available on the newer architecture.

Regards,
Dilyan

Israel Hirsh (Intel)'s picture
Hi,

Tal, Intel Architecture Code Analyzer tool developer, reminded me that for 64-bit object files you need to specify the -64 command line options (see section 2.2 in the user's guide). If your code is 64-bit, this may be the issue.

Israel.

Israel Hirsh (Intel)'s picture

oops ... just noticed that your code is 32-bit and that you did specify -32 for the object file analysis. Please check whether this problem persists and we'll see what's the best way to root cause the issue. Anyway, as long as your DLL can be analyzed by the tool, you should be able to make forward progress.

Israel.

Tal Uliel (Intel)'s picture

Hello Dilyan,

Did you compile you object file with -Qipo? object files that were compiled with -Qipo cann't be analyzed by Intel Architecture Code Analyzer.

What kind of analysis are you planing to do? do you compare AVX and SSE code?

Tal

dkpmo's picture

Tal, Israel,

thank you for your help. Unfortunately I'm still not able to produce any results using the tool. I'm saving the following function in a file test.cpp and compile it with Intel Compiler 10.0.026:

#include "iacaMarks.h"
#include "ia32intrin.h"

__m64 inline shuffleBytes(const unsigned char pTbl[8], __m64 *indx )
{
IACA_START

__m64 *pTbl_Dist = (__m64*)&pTbl[0];
__m64 *indxPtr = (__m64*)&indx[0];
return _mm_shuffle_pi8(*pTbl_Dist, *indxPtr);

IACA_END
}

I'm compiling with icl /c test.cpp to generate the obj file. I'm not using the inter process optimizations and according to the compiler documentation IPO is disabled by default. When I try the obj file (is 32 bit) I still get
COULD NOT FIND START_MARKER
exception.
Do you have any ideas what I'm doing wrong? Do I need to turn off optimizations also?

Thank you,
Dilyan

Tal Uliel (Intel)'s picture
Quoting - dkpmo Tal, Israel,

thank you for your help. Unfortunately I'm still not able to produce any results using the tool. I'm saving the following function in a file test.cpp and compile it with Intel Compiler 10.0.026:

#include "iacaMarks.h"
#include "ia32intrin.h"

__m64 inline shuffleBytes(const unsigned char pTbl[8], __m64 *indx )
{
IACA_START

__m64 *pTbl_Dist = (__m64*)&pTbl[0];
__m64 *indxPtr = (__m64*)&indx[0];
return _mm_shuffle_pi8(*pTbl_Dist, *indxPtr);

IACA_END
}

I'm compiling with icl /c test.cpp to generate the obj file. I'm not using the inter process optimizations and according to the compiler documentation IPO is disabled by default. When I try the obj file (is 32 bit) I still get
COULD NOT FIND START_MARKER
exception.
Do you have any ideas what I'm doing wrong? Do I need to turn off optimizations also?

Thank you,
Dilyan

What Intel compiler version are you using?

dkpmo's picture

The version is 10.0.026. I know it's a bit older but this is what we use here at work. Do you think I should use newer version? What version are you using?

Tal Uliel (Intel)'s picture
Best Reply
Quoting - dkpmo
The version is 10.0.026. I know it's a bit older but this is what we use here at work. Do you think I should use newer version? What version are you using?

Hello Dilyan,

I tried compiling your code and I got the same results.

two issues cause Intel Architecture Code Analyzer to fail finding the the marks:
1) the function is inlined
2) the end marker is after the return function

I think that becasue the function is inlined, the compiler create some sort of intermidate code in the obj file that hide the marks from Intel Architecture Code Analyzer.
When I remove the inline part, Intel Architecture Code Analyzer was able to find the START marker but not tthe END marker, this I think was due to a compiler optimization since the function ends after the return.

because the function is inlined, the best way to analyze it is to put the markers around a call to this function.

Another option is to add a stab function (as i did below) that call shuffleBytes and put the markers around it. however due to compiler optimaztions (again that's what I suspect, I'm still checking) you must have a proper destination.

#include "iacaMarks.h"
#include "ia32intrin.h"

__m64 inline shuffleBytes(const unsigned char pTbl[8], __m64 *indx )
{
__m64 *pTbl_Dist = (__m64*)&pTbl[0];
__m64 *indxPtr = (__m64*)&indx[0];
return _mm_shuffle_pi8(*pTbl_Dist, *indxPtr);

}

void stab(const unsigned char pTbl[8], __m64 *indx)
{
IACA_START
*indx = shuffleBytes(pTbl, indx);
IACA_END
}

I Hope this is helpful,
Tal

dkpmo's picture

Hi Tal,

thank you very much for all your help! It worked!

When will be the new version available?

Thanks,
Dilyan

Israel Hirsh (Intel)'s picture

Rev 1.0.1 already posted (user's guide says 1.0.2).
Israel.

dkpmo's picture

Thanks for the info, I'll try it out.

Login to leave a comment.