compile cilk++ program with icc

compile cilk++ program with icc

Hi All,

I'm not quite sure whether this question should be directed here. But, I need some help:

I downloaded the Intel Parallel Studio XE Beta for Linux program (Intel 64 version), and installed it on my Ubuntu 10.04 system. The installation seems quite smooth and I saw the /opt/intel/bin/icc, ifort, ...

But when I use it to compile my CILK++ program from command line, such as:

icc -o heat.64 -O3 -DNDEBUG -I include -Wall -Werror -m64 heat.cilk common.cilk heat_loops.cilk heat_recursive.cilk heat_util.cilk heat_tests.cilk heat_meta.cilk -L lib/x86_64
make: icc: Command not found
make: *** [heat.64] Error 127

Can anybody tell me what's wrong with that? And how to fix it?

Thanks!

Yuan

publicaciones de 9 / 0 nuevos
Último envío
Para obtener más información sobre las optimizaciones del compilador, consulte el aviso sobre la optimización.

Hi Yuan.

I'm a Windows developer so I can't answer why you're not finding the icc command, but the Intel compiler doesn't recognize .cilk files. You'll want to either add the option that tells the compiler to treat your .cilk files as C++, or rename them.

- Barry

Barry,

Sorry, I pasted the wrong error message. Based on your advice, I changed all .cilk files to .cpp suffix. and I compile it from command line as follows:

icc -o heat.64 -O3 -DNDEBUG -I include -Wall -Werror -mkl -xSSE4.2 -m64 heat.cpp common.cpp heat_loops.cpp heat_recursive.cpp heat_util.cpp heat_tests.cpp heat_meta.cpp -L lib/x86_64
mcpcom: No such file or directory
compilation aborted for heat.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for common.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for heat_loops.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for heat_recursive.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for heat_util.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for heat_tests.cpp (code 100)
mcpcom: No such file or directory
compilation aborted for heat_meta.cpp (code 100)
make: *** [heat.64] Error 100

More specifically:

icc -c heat.cpp
mcpcom: No such file or directory
compilation aborted for heat.cpp (code 100)

If I use cilk++ rather than icc in the command line here, it compiles fine!:
cilk++ -c heat.cpp

Could you give me some hints on what might be wrong?

Thanks!

Yuan

My guess is that there was a problem with your installation. icc is the "driver". It parses the command, and then passes it to mcpcom which is the meat of the compiler. I'd make sure that both icc and mcpcom are in a directory on your PATH. Certainly the commands "which icc" and "which mcpcom" would be interesting.

Again, I work primarily on Windows, so I'm really not the right person to help you.

- Barry

I see!

I checked my installation directory, it has very weird organization:

yuantang@Octave:/opt/intel$ ls -l
total 32
drwxr-xr-x 2 root root 4096 2010-06-25 11:14 bin
lrwxrwxrwx 1 root root 16 2010-06-25 11:14 compilerpro -> compilerpro-12.0
drwxr-xr-x 3 root root 4096 2010-06-25 11:14 compilerpro-12.0
drwxr-xr-x 14 root root 4096 2010-06-25 11:14 compilerpro-12.0.0.025
lrwxrwxrwx 1 root root 19 2010-06-25 11:14 include -> compilerpro/include
drwxr-xr-x 13 root root 4096 2010-06-25 11:14 inspector_xe
-rw-r--r-- 1 root root 5658 2010-06-25 11:14 intel_sdp_products.db
lrwxrwxrwx 1 root root 15 2010-06-25 11:14 ipp -> compilerpro/ipp
lrwxrwxrwx 1 root root 15 2010-06-25 11:14 lib -> compilerpro/lib
drwxr-xr-x 2 root root 4096 2010-06-25 11:12 licenses
lrwxrwxrwx 1 root root 15 2010-06-25 11:14 man -> compilerpro/man
lrwxrwxrwx 1 root root 15 2010-06-25 11:14 mkl -> compilerpro/mkl
drwxr-xr-x 4 root root 4096 2010-06-25 11:14 parallel_studio_xe_2011
lrwxrwxrwx 1 root root 15 2010-06-25 11:14 tbb -> compilerpro/tbb

while the 'icc' under 'bin' directory is a symbolic link to the 'icc' under the directory 'compilerpro'. However, there's no 'mcpcom' under 'compilerpro' which is itself a symbolic link to 'compilerpro-12.0'. But there's a 'mcpcom' found under 'compilerpro-12.0.0.025'. So, I use the binary executable under 'compilerpro-12.0.0.025' to compile my cilk program and rename all the cilk program to .cpp suffix:

icc -o heat.64 -O3 -DNDEBUG -I/opt/intel/compilerpro-12.0.0.025/compiler/include/cilk -Wall -Werror -mkl -xSSE4.2 -m64 heat.cpp common.cpp heat_loops.cpp heat_recursive.cpp heat_util.cpp heat_tests.cpp heat_meta.cpp -L /opt/intel/compilerpro-12.0.0.025/compiler/lib/intel64 -L lib/x86_64
/usr/lib/gcc/x86_64-linux-gnu/4.4.3/../../../../lib64/crt1.o: In function `_start':
(.text+0x20): undefined reference to `main'
make: *** [heat.64] Error 1

Could you tell me what might be wrong with that?

My guess is that your application used cilk_main instead of main. cilk_main was a convenience function to allow you to avoid setting up a Cilk context and calling cilk_ctx::run() to convert from C/C++ linkage to Cilk linkage functions. "Behind your back" the Cilk V1 compiler was generating a main() which did the necessary boilerplate code to call cilk_main().

The version of Cilk implemented by the Intel compiler no longer uses a special linkage for Cilk functions. And the concept of a Cilk context has been removed. There is one Cilk context which is maintained by the Cilk runtime. As part of the prolog for a function which calls cilk_spawn the Cilk runtime will be initialized and your thread will "bind" to the Cilk runtime, and "unbind" as part of the epilog for that function. This means that all threads that use Cilk will share the same context, and the Cilk worker threads will be shared among them.

So:

  • If your application has a cilk_main(), rename the function main().
  • If you're using cilk::context, remove it
  • If you want to set the number of workers or otherwise manipulate the runtime, include cilk_api.h. That will bring in the OS-specific version of the Cilk runtime API. The comments in cilk_api_linux.h (or cilk_api_windows.h) should be sufficient for you to convert any of the Cilk V1 calls
  • The most common error is to call cilk_set_param("nworkers", n) after the runtime has started. Since the runtime is initialized by a call in the prolog of a function that contains a cilk_spawn, you need to call cilk_set_param("nworkers") further up the calltree. Be sure to check the return code from cilk_set_param(). Non-zero indicates failure.

- Barry

Barry,

Thank you very much for the advice. You hit the right point! The reason is because I'm using the old "cilk_main". After I renamed the "cilk_main" to "main" the compilation passed and the application runs. However, compared with the original cilk++ compiler, the reulting code is much larger:

For the same program, the cilk++ generate a code size of 66337 bytes, whilst the icc generate a code size of 254485 bytes (If I remove the '-mkl' flag, it will be 106625 bytes, still much larger than the cilk++ version) . And the performance of icc generated code is much slower than that generated by cilk++.

For example, my application generated by original cilk++ achieved at least 9.6 Gflops, whilst the icc generated code produce only 7.9 Gflops at most on the same machine.

I compiled the program as follows:

icc -o heat.64 -O3 -DNDEBUG -Wall -Werror -mkl -xSSE4.2 -I/opt/intel/compilerpro-12.0.0.025/compiler/include/cilk -m64 heat.cpp common.cpp heat_loops.cpp heat_recursive.cpp heat_util.cpp heat_tests.cpp heat_meta.cpp -L lib/x86_64

BTW: My machine is Intel Core i7 CPU X 980@3.33GHz with hyperthreading technology enabled. If I didn't set the number of workers in my program, will it by default use up all the cores, or will it just use a limited number of workers/cores for running my application?

Could you give me some advice what might be the problem of performance ?

BTW: I also tried to manually set the number of workers in the very begining of my main() function as follows:

#include
int CurrentP = __cilkrts_get_worker_number();
std::cout << "currentP = "<< CurrentP << std::endl;
int workers = __cilkrts_set_param("nworkers", "12");
if (workers == 0) {
printf("set workers = 12\n");
std::cout << "Running on 12 workers " << std::endl;
} else {
printf("set workers failed!\n");
std::cout << "Running on " << CurrentP << " workers " << std::endl;
}

The compilation passed with this code, but if I run it, the code will exit with "segmentation fault".

Thanks!

Yuan

Best Reply

> Could you give me some advice what might be the problem of performance ?

For the moment, my advice would be patience.

Our first priority was to get things working with the new compiler. I can't speak to the code size, but I do know that there are additional optimizations coming to speed things up. I'm not sure if they've been released yet. And we've been discussing further optimizations with the compiler team for future versions.

In general, the Intel compiler does a better job of optimizing your C++ code. Unfortunately, there's more overhead in the Cilk functions. If you've got a reasonable amount of work being done in your C++ code, it should be a net gain. But as I said, we're working to make it better.

Beyond that, the fewer parameters you're passing to your spawned functions, the better. The compiler creates a "spawn helper" function which allows us to deal with temporary lifetime issues. Some of the optimizations have to do with reducing overhead in this functions. Anyway, all of the parameters need to be forwarded through this function. The fewer there, the less code to forward them.

> If I didn't set the number of workers in my program, will it by default
use up all the cores,
> or will it just use a limited number of
workers/cores for running my application?

By default, the Cilk runtime will ask the OS how many processors you've got and create a worker thread for each one. This should be unchanged from Cilk V1. On Linux we call get_nprocs() to get the default for the number of
workers. On Windows the equivalent information is gotten by a call to
GetSystemInfo().

Note that the environment variable to override the default has changed.
In Cilk V1, it was CILK_NPROC. This has been replaced by
CILK_NWORKERS.

Of course, calls to __cilkrts_set_param("nworkers") overrides everything.

> The compilation passed with this code, but if I run it, the code will
exit with "segmentation fault".

You're calling the wrong function.

__cilkrts_get_worker_number() returns the id of the worker you're
currently executing on. Since the runtime hasn't started yet, that's
invalid. The good news is that I found and fixed that bug a week or two ago. :o) The code now detects that you're not calling it from a worker and returns -1 instead of seg faulting.

The function you want to call is __cilkrts_get_nworkers(), which will return the number of workers. *HOWEVER*, this will initialize the runtime (since we can't tell you how many workers there are until we create them) so your call to __cilkrts_set_param("nworkers", "12") will fail, since the runtime is already started.

- Barry

For the performance issues, you can also try comparing the performance of the Intel serialization vs. the Cilk++ SDK serialization. That would remove any Cilk-specific performance concerns out and show if there's anything compiler-related that should be looked at.

Brandon Hewitt
Technical Consulting Engineer

For 1:1 technical support: http://premier.intel.com

Software Product Support info: http://www.intel.com/software/support

Deje un comentario

Por favor inicie sesión para agregar un comentario. ¿No es socio? Únase ya