Cilk++ compiler bug when calling CLAPACK functions

Cilk++ compiler bug when calling CLAPACK functions

Hello again,

As before, I am attempting to use Cilk++ with CLAPACK to perform high performance matrix operations. I have tried to do my research but have been unable to resolve this on my own. Thanks so much for your excellent support! I'm new to Cilk++ so I apologize if I am missing something I should be doing.

The offending CLAPACK functions are dtrsm_ and dgemm_. You can find the netlib reference for these at http://www.netlib.org/blas/dtrsm.f and http://www.netlib.org/blas/dgemm.f respectively. The particular versions I am using have been built using the standard reference BLAS on a Fedora 13 machine with an Intel Core 2 Duo. The compiler bug only manifests itself when I attempt to introduce parallelism with either cilk_spawn or cilk_for. The code compiles fine with the Cilk++ compilier but without the parallel context.

My initial piece of code was written with cilk_for:

      cilk_for( int i = k+1; i < P; i++)
	{
	  dtrsm_(&SIDE, &UPLO, &TRANSA, &DIAG, &pSize, &pSize, &ALPHA, B[k][k], &pSize, B[i][k], &pSize);
	}

      int R = k+1;
      for( int i = k+1; i < P; i++)
	{
	  R++;
	  cilk_for( int j = k+1; j < R; j++)
	    {
	      dgemm_(&gemmTRANSA,&gemmTRANSB,&pSize,&pSize,&pSize,&gemmALPHA,B[i][k],&pSize,B[j][k],&pSize,&gemmBETA,B[i][j],&pSize);
	    }
	}
    }

Which produced the following error:(line 156 refers to the call to degmm_, but if I remove the cilk_for around dgemm_ the same error moves to the line that calls dtrsm_)

blockChol3.cilk: In function int cilk my_main(int, char**):
blockChol3.cilk:148: warning: loop condition compares loop variable to wider type
blockChol3.cilk: In function void __cilk_loop_d_002(void*, long unsigned int, long unsigned int):
blockChol3.cilk:156: warning:  is used uninitialized in this function
blockChol3.cilk:113: note:  was declared here
blockChol3.cilk:156: internal compiler error: in make_decl_rtl, at varasm.c:1015
Please submit a full bug report,
with preprocessed source if appropriate.
This compiler version is 4.2.4 (Cilk Arts build 8503)
See  for instructions.

So I then tried a cilk_spawn approach:

for( int i = k+1; i < P; i++)
	{
	  cilk_spawn dtrsm_(&SIDE, &UPLO, &TRANSA, &DIAG, &pSize, &pSize, &ALPHA, B[k][k], &pSize, B[i][k], &pSize);
	}
      cilk_sync; 

      int R = k+1;
      for( int i = k+1; i < P; i++)
	{
	  R++;
	  for( int j = k+1; j < R; j++)
	    {
	      dgemm_(&gemmTRANSA,&gemmTRANSB,&pSize,&pSize,&pSize,&gemmALPHA,B[i][k],&pSize,B[j][k],&pSize,&gemmBETA,B[i][j],&pSize);
	    }
	  cilk_sync;
	}

Which produced the following error:

blockChol3.cilk: In function void cilk __cilk_spawn_f2c_dtrsm_001(long unsigned int, long unsigned int, long unsigned int, int, int, 
double (* __restrict__)[( + 1u)][( + 1u)][( + 1u)], double* __restrict__, char* __restrict__, 
char* __restrict__, char* __restrict__, char* __restrict__, integer* __restrict__, volatile __cilkrts_frame_t*):
blockChol3.cilk:147: warning:  is used uninitialized in this function
blockChol3.cilk:147: warning:  is used uninitialized in this function
blockChol3.cilk:147: internal compiler error: in make_decl_rtl, at varasm.c:1015
Please submit a full bug report,
with preprocessed source if appropriate.
This compiler version is 4.2.4 (Cilk Arts build 8503)
See  for instructions.

When I turn off compilier optimization the following errors occur:
Using cilk_for:

blockChol3.cilk: In function int cilk my_main(int, char**):
blockChol3.cilk:148: warning: loop condition compares loop variable to wider type
blockChol3.cilk: In function void __cilk_loop_d_002(void*, long unsigned int, long unsigned int):
blockChol3.cilk:156: internal compiler error: in make_decl_rtl, at varasm.c:1015
Please submit a full bug report,
with preprocessed source if appropriate.
This compiler version is 4.2.4 (Cilk Arts build 8503)
See  for instructions.

Using cilk_spawn:

blockChol3.cilk: In function void cilk __cilk_spawn_f2c_dtrsm_001(long unsigned int, long unsigned int, long unsigned int, int, int, 
double (* __restrict__)[( + 1u)][( + 1u)][( + 1u)],double* __restrict__, 
char* __restrict__, char* __restrict__, char* __restrict__, char* __restrict__, integer* __restrict__, volatile __cilkrts_frame_t*):
blockChol3.cilk:147: internal compiler error: in make_decl_rtl, at varasm.c:1015
Please submit a full bug report,
with preprocessed source if appropriate.
This compiler version is 4.2.4 (Cilk Arts build 8503)
See  for instructions.

Finally, the last thing I tried was to specify C linkage. However, this is a feature I have not used much so I'm not certain I'm doing it correctly. It appears to have made the compilier ignore the CLAPACK calls:

cilk_for( int i = k+1; i < P; i++)
	{
	  extern "C" dtrsm_(&SIDE, &UPLO, &TRANSA, &DIAG, &pSize, &pSize, &ALPHA, B[k][k], &pSize, B[i][k], &pSize);
	} 

      int R = k+1;
      for( int i = k+1; i < P; i++)
	{
	  R++;
	  cilk_for( int j = k+1; j < R; j++)
	    {
	      extern "C" dgemm_(&gemmTRANSA,&gemmTRANSB,&pSize,&pSize,&pSize,&gemmALPHA,B[i][k],&pSize,B[j][k],&pSize,&gemmBETA,B[i][j],&pSize);
	    }
	

Which produces the following error:

blockChol3.cilk: In function int cilk my_main(int, char**):
blockChol3.cilk:147: error: expected unqualified-id before string constant
blockChol3.cilk:148: warning: loop condition compares loop variable to wider type
blockChol3.cilk:156: error: expected unqualified-id before string constant
blockChol3.cilk:72: warning: unused variable SIDE
blockChol3.cilk:74: warning: unused variable TRANSA
blockChol3.cilk:75: warning: unused variable DIAG
blockChol3.cilk:78: warning: unused variable ALPHA
blockChol3.cilk:88: warning: unused variable gemmTRANSA
blockChol3.cilk:89: warning: unused variable gemmTRANSB
blockChol3.cilk:93: warning: unused variable gemmALPHA
blockChol3.cilk:98: warning: unused variable gemmBETA

Thanks for your time!
David

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

David,

I'm not sure I'll be able to help you, as the error is deep inside the gcc compiler and may have existed in the non-Cilk 4.2.4 compiler. However, there are several things worth trying.

1. The problem does not seem related to which Cilk construct you use, so you should prefer the original code using cilk_for to the modified code using cilk_spawn. It will be more efficient.

2. Try to get rid of the warnings. Initialize any uninitialized variables, make sure your comparisons are of equal-width values, etc.

3. Your idea of using extern "C" may have merit, but your use of extern "C" is incorrect. C linkage is an attribute of a function declaration, not of a function call. Try:

extern "C" dtrsm_(argument declarations);

Then call the function as before. If dtrsm_ is being declared in a header file, try enclosing the entire header in extern "C" like this:

extern "C" {
#include "header.h"
}

I hope this helps.
-Pablo

Unfortunately, we don't have anybody with GCC expertise at the moment.

The Cilk technology is currenly in Beta as part of the Intel compiler for both 32
and 64 bit Intel chips on both Windows and Linux. If you are
interested in signing up for the beta program, please visit http://software.intel.com/en-us/articles/intel-parallel-studio-microsoft-visual-studio-2010-support/.

The new version does away with the Cilk context and Cilk linkage issues which makes using Cilk much easier.

- Barry

Hi Pablo, thanks again for your help!

Thanks for clearing up the meaning of the extern statement. I declared the offending headers as you suggested but it did not help.

I also converted everything to the type long int. The warnings about loop variables being compared to a wider type are gone.

The error that I am getting is still:

cilk++ -o blockChol3 -Wall -O0 blockChol3.cilk 
../lapack/CLAPACK-3.2.1/tmglib_LINUX.a 
../lapack/CLAPACK-3.2.1/lapack_LINUX.a 
../lapack/CLAPACK-3.2.1/blas_LINUX.a 
../lapack/CLAPACK-3.2.1/F2CLIBS/libf2c.a 
-I../lapack/CLAPACK-3.2.1/INCLUDE/

blockChol3.cilk: In function void __cilk_loop_d_001(void*, long unsigned int, long unsigned int):
blockChol3.cilk:149: internal compiler error: in make_decl_rtl, at varasm.c:1015
Please submit a full bug report,
with preprocessed source if appropriate.
This compiler version is 4.2.4 (Cilk Arts build 8503)
See  for instructions.

Again, I am not experienced with either Cilk or CLAPACK, but it seems to me that that the parallel code the Cilk++ compiler produces is what's having a problem with g++. Do you think it would be worth my while to try and compile my code with an older version of g++?

Any other ideas?

I can submit my code if it would help.

Thanks,
David

Hi Barry,

Thanks for the tip about your upcoming beta. I'm working as a grad student studying parallelism, so it's neat to see what directions the industry is moving. Unfortunately once I've got my code working we want to measure parallel speedup on a 24 core AMD machine, so we won't be using your beta in our final product.

Thanks for your time,
David

Leave a Comment

Please sign in to add a comment. Not a member? Join today