I was developing a new software using scaLAPACK/BLACS MKL implementation. In my small testing cluster everything worked ok. However, when I have moved to the 'big' cluster to start computations, the program sometimes fails with error (repeated for different threads):
Rank 38 [Thu Apr 17 16:08:18 2014] [c7-1c1s8n3] Fatal error in MPI_Recv: Invalid tag, error stack:
MPI_Recv(192): MPI_Recv(buf=0x3847640, count=64, MPI_INT, src=37, tag=5000000, comm=0x84000004, status=0x7fffffff7418) failed
MPI_Recv(113): Invalid tag, value is 5000000
I am using the compiler and libraries in Intel Composer XE Edition 2013 SP1. The cluster is based on Cray XC30 series and uses their own MPI implementation. I have read this MPI implementation have their own limits for 'tag' parameter (I tried different versions). However I have read 'tag' parameter limits are higher than 5000000.
Sombody received the same error? Is it related to MKL implementation or MPI implementation?