Variable-length arrays and CilkPlus

Variable-length arrays and CilkPlus

It seems that I can use OpenMP together with CilkPlus array notation on variable length arrays, but not _Cilk_for.  This is under the Intel compiler with build 13.1.2.183.  I get messages like:

junk.cpp(10): error: a variable captured by a lambda cannot have a type involving a variable-length array
              a[j][i] = (a[j][i] - __sec_reduce_add(a[0:j][i]*a[0:j][j])) / x;

It works perfectly if I replace the _Cilk_for by for, and use '#pragma omp parallel for'  and works (but does not parallelise) if I do the same but use '#pragma simd'.  This is an example of the code:

void cholesky_cilkplus (double a[size][size]) {
    for (int j = 0; j < size; ++j) {
        a[j][0:j] = 0.0;
        double x = a[j][j] = sqrt(a[j][j]-__sec_reduce_add(a[0:j][j]*a[0:j][j]));
        _Cilk_for (int i = j+1; i < size; ++i)
            a[j][i] = (a[j][i] - __sec_reduce_add(a[0:j][i]*a[0:j][j])) / x;
    }
}

Since I raised this in the context of gcc, I have written a multi-dimensional array template and have passed it on to the WG21 scientific extension group for consideration.  But that is another matter.  What I would like to know is whether the restriction that the above combination is not supported is an oversight, deliberate, permanent etc.   Thanks for any comments.

16 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

I suppose the change in C11 which made VLA support optional doesn't help the case for making it part of a C++ standard.

I think Cilk(tm) Plus is intended to be fully compatible with C99, but I don't find it well documented. I wouldn't expect cilk_for to be compatible with -openmp compilation, but better documentation and error messages seem important.  The position on Cilk reducers under OpenMP seems intentionally ambiguous, but there ought not to be a problem as long as the reducers don't invoke multiple workers.

The compiler version you quote, and the current one, have trouble with the Fortran analogues of VLA (automatic and allocatable) when -openmp is set. 13.1.192 appears to be more reliable on the Fortran side.  I haven't had any feedback on problem reports which I had to resubmit for the 14.0 compiler.

With regard to C11, yes :-(  I shall refrain from publishing more, because polite words fail me.

Thanks for that information about 192 - it may be useful to me with another hat on.  I will pass it on (I don't manage the system I am using).  My code uses OpenMP and CilkPlus only serially, but you have a point that I really ought to separate them out using scripting and #if.

This appears to be fixed in 14.0.  The following program compiles without any errors:

#include <stdio.h>
#include <cilk/cilk.h>
void do_test(int n)
{
 int arg[n];
cilk_for(int i = 0; i < n; i += n)
 {
 arg[i] = n - 1;
 printf("hello world %dn", i);
}
}
int main(int argc, char *argv[])
{
 int n = 1000;
 do_test(n);
 return 0;
}

    - Barry

Thanks very much.

A bit more on this.  I have separated them, and tried several approaches, and there doesn't appear to be ANY way of using CilkPlus 13.x for clean array code (as in, say, linear algebra) to provide the simplicity, convenience and safety of (say) Fortran+OpenMP.  I should appreciate a correction if I am wrong, but I think that it means waiting for 14.0 and trying again.

Almost all such array code needs multi-dimensional arrays whose size varies at run-time;  the best standard solution is to write a class to do it, but I can't get the array notation to work on either of the two classes I have tried - the classes support the same textual code as standard C++ multi-dimensional arrays, but that's not enough.  I think that _Cilk_for would work but, in itself,  that's no great improvement over OpenMP parallel for for such code.  To get the simplicity, convenience and safety, such code needs all of array notation, reducers and _Cilk_for.

Even assuming 14.0 allows this for variable-length built-in arrays, there will remain the issue of getting something through the standards process, but that's a task for another day.

It's been a while since we did the work, so I just went back and checked the dates in the source code repository.  Cilk support for Variable Length Arrays (VLAs) wasn't implemented until 14.0, so this isn't entirely surprising.  Expanding a frame's size when it's been stolen was tricky to get right in all of the the cases.

14.0 (Intel Composer XE 2013 SP1) was released to manufacturing back in August, so it should be available to you now.

    - Barry

No joy, I am afraid.  My guess is that it can handle vectors because of the vector/pointer kludge, but not multidimensional ones, because that kludge isn't enough.  I tried with 14.0.0.080 and it was no better.  Here is the kernel of my test - it is just the LAPACK logic, and is used as the simplest example of an extremely common (almost dominating) requirement in scientific programming.

#define SUM __sec_reduce_add

int size;

void cholesky_cilkplus (double a[size][size]) {
    for (int j = 0; j < size; ++j) {
        a[j][0:j] = 0.0;
        double x = a[j][j] = sqrt(a[j][j]-SUM(a[0:j][j]*a[0:j][j]));
        _Cilk_for (int i = j+1; i < size; ++i)
            a[j][i] = (a[j][i] - SUM(a[0:j][i]*a[0:j][j])) / x;
    }
}

void solve_cilkplus (double a[size][size], double b[size][size]) {
/* It is too complicated to use const. */
    for (int i = 0; i < size; ++i) {
        _Cilk_for (int j = 0; j < size; ++j) {
            b[j][i] /= a[i][i];
            b[j][i+1:size-i-1] -= b[j][i] * a[i][i+1:size-i-1];
        }
    }
    _Cilk_for (int j = 0; j < size; ++j) {
        for (int i = size-1; i >= 0; --i)
            b[j][i] =
                (b[j][i]-SUM(a[i][i+1:size-i-1]*b[j][i+1:size-i-1]))/
                    a[i][i];
    }
}

Thanks for the example.  I'll file a bug report for the Intel compiler.

Any news on this one?  That's merely for information and my own medium-term prioritisation, as I have plenty else to do!

14.0.0 resurfaced some bugs in this area, but 14.0.1 appears to have regained stability for local stack allocations on the Fortran side. 

In this example posted on Oct. 5, with a 14.0.1 Intel compiler I get repeated complaints:

nm.c(8): error: identifier "sqrt" is undefined  [due to omission of <math.h>]

nm.c(10): error: a variable captured by a lambda cannot have a type involving
a variable-length array

....

so it seems the effort to support VLA with CEAN has been rolled back. (Same message for C99 or C++)

Thanks very much.  That's a useful update.

There's been no reported progress on the VLA capture issue.  I did find that Nick is not alone in reporting the problem.

I work with Composer XE 2013 SP1 Update 1 (package 139) (containing the last published version of Composer XE 14), integrated with VisStudio-2010 (SP1).

In my code I have array of structures StepParameters (allocated to pointer p2dStepParameters_ar), and want to use it as 2D array with known length of the line, declared as const int max_num_of_steps and initialized with the value that passed from outside the routine.

For this purpose I define the pointer a2d_StepPars_ray_step with approprite customization:

StepParameters (*a2d_StepPars_ray_step)[max_num_of_steps] =

                             (StepParameters (*)[max_num_of_steps])p2dStepParameters_ar;

I try to use this pointer as following:

StepParameters *pa_curr_R_steps = a2d_StepPars_ray_step[iR]; // 'Line' of steps for given iR

This code line is being successfully compiled inside regular C for-loop, but inside cilk_for loop it causes the error:

a variable captured by a lambda cannot have a type involving a variable-length arr

Is it the bug mentioned above ? When we can expect its correction ?

 

So nobody has to go searching for this again, the various reports of this bug have been collected under CQ153038.  It is not yet fixed and I don't know when a fix will be available.

  - Barry

I happened on a case where VLA appears to work quite well with Cilk(tm) Plus on my corei7-4 Windows laptop, but fails on MIC native, until I apply a sufficient group of alignment attributes and assertions (which need ifdef variations to be accepted on Windows).  On the Windows side, vec-report6 still claims there is an unaligned access (identical to one which is reported aligned).  On the MIC side the work on correcting unaligned access has improved reliability and performance (but not to where it matches plain C).

发表评论

登录添加评论。还不是成员?立即加入