Here's an unlikely failure when a closure is executed after ArBB compilation.
It's a simple sparse-matrix dense-vector multiply, resulting in a dense vector. The sparse matrix has at least one non-zero entry in each row, which is generally true for realistic problems, so it can be thought of as a flattened nested container (i.e. dense vector)withcompanion dense vectors forsegment length and coefficient column index. This exampleincludes a dense vector for segment offset, as it helps illustrate the failure.
The closure function is...
void foo(dense &Ans, const dense &Vec, const dense &Spm,
const dense &Len, const dense &Idx, const dense &Off)
{
dense tmp = Spm*gather(Vec, Idx);
Ans = add_reduce(reshape_nested_lengths(tmp, Len)); // fails
// Ans = add_reduce(reshape_nested_offsets(tmp, Off)); // works
}
...we can use this data...
dense idx = dense::parse("{0,1,2,1,2,3}");
dense len = dense::parse("{1,2,2,1}");
dense off = dense::parse("{0,1,3,5}");
dense spm = dense::parse("{1.,.5,.5,.5,.5,1.}");
dense vec = dense::parse("{1.,2.,3.,4.}");
dense ans;
// answer will be {1, 2.5, 2.5, 4}
...and compile this...
const closure &, const dense &, const dense &,
const dense &, const dense &, const dense &)>bar = capture(foo);
bar(ans, vec, spm, len, idx, off);
bar(ans, vec, spm, len, idx, off); // 2nd call fails, executable dies
But if we comment out the 2nd line of foo() using the 3rd line instead, it works as expected (but is much slower, by the way,which is unexpectedas segment offsets are merely the running sum of the segment lengths).
- paul



