Cilkview bug

Cilkview bug

Hi,

   I'm getting some suspicious cilkview output (pasted below).  Notice that the work calculated in the cilk parallel region is implausibly large.

I'm using:

Cilkview version 2.0.0, build 3566, built Aug 20 2013 13:58:06
Using PIN 2.12, Build 59761

gcc (GCC) 4.8.0 20120618 (experimental)

gcc -o qsort.64 -O3 -g -Werror -gdwarf-3 -std=gnu99 -m64 -I/afs/csail.mit.edu/proj/courses/6.172/cilkutil/include qsort.c -lcilkrts -ldl

Whole Program Statistics
1) Parallelism Profile
Work : 13,138 instructions
Span : 5,129,893,803 instructions
Burdened span : 91,087,059,418 instructions
Parallelism : 0.00
Burdened parallelism : 0.00
Number of spawns/syncs: 20,000,000
Average instructions / strand : 0
Strands along span : 33,456,971
Average instructions / strand on span : 153
Total number of atomic instructions : 20,000,040
Frame count : 40,000,000

2) Speedup Estimate
2 processors: 0.00 - 0.00
4 processors: 0.00 - 0.00
8 processors: 0.00 - 0.00
16 processors: 0.00 - 0.00
32 processors: 0.00 - 0.00
64 processors: 0.00 - 0.00
128 processors: 0.00 - 0.00
256 processors: 0.00 - 0.00

Cilk Parallel Region(s) Statistics - Elapsed time: 0.793 seconds
1) Parallelism Profile
Work : 18,446,744,073,168,188,722 instructions
Span : 4,588,517,771 instructions
Burdened span : 90,545,683,386 instructions
Parallelism : 4020196715.76
Burdened parallelism : 203728586.32
Number of spawns/syncs: 20,000,000
Average instructions / strand : 381,237,467,926
Strands along span : 16,728,485
Average instructions / strand on span : 274
Total number of atomic instructions : 20,000,040
Frame count : 40,000,000
Entries to parallel region : 10

2) Speedup Estimate
2 processors: 1.90 - 2.00
4 processors: 3.80 - 4.00
8 processors: 7.60 - 8.00
16 processors: 15.20 - 16.00
32 processors: 30.40 - 32.00
64 processors: 60.80 - 64.00
128 processors: 121.60 - 128.00
256 processors: 243.20 - 256.00

AnexoTamanho
Download qsort.c4.29 KB
8 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Is this run using one of our sample programs?

    - Barry

No, I'm running cilkview --trials=all ./qsort.64

qsort.64 is generated using the gcc command in my original post and the source is qsort.c which I included in my original post.  

Thanks,

Will

Ah.  I missed it in the original post.  I'll take a look and get back to you.

   - Barry

FYI, I'm unable to reproduce the problem with they latest build from the "cilkplus" branch.  I'm pulling the snapshot from the "cilkplus-4_8_branch" now.

    - Barry

Hello William,

   Is there any reason why you are forcing dwarf3? If not, then can you please try to compile without that flag and try using cilk tools? In the past we have had issues with the metadata info. when the drawf version is changed.

Thanks,

Balaji V. Iyer.

It's not reproducing for me with the cilkplus-4_8-branch either.

4.9: gcc (GCC) 4.9.0 20130520 (experimental)

4.8: gcc (GCC) 4.8.1 20130520 (prerelease)

Cilkview run with the image produced by the cilkplus-4_8-branch compiler:

Cilkview: Generating scalability data
Cilkview Scalability Analyzer V2.0.0, Build 3808
Sorting 1000000 integers
Running 10 trials
All sorts succeeded
Sort time: 6.694000e+00 seconds

Whole Program Statistics
1) Parallelism Profile
Work : 6,799,746,125 instructions
Span : 835,654,456 instructions
Burdened span : 851,217,096 instructions
Parallelism : 8.14
Burdened parallelism : 7.99
Number of spawns/syncs: 20,000,000
Average instructions / strand : 113
Strands along span : 1,407
Average instructions / strand on span : 593,926
Total number of atomic instructions : 20,000,070
Frame count : 40,000,000

2) Speedup Estimate
2 processors: 1.65 - 2.00
4 processors: 2.44 - 4.00
8 processors: 3.21 - 8.00
16 processors: 3.82 - 8.14
32 processors: 4.21 - 8.14
64 processors: 4.44 - 8.14
128 processors: 4.57 - 8.14
256 processors: 4.63 - 8.14

Cilk Parallel Region(s) Statistics - Elapsed time: 0.631 seconds
1) Parallelism Profile
Work : 6,258,755,107 instructions
Span : 294,663,438 instructions
Burdened span : 310,226,078 instructions
Parallelism : 21.24
Burdened parallelism : 20.17
Number of spawns/syncs: 20,000,000
Average instructions / strand : 104
Strands along span : 703
Average instructions / strand on span : 419,151
Total number of atomic instructions : 20,000,070
Frame count : 40,000,000
Entries to parallel region : 10

2) Speedup Estimate
2 processors: 1.84 - 2.00
4 processors: 3.19 - 4.00
8 processors: 5.03 - 8.00
16 processors: 7.07 - 16.00
32 processors: 8.86 - 21.24
64 processors: 10.14 - 21.24
128 processors: 10.94 - 21.24
256 processors: 11.38 - 21.24

The copy of Cilkview I'm running is from the current build, but nothing's been done to it since the release you're using.

I just discussed the GCC version numbers with Balaji.  While GCC does not automatically generate a new version number when a fix is checked in, your build number is 2012 date.  I suspect we've fixed the problem you're seeing in a more recent compiler.

    - Barry

Hey guys,

   Thanks - I updated my version of GCC and it all seems to be working fine.  It's a little bizarre that Cilkview behaved that way, but the subtlety may just be a curiosity left for the archaeologists of the next millenium...

Thanks,

Will

Faça login para deixar um comentário.