understanding cilkview output

understanding cilkview output

Ritratto di fahadsatti
I wrote a program to a calculate and analyze FFT. When I use cilkview to analyze the program it throws me the following out: Statistics for fftp_out 1) Parallelism Profile Work : 4208964828 instructions Span : 102287518 instructions Burdened span : 104256627 instructions Parallelism : 41.15 Burdened parallelism : 40.37 Number of spawns/syncs: 286721 Average instructions / strand : 4893 Strands along span : 129 Average instructions / strand on span : 792926 Total number of atomic instructions : 10692 Frame count : 7409661 2) Speedup Estimate 2 procs: 1.92 - 2.00 4 procs: 3.55 - 4.00 8 procs: 6.18 - 8.00 16 procs: 9.81 - 16.00 32 procs: 13.88 - 32.00 Whole Program Statistics: Cilkview Scalability Analyzer V1.1.0, Build 8503 1) Parallelism Profile Work : 8,099,110,662 instructions Span : 3,992,433,352 instructions Burdened span : 3,994,402,461 instructions Parallelism : 2.03 Burdened parallelism : 2.03 Number of spawns/syncs: 286,721 Average instructions / strand : 9,415 Strands along span : 259 Average instructions / strand on span : 15,414,800 Total number of atomic instructions : 6,229,955 Frame count : 7409664 2) Speedup Estimate 2 processors: 1.09 - 2.00 4 processors: 1.14 - 2.03 8 processors: 1.16 - 2.03 16 processors: 1.18 - 2.03 32 processors: 1.19 - 2.03 My question is why is parallelism so much different from the first output than the second? Secondly, I don't really understand what these two different outputs refer to, any help will be greatly appreciated.
3 post / 0 new
Ultimo contenuto
Per informazioni complete sulle ottimizzazioni del compilatore, consultare l'Avviso sull'ottimizzazione
Ritratto di William Leiserson (Intel)

The first output is the section of code that you instrumented. That's what was executed between cv.start() and cv.stop().

The second output is the whole program. Typically, test programs begin by doing a bunch of serial work like loading data structures before they get to the parallel work. No graph is generated for the Whole Program Statistics, and it always appears last on the output. But it's presented so that even if you didn't instrument any of your code, Cilkview will still give you an analysis of the total parallelism of your program.

Ritratto di William Leiserson (Intel)

Whoops. Sorry, I forgot about the duplicate thread.

Accedere per lasciare un commento.