Annotated source listing with optimization reports

Publicado: 08/23/2016, Última atualização: 08/23/2016

Intel® C++/Fortran Compiler 17.0 provides new option to annotate source code lines with optimization reports.

Syntax
Windows OS: /Qopt-report-annotate[:<keyword>] 
Linux OS and OS X: -qopt-report-annotate[=<keyword>]
          Annotate source files with optimization reports in specified format
            html - annotate in HTML format
            text - annotate in text format (DEFAULT)

Windows OS: /Qopt-report-annotate-position:<keyword>
Linux OS and OS X: -qopt-report-annotate-position=<keyword>
          Specify the site where optimization reports appear in the annotated source
            caller - annotate at caller site
            callee - annotate at callee site
            both   - annotate at both caller and callee site
   
Scope and working scenario

Primarily Windows command line users and Linux users can integrate the optimization reports with source listing by using this feature. Specify the site where loop related optimization reports appear in the annotated source for inlined routine
(1) this option only concerns  function which compiler inlined somewhere so that place specifies either call sites where a routine end up inlined or inlined function itself or both ends.
(2) the option concerns only loop related optimizations (i.e. output portion which starts with LOOP BEGIN and ends with LOOP END).

For example

Compiling using  #icc -c -qopt-report5 -qopt-report-annotate -qopt-report-annotate-position=callee annotate.c command line option produces source annotated file called "annotate.c.annot" in working directory.
Open the file[#cat annotate.c.annot]to view the annotations along with source code.

loop optimization report:
16        h = (double)1.0/(double)N;
17        sum = 0.0;
18
19        for ( i=0; i<N ; i++ ){
//
//LOOP BEGIN at /<absolute path to source file>/annotate.c(19,3) inlined into /<absolute path to source file>/annotate.c(35,14)
//   remark #15305: vectorization support: vector length 2
//   remark #15399: vectorization support: unroll factor set to 4
//   remark #15309: vectorization support: normalized vectorization overhead 0.108
//   remark #15300: LOOP WAS VECTORIZED
//   remark #15475: --- begin vector cost summary ---
//   remark #15476: scalar cost: 46
//   remark #15477: vector cost: 25.500
//   remark #15478: estimated potential speedup: 1.800
//   remark #15486: divides: 1
//   remark #15487: type converts: 1
//   remark #15488: --- end vector cost summary ---
//   remark #25015: Estimate of max trip count of loop=125000000
//LOOP END
//
//LOOP BEGIN at /<absolute path to source file>/annotate.c(19,3)
//   remark #15305: vectorization support: vector length 2
//   remark #15399: vectorization support: unroll factor set to 4
//   remark #15309: vectorization support: normalized vectorization overhead 0.108
//   remark #15300: LOOP WAS VECTORIZED
//   remark #15475: --- begin vector cost summary ---
//   remark #15476: scalar cost: 46
//   remark #15477: vector cost: 25.500
//   remark #15478: estimated potential speedup: 1.800
//   remark #15486: divides: 1
//   remark #15487: type converts: 1
//   remark #15488: --- end vector cost summary ---
//   remark #25015: Estimate of max trip count of loop=125000000
//LOOP END
20            x = h*(i-0.5);
21            sum = sum + f(x);


Inline report:
13      double f2(){
//INLINE REPORT: (f2()) [3/3=100.0%] /<absolute path to source file>/annotate.c(13,12)
//  -> INLINE: (21,19) f(double) (isz = 2) (sz = 9)
//
///<absolute path to source file>/annotate.c(13,12):remark #34051: REGISTER ALLOCATION : [f2] /<absolute path to source file>/annotate.c:13
//
//    Hardware registers
//        Reserved     :    2[ rsp rip]
//        Available    :   39[ rax rdx rcx rbx rbp rsi rdi r8-r15 mm0-mm7 zmm0-zmm15]
//        Callee-save  :    6[ rbx rbp r12-r15]
//        Assigned     :   16[ rax rdx zmm0-zmm13]
//
//    Routine temporaries
//        Total         :      53
//            Global    :      15
//            Local     :      38
//        Regenerable   :       6
//        Spilled       :       0
//
//    Routine stack
//        Variables     :       0 bytes*
//            Reads     :       0 [0.00e+00 ~ 0.0%]
//            Writes    :       0 [0.00e+00 ~ 0.0%]
//        Spills        :       0 bytes*
//            Reads     :       0 [0.00e+00 ~ 0.0%]
//            Writes    :       0 [0.00e+00 ~ 0.0%]
//
//    Notes
//
//        *Non-overlapping variables and spills may share stack space,
//         so the total stack size might be less than this.
//
//
14        int i;
15        double sum, pi, x, h;
16        h = (double)1.0/(double)N;
17        sum = 0.0;

From the above report, we can see that inlined, loop optimization reports are generated inlined with source code lines.

Informações de produto e desempenho

1

Os compiladores da Intel podem ou não otimizar para o mesmo nível de microprocessadores não Intel no caso de otimizações que não são exclusivas para microprocessadores Intel. Essas otimizações incluem os conjuntos de instruções SSE2, SSE3 e SSSE3, e outras otimizações. A Intel não garante a disponibilidade, a funcionalidade ou eficácia de qualquer otimização sobre microprocessadores não fabricados pela Intel. As otimizações que dependem de microprocessadores neste produto são destinadas ao uso com microprocessadores Intel. Algumas otimizações não específicas da microarquitetura Intel são reservadas para os microprocessadores Intel. Consulte os Guias de Usuário e Referência do produto aplicáveis para obter mais informações sobre os conjuntos de instruções específicos cobertos por este aviso.

Revisão do aviso #20110804