Annotated source listing with optimization reports

Опубликовано: 08/23/2016, Последнее обновление: 08/23/2016

Intel® C++/Fortran Compiler 17.0 provides new option to annotate source code lines with optimization reports.

Syntax
Windows OS: /Qopt-report-annotate[:<keyword>] 
Linux OS and OS X: -qopt-report-annotate[=<keyword>]
          Annotate source files with optimization reports in specified format
            html - annotate in HTML format
            text - annotate in text format (DEFAULT)

Windows OS: /Qopt-report-annotate-position:<keyword>
Linux OS and OS X: -qopt-report-annotate-position=<keyword>
          Specify the site where optimization reports appear in the annotated source
            caller - annotate at caller site
            callee - annotate at callee site
            both   - annotate at both caller and callee site
   
Scope and working scenario

Primarily Windows command line users and Linux users can integrate the optimization reports with source listing by using this feature. Specify the site where loop related optimization reports appear in the annotated source for inlined routine
(1) this option only concerns  function which compiler inlined somewhere so that place specifies either call sites where a routine end up inlined or inlined function itself or both ends.
(2) the option concerns only loop related optimizations (i.e. output portion which starts with LOOP BEGIN and ends with LOOP END).

For example

Compiling using  #icc -c -qopt-report5 -qopt-report-annotate -qopt-report-annotate-position=callee annotate.c command line option produces source annotated file called "annotate.c.annot" in working directory.
Open the file[#cat annotate.c.annot]to view the annotations along with source code.

loop optimization report:
16        h = (double)1.0/(double)N;
17        sum = 0.0;
18
19        for ( i=0; i<N ; i++ ){
//
//LOOP BEGIN at /<absolute path to source file>/annotate.c(19,3) inlined into /<absolute path to source file>/annotate.c(35,14)
//   remark #15305: vectorization support: vector length 2
//   remark #15399: vectorization support: unroll factor set to 4
//   remark #15309: vectorization support: normalized vectorization overhead 0.108
//   remark #15300: LOOP WAS VECTORIZED
//   remark #15475: --- begin vector cost summary ---
//   remark #15476: scalar cost: 46
//   remark #15477: vector cost: 25.500
//   remark #15478: estimated potential speedup: 1.800
//   remark #15486: divides: 1
//   remark #15487: type converts: 1
//   remark #15488: --- end vector cost summary ---
//   remark #25015: Estimate of max trip count of loop=125000000
//LOOP END
//
//LOOP BEGIN at /<absolute path to source file>/annotate.c(19,3)
//   remark #15305: vectorization support: vector length 2
//   remark #15399: vectorization support: unroll factor set to 4
//   remark #15309: vectorization support: normalized vectorization overhead 0.108
//   remark #15300: LOOP WAS VECTORIZED
//   remark #15475: --- begin vector cost summary ---
//   remark #15476: scalar cost: 46
//   remark #15477: vector cost: 25.500
//   remark #15478: estimated potential speedup: 1.800
//   remark #15486: divides: 1
//   remark #15487: type converts: 1
//   remark #15488: --- end vector cost summary ---
//   remark #25015: Estimate of max trip count of loop=125000000
//LOOP END
20            x = h*(i-0.5);
21            sum = sum + f(x);


Inline report:
13      double f2(){
//INLINE REPORT: (f2()) [3/3=100.0%] /<absolute path to source file>/annotate.c(13,12)
//  -> INLINE: (21,19) f(double) (isz = 2) (sz = 9)
//
///<absolute path to source file>/annotate.c(13,12):remark #34051: REGISTER ALLOCATION : [f2] /<absolute path to source file>/annotate.c:13
//
//    Hardware registers
//        Reserved     :    2[ rsp rip]
//        Available    :   39[ rax rdx rcx rbx rbp rsi rdi r8-r15 mm0-mm7 zmm0-zmm15]
//        Callee-save  :    6[ rbx rbp r12-r15]
//        Assigned     :   16[ rax rdx zmm0-zmm13]
//
//    Routine temporaries
//        Total         :      53
//            Global    :      15
//            Local     :      38
//        Regenerable   :       6
//        Spilled       :       0
//
//    Routine stack
//        Variables     :       0 bytes*
//            Reads     :       0 [0.00e+00 ~ 0.0%]
//            Writes    :       0 [0.00e+00 ~ 0.0%]
//        Spills        :       0 bytes*
//            Reads     :       0 [0.00e+00 ~ 0.0%]
//            Writes    :       0 [0.00e+00 ~ 0.0%]
//
//    Notes
//
//        *Non-overlapping variables and spills may share stack space,
//         so the total stack size might be less than this.
//
//
14        int i;
15        double sum, pi, x, h;
16        h = (double)1.0/(double)N;
17        sum = 0.0;

From the above report, we can see that inlined, loop optimization reports are generated inlined with source code lines.

Информация о продукте и производительности

1

Компиляторы Intel могут не обеспечивать для процессоров других производителей уровень оптимизации, который не является присущим только процессорам Intel. В состав этих оптимизаций входят наборы команд SSE2, SSE3 и SSSE3, а также другие оптимизации. Корпорация Intel не гарантирует доступность, функциональность или эффективность работы любых приложений оптимизации для микропроцессоров других производителей. Содержащиеся в данной продукции оптимизации, предназначены для использования с конкретными микропроцессорами Intel. Некоторые оптимизации, не относящиеся к микроархитектуре Intel, зарезервированы для микропроцессоров Intel. Пожалуйста, см. соответствующее руководство пользователя или справочные руководства для получения дополнительной информации о конкретных наборах команд, к которым относится данное уведомление.

Редакция уведомления № 20110804