| September 21, 2010 6:29 PM PDT | |
by Levent Akyil
Leveraging SIMD and SSE (Streaming SIMD Extensions) support available on target processors is one of the key optimization techniques JVMs use (or should use). The question is how to identify which Jitted methods use SSE? This has been a common question from many Java developers.
The VTune analyzer’s event based sampling can help users pinpoint exactly which methods are optimized to use SSE. One can simply use SIMD_INST_RETIRED.ANY event (Retired Streaming SIMD instructions (precise event)) to count the overall number of SIMD instructions retired.
The following events can give further break-down of this one event. These are the events that are available on Core™ architecture based processors.
Symbol Name[VTune(TM) Analyzer help] |
Description |
|
FP_MMX_TRANS.TO_FP |
Transitions from MMX (TM) Instructions to Floating Point Instructions. |
|
FP_MMX_TRANS.TO_MMX |
Transitions from Floating Point to MMX (TM) Instructions. |
|
SIMD_ASSIST |
SIMD assists invoked. |
|
SIMD_COMP_INST_RETIRED.PACKED_DOUBLE |
Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions. |
|
SIMD_COMP_INST_RETIRED.PACKED_SINGLE |
Retired computational Streaming SIMD Extensions (SSE) packed-single instructions. |
|
SIMD_COMP_INST_RETIRED.SCALAR_DOUBLE |
Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions. |
|
SIMD_COMP_INST_RETIRED.SCALAR_SINGLE |
Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions. |
|
SIMD_INSTR_RETIRED |
SIMD Instructions retired. |
|
SIMD_INST_RETIRED.ANY |
Retired Streaming SIMD instructions (precise event). |
|
SIMD_INST_RETIRED.PACKED_DOUBLE |
Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions. |
|
SIMD_INST_RETIRED.PACKED_SINGLE |
Retired Streaming SIMD Extensions (SSE) packed-single instructions. |
|
SIMD_INST_RETIRED.SCALAR_DOUBLE |
Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions. |
|
SIMD_INST_RETIRED.SCALAR_SINGLE |
Retired Streaming SIMD Extensions (SSE) scalar-single instructions. |
|
SIMD_INST_RETIRED.VECTOR |
Retired Streaming SIMD Extensions 2 (SSE2) vector integer instructions. |
|
SIMD_SAT_INSTR_RETIRED |
Saturated arithmetic instructions retired. |
|
SIMD_SAT_UOP_EXEC |
SIMD saturated arithmetic micro-ops executed. |
|
SIMD_UOPS_EXEC |
SIMD micro-ops executed (excluding stores). |
|
SIMD_UOP_TYPE_EXEC.ARITHMETIC |
SIMD packed arithmetic micro-ops executed |
|
SIMD_UOP_TYPE_EXEC.LOGICAL |
SIMD packed logical micro-ops executed |
|
SIMD_UOP_TYPE_EXEC.MUL |
SIMD packed multiply micro-ops executed |
|
SIMD_UOP_TYPE_EXEC.PACK |
SIMD pack micro-ops executed |
|
SIMD_UOP_TYPE_EXEC.SHIFT |
SIMD packed shift micro-ops executed |
|
SIMD_UOP_TYPE_EXEC.UNPACK |
SIMD unpack micro-ops executed |
If I collect some of these events on SciMark2 (http://math.nist.gov/scimark2/), it can be seen that actually all our benchmarks are using SIMD instructions. It is also important to note that SIMD_INST_RETIRED.ANY should be equal to the total of all the sub events represented as SIMD_INST_RETIRED.xyz.
This article applies to: Tools, Intel® VTune™ Performance Analyzer for Linux* Knowledge Base, Intel® VTune™ Performance Analyzer for Windows* Knowledge Base
For more complete information about compiler optimizations, see our Optimization Notice.
Comments (1) 
Trackbacks (0)
Leave a comment 
Levent (Intel)
|


Jonathan