Identifying JVM SIMD and SSE Usage with the VTune™ Performance Analyzer

by Levent Akyil

Leveraging SIMD and SSE (Streaming SIMD Extensions) support available on target processors is one of the key optimization techniques JVMs use (or should use). The question is how to identify which Jitted methods use SSE? This has been a common question from many Java developers.

The VTune analyzer’s event based sampling can help users pinpoint exactly which methods are optimized to use SSE. One can simply use SIMD_INST_RETIRED.ANY event (Retired Streaming SIMD instructions (precise event)) to count the overall number of SIMD instructions retired.

The following events can give further break-down of this one event. These are the events that are available on Core™ architecture based processors.

Symbol Name[VTune™ Analyzer help]

Description

FP_MMX_TRANS.TO_FP

Transitions from MMX ™ Instructions to Floating Point Instructions.

FP_MMX_TRANS.TO_MMX

Transitions from Floating Point to MMX ™ Instructions.

SIMD_ASSIST

SIMD assists invoked.

SIMD_COMP_INST_RETIRED.PACKED_DOUBLE

Retired computational Streaming SIMD Extensions 2 (SSE2) packed-double instructions.

SIMD_COMP_INST_RETIRED.PACKED_SINGLE

Retired computational Streaming SIMD Extensions (SSE) packed-single instructions.

SIMD_COMP_INST_RETIRED.SCALAR_DOUBLE

Retired computational Streaming SIMD Extensions 2 (SSE2) scalar-double instructions.

SIMD_COMP_INST_RETIRED.SCALAR_SINGLE

Retired computational Streaming SIMD Extensions (SSE) scalar-single instructions.

SIMD_INSTR_RETIRED

SIMD Instructions retired.

SIMD_INST_RETIRED.ANY

Retired Streaming SIMD instructions (precise event).

SIMD_INST_RETIRED.PACKED_DOUBLE

Retired Streaming SIMD Extensions 2 (SSE2) packed-double instructions.

SIMD_INST_RETIRED.PACKED_SINGLE

Retired Streaming SIMD Extensions (SSE) packed-single instructions.

SIMD_INST_RETIRED.SCALAR_DOUBLE

Retired Streaming SIMD Extensions 2 (SSE2) scalar-double instructions.

SIMD_INST_RETIRED.SCALAR_SINGLE

Retired Streaming SIMD Extensions (SSE) scalar-single instructions.

SIMD_INST_RETIRED.VECTOR

Retired Streaming SIMD Extensions 2 (SSE2) vector integer instructions.

SIMD_SAT_INSTR_RETIRED

Saturated arithmetic instructions retired.

SIMD_SAT_UOP_EXEC

SIMD saturated arithmetic micro-ops executed.

SIMD_UOPS_EXEC

SIMD micro-ops executed (excluding stores).

SIMD_UOP_TYPE_EXEC.ARITHMETIC

SIMD packed arithmetic micro-ops executed

SIMD_UOP_TYPE_EXEC.LOGICAL

SIMD packed logical micro-ops executed

SIMD_UOP_TYPE_EXEC.MUL

SIMD packed multiply micro-ops executed

SIMD_UOP_TYPE_EXEC.PACK

SIMD pack micro-ops executed

SIMD_UOP_TYPE_EXEC.SHIFT

SIMD packed shift micro-ops executed

SIMD_UOP_TYPE_EXEC.UNPACK

SIMD unpack micro-ops executed

 

If I collect some of these events on SciMark2 (http://math.nist.gov/scimark2/), it can be seen that actually all our benchmarks are using SIMD instructions.  It is also important to note that SIMD_INST_RETIRED.ANY should be equal to the total of all the sub events represented as SIMD_INST_RETIRED.xyz.

sse_usage.JPG

 


For more complete information about compiler optimizations, see our Optimization Notice.
Tags: