Iniciando com Intel VTune
O objetivo deste artigo é apresentar o Intel® VTune™ Amplifier XE 2013 e mostrar um exemplo básico de seu funcionamento, tendo como base um código Java, o qual terá uma versão serial e uma versão com processamento paralelo, mostrando um caso prático de tunning de performance.
Não será utilizada aqui nenhuma técnica de programação avançada em Java para implementação de paralelismo e não analisaremos profundamente os relatórios gerados pelo VTune.
Ambient Occlusion is an algorithm that approximates the reflection of light off non-reflective surfaces. Since calculating true light reflection is incredibly expensive and impractical given today's hardware, algorithms like ambient occlusion are used to get convincingly close. Ambient occlusion finds intersections with objects in the scene and a ray from the origin to each pixel on the screen. If there is an intersection (a "hit"), it searches for intersections again, but using the "hit" as an origin. Depending on how many intersections it finds, it will be lighter or darker, to imitate shadows, which is the goal of ambient occlusion. Intel® Cilk™ Plus
cilk_for is used to render multiple horizontal lines in parallel, while Intel Cilk Plus Array Notation is used to speed up the search for intersections with the "hit" as an origin. In the scalar implementation, the auto-vectorizer does a somewhat poor job of vectorizing the ambient occlusion calculation (intersections with the "hit" as an origin), which can be fixed by adding a single Intel Cilk Plus SIMD Notation line.
The followings are samples to demonstrate the Intel(R) Cilk(TM) Plus implementations and its performance benefits for the popular classic algorithms. Select the sample name to find more detail information.
Monte Carlo algorithms solve deterministic problems by using a probabilistic analogue. The algorithm requires repeated simulations of chance that lend themselves well to parallel processing and vectorization. The simulations in this example are run serially, with Intel® Cilk™ Plus Array Notation (AN) for vectorization, with Intel Cilk Plus
cilk_for for parallelization, and with both vectorization and
cilk_for. In this example, the Monte Carlo algorithm is utilized to estimate the valuation of a European swaption, which is fundamentally calculated by the difference between the strike price and the future estimated value, or forward swap rate. The Monte Carlo algorithm estimates the valuation by applying the initial conditions to a normal distribution over many simulations to calculate a normal valuation.