This webinar will present a practical case study of porting the Tachyon, an open source ray tracer, part of the SpecMPI suite, to Intel® Xeon Phi™ coprocessor. The Initial port revealed disappointing performance, e.g. the combined Intel® Xeon® processor and Intel Xeon Phi coprocessor version ran 2.6x slower than Xeon-only version. To achieve good performance some code modifications needed to be introduced improving both processor and coprocessor parts. Intel® Cluster Studio XE is used to pinpoint the problems and will highlight key code changes which helped achieve significant improvements (up to 7x vs from initial baseline, and 1.8x speed up vs improved Xeon version). The application exploits parallelism at multiple levels - symmetric MPI execution model, OpenMP-based multi-threading, and explicit SIMD (using SSE2/AVX/Xeon Phi instructions). Several software tools will be highlighted – Intel® Trace Analyzer and Collector, and Intel® VTune™ Amplifier XE in combination with MPI* and OpenMP* programming models, as well as a SIMD-enabled 3D vector operations library (reused and extended from Embree, the open source ray tracer by Intel Labs). Algorithmic changes include MPI-based dynamic scheduling, introduction of explicit intrinsics-based SIMD support, enabling greater OpenMP parallelism capacity.
Benchmark results were obtained prior to the implementation of recent software patches and firmware updates intended to address exploits referred to as "Spectre" and "Meltdown". Implementation of these updates may make these results inapplicable to your device or system.
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. For more information, see Performance Benchmark Test Disclosure.