‹ Back to Video Series: Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors

Episode 4.9 Distributed-memory Parallelism and MPI

  • Overview
  • Resources

In the previous episodes of this chapter, we learned how to use vectorization to parallelize calculations across vector lanes in each core. Then we talked about how to use OpenMP* to scale applications across cores in each processor or coprocessor. Now, in this final episode 4.9 of this chapter, we will study the next level of parallelism: scaling across multiple compute devices and multiple compute nodes in a cluster environment.