OpenMP*

Basic OMP Parallelized Program Not Scaling As Expected

#include <iostream>
#include <vector>
#include <stdexcept>
#include <sstream>
#include <omp.h>

std::vector<int> col_sums(std::vector<std::vector<short>>& data) {
    unsigned int height = data.size(), width = data[0].size();
    std::vector<int> totalSums(width, 0), threadSums(width, 0);

    #pragma omp parallel firstprivate(threadSums)
    {
        #pragma omp parallel for
        for (unsigned int i = 0; i < height; i++) {
            threadSums.data()[0:width] += data[i].data()[0:width];
        }

Promlems with Intel MPI

I have trouble with running Intel MPI on cluster with different different numbers of processors on nodes (12 and 32).

I use Intel MPI 4.0.3 and it works correctly on 20 nodes with 12 processors (Intel(Xeon(R)CPU X5650 @2.67)) at each, and all processors works correctly, then I try to run Intel MPI on other 3 nodes with 32 processors (Intel(Xeon(R)CPU E5-4620 v2@2.00) at each and they work correctly too.

Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors Developer Training Event

The 1-day seminar (CDT 101) features presentations on the available programming models and best optimization practices for the Intel Xeon Phi coprocessor, and on the usage of the Intel software development and diagnostic tools. CDT 101 is a prerequisite for hands-on labs, CDT 102.

Parallel Programming and Optimization with Intel® Xeon Phi™ Coprocessors Developer Training Event

The 1-day seminar (CDT 101) features presentations on the available programming models and best optimization practices for the Intel Xeon Phi coprocessor, and on the usage of the Intel software development and diagnostic tools. CDT 101 is a prerequisite for hands-on labs, CDT 102.
Subscribe to OpenMP*