OpenMP*

OpenMP Shared Arrays

I have two questions about WRITE/READ operations on shared arrays.
 1) In my program I write a different element of a given shared array at every iteration of an OpenMP-parallelized DO LOOP. The results that I get should be right but I'm just wondering whether this is fine or I should enclose the READ/WRITE section in a CRITICAL block. Then, I also READ elements from a shared array without modifying them and it seems to work. Are these procedures correct?

[Bug] OSX Yosemite 10.10 fails when compiling

# ProductName:    Mac OS X
# ProductVersion:    10.10.3
# BuildVersion:    14D136

curl -O https://www.openmprtl.org/sites/default/files/libomp_20150401_oss.tgz
gunzip -c libomp_20150401_oss.tgz | tar xopf -
cd libomp_oss

in line 124..126 of libomp_oss/src/makefile.mk:
...
ifeq "$(os)" "mac"
    mac_os_new := $(shell /bin/sh -c 'if ; then echo "1"; else echo "0"; fi')
endif
...

Inconsistent Speedup

Hi,

I'm new in using OpenMP. I would like to ask about speedup ratio.

I running C source code with OpenMP added with Intel core i5-2410M.

Based on my understanding, speedup = execution time of code using one thread/execution time of code using N threads 

The execution time recorded is time_diff in the attached code.

Basic OMP Parallelized Program Not Scaling As Expected

#include <iostream>
#include <vector>
#include <stdexcept>
#include <sstream>
#include <omp.h>

std::vector<int> col_sums(std::vector<std::vector<short>>& data) {
    unsigned int height = data.size(), width = data[0].size();
    std::vector<int> totalSums(width, 0), threadSums(width, 0);

    #pragma omp parallel firstprivate(threadSums)
    {
        #pragma omp parallel for
        for (unsigned int i = 0; i < height; i++) {
            threadSums.data()[0:width] += data[i].data()[0:width];
        }

Promlems with Intel MPI

I have trouble with running Intel MPI on cluster with different different numbers of processors on nodes (12 and 32).

I use Intel MPI 4.0.3 and it works correctly on 20 nodes with 12 processors (Intel(Xeon(R)CPU X5650 @2.67)) at each, and all processors works correctly, then I try to run Intel MPI on other 3 nodes with 32 processors (Intel(Xeon(R)CPU E5-4620 v2@2.00) at each and they work correctly too.

OpenMP* abonnieren