线程

How to set a 'rec' structuring element in ippiMorphCloseBorder_8u_C1R

Hello,

I am using IPP 9.0 Intel 64 to try to reducing processing time in a morphology closing operation on a 2048x2048 image.  I have used perfsys and am seeing a significant reduction in the processing time when Parm5 is 'rec'.  I'm assuming that this is a rectangular structure element, which is what I'm using. 

function
Parm1
Parm2
Parm3
Parm4
Parm5
Parm6
Parm7
Parm8
Comment
Clocks
per
Time (usec)

Accounting for CYCLE_ACTIVITY.CYCLES_NO_EXECUTE

Hi all,

I am using Vtunes' bandwidth profile to look at the fraction of time my software is waiting on any cache accesses on my HSW i7 processor. The CYCLE_ACTIVITY.CYCLES_NO_EXECUTE gives this time. When I try to break this down into fraction of time waiting on L1, L2, and L3+Mem, I am trying to use CYCLE_ACTIVITY.STALLS_L1D_PENDING, ...STALLS_L2_PENDING, and STALLS_LDM_PENDING. However, the sum of these three counts is > the CYCLES_NO_EXECUTE count always.

iconv issue

hi all,

 

I'm trying to build something for the Phi that depends on iconv; the library routines are present , but the following application fails when run on the Phi:

#include <stdlib.h>
#include <iconv.h>

int main () {
  iconv_t cd;
  cd = iconv_open("latin1","UTF-8");
  if(cd == (iconv_t)(-1)) exit(1);
  iconv_close(cd);

  exit(0);
}

if I build this using "icc -o iconv_test iconv_test.c" and run it on the host it return no error (exit code 0).

Using MPI parMETIS with cluster_sparse_solver

Hello.

I am optimizing the `cluster_sparse_solver` runtime. In my case, the majority of the runtime is taken by phase `11`, symbolic factorization, with METIS. Additionally, only a single node is used in an otherwise `MPI`-enabled application.

I was wondering if there is a way to use `parMETIS` for fill-reducing ordering, in order to benefit from the cluster environment. One thing that would help tremendously is the source code for `cluster_sparse_solver`.

The version of MKL in question is mkl 11.2u3, which was bundled with composer_xe 2015 3.187.

Thanks!

Getting Started with Intel® Threading Building Blocks (Intel® TBB)

Intel® Threading Building Blocks (Intel® TBB) is a runtime-based parallel programming model for C++ code that uses threads. It consists of a template-based runtime library to help you harness the latent performance of multicore processors. Use Intel TBB to write scalable applications that:

  • Specify logical parallel structure instead of threads
  • Emphasize data parallel programming
  • Take advantage of concurrent collections and parallel algorithms

Intel TBB is available as a standalone product as well as part of the following products:

  • 开发人员
  • 英特尔® Parallel Studio XE
  • 开发工具
  • 线程
  • 订阅 线程