Parallel Computing

compile assembly code for Xeon Phi

Hi Guys, 

I am using Xeon Phi in offload mode. Basically, I have written a code with offload pragmas (main.cpp, micSolver.cpp). First, I generate the assembly code with icc -S micSolver.cpp, and it emits two files: (1) micSolver.s, and (2) micSolverMIC.s. As expected, the micSolverMIC.s includes the code to be run on Phi. 

My question is, how can I compile the assembly code further into binary code? There is no problem when using 'icc -c micSolver.s', while 'icc -c micSolverMIC.s' gives the following errors. Do you guys have any idea on how to compile the assembly code? 

speed

I want to speed up my program by intel C++ compiler. To some program, it improves a lot, but to some program, it can make it faster comparing to Visual studio C++. I want to know why. Would someone give me some idea or advice.

Instruct VTune to launch app in bash shell

I am having trouble launching my application from VTune. We've got a number of bash shell scripts that need to run to set up the environment first. So, I'd like VTune to launch a new bash shell to run my application in. If I do a ps command in the launch application, I see that it is launching /bin/sh. Unfortunately, that is linked to dash on our system (https://wiki.ubuntu.com/DashAsBinSh). Is there a way to instruct VTune to use a new bash shell (perhaps a env variable)?

 

Ryan

Activation

How do I activate an existing installation that was previously an evaluation?  We bought a license, but I received it as text in an email body, and don't know what to name the .lic file.

Fortran Memory not deallocated - Inspector indicates subroutine declaration

I am having trouble understand this error, and I figure that it must be a false positive. Using Inspector, I cleaned up a number of Memory not deallocated errors where the allocatable variables were in a module that persists through the life of the application.

However, I have one memory not deallocated error remaining, and I figure that this must be a false positive.

PARDISO Scalability

Hi,

I have a question about scalability of the PARDISO solver. I'm using Intel MKL PARDISO with my Finite Element code. I tested simple linear elastic problem with around 800 000 unknowns. The time results are:

 

threads
pardiso phase 22
pardiso phase 33
sum pardiso
speed up pardiso phase 22
speedup pardiso phase 33
speed up sum pardiso

1
1433.922
18.8455
1452.7675
1
1
1

offload error: buffer set state failed

Hi,

when i was running my code. i am getting error offload error: buffer set state failed.

what does this error means? When this error can occur? Please let me know if you have any idea on this.

 

Note: If i see the memory used by the application it is consuming around 5.1GB. I have 8 GB on my MIC.

Thanks

sivaramakrishna

全新 Android* 世界的可信赖工具:优化技术 — 从英特尔® SSE 内部指令到英特尔® Cilk™ Plus

作者: 英特尔高级软件应用工程师 Zvi Danovich

简介

大部分的 Android 应用 — 即使是仅基于脚本和管理语言 (Java*, HTML5,…) 的应用 — 最终都会使用中间件功能,因为该功能能够利用优化特性。

本文将介绍基于 Android 的优化需求和方法,并详述一个优化多媒体和增强现实应用的案例。

英特尔为 Android 平台(智能手机和平板电脑)提供了多种不同的英特尔® 凌动™ 处理器,至少包括英特尔® SIMD 流指令扩展补充版(英特尔® SSSE3)级别的矢量功能,通常包括两个内核和超线程。

理解并使用这些优化功能吧!

  • Entwickler
  • Android*
  • Android*
  • Intel® Cilk™ Plus
  • Intel® Streaming SIMD Extensions
  • Grafik
  • Optimierung
  • Parallel Computing
  • Parallel Computing abonnieren