优化

Benefits of SSE/AVX processing when an integrated GPU is missing?

Some Intel processors have an on-chip GPU (e.g. Intel Core i/-4770K using a HD Graphics 4600 GPU) whilst others don't have this (e.g  Intel Core i7 3930K). I'm wondering what implications this will have on SSE/AXV SIMD processing when such an integrated GPU is missing on the CPU. Even though there is support for SSE/AVX on many processor not having the embedded GPU, I wonder if this will reduce the benefit of using SSE/AVX significantly compared to CPUs with an embedded GPU? 

Computing Double/Float

Hello,

 

I'm doing some financial computations (Monte Carlo, massively parralel algorithm) in a benchmark case and I wanted to analyze the potential difference of time computation between the use of single and double precision. My problem is that I don't observe at all any difference between float and double. So my question is is there a real difference ? Or I'm just doing something wrong ?

 

Practice an example of profiling applications on Intel® Xeon Phi™ coprocessor on the sever from a client machine

Scenario: 
A Linux* server with Intel(R) Xeon Phi(TM) coprocessor card is a customized Linux* system, there is no X11 support so VTune™ Amplifier XE GUI cannot work on this server. The user should collect/analyze the result from another machine (client box).

Profiling an application which uses SIGNALS

Hello,
we are using Intel VTune 2015 for profiling our application which is running under CentOS 5.11.
Our application uses c++ signals for the control flow. When trying to do a basic hotspots analysis using amplxe-cl command line tool with the following parameters: 
-duration 20 --run-pass-thru=--profiling-signal=1
VTune yields the following error message when detaching after the 20 seconds duration. Alternative to the number 1 I also tried number 4 without any change in results.

How to identify the cause for the high CPI rate

I was trying to identify the reason for the lateness of my program. And I notice that one function has high CPI value (4.5), and it says the reason may

  • Memory stalls 
  • Instruction starvation 
  • Branch misprediction 
  • Long latency instructions 

How can I explore those things using Vtune. Can anyone help me to identify the specific reason for the high CPI? 

I am using vtune 2015 U1 (trial version). and i am a windows user 

internal error: assertion failed at: "shared/cfe/edgcpfe/expr.c", line 3653

Trying to evaluate Intel Compiler 15, but I can't get it to compile our C++11 code base. I've worked around some other problems (both compile find with g++-4.8 and clang++-3.5), but I having a harder time working around this one... Is this a known issue? The evaluation process is very difficult. Its unclear from the emails where or how to get support? With 30 days, it will take me 30 days to workaround just getting our code to compile... argh!

PMU resource(s) currently being used by another profiling tool or process

Hi,

 

I would please your help with the following. We are using intel vtune on a cluster. I submit two different jobs that collect hardware counters  on two different nodes/boxes on our cluster. The first job runs okay, the second fails with the error   "Error: PMU resource(s) currently being used by another profiling tool or process." Now thse are two difffrened nodes on the cluster so I could not think why hardware counters cannot be used at the same time for two different jobs.

订阅 优化