I'm trying to parallelize 2D FFT on a square NxN matrix. My serial 1-D FFT code is:
I'm running an OpenMP application (minimal example) :
!$OMP PARALLEL DO
Array_Out(KR) = Array_Input(KR)
!$OMP END PARALLEL DO
I verified that I do not have data race (I get the same results with only one thread), however when running the application in the Inspector XE 2013 - I get a massage that I have cross-thread stack accesses.
How can I prevent this behavior, and what is the practical effect if not on the results ?
Thanks in advance for your replies,
When I use the hardware transcoding,the usage rate of CPU is too high.But the type information shows that it called the graphics card to transcode the Media. When i use the hardware transcoding to transcode 7 video together ,the usage rate becomes nearly 100% .
I think about that the hardware use not the CPU but GPU.Why my transcoding program shows that it use the CPU ?
I am running into a problem when using the sample_vpp application to scale yuy2 video. In this test I used the compiled sample_vpp.exe provided in the MediaSDK installation. Note that this same process does work with rgb4 input and output.
Test case 1: sample_vpp -lib sw -sw 1920 -sh 1080 -scc yuy2 -dw 960 -dh 540 -dcc yuy2 -n 1 -i aspen.yuv -o aspen_scaled.yuv
The output image is the correct size but appears to be in a planar format based on trying to view in YUVTools. So then I tried not scaling by keeping the output size to match the input size:
I'm using intel vtune amplifier 2015(linux version). my sample time of the work load is 180 seconds. I gave my SW build with debug symbols enabled.
In vutune->project properties, I gave the path for the build and the source files and symbols. When i give re-resolve, vtune takes more than 1 hour to finalize and display results. The progress bar goes to 30% and remains stuck there and it says "finalizing results " for more than an hour.
What is the problem here. why does it take so long to display results when i hit re-resolve?
Download this guide (see Article Attachments, below) to learn how to identify performance issues on software running on the 5th generation Intel® Core™ processor family (based on Intel® Microarchitecture Codename Broadwell). The guide explains the General Exploration Analysis viewpoint available in Intel® VTune™ Amplifier XE. It also walks through some of the most common performance issues that the VTune Amplifier XE interface highlights, what each issue means, and some suggested ways to fix them.
I believe that there is a documentation bug in the pseudo-code for the IRET instruction in the current edition of Volume 2A of the Architectures Software Developers' Manual.
The case we're looking at is using IRET to switch from Ring-0 to Ring-3.
The prose for protected mode states:
Beta 2016 update 2 has a problem with VS 2015 new run-time libraries. Consider the following code (should be compiled and linked as Win32/64 Dll):
I need PRIMME-V1 for building SLEPC so i am trying to build the application using intel 2015 suite on CentOS 6.5.
I am getting following error:-