合作伙伴

Intel LLVM Optimizer Optimization Flags

Hi all,

I'm using the Intel offline compiler and the LLVM-based optimizer with Intel specific optimizations (oclopt). For the optimizer I have used regular O optimization levels (O1, O2, O3) which included many optimization in two passes. As there are plenty of individual optimizations available, I'm interested to know which optimization flags have potentially big impact on performance.

Can you please suggest me some optimizations?

Best regards,
Robert

Intel LLVM Optimizer Optimization Flags

Hi all,

I'm using the Intel offline compiler and the LLVM-based optimizer with Intel specific optimizations (oclopt). For the optimizer I have used regular O optimization levels (O1, O2, O3) which included many optimization in two passes. As there are plenty of individual optimizations available, I'm interested to know which optimization flags have potentially big impact on performance.

Can you please suggest me some optimizations?

Best regards,
Robert

Intel LLVM Optimizer Optimization Flags

Hi all,

I'm using the Intel offline compiler and the LLVM-based optimizer with Intel specific optimizations (oclopt). For the optimizer I have used regular O optimization levels (O1, O2, O3) which included many optimization in two passes. As there are plenty of individual optimizations available, I'm interested to know which optimization flags have potentially big impact on performance.

Can you please suggest me some optimizations?

Best regards,
Robert

Same kernel but huge performance difference under linux and windows

Hi, 

I have managed to run my kernel on iGPU under Linux and Windows.

Officially linux does not support to run kernel on iGPU but an OpenCL source project "beignet" come to help.

So following is the performance result for my kernel (deblocking filter in HEVC), the performance (time in seconds) was not obtained by binding event to kernel launching in OpenCL as it also depends on the OpenCL runtime implementation under windows and linux, instead, it was obtained by the host side CPU profiling utilities. 

                      H2D     Kernel     D2H

Memory Leak in Windows CPU OCL 1.2/2.0, not so much GPU 1.2

Hello world,

I'm developing an asynchronous Windows application and have noticed a strange loss of system memory. My application internally tracks memory usage, and when not using OpenCL at all it matches what is reported by the system through taskmgr. What's curious is the memory leak is more or less depending on what OpenCL version and device I use. Summarizing what taskmgr reports:

No OpenCL (vanilla C code) - ~8MB
OpenCL 2.0 Experimental CPU ~ 1.2 GB
OpenCL 1.2 CPU ~ 350 MB
OpenCL 1.2 GPU (HD 4600) ~ 40 MB

Announcing Intel® Data Analytics Acceleration Library 2016 Beta

We are pleased to announce the release of Intel® Data Analytics Acceleration Library 2016 Beta!

Intel® Data Analytics Acceleration Library is a C++ and Java API library of optimized analytics building blocks for all data analysis stages, from data acquisition to data mining and machine learning. It is a library essential for engineering high performance data application solutions.

To join the free Beta program and get instructions on downloading the software, follow the links below:

  • 开发人员
  • 合作伙伴
  • 教授
  • 学生
  • Apple OS X*
  • Linux*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8
  • 企业客户端
  • 物联网
  • 服务器
  • Windows*
  • C/C++
  • Java*
  • 高级
  • 入门级
  • 中级
  • 英特尔® 数据分析加速库
  • 大数据
  • 开发工具
  • Intel® XDK Update for February 2015: HTML5 Games, Sublime Text* & Easier to Get Started

    We are gearing up for two of the biggest shows of the year:  Game Developers Conference (GDC) in San Francisco and Mobile World Conference (MWC) in Barcelona, Spain – both the week of March 2.  Come look for us in the Intel® Software booths at the shows – stop by and get an Intel XDK sticker!

    Which fine grain SVM features are supported in the current Gen8 driver?

    I don't have Broadwell hardware in front of me yet so can you tell me which fine-grain SVM capabilities are supported in the latest driver on Gen8 devices?  Just FINE_GRAIN_BUFFER?

    If FINE_GRAIN_SYSTEM is supported then can an 8-16GB host address space be shared?

    The OpenCL 2.0 SVM article does a nice job summarizing the capability bits.  Can you list which are supported in the .4080 driver and which might eventually be supported?

    订阅 合作伙伴