not vectorizing for no reason

Hi all,

I have isolated a small section of a loop in my code to vectorize and test for other kinds of optimization a well(like alignment etc)

Here is the actual code.

WORK1(:,:,kk) =  KAPPA_THIC(:,:,kbt,k,bid)  * SLX(:,:,kk,kbt,k,bid) * dz(k)

The optrpt says this 

LOOP BEGIN at loop.F90(91,13)
   remark #15541: outer loop was not auto-vectorized: consider using SIMD directive
   remark #25436: completely unrolled by 8

英特尔® INDE:使用商用游戏引擎的游戏开发人员的工具


如今,游戏开发行业的日子并不好过。一方面,开发人员需要面对多种平台上产品“半衰期”不断缩短的问题;另一方面,多种操作系统版本也带来了更多挑战。甚至针对单个平台优化游戏也变得更困难,因为系统复杂性不断加剧,而且现在功耗对游戏性能的好坏发挥着关键作用。如今,Windows* 和 Android* 设备的数量有数十亿之多,这使得潜在的投资回收期也变得颇为漫长。

在本文中,我们将介绍去年发布的跨平台工具套件英特尔® Integrated Native Development Experience(英特尔® INDE)可如何帮助您便捷创建能够在 Windows* 和 Android* 设备上实现本地运行性能的一流游戏。这些工具即使对于使用 Unity* 或 Epic Unreal Engine* 等第三方游戏引擎也非常有用。英特尔 INDE 工具仍能够提供额外功能,帮助您的游戏在竞争激烈的市场中脱颖而出。

英特尔 INDE 能够帮助您创建出色的游戏,从而为玩家带来畅快淋漓的沉浸式游戏体验。


  • Developers
  • Android*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Android*
  • Windows*
  • C/C++
  • Beginner
  • Intermediate
  • Intel® INDE
  • 使用Cocos2d-x 3.0或更新版本创建多平台游戏


    在本教程中,您将了解到如何使用 3.0 版或更高版本 Cocos2d-x 框架在 Windows* 开发环境中创建简单的游戏,以及如何实施编译以便它在 Windows 和 Android* 上运行。

    Cocos2d-x 是什么?

    Cocos2d-x 是一种跨平台的游戏(及互动书本等其他图形应用)框架,基于 iOS* cocos2d,但使用 C++、JavaScript* 或 Lua*,而非 Objective-C*。

    该框架的一个优势在于支持创建可部署于不同平台(Android、iOS、Win32、Windows* Phone、Mac*、Linux* 等)上的游戏,有助于保持相同的代码库,只需针对每种平台进行特定的调整。

  • Developers
  • Partners
  • Professors
  • Students
  • Android*
  • Microsoft Windows* (XP, Vista, 7)
  • Microsoft Windows* 8.x
  • Android*
  • Game Development
  • Windows*
  • C/C++
  • Beginner
  • cocos2d-x
  • android game
  • Game Development
  • Microsoft Windows* 8 Desktop
  • Are there any instructions in k1om can replace lfence instruction in x86_64

    I'm compiling Supersonic, an opensource database of google on Intel Phi using icc with option -mmic

    but I find some lfence in the source code, but it seems that Phi doesn't support lfence instruction, so I want to replace lfence by some other instructions in Phi.

    Is it practicable? for example,

    英特尔® 实感™ SDK 黄金版 R2 – 概述

    The Intel® RealSense™ SDK Gold R2 (v4.0) release is now available! This brief overview walks you through product improvements from Gold R1 (v3.1) and some of the new natural interaction modalities that can be used to create compelling applications using the Intel RealSense SDK.
  • Developers
  • Microsoft Windows* 8.x
  • Intel® RealSense™ Technology
  • C#
  • C/C++
  • Java*
  • JavaScript*
  • Unity
  • Intermediate
  • Intel® RealSense™ SDK
  • Intel® RealSense™ Technology
  • SDK
  • DCM
  • Gold R1
  • Gold R2 SDK
  • 3D Scanning
  • Pulse Estimator
  • Blob Tracking
  • How to correct setting AVD

    Dear Guys,

    Does anyone know how to set AVD correctly. Let the front camera on your laptop can be like a real machine front camera running at the same

    application? I tried to change but the results are not good. Hope help me to resolve the issue by intel developers,Thanks a lot.

    Set Simulator Camera   http://prntscr.com/77kzuy

    Running x64 Simulator and Camera http://prntscr.com/77kzw6

    Best Regards, Alex

    _mm_unpackhi_epi8 and _mm_unpacklo_epi8 to convert 16 signed chars into 2 signed short vectors

    I am using the _mm_unpacklo_epi16 and _mm_unpackhi_epi16 with second argumet vector of 0s to convert signed/unsigned short vectors into 2 signed/unsigned integer vectors. i.e.:

    __m128i lowVec  = _mm_unpacklo_epi16(vecA vec0);
    __m128i highVec = _mm_unpackhi_epi16(vecA,vec0);

    This works fine with 16 unsigned chars vector into 2 unsigned short  vectors using  _mm_unpacklo_epi8 and _mm_unpackhi_epi8, yet when the input vector is of 16 signed chars the 2 short values in result vectors are all 127+original values. 

    Subscribe to C/C++