The software development guide makes a great case at presenting the possiblities offered by hardware swizzling on Xeon Phi. But I could not understand how to use swizzling in practice.
There appear to be some intrinsic available in "zmmintrin.h" (the only header file that appears to provide 512 bits SIMD intrinsic. included from "immintrin.h"). Some mention swizzling but not all operations appear to support it. And only a handful of swizzling option appear to be supported. In particular I could not find how to apply "lane shifting".
Where can I find documentation on how to use swizzling in pratice? Is swizzling supposed to be used only from assembly?