Here a solution using AVX2 (working on 32 bit integer entities):

#1:

org: ymm0 = x x x x a3 a2 a1 a0

vpermq ymm0,ymm0,0x10 => ymm0 = x x a3 a2 x x a1 a0 / select qwords x1x0

vpunpckldq ymm0,ymm0,ymm0 => ymm0 = a3 a3 a2 a2 a1 a1 a0 a0 / interlace low dwords

#2:

org: ymm0 = b3 a3 b2 a2 b1 a1 b0 a0

vpshufd ymm1,ymm0,0x08 => ymm1 = x x a3 a2 x x a1 a0 / select dwords xx20

vpshufd ymm2,ymm0,0x0d => ymm2 = x x b3 b2 x x b1 b0 / select dwords xx31

vpermq ymm1,ymm1,0x08 => ymm1 = x x x x a3 a2 a1 a0 / select qwords xx20

vpermq ymm2,ymm2,0x08 => ymm2 = x x x x b3 b2 b1 b0 / select qwords xx20

#3:

org: ymm1 = x x x x a3 a2 a1 a0; ymm2 = x x x x b3 b2 b1 b0

vpermq ymm1,ymm1,0x10 => ymm1 = x x a3 a2 x x a1 a0 / select qwords x1x0

vpermq ymm2,ymm2,0x10 => ymm2 = x x b3 b2 x x b1 b0 / select qwords x1x0

vpunpckldq ymm0,ymm1,ymm2 => ymm0 = b3 a3 b2 a2 b1 a1 b0 a0 / interlace low dwords

## Cross lane operations, how?

Question#1I have:

xmm0/mem128 = A3 A2 A1 A0

And I want to have:

ymm0 = A3 A3 A2 A2 A1 A1 A0 A0

Question #2I have:

ymm0 = B3 A3 B2 A2 B1 A1 B0 A0

And I want to have:

xmm1/mem128 = A3 A2 A1 A0

xmm2/mem128 = B3 B2 B1 B0

Question #3I have:

xmm1/mem128 = A3 A2 A1 A0

xmm2/mem128 = B3 B2 B1 B0

And I want to have:

ymm0 = B3 A3 B2 A2 B1 A1 B0 A0

How to accomplish those seemingly trivial transformations having in mind AVX cross-lane limitations?