In SNC2 we will have 2 MCDRAM (node 2 and 3) nodes while in SNC4 we will have 4 MCDRAM (node 4,5,6,7) nodes. I want to map application for both SNC2 and SNC4 but on MCDRAM nodes not on DDR nodes.
For SNC2 I use: numactl -m 2,3 <application>
For SNC4 I use: numactl -m 4,5,6,7 <application>
Application: Intel Caffe
Number of threads: 16/32/64/128
In SNC2, node 2 is being used and then node 0 (DDR) is being used for memory allocation. I expect above memory allocation to use node 2 and node 3 not node 2 and node 0. Similarly, for SNC4 I am observing that node 4 and node 0 being used for memory and not node 4,5,6,7 as per numactl mapping above.
With this issue, I see performance difference as other nodes of HBM MCDRAM are not being used for memory allocation. It's not making sense to me. Can anyone suggest why this may be happening?