I am writing an application on MIC architecture, I want to know the bandwith between each memory device.
Like bandwidth between core and L1, L1 and L2, L2 and memory. I want these information to evaluate my application.
So I want to know how many Load can be issued each clock cycle. ?
How many cycles needed to translate a 64byte cache line from L2 to L1 ?