Asymetrical performance between PCIe peer-to-peer devices


I have developed a driver for enabling direct communication between two PCIe devices. So, the default path is PCI A device -> Host Memory -> PCI B device. My framework allows the direct path PCI A -> PCI B. The problem is that while, the speedup in one direction (A->B) yields significant speedup, in the other direction (B->A) there is an order of magnitude drop down.

I can't figure the reason of the above behavior (both device firmware are closed-source). The chipset of the setup is the 5520 (IOH-36D). I have checked out the manual (, but still no clue. So, I was wondering if there is an obvious reason for that. If you can sched some light for my case I would be glad.

