I was wondering if there has been any performance
comparison of the various (or any) Intel MPI messaging functions on an Intel64 platform
using large Linux pages (2MiB) vs the "standard" 4KiB pages.
are several benefits when large pagers are used, including the
reduction in TLB miss rates and the larger space for pre-fetching (2MiB
MPI stacks especially those coupled with a RDMA based communications stack on other platforms has shown improvements when large pages are used to transfer large messages as the VM operations do not get as much in the way.
Is there any direction in Intel to make large pages more easily accessible to apps vs the Intel MPI stack?