huge page (2m) does not seem to work

huge page (2m) does not seem to work

Dear forum,

I am trying to use 2m huge page on the coprocessor using the native execution (mpi+openmp). I set up the server following this tutorial, using the "libhugetlbfs" library approach. But when the binary is running natively, I notice that the huge page does not seem to be consumed at all:

# cat /proc/meminfo | grep HugePages

AnonHugePages:     24576 kB

HugePages_Total:     800   # this is my preallocated amount

HugePages_Free:      800 # always remains the same with the total amount

HugePages_Rsvd:        0

HugePages_Surp:        0

Could you help me with this problem? In addition, this document seems to indicate that the huge page is supposed to be enabled by default. If so, why the page is still not used?

Thanks for your time!

ps:

System info:

linux version: 2.6.32-431.11.2.el6.x86_64

mpss version: 3.2.1

mic: 5110p

mic uOS Version: 2.6.38.8+mpss3.2.1

result from "cat /sys/kernel/mm/transparent_hugepage/enable" on the host: [always]

The keeper of the city keys Put shutters on the dreams. I wait outside the pilgrim's door With insufficient schemes. The black queen chants The funeral march, The cracked brass bells will ring; To summon back the fire witch To the court of the crimson king.
10 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi King Crimson,

Nice to see you back on the forums.

There is some discussion going on in the background.

We will get back to you soon.

Regards
--
Taylor
 

Best Reply

I used to use the libhugetlbfs approach, but I seem to recall that it quit working for me with some upgrade of MPSS.  (I probably did not update everything in my code that needed updating.)   I did not follow up because at about the same time I found that it was easier to use mmap() with the MAP_ANONYMOUS and MAP_HUGETLB flags.

With the transparent huge page option set, you should not need libhugetlbfs -- any sufficiently large static allocation should be put on large pages.  (I don't know if dynamic allocations other than mmap() will automatically use large pages.)

Another issue is the minimum size required to activate the transparent large page mechanism.  If an allocation is not 2MiB-aligned, then it will be filled with 4KiB pages until it reaches the next 2MiB boundary.   This is not really a problem for GiB-sized allocations, but it means that you can request up to (4 MiB - 8 KiB)  and not get any large pages -- e.g., if the allocation starts at 2MiB+4KiB, then you should get 511 4KiB pages to get to the next 2MiB boundary, leaving 511 4 KiB pages (one less than a 2MiB page) remaining.

John D. McCalpin, PhD "Dr. Bandwidth"

John is correct in that it should be transparent now. Do you still need us to investigate whether you can still use libhugetlbfs, say for reasons minimizing code changes?

Regards
--
Taylor

 

Dr. Kidd,

Yes, I'd appreciate that. Minimizing the code changes is the primary reason I chose the library approach.

The keeper of the city keys Put shutters on the dreams. I wait outside the pilgrim's door With insufficient schemes. The black queen chants The funeral march, The cracked brass bells will ring; To summon back the fire witch To the court of the crimson king.

Dr Kidd? How did you find out my secret?

 

XD. this

The keeper of the city keys Put shutters on the dreams. I wait outside the pilgrim's door With insufficient schemes. The black queen chants The funeral march, The cracked brass bells will ring; To summon back the fire witch To the court of the crimson king.

Hello,

>>I am trying to use 2m huge page on the coprocessor using the native execution (mpi+openmp). I set up the server following this tutorial, using the "libhugetlbfs" library approach. But when the binary is running natively, I notice that the huge page does not seem to be consumed at all:

# cat /proc/meminfo | grep HugePages

AnonHugePages:     24576 kB

You need to look at the "AnonHugePages:     24576 kB"   and NOT  the "HugePages Total/Free"

In the above case you are using "(24576/1024)/2" = 12  2 MB pages  and that is due to Transparent Huge Pages (THP). Did you look at the AnonHugePages before and during your application run on the Xeon Phi?

Thanks

Karthik

hmmmm.......is it work with cross platform...?

 

 

 

 

 

team inivo

Vidura,

It depends on what you mean by is it working with cross platform. If you mean, do you get 2MB pages on the coprocessor when you use the offload programming model to offload data to the coprocessor from the host, the answer is - Transparent Huge Pages does not apply here, but there is an environment variable,  MIC_USE_2MB_BUFFERS which you can use to set a size limit above which contiguous data on the coprocessor will be allocated in 2MB pages. This environment variable is described in the C++ compiler reference manual (https://software.intel.com/en-us/node/512835).

Frances

Leave a Comment

Please sign in to add a comment. Not a member? Join today