Memory allocation issue after upgrading to DPDK18.11

Memory allocation issue after upgrading to DPDK18.11


It's my first time upgrading DPDK version (from DPDK16.07 to DPDK18.11).

I hope this is the relevant forum for a detailed SW question. Please let me know otherwise.

Some setup relevant info:

  • Two Numas.
  • Single page size of 1G.
  • Total of 95 huge pages, 48 are being allocated for socket0 and 47 for socket1.

After upgrading to 18.11, I get failure on application startup while trying to alloc some memory block, which is greater than Greatest_free_size and no other pages are left. With the same application and amount of huge pages, these allocations ended successfully when running with DPDK16.07.

So I compared the allocation on 16.07 VS 18.11 and found out that when memory is allocated statically in 16.07 we have no issue (where Free_size always equals to Greatest_free_size), but when allocating memory dynamically we allocate a new page whenever there is no enough contiguous memory, so I guess we have some holes and a result no free memory is left.

     1. When working dynamically, doesn't the DPDK try to allocate contiguously? If it does, why are we out of memory?

In order to overcome it I enabled the memory legacy flag. However, once using it, I get the following error - EAL: Could not find space for memseg. Please increase CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE in configuration.

With default configuration, each segment list has 32 segments (as being determined by memseg_primary_init), 4 lists per socket.

When we get to remap_segment (getting there from eal_legacy_hugepage_init  -> remap_needed_hugepages), and go through each memseg list searching for n=48 free segments. However, on each list we have only 32 segments so we fail to remap segments.

As a workaround I increased CONFIG_RTE_MAX_MEM_MB_PER_LIST from 32768 to 65536 so as a result each segment list holds 64 segments. Modifying CONFIG_RTE_MAX_MEMSEG_PER_TYPE and/or CONFIG_RTE_MAX_MEM_PER_TYPE as printed in error message had no influence as calculation takes into account the min value of per_type and per_list, where per list was the smaller one.

     2. Why do we check for n segments in a single list instead of in all lists?

     3. Is it correct handling increasing num of segments per list? What if we have more huge pages (which result with more than 64 per socket) – we should re-increase it?




1 post / 0 new