All of lore.kernel.org
 help / color / mirror / Atom feed
* DPDK 19.11.5 Legacy Memory Design Query
@ 2022-09-14  7:30 Umakiran Godavarthi (ugodavar)
  2022-09-21  6:50 ` Umakiran Godavarthi (ugodavar)
  0 siblings, 1 reply; 11+ messages in thread
From: Umakiran Godavarthi (ugodavar) @ 2022-09-14  7:30 UTC (permalink / raw)
  To: anatoly.burakov, dev

[-- Attachment #1: Type: text/plain, Size: 4938 bytes --]

Hi Anatoly/DPDK-Developers

I am working on DPDK 19.11.5 Legacy Memory design and have a query about how to boot up in Legacy memory mode.


  1.
Linux kernel boots up with huge pages (‘N’) and free huge pages (‘N’) initially


  1.  We calculate huge pages we need for data path (Driver need buffers for all queues) by sorting the memory fragments 2MB huge pages fragments and find we need ‘X’ pages , so we go to kernel and set SYSFS attribute nr_hugepages (/proc/sys/vm/nr_hugepages) to X pages.   We also store the sorted physical memory fragments as POOL_0, POOL_1, POOL_2….etc. (Just for pool count purpose enough for all ports and queues to initialize)

For example, if host has memory pattern huge pages like this for total 500 we get in step 1 kernel reservation.

250, 90, 80, 70 , 10 -> Sum is 500 (N pages)

We need only 350 pages (350 based on no of ports, queues dpdk application needs)

So we need 250, 90, 10.

So total 3 pools POOL_0 -> 250 pages, POOL_1 -> 90, POOL_2 -> 10 pages


  1.  We boot up DPDK by RTE_EAL_INIT


  1.  Then we go to DPDK Memory segment list walkthrough and for each FBARRAY , we find the used pages by DPDK and unmap the remaining pages by below code (Idea is to free the huge pages taken by DPDK process virtual memory) -> Free_HP will be 0 then, as X pages are used by DPDK and all unnecessary pages are freed in this step)
Sample Code of 4 :

              rte_memseg_list_walk_thread_unsafe(dpdk_find_and_free_unused, NULL); ->DPDK_FIND_AND_FREE_UNUSED is called for each Memory segment list (FBARRAY pointer is derived from MSL like below)

              dpdk_find_and_free_unused(const struct rte_memseg_list *msl,
                                          void *arg UNUSED)
               {
                      Int ms_idx;
                       arr = (struct rte_fbarray *) &msl->memseg_arr;

                        /*
                         * use size of 2 instead of 1 to find the next free slot but
                        * not hole.
                        */
                     ms_idx = rte_fbarray_find_next_n_free(arr, 0, 2);

                     if (ms_idx >= 0) {
                         addr = RTE_PTR_ADD(msl->base_va, ms_idx * msl->page_sz);
                            munmap(addr, RTE_PTR_DIFF(RTE_PTR_ADD(msl->base_va, msl->len), addr));
                      }
               }


  1.  With NR_PAGES As ‘X’ and FREE_PAGES as 0, we create MBUF pools using RTE API and we face crash (We take care ‘X” pages has multi pools based on memory fragmentation given to the primary process, so pool by pool we have confidence that DPDK should find physical memory segment contiguous and allocate successfully)

                    struct rte_mempool *pool;

                   pool = rte_pktmbuf_pool_create(name, num_mbufs,
                                   RTE_MIN(num_mbufs/4, MBUF_CACHE_SIZE),
                                   MBUF_PRIV_SIZE,
                                   frame_len + RTE_PKTMBUF_HEADROOM,
                                   rte_socket_id()



  1.  Sometimes randomly we face a crash during pool create in Step 5 for each POOL stored in Step 2 process for all ports and queues initialize later on

               DPDK EAL core comes with BT like this

                 #6  sigcrash (signo=11, info=0x7fff1c1867f0, ctx=0x7fff1c1866c0)
    #7
    #8  malloc_elem_can_hold () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #9  find_suitable_element () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #10  malloc_heap_alloc () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #11  rte_malloc_socket () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #12  rte_mempool_create_empty () from ./usr/lib64/dpdk-19/librte_mempool.so.20.0
    #13  rte_pktmbuf_pool_create_by_ops () from ./usr/lib64/dpdk-19/librte_mbuf.so.20.0
    #14  rte_pktmbuf_pool_create () from ./usr/lib64/dpdk-19/librte_mbuf.so.20.0
    #15  dpdk_create_mbuf_pool (mem_chunk_tbl=0x555d556de8e0 , num_mbufs=46080, frame_len=2048, name=0x7fff1c1873c0 "DPDK_POOL_0")

                We see find suitable element does not able to find a suitable element in DPDK memory segment lists it searches for HEAP ALLOC and returns NULL and NULL dereference crashes  boot up process

Please let me know any comments on boot up process for 1-6 and any reason behind the crash ?

We are suspecting Step 4 where FBARRAY unused pages freeing at last should free the least contiguous memory segments right ? (munmap after finding 2 free pages , entire length we unmap in step 4 to free virtual memory)

Please let me know thoughts on FBARRAY design, is it expected to map the most contiguous…..least contiguous in a virtual address space right ?

So our most contiguous segments in Step 2 is safe even after Step 3, Step 4 we believe. Please correct my understanding if anything wrong.


Thanks
Umakiran


[-- Attachment #2: Type: text/html, Size: 27029 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-10-10 15:15 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-14  7:30 DPDK 19.11.5 Legacy Memory Design Query Umakiran Godavarthi (ugodavar)
2022-09-21  6:50 ` Umakiran Godavarthi (ugodavar)
2022-09-22  8:08   ` Umakiran Godavarthi (ugodavar)
2022-09-22  9:00     ` Dmitry Kozlyuk
2022-09-23 11:20       ` Umakiran Godavarthi (ugodavar)
2022-09-23 11:47         ` Dmitry Kozlyuk
2022-09-23 12:12           ` Umakiran Godavarthi (ugodavar)
2022-09-23 13:10             ` Dmitry Kozlyuk
2022-09-26 12:55               ` Umakiran Godavarthi (ugodavar)
2022-09-26 13:06                 ` Umakiran Godavarthi (ugodavar)
2022-10-10 15:15                   ` Dmitry Kozlyuk

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.