All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Umakiran Godavarthi (ugodavar)" <ugodavar@cisco.com>
To: "anatoly.burakov@intel.com" <anatoly.burakov@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Subject: DPDK 19.11.5 Legacy Memory Design Query
Date: Wed, 14 Sep 2022 07:30:54 +0000	[thread overview]
Message-ID: <SJ0PR11MB484677B272ABA8D4C336925CDD469@SJ0PR11MB4846.namprd11.prod.outlook.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 4938 bytes --]

Hi Anatoly/DPDK-Developers

I am working on DPDK 19.11.5 Legacy Memory design and have a query about how to boot up in Legacy memory mode.


  1.
Linux kernel boots up with huge pages (‘N’) and free huge pages (‘N’) initially


  1.  We calculate huge pages we need for data path (Driver need buffers for all queues) by sorting the memory fragments 2MB huge pages fragments and find we need ‘X’ pages , so we go to kernel and set SYSFS attribute nr_hugepages (/proc/sys/vm/nr_hugepages) to X pages.   We also store the sorted physical memory fragments as POOL_0, POOL_1, POOL_2….etc. (Just for pool count purpose enough for all ports and queues to initialize)

For example, if host has memory pattern huge pages like this for total 500 we get in step 1 kernel reservation.

250, 90, 80, 70 , 10 -> Sum is 500 (N pages)

We need only 350 pages (350 based on no of ports, queues dpdk application needs)

So we need 250, 90, 10.

So total 3 pools POOL_0 -> 250 pages, POOL_1 -> 90, POOL_2 -> 10 pages


  1.  We boot up DPDK by RTE_EAL_INIT


  1.  Then we go to DPDK Memory segment list walkthrough and for each FBARRAY , we find the used pages by DPDK and unmap the remaining pages by below code (Idea is to free the huge pages taken by DPDK process virtual memory) -> Free_HP will be 0 then, as X pages are used by DPDK and all unnecessary pages are freed in this step)
Sample Code of 4 :

              rte_memseg_list_walk_thread_unsafe(dpdk_find_and_free_unused, NULL); ->DPDK_FIND_AND_FREE_UNUSED is called for each Memory segment list (FBARRAY pointer is derived from MSL like below)

              dpdk_find_and_free_unused(const struct rte_memseg_list *msl,
                                          void *arg UNUSED)
               {
                      Int ms_idx;
                       arr = (struct rte_fbarray *) &msl->memseg_arr;

                        /*
                         * use size of 2 instead of 1 to find the next free slot but
                        * not hole.
                        */
                     ms_idx = rte_fbarray_find_next_n_free(arr, 0, 2);

                     if (ms_idx >= 0) {
                         addr = RTE_PTR_ADD(msl->base_va, ms_idx * msl->page_sz);
                            munmap(addr, RTE_PTR_DIFF(RTE_PTR_ADD(msl->base_va, msl->len), addr));
                      }
               }


  1.  With NR_PAGES As ‘X’ and FREE_PAGES as 0, we create MBUF pools using RTE API and we face crash (We take care ‘X” pages has multi pools based on memory fragmentation given to the primary process, so pool by pool we have confidence that DPDK should find physical memory segment contiguous and allocate successfully)

                    struct rte_mempool *pool;

                   pool = rte_pktmbuf_pool_create(name, num_mbufs,
                                   RTE_MIN(num_mbufs/4, MBUF_CACHE_SIZE),
                                   MBUF_PRIV_SIZE,
                                   frame_len + RTE_PKTMBUF_HEADROOM,
                                   rte_socket_id()



  1.  Sometimes randomly we face a crash during pool create in Step 5 for each POOL stored in Step 2 process for all ports and queues initialize later on

               DPDK EAL core comes with BT like this

                 #6  sigcrash (signo=11, info=0x7fff1c1867f0, ctx=0x7fff1c1866c0)
    #7
    #8  malloc_elem_can_hold () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #9  find_suitable_element () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #10  malloc_heap_alloc () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #11  rte_malloc_socket () from ./usr/lib64/dpdk-19/librte_eal.so.20.0
    #12  rte_mempool_create_empty () from ./usr/lib64/dpdk-19/librte_mempool.so.20.0
    #13  rte_pktmbuf_pool_create_by_ops () from ./usr/lib64/dpdk-19/librte_mbuf.so.20.0
    #14  rte_pktmbuf_pool_create () from ./usr/lib64/dpdk-19/librte_mbuf.so.20.0
    #15  dpdk_create_mbuf_pool (mem_chunk_tbl=0x555d556de8e0 , num_mbufs=46080, frame_len=2048, name=0x7fff1c1873c0 "DPDK_POOL_0")

                We see find suitable element does not able to find a suitable element in DPDK memory segment lists it searches for HEAP ALLOC and returns NULL and NULL dereference crashes  boot up process

Please let me know any comments on boot up process for 1-6 and any reason behind the crash ?

We are suspecting Step 4 where FBARRAY unused pages freeing at last should free the least contiguous memory segments right ? (munmap after finding 2 free pages , entire length we unmap in step 4 to free virtual memory)

Please let me know thoughts on FBARRAY design, is it expected to map the most contiguous…..least contiguous in a virtual address space right ?

So our most contiguous segments in Step 2 is safe even after Step 3, Step 4 we believe. Please correct my understanding if anything wrong.


Thanks
Umakiran


[-- Attachment #2: Type: text/html, Size: 27029 bytes --]

             reply	other threads:[~2022-09-14  7:31 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-14  7:30 Umakiran Godavarthi (ugodavar) [this message]
2022-09-21  6:50 ` DPDK 19.11.5 Legacy Memory Design Query Umakiran Godavarthi (ugodavar)
2022-09-22  8:08   ` Umakiran Godavarthi (ugodavar)
2022-09-22  9:00     ` Dmitry Kozlyuk
2022-09-23 11:20       ` Umakiran Godavarthi (ugodavar)
2022-09-23 11:47         ` Dmitry Kozlyuk
2022-09-23 12:12           ` Umakiran Godavarthi (ugodavar)
2022-09-23 13:10             ` Dmitry Kozlyuk
2022-09-26 12:55               ` Umakiran Godavarthi (ugodavar)
2022-09-26 13:06                 ` Umakiran Godavarthi (ugodavar)
2022-10-10 15:15                   ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=SJ0PR11MB484677B272ABA8D4C336925CDD469@SJ0PR11MB4846.namprd11.prod.outlook.com \
    --to=ugodavar@cisco.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.