All of lore.kernel.org
 help / color / mirror / Atom feed
* [dpdk-dev] [Bug 389] Crash in librte_kni driver due to noncontiguous pages
@ 2020-02-03 13:32 bugzilla
  0 siblings, 0 replies; only message in thread
From: bugzilla @ 2020-02-03 13:32 UTC (permalink / raw)
  To: dev

https://bugs.dpdk.org/show_bug.cgi?id=389

            Bug ID: 389
           Summary: Crash in librte_kni driver due to noncontiguous pages
           Product: DPDK
           Version: 18.11
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: critical
          Priority: Normal
         Component: core
          Assignee: dev@dpdk.org
          Reporter: scott_wasson@affirmednetworks.com
  Target Milestone: ---

We’re seeing a continuous crash since upgrading to 18.11, the kni FIFO’s
apparently aren’t contiguous.  From user-space’s perspective, the kni’s tx_q
straddles the 2MB page boundary at 0x17a600000.  The mbuf pointers in the ring
prior to this address are valid.  The tx_q’s write pointer is indicating there
are mbufs at 0x17a600000 and beyond, but the pointers are all NULL.

Because the rte_kni kernel module is loaded:

In eal.c:
      /* Workaround for KNI which requires physical address to work */
      if (iova_mode == RTE_IOVA_VA &&
          rte_eal_check_module("rte_kni") == 1) {
              if (phys_addrs) {
                 iova_mode = RTE_IOVA_PA;

Iova_mode is automatically forced to PA.

We determined that enabling --legacy-mem caused the problem to go away.  But
this caused the locations of the kni’s data structures to move, so they no
longer straddled a hugepages boundary.  Our concern is that the furniture may
move around again and bring us back to where we were.  Being tied to using
--legacy-mem is undesirable in the long-term, anyway.

We also found that the following code patch helps (even without --legacy-mem):

index 3d2ffb2..5cc9d69 100644
--- a/lib/librte_kni/rte_kni.c
+++ b/lib/librte_kni/rte_kni.c
@@ -143,31 +143,31 @@ kni_reserve_mz(struct rte_kni *kni)
        char mz_name[RTE_MEMZONE_NAMESIZE];
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_TX_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_tx_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_tx_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_tx_q == NULL, tx_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_RX_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_rx_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_rx_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_rx_q == NULL, rx_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_ALLOC_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_alloc_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_alloc_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_alloc_q == NULL, alloc_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_FREE_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_free_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_free_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_free_q == NULL, free_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_REQ_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_req_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_req_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_req_q == NULL, req_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_RESP_Q_MZ_NAME_FMT,
kni->name);
-       kni->m_resp_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_resp_q = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_resp_q == NULL, resp_q_fail);
        snprintf(mz_name, RTE_MEMZONE_NAMESIZE, KNI_SYNC_ADDR_MZ_NAME_FMT,
kni->name);
-       kni->m_sync_addr = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, 0);
+       kni->m_sync_addr = rte_memzone_reserve(mz_name, KNI_FIFO_SIZE,
SOCKET_ID_ANY, RTE_MEMZONE_IOVA_CONTIG);
        KNI_MEM_CHECK(kni->m_sync_addr == NULL, sync_addr_fail);
        return 0;

I removed --legacy-mem, the tx_q still straddles the same 2MB page boundary,
yet now everything seems OK.

This would seem to follow precedent in rte_mempool.c:

    /* if we're trying to reserve contiguous memory, add appropriate
     * memzone flag.
     */
    if (try_contig)
        flags |= RTE_MEMZONE_IOVA_CONTIG;

which I think explains why our mbufs haven’t seen data truncation issues.

Could you please why RTE_MEMZONE_IOVA_CONTIG is necessary in PA mode?  Isn’t
contiguousness a fundamental property of physical addressing?

Are we still potentially vulnerable with --legacy-mem and without the above
code change?  Did we just get lucky because the furniture moved and doesn’t
straddle a page boundary at the moment?

We also tested with 19.11 and did not see the crash.  However the 19.11 release
notes say:

+* Changed mempool allocation behavior.

Changed the mempool allocation behaviour so that objects no longer cross pages
by default. Note, this may consume more memory when using small memory pages.


Thanks!

-Scott

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2020-02-03 13:32 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-03 13:32 [dpdk-dev] [Bug 389] Crash in librte_kni driver due to noncontiguous pages bugzilla

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.