netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
To: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: netdev@vger.kernel.org,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>,
	"Tariq Toukan" <tariqt@mellanox.com>,
	toshiaki.makita1@gmail.com, grygorii.strashko@ti.com,
	mcroce@redhat.com
Subject: Re: [PATCH net-next v1 08/11] xdp: tracking page_pool resources and safe removal
Date: Sat, 15 Jun 2019 12:33:40 +0300	[thread overview]
Message-ID: <20190615093339.GB3771@khorivan> (raw)
In-Reply-To: <156045052249.29115.2357668905441684019.stgit@firesoul>

On Thu, Jun 13, 2019 at 08:28:42PM +0200, Jesper Dangaard Brouer wrote:
Hi, Jesper

>This patch is needed before we can allow drivers to use page_pool for
>DMA-mappings. Today with page_pool and XDP return API, it is possible to
>remove the page_pool object (from rhashtable), while there are still
>in-flight packet-pages. This is safely handled via RCU and failed lookups in
>__xdp_return() fallback to call put_page(), when page_pool object is gone.
>In-case page is still DMA mapped, this will result in page note getting
>correctly DMA unmapped.
>
>To solve this, the page_pool is extended with tracking in-flight pages. And
>XDP disconnect system queries page_pool and waits, via workqueue, for all
>in-flight pages to be returned.
>
>To avoid killing performance when tracking in-flight pages, the implement
>use two (unsigned) counters, that in placed on different cache-lines, and
>can be used to deduct in-flight packets. This is done by mapping the
>unsigned "sequence" counters onto signed Two's complement arithmetic
>operations. This is e.g. used by kernel's time_after macros, described in
>kernel commit 1ba3aab3033b and 5a581b367b5, and also explained in RFC1982.
>
>The trick is these two incrementing counters only need to be read and
>compared, when checking if it's safe to free the page_pool structure. Which
>will only happen when driver have disconnected RX/alloc side. Thus, on a
>non-fast-path.
>
>It is chosen that page_pool tracking is also enabled for the non-DMA
>use-case, as this can be used for statistics later.
>
>After this patch, using page_pool requires more strict resource "release",
>e.g. via page_pool_release_page() that was introduced in this patchset, and
>previous patches implement/fix this more strict requirement.
>
>Drivers no-longer call page_pool_destroy(). Drivers already call
>xdp_rxq_info_unreg() which call xdp_rxq_info_unreg_mem_model(), which will
>attempt to disconnect the mem id, and if attempt fails schedule the
>disconnect for later via delayed workqueue.
>
>Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>---
> drivers/net/ethernet/mellanox/mlx5/core/en_main.c |    3 -
> include/net/page_pool.h                           |   41 ++++++++++---
> net/core/page_pool.c                              |   62 +++++++++++++++-----
> net/core/xdp.c                                    |   65 +++++++++++++++++++--
> 4 files changed, 136 insertions(+), 35 deletions(-)
>
>diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>index 2f647be292b6..6c9d4d7defbc 100644
>--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c

[...]

>--- a/net/core/xdp.c
>+++ b/net/core/xdp.c
>@@ -38,6 +38,7 @@ struct xdp_mem_allocator {
> 	};
> 	struct rhash_head node;
> 	struct rcu_head rcu;
>+	struct delayed_work defer_wq;
> };
>
> static u32 xdp_mem_id_hashfn(const void *data, u32 len, u32 seed)
>@@ -79,13 +80,13 @@ static void __xdp_mem_allocator_rcu_free(struct rcu_head *rcu)
>
> 	xa = container_of(rcu, struct xdp_mem_allocator, rcu);
>
>+	/* Allocator have indicated safe to remove before this is called */
>+	if (xa->mem.type == MEM_TYPE_PAGE_POOL)
>+		page_pool_free(xa->page_pool);
>+

What would you recommend to do for the following situation:

Same receive queue is shared between 2 network devices. The receive ring is
filled by pages from page_pool, but you don't know the actual port (ndev)
filling this ring, because a device is recognized only after packet is received.

The API is so that xdp rxq is bind to network device, each frame has reference
on it, so rxq ndev must be static. That means each netdev has it's own rxq
instance even no need in it. Thus, after your changes, page must be returned to
the pool it was taken from, or released from old pool and recycled in new one
somehow.

And that is inconvenience at least. It's hard to move pages between pools w/o
performance penalty. No way to use common pool either, as unreg_rxq now drops
the pool and 2 rxqa can't reference same pool.

-- 
Regards,
Ivan Khoronzhuk

  parent reply	other threads:[~2019-06-15  9:33 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 18:28 [PATCH net-next v1 00/11] xdp: page_pool fixes and in-flight accounting Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 01/11] net: page_pool: add helper function to retrieve dma addresses Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 02/11] net: page_pool: add helper function to unmap " Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 03/11] xdp: fix leak of IDA cyclic id if rhashtable_insert_slow fails Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 04/11] xdp: page_pool related fix to cpumap Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 05/11] veth: use xdp_release_frame for XDP_PASS Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 06/11] page_pool: introduce page_pool_free and use in mlx5 Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 07/11] mlx5: more strict use of page_pool API Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 08/11] xdp: tracking page_pool resources and safe removal Jesper Dangaard Brouer
2019-06-14 11:01   ` Ilias Apalodimas
2019-06-15  9:33   ` Ivan Khoronzhuk [this message]
2019-06-16 10:56     ` Tariq Toukan
2019-06-18 12:54       ` Ivan Khoronzhuk
2019-06-18 13:30         ` Ilias Apalodimas
2019-06-18 13:48           ` Ivan Khoronzhuk
2019-06-18 15:19         ` Jesper Dangaard Brouer
2019-06-18 17:54           ` Ivan Khoronzhuk
2019-06-19 11:12             ` Ivan Khoronzhuk
2019-06-13 18:28 ` [PATCH net-next v1 09/11] xdp: force mem allocator removal and periodic warning Jesper Dangaard Brouer
2019-06-15  8:59   ` Ivan Khoronzhuk
2019-06-18  8:37     ` Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 10/11] xdp: add tracepoints for XDP mem Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 11/11] page_pool: add tracepoints for page_pool with details need by XDP Jesper Dangaard Brouer
2019-06-15  2:41 ` [PATCH net-next v1 00/11] xdp: page_pool fixes and in-flight accounting David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190615093339.GB3771@khorivan \
    --to=ivan.khoronzhuk@linaro.org \
    --cc=brouer@redhat.com \
    --cc=grygorii.strashko@ti.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=mcroce@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=tariqt@mellanox.com \
    --cc=toke@toke.dk \
    --cc=toshiaki.makita1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).