netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Tariq Toukan <tariqt@mellanox.com>
To: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>,
	Jesper Dangaard Brouer <brouer@redhat.com>
Cc: "netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Toke Høiland-Jørgensen" <toke@toke.dk>,
	"toshiaki.makita1@gmail.com" <toshiaki.makita1@gmail.com>,
	"grygorii.strashko@ti.com" <grygorii.strashko@ti.com>,
	"mcroce@redhat.com" <mcroce@redhat.com>
Subject: Re: [PATCH net-next v1 08/11] xdp: tracking page_pool resources and safe removal
Date: Sun, 16 Jun 2019 10:56:25 +0000	[thread overview]
Message-ID: <a02856c1-46e7-4691-6bb9-e0efb388981f@mellanox.com> (raw)
In-Reply-To: <20190615093339.GB3771@khorivan>



On 6/15/2019 12:33 PM, Ivan Khoronzhuk wrote:
> On Thu, Jun 13, 2019 at 08:28:42PM +0200, Jesper Dangaard Brouer wrote:
> Hi, Jesper
> 
>> This patch is needed before we can allow drivers to use page_pool for
>> DMA-mappings. Today with page_pool and XDP return API, it is possible to
>> remove the page_pool object (from rhashtable), while there are still
>> in-flight packet-pages. This is safely handled via RCU and failed 
>> lookups in
>> __xdp_return() fallback to call put_page(), when page_pool object is 
>> gone.
>> In-case page is still DMA mapped, this will result in page note getting
>> correctly DMA unmapped.
>>
>> To solve this, the page_pool is extended with tracking in-flight 
>> pages. And
>> XDP disconnect system queries page_pool and waits, via workqueue, for all
>> in-flight pages to be returned.
>>
>> To avoid killing performance when tracking in-flight pages, the implement
>> use two (unsigned) counters, that in placed on different cache-lines, and
>> can be used to deduct in-flight packets. This is done by mapping the
>> unsigned "sequence" counters onto signed Two's complement arithmetic
>> operations. This is e.g. used by kernel's time_after macros, described in
>> kernel commit 1ba3aab3033b and 5a581b367b5, and also explained in 
>> RFC1982.
>>
>> The trick is these two incrementing counters only need to be read and
>> compared, when checking if it's safe to free the page_pool structure. 
>> Which
>> will only happen when driver have disconnected RX/alloc side. Thus, on a
>> non-fast-path.
>>
>> It is chosen that page_pool tracking is also enabled for the non-DMA
>> use-case, as this can be used for statistics later.
>>
>> After this patch, using page_pool requires more strict resource 
>> "release",
>> e.g. via page_pool_release_page() that was introduced in this 
>> patchset, and
>> previous patches implement/fix this more strict requirement.
>>
>> Drivers no-longer call page_pool_destroy(). Drivers already call
>> xdp_rxq_info_unreg() which call xdp_rxq_info_unreg_mem_model(), which 
>> will
>> attempt to disconnect the mem id, and if attempt fails schedule the
>> disconnect for later via delayed workqueue.
>>
>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
>> drivers/net/ethernet/mellanox/mlx5/core/en_main.c |    3 -
>> include/net/page_pool.h                           |   41 ++++++++++---
>> net/core/page_pool.c                              |   62 
>> +++++++++++++++-----
>> net/core/xdp.c                                    |   65 
>> +++++++++++++++++++--
>> 4 files changed, 136 insertions(+), 35 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c 
>> b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> index 2f647be292b6..6c9d4d7defbc 100644
>> --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
> 
> [...]
> 
>> --- a/net/core/xdp.c
>> +++ b/net/core/xdp.c
>> @@ -38,6 +38,7 @@ struct xdp_mem_allocator {
>>     };
>>     struct rhash_head node;
>>     struct rcu_head rcu;
>> +    struct delayed_work defer_wq;
>> };
>>
>> static u32 xdp_mem_id_hashfn(const void *data, u32 len, u32 seed)
>> @@ -79,13 +80,13 @@ static void __xdp_mem_allocator_rcu_free(struct 
>> rcu_head *rcu)
>>
>>     xa = container_of(rcu, struct xdp_mem_allocator, rcu);
>>
>> +    /* Allocator have indicated safe to remove before this is called */
>> +    if (xa->mem.type == MEM_TYPE_PAGE_POOL)
>> +        page_pool_free(xa->page_pool);
>> +
> 
> What would you recommend to do for the following situation:
> 
> Same receive queue is shared between 2 network devices. The receive ring is
> filled by pages from page_pool, but you don't know the actual port (ndev)
> filling this ring, because a device is recognized only after packet is 
> received.
> 
> The API is so that xdp rxq is bind to network device, each frame has 
> reference
> on it, so rxq ndev must be static. That means each netdev has it's own rxq
> instance even no need in it. Thus, after your changes, page must be 
> returned to
> the pool it was taken from, or released from old pool and recycled in 
> new one
> somehow.
> 
> And that is inconvenience at least. It's hard to move pages between 
> pools w/o
> performance penalty. No way to use common pool either, as unreg_rxq now 
> drops
> the pool and 2 rxqa can't reference same pool.
> 

Within the single netdev, separate page_pool instances are anyway 
created for different RX rings, working under different NAPI's.
So I don't understand your comment above about breaking some 
multi-netdev pool sharing use case...

Tariq

  reply	other threads:[~2019-06-16 10:56 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-13 18:28 [PATCH net-next v1 00/11] xdp: page_pool fixes and in-flight accounting Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 01/11] net: page_pool: add helper function to retrieve dma addresses Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 02/11] net: page_pool: add helper function to unmap " Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 03/11] xdp: fix leak of IDA cyclic id if rhashtable_insert_slow fails Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 04/11] xdp: page_pool related fix to cpumap Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 05/11] veth: use xdp_release_frame for XDP_PASS Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 06/11] page_pool: introduce page_pool_free and use in mlx5 Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 07/11] mlx5: more strict use of page_pool API Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 08/11] xdp: tracking page_pool resources and safe removal Jesper Dangaard Brouer
2019-06-14 11:01   ` Ilias Apalodimas
2019-06-15  9:33   ` Ivan Khoronzhuk
2019-06-16 10:56     ` Tariq Toukan [this message]
2019-06-18 12:54       ` Ivan Khoronzhuk
2019-06-18 13:30         ` Ilias Apalodimas
2019-06-18 13:48           ` Ivan Khoronzhuk
2019-06-18 15:19         ` Jesper Dangaard Brouer
2019-06-18 17:54           ` Ivan Khoronzhuk
2019-06-19 11:12             ` Ivan Khoronzhuk
2019-06-13 18:28 ` [PATCH net-next v1 09/11] xdp: force mem allocator removal and periodic warning Jesper Dangaard Brouer
2019-06-15  8:59   ` Ivan Khoronzhuk
2019-06-18  8:37     ` Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 10/11] xdp: add tracepoints for XDP mem Jesper Dangaard Brouer
2019-06-13 18:28 ` [PATCH net-next v1 11/11] page_pool: add tracepoints for page_pool with details need by XDP Jesper Dangaard Brouer
2019-06-15  2:41 ` [PATCH net-next v1 00/11] xdp: page_pool fixes and in-flight accounting David Miller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a02856c1-46e7-4691-6bb9-e0efb388981f@mellanox.com \
    --to=tariqt@mellanox.com \
    --cc=brouer@redhat.com \
    --cc=grygorii.strashko@ti.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=ivan.khoronzhuk@linaro.org \
    --cc=mcroce@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=toke@toke.dk \
    --cc=toshiaki.makita1@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).