linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ilias Apalodimas <ilias.apalodimas@linaro.org>
To: Mina Almasry <almasrymina@google.com>
Cc: "Shailend Chand" <shailend@google.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kselftest@vger.kernel.org, bpf@vger.kernel.org,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"David Ahern" <dsahern@kernel.org>,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Shakeel Butt" <shakeelb@google.com>
Subject: Re: [net-next v1 02/16] net: page_pool: create hooks for custom page providers
Date: Tue, 12 Dec 2023 10:07:18 +0200	[thread overview]
Message-ID: <CAC_iWjKikzwpjR0hBjYuRxgYjyqp_EYrrxoveB_2DgCxk6vWYw@mail.gmail.com> (raw)
In-Reply-To: <20231208005250.2910004-3-almasrymina@google.com>

Hi Mina,

Apologies for not participating in the party earlier.

On Fri, 8 Dec 2023 at 02:52, Mina Almasry <almasrymina@google.com> wrote:
>
> From: Jakub Kicinski <kuba@kernel.org>
>
> The page providers which try to reuse the same pages will
> need to hold onto the ref, even if page gets released from
> the pool - as in releasing the page from the pp just transfers
> the "ownership" reference from pp to the provider, and provider
> will wait for other references to be gone before feeding this
> page back into the pool.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
> Signed-off-by: Mina Almasry <almasrymina@google.com>
>
> ---
>
> This is implemented by Jakub in his RFC:
> https://lore.kernel.org/netdev/f8270765-a27b-6ccf-33ea-cda097168d79@redhat.com/T/
>
> I take no credit for the idea or implementation; I only added minor
> edits to make this workable with device memory TCP, and removed some
> hacky test code. This is a critical dependency of device memory TCP
> and thus I'm pulling it into this series to make it revewable and
> mergable.
>
> RFC v3 -> v1
> - Removed unusued mem_provider. (Yunsheng).
> - Replaced memory_provider & mp_priv with netdev_rx_queue (Jakub).
>
> ---
>  include/net/page_pool/types.h | 12 ++++++++++
>  net/core/page_pool.c          | 43 +++++++++++++++++++++++++++++++----
>  2 files changed, 50 insertions(+), 5 deletions(-)
>
> diff --git a/include/net/page_pool/types.h b/include/net/page_pool/types.h
> index ac286ea8ce2d..0e9fa79a5ef1 100644
> --- a/include/net/page_pool/types.h
> +++ b/include/net/page_pool/types.h
> @@ -51,6 +51,7 @@ struct pp_alloc_cache {
>   * @dev:       device, for DMA pre-mapping purposes
>   * @netdev:    netdev this pool will serve (leave as NULL if none or multiple)
>   * @napi:      NAPI which is the sole consumer of pages, otherwise NULL
> + * @queue:     struct netdev_rx_queue this page_pool is being created for.
>   * @dma_dir:   DMA mapping direction
>   * @max_len:   max DMA sync memory size for PP_FLAG_DMA_SYNC_DEV
>   * @offset:    DMA sync address offset for PP_FLAG_DMA_SYNC_DEV
> @@ -63,6 +64,7 @@ struct page_pool_params {
>                 int             nid;
>                 struct device   *dev;
>                 struct napi_struct *napi;
> +               struct netdev_rx_queue *queue;
>                 enum dma_data_direction dma_dir;
>                 unsigned int    max_len;
>                 unsigned int    offset;
> @@ -125,6 +127,13 @@ struct page_pool_stats {
>  };
>  #endif
>
> +struct memory_provider_ops {
> +       int (*init)(struct page_pool *pool);
> +       void (*destroy)(struct page_pool *pool);
> +       struct page *(*alloc_pages)(struct page_pool *pool, gfp_t gfp);
> +       bool (*release_page)(struct page_pool *pool, struct page *page);
> +};
> +
>  struct page_pool {
>         struct page_pool_params_fast p;
>
> @@ -174,6 +183,9 @@ struct page_pool {
>          */
>         struct ptr_ring ring;
>
> +       void *mp_priv;
> +       const struct memory_provider_ops *mp_ops;
> +
>  #ifdef CONFIG_PAGE_POOL_STATS
>         /* recycle stats are per-cpu to avoid locking */
>         struct page_pool_recycle_stats __percpu *recycle_stats;
> diff --git a/net/core/page_pool.c b/net/core/page_pool.c
> index ca1b3b65c9b5..f5c84d2a4510 100644
> --- a/net/core/page_pool.c
> +++ b/net/core/page_pool.c
> @@ -25,6 +25,8 @@
>
>  #include "page_pool_priv.h"
>
> +static DEFINE_STATIC_KEY_FALSE(page_pool_mem_providers);

We could add the existing page pool mechanisms as another 'provider',
but I assume this is coded like this for performance reasons (IOW skip
the expensive ptr call for the default case?)

> +
>  #define DEFER_TIME (msecs_to_jiffies(1000))
>  #define DEFER_WARN_INTERVAL (60 * HZ)
>
> @@ -174,6 +176,7 @@ static int page_pool_init(struct page_pool *pool,
>                           const struct page_pool_params *params)
>  {
>         unsigned int ring_qsize = 1024; /* Default */
> +       int err;
>
>         memcpy(&pool->p, &params->fast, sizeof(pool->p));
>         memcpy(&pool->slow, &params->slow, sizeof(pool->slow));
> @@ -234,10 +237,25 @@ static int page_pool_init(struct page_pool *pool,
>         /* Driver calling page_pool_create() also call page_pool_destroy() */
>         refcount_set(&pool->user_cnt, 1);
>
> +       if (pool->mp_ops) {
> +               err = pool->mp_ops->init(pool);
> +               if (err) {
> +                       pr_warn("%s() mem-provider init failed %d\n",
> +                               __func__, err);
> +                       goto free_ptr_ring;
> +               }
> +
> +               static_branch_inc(&page_pool_mem_providers);
> +       }
> +
>         if (pool->p.flags & PP_FLAG_DMA_MAP)
>                 get_device(pool->p.dev);
>
>         return 0;
> +
> +free_ptr_ring:
> +       ptr_ring_cleanup(&pool->ring, NULL);
> +       return err;
>  }
>
>  static void page_pool_uninit(struct page_pool *pool)
> @@ -519,7 +537,10 @@ struct page *page_pool_alloc_pages(struct page_pool *pool, gfp_t gfp)
>                 return page;
>
>         /* Slow-path: cache empty, do real allocation */
> -       page = __page_pool_alloc_pages_slow(pool, gfp);
> +       if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)

Why do we need && pool->mp_ops? On the init function, we only bump
page_pool_mem_providers if the ops are there

> +               page = pool->mp_ops->alloc_pages(pool, gfp);
> +       else
> +               page = __page_pool_alloc_pages_slow(pool, gfp);
>         return page;
>  }
>  EXPORT_SYMBOL(page_pool_alloc_pages);
> @@ -576,10 +597,13 @@ void __page_pool_release_page_dma(struct page_pool *pool, struct page *page)
>  void page_pool_return_page(struct page_pool *pool, struct page *page)
>  {
>         int count;
> +       bool put;
>
> -       __page_pool_release_page_dma(pool, page);
> -
> -       page_pool_clear_pp_info(page);
> +       put = true;
> +       if (static_branch_unlikely(&page_pool_mem_providers) && pool->mp_ops)

ditto

> +               put = pool->mp_ops->release_page(pool, page);
> +       else
> +               __page_pool_release_page_dma(pool, page);
>
>         /* This may be the last page returned, releasing the pool, so
>          * it is not safe to reference pool afterwards.
> @@ -587,7 +611,10 @@ void page_pool_return_page(struct page_pool *pool, struct page *page)
>         count = atomic_inc_return_relaxed(&pool->pages_state_release_cnt);
>         trace_page_pool_state_release(pool, page, count);
>
> -       put_page(page);
> +       if (put) {
> +               page_pool_clear_pp_info(page);
> +               put_page(page);
> +       }
>         /* An optimization would be to call __free_pages(page, pool->p.order)
>          * knowing page is not part of page-cache (thus avoiding a
>          * __page_cache_release() call).
> @@ -857,6 +884,12 @@ static void __page_pool_destroy(struct page_pool *pool)
>
>         page_pool_unlist(pool);
>         page_pool_uninit(pool);
> +
> +       if (pool->mp_ops) {

Same here. Using a mix of pool->mp_ops and page_pool_mem_providers
will work, but since we always check the ptr on init, can't we simply
rely on page_pool_mem_providers for the rest of the code?

Thanks
/Ilias
> +               pool->mp_ops->destroy(pool);
> +               static_branch_dec(&page_pool_mem_providers);
> +       }
> +
>         kfree(pool);
>  }
>
> --
> 2.43.0.472.g3155946c3a-goog
>

  reply	other threads:[~2023-12-12  8:08 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  0:52 [net-next v1 00/16] Device Memory TCP Mina Almasry
2023-12-08  0:52 ` [net-next v1 01/16] net: page_pool: factor out releasing DMA from releasing the page Mina Almasry
2023-12-10  3:49   ` Shakeel Butt
2023-12-12  8:11   ` Ilias Apalodimas
2023-12-08  0:52 ` [net-next v1 02/16] net: page_pool: create hooks for custom page providers Mina Almasry
2023-12-12  8:07   ` Ilias Apalodimas [this message]
2023-12-12 14:47     ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 03/16] queue_api: define queue api Mina Almasry
2023-12-14  1:15   ` Jakub Kicinski
2023-12-08  0:52 ` [net-next v1 04/16] gve: implement " Mina Almasry
2024-03-05 11:45   ` Arnd Bergmann
2023-12-08  0:52 ` [net-next v1 05/16] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2023-12-14  1:17   ` Jakub Kicinski
2023-12-08  0:52 ` [net-next v1 06/16] netdev: support binding dma-buf to netdevice Mina Almasry
2023-12-08 15:40   ` kernel test robot
2023-12-08 16:02   ` kernel test robot
2023-12-08 17:48   ` David Ahern
2023-12-08 19:22     ` Mina Almasry
2023-12-08 20:32       ` Mina Almasry
2023-12-09 23:29       ` David Ahern
2023-12-11  2:19         ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 07/16] netdev: netdevice devmem allocator Mina Almasry
2023-12-08 17:56   ` David Ahern
2023-12-08 19:27     ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 08/16] memory-provider: dmabuf devmem memory provider Mina Almasry
2023-12-08 22:48   ` Pavel Begunkov
2023-12-08 23:25     ` Mina Almasry
2023-12-10  3:03       ` Pavel Begunkov
2023-12-11  2:30         ` Mina Almasry
2023-12-11 20:35           ` Pavel Begunkov
2023-12-14 20:03             ` Mina Almasry
2023-12-19 23:55               ` Pavel Begunkov
2023-12-08 23:05   ` Pavel Begunkov
2023-12-12 12:25   ` Jason Gunthorpe
2023-12-12 13:07     ` Christoph Hellwig
2023-12-12 14:26     ` Mina Almasry
2023-12-12 14:39       ` Jason Gunthorpe
2023-12-12 14:58         ` Mina Almasry
2023-12-12 15:08           ` Jason Gunthorpe
2023-12-13  1:09             ` Mina Almasry
2023-12-13  2:19               ` David Ahern
2023-12-13  7:49   ` Yinjun Zhang
2023-12-08  0:52 ` [net-next v1 09/16] page_pool: device memory support Mina Almasry
2023-12-08  9:30   ` Yunsheng Lin
2023-12-08 16:05     ` Mina Almasry
2023-12-11  2:04       ` Yunsheng Lin
2023-12-11  2:26         ` Mina Almasry
2023-12-11  4:04           ` Mina Almasry
2023-12-11 11:51             ` Yunsheng Lin
2023-12-11 18:14               ` Mina Almasry
2023-12-12 11:17                 ` Yunsheng Lin
2023-12-12 14:28                   ` Mina Almasry
2023-12-13 11:48                     ` Yunsheng Lin
2023-12-13  7:52             ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 10/16] page_pool: don't release iov on elevanted refcount Mina Almasry
2023-12-08  0:52 ` [net-next v1 11/16] net: support non paged skb frags Mina Almasry
2023-12-08  0:52 ` [net-next v1 12/16] net: add support for skbs with unreadable frags Mina Almasry
2023-12-08  0:52 ` [net-next v1 13/16] tcp: RX path for devmem TCP Mina Almasry
2023-12-08 15:40   ` kernel test robot
2023-12-08 17:55   ` David Ahern
2023-12-08 19:23     ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 14/16] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags Mina Almasry
2023-12-12 19:08   ` Simon Horman
2023-12-08  0:52 ` [net-next v1 15/16] net: add devmem TCP documentation Mina Almasry
2023-12-12 19:14   ` Simon Horman
2023-12-08  0:52 ` [net-next v1 16/16] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2023-12-08  1:47 ` [net-next v1 00/16] Device Memory TCP Mina Almasry
2023-12-08 17:57 ` David Ahern
2023-12-08 19:31   ` Mina Almasry
2023-12-10  3:48 ` Shakeel Butt
2023-12-12  5:58 ` Christoph Hellwig
2023-12-14  6:20 ` patchwork-bot+netdevbpf
2023-12-14  6:48   ` Christoph Hellwig
2023-12-14  6:51     ` Mina Almasry
2023-12-14  6:59       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAC_iWjKikzwpjR0hBjYuRxgYjyqp_EYrrxoveB_2DgCxk6vWYw@mail.gmail.com \
    --to=ilias.apalodimas@linaro.org \
    --cc=almasrymina@google.com \
    --cc=arnd@arndb.de \
    --cc=bpf@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=hramamurthy@google.com \
    --cc=jeroendb@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=shailend@google.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).