All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mina Almasry <almasrymina@google.com>
To: David Ahern <dsahern@kernel.org>
Cc: "Shailend Chand" <shailend@google.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-arch@vger.kernel.org,
	linux-kselftest@vger.kernel.org, bpf@vger.kernel.org,
	linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org,
	"David S. Miller" <davem@davemloft.net>,
	"Eric Dumazet" <edumazet@google.com>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	"Christian König" <christian.koenig@amd.com>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Willem de Bruijn" <willemb@google.com>,
	"Kaiyuan Zhang" <kaiyuanz@google.com>
Subject: Re: [net-next v1 07/16] netdev: netdevice devmem allocator
Date: Fri, 8 Dec 2023 11:27:47 -0800	[thread overview]
Message-ID: <CAHS8izMysKCCoAoVK9KzQrfbtFfagaPMRYSUUjKTqJ-ZwJ53oA@mail.gmail.com> (raw)
In-Reply-To: <462da4bf-34f8-40c4-8772-9850b3127baf@kernel.org>

On Fri, Dec 8, 2023 at 9:56 AM David Ahern <dsahern@kernel.org> wrote:
>
> On 12/7/23 5:52 PM, Mina Almasry wrote:
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index b8c8be5a912e..30667e4c3b95 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2120,6 +2120,41 @@ static int netdev_restart_rx_queue(struct net_device *dev, int rxq_idx)
> >       return err;
> >  }
> >
> > +struct page_pool_iov *netdev_alloc_dmabuf(struct netdev_dmabuf_binding *binding)
> > +{
> > +     struct dmabuf_genpool_chunk_owner *owner;
> > +     struct page_pool_iov *ppiov;
> > +     unsigned long dma_addr;
> > +     ssize_t offset;
> > +     ssize_t index;
> > +
> > +     dma_addr = gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE,
>
> Any reason not to allow allocation sizes other than PAGE_SIZE? e.g.,
> 2048 for smaller MTUs or 8192 for larger ones. It can be a property of
> page_pool and constant across allocations vs allowing different size for
> each allocation.

Only for simplicity. Supporting non-PAGE_SIZE is certainly possible,
but in my estimation it's a huge can of worms worthy of itss own
series. I find this series complicated to implement and review and
support as-is, and if reasonable I would like to punt that as a future
improvement.

At the minimum, I think the needed changes are:

1. The memory provider needs to report to the page pool the alloc size.
2. The page_pool needs to handle non-PAGE_SIZE memory regions.
3. The drivers need to handle non-PAGE_SIZE memory regions. Drivers
today handle fragged pages, but that is different because it's a
PAGE_SIZE region that is fragged. This is a non-PAGE_SIZE region in
the first place.
4. Any PAGE_SIZE assumptions in the entire net stack need to be removed.

At Google we mostly use page aligned MTUs so we're likely not that
interested in sub PAGE_SIZE allocations, but we are interested in n *
PAGE_SIZE allocations, but, I hope, in a separate followup effort.

-- 
Thanks,
Mina

WARNING: multiple messages have this Message-ID (diff)
From: Mina Almasry <almasrymina@google.com>
To: David Ahern <dsahern@kernel.org>
Cc: linux-doc@vger.kernel.org, "Kaiyuan Zhang" <kaiyuanz@google.com>,
	dri-devel@lists.freedesktop.org,
	"Eric Dumazet" <edumazet@google.com>,
	linux-kselftest@vger.kernel.org, "Shuah Khan" <shuah@kernel.org>,
	"Sumit Semwal" <sumit.semwal@linaro.org>,
	linux-arch@vger.kernel.org,
	"Willem de Bruijn" <willemdebruijn.kernel@gmail.com>,
	"Jeroen de Borst" <jeroendb@google.com>,
	"Jonathan Corbet" <corbet@lwn.net>,
	"Jakub Kicinski" <kuba@kernel.org>,
	"Paolo Abeni" <pabeni@redhat.com>,
	linux-media@vger.kernel.org,
	"Jesper Dangaard Brouer" <hawk@kernel.org>,
	"Arnd Bergmann" <arnd@arndb.de>,
	"Shailend Chand" <shailend@google.com>,
	"Shakeel Butt" <shakeelb@google.com>,
	"Harshitha Ramamurthy" <hramamurthy@google.com>,
	"Willem de Bruijn" <willemb@google.com>,
	netdev@vger.kernel.org,
	"Ilias Apalodimas" <ilias.apalodimas@linaro.org>,
	linux-kernel@vger.kernel.org,
	"Christian König" <christian.koenig@amd.com>,
	"Yunsheng Lin" <linyunsheng@huawei.com>,
	"Praveen Kaligineedi" <pkaligineedi@google.com>,
	bpf@vger.kernel.org, "David S. Miller" <davem@davemloft.net>
Subject: Re: [net-next v1 07/16] netdev: netdevice devmem allocator
Date: Fri, 8 Dec 2023 11:27:47 -0800	[thread overview]
Message-ID: <CAHS8izMysKCCoAoVK9KzQrfbtFfagaPMRYSUUjKTqJ-ZwJ53oA@mail.gmail.com> (raw)
In-Reply-To: <462da4bf-34f8-40c4-8772-9850b3127baf@kernel.org>

On Fri, Dec 8, 2023 at 9:56 AM David Ahern <dsahern@kernel.org> wrote:
>
> On 12/7/23 5:52 PM, Mina Almasry wrote:
> > diff --git a/net/core/dev.c b/net/core/dev.c
> > index b8c8be5a912e..30667e4c3b95 100644
> > --- a/net/core/dev.c
> > +++ b/net/core/dev.c
> > @@ -2120,6 +2120,41 @@ static int netdev_restart_rx_queue(struct net_device *dev, int rxq_idx)
> >       return err;
> >  }
> >
> > +struct page_pool_iov *netdev_alloc_dmabuf(struct netdev_dmabuf_binding *binding)
> > +{
> > +     struct dmabuf_genpool_chunk_owner *owner;
> > +     struct page_pool_iov *ppiov;
> > +     unsigned long dma_addr;
> > +     ssize_t offset;
> > +     ssize_t index;
> > +
> > +     dma_addr = gen_pool_alloc_owner(binding->chunk_pool, PAGE_SIZE,
>
> Any reason not to allow allocation sizes other than PAGE_SIZE? e.g.,
> 2048 for smaller MTUs or 8192 for larger ones. It can be a property of
> page_pool and constant across allocations vs allowing different size for
> each allocation.

Only for simplicity. Supporting non-PAGE_SIZE is certainly possible,
but in my estimation it's a huge can of worms worthy of itss own
series. I find this series complicated to implement and review and
support as-is, and if reasonable I would like to punt that as a future
improvement.

At the minimum, I think the needed changes are:

1. The memory provider needs to report to the page pool the alloc size.
2. The page_pool needs to handle non-PAGE_SIZE memory regions.
3. The drivers need to handle non-PAGE_SIZE memory regions. Drivers
today handle fragged pages, but that is different because it's a
PAGE_SIZE region that is fragged. This is a non-PAGE_SIZE region in
the first place.
4. Any PAGE_SIZE assumptions in the entire net stack need to be removed.

At Google we mostly use page aligned MTUs so we're likely not that
interested in sub PAGE_SIZE allocations, but we are interested in n *
PAGE_SIZE allocations, but, I hope, in a separate followup effort.

-- 
Thanks,
Mina

  reply	other threads:[~2023-12-08 19:28 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-08  0:52 [net-next v1 00/16] Device Memory TCP Mina Almasry
2023-12-08  0:52 ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 01/16] net: page_pool: factor out releasing DMA from releasing the page Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-10  3:49   ` Shakeel Butt
2023-12-10  3:49     ` Shakeel Butt
2023-12-12  8:11   ` Ilias Apalodimas
2023-12-12  8:11     ` Ilias Apalodimas
2023-12-08  0:52 ` [net-next v1 02/16] net: page_pool: create hooks for custom page providers Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-12  8:07   ` Ilias Apalodimas
2023-12-12  8:07     ` Ilias Apalodimas
2023-12-12 14:47     ` Mina Almasry
2023-12-12 14:47       ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 03/16] queue_api: define queue api Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-14  1:15   ` Jakub Kicinski
2023-12-14  1:15     ` Jakub Kicinski
2023-12-08  0:52 ` [net-next v1 04/16] gve: implement " Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2024-03-05 11:45   ` Arnd Bergmann
2023-12-08  0:52 ` [net-next v1 05/16] net: netdev netlink api to bind dma-buf to a net device Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-14  1:17   ` Jakub Kicinski
2023-12-14  1:17     ` Jakub Kicinski
2023-12-08  0:52 ` [net-next v1 06/16] netdev: support binding dma-buf to netdevice Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08 15:40   ` kernel test robot
2023-12-08 15:40     ` kernel test robot
2023-12-08 16:02   ` kernel test robot
2023-12-08 16:02     ` kernel test robot
2023-12-08 17:48   ` David Ahern
2023-12-08 17:48     ` David Ahern
2023-12-08 19:22     ` Mina Almasry
2023-12-08 19:22       ` Mina Almasry
2023-12-08 20:32       ` Mina Almasry
2023-12-08 20:32         ` Mina Almasry
2023-12-09 23:29       ` David Ahern
2023-12-09 23:29         ` David Ahern
2023-12-11  2:19         ` Mina Almasry
2023-12-11  2:19           ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 07/16] netdev: netdevice devmem allocator Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08 17:56   ` David Ahern
2023-12-08 17:56     ` David Ahern
2023-12-08 19:27     ` Mina Almasry [this message]
2023-12-08 19:27       ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 08/16] memory-provider: dmabuf devmem memory provider Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08 22:48   ` Pavel Begunkov
2023-12-08 22:48     ` Pavel Begunkov
2023-12-08 23:25     ` Mina Almasry
2023-12-08 23:25       ` Mina Almasry
2023-12-10  3:03       ` Pavel Begunkov
2023-12-10  3:03         ` Pavel Begunkov
2023-12-11  2:30         ` Mina Almasry
2023-12-11  2:30           ` Mina Almasry
2023-12-11 20:35           ` Pavel Begunkov
2023-12-11 20:35             ` Pavel Begunkov
2023-12-14 20:03             ` Mina Almasry
2023-12-14 20:03               ` Mina Almasry
2023-12-19 23:55               ` Pavel Begunkov
2023-12-19 23:55                 ` Pavel Begunkov
2023-12-08 23:05   ` Pavel Begunkov
2023-12-08 23:05     ` Pavel Begunkov
2023-12-12 12:25   ` Jason Gunthorpe
2023-12-12 12:25     ` Jason Gunthorpe
2023-12-12 13:07     ` Christoph Hellwig
2023-12-12 14:26     ` Mina Almasry
2023-12-12 14:26       ` Mina Almasry
2023-12-12 14:39       ` Jason Gunthorpe
2023-12-12 14:39         ` Jason Gunthorpe
2023-12-12 14:58         ` Mina Almasry
2023-12-12 14:58           ` Mina Almasry
2023-12-12 15:08           ` Jason Gunthorpe
2023-12-12 15:08             ` Jason Gunthorpe
2023-12-13  1:09             ` Mina Almasry
2023-12-13  1:09               ` Mina Almasry
2023-12-13  2:19               ` David Ahern
2023-12-13  2:19                 ` David Ahern
2023-12-13  7:49   ` Yinjun Zhang
2023-12-13  7:49     ` Yinjun Zhang
2023-12-08  0:52 ` [net-next v1 09/16] page_pool: device memory support Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08  9:30   ` Yunsheng Lin
2023-12-08  9:30     ` Yunsheng Lin
2023-12-08 16:05     ` Mina Almasry
2023-12-08 16:05       ` Mina Almasry
2023-12-11  2:04       ` Yunsheng Lin
2023-12-11  2:04         ` Yunsheng Lin
2023-12-11  2:26         ` Mina Almasry
2023-12-11  2:26           ` Mina Almasry
2023-12-11  4:04           ` Mina Almasry
2023-12-11  4:04             ` Mina Almasry
2023-12-11 11:51             ` Yunsheng Lin
2023-12-11 11:51               ` Yunsheng Lin
2023-12-11 18:14               ` Mina Almasry
2023-12-11 18:14                 ` Mina Almasry
2023-12-12 11:17                 ` Yunsheng Lin
2023-12-12 11:17                   ` Yunsheng Lin
2023-12-12 14:28                   ` Mina Almasry
2023-12-12 14:28                     ` Mina Almasry
2023-12-13 11:48                     ` Yunsheng Lin
2023-12-13 11:48                       ` Yunsheng Lin
2023-12-13  7:52             ` Mina Almasry
2023-12-13  7:52               ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 10/16] page_pool: don't release iov on elevanted refcount Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 11/16] net: support non paged skb frags Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 12/16] net: add support for skbs with unreadable frags Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 13/16] tcp: RX path for devmem TCP Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08 15:40   ` kernel test robot
2023-12-08 15:40     ` kernel test robot
2023-12-08 17:55   ` David Ahern
2023-12-08 17:55     ` David Ahern
2023-12-08 19:23     ` Mina Almasry
2023-12-08 19:23       ` Mina Almasry
2023-12-08  0:52 ` [net-next v1 14/16] net: add SO_DEVMEM_DONTNEED setsockopt to release RX frags Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-12 19:08   ` Simon Horman
2023-12-12 19:08     ` Simon Horman
2023-12-08  0:52 ` [net-next v1 15/16] net: add devmem TCP documentation Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-12 19:14   ` Simon Horman
2023-12-12 19:14     ` Simon Horman
2023-12-08  0:52 ` [net-next v1 16/16] selftests: add ncdevmem, netcat for devmem TCP Mina Almasry
2023-12-08  0:52   ` Mina Almasry
2023-12-08  1:47 ` [net-next v1 00/16] Device Memory TCP Mina Almasry
2023-12-08  1:47   ` Mina Almasry
2023-12-08 17:57 ` David Ahern
2023-12-08 17:57   ` David Ahern
2023-12-08 19:31   ` Mina Almasry
2023-12-08 19:31     ` Mina Almasry
2023-12-10  3:48 ` Shakeel Butt
2023-12-10  3:48   ` Shakeel Butt
2023-12-12  5:58 ` Christoph Hellwig
2023-12-14  6:20 ` patchwork-bot+netdevbpf
2023-12-14  6:20   ` patchwork-bot+netdevbpf
2023-12-14  6:48   ` Christoph Hellwig
2023-12-14  6:51     ` Mina Almasry
2023-12-14  6:51       ` Mina Almasry
2023-12-14  6:59       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAHS8izMysKCCoAoVK9KzQrfbtFfagaPMRYSUUjKTqJ-ZwJ53oA@mail.gmail.com \
    --to=almasrymina@google.com \
    --cc=arnd@arndb.de \
    --cc=bpf@vger.kernel.org \
    --cc=christian.koenig@amd.com \
    --cc=corbet@lwn.net \
    --cc=davem@davemloft.net \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=hawk@kernel.org \
    --cc=hramamurthy@google.com \
    --cc=ilias.apalodimas@linaro.org \
    --cc=jeroendb@google.com \
    --cc=kaiyuanz@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linyunsheng@huawei.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=pkaligineedi@google.com \
    --cc=shailend@google.com \
    --cc=shakeelb@google.com \
    --cc=shuah@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=willemb@google.com \
    --cc=willemdebruijn.kernel@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.