netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net] skbuff: disable coalescing for page_pool recycling
@ 2022-03-24 17:29 Jean-Philippe Brucker
  2022-03-25  1:59 ` Yunsheng Lin
  0 siblings, 1 reply; 5+ messages in thread
From: Jean-Philippe Brucker @ 2022-03-24 17:29 UTC (permalink / raw)
  To: ilias.apalodimas, hawk
  Cc: davem, kuba, pabeni, netdev, linyunsheng, Jean-Philippe Brucker

Fix a use-after-free when using page_pool with page fragments. We
encountered this problem during normal RX in the hns3 driver:

(1) Initially we have three descriptors in the RX queue. The first one
    allocates PAGE1 through page_pool, and the other two allocate one
    half of PAGE2 each. Page references look like this:

                RX_BD1 _______ PAGE1
                RX_BD2 _______ PAGE2
                RX_BD3 _________/

(2) Handle RX on the first descriptor. Allocate SKB1, eventually added
    to the receive queue by tcp_queue_rcv().

(3) Handle RX on the second descriptor. Allocate SKB2 and pass it to
    netif_receive_skb():

    netif_receive_skb(SKB2)
      ip_rcv(SKB2)
        SKB3 = skb_clone(SKB2)

    SKB2 and SKB3 share a reference to PAGE2 through
    skb_shinfo()->dataref. The other ref to PAGE2 is still held by
    RX_BD3:

                      SKB2 ---+- PAGE2
                      SKB3 __/   /
                RX_BD3 _________/

 (3b) Now while handling TCP, coalesce SKB3 with SKB1:

      tcp_v4_rcv(SKB3)
        tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds
        kfree_skb_partial(SKB3)
          skb_release_data(SKB3)                // drops one dataref

                      SKB1 _____ PAGE1
                           \____
                      SKB2 _____ PAGE2
                                 /
                RX_BD3 _________/

    The problem is here: both SKB1 and SKB2 point to PAGE2 but SKB1 does
    not actually hold a reference to PAGE2. Without coalescing, when
    releasing both SKB2 and SKB3, a single reference to PAGE2 would be
    dropped. Now when releasing SKB1 and SKB2, two references to PAGE2
    will be dropped, resulting in underflow.

 (3c) Drop SKB2:

      af_packet_rcv(SKB2)
        consume_skb(SKB2)
          skb_release_data(SKB2)                // drops second dataref
            page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count

                      SKB1 _____ PAGE1
                           \____
                                 PAGE2
                                 /
                RX_BD3 _________/

(4) Userspace calls recvmsg()
    Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
    release the SKB3 page as well:

    tcp_eat_recv_skb(SKB1)
      skb_release_data(SKB1)
        page_pool_return_skb_page(PAGE1)
        page_pool_return_skb_page(PAGE2)        // drops second pp_frag_count

(5) PAGE2 is freed, but the third RX descriptor was still using it!
    In our case this causes IOMMU faults, but it would silently corrupt
    memory if the IOMMU was disabled.

A proper implementation would probably take another reference from the
page_pool at step (3b), but that seems too complicated for a fix. Keep
it simple for now, prevent coalescing for page_pool users.

Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
---

The Fixes tag says frag page recycling but I'm not sure, does it not
also affect full page recycling?  Coalescing is one case, are there
other places where we move page fragments between skbuffs?  I'm still
too unfamiliar with this code to figure it out.

Previously discussed here:
https://lore.kernel.org/netdev/YfFbDivUPbpWjh%2Fm@myrica/
---
 net/core/skbuff.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 10bde7c6db44..431f7ce88049 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -5276,11 +5276,11 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
 	if (skb_cloned(to))
 		return false;
 
-	/* The page pool signature of struct page will eventually figure out
-	 * which pages can be recycled or not but for now let's prohibit slab
-	 * allocated and page_pool allocated SKBs from being coalesced.
+	/* Prohibit adding page_pool allocated SKBs, because that would require
+	 * transferring the page fragment reference. For now let's also prohibit
+	 * slab allocated and page_pool allocated SKBs from being coalesced.
 	 */
-	if (to->pp_recycle != from->pp_recycle)
+	if (to->pp_recycle || from->pp_recycle)
 		return false;
 
 	if (len <= skb_tailroom(to)) {
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH net] skbuff: disable coalescing for page_pool recycling
  2022-03-24 17:29 [PATCH net] skbuff: disable coalescing for page_pool recycling Jean-Philippe Brucker
@ 2022-03-25  1:59 ` Yunsheng Lin
  2022-03-25  2:50   ` Alexander Duyck
  0 siblings, 1 reply; 5+ messages in thread
From: Yunsheng Lin @ 2022-03-25  1:59 UTC (permalink / raw)
  To: Jean-Philippe Brucker, ilias.apalodimas, hawk
  Cc: davem, kuba, pabeni, netdev, Alexander Duyck

+cc Alexander Duyck

On 2022/3/25 1:29, Jean-Philippe Brucker wrote:
> Fix a use-after-free when using page_pool with page fragments. We
> encountered this problem during normal RX in the hns3 driver:
> 
> (1) Initially we have three descriptors in the RX queue. The first one
>     allocates PAGE1 through page_pool, and the other two allocate one
>     half of PAGE2 each. Page references look like this:
> 
>                 RX_BD1 _______ PAGE1
>                 RX_BD2 _______ PAGE2
>                 RX_BD3 _________/
> 
> (2) Handle RX on the first descriptor. Allocate SKB1, eventually added
>     to the receive queue by tcp_queue_rcv().
> 
> (3) Handle RX on the second descriptor. Allocate SKB2 and pass it to
>     netif_receive_skb():
> 
>     netif_receive_skb(SKB2)
>       ip_rcv(SKB2)
>         SKB3 = skb_clone(SKB2)
> 
>     SKB2 and SKB3 share a reference to PAGE2 through
>     skb_shinfo()->dataref. The other ref to PAGE2 is still held by
>     RX_BD3:
> 
>                       SKB2 ---+- PAGE2
>                       SKB3 __/   /
>                 RX_BD3 _________/
> 
>  (3b) Now while handling TCP, coalesce SKB3 with SKB1:
> 
>       tcp_v4_rcv(SKB3)
>         tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds
>         kfree_skb_partial(SKB3)
>           skb_release_data(SKB3)                // drops one dataref
> 
>                       SKB1 _____ PAGE1
>                            \____
>                       SKB2 _____ PAGE2
>                                  /
>                 RX_BD3 _________/
> 
>     The problem is here: both SKB1 and SKB2 point to PAGE2 but SKB1 does
>     not actually hold a reference to PAGE2.

it seems the SKB1 *does* hold a reference to PAGE2 by calling __skb_frag_ref(),
which increments the page->_refcount instead of incrementing pp_frag_count,
as skb_cloned(SKB3) is true and __skb_frag_ref() does not handle page pool
case:

https://elixir.bootlin.com/linux/v5.17-rc1/source/net/core/skbuff.c#L5308

 Without coalescing, when
>     releasing both SKB2 and SKB3, a single reference to PAGE2 would be
>     dropped. Now when releasing SKB1 and SKB2, two references to PAGE2
>     will be dropped, resulting in underflow.
> 
>  (3c) Drop SKB2:
> 
>       af_packet_rcv(SKB2)
>         consume_skb(SKB2)
>           skb_release_data(SKB2)                // drops second dataref
>             page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count
> 
>                       SKB1 _____ PAGE1
>                            \____
>                                  PAGE2
>                                  /
>                 RX_BD3 _________/
> 
> (4) Userspace calls recvmsg()
>     Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
>     release the SKB3 page as well:
> 
>     tcp_eat_recv_skb(SKB1)
>       skb_release_data(SKB1)
>         page_pool_return_skb_page(PAGE1)
>         page_pool_return_skb_page(PAGE2)        // drops second pp_frag_count
> 
> (5) PAGE2 is freed, but the third RX descriptor was still using it!
>     In our case this causes IOMMU faults, but it would silently corrupt
>     memory if the IOMMU was disabled.
> 
> A proper implementation would probably take another reference from the
> page_pool at step (3b), but that seems too complicated for a fix. Keep
> it simple for now, prevent coalescing for page_pool users.
> 
> Fixes: 53e0961da1c7 ("page_pool: add frag page recycling support in page pool")
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
> 
> The Fixes tag says frag page recycling but I'm not sure, does it not
> also affect full page recycling?  Coalescing is one case, are there
> other places where we move page fragments between skbuffs?  I'm still
> too unfamiliar with this code to figure it out.
> 
> Previously discussed here:
> https://lore.kernel.org/netdev/YfFbDivUPbpWjh%2Fm@myrica/
> ---
>  net/core/skbuff.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 10bde7c6db44..431f7ce88049 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5276,11 +5276,11 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  	if (skb_cloned(to))
>  		return false;
>  
> -	/* The page pool signature of struct page will eventually figure out
> -	 * which pages can be recycled or not but for now let's prohibit slab
> -	 * allocated and page_pool allocated SKBs from being coalesced.
> +	/* Prohibit adding page_pool allocated SKBs, because that would require
> +	 * transferring the page fragment reference. For now let's also prohibit
> +	 * slab allocated and page_pool allocated SKBs from being coalesced.
>  	 */
> -	if (to->pp_recycle != from->pp_recycle)
> +	if (to->pp_recycle || from->pp_recycle)
>  		return false;
>  
>  	if (len <= skb_tailroom(to)) {
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH net] skbuff: disable coalescing for page_pool recycling
  2022-03-25  1:59 ` Yunsheng Lin
@ 2022-03-25  2:50   ` Alexander Duyck
  2022-03-28 12:02     ` Jean-Philippe Brucker
  0 siblings, 1 reply; 5+ messages in thread
From: Alexander Duyck @ 2022-03-25  2:50 UTC (permalink / raw)
  To: Yunsheng Lin, Jean-Philippe Brucker, ilias.apalodimas, hawk
  Cc: davem, kuba, pabeni, netdev



> -----Original Message-----
> From: Yunsheng Lin <linyunsheng@huawei.com>
> Sent: Thursday, March 24, 2022 7:00 PM
> To: Jean-Philippe Brucker <jean-philippe@linaro.org>;
> ilias.apalodimas@linaro.org; hawk@kernel.org
> Cc: davem@davemloft.net; kuba@kernel.org; pabeni@redhat.com;
> netdev@vger.kernel.org; Alexander Duyck <alexanderduyck@fb.com>
> Subject: Re: [PATCH net] skbuff: disable coalescing for page_pool recycling
> 
> +cc Alexander Duyck
> 
> On 2022/3/25 1:29, Jean-Philippe Brucker wrote:
> > Fix a use-after-free when using page_pool with page fragments. We
> > encountered this problem during normal RX in the hns3 driver:
> >
> > (1) Initially we have three descriptors in the RX queue. The first one
> >     allocates PAGE1 through page_pool, and the other two allocate one
> >     half of PAGE2 each. Page references look like this:
> >
> >                 RX_BD1 _______ PAGE1
> >                 RX_BD2 _______ PAGE2
> >                 RX_BD3 _________/
> >
> > (2) Handle RX on the first descriptor. Allocate SKB1, eventually added
> >     to the receive queue by tcp_queue_rcv().
> >
> > (3) Handle RX on the second descriptor. Allocate SKB2 and pass it to
> >     netif_receive_skb():
> >
> >     netif_receive_skb(SKB2)
> >       ip_rcv(SKB2)
> >         SKB3 = skb_clone(SKB2)
> >
> >     SKB2 and SKB3 share a reference to PAGE2 through
> >     skb_shinfo()->dataref. The other ref to PAGE2 is still held by
> >     RX_BD3:
> >
> >                       SKB2 ---+- PAGE2
> >                       SKB3 __/   /
> >                 RX_BD3 _________/
> >
> >  (3b) Now while handling TCP, coalesce SKB3 with SKB1:
> >
> >       tcp_v4_rcv(SKB3)
> >         tcp_try_coalesce(to=SKB1, from=SKB3)    // succeeds
> >         kfree_skb_partial(SKB3)
> >           skb_release_data(SKB3)                // drops one dataref
> >
> >                       SKB1 _____ PAGE1
> >                            \____
> >                       SKB2 _____ PAGE2
> >                                  /
> >                 RX_BD3 _________/
> >
> >     The problem is here: both SKB1 and SKB2 point to PAGE2 but SKB1 does
> >     not actually hold a reference to PAGE2.
> 
> it seems the SKB1 *does* hold a reference to PAGE2 by calling
> __skb_frag_ref(), which increments the page->_refcount instead of
> incrementing pp_frag_count, as skb_cloned(SKB3) is true and
> __skb_frag_ref() does not handle page pool
> case:
> 
> INVALID URI REMOVED
> rc1/source/net/core/skbuff.c*L5308__;Iw!!Bt8RZUm9aw!u944ZiA7uzBuFvccr
> rtR1xvondLNnkMf5xzM8xbbkosow-v5t-XdZJd6bMsByMx2Kw$

I'm confused here as well. I don't see a path where you can take ownership of the page without taking a reference.

Specifically the skb_head_is_locked() won't let you steal the head if the skb is cloned. And then for the frags they have an additional reference taken if the skb is cloned.

>  Without coalescing, when
> >     releasing both SKB2 and SKB3, a single reference to PAGE2 would be
> >     dropped. Now when releasing SKB1 and SKB2, two references to PAGE2
> >     will be dropped, resulting in underflow.
> >
> >  (3c) Drop SKB2:
> >
> >       af_packet_rcv(SKB2)
> >         consume_skb(SKB2)
> >           skb_release_data(SKB2)                // drops second dataref
> >             page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count
> >
> >                       SKB1 _____ PAGE1
> >                            \____
> >                                  PAGE2
> >                                  /
> >                 RX_BD3 _________/
> >
> > (4) Userspace calls recvmsg()
> >     Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
> >     release the SKB3 page as well:
> >
> >     tcp_eat_recv_skb(SKB1)
> >       skb_release_data(SKB1)
> >         page_pool_return_skb_page(PAGE1)
> >         page_pool_return_skb_page(PAGE2)        // drops second
> pp_frag_count
> >
> > (5) PAGE2 is freed, but the third RX descriptor was still using it!
> >     In our case this causes IOMMU faults, but it would silently corrupt
> >     memory if the IOMMU was disabled.

I think I see the problem. It is when you get into steps 4 and 5 that you are actually hitting the issue. When you coalesced the page you ended up switching the page from a page pool page to a reference counted page, but it is being stored in a page pool skb. That is the issue. Basically if the skb is a pp_recycle skb we should be incrementing the frag count, not the reference count. So essentially the logic should be that if to->pp_recycle is set but from is cloned then you need to return false. The problem isn't that they are both pp_recycle skbs, it is that the from was cloned and we are trying to merge that into a pp_recycle skb by adding to the reference count of the pages.

> > A proper implementation would probably take another reference from the
> > page_pool at step (3b), but that seems too complicated for a fix. Keep
> > it simple for now, prevent coalescing for page_pool users.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] skbuff: disable coalescing for page_pool recycling
  2022-03-25  2:50   ` Alexander Duyck
@ 2022-03-28 12:02     ` Jean-Philippe Brucker
  2022-03-28 12:18       ` Ilias Apalodimas
  0 siblings, 1 reply; 5+ messages in thread
From: Jean-Philippe Brucker @ 2022-03-28 12:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Yunsheng Lin, ilias.apalodimas, hawk, davem, kuba, pabeni, netdev

On Fri, Mar 25, 2022 at 02:50:46AM +0000, Alexander Duyck wrote:
> > >     The problem is here: both SKB1 and SKB2 point to PAGE2 but SKB1 does
> > >     not actually hold a reference to PAGE2.
> > 
> > it seems the SKB1 *does* hold a reference to PAGE2 by calling
> > __skb_frag_ref(), which increments the page->_refcount instead of
> > incrementing pp_frag_count, as skb_cloned(SKB3) is true and
> > __skb_frag_ref() does not handle page pool
> > case:
> > 
> > INVALID URI REMOVED
> > rc1/source/net/core/skbuff.c*L5308__;Iw!!Bt8RZUm9aw!u944ZiA7uzBuFvccr
> > rtR1xvondLNnkMf5xzM8xbbkosow-v5t-XdZJd6bMsByMx2Kw$
> 
> I'm confused here as well. I don't see a path where you can take ownership of the page without taking a reference.
> 
> Specifically the skb_head_is_locked() won't let you steal the head if the skb is cloned. And then for the frags they have an additional reference taken if the skb is cloned.
> 
> >  Without coalescing, when
> > >     releasing both SKB2 and SKB3, a single reference to PAGE2 would be
> > >     dropped. Now when releasing SKB1 and SKB2, two references to PAGE2
> > >     will be dropped, resulting in underflow.
> > >
> > >  (3c) Drop SKB2:
> > >
> > >       af_packet_rcv(SKB2)
> > >         consume_skb(SKB2)
> > >           skb_release_data(SKB2)                // drops second dataref
> > >             page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count
> > >
> > >                       SKB1 _____ PAGE1
> > >                            \____
> > >                                  PAGE2
> > >                                  /
> > >                 RX_BD3 _________/
> > >
> > > (4) Userspace calls recvmsg()
> > >     Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
> > >     release the SKB3 page as well:
> > >
> > >     tcp_eat_recv_skb(SKB1)
> > >       skb_release_data(SKB1)
> > >         page_pool_return_skb_page(PAGE1)
> > >         page_pool_return_skb_page(PAGE2)        // drops second
> > pp_frag_count
> > >
> > > (5) PAGE2 is freed, but the third RX descriptor was still using it!
> > >     In our case this causes IOMMU faults, but it would silently corrupt
> > >     memory if the IOMMU was disabled.
> 
> I think I see the problem. It is when you get into steps 4 and 5 that you are actually hitting the issue. When you coalesced the page you ended up switching the page from a page pool page to a reference counted page, but it is being stored in a page pool skb. That is the issue. Basically if the skb is a pp_recycle skb we should be incrementing the frag count, not the reference count.
> So essentially the logic should be that if to->pp_recycle is set but from is cloned then you need to return false. The problem isn't that they are both pp_recycle skbs, it is that the from was cloned and we are trying to merge that into a pp_recycle skb by adding to the reference count of the pages.

I agree with this, the problem is switching from a page_pool frag refcount
to a page refcount. I suppose we could change __skb_frag_ref() to increase
the pp_frag_count but that's probably best left as future improvement, I
don't want to break more than I fix here. I'll send a v2 with a check on
(cloned(from) && from->pp_recycle)

Thanks,
Jean

> 
> > > A proper implementation would probably take another reference from the
> > > page_pool at step (3b), but that seems too complicated for a fix. Keep
> > > it simple for now, prevent coalescing for page_pool users.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH net] skbuff: disable coalescing for page_pool recycling
  2022-03-28 12:02     ` Jean-Philippe Brucker
@ 2022-03-28 12:18       ` Ilias Apalodimas
  0 siblings, 0 replies; 5+ messages in thread
From: Ilias Apalodimas @ 2022-03-28 12:18 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: Alexander Duyck, Yunsheng Lin, hawk, davem, kuba, pabeni, netdev

Hi,

Apologies for chiming in late.

On Mon, 28 Mar 2022 at 15:02, Jean-Philippe Brucker
<jean-philippe@linaro.org> wrote:
>
> On Fri, Mar 25, 2022 at 02:50:46AM +0000, Alexander Duyck wrote:
> > > >     The problem is here: both SKB1 and SKB2 point to PAGE2 but SKB1 does
> > > >     not actually hold a reference to PAGE2.
> > >
> > > it seems the SKB1 *does* hold a reference to PAGE2 by calling
> > > __skb_frag_ref(), which increments the page->_refcount instead of
> > > incrementing pp_frag_count, as skb_cloned(SKB3) is true and
> > > __skb_frag_ref() does not handle page pool
> > > case:
> > >
> > > INVALID URI REMOVED
> > > rc1/source/net/core/skbuff.c*L5308__;Iw!!Bt8RZUm9aw!u944ZiA7uzBuFvccr
> > > rtR1xvondLNnkMf5xzM8xbbkosow-v5t-XdZJd6bMsByMx2Kw$
> >
> > I'm confused here as well. I don't see a path where you can take ownership of the page without taking a reference.
> >
> > Specifically the skb_head_is_locked() won't let you steal the head if the skb is cloned. And then for the frags they have an additional reference taken if the skb is cloned.
> >
> > >  Without coalescing, when
> > > >     releasing both SKB2 and SKB3, a single reference to PAGE2 would be
> > > >     dropped. Now when releasing SKB1 and SKB2, two references to PAGE2
> > > >     will be dropped, resulting in underflow.
> > > >
> > > >  (3c) Drop SKB2:
> > > >
> > > >       af_packet_rcv(SKB2)
> > > >         consume_skb(SKB2)
> > > >           skb_release_data(SKB2)                // drops second dataref
> > > >             page_pool_return_skb_page(PAGE2)    // drops one pp_frag_count
> > > >
> > > >                       SKB1 _____ PAGE1
> > > >                            \____
> > > >                                  PAGE2
> > > >                                  /
> > > >                 RX_BD3 _________/
> > > >
> > > > (4) Userspace calls recvmsg()
> > > >     Copies SKB1 and releases it. Since SKB3 was coalesced with SKB1, we
> > > >     release the SKB3 page as well:
> > > >
> > > >     tcp_eat_recv_skb(SKB1)
> > > >       skb_release_data(SKB1)
> > > >         page_pool_return_skb_page(PAGE1)
> > > >         page_pool_return_skb_page(PAGE2)        // drops second
> > > pp_frag_count
> > > >
> > > > (5) PAGE2 is freed, but the third RX descriptor was still using it!
> > > >     In our case this causes IOMMU faults, but it would silently corrupt
> > > >     memory if the IOMMU was disabled.
> >
> > I think I see the problem. It is when you get into steps 4 and 5 that you are actually hitting the issue. When you coalesced the page you ended up switching the page from a page pool page to a reference counted page, but it is being stored in a page pool skb. That is the issue. Basically if the skb is a pp_recycle skb we should be incrementing the frag count, not the reference count.
> > So essentially the logic should be that if to->pp_recycle is set but from is cloned then you need to return false. The problem isn't that they are both pp_recycle skbs, it is that the from was cloned and we are trying to merge that into a pp_recycle skb by adding to the reference count of the pages.
>
> I agree with this, the problem is switching from a page_pool frag refcount
> to a page refcount. I suppose we could change __skb_frag_ref() to increase
> the pp_frag_count but that's probably best left as future improvement, I
> don't want to break more than I fix here.

I would prefer if we managed to keep the skb bits completely agnostic
to page pool recycling (apart from the obvious pp_recycle flag which
belongs to an skb to begin with).

> I'll send a v2 with a check on
> (cloned(from) && from->pp_recycle)

Coalescing is helping a lot wrt to processing packets and speeding up
things,  so completely removing it from page pool allocated buffers is
not what we want.  Limiting that to cloned page pool allocated SKBs
sounds reasonable I guess.

Thanks
/Ilias

>
> Thanks,
> Jean
>
> >
> > > > A proper implementation would probably take another reference from the
> > > > page_pool at step (3b), but that seems too complicated for a fix. Keep
> > > > it simple for now, prevent coalescing for page_pool users.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-03-28 12:19 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-24 17:29 [PATCH net] skbuff: disable coalescing for page_pool recycling Jean-Philippe Brucker
2022-03-25  1:59 ` Yunsheng Lin
2022-03-25  2:50   ` Alexander Duyck
2022-03-28 12:02     ` Jean-Philippe Brucker
2022-03-28 12:18       ` Ilias Apalodimas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).