netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
@ 2023-04-04  7:47 Liang Chen
  2023-04-04 11:18 ` Yunsheng Lin
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Liang Chen @ 2023-04-04  7:47 UTC (permalink / raw)
  To: ilias.apalodimas, hawk
  Cc: davem, edumazet, kuba, pabeni, netdev, liangchen.linux

Commit 1effe8ca4e34 allowed coalescing to proceed with non page pool page
and page pool page when @from is cloned, i.e.

to->pp_recycle    --> false
from->pp_recycle  --> true
skb_cloned(from)  --> true

However, it actually requires skb_cloned(@from) to hold true until
coalescing finishes in this situation. If the other cloned SKB is
released while the merging is in process, from_shinfo->nr_frags will be
set to 0 toward the end of the function, causing the increment of frag
page _refcount to be unexpectedly skipped resulting in inconsistent
reference counts. Later when SKB(@to) is released, it frees the page
directly even though the page pool page is still in use, leading to
use-after-free or double-free errors. So it should be prohibitted.

The double-free error message below prompted us to investigate:
BUG: Bad page state in process swapper/1  pfn:0e0d1
page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000
index:0x2 pfn:0xe0d1
flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000
raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount

CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E      6.2.0+
Call Trace:
 <IRQ>
dump_stack_lvl+0x32/0x50
bad_page+0x69/0xf0
free_pcp_prepare+0x260/0x2f0
free_unref_page+0x20/0x1c0
skb_release_data+0x10b/0x1a0
napi_consume_skb+0x56/0x150
net_rx_action+0xf0/0x350
? __napi_schedule+0x79/0x90
__do_softirq+0xc8/0x2b1
__irq_exit_rcu+0xb9/0xf0
common_interrupt+0x82/0xa0
</IRQ>
<TASK>
asm_common_interrupt+0x22/0x40
RIP: 0010:default_idle+0xb/0x20

Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
---
 net/core/skbuff.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 050a875d09c5..9be23ece5f03 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -5598,17 +5598,14 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
 		return false;
 
 	/* In general, avoid mixing slab allocated and page_pool allocated
-	 * pages within the same SKB. However when @to is not pp_recycle and
-	 * @from is cloned, we can transition frag pages from page_pool to
-	 * reference counted.
-	 *
-	 * On the other hand, don't allow coalescing two pp_recycle SKBs if
-	 * @from is cloned, in case the SKB is using page_pool fragment
-	 * references (PP_FLAG_PAGE_FRAG). Since we only take full page
-	 * references for cloned SKBs at the moment that would result in
-	 * inconsistent reference counts.
+	 * pages within the same SKB. However don't allow coalescing two
+	 * pp_recycle SKBs if @from is cloned, in case the SKB is using
+	 * page_pool fragment references (PP_FLAG_PAGE_FRAG). Since we only
+	 * take full page references for cloned SKBs at the moment that would
+	 * result in inconsistent reference counts.
 	 */
-	if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
+	if ((to->pp_recycle != from->pp_recycle)
+		|| (from->pp_recycle && skb_cloned(from)))
 		return false;
 
 	if (len <= skb_tailroom(to)) {
-- 
2.18.2


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-04  7:47 [PATCH] skbuff: Fix a race between coalescing and releasing SKBs Liang Chen
@ 2023-04-04 11:18 ` Yunsheng Lin
  2023-04-04 15:51 ` Alexander H Duyck
  2023-04-11  0:26 ` Jakub Kicinski
  2 siblings, 0 replies; 17+ messages in thread
From: Yunsheng Lin @ 2023-04-04 11:18 UTC (permalink / raw)
  To: Liang Chen, ilias.apalodimas, hawk; +Cc: davem, edumazet, kuba, pabeni, netdev

On 2023/4/4 15:47, Liang Chen wrote:
> Commit 1effe8ca4e34 allowed coalescing to proceed with non page pool page
> and page pool page when @from is cloned, i.e.
> 
> to->pp_recycle    --> false
> from->pp_recycle  --> true
> skb_cloned(from)  --> true
> 
> However, it actually requires skb_cloned(@from) to hold true until
> coalescing finishes in this situation. If the other cloned SKB is
> released while the merging is in process, from_shinfo->nr_frags will be
> set to 0 toward the end of the function, causing the increment of frag
> page _refcount to be unexpectedly skipped resulting in inconsistent
> reference counts. Later when SKB(@to) is released, it frees the page
> directly even though the page pool page is still in use, leading to
> use-after-free or double-free errors. So it should be prohibitted.

Nice catch.

> 
> The double-free error message below prompted us to investigate:
> BUG: Bad page state in process swapper/1  pfn:0e0d1
> page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000
> index:0x2 pfn:0xe0d1
> flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
> raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000
> raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000
> page dumped because: nonzero _refcount
> 
> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E      6.2.0+
> Call Trace:
>  <IRQ>
> dump_stack_lvl+0x32/0x50
> bad_page+0x69/0xf0
> free_pcp_prepare+0x260/0x2f0
> free_unref_page+0x20/0x1c0
> skb_release_data+0x10b/0x1a0
> napi_consume_skb+0x56/0x150
> net_rx_action+0xf0/0x350
> ? __napi_schedule+0x79/0x90
> __do_softirq+0xc8/0x2b1
> __irq_exit_rcu+0xb9/0xf0
> common_interrupt+0x82/0xa0
> </IRQ>
> <TASK>
> asm_common_interrupt+0x22/0x40
> RIP: 0010:default_idle+0xb/0x20

A Fixes tag is needed for the backport of the fix.
Fixes: 1effe8ca4e34 ("skbuff: fix coalescing for page_pool fragment recycling")

And target the net branch for bugfix:
 [PATCH net] skbuff: Fix a race between coalescing and releasing SKBs

Otherwise, it looks good to me:
Reviewed-by: Yunsheng Lin <linyunsheng@huawei.com>

> 
> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> ---
>  net/core/skbuff.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 050a875d09c5..9be23ece5f03 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5598,17 +5598,14 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  		return false;
>  
>  	/* In general, avoid mixing slab allocated and page_pool allocated
> -	 * pages within the same SKB. However when @to is not pp_recycle and
> -	 * @from is cloned, we can transition frag pages from page_pool to
> -	 * reference counted.
> -	 *
> -	 * On the other hand, don't allow coalescing two pp_recycle SKBs if
> -	 * @from is cloned, in case the SKB is using page_pool fragment
> -	 * references (PP_FLAG_PAGE_FRAG). Since we only take full page
> -	 * references for cloned SKBs at the moment that would result in
> -	 * inconsistent reference counts.
> +	 * pages within the same SKB. However don't allow coalescing two
> +	 * pp_recycle SKBs if @from is cloned, in case the SKB is using
> +	 * page_pool fragment references (PP_FLAG_PAGE_FRAG). Since we only
> +	 * take full page references for cloned SKBs at the moment that would
> +	 * result in inconsistent reference counts.
>  	 */
> -	if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
> +	if ((to->pp_recycle != from->pp_recycle)
> +		|| (from->pp_recycle && skb_cloned(from)))
>  		return false;
>  
>  	if (len <= skb_tailroom(to)) {
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-04  7:47 [PATCH] skbuff: Fix a race between coalescing and releasing SKBs Liang Chen
  2023-04-04 11:18 ` Yunsheng Lin
@ 2023-04-04 15:51 ` Alexander H Duyck
  2023-04-05  1:21   ` Jakub Kicinski
  2023-04-11  0:26 ` Jakub Kicinski
  2 siblings, 1 reply; 17+ messages in thread
From: Alexander H Duyck @ 2023-04-04 15:51 UTC (permalink / raw)
  To: Liang Chen, ilias.apalodimas, hawk; +Cc: davem, edumazet, kuba, pabeni, netdev

On Tue, 2023-04-04 at 15:47 +0800, Liang Chen wrote:
> Commit 1effe8ca4e34 allowed coalescing to proceed with non page pool page
> and page pool page when @from is cloned, i.e.
> 
> to->pp_recycle    --> false
> from->pp_recycle  --> true
> skb_cloned(from)  --> true
> 
> However, it actually requires skb_cloned(@from) to hold true until
> coalescing finishes in this situation. If the other cloned SKB is
> released while the merging is in process, from_shinfo->nr_frags will be
> set to 0 toward the end of the function, causing the increment of frag
> page _refcount to be unexpectedly skipped resulting in inconsistent
> reference counts. Later when SKB(@to) is released, it frees the page
> directly even though the page pool page is still in use, leading to
> use-after-free or double-free errors. So it should be prohibitted.
> 
> The double-free error message below prompted us to investigate:
> BUG: Bad page state in process swapper/1  pfn:0e0d1
> page:00000000c6548b28 refcount:-1 mapcount:0 mapping:0000000000000000
> index:0x2 pfn:0xe0d1
> flags: 0xfffffc0000000(node=0|zone=1|lastcpupid=0x1fffff)
> raw: 000fffffc0000000 0000000000000000 ffffffff00000101 0000000000000000
> raw: 0000000000000002 0000000000000000 ffffffffffffffff 0000000000000000
> page dumped because: nonzero _refcount
> 
> CPU: 1 PID: 0 Comm: swapper/1 Tainted: G            E      6.2.0+
> Call Trace:
>  <IRQ>
> dump_stack_lvl+0x32/0x50
> bad_page+0x69/0xf0
> free_pcp_prepare+0x260/0x2f0
> free_unref_page+0x20/0x1c0
> skb_release_data+0x10b/0x1a0
> napi_consume_skb+0x56/0x150
> net_rx_action+0xf0/0x350
> ? __napi_schedule+0x79/0x90
> __do_softirq+0xc8/0x2b1
> __irq_exit_rcu+0xb9/0xf0
> common_interrupt+0x82/0xa0
> </IRQ>
> <TASK>
> asm_common_interrupt+0x22/0x40
> RIP: 0010:default_idle+0xb/0x20
> 
> Signed-off-by: Liang Chen <liangchen.linux@gmail.com>
> ---
>  net/core/skbuff.c | 17 +++++++----------
>  1 file changed, 7 insertions(+), 10 deletions(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 050a875d09c5..9be23ece5f03 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -5598,17 +5598,14 @@ bool skb_try_coalesce(struct sk_buff *to, struct sk_buff *from,
>  		return false;
>  
>  	/* In general, avoid mixing slab allocated and page_pool allocated
> -	 * pages within the same SKB. However when @to is not pp_recycle and
> -	 * @from is cloned, we can transition frag pages from page_pool to
> -	 * reference counted.
> -	 *
> -	 * On the other hand, don't allow coalescing two pp_recycle SKBs if
> -	 * @from is cloned, in case the SKB is using page_pool fragment
> -	 * references (PP_FLAG_PAGE_FRAG). Since we only take full page
> -	 * references for cloned SKBs at the moment that would result in
> -	 * inconsistent reference counts.
> +	 * pages within the same SKB. However don't allow coalescing two
> +	 * pp_recycle SKBs if @from is cloned, in case the SKB is using
> +	 * page_pool fragment references (PP_FLAG_PAGE_FRAG). Since we only
> +	 * take full page references for cloned SKBs at the moment that would
> +	 * result in inconsistent reference counts.
>  	 */
> -	if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
> +	if ((to->pp_recycle != from->pp_recycle)
> +		|| (from->pp_recycle && skb_cloned(from)))
>  		return false;
>  
>  	if (len <= skb_tailroom(to)) {

I'm not quite sure I agree with the fix. Couldn't we just modify the
check further down that does:

        if (!skb_cloned(from))
                from_shinfo->nr_frags = 0;

And instead just make that:
	if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
                from_shinfo->nr_frags = 0;

With that we would retain the existing behavior and in the case of
cloned from frames we would take the references and let the original
from skb freed to take care of pulling the pages from the page pool.



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-04 15:51 ` Alexander H Duyck
@ 2023-04-05  1:21   ` Jakub Kicinski
  2023-04-05  8:18     ` Liang Chen
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2023-04-05  1:21 UTC (permalink / raw)
  To: Alexander H Duyck
  Cc: Liang Chen, ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> I'm not quite sure I agree with the fix. Couldn't we just modify the
> check further down that does:
> 
>         if (!skb_cloned(from))
>                 from_shinfo->nr_frags = 0;
> 
> And instead just make that:
> 	if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
>                 from_shinfo->nr_frags = 0;
> 
> With that we would retain the existing behavior and in the case of
> cloned from frames we would take the references and let the original
> from skb freed to take care of pulling the pages from the page pool.

Sounds like a better fix, indeed. But this sort of code will require
another fat comment above to explain why. This:

	if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))

is much easier to understand, no?

We should at least include that in the explanatory comment, I reckon...

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-05  1:21   ` Jakub Kicinski
@ 2023-04-05  8:18     ` Liang Chen
  2023-04-05 14:50       ` Jakub Kicinski
  2023-04-05 15:06       ` Alexander Duyck
  0 siblings, 2 replies; 17+ messages in thread
From: Liang Chen @ 2023-04-05  8:18 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Alexander H Duyck, ilias.apalodimas, hawk, davem, edumazet,
	pabeni, netdev

On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > check further down that does:
> >
> >         if (!skb_cloned(from))
> >                 from_shinfo->nr_frags = 0;
> >
> > And instead just make that:
> >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> >                 from_shinfo->nr_frags = 0;
> >
> > With that we would retain the existing behavior and in the case of
> > cloned from frames we would take the references and let the original
> > from skb freed to take care of pulling the pages from the page pool.
>
> Sounds like a better fix, indeed. But this sort of code will require
> another fat comment above to explain why. This:
>
>         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
>
> is much easier to understand, no?
>
> We should at least include that in the explanatory comment, I reckon...

Sure, the idea of dealing with the case where @from transitioned into non cloned
skb in the function retains the existing behavior, and gives more
opportunities to
coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
here.

I will take a closer look at the code path for the fragstolen case
before making v2
patch  -  If @from transitioned into non cloned skb before "if
(skb_head_is_locked(from))"

Thanks for the reviews.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-05  8:18     ` Liang Chen
@ 2023-04-05 14:50       ` Jakub Kicinski
  2023-04-06  3:22         ` Liang Chen
  2023-04-05 15:06       ` Alexander Duyck
  1 sibling, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2023-04-05 14:50 UTC (permalink / raw)
  To: Liang Chen
  Cc: Alexander H Duyck, ilias.apalodimas, hawk, davem, edumazet,
	pabeni, netdev

On Wed, 5 Apr 2023 16:18:47 +0800 Liang Chen wrote:
> > Sounds like a better fix, indeed. But this sort of code will require
> > another fat comment above to explain why. This:
> >
> >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> >
> > is much easier to understand, no?
> >
> > We should at least include that in the explanatory comment, I reckon...  
> 
> Sure, the idea of dealing with the case where @from transitioned into non cloned
> skb in the function retains the existing behavior, and gives more
> opportunities to
> coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> here.

Well, that's pretty much what Alex suggested minus the optimization he
put in for "was never cloned" which is probably worth having. So if
you're gonna do this just use his code.

My point was that !from->pp_recycle requires the reader to understand
the relationship between this check and the previous condition at entry.
While to->pp_recycle == from->pp_recycle seems much more obvious to me -
directly shifting frags between skbs with different refcount styles is
dangerous.

Maybe it's just me, so whatever.
Make sure you write a good comment.

> I will take a closer look at the code path for the fragstolen case
> before making v2
> patch  -  If @from transitioned into non cloned skb before "if
> (skb_head_is_locked(from))"
> 
> Thanks for the reviews.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-05  8:18     ` Liang Chen
  2023-04-05 14:50       ` Jakub Kicinski
@ 2023-04-05 15:06       ` Alexander Duyck
  2023-04-06  3:28         ` Liang Chen
  1 sibling, 1 reply; 17+ messages in thread
From: Alexander Duyck @ 2023-04-05 15:06 UTC (permalink / raw)
  To: Liang Chen
  Cc: Jakub Kicinski, ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

On Wed, Apr 5, 2023 at 1:19 AM Liang Chen <liangchen.linux@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
> >
> > On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > > check further down that does:
> > >
> > >         if (!skb_cloned(from))
> > >                 from_shinfo->nr_frags = 0;
> > >
> > > And instead just make that:
> > >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> > >                 from_shinfo->nr_frags = 0;
> > >
> > > With that we would retain the existing behavior and in the case of
> > > cloned from frames we would take the references and let the original
> > > from skb freed to take care of pulling the pages from the page pool.
> >
> > Sounds like a better fix, indeed. But this sort of code will require
> > another fat comment above to explain why. This:
> >
> >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> >
> > is much easier to understand, no?
> >
> > We should at least include that in the explanatory comment, I reckon...
>
> Sure, the idea of dealing with the case where @from transitioned into non cloned
> skb in the function retains the existing behavior, and gives more
> opportunities to
> coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> here.
> I will take a closer look at the code path for the fragstolen case
> before making v2
> patch  -  If @from transitioned into non cloned skb before "if
> (skb_head_is_locked(from))"
>
> Thanks for the reviews.

Actually I am not sure that works now that I look at it closer. The
problem with using (!skb_cloned(from) && !from->pp_recycle) is that it
breaks the case where both from and to are pp_recycle without being
cloned.

So it probably needs to be something actually the setup Jakub
suggested would probably work better:
  if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))

Basically we just have to guarantee that if we are copying frags over
the frag destructor is the same for the origin and destination.
Otherwise we can take a reference and convert them to being reference
counted.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-05 14:50       ` Jakub Kicinski
@ 2023-04-06  3:22         ` Liang Chen
  0 siblings, 0 replies; 17+ messages in thread
From: Liang Chen @ 2023-04-06  3:22 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Alexander H Duyck, ilias.apalodimas, hawk, davem, edumazet,
	pabeni, netdev

On Wed, Apr 5, 2023 at 10:50 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Wed, 5 Apr 2023 16:18:47 +0800 Liang Chen wrote:
> > > Sounds like a better fix, indeed. But this sort of code will require
> > > another fat comment above to explain why. This:
> > >
> > >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > >
> > > is much easier to understand, no?
> > >
> > > We should at least include that in the explanatory comment, I reckon...
> >
> > Sure, the idea of dealing with the case where @from transitioned into non cloned
> > skb in the function retains the existing behavior, and gives more
> > opportunities to
> > coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> > here.
>
> Well, that's pretty much what Alex suggested minus the optimization he
> put in for "was never cloned" which is probably worth having. So if
> you're gonna do this just use his code.
>
> My point was that !from->pp_recycle requires the reader to understand
> the relationship between this check and the previous condition at entry.
> While to->pp_recycle == from->pp_recycle seems much more obvious to me -
> directly shifting frags between skbs with different refcount styles is
> dangerous.
>

Yeah, I agree with the point that to->pp_recycle == from->pp_recycle
is easier to understand, and will use it in the next iteration of the
patch. Thanks!

> Maybe it's just me, so whatever.
> Make sure you write a good comment.

Sure.
>
> > I will take a closer look at the code path for the fragstolen case
> > before making v2
> > patch  -  If @from transitioned into non cloned skb before "if
> > (skb_head_is_locked(from))"
> >

I took a closer look at the code path for the "fragstolen" case, and
it indeed requires a special handle for the situation addressed in
this patch. Something like,

    if( to->pp_recycle != from->pp_recycle )
        get_page(page);

before  "*fragstolen = true;".   But this makes the logic a bit
complicated. Anyway, I will include the logic in the v2 patch. Let's
see if it looks better.

> > Thanks for the reviews.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-05 15:06       ` Alexander Duyck
@ 2023-04-06  3:28         ` Liang Chen
  2023-04-06  9:56           ` Eric Dumazet
  0 siblings, 1 reply; 17+ messages in thread
From: Liang Chen @ 2023-04-06  3:28 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Jakub Kicinski, ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

On Wed, Apr 5, 2023 at 11:06 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 1:19 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > > > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > > > check further down that does:
> > > >
> > > >         if (!skb_cloned(from))
> > > >                 from_shinfo->nr_frags = 0;
> > > >
> > > > And instead just make that:
> > > >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> > > >                 from_shinfo->nr_frags = 0;
> > > >
> > > > With that we would retain the existing behavior and in the case of
> > > > cloned from frames we would take the references and let the original
> > > > from skb freed to take care of pulling the pages from the page pool.
> > >
> > > Sounds like a better fix, indeed. But this sort of code will require
> > > another fat comment above to explain why. This:
> > >
> > >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > >
> > > is much easier to understand, no?
> > >
> > > We should at least include that in the explanatory comment, I reckon...
> >
> > Sure, the idea of dealing with the case where @from transitioned into non cloned
> > skb in the function retains the existing behavior, and gives more
> > opportunities to
> > coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> > here.
> > I will take a closer look at the code path for the fragstolen case
> > before making v2
> > patch  -  If @from transitioned into non cloned skb before "if
> > (skb_head_is_locked(from))"
> >
> > Thanks for the reviews.
>
> Actually I am not sure that works now that I look at it closer. The
> problem with using (!skb_cloned(from) && !from->pp_recycle) is that it
> breaks the case where both from and to are pp_recycle without being
> cloned.

Yeah, it would break that case. Thanks!
>
> So it probably needs to be something actually the setup Jakub
> suggested would probably work better:
>   if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
>

I agree. That's better.
> Basically we just have to guarantee that if we are copying frags over
> the frag destructor is the same for the origin and destination.
> Otherwise we can take a reference and convert them to being reference
> counted.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-06  3:28         ` Liang Chen
@ 2023-04-06  9:56           ` Eric Dumazet
  2023-04-06 10:46             ` Ilias Apalodimas
  0 siblings, 1 reply; 17+ messages in thread
From: Eric Dumazet @ 2023-04-06  9:56 UTC (permalink / raw)
  To: Liang Chen
  Cc: Alexander Duyck, Jakub Kicinski, ilias.apalodimas, hawk, davem,
	pabeni, netdev

On Thu, Apr 6, 2023 at 5:28 AM Liang Chen <liangchen.linux@gmail.com> wrote:
>
> On Wed, Apr 5, 2023 at 11:06 PM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 1:19 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > >
> > > > On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > > > > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > > > > check further down that does:
> > > > >
> > > > >         if (!skb_cloned(from))
> > > > >                 from_shinfo->nr_frags = 0;
> > > > >
> > > > > And instead just make that:
> > > > >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> > > > >                 from_shinfo->nr_frags = 0;
> > > > >
> > > > > With that we would retain the existing behavior and in the case of
> > > > > cloned from frames we would take the references and let the original
> > > > > from skb freed to take care of pulling the pages from the page pool.
> > > >
> > > > Sounds like a better fix, indeed. But this sort of code will require
> > > > another fat comment above to explain why. This:
> > > >
> > > >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > > >
> > > > is much easier to understand, no?
> > > >
> > > > We should at least include that in the explanatory comment, I reckon...
> > >
> > > Sure, the idea of dealing with the case where @from transitioned into non cloned
> > > skb in the function retains the existing behavior, and gives more
> > > opportunities to
> > > coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> > > here.
> > > I will take a closer look at the code path for the fragstolen case
> > > before making v2
> > > patch  -  If @from transitioned into non cloned skb before "if
> > > (skb_head_is_locked(from))"
> > >
> > > Thanks for the reviews.
> >
> > Actually I am not sure that works now that I look at it closer. The
> > problem with using (!skb_cloned(from) && !from->pp_recycle) is that it
> > breaks the case where both from and to are pp_recycle without being
> > cloned.
>
> Yeah, it would break that case. Thanks!
> >
> > So it probably needs to be something actually the setup Jakub
> > suggested would probably work better:
> >   if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> >
>
> I agree. That's better.

Same feeling on my side.
I prefer not trying to merge mixed pp_recycle skbs "just because we
could" at the expense
of adding more code in a fast path.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-06  9:56           ` Eric Dumazet
@ 2023-04-06 10:46             ` Ilias Apalodimas
  2023-04-06 11:54               ` Liang Chen
  0 siblings, 1 reply; 17+ messages in thread
From: Ilias Apalodimas @ 2023-04-06 10:46 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Liang Chen, Alexander Duyck, Jakub Kicinski, hawk, davem, pabeni, netdev

On Thu, 6 Apr 2023 at 12:56, Eric Dumazet <edumazet@google.com> wrote:
>
> On Thu, Apr 6, 2023 at 5:28 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> >
> > On Wed, Apr 5, 2023 at 11:06 PM Alexander Duyck
> > <alexander.duyck@gmail.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 1:19 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > > >
> > > > > On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > > > > > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > > > > > check further down that does:
> > > > > >
> > > > > >         if (!skb_cloned(from))
> > > > > >                 from_shinfo->nr_frags = 0;
> > > > > >
> > > > > > And instead just make that:
> > > > > >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> > > > > >                 from_shinfo->nr_frags = 0;
> > > > > >
> > > > > > With that we would retain the existing behavior and in the case of
> > > > > > cloned from frames we would take the references and let the original
> > > > > > from skb freed to take care of pulling the pages from the page pool.
> > > > >
> > > > > Sounds like a better fix, indeed. But this sort of code will require
> > > > > another fat comment above to explain why. This:
> > > > >
> > > > >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > > > >
> > > > > is much easier to understand, no?
> > > > >
> > > > > We should at least include that in the explanatory comment, I reckon...
> > > >
> > > > Sure, the idea of dealing with the case where @from transitioned into non cloned
> > > > skb in the function retains the existing behavior, and gives more
> > > > opportunities to
> > > > coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> > > > here.
> > > > I will take a closer look at the code path for the fragstolen case
> > > > before making v2
> > > > patch  -  If @from transitioned into non cloned skb before "if
> > > > (skb_head_is_locked(from))"
> > > >
> > > > Thanks for the reviews.
> > >
> > > Actually I am not sure that works now that I look at it closer. The
> > > problem with using (!skb_cloned(from) && !from->pp_recycle) is that it
> > > breaks the case where both from and to are pp_recycle without being
> > > cloned.
> >
> > Yeah, it would break that case. Thanks!
> > >
> > > So it probably needs to be something actually the setup Jakub
> > > suggested would probably work better:
> > >   if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > >
> >
> > I agree. That's better.
>
> Same feeling on my side.
> I prefer not trying to merge mixed pp_recycle skbs "just because we
> could" at the expense
> of adding more code in a fast path.

+1 here.  The intention of recycling was to affect the normal path as
less as possible.  On top of that, we've some amount of race
conditions over the years, trying to squeeze more performance with
similar tricks.  I'd much rather be safe here, since recycling by
itself is a great performance boost

Regards
/Ilias

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-06 10:46             ` Ilias Apalodimas
@ 2023-04-06 11:54               ` Liang Chen
  2023-04-06 14:59                 ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Liang Chen @ 2023-04-06 11:54 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Eric Dumazet, Alexander Duyck, Jakub Kicinski, hawk, davem,
	pabeni, netdev

On Thu, Apr 6, 2023 at 6:46 PM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> On Thu, 6 Apr 2023 at 12:56, Eric Dumazet <edumazet@google.com> wrote:
> >
> > On Thu, Apr 6, 2023 at 5:28 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> > >
> > > On Wed, Apr 5, 2023 at 11:06 PM Alexander Duyck
> > > <alexander.duyck@gmail.com> wrote:
> > > >
> > > > On Wed, Apr 5, 2023 at 1:19 AM Liang Chen <liangchen.linux@gmail.com> wrote:
> > > > >
> > > > > On Wed, Apr 5, 2023 at 9:21 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > > > > >
> > > > > > On Tue, 04 Apr 2023 08:51:18 -0700 Alexander H Duyck wrote:
> > > > > > > I'm not quite sure I agree with the fix. Couldn't we just modify the
> > > > > > > check further down that does:
> > > > > > >
> > > > > > >         if (!skb_cloned(from))
> > > > > > >                 from_shinfo->nr_frags = 0;
> > > > > > >
> > > > > > > And instead just make that:
> > > > > > >       if (!skb->cloned || (!skb_cloned(from) && !from->pp_recycle))
> > > > > > >                 from_shinfo->nr_frags = 0;
> > > > > > >
> > > > > > > With that we would retain the existing behavior and in the case of
> > > > > > > cloned from frames we would take the references and let the original
> > > > > > > from skb freed to take care of pulling the pages from the page pool.
> > > > > >
> > > > > > Sounds like a better fix, indeed. But this sort of code will require
> > > > > > another fat comment above to explain why. This:
> > > > > >
> > > > > >         if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > > > > >
> > > > > > is much easier to understand, no?
> > > > > >
> > > > > > We should at least include that in the explanatory comment, I reckon...
> > > > >
> > > > > Sure, the idea of dealing with the case where @from transitioned into non cloned
> > > > > skb in the function retains the existing behavior, and gives more
> > > > > opportunities to
> > > > > coalesce skbs. And it seems (!skb_cloned(from) && !from->pp_recycle) is enough
> > > > > here.
> > > > > I will take a closer look at the code path for the fragstolen case
> > > > > before making v2
> > > > > patch  -  If @from transitioned into non cloned skb before "if
> > > > > (skb_head_is_locked(from))"
> > > > >
> > > > > Thanks for the reviews.
> > > >
> > > > Actually I am not sure that works now that I look at it closer. The
> > > > problem with using (!skb_cloned(from) && !from->pp_recycle) is that it
> > > > breaks the case where both from and to are pp_recycle without being
> > > > cloned.
> > >
> > > Yeah, it would break that case. Thanks!
> > > >
> > > > So it probably needs to be something actually the setup Jakub
> > > > suggested would probably work better:
> > > >   if (to->pp_recycle == from->pp_recycle && !skb_cloned(from))
> > > >
> > >
> > > I agree. That's better.
> >
> > Same feeling on my side.
> > I prefer not trying to merge mixed pp_recycle skbs "just because we
> > could" at the expense
> > of adding more code in a fast path.
>
> +1 here.  The intention of recycling was to affect the normal path as
> less as possible.  On top of that, we've some amount of race
> conditions over the years, trying to squeeze more performance with
> similar tricks.  I'd much rather be safe here, since recycling by
> itself is a great performance boost
>
> Regards
> /Ilias

Sorry, I didn't check my inbox before sending out the v2 patch.

Yeah, It is a bit complicated as we expected. The patch is sent out.
Please take a look to see if it is the way to go, or We should stay
with the current patch for simplicity reasons. Thanks!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-06 11:54               ` Liang Chen
@ 2023-04-06 14:59                 ` Jakub Kicinski
  2023-04-07  0:45                   ` Liang Chen
  0 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2023-04-06 14:59 UTC (permalink / raw)
  To: Liang Chen
  Cc: Ilias Apalodimas, Eric Dumazet, Alexander Duyck, hawk, davem,
	pabeni, netdev

On Thu, 6 Apr 2023 19:54:23 +0800 Liang Chen wrote:
> > > Same feeling on my side.
> > > I prefer not trying to merge mixed pp_recycle skbs "just because we
> > > could" at the expense
> > > of adding more code in a fast path.  
> >
> > +1 here.  The intention of recycling was to affect the normal path as
> > less as possible.  On top of that, we've some amount of race
> > conditions over the years, trying to squeeze more performance with
> > similar tricks.  I'd much rather be safe here, since recycling by
> > itself is a great performance boost
> 
> Sorry, I didn't check my inbox before sending out the v2 patch.

I can discard v2 from patchwork, let's continue the conversation
here.

> Yeah, It is a bit complicated as we expected. The patch is sent out.
> Please take a look to see if it is the way to go, or We should stay
> with the current patch for simplicity reasons. Thanks!

Sounds like you know what Eric and Ilias agreed with, I'm a bit
confused.. are we basically going back to v1? (hopefully with coding
style fixed)

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-06 14:59                 ` Jakub Kicinski
@ 2023-04-07  0:45                   ` Liang Chen
  0 siblings, 0 replies; 17+ messages in thread
From: Liang Chen @ 2023-04-07  0:45 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Ilias Apalodimas, Eric Dumazet, Alexander Duyck, hawk, davem,
	pabeni, netdev

On Thu, Apr 6, 2023 at 10:59 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 6 Apr 2023 19:54:23 +0800 Liang Chen wrote:
> > > > Same feeling on my side.
> > > > I prefer not trying to merge mixed pp_recycle skbs "just because we
> > > > could" at the expense
> > > > of adding more code in a fast path.
> > >
> > > +1 here.  The intention of recycling was to affect the normal path as
> > > less as possible.  On top of that, we've some amount of race
> > > conditions over the years, trying to squeeze more performance with
> > > similar tricks.  I'd much rather be safe here, since recycling by
> > > itself is a great performance boost
> >
> > Sorry, I didn't check my inbox before sending out the v2 patch.
>
> I can discard v2 from patchwork, let's continue the conversation
> here.
>
> > Yeah, It is a bit complicated as we expected. The patch is sent out.
> > Please take a look to see if it is the way to go, or We should stay
> > with the current patch for simplicity reasons. Thanks!
>
> Sounds like you know what Eric and Ilias agreed with, I'm a bit
> confused.. are we basically going back to v1? (hopefully with coding
> style fixed)

Sorry for making the confusion. I just felt the v2 patch complicated
the code a bit.  I am not quite sure if the performance gain justifies
the added complexity, and would really appreciate it if you could make
the decision on which patch to pick up. Thanks a lot!

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-04  7:47 [PATCH] skbuff: Fix a race between coalescing and releasing SKBs Liang Chen
  2023-04-04 11:18 ` Yunsheng Lin
  2023-04-04 15:51 ` Alexander H Duyck
@ 2023-04-11  0:26 ` Jakub Kicinski
  2023-04-12  7:27   ` Liang Chen
  2 siblings, 1 reply; 17+ messages in thread
From: Jakub Kicinski @ 2023-04-11  0:26 UTC (permalink / raw)
  To: Liang Chen; +Cc: ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

On Tue,  4 Apr 2023 15:47:33 +0800 Liang Chen wrote:
> -	if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
> +	if ((to->pp_recycle != from->pp_recycle)
> +		|| (from->pp_recycle && skb_cloned(from)))
>  		return false;

It should be formatted like this:

	if (to->pp_recycle != from->pp_recycle ||
	    (from->pp_recycle && skb_cloned(from)))

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-11  0:26 ` Jakub Kicinski
@ 2023-04-12  7:27   ` Liang Chen
  2023-04-12 13:58     ` Jakub Kicinski
  0 siblings, 1 reply; 17+ messages in thread
From: Liang Chen @ 2023-04-12  7:27 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

Sure. I have addressed it and submitted the updated patch for your
review as v3. Thank you for pointing it out.

Thanks,
Liang

On Tue, Apr 11, 2023 at 8:26 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Tue,  4 Apr 2023 15:47:33 +0800 Liang Chen wrote:
> > -     if (to->pp_recycle != (from->pp_recycle && !skb_cloned(from)))
> > +     if ((to->pp_recycle != from->pp_recycle)
> > +             || (from->pp_recycle && skb_cloned(from)))
> >               return false;
>
> It should be formatted like this:
>
>         if (to->pp_recycle != from->pp_recycle ||
>             (from->pp_recycle && skb_cloned(from)))

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH] skbuff: Fix a race between coalescing and releasing SKBs
  2023-04-12  7:27   ` Liang Chen
@ 2023-04-12 13:58     ` Jakub Kicinski
  0 siblings, 0 replies; 17+ messages in thread
From: Jakub Kicinski @ 2023-04-12 13:58 UTC (permalink / raw)
  To: Liang Chen; +Cc: ilias.apalodimas, hawk, davem, edumazet, pabeni, netdev

On Wed, 12 Apr 2023 15:27:13 +0800 Liang Chen wrote:
> Sure. I have addressed it and submitted the updated patch for your
> review as v3. Thank you for pointing it out.

I know, I don't understand why you have to send this note, I can see
the patch. And it's still in patchwork which as you know (as I trust
you read our process documentation) means it's going to be reviewed.
The judeo-christiano-muslim world is going thru its spring celebrations
so the reviews will be slower than usual. 

Please be patient.

And please don't top post.

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-04-12 13:58 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-04  7:47 [PATCH] skbuff: Fix a race between coalescing and releasing SKBs Liang Chen
2023-04-04 11:18 ` Yunsheng Lin
2023-04-04 15:51 ` Alexander H Duyck
2023-04-05  1:21   ` Jakub Kicinski
2023-04-05  8:18     ` Liang Chen
2023-04-05 14:50       ` Jakub Kicinski
2023-04-06  3:22         ` Liang Chen
2023-04-05 15:06       ` Alexander Duyck
2023-04-06  3:28         ` Liang Chen
2023-04-06  9:56           ` Eric Dumazet
2023-04-06 10:46             ` Ilias Apalodimas
2023-04-06 11:54               ` Liang Chen
2023-04-06 14:59                 ` Jakub Kicinski
2023-04-07  0:45                   ` Liang Chen
2023-04-11  0:26 ` Jakub Kicinski
2023-04-12  7:27   ` Liang Chen
2023-04-12 13:58     ` Jakub Kicinski

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).