[PATCH 1/1 v2] skbuff: Fix a potential race while recycling page

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
@ 2021-07-09  6:29 Ilias Apalodimas
  2021-07-09 14:34 ` Alexander Duyck
  2021-07-15  4:00 ` Yunsheng Lin
  0 siblings, 2 replies; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-09  6:29 UTC (permalink / raw)
  To: netdev
  Cc: Ilias Apalodimas, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, linux-kernel

As Alexander points out, when we are trying to recycle a cloned/expanded
SKB we might trigger a race.  The recycling code relies on the
pp_recycle bit to trigger,  which we carry over to cloned SKBs.
If that cloned SKB gets expanded or if we get references to the frags,
call skbb_release_data() and overwrite skb->head, we are creating separate
instances accessing the same page frags.  Since the skb_release_data()
will first try to recycle the frags,  there's a potential race between
the original and cloned SKB, since both will have the pp_recycle bit set.

Fix this by explicitly those SKBs not recyclable.
The atomic_sub_return effectively limits us to a single release case,
and when we are calling skb_release_data we are also releasing the
option to perform the recycling, or releasing the pages from the page pool.

Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
Reported-by: Alexander Duyck <alexanderduyck@fb.com>
Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
---
Changes since v1:
- Set the recycle bit to 0 during skb_release_data instead of the 
  individual fucntions triggering the issue, in order to catch all 
  cases
 net/core/skbuff.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 12aabcda6db2..f91f09a824be 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
 	if (skb->cloned &&
 	    atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
 			      &shinfo->dataref))
-		return;
+		goto exit;

 	skb_zcopy_clear(skb, true);

@@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
 		kfree_skb_list(shinfo->frag_list);

 	skb_free_head(skb);
+exit:
+	skb->pp_recycle = 0;
 }

 /*
-- 
2.32.0.rc0

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-09  6:29 [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets Ilias Apalodimas
@ 2021-07-09 14:34 ` Alexander Duyck
  2021-07-09 17:29   ` Ilias Apalodimas
  2021-07-12 11:52   ` Jesper Dangaard Brouer
  2021-07-15  4:00 ` Yunsheng Lin
  1 sibling, 2 replies; 15+ messages in thread
From: Alexander Duyck @ 2021-07-09 14:34 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Netdev, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	LKML

On Thu, Jul 8, 2021 at 11:30 PM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> As Alexander points out, when we are trying to recycle a cloned/expanded
> SKB we might trigger a race.  The recycling code relies on the
> pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> If that cloned SKB gets expanded or if we get references to the frags,
> call skbb_release_data() and overwrite skb->head, we are creating separate
> instances accessing the same page frags.  Since the skb_release_data()
> will first try to recycle the frags,  there's a potential race between
> the original and cloned SKB, since both will have the pp_recycle bit set.
>
> Fix this by explicitly those SKBs not recyclable.
> The atomic_sub_return effectively limits us to a single release case,
> and when we are calling skb_release_data we are also releasing the
> option to perform the recycling, or releasing the pages from the page pool.
>
> Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> ---
> Changes since v1:
> - Set the recycle bit to 0 during skb_release_data instead of the
>   individual fucntions triggering the issue, in order to catch all
>   cases
>  net/core/skbuff.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 12aabcda6db2..f91f09a824be 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
>         if (skb->cloned &&
>             atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>                               &shinfo->dataref))
> -               return;
> +               goto exit;
>
>         skb_zcopy_clear(skb, true);
>
> @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
>                 kfree_skb_list(shinfo->frag_list);
>
>         skb_free_head(skb);
> +exit:
> +       skb->pp_recycle = 0;
>  }
>
>  /*
> --
> 2.32.0.rc0
>

This is probably the cleanest approach with the least amount of
change, but one thing I am concerned with in this approach is that we
end up having to dirty a cacheline that I am not sure is otherwise
touched during skb cleanup. I am not sure if that will be an issue or
not. If it is then an alternative or follow-on patch could move the
pp_recycle flag into the skb_shared_info flags itself and then make
certain that we clear it around the same time we are setting
shinfo->dataref to 1.

Otherwise this looks good to me.

Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-09 14:34 ` Alexander Duyck
@ 2021-07-09 17:29   ` Ilias Apalodimas
  2021-07-12 11:52   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-09 17:29 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Netdev, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	LKML

On Fri, Jul 09, 2021 at 07:34:38AM -0700, Alexander Duyck wrote:
> On Thu, Jul 8, 2021 at 11:30 PM Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
> >
> > As Alexander points out, when we are trying to recycle a cloned/expanded
> > SKB we might trigger a race.  The recycling code relies on the
> > pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> > If that cloned SKB gets expanded or if we get references to the frags,
> > call skbb_release_data() and overwrite skb->head, we are creating separate
> > instances accessing the same page frags.  Since the skb_release_data()
> > will first try to recycle the frags,  there's a potential race between
> > the original and cloned SKB, since both will have the pp_recycle bit set.
> >
> > Fix this by explicitly those SKBs not recyclable.
> > The atomic_sub_return effectively limits us to a single release case,
> > and when we are calling skb_release_data we are also releasing the
> > option to perform the recycling, or releasing the pages from the page pool.
> >
> > Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> > Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> > Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > ---
> > Changes since v1:
> > - Set the recycle bit to 0 during skb_release_data instead of the
> >   individual fucntions triggering the issue, in order to catch all
> >   cases
> >  net/core/skbuff.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 12aabcda6db2..f91f09a824be 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
> >         if (skb->cloned &&
> >             atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >                               &shinfo->dataref))
> > -               return;
> > +               goto exit;
> >
> >         skb_zcopy_clear(skb, true);
> >
> > @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> >                 kfree_skb_list(shinfo->frag_list);
> >
> >         skb_free_head(skb);
> > +exit:
> > +       skb->pp_recycle = 0;
> >  }
> >
> >  /*
> > --
> > 2.32.0.rc0
> >
> 
> This is probably the cleanest approach with the least amount of
> change, but one thing I am concerned with in this approach is that we
> end up having to dirty a cacheline that I am not sure is otherwise
> touched during skb cleanup. I am not sure if that will be an issue or
> not. If it is then an alternative or follow-on patch could move the
> pp_recycle flag into the skb_shared_info flags itself and then make
> certain that we clear it around the same time we are setting
> shinfo->dataref to 1.
> 

Yep that's a viable alternative. Let's see if there's any measurable
impact.

> Otherwise this looks good to me.
> 
> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>

Thanks Alexander!


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-09 14:34 ` Alexander Duyck
  2021-07-09 17:29   ` Ilias Apalodimas
@ 2021-07-12 11:52   ` Jesper Dangaard Brouer
  1 sibling, 0 replies; 15+ messages in thread
From: Jesper Dangaard Brouer @ 2021-07-12 11:52 UTC (permalink / raw)
  To: Alexander Duyck, Ilias Apalodimas
  Cc: brouer, Netdev, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Matteo Croce, LKML



On 09/07/2021 16.34, Alexander Duyck wrote:
> On Thu, Jul 8, 2021 at 11:30 PM Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
>>
>> As Alexander points out, when we are trying to recycle a cloned/expanded
>> SKB we might trigger a race.  The recycling code relies on the
>> pp_recycle bit to trigger,  which we carry over to cloned SKBs.
>> If that cloned SKB gets expanded or if we get references to the frags,
>> call skbb_release_data() and overwrite skb->head, we are creating separate
>> instances accessing the same page frags.  Since the skb_release_data()
>> will first try to recycle the frags,  there's a potential race between
>> the original and cloned SKB, since both will have the pp_recycle bit set.
>>
>> Fix this by explicitly those SKBs not recyclable.
>> The atomic_sub_return effectively limits us to a single release case,
>> and when we are calling skb_release_data we are also releasing the
>> option to perform the recycling, or releasing the pages from the page pool.
>>
>> Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
>> Reported-by: Alexander Duyck <alexanderduyck@fb.com>
>> Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
>> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
>> ---
>> Changes since v1:
>> - Set the recycle bit to 0 during skb_release_data instead of the
>>    individual fucntions triggering the issue, in order to catch all
>>    cases
>>   net/core/skbuff.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>> index 12aabcda6db2..f91f09a824be 100644
>> --- a/net/core/skbuff.c
>> +++ b/net/core/skbuff.c
>> @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
>>          if (skb->cloned &&
>>              atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>>                                &shinfo->dataref))
>> -               return;
>> +               goto exit;
>>
>>          skb_zcopy_clear(skb, true);
>>
>> @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
>>                  kfree_skb_list(shinfo->frag_list);
>>
>>          skb_free_head(skb);
>> +exit:
>> +       skb->pp_recycle = 0;
>>   }
>>
>>   /*
>> --
>> 2.32.0.rc0
>>
> 
> This is probably the cleanest approach with the least amount of
> change, but one thing I am concerned with in this approach is that we
> end up having to dirty a cacheline that I am not sure is otherwise
> touched during skb cleanup. I am not sure if that will be an issue or
> not. If it is then an alternative or follow-on patch could move the
> pp_recycle flag into the skb_shared_info flags itself and then make
> certain that we clear it around the same time we are setting
> shinfo->dataref to 1.
> 

The skb->cloned and skb->pp_recycle (bitfields) are on the same 
cache-line (incl. nohdr, destructor, active_extensions).  Thus, we know 
this must be in CPUs cache, regardless of this change.  I do acknowledge 
that it might be in cache coherency "Shared" state, and writing 
skb->pp_recycle=0 the CPU *might* have to change the cache coherency 
state, but I don't expect this to be a performance problem.

> Otherwise this looks good to me.
> 
> Reviewed-by: Alexander Duyck <alexanderduyck@fb.com>
  Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>

I've gone over the code-path, with Ilias on IRC and I've convinced 
myself that this fix is correct, thus ACK.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-09  6:29 [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets Ilias Apalodimas
  2021-07-09 14:34 ` Alexander Duyck
@ 2021-07-15  4:00 ` Yunsheng Lin
  2021-07-15 10:00   ` Ilias Apalodimas
  2021-07-15 14:25   ` Alexander Duyck
  1 sibling, 2 replies; 15+ messages in thread
From: Yunsheng Lin @ 2021-07-15  4:00 UTC (permalink / raw)
  To: Ilias Apalodimas, netdev
  Cc: Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	linux-kernel

On 2021/7/9 14:29, Ilias Apalodimas wrote:
> As Alexander points out, when we are trying to recycle a cloned/expanded
> SKB we might trigger a race.  The recycling code relies on the
> pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> If that cloned SKB gets expanded or if we get references to the frags,
> call skbb_release_data() and overwrite skb->head, we are creating separate
> instances accessing the same page frags.  Since the skb_release_data()
> will first try to recycle the frags,  there's a potential race between
> the original and cloned SKB, since both will have the pp_recycle bit set.
> 
> Fix this by explicitly those SKBs not recyclable.
> The atomic_sub_return effectively limits us to a single release case,
> and when we are calling skb_release_data we are also releasing the
> option to perform the recycling, or releasing the pages from the page pool.
> 
> Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> ---
> Changes since v1:
> - Set the recycle bit to 0 during skb_release_data instead of the 
>   individual fucntions triggering the issue, in order to catch all 
>   cases
>  net/core/skbuff.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 12aabcda6db2..f91f09a824be 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
>  	if (skb->cloned &&
>  	    atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>  			      &shinfo->dataref))
> -		return;
> +		goto exit;

Is it possible this patch may break the head frag page for the original skb,
supposing it's head frag page is from the page pool and below change clears
the pp_recycle for original skb, causing a page leaking for the page pool?

>  
>  	skb_zcopy_clear(skb, true);
>  
> @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
>  		kfree_skb_list(shinfo->frag_list);
>  
>  	skb_free_head(skb);
> +exit:
> +	skb->pp_recycle = 0;
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15  4:00 ` Yunsheng Lin
@ 2021-07-15 10:00   ` Ilias Apalodimas
  2021-07-15 10:38     ` Ilias Apalodimas
  2021-07-15 14:25   ` Alexander Duyck
  1 sibling, 1 reply; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-15 10:00 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Networking, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	open list

On Thu, 15 Jul 2021 at 07:01, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2021/7/9 14:29, Ilias Apalodimas wrote:
> > As Alexander points out, when we are trying to recycle a cloned/expanded
> > SKB we might trigger a race.  The recycling code relies on the
> > pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> > If that cloned SKB gets expanded or if we get references to the frags,
> > call skbb_release_data() and overwrite skb->head, we are creating separate
> > instances accessing the same page frags.  Since the skb_release_data()
> > will first try to recycle the frags,  there's a potential race between
> > the original and cloned SKB, since both will have the pp_recycle bit set.
> >
> > Fix this by explicitly those SKBs not recyclable.
> > The atomic_sub_return effectively limits us to a single release case,
> > and when we are calling skb_release_data we are also releasing the
> > option to perform the recycling, or releasing the pages from the page pool.
> >
> > Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> > Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> > Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > ---
> > Changes since v1:
> > - Set the recycle bit to 0 during skb_release_data instead of the
> >   individual fucntions triggering the issue, in order to catch all
> >   cases
> >  net/core/skbuff.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 12aabcda6db2..f91f09a824be 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
> >       if (skb->cloned &&
> >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >                             &shinfo->dataref))
> > -             return;
> > +             goto exit;
>
> Is it possible this patch may break the head frag page for the original skb,
> supposing it's head frag page is from the page pool and below change clears
> the pp_recycle for original skb, causing a page leaking for the page pool?
>

So this would leak eventually dma mapping if the skb is cloned (and
the dataref is now +1) and we are freeing the original skb first?

> >
> >       skb_zcopy_clear(skb, true);
> >
> > @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> >               kfree_skb_list(shinfo->frag_list);
> >
> >       skb_free_head(skb);
> > +exit:
> > +     skb->pp_recycle = 0;
> >  }
> >
> >  /*
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 10:00   ` Ilias Apalodimas
@ 2021-07-15 10:38     ` Ilias Apalodimas
  2021-07-15 10:47       ` Yunsheng Lin
  0 siblings, 1 reply; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-15 10:38 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Networking, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	open list

On Thu, 15 Jul 2021 at 13:00, Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> On Thu, 15 Jul 2021 at 07:01, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >
> > On 2021/7/9 14:29, Ilias Apalodimas wrote:
> > > As Alexander points out, when we are trying to recycle a cloned/expanded
> > > SKB we might trigger a race.  The recycling code relies on the
> > > pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> > > If that cloned SKB gets expanded or if we get references to the frags,
> > > call skbb_release_data() and overwrite skb->head, we are creating separate
> > > instances accessing the same page frags.  Since the skb_release_data()
> > > will first try to recycle the frags,  there's a potential race between
> > > the original and cloned SKB, since both will have the pp_recycle bit set.
> > >
> > > Fix this by explicitly those SKBs not recyclable.
> > > The atomic_sub_return effectively limits us to a single release case,
> > > and when we are calling skb_release_data we are also releasing the
> > > option to perform the recycling, or releasing the pages from the page pool.
> > >
> > > Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> > > Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> > > Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> > > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > > ---
> > > Changes since v1:
> > > - Set the recycle bit to 0 during skb_release_data instead of the
> > >   individual fucntions triggering the issue, in order to catch all
> > >   cases
> > >  net/core/skbuff.c | 4 +++-
> > >  1 file changed, 3 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index 12aabcda6db2..f91f09a824be 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
> > >       if (skb->cloned &&
> > >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> > >                             &shinfo->dataref))
> > > -             return;
> > > +             goto exit;
> >
> > Is it possible this patch may break the head frag page for the original skb,
> > supposing it's head frag page is from the page pool and below change clears
> > the pp_recycle for original skb, causing a page leaking for the page pool?
> >
>
> So this would leak eventually dma mapping if the skb is cloned (and
> the dataref is now +1) and we are freeing the original skb first?
>

Apologies for the noise, my description was not complete.
The case you are thinking is clone an SKB and then expand the original?

thanks
/Ilias


> > >
> > >       skb_zcopy_clear(skb, true);
> > >
> > > @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> > >               kfree_skb_list(shinfo->frag_list);
> > >
> > >       skb_free_head(skb);
> > > +exit:
> > > +     skb->pp_recycle = 0;
> > >  }
> > >
> > >  /*
> > >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 10:38     ` Ilias Apalodimas
@ 2021-07-15 10:47       ` Yunsheng Lin
  2021-07-15 12:37         ` Ilias Apalodimas
  0 siblings, 1 reply; 15+ messages in thread
From: Yunsheng Lin @ 2021-07-15 10:47 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Networking, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	open list

On 2021/7/15 18:38, Ilias Apalodimas wrote:
> On Thu, 15 Jul 2021 at 13:00, Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
>>
>> On Thu, 15 Jul 2021 at 07:01, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>>>
>>> On 2021/7/9 14:29, Ilias Apalodimas wrote:
>>>> As Alexander points out, when we are trying to recycle a cloned/expanded
>>>> SKB we might trigger a race.  The recycling code relies on the
>>>> pp_recycle bit to trigger,  which we carry over to cloned SKBs.
>>>> If that cloned SKB gets expanded or if we get references to the frags,
>>>> call skbb_release_data() and overwrite skb->head, we are creating separate
>>>> instances accessing the same page frags.  Since the skb_release_data()
>>>> will first try to recycle the frags,  there's a potential race between
>>>> the original and cloned SKB, since both will have the pp_recycle bit set.
>>>>
>>>> Fix this by explicitly those SKBs not recyclable.
>>>> The atomic_sub_return effectively limits us to a single release case,
>>>> and when we are calling skb_release_data we are also releasing the
>>>> option to perform the recycling, or releasing the pages from the page pool.
>>>>
>>>> Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
>>>> Reported-by: Alexander Duyck <alexanderduyck@fb.com>
>>>> Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
>>>> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
>>>> ---
>>>> Changes since v1:
>>>> - Set the recycle bit to 0 during skb_release_data instead of the
>>>>   individual fucntions triggering the issue, in order to catch all
>>>>   cases
>>>>  net/core/skbuff.c | 4 +++-
>>>>  1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
>>>> index 12aabcda6db2..f91f09a824be 100644
>>>> --- a/net/core/skbuff.c
>>>> +++ b/net/core/skbuff.c
>>>> @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
>>>>       if (skb->cloned &&
>>>>           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>>>>                             &shinfo->dataref))
>>>> -             return;
>>>> +             goto exit;
>>>
>>> Is it possible this patch may break the head frag page for the original skb,
>>> supposing it's head frag page is from the page pool and below change clears
>>> the pp_recycle for original skb, causing a page leaking for the page pool?
>>>
>>
>> So this would leak eventually dma mapping if the skb is cloned (and
>> the dataref is now +1) and we are freeing the original skb first?
>>
> 
> Apologies for the noise, my description was not complete.
> The case you are thinking is clone an SKB and then expand the original?

Yes.
It seems we might need different pp_recycle bit for head frag and data frag.

> 
> thanks
> /Ilias
> 
> 
>>>>
>>>>       skb_zcopy_clear(skb, true);
>>>>
>>>> @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
>>>>               kfree_skb_list(shinfo->frag_list);
>>>>
>>>>       skb_free_head(skb);
>>>> +exit:
>>>> +     skb->pp_recycle = 0;
>>>>  }
>>>>
>>>>  /*
>>>>
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 10:47       ` Yunsheng Lin
@ 2021-07-15 12:37         ` Ilias Apalodimas
  0 siblings, 0 replies; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-15 12:37 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Networking, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	open list

On Thu, 15 Jul 2021 at 13:48, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2021/7/15 18:38, Ilias Apalodimas wrote:
> > On Thu, 15 Jul 2021 at 13:00, Ilias Apalodimas
> > <ilias.apalodimas@linaro.org> wrote:
> >>
> >> On Thu, 15 Jul 2021 at 07:01, Yunsheng Lin <linyunsheng@huawei.com> wrote:
> >>>
> >>> On 2021/7/9 14:29, Ilias Apalodimas wrote:
> >>>> As Alexander points out, when we are trying to recycle a cloned/expanded
> >>>> SKB we might trigger a race.  The recycling code relies on the
> >>>> pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> >>>> If that cloned SKB gets expanded or if we get references to the frags,
> >>>> call skbb_release_data() and overwrite skb->head, we are creating separate
> >>>> instances accessing the same page frags.  Since the skb_release_data()
> >>>> will first try to recycle the frags,  there's a potential race between
> >>>> the original and cloned SKB, since both will have the pp_recycle bit set.
> >>>>
> >>>> Fix this by explicitly those SKBs not recyclable.
> >>>> The atomic_sub_return effectively limits us to a single release case,
> >>>> and when we are calling skb_release_data we are also releasing the
> >>>> option to perform the recycling, or releasing the pages from the page pool.
> >>>>
> >>>> Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> >>>> Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> >>>> Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> >>>> Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> >>>> ---
> >>>> Changes since v1:
> >>>> - Set the recycle bit to 0 during skb_release_data instead of the
> >>>>   individual fucntions triggering the issue, in order to catch all
> >>>>   cases
> >>>>  net/core/skbuff.c | 4 +++-
> >>>>  1 file changed, 3 insertions(+), 1 deletion(-)
> >>>>
> >>>> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> >>>> index 12aabcda6db2..f91f09a824be 100644
> >>>> --- a/net/core/skbuff.c
> >>>> +++ b/net/core/skbuff.c
> >>>> @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
> >>>>       if (skb->cloned &&
> >>>>           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >>>>                             &shinfo->dataref))
> >>>> -             return;
> >>>> +             goto exit;
> >>>
> >>> Is it possible this patch may break the head frag page for the original skb,
> >>> supposing it's head frag page is from the page pool and below change clears
> >>> the pp_recycle for original skb, causing a page leaking for the page pool?
> >>>
> >>
> >> So this would leak eventually dma mapping if the skb is cloned (and
> >> the dataref is now +1) and we are freeing the original skb first?
> >>
> >
> > Apologies for the noise, my description was not complete.
> > The case you are thinking is clone an SKB and then expand the original?
>
> Yes.
> It seems we might need different pp_recycle bit for head frag and data frag.

We could just reset the pp_recycle flag on pskb_carve_inside_header,
pskb_expand_header and pskb_carve_inside_nonlinear which were the
three functions that might trigger the race to begin with.  The point
on adding it on skb_release_data was to have a catch all for all
future cases ...
Let me stare at itt a bit more in case I can come up with something better

Thanks
/Ilias
>
> >
> > thanks
> > /Ilias
> >
> >
> >>>>
> >>>>       skb_zcopy_clear(skb, true);
> >>>>
> >>>> @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> >>>>               kfree_skb_list(shinfo->frag_list);
> >>>>
> >>>>       skb_free_head(skb);
> >>>> +exit:
> >>>> +     skb->pp_recycle = 0;
> >>>>  }
> >>>>
> >>>>  /*
> >>>>
> > .
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15  4:00 ` Yunsheng Lin
  2021-07-15 10:00   ` Ilias Apalodimas
@ 2021-07-15 14:25   ` Alexander Duyck
  2021-07-15 14:45     ` Ilias Apalodimas
  1 sibling, 1 reply; 15+ messages in thread
From: Alexander Duyck @ 2021-07-15 14:25 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Ilias Apalodimas, Netdev, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, LKML

On Wed, Jul 14, 2021 at 9:02 PM Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2021/7/9 14:29, Ilias Apalodimas wrote:
> > As Alexander points out, when we are trying to recycle a cloned/expanded
> > SKB we might trigger a race.  The recycling code relies on the
> > pp_recycle bit to trigger,  which we carry over to cloned SKBs.
> > If that cloned SKB gets expanded or if we get references to the frags,
> > call skbb_release_data() and overwrite skb->head, we are creating separate
> > instances accessing the same page frags.  Since the skb_release_data()
> > will first try to recycle the frags,  there's a potential race between
> > the original and cloned SKB, since both will have the pp_recycle bit set.
> >
> > Fix this by explicitly those SKBs not recyclable.
> > The atomic_sub_return effectively limits us to a single release case,
> > and when we are calling skb_release_data we are also releasing the
> > option to perform the recycling, or releasing the pages from the page pool.
> >
> > Fixes: 6a5bcd84e886 ("page_pool: Allow drivers to hint on SKB recycling")
> > Reported-by: Alexander Duyck <alexanderduyck@fb.com>
> > Suggested-by: Alexander Duyck <alexanderduyck@fb.com>
> > Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
> > ---
> > Changes since v1:
> > - Set the recycle bit to 0 during skb_release_data instead of the
> >   individual fucntions triggering the issue, in order to catch all
> >   cases
> >  net/core/skbuff.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 12aabcda6db2..f91f09a824be 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -663,7 +663,7 @@ static void skb_release_data(struct sk_buff *skb)
> >       if (skb->cloned &&
> >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >                             &shinfo->dataref))
> > -             return;
> > +             goto exit;
>
> Is it possible this patch may break the head frag page for the original skb,
> supposing it's head frag page is from the page pool and below change clears
> the pp_recycle for original skb, causing a page leaking for the page pool?

I don't see how. The assumption here is that when atomic_sub_return
gets down to 0 we will still have an skb with skb->pp_recycle set and
it will flow down and encounter skb_free_head below. All we are doing
is skipping those steps and clearing skb->pp_recycle for all but the
last buffer and the last one to free it will trigger the recycling.

> >
> >       skb_zcopy_clear(skb, true);
> >
> > @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> >               kfree_skb_list(shinfo->frag_list);
> >
> >       skb_free_head(skb);
> > +exit:
> > +     skb->pp_recycle = 0;

Note the path here. We don't clear skb->pp_recycle for the last buffer
where "dataref == 0" until *AFTER* the head has been freed, and all
clones will have skb->pp_recycle = 1 as long as they are a clone of
the original skb that had it set.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 14:25   ` Alexander Duyck
@ 2021-07-15 14:45     ` Ilias Apalodimas
  2021-07-15 14:57       ` Alexander Duyck
  0 siblings, 1 reply; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-15 14:45 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Yunsheng Lin, Netdev, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, LKML

> > >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,

[...]

> > >                             &shinfo->dataref))
> > > -             return;
> > > +             goto exit;
> >
> > Is it possible this patch may break the head frag page for the original skb,
> > supposing it's head frag page is from the page pool and below change clears
> > the pp_recycle for original skb, causing a page leaking for the page pool?
> 
> I don't see how. The assumption here is that when atomic_sub_return
> gets down to 0 we will still have an skb with skb->pp_recycle set and
> it will flow down and encounter skb_free_head below. All we are doing
> is skipping those steps and clearing skb->pp_recycle for all but the
> last buffer and the last one to free it will trigger the recycling.

I think the assumption here is that 
1. We clone an skb
2. The original skb goes into pskb_expand_head()
3. skb_release_data() will be called for the original skb

But with the dataref bumped, we'll skip the recycling for it but we'll also
skip recycling or unmapping the current head (which is a page_pool mapped
buffer)

> 
> > >
> > >       skb_zcopy_clear(skb, true);
> > >
> > > @@ -674,6 +674,8 @@ static void skb_release_data(struct sk_buff *skb)
> > >               kfree_skb_list(shinfo->frag_list);
> > >
> > >       skb_free_head(skb);
> > > +exit:
> > > +     skb->pp_recycle = 0;
> 
> Note the path here. We don't clear skb->pp_recycle for the last buffer
> where "dataref == 0" until *AFTER* the head has been freed, and all
> clones will have skb->pp_recycle = 1 as long as they are a clone of
> the original skb that had it set.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 14:45     ` Ilias Apalodimas
@ 2021-07-15 14:57       ` Alexander Duyck
  2021-07-15 15:02         ` Ilias Apalodimas
  0 siblings, 1 reply; 15+ messages in thread
From: Alexander Duyck @ 2021-07-15 14:57 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Yunsheng Lin, Netdev, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, LKML

On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
<ilias.apalodimas@linaro.org> wrote:
>
> > > >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>
> [...]
>
> > > >                             &shinfo->dataref))
> > > > -             return;
> > > > +             goto exit;
> > >
> > > Is it possible this patch may break the head frag page for the original skb,
> > > supposing it's head frag page is from the page pool and below change clears
> > > the pp_recycle for original skb, causing a page leaking for the page pool?
> >
> > I don't see how. The assumption here is that when atomic_sub_return
> > gets down to 0 we will still have an skb with skb->pp_recycle set and
> > it will flow down and encounter skb_free_head below. All we are doing
> > is skipping those steps and clearing skb->pp_recycle for all but the
> > last buffer and the last one to free it will trigger the recycling.
>
> I think the assumption here is that
> 1. We clone an skb
> 2. The original skb goes into pskb_expand_head()
> 3. skb_release_data() will be called for the original skb
>
> But with the dataref bumped, we'll skip the recycling for it but we'll also
> skip recycling or unmapping the current head (which is a page_pool mapped
> buffer)

Right, but in that case it is the clone that is left holding the
original head and the skb->pp_recycle flag is set on the clone as it
was copied from the original when we cloned it. What we have
essentially done is transferred the responsibility for freeing it from
the original to the clone.

If you think about it the result is the same as if step 2 was to go
into kfree_skb. We would still be calling skb_release_data and the
dataref would be decremented without the original freeing the page. We
have to wait until all the clones are freed and dataref reaches 0
before the head can be recycled.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 14:57       ` Alexander Duyck
@ 2021-07-15 15:02         ` Ilias Apalodimas
  2021-07-16  2:30           ` Yunsheng Lin
  0 siblings, 1 reply; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-15 15:02 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: Yunsheng Lin, Netdev, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, LKML

On Thu, Jul 15, 2021 at 07:57:57AM -0700, Alexander Duyck wrote:
> On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
> <ilias.apalodimas@linaro.org> wrote:
> >
> > > > >           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >
> > [...]
> >
> > > > >                             &shinfo->dataref))
> > > > > -             return;
> > > > > +             goto exit;
> > > >
> > > > Is it possible this patch may break the head frag page for the original skb,
> > > > supposing it's head frag page is from the page pool and below change clears
> > > > the pp_recycle for original skb, causing a page leaking for the page pool?
> > >
> > > I don't see how. The assumption here is that when atomic_sub_return
> > > gets down to 0 we will still have an skb with skb->pp_recycle set and
> > > it will flow down and encounter skb_free_head below. All we are doing
> > > is skipping those steps and clearing skb->pp_recycle for all but the
> > > last buffer and the last one to free it will trigger the recycling.
> >
> > I think the assumption here is that
> > 1. We clone an skb
> > 2. The original skb goes into pskb_expand_head()
> > 3. skb_release_data() will be called for the original skb
> >
> > But with the dataref bumped, we'll skip the recycling for it but we'll also
> > skip recycling or unmapping the current head (which is a page_pool mapped
> > buffer)
> 
> Right, but in that case it is the clone that is left holding the
> original head and the skb->pp_recycle flag is set on the clone as it
> was copied from the original when we cloned it. 

Ah yes, that's what I missed

> What we have
> essentially done is transferred the responsibility for freeing it from
> the original to the clone.
> 
> If you think about it the result is the same as if step 2 was to go
> into kfree_skb. We would still be calling skb_release_data and the
> dataref would be decremented without the original freeing the page. We
> have to wait until all the clones are freed and dataref reaches 0
> before the head can be recycled.

Yep sounds correct

Thanks
/Ilias

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-15 15:02         ` Ilias Apalodimas
@ 2021-07-16  2:30           ` Yunsheng Lin
  2021-07-16  6:43             ` Ilias Apalodimas
  0 siblings, 1 reply; 15+ messages in thread
From: Yunsheng Lin @ 2021-07-16  2:30 UTC (permalink / raw)
  To: Ilias Apalodimas, Alexander Duyck
  Cc: Netdev, Alexander Duyck, David S. Miller, Jakub Kicinski,
	Alexander Lobakin, Jonathan Lemon, Willem de Bruijn, Miaohe Lin,
	Guillaume Nault, Cong Wang, Jesper Dangaard Brouer, Matteo Croce,
	LKML

On 2021/7/15 23:02, Ilias Apalodimas wrote:
> On Thu, Jul 15, 2021 at 07:57:57AM -0700, Alexander Duyck wrote:
>> On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
>> <ilias.apalodimas@linaro.org> wrote:
>>>
>>>>>>           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
>>>
>>> [...]
>>>
>>>>>>                             &shinfo->dataref))
>>>>>> -             return;
>>>>>> +             goto exit;
>>>>>
>>>>> Is it possible this patch may break the head frag page for the original skb,
>>>>> supposing it's head frag page is from the page pool and below change clears
>>>>> the pp_recycle for original skb, causing a page leaking for the page pool?
>>>>
>>>> I don't see how. The assumption here is that when atomic_sub_return
>>>> gets down to 0 we will still have an skb with skb->pp_recycle set and
>>>> it will flow down and encounter skb_free_head below. All we are doing
>>>> is skipping those steps and clearing skb->pp_recycle for all but the
>>>> last buffer and the last one to free it will trigger the recycling.
>>>
>>> I think the assumption here is that
>>> 1. We clone an skb
>>> 2. The original skb goes into pskb_expand_head()
>>> 3. skb_release_data() will be called for the original skb
>>>
>>> But with the dataref bumped, we'll skip the recycling for it but we'll also
>>> skip recycling or unmapping the current head (which is a page_pool mapped
>>> buffer)
>>
>> Right, but in that case it is the clone that is left holding the
>> original head and the skb->pp_recycle flag is set on the clone as it
>> was copied from the original when we cloned it. 
> 
> Ah yes, that's what I missed
> 
>> What we have
>> essentially done is transferred the responsibility for freeing it from
>> the original to the clone.
>>
>> If you think about it the result is the same as if step 2 was to go
>> into kfree_skb. We would still be calling skb_release_data and the
>> dataref would be decremented without the original freeing the page. We
>> have to wait until all the clones are freed and dataref reaches 0
>> before the head can be recycled.
> 
> Yep sounds correct

Ok, I suppose the fraglist skb is handled similar as the regular skb, right?

Also, this patch might need respinning as the state of this patch is "Changes
Requested" in patchwork.

> 
> Thanks
> /Ilias
> .
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets
  2021-07-16  2:30           ` Yunsheng Lin
@ 2021-07-16  6:43             ` Ilias Apalodimas
  0 siblings, 0 replies; 15+ messages in thread
From: Ilias Apalodimas @ 2021-07-16  6:43 UTC (permalink / raw)
  To: Yunsheng Lin
  Cc: Alexander Duyck, Netdev, Alexander Duyck, David S. Miller,
	Jakub Kicinski, Alexander Lobakin, Jonathan Lemon,
	Willem de Bruijn, Miaohe Lin, Guillaume Nault, Cong Wang,
	Jesper Dangaard Brouer, Matteo Croce, LKML

On Fri, 16 Jul 2021 at 05:30, Yunsheng Lin <linyunsheng@huawei.com> wrote:
>
> On 2021/7/15 23:02, Ilias Apalodimas wrote:
> > On Thu, Jul 15, 2021 at 07:57:57AM -0700, Alexander Duyck wrote:
> >> On Thu, Jul 15, 2021 at 7:45 AM Ilias Apalodimas
> >> <ilias.apalodimas@linaro.org> wrote:
> >>>
> >>>>>>           atomic_sub_return(skb->nohdr ? (1 << SKB_DATAREF_SHIFT) + 1 : 1,
> >>>
> >>> [...]
> >>>
> >>>>>>                             &shinfo->dataref))
> >>>>>> -             return;
> >>>>>> +             goto exit;
> >>>>>
> >>>>> Is it possible this patch may break the head frag page for the original skb,
> >>>>> supposing it's head frag page is from the page pool and below change clears
> >>>>> the pp_recycle for original skb, causing a page leaking for the page pool?
> >>>>
> >>>> I don't see how. The assumption here is that when atomic_sub_return
> >>>> gets down to 0 we will still have an skb with skb->pp_recycle set and
> >>>> it will flow down and encounter skb_free_head below. All we are doing
> >>>> is skipping those steps and clearing skb->pp_recycle for all but the
> >>>> last buffer and the last one to free it will trigger the recycling.
> >>>
> >>> I think the assumption here is that
> >>> 1. We clone an skb
> >>> 2. The original skb goes into pskb_expand_head()
> >>> 3. skb_release_data() will be called for the original skb
> >>>
> >>> But with the dataref bumped, we'll skip the recycling for it but we'll also
> >>> skip recycling or unmapping the current head (which is a page_pool mapped
> >>> buffer)
> >>
> >> Right, but in that case it is the clone that is left holding the
> >> original head and the skb->pp_recycle flag is set on the clone as it
> >> was copied from the original when we cloned it.
> >
> > Ah yes, that's what I missed
> >
> >> What we have
> >> essentially done is transferred the responsibility for freeing it from
> >> the original to the clone.
> >>
> >> If you think about it the result is the same as if step 2 was to go
> >> into kfree_skb. We would still be calling skb_release_data and the
> >> dataref would be decremented without the original freeing the page. We
> >> have to wait until all the clones are freed and dataref reaches 0
> >> before the head can be recycled.
> >
> > Yep sounds correct
>
> Ok, I suppose the fraglist skb is handled similar as the regular skb, right?
>

Yes, even in the fragments case your cloned/expanded SBK will still
have the recycle bit set, so it will try to recycle them or unmap them

> Also, this patch might need respinning as the state of this patch is "Changes
> Requested" in patchwork.

Thanks, I'll respin it and add a comment explaining why

>
> >
> > Thanks
> > /Ilias
> > .
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-07-16  6:44 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-09  6:29 [PATCH 1/1 v2] skbuff: Fix a potential race while recycling page_pool packets Ilias Apalodimas
2021-07-09 14:34 ` Alexander Duyck
2021-07-09 17:29   ` Ilias Apalodimas
2021-07-12 11:52   ` Jesper Dangaard Brouer
2021-07-15  4:00 ` Yunsheng Lin
2021-07-15 10:00   ` Ilias Apalodimas
2021-07-15 10:38     ` Ilias Apalodimas
2021-07-15 10:47       ` Yunsheng Lin
2021-07-15 12:37         ` Ilias Apalodimas
2021-07-15 14:25   ` Alexander Duyck
2021-07-15 14:45     ` Ilias Apalodimas
2021-07-15 14:57       ` Alexander Duyck
2021-07-15 15:02         ` Ilias Apalodimas
2021-07-16  2:30           ` Yunsheng Lin
2021-07-16  6:43             ` Ilias Apalodimas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).