netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg
@ 2020-08-13 11:58 Miaohe Lin
  2020-08-13 12:34 ` Willem de Bruijn
  0 siblings, 1 reply; 5+ messages in thread
From: Miaohe Lin @ 2020-08-13 11:58 UTC (permalink / raw)
  To: davem, kuznet, yoshfuji, kuba, willemb; +Cc: netdev, linux-kernel, linmiaohe

The var extra_uref is introduced to pass the initial reference taken in
sock_zerocopy_alloc to the first generated skb. But now we may fail to pass
the initial reference with newly allocated UDP or RAW uarg when the skb is
zcopied.

If the skb is zcopied, we always set extra_uref to false. This is fine with
reallocted uarg because no extra ref is taken by UDP and RAW zerocopy. But
if uarg is newly allocated via sock_zerocopy_alloc(), we lost the initial
reference because extra_uref is false and we missed to pass it to the first
generated skb.

To fix this, we should set extra_uref to true if UDP or RAW uarg is newly
allocated when the skb is zcopied.

Fixes: 522924b58308 ("net: correct udp zerocopy refcnt also when zerocopy only on append")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 net/ipv4/ip_output.c  | 4 +++-
 net/ipv6/ip6_output.c | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 61f802d5350c..78d3b5d48617 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1019,7 +1019,9 @@ static int __ip_append_data(struct sock *sk,
 		uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
+		/* Only ref on newly allocated uarg. */
+		if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
+			extra_uref = true;
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index c78e67d7747f..0f82923239a9 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1476,7 +1476,9 @@ static int __ip6_append_data(struct sock *sk,
 		uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
 		if (!uarg)
 			return -ENOBUFS;
-		extra_uref = !skb_zcopy(skb);	/* only ref on new uarg */
+		/* Only ref on newly allocated uarg. */
+		if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
+			extra_uref = true;
 		if (rt->dst.dev->features & NETIF_F_SG &&
 		    csummode == CHECKSUM_PARTIAL) {
 			paged = true;
-- 
2.19.1


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg
  2020-08-13 11:58 [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg Miaohe Lin
@ 2020-08-13 12:34 ` Willem de Bruijn
  0 siblings, 0 replies; 5+ messages in thread
From: Willem de Bruijn @ 2020-08-13 12:34 UTC (permalink / raw)
  To: Miaohe Lin
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, Network Development, linux-kernel

On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmiaohe@huawei.com> wrote:
>
> The var extra_uref is introduced to pass the initial reference taken in
> sock_zerocopy_alloc to the first generated skb. But now we may fail to pass
> the initial reference with newly allocated UDP or RAW uarg when the skb is
> zcopied.
>
> If the skb is zcopied, we always set extra_uref to false. This is fine with
> reallocted uarg because no extra ref is taken by UDP and RAW zerocopy. But
> if uarg is newly allocated via sock_zerocopy_alloc(), we lost the initial
> reference because extra_uref is false and we missed to pass it to the first
> generated skb.
>
> To fix this, we should set extra_uref to true if UDP or RAW uarg is newly
> allocated when the skb is zcopied.

extra_uref is true if there is no previous skb to append to or there
is a previous skb, but that does not have zerocopy data associated yet
(because the previous call(s) did not set MSG_ZEROCOPY).

In other words, when first (allocating and) associating a zerocopy
struct with the skb.



> Fixes: 522924b58308 ("net: correct udp zerocopy refcnt also when zerocopy only on append")
> Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
> ---
>  net/ipv4/ip_output.c  | 4 +++-
>  net/ipv6/ip6_output.c | 4 +++-
>  2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
> index 61f802d5350c..78d3b5d48617 100644
> --- a/net/ipv4/ip_output.c
> +++ b/net/ipv4/ip_output.c
> @@ -1019,7 +1019,9 @@ static int __ip_append_data(struct sock *sk,
>                 uarg = sock_zerocopy_realloc(sk, length, skb_zcopy(skb));
>                 if (!uarg)
>                         return -ENOBUFS;
> -               extra_uref = !skb_zcopy(skb);   /* only ref on new uarg */
> +               /* Only ref on newly allocated uarg. */
> +               if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
> +                       extra_uref = true;

SOCK_STREAM does not use __ip_append_data.

This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg.

This function can only acquire a uarg through sock_zerocopy_realloc,
which on skb_zcopy(skb) only returns the existing uarg or NULL (for
not SOCK_STREAM).

So I don't see when that condition can happen.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg
@ 2020-08-15  2:02 linmiaohe
  0 siblings, 0 replies; 5+ messages in thread
From: linmiaohe @ 2020-08-15  2:02 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, Network Development, linux-kernel

Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>On Fri, Aug 14, 2020 at 10:17 AM linmiaohe <linmiaohe@huawei.com> wrote:
>>
>
>I don't think that can happen.
>
>The question is when this branch is false
>
>                next = (u32)atomic_read(&sk->sk_zckey);
>                if ((u32)(uarg->id + uarg->len) == next) {
>
>I cannot come up with a case. I think it might be vestigial. The goal is to ensure to append only a consecutive range of notification IDs.
>Each notification ID corresponds to a sendmsg invocation with MSG_ZEROCOPY. In both TCP and UDP with corking, data is ordered and access to changes to these fields happen together as a transaction:
>
>                /* realloc only when socket is locked (TCP, UDP cork),
>                 * so uarg->len and sk_zckey access is serialized
>                 */

So what I concerned is just a theoretically problems. If we can always guarantee sock_zerocopy_realloc only returns the existing uarg or NULL when on skb_zcopy(skb),
bad things won't happen.

Many thanks for your detailed explaination.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg
  2020-08-14  8:17 linmiaohe
@ 2020-08-14 15:44 ` Willem de Bruijn
  0 siblings, 0 replies; 5+ messages in thread
From: Willem de Bruijn @ 2020-08-14 15:44 UTC (permalink / raw)
  To: linmiaohe
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, Network Development, linux-kernel

On Fri, Aug 14, 2020 at 10:17 AM linmiaohe <linmiaohe@huawei.com> wrote:
>
> Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
> >On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmiaohe@huawei.com> wrote:
> >>
> >> The var extra_uref is introduced to pass the initial reference taken
> >> in sock_zerocopy_alloc to the first generated skb. But now we may fail
> >> to pass the initial reference with newly allocated UDP or RAW uarg
> >> when the skb is zcopied.
> >
> >extra_uref is true if there is no previous skb to append to or there is a previous skb, but that does not have zerocopy data associated yet (because the previous call(s) did not set MSG_ZEROCOPY).
> >
> >In other words, when first (allocating and) associating a zerocopy struct with the skb.
>
> Many thanks for your explaination. The var extra_uref plays the role as you say. I just borrowed the description of var extra_uref from previous commit log here.
>
> >
> >> -               extra_uref = !skb_zcopy(skb);   /* only ref on new uarg */
> >> +               /* Only ref on newly allocated uarg. */
> >> +               if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
> >> +                       extra_uref = true;
> >
> >SOCK_STREAM does not use __ip_append_data.
> >
> >This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg.
> >
> >This function can only acquire a uarg through sock_zerocopy_realloc, which on skb_zcopy(skb) only returns the existing uarg or NULL (for not SOCK_STREAM).
> >
> >So I don't see when that condition can happen.
> >
>
> On skb_zcopy(skb), we returns the existing uarg iff (uarg->id + uarg->len == atomic_read(&sk->sk_zckey)) in sock_zerocopy_realloc. So we may get a newly allocated
> uarg via sock_zerocopy_alloc(). Though we may not trigger this codepath now, it's still a potential problem that we may missed the right trace to uarg.

I don't think that can happen.

The question is when this branch is false

                next = (u32)atomic_read(&sk->sk_zckey);
                if ((u32)(uarg->id + uarg->len) == next) {

I cannot come up with a case. I think it might be vestigial. The goal
is to ensure to append only a consecutive range of notification IDs.
Each notification ID corresponds to a sendmsg invocation with
MSG_ZEROCOPY. In both TCP and UDP with corking, data is ordered and
access to changes to these fields happen together as a transaction:

                /* realloc only when socket is locked (TCP, UDP cork),
                 * so uarg->len and sk_zckey access is serialized
                 */

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg
@ 2020-08-14  8:17 linmiaohe
  2020-08-14 15:44 ` Willem de Bruijn
  0 siblings, 1 reply; 5+ messages in thread
From: linmiaohe @ 2020-08-14  8:17 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David Miller, Alexey Kuznetsov, Hideaki YOSHIFUJI,
	Jakub Kicinski, Network Development, linux-kernel

Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote:
>On Thu, Aug 13, 2020 at 1:59 PM Miaohe Lin <linmiaohe@huawei.com> wrote:
>>
>> The var extra_uref is introduced to pass the initial reference taken 
>> in sock_zerocopy_alloc to the first generated skb. But now we may fail 
>> to pass the initial reference with newly allocated UDP or RAW uarg 
>> when the skb is zcopied.
>
>extra_uref is true if there is no previous skb to append to or there is a previous skb, but that does not have zerocopy data associated yet (because the previous call(s) did not set MSG_ZEROCOPY).
>
>In other words, when first (allocating and) associating a zerocopy struct with the skb.

Many thanks for your explaination. The var extra_uref plays the role as you say. I just borrowed the description of var extra_uref from previous commit log here.

>
>> -               extra_uref = !skb_zcopy(skb);   /* only ref on new uarg */
>> +               /* Only ref on newly allocated uarg. */
>> +               if (!skb_zcopy(skb) || (sk->sk_type != SOCK_STREAM && skb_zcopy(skb) != uarg))
>> +                       extra_uref = true;
>
>SOCK_STREAM does not use __ip_append_data.
>
>This leaves as new branch skb_zcopy(skb) && skb_zcopy(skb) != uarg.
>
>This function can only acquire a uarg through sock_zerocopy_realloc, which on skb_zcopy(skb) only returns the existing uarg or NULL (for not SOCK_STREAM).
>
>So I don't see when that condition can happen.
>

On skb_zcopy(skb), we returns the existing uarg iff (uarg->id + uarg->len == atomic_read(&sk->sk_zckey)) in sock_zerocopy_realloc. So we may get a newly allocated
uarg via sock_zerocopy_alloc(). Though we may not trigger this codepath now, it's still a potential problem that we may missed the right trace to uarg.

Or am I still miss anything? Thanks.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-08-16  2:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-13 11:58 [PATCH] net: correct zerocopy refcnt with newly allocated UDP or RAW uarg Miaohe Lin
2020-08-13 12:34 ` Willem de Bruijn
2020-08-14  8:17 linmiaohe
2020-08-14 15:44 ` Willem de Bruijn
2020-08-15  2:02 linmiaohe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).