All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Fastabend <john.fastabend@gmail.com>
To: Cong Wang <xiyou.wangcong@gmail.com>,
	John Fastabend <john.fastabend@gmail.com>
Cc: Linux Kernel Network Developers <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>, Cong Wang <cong.wang@bytedance.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Jakub Sitnicki <jakub@cloudflare.com>,
	Lorenz Bauer <lmb@cloudflare.com>
Subject: Re: [Patch bpf] udp: fix a memory leak in udp_read_sock()
Date: Fri, 21 May 2021 15:09:32 -0700	[thread overview]
Message-ID: <60a82f9c96de2_1c22f2086e@john-XPS-13-9370.notmuch> (raw)
In-Reply-To: <CAM_iQpWxJrXhdxyhO6O+h1d9dz=4BBk8i-EYrVG6v8ix_0gCnQ@mail.gmail.com>

Cong Wang wrote:
> On Thu, May 20, 2021 at 10:43 AM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Cong Wang wrote:
> > > On Wed, May 19, 2021 at 2:54 PM John Fastabend <john.fastabend@gmail.com> wrote:
> > > >
> > > > Cong Wang wrote:
> > > > > On Wed, May 19, 2021 at 12:06 PM John Fastabend
> > > > > <john.fastabend@gmail.com> wrote:
> > > > > >
> > > > > > Cong Wang wrote:
> > > > > > > On Tue, May 18, 2021 at 12:56 PM John Fastabend
> > > > > > > <john.fastabend@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Cong Wang wrote:
> > > > > > > > > On Mon, May 17, 2021 at 10:36 PM John Fastabend
> > > > > > > > > <john.fastabend@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Cong Wang wrote:
> > > > > > > > > > > From: Cong Wang <cong.wang@bytedance.com>
> > > > > > > > > > >

[...]

> >
> > "We are not dropping the packet" you'll need to be more explicit on
> > how this path is not dropping the skb,
> 
> You know it is cloned, don't you? So if we clone an skb and deliver
> the clone then free the original, what is dropped here? Absolutely
> nothing.
> 
> By "drop", we clearly mean nothing is delivered. Or do you have
> any different definition of "drop"?

This is my meaning of 'drop'.

> 
> >
> >   udp_read_sock()
> >     skb = skb_recv_udp()
> >      __skb_recv_udp()
> >        __skb_try_recv_from_queue()
> >         __skb_unlink()              // skb is off the queue
> >     used = recv_actor()
> >       sk_psock_verdict_recv()
> 
> Why do you intentionally ignore the fact the skb is cloned
> and the clone is delivered??

I'm only concerned about the error case here.

> 
> >         return 0;
> >     if (used <= 0) {
> >       kfree(skb)         // skb is unlink'd and kfree'd
> 
> Why do you ignore the other kfree_skb() I added in this patch?
> Which is clearly on the non-error path. This is why I said the
> skb needs to be freed _regardless_ of error or not. You just
> keep ignoring it.

Agree it needs a kfree_skb in both cases I'm not ignoring it. My
issue with this fix is it is not complete. We need some counter
to increment the udp stats so we can learn about these drops.
And it looks like an extra two lines will get us that.

> 
> 
> >       break;
> >     return 0
> >
> > So if in the error case the skb is unlink'd from the queue and
> > kfree'd where is it still existing, how do we get it back? It
> > sure looks dropped to me. Yes, the kfree() is needed to not
> > leak it, but I'm saying we don't want to drop packets silently.
> 
> See above, you clearly ignored the other kfree_skb() which is
> on non-error path.

Ignored in this thread because its correct and looks good to me.
My only issue is with the error path.

> 
> > The convention in UDP space looks to be inc sk->sk_drop and inc
> > the MIB. When we have to debug this on deployed systems and
> > packets silently get dropped its going to cause lots of pain so
> > lets be sure we get the counters correct.
> 
> Sure, let me quote what I already said:
> " A separate patch is need to check desc->error, if really needed."

Check desc->error or even just used <= 0 is sufficient here. Yes it is
really needed I don't want to debug this in a prod environment
without it.

> 
> This patch, as its subject tells, aims to address a memory leak, not
> to address error counters. BTW, TCP does not increase error

OK either add the counters in this patch or as a series of two
patches so we get a complete fix in one series. Without it some
box out there will randomly drop UDP packets, probably DNS
packets for extra fun, and we will have no way of knowing other
than sporadic packet loss. Unless your arguing against having the
counters or that the counters don't make sense for some reason?

> counters either, yet another reason it deserves a separate patch
> to address both.

TCP case is different if we drop packets in a TCP error case
thats not a 'lets increment the counters' problem the drop needs
to be fixed we can't let data fall out of a TCP stream because
no one will retransmit it. We've learned this the hard way.

> 
> Thanks.

  reply	other threads:[~2021-05-21 22:09 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-17  2:23 [Patch bpf] udp: fix a memory leak in udp_read_sock() Cong Wang
2021-05-17 16:22 ` Song Liu
2021-05-18  5:36 ` John Fastabend
2021-05-18 16:54   ` Cong Wang
2021-05-18 19:56     ` John Fastabend
2021-05-18 21:21       ` Cong Wang
2021-05-19 19:06         ` John Fastabend
2021-05-19 20:17           ` Cong Wang
2021-05-19 21:54             ` John Fastabend
2021-05-19 23:26               ` Cong Wang
2021-05-20 17:42                 ` John Fastabend
2021-05-20 20:14                   ` Cong Wang
2021-05-21 22:09                     ` John Fastabend [this message]
2021-05-21 23:39                       ` Cong Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=60a82f9c96de2_1c22f2086e@john-XPS-13-9370.notmuch \
    --to=john.fastabend@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=cong.wang@bytedance.com \
    --cc=daniel@iogearbox.net \
    --cc=jakub@cloudflare.com \
    --cc=lmb@cloudflare.com \
    --cc=netdev@vger.kernel.org \
    --cc=xiyou.wangcong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.