bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Brendan Jackman <jackmanb@google.com>
Cc: KP Singh <kpsingh@kernel.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next] libbpf: Fix signed overflow in ringbuf_process_ring
Date: Wed, 28 Apr 2021 11:13:34 -0700	[thread overview]
Message-ID: <CAEf4Bzb1ZNotcB44cDauAkAbs4R=UvPRKP1KWNbLg1k1jH25mA@mail.gmail.com> (raw)
In-Reply-To: <CA+i-1C2bNk0Mx_9KkuyOjvQh_y7KFrHBU-869P+8oTFq8HdVcw@mail.gmail.com>

On Wed, Apr 28, 2021 at 1:18 AM Brendan Jackman <jackmanb@google.com> wrote:
>
> On Wed, 28 Apr 2021 at 01:19, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Tue, Apr 27, 2021 at 4:05 PM KP Singh <kpsingh@kernel.org> wrote:
> > >
> > > On Tue, Apr 27, 2021 at 11:34 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Tue, Apr 27, 2021 at 10:09 AM Brendan Jackman <jackmanb@google.com> wrote:
> > > > >
> > > > > One of our benchmarks running in (Google-internal) CI pushes data
> > > > > through the ringbuf faster than userspace is able to consume
> > > > > it. In this case it seems we're actually able to get >INT_MAX entries
> > > > > in a single ringbuf_buffer__consume call. ASAN detected that cnt
> > > > > overflows in this case.
> > > > >
> > > > > Fix by just setting a limit on the number of entries that can be
> > > > > consumed.
> > > > >
> > > > > Fixes: bf99c936f947 (libbpf: Add BPF ring buffer support)
> > > > > Signed-off-by: Brendan Jackman <jackmanb@google.com>
> > > > > ---
> > > > >  tools/lib/bpf/ringbuf.c | 3 ++-
> > > > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/tools/lib/bpf/ringbuf.c b/tools/lib/bpf/ringbuf.c
> > > > > index e7a8d847161f..445a21df0934 100644
> > > > > --- a/tools/lib/bpf/ringbuf.c
> > > > > +++ b/tools/lib/bpf/ringbuf.c
> > > > > @@ -213,8 +213,8 @@ static int ringbuf_process_ring(struct ring* r)
> > > > >         do {
> > > > >                 got_new_data = false;
> > > > >                 prod_pos = smp_load_acquire(r->producer_pos);
> > > > > -               while (cons_pos < prod_pos) {
> > > > > +               /* Don't read more than INT_MAX, or the return vale won't make sense. */
> > > > > +               while (cons_pos < prod_pos && cnt < INT_MAX) {
> > > >
> > > > ring_buffer__pool() is assumed to not return until all the enqueued
> > > > messages are consumed. That's the requirement for the "adaptive"
> > > > notification scheme to work properly. So this will break that and
> > > > cause the next ring_buffer__pool() to never wake up.
>
> Ah yes, good point, thanks.
>
> > > > We could use __u64 internally and then cap it to INT_MAX on return
> > > > maybe? But honestly, this sounds like an artificial corner case, if
> > > > you are producing data faster than you can consume it and it goes
> > > > beyond INT_MAX, something is seriously broken in your application and
>
> Yes it's certainly artificial but IMO it's still highly desirable for
> libbpf to hold up its side of the bargain even when the application is
> behaving very strangely like this.

One can also argue that if application consumed more than 2 billion
messages in one go, that's an error. ;-P But of course that is not
great.

>
> [...]
>
> > I think we have two alternatives here:
> > 1) consume all but cap return to INT_MAX
> > 2) consume all but return long long as return result
> >
> > Third alternative is to have another API with maximum number of
> > samples to consume. But then user needs to know what they are doing
> > (e.g., they do FORCE on BPF side, or they do their own epoll_wait, or
> > they do ring_buffer__poll with timeout = 0, etc).
> >
> > I'm just not sure anyone would want to understand all the
> > implications. And it's easy to miss those implications. So maybe let's
> > do long long (or __s64) return type instead?
>
> Wouldn't changing the API to 64 bit return type break existing users
> on some ABIs?
>

Yes, it might, not perfect.

> I think capping the return value to INT_MAX and adding a note to the
> function definition comment would also be fine, it doesn't feel like a
> very complex thing for the user to understand: "Returns number of
> records consumed (or INT_MAX, whichever is less)".

Yep, let's cap. But to not penalize a hot loop with extra checks.
Let's use int64_t internally for counting and only cap it before the
return.

  reply	other threads:[~2021-04-28 18:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-27 17:08 [PATCH bpf-next] libbpf: Fix signed overflow in ringbuf_process_ring Brendan Jackman
2021-04-27 21:34 ` Andrii Nakryiko
2021-04-27 23:05   ` KP Singh
2021-04-27 23:19     ` Andrii Nakryiko
2021-04-28  8:18       ` Brendan Jackman
2021-04-28 18:13         ` Andrii Nakryiko [this message]
2021-04-29  8:46           ` Brendan Jackman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4Bzb1ZNotcB44cDauAkAbs4R=UvPRKP1KWNbLg1k1jH25mA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=jackmanb@google.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).