Re: [PATCH v2 bpf-next 4/6] libbpf: don't enforce verifier log levels on libbpf side

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Timo Beckers <timo@incline.eu>
Cc: Lorenz Bauer <lmb@isovalent.com>,
	Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@kernel.org, robin.goegge@isovalent.com,
	kernel-team@meta.com
Subject: Re: [PATCH v2 bpf-next 4/6] libbpf: don't enforce verifier log levels on libbpf side
Date: Fri, 31 Mar 2023 08:59:56 -0700	[thread overview]
Message-ID: <CAEf4BzarW-zUiP19_Xn5-1YgVkLXES4L1rqk1m5H7cqgKqMxNQ@mail.gmail.com> (raw)
In-Reply-To: <68b1465b-e970-15c3-37a6-f81a639bc71c@incline.eu>

On Fri, Mar 31, 2023 at 2:39 AM Timo Beckers <timo@incline.eu> wrote:
>
> On 3/30/23 23:05, Andrii Nakryiko wrote:
> > On Thu, Mar 30, 2023 at 10:13 AM Lorenz Bauer <lmb@isovalent.com> wrote:
> >> On Wed, Mar 29, 2023 at 12:56 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> >>> This basically prevents any forward compatibility. And we either way
> >>> just return -EINVAL, which would otherwise be returned from bpf()
> >>> syscall anyways.
> >> In your cover letter you make the argument that applications can opt
> >> out of the behaviour, but I think shows that this isn't entirely true.
> >> Apps linking old libbpf won't be able to fix their breakage without
> >> updating libbpf. This is especially annoying when you have to support
> >> multiple old versions where doing this isn't straightforward.
> >>
> > Ok, technically, you are correct. If you somehow managed to get a
> > bleeding edge kernel, but outdated libbpf, you won't be able to
> > specify log_level = 8. This is not the only place where too old libbpf
> > would limit you from using bleeding edge kernel features, though, and
> > we have to live with that (though try our best to avoid such
> > dependencies, of course).
> >
> > But in practice you get the freshest libbpf way before your kernel
> > becomes the freshest one, so I don't think this is a big deal in
> > practice.
> Hey Andrii,
>

Hi Timo, thanks for chiming in!

> In an ideal world, yes, but not how it works out in practice. Anything
> that ships in a container obviously misses out on distro packages of the
> underlying host that are tied to the running kernel. This looks like it's
> quickly becoming a majority use case of bpf in the wild, barring of course
> Android and systemd.
>
> In practice, we get to deal with rather large version swings in both
> directions; users may want to run the latest tools on 4.19, or last
> year's stable tool release on 6.1, so the need for both backwards- and
> forwards-compatibility is definitely there.

Yep, that sucks. And just reminds us again and again what I've been
consistently telling anyone who wanted to listen: statically linking
against libbpf is the best strategy to avoid unnecessary headaches.
Control your destiny (i.e., dependencies).

But in general, yes, I totally agree that unnecessary changes that
require the newest libbpf should be avoided. I think that in this case
with BPF_LOG_FIXED it's justified, though, and I think you are
agreeing as well, see below.

>
> Fortunately, things don't break that often. :) The biggest papercut
> we've had recently was the introduction of enum64, which just flat
> out requires the latest userspace if kernel btf is needed.

Yep, BTF changes are most painful and are felt across the entire range
of tools, unfortunately. This is offtopic, but this is one of the
reasons we were discussing with Alan how to make BTF more
self-describing. I still think we should do this, no one had time to
take this on, yet.

> >> Take this as another plea to make this opt in and instead work
> >> together to make this a default on the lib side. :)
> > Please, help me understand the arguments against making rotating mode
> > a default, now that we return -ENOSPC on truncation. In which scenario
> > this difference matters?
> I think there may be a misunderstanding. As you mentioned, there's rarely
> anything useful at the start of the log. I think the opt-in behaviour
> under discussion here is the lack of ENOSPC when the buffer is undersized.
> Userspace should set a flag if it supports receiving a partial log without
> expecting ENOSPC.

Great, thanks for specifics. I think we can finally be on the same
page. Pretty much the only change in this v2 is exactly the -ENOSPC on
log truncation in *either mode*. See version log from cover letter:

  - return -ENOSPC even in rotating log mode for preserving backwards
    compatibility (Lorenz);

>
> I've lost track of the exact changes that are now on the table with your
> second patch set floating around. The way I understand it, there are
> multiple isolated behavioural changes, so let's discuss them separately:
>
> - Log will include the tail instead of the head. This is a good change, no
>    argument there, and personally I wouldn't bother with a flag until
>    someone complains. Old userspace is (worst case) already used to retrying
>    with bigger buffers until ENOSPC is gone, and (best case) gets a few
> pages
>    of actually useful log if it doesn't retry.

Agreed. And that's how it behaves, you don't need a flag to get this
behavior and that's exactly what I'm arguing with Lorenz to preserve.

>
> - log_size_actual: if populated by the kernel, userspace can do a perfect
>    retry with the correct buffer size. Userspace will naturally be able to
>    pick this up when their bpf_attr gets updated. No opt-in/flags needed
>    because not a breaking change.

Yes. That's what the follow up patch set is doing. No new flag, just a
new field which kernel populates. See [0]

  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=735213&state=*

>
> - No more ENOSPC: this needs to be opt-in, as old userspace relies on ENOSPC
>    to drive the retry loop. If the kernel no longer returns ENOSPC by
> default,
>    userspace will think it received a full log and won't be able to detect
>    truncation if it doesn't yet know about the log_size_actual field.
>    From what I gather, we're also in agreement on this. Idea for a flag
> name:
>    BPF_LOG_PARTIAL?

Ok, this is where the misunderstanding is. We don't have a mode where
the verifier succeeds on log truncation. And I don't even plan to add
a flag to enable this no-ENOSPC mode. V2 always returns -ENOSPC on
truncation.

The only difference is that on truncation the user gets the tail of
the log, and that's the default behavior. And BPF_LOG_FIXED is there
only for uncommon cases where the beginning of the log is needed in
case of truncation. That's it.

>
> Hope this can move the discussion forward, it looked to me like we were just
> talking past each other. Thanks for addressing our feedback so far!

Yep, I hope so too. I think this v2 is exactly behaving like you want
and expect, and you guys missed that we do now return -ENOSPC even in
new rotating mode. Please let me know if this clarifies things,
thanks!

> > 1. If there is no truncation and the user provides a big enough buffer
> > (which my second patch set makes it even easier to do for libraries),
> > there is no difference, they get identical log contents and behavior.
> >
> > 2. If there was truncation, in both cases we get -ENOSPC. The contents
> > will differ. In one case we get the beginning of a long log with no
> > details about what actually caused the failure (useless in pretty much
> > any circumstances) versus you get the last N bytes of log, all the way
> > to actual error and some history leading towards it. Which is what we
> > used to debug and understand verification issues.
> >
> > What is the situation where the beginning of the log is preferable? I
> > had exactly one case where I actually wanted the beginning of the log,
> > that was when I was debugging some bug in the verifier when
> > implementing open-coded iterators. This bug was happening early and
> > causing an infinite loop, so I wanted to see the first few pages of
> > the output to catch how it all started. But that was a development bug
> > of a tricky feature, definitely not something we expect for end users
> > to deal with. And it was literally *once* that I needed this.
> >
> > Why are we fighting to preserve this much less useful behavior as a
> > default, if there is no reduction of functionality for end-users?
> > Library writers have full access to union bpf_attr and can opt-out
> > easily (though again, why?). Normal end users will never have to ask
> > for BPF_LOG_FIXED behavior. Maybe some advanced tool-building users
> > will want BPF_LOG_FIXED (like veristat, for example), but then it's in
> > their best interest to have fresh enough libbpf anyways.
> >
> > So instead of "I want X over Y", let's discuss "*why* X is better than Y"?
> >
> >> Apps linking old libbpf won't be able to fix their breakage without
> >> updating libbpf. This is especially annoying when you have to support
> > What sort of breakage would be there to fix?
> >
> > Also keep in mind that not all use cases use BPF library's high-level
> > code that does all this fancy log buf manipulations. There are
> > legitimate cases where tools/applications want direct access to
> > log_buf, so needing to do extra feature detection to get rotating mode
> > (but falling back without failing to fixed mode on the old kernel) is
> > just an unnecessary nuisance.
>