Re: [PATCH v2 bpf-next 4/6] libbpf: don't enforce verifier log levels on libbpf side

From: Timo Beckers <timo@incline.eu>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Lorenz Bauer <lmb@isovalent.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	martin.lau@kernel.org, robin.goegge@isovalent.com,
	kernel-team@meta.com
Subject: Re: [PATCH v2 bpf-next 4/6] libbpf: don't enforce verifier log levels on libbpf side
Date: Fri, 31 Mar 2023 11:39:48 +0200	[thread overview]
Message-ID: <68b1465b-e970-15c3-37a6-f81a639bc71c@incline.eu> (raw)
In-Reply-To: <CAEf4BzbH7tB+zaK=DJtpR+SXqhNqwYMwiru9xpuAhGpaaFrJsg@mail.gmail.com>

On 3/30/23 23:05, Andrii Nakryiko wrote:
> On Thu, Mar 30, 2023 at 10:13 AM Lorenz Bauer <lmb@isovalent.com> wrote:
>> On Wed, Mar 29, 2023 at 12:56 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>>> This basically prevents any forward compatibility. And we either way
>>> just return -EINVAL, which would otherwise be returned from bpf()
>>> syscall anyways.
>> In your cover letter you make the argument that applications can opt
>> out of the behaviour, but I think shows that this isn't entirely true.
>> Apps linking old libbpf won't be able to fix their breakage without
>> updating libbpf. This is especially annoying when you have to support
>> multiple old versions where doing this isn't straightforward.
>>
> Ok, technically, you are correct. If you somehow managed to get a
> bleeding edge kernel, but outdated libbpf, you won't be able to
> specify log_level = 8. This is not the only place where too old libbpf
> would limit you from using bleeding edge kernel features, though, and
> we have to live with that (though try our best to avoid such
> dependencies, of course).
>
> But in practice you get the freshest libbpf way before your kernel
> becomes the freshest one, so I don't think this is a big deal in
> practice.
Hey Andrii,

In an ideal world, yes, but not how it works out in practice. Anything
that ships in a container obviously misses out on distro packages of the
underlying host that are tied to the running kernel. This looks like it's
quickly becoming a majority use case of bpf in the wild, barring of course
Android and systemd.

In practice, we get to deal with rather large version swings in both
directions; users may want to run the latest tools on 4.19, or last
year's stable tool release on 6.1, so the need for both backwards- and
forwards-compatibility is definitely there.

Fortunately, things don't break that often. :) The biggest papercut
we've had recently was the introduction of enum64, which just flat
out requires the latest userspace if kernel btf is needed.
>> Take this as another plea to make this opt in and instead work
>> together to make this a default on the lib side. :)
> Please, help me understand the arguments against making rotating mode
> a default, now that we return -ENOSPC on truncation. In which scenario
> this difference matters?
I think there may be a misunderstanding. As you mentioned, there's rarely
anything useful at the start of the log. I think the opt-in behaviour
under discussion here is the lack of ENOSPC when the buffer is undersized.
Userspace should set a flag if it supports receiving a partial log without
expecting ENOSPC.

I've lost track of the exact changes that are now on the table with your
second patch set floating around. The way I understand it, there are
multiple isolated behavioural changes, so let's discuss them separately:

- Log will include the tail instead of the head. This is a good change, no
   argument there, and personally I wouldn't bother with a flag until
   someone complains. Old userspace is (worst case) already used to retrying
   with bigger buffers until ENOSPC is gone, and (best case) gets a few 
pages
   of actually useful log if it doesn't retry.

- log_size_actual: if populated by the kernel, userspace can do a perfect
   retry with the correct buffer size. Userspace will naturally be able to
   pick this up when their bpf_attr gets updated. No opt-in/flags needed
   because not a breaking change.

- No more ENOSPC: this needs to be opt-in, as old userspace relies on ENOSPC
   to drive the retry loop. If the kernel no longer returns ENOSPC by 
default,
   userspace will think it received a full log and won't be able to detect
   truncation if it doesn't yet know about the log_size_actual field.
   From what I gather, we're also in agreement on this. Idea for a flag 
name:
   BPF_LOG_PARTIAL?

Hope this can move the discussion forward, it looked to me like we were just
talking past each other. Thanks for addressing our feedback so far!
> 1. If there is no truncation and the user provides a big enough buffer
> (which my second patch set makes it even easier to do for libraries),
> there is no difference, they get identical log contents and behavior.
>
> 2. If there was truncation, in both cases we get -ENOSPC. The contents
> will differ. In one case we get the beginning of a long log with no
> details about what actually caused the failure (useless in pretty much
> any circumstances) versus you get the last N bytes of log, all the way
> to actual error and some history leading towards it. Which is what we
> used to debug and understand verification issues.
>
> What is the situation where the beginning of the log is preferable? I
> had exactly one case where I actually wanted the beginning of the log,
> that was when I was debugging some bug in the verifier when
> implementing open-coded iterators. This bug was happening early and
> causing an infinite loop, so I wanted to see the first few pages of
> the output to catch how it all started. But that was a development bug
> of a tricky feature, definitely not something we expect for end users
> to deal with. And it was literally *once* that I needed this.
>
> Why are we fighting to preserve this much less useful behavior as a
> default, if there is no reduction of functionality for end-users?
> Library writers have full access to union bpf_attr and can opt-out
> easily (though again, why?). Normal end users will never have to ask
> for BPF_LOG_FIXED behavior. Maybe some advanced tool-building users
> will want BPF_LOG_FIXED (like veristat, for example), but then it's in
> their best interest to have fresh enough libbpf anyways.
>
> So instead of "I want X over Y", let's discuss "*why* X is better than Y"?
>
>> Apps linking old libbpf won't be able to fix their breakage without
>> updating libbpf. This is especially annoying when you have to support
> What sort of breakage would be there to fix?
>
> Also keep in mind that not all use cases use BPF library's high-level
> code that does all this fancy log buf manipulations. There are
> legitimate cases where tools/applications want direct access to
> log_buf, so needing to do extra feature detection to get rotating mode
> (but falling back without failing to fixed mode on the old kernel) is
> just an unnecessary nuisance.