Re: [PATCH bpf-next v2 2/3] bpf: Introduce ARG_PTR_TO_WRITABLE_MEM

From: Alexei Starovoitov <alexei.starovoitov@gmail.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Hao Luo <haoluo@google.com>, Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	KP Singh <kpsingh@kernel.org>, bpf <bpf@vger.kernel.org>
Subject: Re: [PATCH bpf-next v2 2/3] bpf: Introduce ARG_PTR_TO_WRITABLE_MEM
Date: Tue, 26 Oct 2021 12:22:51 -0700	[thread overview]
Message-ID: <CAADnVQKo+jrFO4FVm=rm8q--hkHwBe9-iwDrdBzWW_aFxQ5KxA@mail.gmail.com> (raw)
In-Reply-To: <CAEf4Bzb2LdZrYVP+h5HxKS+H5tj-s7h_4xir_c3+bihaU5z_yQ@mail.gmail.com>

On Tue, Oct 26, 2021 at 11:45 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Oct 26, 2021 at 10:59 AM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Oct 25, 2021 at 10:14 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > Instead of adding new types,
> > > > can we do something like this instead:
> > > >
> > > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > > index c8a78e830fca..5dbd2541aa86 100644
> > > > --- a/include/linux/bpf_verifier.h
> > > > +++ b/include/linux/bpf_verifier.h
> > > > @@ -68,7 +68,8 @@ struct bpf_reg_state {
> > > >                         u32 btf_id;
> > > >                 };
> > > >
> > > > -               u32 mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */
> > > > +               u32 rd_mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */
> > > > +               u32 wr_mem_size; /* for PTR_TO_MEM | PTR_TO_MEM_OR_NULL */
> > >
> > > This seems more confusing, it's technically possible to express a
> > > memory pointer from which you can read X bytes, but can write Y bytes.
> >
> > I'm fine it being a new flag instead of wr_mem_size.
> >
> > > I actually liked the idea that helpers will be explicit about whether
> > > they can write into a memory or only read from it.
> > >
> > > Apart from a few more lines of code, are there any downsides to having
> > > PTR_TO_MEM vs PTR_TO_RDONLY_MEM?
> >
> > because it's a churn and non scalable long term.
> > It's not just PTR_TO_RDONLY_MEM.
> > It's also ARG_PTR_TO_RDONLY_MEM,
> > and RET_PTR_TO_RDONLY_MEM,
> > and PTR_TO_RDONLY_MEM_OR_NULL
> > and *_OR_BTF_ID,
> > and *_OR_BTF_ID_OR_NULL.
> > It felt that expressing readonly-ness as a flag in bpf_reg_state
> > will make it easier to get right in the code and extend in the future.
>
> That's true, but while it's easy to add a flag to bpf_reg_state, it's
> not easy to do the same for BPF helper input (ARG_PTR_xxx) and output
> (RET_PTR_xxx) restrictions. So unless we extend ARG_PTR and RET_PTR
> with flags, it seems more consistent to keep the same pure enum
> approach for reg_state.
>
> > May be we will have a kernel vs user flag for PTR_TO_MEM in the future.
> > If we use different name to express that we will have:
> > PTR_TO_USER_RDONLY_MEM and
> > PTR_TO_USER_MEM
> > plus all variants of ARG_* and RET_* and *_OR_NULL.
> > With a flag approach it will be just another flag in bpf_reg_state.
>
> All true, but then maybe we should rethink how we do all those enums.
> And instead of having all the _OR_NULL variants, it should be
> ARG_NULLABLE/REG_NULLABLE/RET_NULLABLE flag that can be or-ed with the
> basic set of register/input/output type enums? Same for ARG_RDONLY
> flag. Same could technically be done for USER vs KERNEL memory in the
> future.

Exactly. OR_NULL is such a flag and we already struggled to
differentiate that flag with truly_not_equal_to_NULL and may_be_NULL.
That's why all bpf_skc* helpers have additional run-time !NULL check.

ARG_NULLABLE/REG_NULLABLE/RET_NULLABLE would make it cleaner.
And ARG_RDONLY would fit that model well.

> It's definitely a bunch of code changes, but if we are worried about
> an explosion of enum values, it might be the right move?
>
> On the other hand, if there are all those different variations and
> each is handled slightly differently, we'll have to have different
> logic for each of them. And whether it's an enum + flags, or a few
> more enumerators, doesn't change anything fundamentally. I feel like
> enums make code discovery a bit simpler in practice, but it's
> subjective.

I think it's a bit of a mess already.
ARG_PTR_TO_BTF_ID has may_be_NULL flag.
Just like ARG_PTR_TO_BTF_ID_SOCK_COMMON.
but RET_PTR_TO_BTF_ID doesn't.
PTR_TO_BTF_ID doesn't have that may_be_NULL assumption either.

imo cleaning up OR_NULL will be very nice.
RDONLY would be an addition on top.
We can probably fold UNINIT as a flag too.

All that will be a big change, but I think it's worth it.