Re: [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros

From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	kernel-team@fb.com, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros
Date: Tue, 7 Mar 2023 13:54:36 -0800	[thread overview]
Message-ID: <CAEf4BzZr6n21r1Kvv6+9iUg8JJ7fnAWRrRZA3rYY2kOZADP8aw@mail.gmail.com> (raw)
In-Reply-To: <20230306001228.mpu4f6l5uosoo26v@MacBook-Pro-6.local>

On Sun, Mar 5, 2023 at 4:12 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Mar 04, 2023 at 03:28:03PM -0800, Andrii Nakryiko wrote:
> > On Sat, Mar 4, 2023 at 12:34 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Mar 02, 2023 at 03:50:13PM -0800, Andrii Nakryiko wrote:
> > > > Add bpf_for_each(), bpf_for() and bpf_repeat() macros that make writing
> > > > open-coded iterator-based loops much more convenient and natural. These
> > > > macro utilize cleanup attribute to ensure proper destruction of the
> > > > iterator and thanks to that manage to provide an ergonomic very close to
> > > > C language for construct. Typical integer loop would look like:
> > > >
> > > >   int i;
> > > >   int arr[N];
> > > >
> > > >   bpf_for(i, 0, N) {
> > > >   /* verifier will know that i >= 0 && i < N, so could be used to
> > > >    * directly access array elements with no extra checks
> > > >    */
> > > >    arr[i] = i;
> > > >   }
> > > >
> > > > bpf_repeat() is very similar, but it doesn't expose iteration number and
> > > > is mean as a simple "repeat action N times":
> > > >
> > > >   bpf_repeat(N) { /* whatever */ }
> > > >
> > > > Note that break and continue inside the {} block work as expected.
> > > >
> > > > bpf_for_each() is a generalization over any kind of BPF open-coded
> > > > iterator allowing to use for-each-like approach instead of calling
> > > > low-level bpf_iter_<type>_{new,next,destroy}() APIs explicitly. E.g.:
> > > >
> > > >   struct cgroup *cg;
> > > >
> > > >   bpf_for_each(cgroup, cg, some, input, args) {
> > > >       /* do something with each cg */
> > > >   }
> > > >
> > > > Would call (right now hypothetical) bpf_iter_cgroup_{new,next,destroy}()
> > > > functions to form a loop over cgroups, where `some, input, args` are
> > > > passed verbatim into constructor as
> > > > bpf_iter_cgroup_new(&it, some, input, args).
> > > >
> > > > As a demonstration, add pyperf variant based on bpf_for() loop.
> > > >
> > > > Also clean up few tests that either included bpf_misc.h header
> > > > unnecessarily from user-space or included it before any common types are
> > > > defined.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > >  .../bpf/prog_tests/bpf_verif_scale.c          |  6 ++
> > > >  .../bpf/prog_tests/uprobe_autoattach.c        |  1 -
> > > >  tools/testing/selftests/bpf/progs/bpf_misc.h  | 76 +++++++++++++++++++
> > > >  tools/testing/selftests/bpf/progs/lsm.c       |  4 +-
> > > >  tools/testing/selftests/bpf/progs/pyperf.h    | 14 +++-
> > > >  .../selftests/bpf/progs/pyperf600_iter.c      |  7 ++
> > > >  .../selftests/bpf/progs/pyperf600_nounroll.c  |  3 -
> > > >  7 files changed, 101 insertions(+), 10 deletions(-)
> > > >  create mode 100644 tools/testing/selftests/bpf/progs/pyperf600_iter.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > index 5ca252823294..731c343897d8 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > @@ -144,6 +144,12 @@ void test_verif_scale_pyperf600_nounroll()
> > > >       scale_test("pyperf600_nounroll.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > >  }
> > > >
> > > > +void test_verif_scale_pyperf600_iter()
> > > > +{
> > > > +     /* open-coded BPF iterator version */
> > > > +     scale_test("pyperf600_iter.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > > +}
> > > > +
> > > >  void test_verif_scale_loop1()
> > > >  {
> > > >       scale_test("loop1.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > index 6558c857e620..d5b3377aa33c 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > @@ -3,7 +3,6 @@
> > > >
> > > >  #include <test_progs.h>
> > > >  #include "test_uprobe_autoattach.skel.h"
> > > > -#include "progs/bpf_misc.h"
> > > >
> > > >  /* uprobe attach point */
> > > >  static noinline int autoattach_trigger_func(int arg1, int arg2, int arg3,
> > > > diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > index f704885aa534..08a791f307a6 100644
> > > > --- a/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > @@ -75,5 +75,81 @@
> > > >  #define FUNC_REG_ARG_CNT 5
> > > >  #endif
> > > >
> > > > +struct bpf_iter;
> > > > +
> > > > +extern int bpf_iter_num_new(struct bpf_iter *it__uninit, int start, int end) __ksym;
> > > > +extern int *bpf_iter_num_next(struct bpf_iter *it) __ksym;
> > > > +extern void bpf_iter_num_destroy(struct bpf_iter *it) __ksym;
> > > > +
> > > > +#ifndef bpf_for_each
> > > > +/* bpf_for_each(iter_kind, elem, args...) provides generic construct for using BPF
> > > > + * open-coded iterators without having to write mundane explicit low-level
> > > > + * loop. Instead, it provides for()-like generic construct that can be used
> > > > + * pretty naturally. E.g., for some hypothetical cgroup iterator, you'd write:
> > > > + *
> > > > + * struct cgroup *cg, *parent_cg = <...>;
> > > > + *
> > > > + * bpf_for_each(cgroup, cg, parent_cg, CG_ITER_CHILDREN) {
> > > > + *     bpf_printk("Child cgroup id = %d", cg->cgroup_id);
> > > > + *     if (cg->cgroup_id == 123)
> > > > + *         break;
> > > > + * }
> > > > + *
> > > > + * I.e., it looks almost like high-level for each loop in other languages,
> > > > + * supports continue/break, and is verifiable by BPF verifier.
> > > > + *
> > > > + * For iterating integers, the difference betwen bpf_for_each(num, i, N, M)
> > > > + * and bpf_for(i, N, M) is in that bpf_for() provides additional proof to
> > > > + * verifier that i is in [N, M) range, and in bpf_for_each() case i is `int
> > > > + * *`, not just `int`. So for integers bpf_for() is more convenient.
> > > > + */
> > > > +#define bpf_for_each(type, cur, args...) for (                                                 \
> > > > +     /* initialize and define destructor */                                            \
> > > > +     struct bpf_iter ___it __attribute__((cleanup(bpf_iter_##type##_destroy))),        \
> > >
> > > We should probably say somewhere that it requires C99 with some flag that allows
> > > declaring variables inside the loop.
> >
> > yes, I'll add a comment. I think cleanup attribute isn't standard as
> > well, I'll mention it. This shouldn't be restrictive, though, as we
> > expect very modern Clang (or eventually GCC), which definitely will
> > support all of that. And I feel like most people don't restrict their
> > BPF-side code to strict C89 anyways.
>
> yep. No UX concerns. A comment to manage expectations should be enough.
>

added


> > >
> > > Also what are the rules for attr(cleanup()).
> > > When does it get called?
> >
> > From GCC documentation:
> >
> >   > The cleanup attribute runs a function when the variable goes out of scope.
> >
> > So given ___it is bound to for loop, any code path that leads to loop
> > exit (so, when condition turns false or *breaking* out of the loop,
> > which is why I use cleanup, this was a saving grace for this approach
> > to work at all).
>
> +1
>
> >
> > > My understanding that the visibility of ___it is only within for() body.
> >
> > right
> >
> > > So when the prog does:
> > > bpf_for(i, 0, 10) sum += i;
> > > bpf_for(i, 0, 10) sum += i;
> > >
> > > the compiler should generate bpf_iter_num_destroy right after each bpf_for() ?
> >
> > Conceptually, yes, but see the note about breaking out of the loop
> > above. How actual assembly code is generated is beyond our control. If
> > the compiler generates multiple separate code paths, each with its own
> > destroy, that's fine as well. No assumptions are made in the verifier,
> > we just need to see one bpf_iter_<type>_destroy() for each instance of
> > iterator.
> >

[...]

> > One restriction with this approach is that I can't define both `struct
> > bpf_iter __it` and `int i` inside for loop, so cur/i has to be
> > declared and passed into bpf_for/bpf_for_each by user explicitly. So
> > for bpf_repeat() to expose i would require always doing:
> >
> > int i;
> >
> > bpf_repeat(i, 100) { /* i is set to 0, 1, ..., 99 */ }
> >
> > which, if you don't care about iteration number, is a bit of waste. So
> > don't know, I'm undecided on bpf_repeat with i.
>
> It feels that the proposed bpf_repeat() without 'i' is cleaner.
>

agreed, I'm keeping it as is

> > >
> > > In such case should we add __builtin_constant_p() check for 'start' and 'end' ?
> >
> > that seems to defy the purpose, as if you know start/end statically,
> > you might as well just write either unrolled loop or bounded for()
> > loop
> >
> > > int arr[100];
> > > if (var < 100)
> > >   bpf_for(i, 0, global_var) sum += arr[i];
> > > will fail the verifier and the users might complain of dumb verifier.
> >
> >

[...]

> >
> > I can't do that, as for() allows only one type of variables to be
> > defined (note `*___p` hack as well), so there is no place to remember
> > start/end, unfortunately...
>
> I see. Abusing bpf_iter like:
> for (struct bpf_iter_num __it, *___p, start_end = (struct bpf_iter_num) {start, end};
> is probably too ugly just to READ_ONCE(start).
> Especially since bpf_iter_num is opaque.

yeah, that seems like too much

>
> Please add a comment to warn users that if start/end are variables they
> better not change during bpf_for().
> I guess plain C loop: for (int i = 0; i < j; i++) will go crazy too if 'j' changes.
> I'm not sure what standard says here.
> Whether compiler has to reload 'j' every iteration or not.

no idea, but yep, will add a comment

>
> > So it's a tradeoff. I can drop range validation, but then every
> > example and lots of production code would be re-checking these
> > conditions.
>
> Understood. Keep it as-is.
>
> > >
> > > > +     });                                                                               \
> > > > +     /* nothing here  */                                                               \
>
> btw that 'nothing here' comment is distracting when reading this macro.

ok, dropped this