All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net,
	kernel-team@fb.com, Tejun Heo <tj@kernel.org>
Subject: Re: [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros
Date: Tue, 7 Mar 2023 13:54:36 -0800	[thread overview]
Message-ID: <CAEf4BzZr6n21r1Kvv6+9iUg8JJ7fnAWRrRZA3rYY2kOZADP8aw@mail.gmail.com> (raw)
In-Reply-To: <20230306001228.mpu4f6l5uosoo26v@MacBook-Pro-6.local>

On Sun, Mar 5, 2023 at 4:12 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Mar 04, 2023 at 03:28:03PM -0800, Andrii Nakryiko wrote:
> > On Sat, Mar 4, 2023 at 12:34 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Mar 02, 2023 at 03:50:13PM -0800, Andrii Nakryiko wrote:
> > > > Add bpf_for_each(), bpf_for() and bpf_repeat() macros that make writing
> > > > open-coded iterator-based loops much more convenient and natural. These
> > > > macro utilize cleanup attribute to ensure proper destruction of the
> > > > iterator and thanks to that manage to provide an ergonomic very close to
> > > > C language for construct. Typical integer loop would look like:
> > > >
> > > >   int i;
> > > >   int arr[N];
> > > >
> > > >   bpf_for(i, 0, N) {
> > > >   /* verifier will know that i >= 0 && i < N, so could be used to
> > > >    * directly access array elements with no extra checks
> > > >    */
> > > >    arr[i] = i;
> > > >   }
> > > >
> > > > bpf_repeat() is very similar, but it doesn't expose iteration number and
> > > > is mean as a simple "repeat action N times":
> > > >
> > > >   bpf_repeat(N) { /* whatever */ }
> > > >
> > > > Note that break and continue inside the {} block work as expected.
> > > >
> > > > bpf_for_each() is a generalization over any kind of BPF open-coded
> > > > iterator allowing to use for-each-like approach instead of calling
> > > > low-level bpf_iter_<type>_{new,next,destroy}() APIs explicitly. E.g.:
> > > >
> > > >   struct cgroup *cg;
> > > >
> > > >   bpf_for_each(cgroup, cg, some, input, args) {
> > > >       /* do something with each cg */
> > > >   }
> > > >
> > > > Would call (right now hypothetical) bpf_iter_cgroup_{new,next,destroy}()
> > > > functions to form a loop over cgroups, where `some, input, args` are
> > > > passed verbatim into constructor as
> > > > bpf_iter_cgroup_new(&it, some, input, args).
> > > >
> > > > As a demonstration, add pyperf variant based on bpf_for() loop.
> > > >
> > > > Also clean up few tests that either included bpf_misc.h header
> > > > unnecessarily from user-space or included it before any common types are
> > > > defined.
> > > >
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > >  .../bpf/prog_tests/bpf_verif_scale.c          |  6 ++
> > > >  .../bpf/prog_tests/uprobe_autoattach.c        |  1 -
> > > >  tools/testing/selftests/bpf/progs/bpf_misc.h  | 76 +++++++++++++++++++
> > > >  tools/testing/selftests/bpf/progs/lsm.c       |  4 +-
> > > >  tools/testing/selftests/bpf/progs/pyperf.h    | 14 +++-
> > > >  .../selftests/bpf/progs/pyperf600_iter.c      |  7 ++
> > > >  .../selftests/bpf/progs/pyperf600_nounroll.c  |  3 -
> > > >  7 files changed, 101 insertions(+), 10 deletions(-)
> > > >  create mode 100644 tools/testing/selftests/bpf/progs/pyperf600_iter.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > index 5ca252823294..731c343897d8 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/bpf_verif_scale.c
> > > > @@ -144,6 +144,12 @@ void test_verif_scale_pyperf600_nounroll()
> > > >       scale_test("pyperf600_nounroll.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > >  }
> > > >
> > > > +void test_verif_scale_pyperf600_iter()
> > > > +{
> > > > +     /* open-coded BPF iterator version */
> > > > +     scale_test("pyperf600_iter.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > > +}
> > > > +
> > > >  void test_verif_scale_loop1()
> > > >  {
> > > >       scale_test("loop1.bpf.o", BPF_PROG_TYPE_RAW_TRACEPOINT, false);
> > > > diff --git a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > index 6558c857e620..d5b3377aa33c 100644
> > > > --- a/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > +++ b/tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c
> > > > @@ -3,7 +3,6 @@
> > > >
> > > >  #include <test_progs.h>
> > > >  #include "test_uprobe_autoattach.skel.h"
> > > > -#include "progs/bpf_misc.h"
> > > >
> > > >  /* uprobe attach point */
> > > >  static noinline int autoattach_trigger_func(int arg1, int arg2, int arg3,
> > > > diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > index f704885aa534..08a791f307a6 100644
> > > > --- a/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h
> > > > @@ -75,5 +75,81 @@
> > > >  #define FUNC_REG_ARG_CNT 5
> > > >  #endif
> > > >
> > > > +struct bpf_iter;
> > > > +
> > > > +extern int bpf_iter_num_new(struct bpf_iter *it__uninit, int start, int end) __ksym;
> > > > +extern int *bpf_iter_num_next(struct bpf_iter *it) __ksym;
> > > > +extern void bpf_iter_num_destroy(struct bpf_iter *it) __ksym;
> > > > +
> > > > +#ifndef bpf_for_each
> > > > +/* bpf_for_each(iter_kind, elem, args...) provides generic construct for using BPF
> > > > + * open-coded iterators without having to write mundane explicit low-level
> > > > + * loop. Instead, it provides for()-like generic construct that can be used
> > > > + * pretty naturally. E.g., for some hypothetical cgroup iterator, you'd write:
> > > > + *
> > > > + * struct cgroup *cg, *parent_cg = <...>;
> > > > + *
> > > > + * bpf_for_each(cgroup, cg, parent_cg, CG_ITER_CHILDREN) {
> > > > + *     bpf_printk("Child cgroup id = %d", cg->cgroup_id);
> > > > + *     if (cg->cgroup_id == 123)
> > > > + *         break;
> > > > + * }
> > > > + *
> > > > + * I.e., it looks almost like high-level for each loop in other languages,
> > > > + * supports continue/break, and is verifiable by BPF verifier.
> > > > + *
> > > > + * For iterating integers, the difference betwen bpf_for_each(num, i, N, M)
> > > > + * and bpf_for(i, N, M) is in that bpf_for() provides additional proof to
> > > > + * verifier that i is in [N, M) range, and in bpf_for_each() case i is `int
> > > > + * *`, not just `int`. So for integers bpf_for() is more convenient.
> > > > + */
> > > > +#define bpf_for_each(type, cur, args...) for (                                                 \
> > > > +     /* initialize and define destructor */                                            \
> > > > +     struct bpf_iter ___it __attribute__((cleanup(bpf_iter_##type##_destroy))),        \
> > >
> > > We should probably say somewhere that it requires C99 with some flag that allows
> > > declaring variables inside the loop.
> >
> > yes, I'll add a comment. I think cleanup attribute isn't standard as
> > well, I'll mention it. This shouldn't be restrictive, though, as we
> > expect very modern Clang (or eventually GCC), which definitely will
> > support all of that. And I feel like most people don't restrict their
> > BPF-side code to strict C89 anyways.
>
> yep. No UX concerns. A comment to manage expectations should be enough.
>

added


> > >
> > > Also what are the rules for attr(cleanup()).
> > > When does it get called?
> >
> > From GCC documentation:
> >
> >   > The cleanup attribute runs a function when the variable goes out of scope.
> >
> > So given ___it is bound to for loop, any code path that leads to loop
> > exit (so, when condition turns false or *breaking* out of the loop,
> > which is why I use cleanup, this was a saving grace for this approach
> > to work at all).
>
> +1
>
> >
> > > My understanding that the visibility of ___it is only within for() body.
> >
> > right
> >
> > > So when the prog does:
> > > bpf_for(i, 0, 10) sum += i;
> > > bpf_for(i, 0, 10) sum += i;
> > >
> > > the compiler should generate bpf_iter_num_destroy right after each bpf_for() ?
> >
> > Conceptually, yes, but see the note about breaking out of the loop
> > above. How actual assembly code is generated is beyond our control. If
> > the compiler generates multiple separate code paths, each with its own
> > destroy, that's fine as well. No assumptions are made in the verifier,
> > we just need to see one bpf_iter_<type>_destroy() for each instance of
> > iterator.
> >

[...]

> > One restriction with this approach is that I can't define both `struct
> > bpf_iter __it` and `int i` inside for loop, so cur/i has to be
> > declared and passed into bpf_for/bpf_for_each by user explicitly. So
> > for bpf_repeat() to expose i would require always doing:
> >
> > int i;
> >
> > bpf_repeat(i, 100) { /* i is set to 0, 1, ..., 99 */ }
> >
> > which, if you don't care about iteration number, is a bit of waste. So
> > don't know, I'm undecided on bpf_repeat with i.
>
> It feels that the proposed bpf_repeat() without 'i' is cleaner.
>

agreed, I'm keeping it as is

> > >
> > > In such case should we add __builtin_constant_p() check for 'start' and 'end' ?
> >
> > that seems to defy the purpose, as if you know start/end statically,
> > you might as well just write either unrolled loop or bounded for()
> > loop
> >
> > > int arr[100];
> > > if (var < 100)
> > >   bpf_for(i, 0, global_var) sum += arr[i];
> > > will fail the verifier and the users might complain of dumb verifier.
> >
> >

[...]

> >
> > I can't do that, as for() allows only one type of variables to be
> > defined (note `*___p` hack as well), so there is no place to remember
> > start/end, unfortunately...
>
> I see. Abusing bpf_iter like:
> for (struct bpf_iter_num __it, *___p, start_end = (struct bpf_iter_num) {start, end};
> is probably too ugly just to READ_ONCE(start).
> Especially since bpf_iter_num is opaque.

yeah, that seems like too much

>
> Please add a comment to warn users that if start/end are variables they
> better not change during bpf_for().
> I guess plain C loop: for (int i = 0; i < j; i++) will go crazy too if 'j' changes.
> I'm not sure what standard says here.
> Whether compiler has to reload 'j' every iteration or not.

no idea, but yep, will add a comment

>
> > So it's a tradeoff. I can drop range validation, but then every
> > example and lots of production code would be re-checking these
> > conditions.
>
> Understood. Keep it as-is.
>
> > >
> > > > +     });                                                                               \
> > > > +     /* nothing here  */                                                               \
>
> btw that 'nothing here' comment is distracting when reading this macro.

ok, dropped this

  reply	other threads:[~2023-03-07 21:55 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-02 23:49 [PATCH bpf-next 00/17] BPF open-coded iterators Andrii Nakryiko
2023-03-02 23:49 ` [PATCH bpf-next 01/17] bpf: improve stack slot state printing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 02/17] bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER} Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 03/17] selftests/bpf: enhance align selftest's expected log matching Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 04/17] bpf: honor env->test_state_freq flag in is_state_visited() Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 05/17] selftests/bpf: adjust log_fixup's buffer size for proper truncation Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 06/17] bpf: clean up visit_insn()'s instruction processing Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 07/17] bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 08/17] bpf: ensure that r0 is marked scratched after any function call Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 09/17] bpf: move kfunc_call_arg_meta higher in the file Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 10/17] bpf: mark PTR_TO_MEM as non-null register type Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 11/17] bpf: generalize dynptr_get_spi to be usable for iters Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 12/17] bpf: add support for fixed-size memory pointer returns for kfuncs Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 13/17] bpf: add support for open-coded iterator loops Andrii Nakryiko
2023-03-04 20:02   ` Alexei Starovoitov
2023-03-04 23:27     ` Andrii Nakryiko
2023-03-05 23:46       ` Alexei Starovoitov
2023-03-07 21:54         ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 14/17] bpf: implement number iterator Andrii Nakryiko
2023-03-04 20:21   ` Alexei Starovoitov
2023-03-04 23:27     ` Andrii Nakryiko
2023-03-05 23:49       ` Alexei Starovoitov
2023-03-02 23:50 ` [PATCH bpf-next 15/17] selftests/bpf: add bpf_for_each(), bpf_for(), and bpf_repeat() macros Andrii Nakryiko
2023-03-04 20:34   ` Alexei Starovoitov
2023-03-04 23:28     ` Andrii Nakryiko
2023-03-06  0:12       ` Alexei Starovoitov
2023-03-07 21:54         ` Andrii Nakryiko [this message]
2023-03-02 23:50 ` [PATCH bpf-next 16/17] selftests/bpf: add iterators tests Andrii Nakryiko
2023-03-04 20:39   ` Alexei Starovoitov
2023-03-04 23:29     ` Andrii Nakryiko
2023-03-06  0:14       ` Alexei Starovoitov
2023-03-04 21:09   ` Jiri Olsa
2023-03-04 23:29     ` Andrii Nakryiko
2023-03-02 23:50 ` [PATCH bpf-next 17/17] selftests/bpf: add number iterator tests Andrii Nakryiko
2023-03-04 19:30 ` [PATCH bpf-next 00/17] BPF open-coded iterators patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzZr6n21r1Kvv6+9iUg8JJ7fnAWRrRZA3rYY2kOZADP8aw@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.