netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>
Subject: Re: [PATCH v2 bpf-next 04/14] libbpf: make RELO_CALL work for multi-prog sections and sub-program calls
Date: Wed, 2 Sep 2020 12:52:27 -0700	[thread overview]
Message-ID: <CAEf4BzYkpJWUqeNJfVpyC1Cf5ThHXkpxV-yrgY0XfAmEie6tKg@mail.gmail.com> (raw)
In-Reply-To: <20200902053628.bqqytnpebrum7heh@ast-mbp.dhcp.thefacebook.com>

On Tue, Sep 1, 2020 at 10:36 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Aug 31, 2020 at 06:49:53PM -0700, Andrii Nakryiko wrote:
> > +
> > +static int
> > +bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
> > +                    struct bpf_program *prog)
> > +{
> > +     size_t sub_insn_idx, insn_idx, new_cnt;
> > +     struct bpf_program *subprog;
> > +     struct bpf_insn *insns, *insn;
> > +     struct reloc_desc *relo;
> > +     int err;
> > +
> > +     err = reloc_prog_func_and_line_info(obj, main_prog, prog);
> > +     if (err)
> > +             return err;
> > +
> > +     for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
> > +             insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
> > +             if (!insn_is_subprog_call(insn))
> > +                     continue;
> > +
> > +             relo = find_prog_insn_relo(prog, insn_idx);
> > +             if (relo && relo->type != RELO_CALL) {
> > +                     pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
> > +                             prog->name, insn_idx, relo->type);
> > +                     return -LIBBPF_ERRNO__RELOC;
> > +             }
> > +             if (relo) {
> > +                     /* sub-program instruction index is a combination of
> > +                      * an offset of a symbol pointed to by relocation and
> > +                      * call instruction's imm field; for global functions,
> > +                      * call always has imm = -1, but for static functions
> > +                      * relocation is against STT_SECTION and insn->imm
> > +                      * points to a start of a static function
> > +                      */
> > +                     sub_insn_idx = relo->sym_off / BPF_INSN_SZ + insn->imm + 1;
> > +             } else {
> > +                     /* if subprogram call is to a static function within
> > +                      * the same ELF section, there won't be any relocation
> > +                      * emitted, but it also means there is no additional
> > +                      * offset necessary, insns->imm is relative to
> > +                      * instruction's original position within the section
> > +                      */
>
> Great two comments. Thanks.
>
> > +                     sub_insn_idx = prog->sec_insn_off + insn_idx + insn->imm + 1;
> > +             }
> > +
> > +             /* we enforce that sub-programs should be in .text section */
> > +             subprog = find_prog_by_sec_insn(obj, obj->efile.text_shndx, sub_insn_idx);
> > +             if (!subprog) {
> > +                     pr_warn("prog '%s': no .text section found yet sub-program call exists\n",
> > +                             prog->name);
> > +                     return -LIBBPF_ERRNO__RELOC;
> > +             }
> > +
> > +             /* if subprogram hasn't been used in current main program,
> > +              * relocate it and append at the end of main program code
> > +              */
>
> This one is quite confusing.
> "hasn't been used" isn't right.
> This subprog was used, but wasn't appeneded yet. That's what sub_insn_off is tracking.

"hasn't been used *yet*" would be more precise, meaning: up until the
current instruction there were no calls to that subprogram.

> Also "relocate and append it" is not right either.
> It's "append and start relocating".

Right, order of actions is wrong, I'll fix the wording.

> Probably shouldn't call it 'main' and 'subprog'.
> It equally applies to 'subprog' and 'another subprog'.

Yes, you are right, I'll update comments to be less confusing.

>
> > +             if (subprog->sub_insn_off == 0) {
> > +                     subprog->sub_insn_off = main_prog->insns_cnt;
> > +
> > +                     new_cnt = main_prog->insns_cnt + subprog->insns_cnt;
> > +                     insns = libbpf_reallocarray(main_prog->insns, new_cnt, sizeof(*insns));
> > +                     if (!insns) {
> > +                             pr_warn("prog '%s': failed to realloc prog code\n", main_prog->name);
> > +                             return -ENOMEM;
> > +                     }
> > +                     main_prog->insns = insns;
> > +                     main_prog->insns_cnt = new_cnt;
> > +
> > +                     memcpy(main_prog->insns + subprog->sub_insn_off, subprog->insns,
> > +                            subprog->insns_cnt * sizeof(*insns));
> > +
> > +                     pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
> > +                              main_prog->name, subprog->insns_cnt, subprog->name);
> > +
> > +                     err = bpf_object__reloc_code(obj, main_prog, subprog);
> > +                     if (err)
> > +                             return err;
> > +             }
> > +
> > +             /* main_prog->insns memory could have been re-allocated, so
> > +              * calculate pointer again
> > +              */
> > +             insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
> > +             /* calculate correct instruction position within main prog */
>
> may be: "calculate position within the prog being relocated?"

no-no, in this case it's an instruction index within the main
(entry-point) BPF program with all the used subprograms appended. So
even if we have subprog1 calling another subprog2, all the instruction
indices are calculated within the main BPF program's "system of
coordinates", because each main BPF program can have a different
subset of functions appended and subprogs might be in a different
order and at different positions.

>
> > +             insn->imm = subprog->sub_insn_off - (prog->sub_insn_off + insn_idx) - 1;
>
> I think the algorithm is sound.
> Could you add a better description of it?
> May be some small diagram to illustrate how it recursively relocates?
> That it starts with main, walks some number of insn, when it sees pseudo_call to
> not yet appended subprog, it adds it to the end and recursively starts relocating it.
> That subprog can have relos too. If they're pointing to not yet appended subprog it will be
> added again and that 2nd subprog will start relocating while the main and 1st subprog
> will be pending.

Ok, I'll try to give some better overview in the comments.

> The algorithm didn't have to be recursive, but I guess it's fine to keep this way.
> It's simple enough. I haven't thought through how it can look without recursion.
> Probably a bunch of book keeping of things to relocate would have been necessary.

It's a graph traversing problem to figure out which subprograms need
to be appended. I did it DFS style because it's a bit simpler that way
(there is no separate pass for detecting used subprogs and then
relocating all of them), plus libbpf already relies on recursion (at
least for CO-RE), so I didn't feel bad about it. It's possible to do
it with BFS, though, by maintaining a queue of to-be-processed
subprogs, mark and append them. Then as a second pass relocate calls.
But as you said, it's simple enough on a high-level, that I'd stick to
what I have. I'll improve the comments, though.

  reply	other threads:[~2020-09-02 19:52 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-01  1:49 [PATCH v2 bpf-next 00/14] Add libbpf full support for BPF-to-BPF calls Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 01/14] libbpf: ensure ELF symbols table is found before further ELF processing Andrii Nakryiko
2020-09-02  5:43   ` John Fastabend
2020-09-01  1:49 ` [PATCH v2 bpf-next 02/14] libbpf: parse multi-function sections into multiple BPF programs Andrii Nakryiko
2020-09-02  6:08   ` John Fastabend
2020-09-01  1:49 ` [PATCH v2 bpf-next 03/14] libbpf: support CO-RE relocations for multi-prog sections Andrii Nakryiko
2020-09-02  6:14   ` John Fastabend
2020-09-01  1:49 ` [PATCH v2 bpf-next 04/14] libbpf: make RELO_CALL work for multi-prog sections and sub-program calls Andrii Nakryiko
2020-09-02  5:36   ` Alexei Starovoitov
2020-09-02 19:52     ` Andrii Nakryiko [this message]
2020-09-01  1:49 ` [PATCH v2 bpf-next 05/14] libbpf: implement generalized .BTF.ext func/line info adjustment Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 06/14] libbpf: add multi-prog section support for struct_ops Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 07/14] selftests/bpf: add selftest for multi-prog sections and bpf-to-bpf calls Andrii Nakryiko
2020-09-02  5:41   ` Alexei Starovoitov
2020-09-02 19:57     ` Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 08/14] tools/bpftool: replace bpf_program__title() with bpf_program__section_name() Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 09/14] selftests/bpf: don't use deprecated libbpf APIs Andrii Nakryiko
2020-09-01  1:49 ` [PATCH v2 bpf-next 10/14] libbpf: deprecate notion of BPF program "title" in favor of "section name" Andrii Nakryiko
2020-09-01  1:50 ` [PATCH v2 bpf-next 11/14] selftests/bpf: turn fexit_bpf2bpf into test with subtests Andrii Nakryiko
2020-09-01  1:50 ` [PATCH v2 bpf-next 12/14] selftests/bpf: convert pyperf, strobemeta, and l4lb_noinline to __noinline Andrii Nakryiko
2020-09-02  5:45   ` Alexei Starovoitov
2020-09-02 19:58     ` Andrii Nakryiko
2020-09-01  1:50 ` [PATCH v2 bpf-next 13/14] selftests/bpf: modernize xdp_noinline test w/ skeleton and __noinline Andrii Nakryiko
2020-09-01  1:50 ` [PATCH v2 bpf-next 14/14] selftests/bpf: convert cls_redirect selftest to use __noinline Andrii Nakryiko
2020-09-02  5:46   ` Alexei Starovoitov
2020-09-02 19:58     ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzYkpJWUqeNJfVpyC1Cf5ThHXkpxV-yrgY0XfAmEie6tKg@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).