bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Fāng-ruì Sòng" <maskray@google.com>
Cc: Yonghong Song <yhs@fb.com>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Lorenz Bauer <lmb@cloudflare.com>
Subject: Re: [PATCH bpf-next v2] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations
Date: Mon, 7 Jun 2021 21:22:12 -0700	[thread overview]
Message-ID: <CAEf4BzZF6aQsR-y=tVObq_euvmu92gmm8TPUTc_XY3VmZjpLEA@mail.gmail.com> (raw)
In-Reply-To: <CAFP8O3J4_aaT+POmB6H6mihuP1-VQ4ww1nVrHxEvd70S5ODEUw@mail.gmail.com>

On Mon, Jun 7, 2021 at 3:08 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>
> On Mon, Jun 7, 2021 at 2:06 PM Yonghong Song <yhs@fb.com> wrote:
> >
> >
> >
> > On 6/5/21 2:03 PM, Fāng-ruì Sòng wrote:
> > > On Tue, May 25, 2021 at 11:52 AM Yonghong Song <yhs@fb.com> wrote:
> > >>
> > >>
> > >>
> > >> On 5/25/21 11:29 AM, Fangrui Song wrote:
> > >>> I have a review queue with a huge pile of LLVM patches and have only
> > >>> skimmed through this.
> > >>>
> > >>> First, if the size benefit of REL over RELA isn't deem that necessary,
> > >>> I will highly recommend RELA for simplicity and robustness.
> > >>> REL is error-prone.
> > >>
> > >> The worry is backward compatibility. Because of BPF ELF format
> > >> is so intervened with bpf eco system (loading, bpf map, etc.),
> > >> a lot of tools in the wild already implemented to parse REL...
> > >> It will be difficult to change...
> > >
> > > It seems that the design did not get enough initial scrutiny...
> > > (On https://reviews.llvm.org/D101336  , a reviewer who has apparently
> > > never contributed to lld/ELF clicked LGTM without actual reviewing the
> > > patch and that was why I have to click "Request Changes").
> > >
> > > I worry that keeping the current state as-is can cause much
> > > maintenance burden in the LLVM MC layer, linker, and other binary
> > > utilities.
> > > Some things can be improved without breaking backward compatibility.
> > >
> > >>>
> > >>> On 2021-05-24, Yonghong Song wrote:
> > >>>> LLVM upstream commit
> > >>>> https://reviews.llvm.org/D102712
> > >>>> made some changes to bpf relocations to make them
> > >>>> llvm linker lld friendly. The scope of
> > >>>> existing relocations R_BPF_64_{64,32} is narrowed
> > >>>> and new relocations R_BPF_64_{ABS32,ABS64,NODYLD32}
> > >>>> are introduced.
> > >>>>
> > >>>> Let us add some documentation about llvm bpf
> > >>>> relocations so people can understand how to resolve
> > >>>> them properly in their respective tools.
> > >>>>
> > >>>> Cc: John Fastabend <john.fastabend@gmail.com>
> > >>>> Cc: Lorenz Bauer <lmb@cloudflare.com>
> > >>>> Signed-off-by: Yonghong Song <yhs@fb.com>
> > >>>> ---
> > >>>> Documentation/bpf/index.rst            |   1 +
> > >>>> Documentation/bpf/llvm_reloc.rst       | 240 +++++++++++++++++++++++++
> > >>>> tools/testing/selftests/bpf/README.rst |  19 ++
> > >>>> 3 files changed, 260 insertions(+)
> > >>>> create mode 100644 Documentation/bpf/llvm_reloc.rst
> > >>>>

[...]

> > >>>> +  Enum  ELF Reloc Type     Description      BitSize  Offset
> > >>>> Calculation
> > >>>> +  0     R_BPF_NONE         None
> > >>>> +  1     R_BPF_64_64        ld_imm64 insn    32       r_offset + 4  S
> > >>>> + IA
> > >>>> +  2     R_BPF_64_ABS64     normal data      64       r_offset      S
> > >>>> + IA
> > >>>> +  3     R_BPF_64_ABS32     normal data      32       r_offset      S
> > >>>> + IA
> > >>>> +  4     R_BPF_64_NODYLD32  .BTF[.ext] data  32       r_offset      S
> > >>>> + IA
> > >>>> +  10    R_BPF_64_32        call insn        32       r_offset + 4  (S
> > >>>> + IA) / 8 - 1
> > >>>
> > >>> Shifting the offset by 4 looks weird. R_386_32 applies at r_offset.
> > >>> The call instruction  R_BPF_64_32 is strange. Such special calculation
> > >>> should not be named R_BPF_64_32.
> > >>
> > >> Again, we have a backward compatibility issue here. I would like to
> > >> provide an alias for it in llvm relocation header file, but I don't
> > >> know how to do it.
> > >
> > > It is very confusing that R_BPF_64_64 has a 32-bit value.
> >
> > If you like, we can make it as 64bit value.
> > R_BPF_64_64 is for ld_imm64 insn which is a 16byte insn.
> > The bytes 4-7 and 12-15 forms a 64bit value for the instruction.
> > We can do
> >       write32/read32 for bytes 4-7
> >       write32/read32 for bytes 12-15
> > for the relocation. Currently, we limit to bytes 4-7 since
> > in BPF it is really unlikely we have section offset > 4G.
> > But we could extend to full 64bit section offset.
>
> Such semantics have precedents, e.g. R_AARCH64_ADD_ABS_LO12_NC.
>
> For BPF, the name can be ABS_LO32: absolute, the low 32-bit value,
> with relocation overflow checking.
> There will be an out-of-range relocation if the value is outside
> [-2**31, 2**32).
>
> If the value is byte aligned, it will be more natural if you shift r_offset
> so that the linker just relocates some bytes starting at r_offset, instead of
> r_offset+4 or r_offset+12.
>
> ABS_LO32_NC (no-checking) + ABS_HI32 (absolute, the high 32-bit value) can be
> introduced in the fugure.
>
> > > Since its computation is the same as R_BPF_64_ABS32, can R_BPF_64_64
> > > be deprecated in favor of R_BPF_64_ABS32?
> >
> > Its computation is the same but R_BPF_64_ABS32 starts from offset
> > and R_BPF_64_64 starts from offset + 4.
> >
> > For R_BPF_64_64, the relocation offset is the *start* of the instruction
> > hence +4 is needed to actually read/write addend.
> >
> > To deprecate R_BPF_64_64 to be the same as R_BPF_64_ABS32, we will
> > need to change relocation offset. This will break existing libbpf
> > and cilium and likely many other tools, so I would prefer not
> > to do this.
>
> You can add a new relocation type. Backward compatibility is good.
> There can only be forward compatibility issues.

New relocation emitted by compiler for cases where R_BPF_64_64 is
emitted today will break all the existing BPF applications using
anything but bleeding-edge libbpf. It was kind of ok (but even that
already breaks selftests/bpf on bpf tree, for instance) to emit new
kinds of relocations (R_BPF_64_ABS32 and R_BPF_64_ABS64) for some
cases where normally R_BPF_64_64/R_BPF_64_32 would be used, but only
because r_offset and the rest of semantics didn't change and because
most existing uses (including libbpf) weren't strict about checking
relocation type.

But emitting new relocation that changes the meaning of r_offset is
absolutely not acceptable.

>
> I see some relocation types which are deemed fundamental on other architectures
> are just being introduced. How could they work without these new relocation
> types anyway?
>
> > >
> > > There is nothing preventing a relocation type from being used as data
> > > in some cases while code in other cases.
> > > R_BPF_64_64 can be renamed to indicate that it is deprecated.
> > > R_BPF_64_32 can be confused with R_BPF_64_ABS32. You may rename
> > > R_BPF_64_32 to say, R_BPF_64_CALL32.
> > >
> > > For compatibility, only the values matter, not the names.
> > > E.g. on x86, some unused GNU_PROPERTY values were renamed to
> > > GNU_PROPERTY_X86_COMPAT_ISA_1_USED ("COMPAT" for compatibility) while
> > > their values were kept.
> > > Two aarch64 relocation types have been renamed.
> >
> > Renaming sounds a more attractive choice. But a lot of other tools
> > have already used the name and it will be odd and not user friendly
> > to display a different name from llvm.
> >
> > For example elfutils, we have
> >    backends/bpf_symbol.c:    case R_BPF_64_64:
> >    libelf/elf.h:#define R_BPF_64_64
> >
> > My /usr/include/elf.h (from glibc-headers-2.28-149.el8.x86_64) has:
> >    /* BPF specific declarations.  */
> >
> >    #define R_BPF_NONE              0       /* No reloc */
> >    #define R_BPF_64_64             1
> >    #define R_BPF_64_32             10
> >
> > I agree the name is a little misleading, but renaming may cause
> > some confusion in bpf ecosystem. So we prefer to stay as is, but
> > with documentation to explain what each relocation intends to do.
>
> There are only 3 relocation types. R_BPF_NONE is good.
> There is still time figuring out proper naming and fixing them today.
> Otherwise I foresee that the naming problem will cause continuous confusion to
> other toolchain folks.
>
> Assuming R_BPF_64_ABS_LO32 convey the correct semantics, you can do
>
>     #define R_BPF_64_ABS_LO32       1
>     #define R_BPF_64_64             1 /* deprecated alias */
>
> Similarly, for BPF_64_32:
>
>     #define R_BPF_64_CALL32         10
>     #define R_BPF_64_32             10 /* deprecated alias */
>

I think renaming is fine given we leave old constants as aliases to
new ones. llvm-readelf output would change, but I doubt anyone is
doing anything programmatic with such human-oriented output, so I
think that's ok.

> > >>>
> > >>>> +For example, ``R_BPF_64_64`` relocation type is used for ``ld_imm64``
> > >>>> instruction.
> > >>>> +The actual to-be-relocated data (0 or section offset)
> > >>>> +is stored at ``r_offset + 4`` and the read/write
> > >>>> +data bitsize is 32 (4 bytes). The relocation can be resolved with
> > >>>> +the symbol value plus implicit addend. Note that the ``BitSize`` is
> > >>>> 32 which
> > >>>> +means the section offset must be less than or equal to ``UINT32_MAX``
> > >>>> and this
> > >>>> +is enforced by LLVM BPF backend.
> > >>>> +

[...]

> > But we don't want relocations in .BTF/.BTF.ext to be resolved with
> > actually addresses. They will be processed by bpf libraries and the
> > kernel. The reason not to have relocation resolution
> > is not due to their names, but due
> > to their functionality. If we intend to load dwarf to kernel, we
> > will issue R_BPF_64_NODYLD32 to dwarf as well.
>
> Is the problem due to REL relocations not being idempotent?
>
> > One can argue we should have fine control in RuntimeDyld so
> > that which section to have runtime relocation resolution
> > and which section does not. I don't know whether
> > ExecutionEngine/RuntimeDyld agree with that or not. But
> > BPF backend perspective, R_BPF_64_ABS32 and R_BPF_64_NODYLD32
> > are two different relocations, they may do something common
> > in some places like lld, but they may do different things
> > in other places like dyld.
>
> If RELA is used, will the tool be happy if you just unconditionally
> apply relocations?
>
> You are introducing new relocation types anyway, so migrating to RELA may
> simplify a bunch of things. The issue is only about forward compatibility
> for some tools.

It's not about breaking some tools, it's about breaking *every*
libbpf-based application. Emitting new relocation where either
semantics change (different meaning for r_offset) or its type changes
(REL vs RELA) is breaking everything with 100% reliability.

[...]

  reply	other threads:[~2021-06-08  4:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  3:33 [PATCH bpf-next v2] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations Yonghong Song
2021-05-25 18:29 ` Fangrui Song
2021-05-25 18:52   ` Yonghong Song
2021-06-05 21:03     ` Fāng-ruì Sòng
2021-06-07 21:06       ` Yonghong Song
2021-06-07 22:08         ` Fāng-ruì Sòng
2021-06-08  4:22           ` Andrii Nakryiko [this message]
2021-06-08  4:32           ` Yonghong Song
2021-06-08  5:51             ` Fāng-ruì Sòng
2021-06-08 15:49               ` Alexei Starovoitov
2021-06-08 16:33                 ` Fāng-ruì Sòng
2021-06-08 18:32                   ` Alexei Starovoitov
2021-06-08 23:10                     ` Fāng-ruì Sòng
2021-06-08 23:23                       ` Alexei Starovoitov
2021-05-25 15:25 Yonghong Song
2021-05-25 15:31 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZF6aQsR-y=tVObq_euvmu92gmm8TPUTc_XY3VmZjpLEA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=lmb@cloudflare.com \
    --cc=maskray@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).