All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: "Fāng-ruì Sòng" <maskray@google.com>
Cc: Yonghong Song <yhs@fb.com>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Andrii Nakryiko <andrii@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Lorenz Bauer <lmb@cloudflare.com>
Subject: Re: [PATCH bpf-next v2] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations
Date: Mon, 7 Jun 2021 21:22:12 -0700	[thread overview]
Message-ID: <CAEf4BzZF6aQsR-y=tVObq_euvmu92gmm8TPUTc_XY3VmZjpLEA@mail.gmail.com> (raw)
In-Reply-To: <CAFP8O3J4_aaT+POmB6H6mihuP1-VQ4ww1nVrHxEvd70S5ODEUw@mail.gmail.com>

On Mon, Jun 7, 2021 at 3:08 PM Fāng-ruì Sòng <maskray@google.com> wrote:
>
> On Mon, Jun 7, 2021 at 2:06 PM Yonghong Song <yhs@fb.com> wrote:
> >
> >
> >
> > On 6/5/21 2:03 PM, Fāng-ruì Sòng wrote:
> > > On Tue, May 25, 2021 at 11:52 AM Yonghong Song <yhs@fb.com> wrote:
> > >>
> > >>
> > >>
> > >> On 5/25/21 11:29 AM, Fangrui Song wrote:
> > >>> I have a review queue with a huge pile of LLVM patches and have only
> > >>> skimmed through this.
> > >>>
> > >>> First, if the size benefit of REL over RELA isn't deem that necessary,
> > >>> I will highly recommend RELA for simplicity and robustness.
> > >>> REL is error-prone.
> > >>
> > >> The worry is backward compatibility. Because of BPF ELF format
> > >> is so intervened with bpf eco system (loading, bpf map, etc.),
> > >> a lot of tools in the wild already implemented to parse REL...
> > >> It will be difficult to change...
> > >
> > > It seems that the design did not get enough initial scrutiny...
> > > (On https://reviews.llvm.org/D101336  , a reviewer who has apparently
> > > never contributed to lld/ELF clicked LGTM without actual reviewing the
> > > patch and that was why I have to click "Request Changes").
> > >
> > > I worry that keeping the current state as-is can cause much
> > > maintenance burden in the LLVM MC layer, linker, and other binary
> > > utilities.
> > > Some things can be improved without breaking backward compatibility.
> > >
> > >>>
> > >>> On 2021-05-24, Yonghong Song wrote:
> > >>>> LLVM upstream commit
> > >>>> https://reviews.llvm.org/D102712
> > >>>> made some changes to bpf relocations to make them
> > >>>> llvm linker lld friendly. The scope of
> > >>>> existing relocations R_BPF_64_{64,32} is narrowed
> > >>>> and new relocations R_BPF_64_{ABS32,ABS64,NODYLD32}
> > >>>> are introduced.
> > >>>>
> > >>>> Let us add some documentation about llvm bpf
> > >>>> relocations so people can understand how to resolve
> > >>>> them properly in their respective tools.
> > >>>>
> > >>>> Cc: John Fastabend <john.fastabend@gmail.com>
> > >>>> Cc: Lorenz Bauer <lmb@cloudflare.com>
> > >>>> Signed-off-by: Yonghong Song <yhs@fb.com>
> > >>>> ---
> > >>>> Documentation/bpf/index.rst            |   1 +
> > >>>> Documentation/bpf/llvm_reloc.rst       | 240 +++++++++++++++++++++++++
> > >>>> tools/testing/selftests/bpf/README.rst |  19 ++
> > >>>> 3 files changed, 260 insertions(+)
> > >>>> create mode 100644 Documentation/bpf/llvm_reloc.rst
> > >>>>

[...]

> > >>>> +  Enum  ELF Reloc Type     Description      BitSize  Offset
> > >>>> Calculation
> > >>>> +  0     R_BPF_NONE         None
> > >>>> +  1     R_BPF_64_64        ld_imm64 insn    32       r_offset + 4  S
> > >>>> + IA
> > >>>> +  2     R_BPF_64_ABS64     normal data      64       r_offset      S
> > >>>> + IA
> > >>>> +  3     R_BPF_64_ABS32     normal data      32       r_offset      S
> > >>>> + IA
> > >>>> +  4     R_BPF_64_NODYLD32  .BTF[.ext] data  32       r_offset      S
> > >>>> + IA
> > >>>> +  10    R_BPF_64_32        call insn        32       r_offset + 4  (S
> > >>>> + IA) / 8 - 1
> > >>>
> > >>> Shifting the offset by 4 looks weird. R_386_32 applies at r_offset.
> > >>> The call instruction  R_BPF_64_32 is strange. Such special calculation
> > >>> should not be named R_BPF_64_32.
> > >>
> > >> Again, we have a backward compatibility issue here. I would like to
> > >> provide an alias for it in llvm relocation header file, but I don't
> > >> know how to do it.
> > >
> > > It is very confusing that R_BPF_64_64 has a 32-bit value.
> >
> > If you like, we can make it as 64bit value.
> > R_BPF_64_64 is for ld_imm64 insn which is a 16byte insn.
> > The bytes 4-7 and 12-15 forms a 64bit value for the instruction.
> > We can do
> >       write32/read32 for bytes 4-7
> >       write32/read32 for bytes 12-15
> > for the relocation. Currently, we limit to bytes 4-7 since
> > in BPF it is really unlikely we have section offset > 4G.
> > But we could extend to full 64bit section offset.
>
> Such semantics have precedents, e.g. R_AARCH64_ADD_ABS_LO12_NC.
>
> For BPF, the name can be ABS_LO32: absolute, the low 32-bit value,
> with relocation overflow checking.
> There will be an out-of-range relocation if the value is outside
> [-2**31, 2**32).
>
> If the value is byte aligned, it will be more natural if you shift r_offset
> so that the linker just relocates some bytes starting at r_offset, instead of
> r_offset+4 or r_offset+12.
>
> ABS_LO32_NC (no-checking) + ABS_HI32 (absolute, the high 32-bit value) can be
> introduced in the fugure.
>
> > > Since its computation is the same as R_BPF_64_ABS32, can R_BPF_64_64
> > > be deprecated in favor of R_BPF_64_ABS32?
> >
> > Its computation is the same but R_BPF_64_ABS32 starts from offset
> > and R_BPF_64_64 starts from offset + 4.
> >
> > For R_BPF_64_64, the relocation offset is the *start* of the instruction
> > hence +4 is needed to actually read/write addend.
> >
> > To deprecate R_BPF_64_64 to be the same as R_BPF_64_ABS32, we will
> > need to change relocation offset. This will break existing libbpf
> > and cilium and likely many other tools, so I would prefer not
> > to do this.
>
> You can add a new relocation type. Backward compatibility is good.
> There can only be forward compatibility issues.

New relocation emitted by compiler for cases where R_BPF_64_64 is
emitted today will break all the existing BPF applications using
anything but bleeding-edge libbpf. It was kind of ok (but even that
already breaks selftests/bpf on bpf tree, for instance) to emit new
kinds of relocations (R_BPF_64_ABS32 and R_BPF_64_ABS64) for some
cases where normally R_BPF_64_64/R_BPF_64_32 would be used, but only
because r_offset and the rest of semantics didn't change and because
most existing uses (including libbpf) weren't strict about checking
relocation type.

But emitting new relocation that changes the meaning of r_offset is
absolutely not acceptable.

>
> I see some relocation types which are deemed fundamental on other architectures
> are just being introduced. How could they work without these new relocation
> types anyway?
>
> > >
> > > There is nothing preventing a relocation type from being used as data
> > > in some cases while code in other cases.
> > > R_BPF_64_64 can be renamed to indicate that it is deprecated.
> > > R_BPF_64_32 can be confused with R_BPF_64_ABS32. You may rename
> > > R_BPF_64_32 to say, R_BPF_64_CALL32.
> > >
> > > For compatibility, only the values matter, not the names.
> > > E.g. on x86, some unused GNU_PROPERTY values were renamed to
> > > GNU_PROPERTY_X86_COMPAT_ISA_1_USED ("COMPAT" for compatibility) while
> > > their values were kept.
> > > Two aarch64 relocation types have been renamed.
> >
> > Renaming sounds a more attractive choice. But a lot of other tools
> > have already used the name and it will be odd and not user friendly
> > to display a different name from llvm.
> >
> > For example elfutils, we have
> >    backends/bpf_symbol.c:    case R_BPF_64_64:
> >    libelf/elf.h:#define R_BPF_64_64
> >
> > My /usr/include/elf.h (from glibc-headers-2.28-149.el8.x86_64) has:
> >    /* BPF specific declarations.  */
> >
> >    #define R_BPF_NONE              0       /* No reloc */
> >    #define R_BPF_64_64             1
> >    #define R_BPF_64_32             10
> >
> > I agree the name is a little misleading, but renaming may cause
> > some confusion in bpf ecosystem. So we prefer to stay as is, but
> > with documentation to explain what each relocation intends to do.
>
> There are only 3 relocation types. R_BPF_NONE is good.
> There is still time figuring out proper naming and fixing them today.
> Otherwise I foresee that the naming problem will cause continuous confusion to
> other toolchain folks.
>
> Assuming R_BPF_64_ABS_LO32 convey the correct semantics, you can do
>
>     #define R_BPF_64_ABS_LO32       1
>     #define R_BPF_64_64             1 /* deprecated alias */
>
> Similarly, for BPF_64_32:
>
>     #define R_BPF_64_CALL32         10
>     #define R_BPF_64_32             10 /* deprecated alias */
>

I think renaming is fine given we leave old constants as aliases to
new ones. llvm-readelf output would change, but I doubt anyone is
doing anything programmatic with such human-oriented output, so I
think that's ok.

> > >>>
> > >>>> +For example, ``R_BPF_64_64`` relocation type is used for ``ld_imm64``
> > >>>> instruction.
> > >>>> +The actual to-be-relocated data (0 or section offset)
> > >>>> +is stored at ``r_offset + 4`` and the read/write
> > >>>> +data bitsize is 32 (4 bytes). The relocation can be resolved with
> > >>>> +the symbol value plus implicit addend. Note that the ``BitSize`` is
> > >>>> 32 which
> > >>>> +means the section offset must be less than or equal to ``UINT32_MAX``
> > >>>> and this
> > >>>> +is enforced by LLVM BPF backend.
> > >>>> +

[...]

> > But we don't want relocations in .BTF/.BTF.ext to be resolved with
> > actually addresses. They will be processed by bpf libraries and the
> > kernel. The reason not to have relocation resolution
> > is not due to their names, but due
> > to their functionality. If we intend to load dwarf to kernel, we
> > will issue R_BPF_64_NODYLD32 to dwarf as well.
>
> Is the problem due to REL relocations not being idempotent?
>
> > One can argue we should have fine control in RuntimeDyld so
> > that which section to have runtime relocation resolution
> > and which section does not. I don't know whether
> > ExecutionEngine/RuntimeDyld agree with that or not. But
> > BPF backend perspective, R_BPF_64_ABS32 and R_BPF_64_NODYLD32
> > are two different relocations, they may do something common
> > in some places like lld, but they may do different things
> > in other places like dyld.
>
> If RELA is used, will the tool be happy if you just unconditionally
> apply relocations?
>
> You are introducing new relocation types anyway, so migrating to RELA may
> simplify a bunch of things. The issue is only about forward compatibility
> for some tools.

It's not about breaking some tools, it's about breaking *every*
libbpf-based application. Emitting new relocation where either
semantics change (different meaning for r_offset) or its type changes
(REL vs RELA) is breaking everything with 100% reliability.

[...]

  reply	other threads:[~2021-06-08  4:22 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-25  3:33 [PATCH bpf-next v2] docs/bpf: add llvm_reloc.rst to explain llvm bpf relocations Yonghong Song
2021-05-25 18:29 ` Fangrui Song
2021-05-25 18:52   ` Yonghong Song
2021-06-05 21:03     ` Fāng-ruì Sòng
2021-06-07 21:06       ` Yonghong Song
2021-06-07 22:08         ` Fāng-ruì Sòng
2021-06-08  4:22           ` Andrii Nakryiko [this message]
2021-06-08  4:32           ` Yonghong Song
2021-06-08  5:51             ` Fāng-ruì Sòng
2021-06-08 15:49               ` Alexei Starovoitov
2021-06-08 16:33                 ` Fāng-ruì Sòng
2021-06-08 18:32                   ` Alexei Starovoitov
2021-06-08 23:10                     ` Fāng-ruì Sòng
2021-06-08 23:23                       ` Alexei Starovoitov
2021-05-25 15:25 Yonghong Song
2021-05-25 15:31 ` Yonghong Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzZF6aQsR-y=tVObq_euvmu92gmm8TPUTc_XY3VmZjpLEA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kernel-team@fb.com \
    --cc=lmb@cloudflare.com \
    --cc=maskray@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.