All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>,
	Alexei Starovoitov <ast@kernel.org>, bpf <bpf@vger.kernel.org>,
	Heiko Carstens <hca@linux.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>
Subject: Re: [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel
Date: Fri, 31 Jul 2020 10:41:53 -0700	[thread overview]
Message-ID: <CAEf4BzbGMKPTUw=B1tC=NsYn7oUQb3tmUEghRd-URT1tu0hNiA@mail.gmail.com> (raw)
In-Reply-To: <6177128b-bef5-7445-bf00-8051f8efa3bc@iogearbox.net>

On Wed, Jul 29, 2020 at 3:12 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 7/30/20 12:05 AM, Andrii Nakryiko wrote:
> > On Wed, Jul 29, 2020 at 2:54 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >> On 7/29/20 11:36 PM, Andrii Nakryiko wrote:
> >>> On Wed, Jul 29, 2020 at 2:01 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 7/29/20 6:06 AM, Andrii Nakryiko wrote:
> >>>>> On Tue, Jul 28, 2020 at 2:16 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>>> On 7/28/20 9:11 PM, Andrii Nakryiko wrote:
> >>>>>>> On Tue, Jul 28, 2020 at 5:15 AM Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>>>>>>>
> >>>>>>>> Yet another adaptation to commit 0ebeea8ca8a4 ("bpf: Restrict
> >>>>>>>> bpf_probe_read{, str}() only to archs where they work") that makes more
> >>>>>>>> samples compile on s390.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> >>>>>>>
> >>>>>>> Sorry, we can't do this yet. This will break on older kernels that
> >>>>>>> don't yet have bpf_probe_read_kernel() implemented. Met and Yonghong
> >>>>>>> are working on extending a set of CO-RE relocations, that would allow
> >>>>>>> to do bpf_probe_read_kernel() detection on BPF side, transparently for
> >>>>>>> an application, and will pick either bpf_probe_read() or
> >>>>>>> bpf_probe_read_kernel(). It should be ready soon (this or next week,
> >>>>>>> most probably), though it will have dependency on the latest Clang.
> >>>>>>> But for now, please don't change this.
> >>>>>>
> >>>>>> Could you elaborate what this means wrt dependency on latest clang? Given clang
> >>>>>> releases have a rather long cadence, what about existing users with current clang
> >>>>>> releases?
> >>>>>
> >>>>> So the overall idea is to use something like this to do kernel reads:
> >>>>>
> >>>>> static __always_inline int bpf_probe_read_universal(void *dst, u32 sz,
> >>>>> const void *src)
> >>>>> {
> >>>>>        if (bpf_core_type_exists(btf_bpf_probe_read_kernel))
> >>>>>            return bpf_probe_read_kernel(dst, sz, src);
> >>>>>        else
> >>>>>            return bpf_probe_read(dst, sz, src);
> >>>>> }
> >>>>>
> >>>>> And then use bpf_probe_read_universal() in BPF_CORE_READ and family.
> >>>>>
> >>>>> This approach relies on few things:
> >>>>>
> >>>>> 1. each BPF helper has a corresponding btf_<helper-name> type defined for it
> >>>>> 2. bpf_core_type_exists(some_type) returns 0 or 1, depending if
> >>>>> specified type is found in kernel BTF (so needs kernel BTF, of
> >>>>> course). This is the part me and Yonghong are working on at the
> >>>>> moment.
> >>>>> 3. verifier's dead code elimination, which will leave only
> >>>>> bpf_probe_read() or bpf_probe_read_kernel() calls and will remove the
> >>>>> other one. So on older kernels, there will never be unsupoorted call
> >>>>> to bpf_probe_read_kernel().
> >>>>>
> >>>>> The new type existence relocation requires the latest Clang. So the
> >>>>> way to deal with older Clangs would be to just fallback to
> >>>>> bpf_probe_read, if we detect that Clang is too old and can't emit
> >>>>> necessary relocation.
> >>>>
> >>>> Okay, seems reasonable overall. One question though: couldn't libbpf transparently
> >>>> fix up the selection of bpf_probe_read() vs bpf_probe_read_kernel()? E.g. it would
> >>>> probe the kernel whether bpf_probe_read_kernel() is available and if it is then it
> >>>> would rewrite the raw call number from the instruction from bpf_probe_read() into
> >>>> the one for bpf_probe_read_kernel()? I guess the question then becomes whether the
> >>>> original use for bpf_probe_read() was related to CO-RE. But I think this could also
> >>>> be overcome by adding a fake helper signature in libbpf with a unreasonable high
> >>>> number that is dedicated to probing mem via CO-RE and then libbpf picks the right
> >>>> underlying helper call number for the insn. That avoids fiddling with macros and
> >>>> need for new clang version, no (unless I'm missing something)?
> >>>
> >>> Libbpf could do it, but I'm a bit worried that unconditionally
> >>> changing bpf_probe_read() into bpf_probe_read_kernel() is going to be
> >>> wrong in some cases. If that wasn't the case, why wouldn't we just
> >>> re-purpose bpf_probe_read() into bpf_probe_read_kernel() in kernel
> >>> itself, right?
> >>
> >> Yes, that is correct, but I mentioned above that this new 'fake' helper call number
> >> that libbpf would be fixing up would only be used for bpf_probe_read{,str}() inside
> >> bpf_core_read.h.
> >>
> >> Small example, bpf_core_read.h would be changed to (just an extract):
> >>
> >> diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
> >> index eae5cccff761..4bddb2ddf3f0 100644
> >> --- a/tools/lib/bpf/bpf_core_read.h
> >> +++ b/tools/lib/bpf/bpf_core_read.h
> >> @@ -115,7 +115,7 @@ enum bpf_field_info_kind {
> >>     * (local) BTF, used to record relocation.
> >>     */
> >>    #define bpf_core_read(dst, sz, src)                                        \
> >> -       bpf_probe_read(dst, sz,                                             \
> >> +       bpf_probe_read_selector(dst, sz,                                                    \
> >>                          (const void *)__builtin_preserve_access_index(src))
> >>
> >>    /*
> >> @@ -124,7 +124,7 @@ enum bpf_field_info_kind {
> >>     * argument.
> >>     */
> >>    #define bpf_core_read_str(dst, sz, src)                                            \
> >> -       bpf_probe_read_str(dst, sz,                                         \
> >> +       bpf_probe_read_str_selector(dst, sz,                                        \
> >>                              (const void *)__builtin_preserve_access_index(src))
> >>
> >>    #define ___concat(a, b) a ## b
> >>
> >> And bpf_probe_read_{,str_}selector would be defined as e.g. ...
> >>
> >> static long (*bpf_probe_read_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -1;
> >> static long (*bpf_probe_read_str_selector)(void *dst, __u32 size, const void *unsafe_ptr) = (void *) -2;
> >>
> >> ... where libbpf would do the fix up to either 4 or 45 for insn->imm. But it's still
> >> confined to usage in bpf_core_read.h when the CO-RE macros are used.
> >
> > Ah, I see. Yeah, I suppose that would work as well. Do you prefer me
> > to go this way?
>
> I would suggest we should try this path given this can be used with any clang version
> that has the BPF backend, not just latest upstream git.

I have an even better solution, I think. Convert everything to
bpf_probe_read_kernel() or bpf_probe_read_user() unconditionally, but
let libbpf switch those two to bpf_probe_read() if _kernel()/_user()
variants are not yet in the kernel. That should handle both CO-RE
helpers and just pretty much any use case that was converted.


>
> Thanks,
> Daniel

  parent reply	other threads:[~2020-07-31 17:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-28 12:00 [PATCH bpf-next 0/3] samples/bpf: A couple s390 fixes Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 1/3] samples/bpf: Fix building out of srctree Ilya Leoshkevich
2020-07-28 20:48   ` Song Liu
2020-07-28 21:12     ` Ilya Leoshkevich
2020-07-28 21:37       ` Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 2/3] samples/bpf: Fix test_map_in_map on s390 Ilya Leoshkevich
2020-07-28 20:59   ` Song Liu
2020-07-28 22:05     ` Ilya Leoshkevich
2020-07-28 12:00 ` [PATCH bpf-next 3/3] libbpf: Use bpf_probe_read_kernel Ilya Leoshkevich
2020-07-28 19:11   ` Andrii Nakryiko
2020-07-28 21:16     ` Daniel Borkmann
2020-07-29  4:06       ` Andrii Nakryiko
2020-07-29 21:01         ` Daniel Borkmann
2020-07-29 21:36           ` Andrii Nakryiko
2020-07-29 21:54             ` Daniel Borkmann
2020-07-29 22:05               ` Andrii Nakryiko
2020-07-29 22:12                 ` Daniel Borkmann
2020-07-29 22:17                   ` Andrii Nakryiko
2020-07-31 17:41                   ` Andrii Nakryiko [this message]
2020-07-31 20:34                     ` Daniel Borkmann
2020-08-05 18:32                       ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzbGMKPTUw=B1tC=NsYn7oUQb3tmUEghRd-URT1tu0hNiA@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=iii@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.