All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Yonghong Song <yhs@fb.com>, Florent Revest <revest@chromium.org>,
	KP Singh <kpsingh@chromium.org>, bpf <bpf@vger.kernel.org>,
	Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>,
	Florent Revest <revest@google.com>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper
Date: Fri, 18 Dec 2020 12:47:07 -0800	[thread overview]
Message-ID: <CAEf4BzZ7Pk677ZjUprRpsKzF3epBvouogxMU7YKN9i5pH2zQRg@mail.gmail.com> (raw)
In-Reply-To: <20201218203655.clqyeeamwicvej5z@ast-mbp>

On Fri, Dec 18, 2020 at 12:36 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Dec 18, 2020 at 10:53:57AM -0800, Andrii Nakryiko wrote:
> > On Thu, Dec 17, 2020 at 7:20 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Thu, Dec 17, 2020 at 09:26:09AM -0800, Yonghong Song wrote:
> > > >
> > > >
> > > > On 12/17/20 7:31 AM, Florent Revest wrote:
> > > > > On Mon, Dec 14, 2020 at 7:47 AM Yonghong Song <yhs@fb.com> wrote:
> > > > > > On 12/11/20 6:40 AM, Florent Revest wrote:
> > > > > > > On Wed, Dec 2, 2020 at 10:18 PM Alexei Starovoitov
> > > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > > > I still think that adopting printk/vsnprintf for this instead of
> > > > > > > > reinventing the wheel
> > > > > > > > is more flexible and easier to maintain long term.
> > > > > > > > Almost the same layout can be done with vsnprintf
> > > > > > > > with exception of \0 char.
> > > > > > > > More meaningful names, etc.
> > > > > > > > See Documentation/core-api/printk-formats.rst
> > > > > > >
> > > > > > > I agree this would be nice. I finally got a bit of time to experiment
> > > > > > > with this and I noticed a few things:
> > > > > > >
> > > > > > > First of all, because helpers only have 5 arguments, if we use two for
> > > > > > > the output buffer and its size and two for the format string and its
> > > > > > > size, we are only left with one argument for a modifier. This is still
> > > > > > > enough for our usecase (where we'd only use "%ps" for example) but it
> > > > > > > does not strictly-speaking allow for the same layout that Andrii
> > > > > > > proposed.
> > > > > >
> > > > > > See helper bpf_seq_printf. It packs all arguments for format string and
> > > > > > puts them into an array. bpf_seq_printf will unpack them as it parsed
> > > > > > through the format string. So it should be doable to have more than
> > > > > > "%ps" in format string.
> > > > >
> > > > > This could be a nice trick, thank you for the suggestion Yonghong :)
> > > > >
> > > > > My understanding is that this would also require two extra args (one
> > > > > for the array of arguments and one for the size of this array) so it
> > > > > would still not fit the 5 arguments limit I described in my previous
> > > > > email.
> > > > > eg: this would not be possible:
> > > > > long bpf_snprintf(const char *out, u32 out_size,
> > > > >                    const char *fmt, u32 fmt_size,
> > > > >                   const void *data, u32 data_len)
> > > >
> > > > Right. bpf allows only up to 5 parameters.
> > > > >
> > > > > Would you then suggest that we also put the format string and its
> > > > > length in the first and second cells of this array and have something
> > > > > along the line of:
> > > > > long bpf_snprintf(const char *out, u32 out_size,
> > > > >                    const void *args, u32 args_len) ?
> > > > > This seems like a fairly opaque signature to me and harder to verify.
> > > >
> > > > One way is to define an explicit type for args, something like
> > > >    struct bpf_fmt_str_data {
> > > >       char *fmt;
> > > >       u64 fmt_len;
> > > >       u64 data[];
> > > >    };
> > >
> > > that feels a bit convoluted.
> > >
> > > The reason I feel unease with the helper as was originally proposed
> > > and with Andrii's proposal is all the extra strlen and strcpy that
> > > needs to be done. In the helper we have to call kallsyms_lookup()
> > > which is ok interface for what it was desinged to do,
> > > but it's awkward to use to construct new string ("%s [%s]", sym, modname)
> > > or to send two strings into a ring buffer.
> > > Andrii's zero separator idea will simplify bpf prog, but user space
> > > would need to do strlen anyway if it needs to pretty print.
> > > If we take pain on converting addr to sym+modname let's figure out
> > > how to make it easy for the bpf prog to do and easy for user space to consume.
> > > That's why I proposed snprintf.
> >
> > I have nothing against snprintf support for symbols. But
> > bpf_ksym_resolve() solves only a partially overlapping problem, so
> > deserves to be added in addition to snprintf support. With snprintf,
> > it will be hard to avoid two lookups of the same symbol to print "%s
> > [%s]" form, so there is a performance loss, which is probably bigger
> > than a simple search for a zero-byte.
>
> I suspect we're not on the same page in terms of what printf can do.
> See Documentation/core-api/printk-formats.rst and lib/vsprintf.c:symbol_string()
> It's exactly one lookup in sprintf implementation.
> bpf_snprintf(buf, "%ps", addr) would be equivalent to
> {
>   ksym_resolve(sym, modname, addr, SYM | MOD);
>   printf("%s [%s]", sym, modname);
> }

Ah, I missed that we'll have a single specifier for "%s [%s]" format.
My assumption was that we have one for symbol name only and another
for symbol module. Yeah, then it's fine from the performance
perspective.

>
> > But bpf_ksym_resolve() can be
> > used flexibly. You can either do two separate bpf_ksym_resolve() calls
> > to get symbol name (and its length) and symbol's module (and its
> > length), if you need to process it programmatically in BPF program. Or
> > you can bundle it together and let user-space process it. User-space
> > will need to copy data anyways because it can't stay in
> > perfbuf/ringbuf for long. So scanning for zero delimiters will be
> > negligible, it will just bring data into cache. All I'm saying is that
> > ksym_resolve() gives flexibility which snprintf can't provide.
>
> Well, with snprintf there will be no way to print mod symbol
> without modname, but imo it's a good thing.
> What is the use case for getting mod symbol without modname?

For easier post-processing on the user side. Instead of parsing
"vmlinux_symbol" or "module_symbol [module_name]" (two non-uniform
variants already), user-space would just get two separate strings. I
just like APIs that don't assume how I am going to use them :), so
"symbol [module]" format is a bit more inconvenient than decomposed
pieces.

>
> > Additionally, with ksym_resolve() being able to return base address,
> > it's now possible to do a bunch of new stuff, from in-BPF
> > symbolization to additional things like correlating memory accesses or
> > function calls, etc.
>
> Getting adjusted base address could be useful some day, but why now? What for?

I proposed that only if we do bpf_ksym_resolve(). No need to support
that in snprintf case, of course.

>
> > bits), my point is that ksym_resolve() is more powerful than
> > snprintf(): the latter can be used pretty much only for
> > pretty-printing.
>
> Potentially yes. I think the stated goal was pretty printing.

That's fine if we do only snprintf, yes. But if a separate helper,
then we should think more broadly.

>
> >
> > >
> > > As far as 6 arg issue:
> > > long bpf_snprintf(const char *out, u32 out_size,
> > >                   const char *fmt, u32 fmt_size,
> > >                   const void *data, u32 data_len);
> > > Yeah. It won't work as-is, but fmt_size is unnecessary nowadays.
> > > The verifier understands read-only data.
> > > Hence the helper can be:
> > > long bpf_snprintf(const char *out, u32 out_size,
> >
> > With the power of BTF, we can also put these two correlated values
> > into a single struct and pass a pointer to it. It will take only one
> > parameter for one memory region. Alternative is the "fat pointer"
> > approach that Go and Rust use, but it's less flexible overall.
>
> I think it will be less flexible when output size is fixed by the type info.
> With explicit size the bpf_snprintf() can print directly into ringbuffer.
> Multiple bpf_snprintf() will be able to fill it one by one reducing
> space available at every step.
> bpf_snprintf() would need to return the number of bytes, of course.
> Just like probe_read_str.

Ok, I should have probably demonstrated with an example. I don't
propose to specify the size through BTF itself. I was thinking about:

struct bpf_mem_ptr {
    void *data;
    size_t size;
};


struct bpf_mem_ptr p = { ptr, 123 };
bpf_whatever_helper(&p, ...);


bpf_whatever_helper() will specify that the first argument has to be
PTR_TO_BTF_ID where btf_id corresponds to struct bpf_mem_ptr. Hope
this helps.

  reply	other threads:[~2020-12-18 20:48 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-26 16:57 [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper Florent Revest
2020-11-26 16:57 ` [PATCH bpf-next 2/2] selftests/bpf: Add bpf_kallsyms_lookup test Florent Revest
2020-12-02  0:57   ` Andrii Nakryiko
2020-11-27  2:32 ` [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper KP Singh
2020-11-27  9:25   ` Florent Revest
2020-11-27  9:27     ` Florent Revest
2020-11-27  7:35 ` Yonghong Song
2020-11-27  9:20   ` Florent Revest
2020-11-27 11:20   ` KP Singh
2020-11-27 16:09     ` Yonghong Song
2020-12-02  0:55       ` Andrii Nakryiko
2020-12-02 20:32         ` Florent Revest
2020-12-02 21:18           ` Alexei Starovoitov
2020-12-11 14:40             ` Florent Revest
2020-12-14  6:47               ` Yonghong Song
2020-12-17 15:31                 ` Florent Revest
2020-12-17 17:26                   ` Yonghong Song
2020-12-18  3:20                     ` Alexei Starovoitov
2020-12-18  4:39                       ` Yonghong Song
2020-12-18 18:53                       ` Andrii Nakryiko
2020-12-18 20:36                         ` Alexei Starovoitov
2020-12-18 20:47                           ` Andrii Nakryiko [this message]
2020-12-22 20:38                             ` Florent Revest
2020-12-22 20:52                       ` Florent Revest
2020-12-22 14:18                 ` Christoph Hellwig
2020-12-22 20:17                   ` Florent Revest
2020-12-23  7:50                     ` Christoph Hellwig
2020-12-02  0:47     ` Andrii Nakryiko
2020-11-27 17:20 ` kernel test robot
2020-11-27 17:20   ` kernel test robot
2020-11-27 17:20 ` [RFC PATCH] bpf: bpf_kallsyms_lookup_proto can be static kernel test robot
2020-11-27 17:20   ` kernel test robot
2020-11-29  1:07 ` [PATCH bpf-next 1/2] bpf: Add a bpf_kallsyms_lookup helper Alexei Starovoitov
2020-11-30 16:23   ` Florent Revest
2020-12-01  2:41     ` Alexei Starovoitov
2020-12-01 20:25       ` Florent Revest

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzZ7Pk677ZjUprRpsKzF3epBvouogxMU7YKN9i5pH2zQRg@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=kpsingh@chromium.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=revest@chromium.org \
    --cc=revest@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.