All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Alan Maguire <alan.maguire@oracle.com>
Cc: Alexei Starovoitov <ast@kernel.org>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Andrii Nakryiko <andrii@kernel.org>, Martin Lau <kafai@fb.com>,
	Song Liu <songliubraving@fb.com>, Yonghong Song <yhs@fb.com>,
	john fastabend <john.fastabend@gmail.com>,
	KP Singh <kpsingh@kernel.org>, Bill Wendling <morbo@google.com>,
	Shuah Khan <shuah@kernel.org>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	"open list:KERNEL SELFTEST FRAMEWORK" 
	<linux-kselftest@vger.kernel.org>,
	open list <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data
Date: Fri, 22 Jan 2021 12:05:00 -0800	[thread overview]
Message-ID: <CAEf4BzZBVjUQnPxG1hyxkoM5HLWyEm2VJjOg0MoogrBdm6QdEQ@mail.gmail.com> (raw)
In-Reply-To: <alpine.LRH.2.23.451.2101221612440.12992@localhost>

On Fri, Jan 22, 2021 at 8:31 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Thu, 21 Jan 2021, Andrii Nakryiko wrote:
>
> > On Wed, Jan 20, 2021 at 10:56 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Sun, Jan 17, 2021 at 2:22 PM Alan Maguire <alan.maguire@oracle.com> wrote:
> > > >
> > > > Add a BTF dumper for typed data, so that the user can dump a typed
> > > > version of the data provided.
> > > >
> > > > The API is
> > > >
> > > > int btf_dump__emit_type_data(struct btf_dump *d, __u32 id,
> > > >                              const struct btf_dump_emit_type_data_opts *opts,
> > > >                              void *data);
> > > >
> >
> > Two more things I realized about this API overnight:
> >
> > 1. It's error-prone to specify only the pointer to data without
> > specifying the size. If user screws up and scecifies wrong type ID or
> > if BTF data is corrupted, then this API would start reading and
> > printing memory outside the bounds. I think it's much better to also
> > require user to specify the size and bail out with error if we reach
> > the end of the allowed memory area.
>
> Yep, good point, especially given in the tracing context we will likely
> only have a subset of the data (e.g. part of the 16k representing a
> task_struct).  The way I was approaching this was to return -E2BIG
> and append a "..." to the dumped data denoting the data provided
> didn't cover the size needed to fully represent the type. The idea is
> the structure is too big for the data provided, hence E2BIG, but maybe
> there's a more intuitive way to do this? See below for more...
>

Hm... that's an interesting use case for sure, but seems reasonable to
support. "..." seems a bit misleading because it can be interpreted as
"we omitted some output for brevity", no? "<truncated>" or something
like that might be more obvious, but I'm just bikeshedding :)

> >
> > 2. This API would be more useful if it also returns the amount of
> > "consumed" bytes. That way users can do more flexible and powerful
> > pretty-printing of raw data. So on success we'll have >= 0 number of
> > bytes used for dumping given BTF type, or <0 on error. WDYT?
> >
>
> I like it! So
>
> 1. if a user provides a too-big data object, we return the amount we used; and
> 2. if a user provides a too-small data object, we append "..." to the dump
>   and return -E2BIG (or whatever error code).
>
> However I wonder for case 2 if it'd be better to use a snprintf()-like
> semantic rather than an error code, returning the amount we would have
> used. That way we easily detect case 1 (size passed in > return value),
> case 2 (size passed in < return value), and errors can be treated separately.
> Feels to me that dealing with truncated data is going to be sufficiently
> frequent it might be good not to classify it as an error. Let me know if
> you think that makes sense.

Hm... Yeah, that would work, I think, and would feel pretty natural.
On the other hand, it's easy to know the total input size needed by
calling btf__resolve_size(btf, type_id), so if user expects to provide
truncated input data and wants to know how much they should have
provided, they can easily do that.

Basically, I don't have strong preference here, though providing
truncated input data still feels more like an error, than a normal
situation... Maybe someone else want to weigh in? And -E2BIG is
distinctive enough in this case. So both would work fine, but not
clear which one is less surprising API.

>
> I'm working on v3, and hope to have something early next week, but a quick
> reply to a question below...
>
> > > > ...where the id is the BTF id of the data pointed to by the "void *"
> > > > argument; for example the BTF id of "struct sk_buff" for a
> > > > "struct skb *" data pointer.  Options supported are
> > > >
> > > >  - a starting indent level (indent_lvl)
> > > >  - a set of boolean options to control dump display, similar to those
> > > >    used for BPF helper bpf_snprintf_btf().  Options are
> > > >         - compact : omit newlines and other indentation
> > > >         - noname: omit member names
> > > >         - zero: show zero-value members
> > > >
> > > > Default output format is identical to that dumped by bpf_snprintf_btf(),
> > > > for example a "struct sk_buff" representation would look like this:
> > > >
> > > > struct sk_buff){
> > > >  (union){
> > > >   (struct){
> > >
> > > Curious, these explicit anonymous (union) and (struct), is that
> > > preferred way for explicitness, or is it just because it makes
> > > implementation simpler and thus was chosen? I.e., if the goal was to
> > > mimic C-style data initialization, you'd just have plain .next = ...,
> > > .prev = ..., .dev = ..., .dev_scratch = ..., all on the same level. So
> > > just checking for myself.
>
> The idea here is that we want to clarify if we're dealing with
> an anonymous struct or union.  I wanted to have things work
> like a C-style initializer as closely as possible, but I
> realized it's not legit to initialize multiple values in a
> union, and more importantly when we're trying to visually interpret
> data, we really want to know if an anonymous container of data is
> a structure (where all values represent different elements in the
> structure) or a union (where we're seeing multiple interpretations of
> the same value).

Yeah, fair enough.

>
> Thanks again for the detailed review!

Of course. But it's not clear if you agree with me on everything, so I
still hope to get replies later.

>
> Alan

  reply	other threads:[~2021-01-22 20:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-17 22:16 [PATCH v2 bpf-next 0/4] libbpf: BTF dumper support for typed data Alan Maguire
2021-01-17 22:16 ` [PATCH v2 bpf-next 1/4] libbpf: add btf_has_size() and btf_int() inlines Alan Maguire
2021-01-21  4:11   ` Andrii Nakryiko
2021-01-17 22:16 ` [PATCH v2 bpf-next 2/4] libbpf: make skip_mods_and_typedefs available internally in libbpf Alan Maguire
2021-01-21  4:13   ` Andrii Nakryiko
2021-01-17 22:16 ` [PATCH v2 bpf-next 3/4] libbpf: BTF dumper support for typed data Alan Maguire
2021-01-21  6:56   ` Andrii Nakryiko
2021-01-21 19:51     ` Andrii Nakryiko
2021-01-22 16:31       ` Alan Maguire
2021-01-22 20:05         ` Andrii Nakryiko [this message]
2021-01-17 22:16 ` [PATCH v2 bpf-next 4/4] selftests/bpf: add dump type data tests to btf dump tests Alan Maguire
2021-01-21  7:01   ` Andrii Nakryiko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAEf4BzZBVjUQnPxG1hyxkoM5HLWyEm2VJjOg0MoogrBdm6QdEQ@mail.gmail.com \
    --to=andrii.nakryiko@gmail.com \
    --cc=alan.maguire@oracle.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=john.fastabend@gmail.com \
    --cc=kafai@fb.com \
    --cc=kpsingh@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=morbo@google.com \
    --cc=netdev@vger.kernel.org \
    --cc=shuah@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.