BPF Archive on lore.kernel.org
 help / color / Atom feed
From: Stanislav Fomichev <sdf@fomichev.me>
To: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>,
	Stanislav Fomichev <sdf@google.com>,
	Network Development <netdev@vger.kernel.org>,
	bpf <bpf@vger.kernel.org>,
	"David S. Miller" <davem@davemloft.net>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Yonghong Song <yhs@fb.com>
Subject: Re: debug annotations for bpf progs. Was: [PATCH bpf-next 1/3] bpf: preserve command of the process that loaded the program
Date: Tue, 15 Oct 2019 15:26:12 -0700
Message-ID: <20191015222612.GB1897241@mini-arch> (raw)
In-Reply-To: <CAADnVQKuysEvFAX54+f0YPJ1+cgcRJbhrpVE7xmvLqu-ADrk+Q@mail.gmail.com>

On 10/15, Alexei Starovoitov wrote:
> On Fri, Oct 11, 2019 at 5:38 PM Stanislav Fomichev <sdf@fomichev.me> wrote:
> >
> > On 10/11, Alexei Starovoitov wrote:
> > > On Fri, Oct 11, 2019 at 9:21 AM Stanislav Fomichev <sdf@google.com> wrote:
> > > >
> > > > Even though we have the pointer to user_struct and can recover
> > > > uid of the user who has created the program, it usually contains
> > > > 0 (root) which is not very informative. Let's store the comm of the
> > > > calling process and export it via bpf_prog_info. This should help
> > > > answer the question "which process loaded this particular program".
> > > >
> > > > Signed-off-by: Stanislav Fomichev <sdf@google.com>
> > > > ---
> > > >  include/linux/bpf.h      | 1 +
> > > >  include/uapi/linux/bpf.h | 2 ++
> > > >  kernel/bpf/syscall.c     | 4 ++++
> > > >  3 files changed, 7 insertions(+)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 5b9d22338606..b03ea396afe5 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -421,6 +421,7 @@ struct bpf_prog_aux {
> > > >                 struct work_struct work;
> > > >                 struct rcu_head rcu;
> > > >         };
> > > > +       char created_by_comm[BPF_CREATED_COMM_LEN];
> > > >  };
> > > >
> > > >  struct bpf_array {
> > > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > > index a65c3b0c6935..4e883ecbba1e 100644
> > > > --- a/include/uapi/linux/bpf.h
> > > > +++ b/include/uapi/linux/bpf.h
> > > > @@ -326,6 +326,7 @@ enum bpf_attach_type {
> > > >  #define BPF_F_NUMA_NODE                (1U << 2)
> > > >
> > > >  #define BPF_OBJ_NAME_LEN 16U
> > > > +#define BPF_CREATED_COMM_LEN   16U
> > >
> > > Nack.
> > > 16 bytes is going to be useless.
> > > We found it the hard way with prog_name.
> > > If you want to embed additional debug information
> > > please use BTF for that.
> > BTF was my natural choice initially, but then I saw created_by_uid and
> > thought created_by_comm might have a chance :-)
> >
> > To clarify, by BTF you mean creating some unused global variable
> > and use its name as the debugging info? Or there is some better way?
> 
> I was thinking about adding new section to .btf.ext with this extra data,
> but global variable is a better idea indeed.
> We'd need to standardize such variables names, so that
> bpftool can parse and print it while doing 'bpftool prog show'.
> We see more and more cases where services use more than
> one program in single .c file to accomplish their goals.
> Tying such debug info (like 'created_by_comm') to each program
> individually isn't quite right.
> In that sense global variables are better, since they cover the
> whole .c file.
> Beyond 'created_by_comm' there are others things that people
> will likely want to know.
> Like which version of llvm was used to compile this .o file.
> Which unix user name compiled it.
> The name of service/daemon that will be using this .o
> and so on.
> May be some standard prefix to such global variables will do?
> Like "bpftool prog show" can scan global data for
> "__annotate_#name" and print both name and string contents ?
> For folks who regularly ssh into servers to debug bpf progs
> that will help a lot.
> May be some annotations llvm can automatically add to .o.
> Thoughts?
We started some proof-of-concept prototyping yesterday; the idea, roughly:
* build system generates build_info.h header which contains:
  char __attribute__((section("aux_timestamp"))) *__aux_<build timestamp> = "";
  char __attribute__((section("aux_version"))) *__aux_<version> = "";
  ...
* clang has -include flag which includes this auto-generated file,
  so we don't rely on users including it
* 'bpftool show btf | grep aux_' can be used for low-level debugging

It's not pretty, but it gets the job done. I agree that having some sort of
convention is nice to make it more usable. If we can agree on a pre-defined
section (aux in my case) so that bpftool can take "variable" from
aux_<variable> section name and print "value" from __aux_<value>,
that would be nice.

One thing I still have no idea how to implement with this scheme
is some alternative to created_by_comm. There is no easy way to
add some BTF at runtime (load) time. Ideas?

> "__annotate_#name" and print both name and string contents ?
As Andrii just pointed out, this requires knowing where to look
for the obj files to print the contents of the vars :-(

  parent reply index

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-11 16:21 Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 2/3] tools/bpf: sync bpf.h Stanislav Fomichev
2019-10-11 16:21 ` [PATCH bpf-next 3/3] bpftool: print the comm of the process that loaded the program Stanislav Fomichev
2019-10-11 20:19   ` Martin Lau
2019-10-11 20:37     ` Stanislav Fomichev
2019-10-11 21:11       ` Arnaldo Carvalho de Melo
2019-10-11 21:30         ` Stanislav Fomichev
2019-10-12  0:10 ` [PATCH bpf-next 1/3] bpf: preserve command " Alexei Starovoitov
2019-10-12  0:38   ` Stanislav Fomichev
2019-10-15 21:21     ` debug annotations for bpf progs. Was: " Alexei Starovoitov
2019-10-15 22:14       ` Andrii Nakryiko
2019-10-15 22:24         ` Alexei Starovoitov
2019-10-15 22:33           ` Andrii Nakryiko
2019-10-15 22:48             ` Alexei Starovoitov
2019-10-15 22:26       ` Stanislav Fomichev [this message]
2019-10-16 14:01       ` Daniel Borkmann
2019-10-17 16:28         ` Stanislav Fomichev
2019-10-18  6:19           ` Alexei Starovoitov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191015222612.GB1897241@mini-arch \
    --to=sdf@fomichev.me \
    --cc=alexei.starovoitov@gmail.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=davem@davemloft.net \
    --cc=netdev@vger.kernel.org \
    --cc=sdf@google.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git