bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Hao Luo <haoluo@google.com>
Cc: Andrii Nakryiko <andriin@fb.com>, bpf <bpf@vger.kernel.org>,
	Networking <netdev@vger.kernel.org>,
	Alexei Starovoitov <ast@fb.com>,
	Daniel Borkmann <daniel@iogearbox.net>,
	Kernel Team <kernel-team@fb.com>,
	Arnaldo Carvalho de Melo <acme@kernel.org>,
	Song Liu <songliubraving@fb.com>,
	Quentin Monnet <quentin@isovalent.com>
Subject: Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
Date: Mon, 15 Jun 2020 12:08:26 -0700	[thread overview]
Message-ID: <CAEf4BzYVY-sA_SRqxr-dxrkR5DPW6tv3tnNonK=4WPx6eEiZFQ@mail.gmail.com> (raw)
In-Reply-To: <CA+khW7hFZzp_K_xydSFw0O3LYB22_fC=Z4wG7i9Si+phGHn4cQ@mail.gmail.com>

On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> Thanks, Andrii,
> This change looks nice! A couple of comments:
> 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?

That choice is very deliberate one. `extern const void` is the right
way in C language to access linker-generated symbols, for instance,
which is quite similar to what the intent is her. Having void type is
very explicit that you don't know/care about that value pointed to by
extern address, the only operation you can perform is to get it's

Once we add kernel variables support, that's when types will start to
be specified and libbpf will do extra checks (type matching) and extra
work (generating ldimm64 with BTF ID, for instance), to allow C code
to access data pointed to by extern address.

Switching type to u64 would be misleading in allowing C code to
implicitly dereference value of extern. E.g., there is a big
difference between:

extern u64 bla;

printf("%lld\n", bla); /* de-reference happens here, we get contents
of memory pointed to by "bla" symbol */

printf("%p\n", &bla); /* here we get value of linker symbol/address of
extern variable */

Currently I explicitly support only the latter and want to prevent the
former, until we have kernel variables in BTF. Using `extern void`
makes compiler enforce that only the &bla form is allowed. Everything
else is compilation error.

> 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?

That's a bit of a hack on my part. Variable needs to point to some
type, which size will match the size of datasec's varinfo entry. This
is checked and enforced by kernel. I'm looking for 4-byte int, because
it's almost guaranteed that it will be present in program's BTF and I
won't have to explicitly add it (it's because all BPF programs return
int, so it must be in program's BTF already). While 8-byte long is
less likely to be there.

In the future, if we have a nicer way to extend BTF (and we will
soon), we can do this a bit better, but either way that .ksyms DATASEC
type isn't used for anything (there is no map with that DATASEC as a
value type), so it doesn't matter.

> Hao
> On Fri, Jun 12, 2020 at 3:35 PM Andrii Nakryiko <andriin@fb.com> wrote:
>> Add support for another (in addition to existing Kconfig) special kind of
>> externs in BPF code, kernel symbol externs. Such externs allow BPF code to
>> "know" kernel symbol address and either use it for comparisons with kernel
>> data structures (e.g., struct file's f_op pointer, to distinguish different
>> kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
>> pointers and read data from global variables. Kernel symbol addresses are
>> found through /proc/kallsyms, which should be present in the system.
>> Currently, such kernel symbol variables are typeless: they have to be defined
>> as `extern const void <symbol>` and the only operation you can do (in C code)
>> with them is to take its address. Such extern should reside in a special
>> section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
>> vs weak semantics stays the same as with Kconfig externs. If symbol is not
>> found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
>> but will be defaulted to 0 for weak externs.
>> If the same symbol is defined multiple times in /proc/kallsyms, then it will
>> be error if any of the associated addresses differs. In that case, address is
>> ambiguous, so libbpf falls on the side of caution, rather than confusing user
>> with randomly chosen address.
>> In the future, once kernel is extended with variables BTF information, such
>> ksym externs will be supported in a typed version, which will allow BPF
>> program to read variable's contents directly, similarly to how it's done for
>> fentry/fexit input arguments.
>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>> ---
>>  tools/lib/bpf/bpf_helpers.h |   1 +
>>  tools/lib/bpf/btf.h         |   5 ++
>>  tools/lib/bpf/libbpf.c      | 138 ++++++++++++++++++++++++++++++++++--
>>  3 files changed, 139 insertions(+), 5 deletions(-)


>>  enum extern_type {
>>         EXT_UNKNOWN,
>> +       EXT_KSYM,
>>         EXT_KCFG,
>>  };
> Minor, let EXT_KSYM come after EXT_KCFG.

I wanted ksym externs to go before KCFG ones, but not sure why. I'll
double check, I don't think it should matter.



>> +static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
>> +{
>> +       char sym_type, sym_name[256];
>> +       unsigned long sym_addr;
>> +       struct extern_desc *ext;
>> +       int ret, err = 0;
>> +       FILE *f;
>> +
>> +       f = fopen("/proc/kallsyms", "r");
>> +       if (!f) {
>> +               err = -errno;
>> +               pr_warn("failed to open /proc/kallsyms: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       while (true) {
>> +               ret = fscanf(f, "%lx %c %s%*[^\n]\n",
>> +                            &sym_addr, &sym_type, sym_name);
> Maybe better follow the existing pattern in kernel (scripts/kallsyms.c https://github.com/torvalds/linux/blob/master/scripts/kallsyms.c#L177)

oh, didn't know about this "%499s" trick, will change.

>> +               if (ret == EOF && feof(f))
>> +                       break;
>> +               if (ret != 3) {
>> +                       err = -EINVAL;
>> +                       goto out;
>> +               }
>> +


  parent reply	other threads:[~2020-06-15 19:08 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support Andrii Nakryiko
     [not found]   ` <CA+khW7hAYVdoQX5-j0z1iGEVZeww4BBu4NXzy5eS5OwDRYqe2w@mail.gmail.com>
2020-06-15 18:55     ` Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses Andrii Nakryiko
     [not found]   ` <CA+khW7hFZzp_K_xydSFw0O3LYB22_fC=Z4wG7i9Si+phGHn4cQ@mail.gmail.com>
2020-06-15 19:08     ` Andrii Nakryiko [this message]
2020-06-16  8:05       ` Hao Luo
2020-06-17  1:24         ` Hao Luo
2020-06-17  1:36           ` Andrii Nakryiko
2020-06-18  7:53             ` Hao Luo
2020-06-12 22:31 ` [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest Andrii Nakryiko
     [not found]   ` <CA+khW7jxdS1KRpk2syVGjDqbyn3wAd3Eh_LEMAEhkPUehuXMwg@mail.gmail.com>
2020-06-15 19:11     ` Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 4/8] tools/bpftool: move map/prog parsing logic into common Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 5/8] tools/bpftool: minimize bootstrap bpftool Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 6/8] tools/bpftool: generalize BPF skeleton support and generate vmlinux.h Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 7/8] libbpf: wrap source argument of BPF_CORE_READ macro in parentheses Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf Andrii Nakryiko
2020-06-13  3:45   ` Alexei Starovoitov
2020-06-13  5:57     ` Andrii Nakryiko
2020-06-13 22:14       ` Arnaldo Carvalho de Melo
2020-06-15  9:04         ` Toke Høiland-Jørgensen
2020-06-15  9:30           ` Quentin Monnet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAEf4BzYVY-sA_SRqxr-dxrkR5DPW6tv3tnNonK=4WPx6eEiZFQ@mail.gmail.com' \
    --to=andrii.nakryiko@gmail.com \
    --cc=acme@kernel.org \
    --cc=andriin@fb.com \
    --cc=ast@fb.com \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=haoluo@google.com \
    --cc=kernel-team@fb.com \
    --cc=netdev@vger.kernel.org \
    --cc=quentin@isovalent.com \
    --cc=songliubraving@fb.com \


* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).