linux-perf-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Arnaldo Carvalho de Melo <acme@kernel.org>
To: Ingo Molnar <mingo@kernel.org>
Cc: "Jiri Olsa" <jolsa@kernel.org>,
	"Namhyung Kim" <namhyung@kernel.org>,
	"Clark Williams" <williams@redhat.com>,
	linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org,
	"Arnaldo Carvalho de Melo" <acme@redhat.com>,
	"Adrian Hunter" <adrian.hunter@intel.com>,
	"Alexei Starovoitov" <ast@fb.com>,
	"Andrii Nakryiko" <andrii.nakryiko@gmail.com>,
	"Daniel Borkmann" <daniel@iogearbox.net>,
	"Luis Cláudio Gonçalves" <lclaudio@redhat.com>,
	"Martin KaFai Lau" <kafai@fb.com>,
	"Song Liu" <songliubraving@fb.com>,
	"Wang Nan" <wangnan0@huawei.com>, "Yonghong Song" <yhs@fb.com>
Subject: [PATCH 17/35] perf bpf: Automatically add BTF ELF markers
Date: Thu,  7 Mar 2019 14:44:15 -0300	[thread overview]
Message-ID: <20190307174433.28819-18-acme@kernel.org> (raw)
In-Reply-To: <20190307174433.28819-1-acme@kernel.org>

From: Arnaldo Carvalho de Melo <acme@redhat.com>

The libbpf loader expects that some __btf_map_<MAP_NAME> structs be in
place with the keys and values types of maps so that one can store the
struct definitions and have them sent to the kernel via sys_bpf(fd, cmd
= BTF_LOAD) and then later be retrievable via sys_bpf(fd, cmd =
BPF_OBJ_GET_INFO_BY_FD) for use by tools such as 'bpftool map dump id
MAP_ID'.

Since we already have this for defining maps in 'perf trace' BPF events:

   bpf_map(name, _type, type_key, type_val, _max_entries)

As used in the tools/perf/examples/bpf/augmented_raw_syscalls.c:

 --- 8< ---

struct syscall {
        bool    enabled;
};

bpf_map(syscalls, ARRAY, int, struct syscall, 512);

 --- 8< ---

All we need is to get all that already available info, piggyback on the
'bpf_map' define in tools/perf/include/bpf/bpf.h, that is included by
'perf trace' BPF programs and do that without requiring changes to the
BPF programs already defining maps using 'bpf_map()'.

So this is what we have before this patch:

1) With this in ~/.perfconfig to dump .c events as .o, aka save a copy
   so that we can use the .o later as a pre-compiled BPF bytecode:

  # grep '\[llvm\]' -A2 ~/.perfconfig
  [llvm]
	dump-obj = true
	clang-opt = -g

  #
  # clang --version
  clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 7906282d3afec5dfdc2b27943fd6c0309086c507) (https://git.llvm.org/git/llvm.git/ a1b5de1ff8ae8bc79dc8e86e1f82565229bd0500)
  Target: x86_64-unknown-linux-gnu
  Thread model: posix
  InstalledDir: /opt/llvm/bin

2) Note the -g there so that we get clang to generate debuginfo, and
   since the target is 'bpf' it will generate the BTF info in this
   clang version (9.0).

3) Run a simple 'perf record' specifiying as an event the augmented_raw_syscalls.c
   source code:

  # perf record -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c sleep 1
  LLVM: dumping /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data ]

  # file /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), with debug_info, not stripped

4) Look at the BTF structs encoded in it:

  # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  syscall_enter_args	64	0
  augmented_filename	264	0
  syscall	1	0
  syscall_exit_args	24	0
  bpf_map	28	0
  #
  # pahole -F btf -C syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  # pahole -F btf -C syscall /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  struct syscall {
	  bool                       enabled;              /*     0     1 */

	  /* size: 1, cachelines: 1, members: 1 */
	  /* last cacheline: 1 bytes */
  };
  #

5) Ok, with just this we don't have the markers expected by the libbpf
   loader and when we run with this BPF bytecode, because we have:

  # grep '\[trace\]' -A1 ~/.perfconfig
  [trace]
	add_events = /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  #

6) Lets do a 'perf trace' system wide session using this BPF program:

   # perf trace -e *mmsg,open*
  Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR|O_CREAT|O_TRUNC, S_IRUSR|S_IWUSR) = 106
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  Cache2 I/O/6885 openat(AT_FDCWD, "/proc/self/mountinfo", O_RDONLY) = 121
  DNS Res~ver #3/23340 openat(AT_FDCWD, "/etc/hosts", O_RDONLY|O_CLOEXEC) = 106
  DNS Res~ver #3/23340 sendmmsg(106<socket:[3482690]>, 0x7f252f1fcaf0, 2, MSG_NOSIGNAL) = 2
  Cache2 I/O/6885 openat(AT_FDCWD, "/home/acme/.cache/mozilla/firefox/ina67tev.default/cache2/entries/BA220AB2914006A7AE96D27BE6EA13DD77519FCA", O_RDWR) = 106
  lighttpd/18915 openat(AT_FDCWD, "/proc/loadavg", O_RDONLY) = 12

7) While it runs lets see the maps that 'perf trace' + libbpf's BPF
  loader loaded into the kernel via sys_bpf(fd, BPF_BTF_LOAD, ...):

  # bpftool map list | tail -6
  149: perf_event_array  name __augmented_sys  flags 0x0
	  key 4B  value 4B  max_entries 8  memlock 4096B
  150: array  name syscalls  flags 0x0
	  key 4B  value 1B  max_entries 512  memlock 8192B
  151: hash  name pids_filtered  flags 0x0
	  key 4B  value 1B  max_entries 64  memlock 8192B
  #

8) Dump the "pids_filtered", map, that will have one entry per PID that
   'perf trace' wants filtered, which includes its own, to avoid a
   tracing feedback loop (perf trace shows the syscalls it does which
   generates more syscalls that it has to show that...), it also
   auto-filters the 'gnome-terminal' and 'sshd' parent PIDs, for the
   same reason:

  # bpftool map dump id 151
  key: a5 0c 00 00  value: 01
  key: 14 63 00 00  value: 01
  Found 2 elements
  #

9) Since there is no BTF info available, it does a generic hex dump :-\

10) Now, with this patch applied, we'll do steps 3 to 6 again and look
    with pahole if there are extra structs encoded in BTF:

  # pahole -F btf --sizes /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  syscall_enter_args	64	0
  augmented_filename	264	0
  syscall	1	0
  syscall_exit_args	24	0
  bpf_map	28	0
  ____btf_map___augmented_syscalls__	8	0
  ____btf_map_syscalls	8	0
  ____btf_map_pids_filtered	8	0
  #

11) Yes, those __btf_map_ + the map names, lets see how they look like:

  # pahole -F btf -C ____btf_map_syscalls /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o
  struct ____btf_map_syscalls {
	  int                        key;                  /*     0     4 */
	  struct syscall             value;                /*     4     1 */

	  /* size: 8, cachelines: 1, members: 2 */
	  /* padding: 3 */
	  /* last cacheline: 8 bytes */
  };
  #

12) Lets repeat step 7 to get the new map ids:

  # bpftool map list | tail -6
  155: perf_event_array  name __augmented_sys  flags 0x0
	  key 4B  value 4B  max_entries 8  memlock 4096B
  156: array  name syscalls  flags 0x0
	  key 4B  value 1B  max_entries 512  memlock 8192B
  157: hash  name pids_filtered  flags 0x0
	  key 4B  value 1B  max_entries 64  memlock 8192B
  #

13) And finally lets dump the 'pids_filtered':

  # bpftool map dump id 157
  [{
        "key": 3237,
        "value": true
    },{
        "key": 26435,
        "value": true
    }
  ]
  #

Looks much better! BTF info was used to interpret the key as an integer
and the value as a struct with just one boolean member, so to make it
more compact, show just the 'true' value where we saw '01'.

Now to make 'perf trace --dump-map' to use BTF!

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexei Starovoitov <ast@fb.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Luis Cláudio Gonçalves <lclaudio@redhat.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lkml.kernel.org/n/tip-ybuf9wpkm30xk28iq7jbwb40@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
---
 tools/perf/include/bpf/bpf.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/perf/include/bpf/bpf.h b/tools/perf/include/bpf/bpf.h
index 5df7ed9d9020..2eac6d804b2d 100644
--- a/tools/perf/include/bpf/bpf.h
+++ b/tools/perf/include/bpf/bpf.h
@@ -24,7 +24,13 @@ struct bpf_map SEC("maps") name = {				\
 	.key_size    = sizeof(type_key),			\
 	.value_size  = sizeof(type_val),			\
 	.max_entries = _max_entries,				\
-}
+};								\
+struct ____btf_map_##name {					\
+	type_key key;						\
+	type_val value;                                 	\
+};								\
+struct ____btf_map_##name __attribute__((section(".maps." #name), used)) \
+	____btf_map_##name = { }
 
 /*
  * FIXME: this should receive .max_entries as a parameter, as careful
-- 
2.20.1

  parent reply	other threads:[~2019-03-07 17:44 UTC|newest]

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-07 17:43 [GIT PULL 00/35] perf/core improvements and fixes Arnaldo Carvalho de Melo
2019-03-07 17:43 ` [PATCH 01/35] perf, bpf: Consider events with attr.bpf_event as side-band events Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 02/35] perf probe: Clarify error message about not finding kernel modules debuginfo Arnaldo Carvalho de Melo
2019-03-07 23:30   ` Masami Hiramatsu
2019-03-07 23:58     ` Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 03/35] tools lib traceevent: Fix buffer overflow in arg_eval Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 04/35] perf: Mark expected switch fall-through Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 05/35] perf time-utils: Refactor time range parsing code Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 06/35] perf auxtrace: Improve address filter error message when there is no DSO Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 07/35] perf intel-pt: Fix divide by zero when TSC is not available Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 08/35] perf db-export: Add calls parent_id to enable creation of call trees Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 09/35] perf scripts python: export-to-sqlite.py: Export calls parent_id Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 10/35] perf scripts python: export-to-postgresql.py: Fix invalid input syntax for integer error Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 11/35] perf scripts python: export-to-postgresql.py: Export calls parent_id Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 12/35] perf scripts python: exported-sql-viewer.py: Factor out TreeWindowBase Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 13/35] perf scripts python: exported-sql-viewer.py: Improve TreeModel abstraction Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 14/35] perf scripts python: exported-sql-viewer.py: Factor out CallGraphModelBase Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 15/35] perf scripts python: exported-sql-viewer.py: Add call tree Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 16/35] perf beauty msg_flags: Add missing %s lost when adding prefix suppression logic Arnaldo Carvalho de Melo
2019-03-07 17:44 ` Arnaldo Carvalho de Melo [this message]
2019-03-07 17:44 ` [PATCH 18/35] perf clang: Remove needless extra semicolon Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 19/35] perf annotate: Calculate the max instruction name, align column to that Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 20/35] perf thread: Generalize function to copy from thread addr space from intel-bts code Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 21/35] perf diff: Support --time filter option Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 22/35] perf diff: Support --cpu " Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 23/35] perf diff: Support --pid/--tid filter options Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 24/35] perf script python: Remove mixed indentation Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 25/35] perf script python: Add Python3 support to futex-contention.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 26/35] perf script python: add Python3 support to check-perf-trace.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 27/35] perf script python: Add Python3 support to event_analyzing_sample.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 28/35] perf script python: Add Python3 support to intel-pt-events.py Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 29/35] perf c2c: Fix c2c report for empty numa node Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 30/35] perf hist: Add error path into hist_entry__init Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 31/35] perf hist: Fix memory leak of srcline Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 32/35] perf tools: Read and store caps/max_precise in perf_pmu Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 33/35] perf evsel: Probe for precise_ip with simple attr Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 34/35] perf session: Fix double free in perf_data__close Arnaldo Carvalho de Melo
2019-03-07 17:44 ` [PATCH 35/35] perf data: Force perf_data__open|close zero data->file.path Arnaldo Carvalho de Melo
2019-03-09 16:02 ` [GIT PULL 00/35] perf/core improvements and fixes Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190307174433.28819-18-acme@kernel.org \
    --to=acme@kernel.org \
    --cc=acme@redhat.com \
    --cc=adrian.hunter@intel.com \
    --cc=andrii.nakryiko@gmail.com \
    --cc=ast@fb.com \
    --cc=daniel@iogearbox.net \
    --cc=jolsa@kernel.org \
    --cc=kafai@fb.com \
    --cc=lclaudio@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-perf-users@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=namhyung@kernel.org \
    --cc=songliubraving@fb.com \
    --cc=wangnan0@huawei.com \
    --cc=williams@redhat.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).