All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
@ 2022-04-07 12:52 Jiri Olsa
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
                   ` (4 more replies)
  0 siblings, 5 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-07 12:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

hi,
sending additional fix for symbol resolving in kprobe multi link
requested by Alexei and Andrii [1].

This speeds up bpftrace kprobe attachment, when using pure symbols
(3344 symbols) to attach:

Before:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )

After:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )


There are 2 reasons I'm sending this as RFC though..

  - I added test that meassures attachment speed on all possible functions
    from available_filter_functions, which is 48712 functions on my setup.
    The attach/detach speed for that is under 2 seconds and the test will
    fail if it's bigger than that.. which might fail on different setups
    or loaded machine.. I'm not sure what's the best solution yet, separate
    bench application perhaps?

  - copy_user_syms function potentially allocates lot of memory (~6MB in my
    tests with attaching ~48k functions). I haven't seen this to fail yet,
    but it might need to be changed to allocate memory gradually if needed,
    do we care? ;-)

thanks,
jirka


[1] https://lore.kernel.org/bpf/CAEf4BzZtQaiUxQ-sm_hH2qKPRaqGHyOfEsW96DxtBHRaKLoL3Q@mail.gmail.com/
---
Jiri Olsa (4):
      kallsyms: Add kallsyms_lookup_names function
      fprobe: Resolve symbols with kallsyms_lookup_names
      bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
      selftests/bpf: Add attach bench test

 include/linux/kallsyms.h                                   |   6 ++++
 kernel/kallsyms.c                                          |  50 ++++++++++++++++++++++++++++++++-
 kernel/trace/bpf_trace.c                                   | 123 +++++++++++++++++++++++++++++++++++++++++++++++---------------------------------
 kernel/trace/fprobe.c                                      |  23 ++-------------
 tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c | 141 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/kprobe_multi_empty.c     |  12 ++++++++
 6 files changed, 283 insertions(+), 72 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/kprobe_multi_empty.c

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
@ 2022-04-07 12:52 ` Jiri Olsa
  2022-04-08  0:57   ` Masami Hiramatsu
                     ` (2 more replies)
  2022-04-07 12:52 ` [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names Jiri Olsa
                   ` (3 subsequent siblings)
  4 siblings, 3 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-07 12:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

Adding kallsyms_lookup_names function that resolves array of symbols
with single pass over kallsyms.

The user provides array of string pointers with count and pointer to
allocated array for resolved values.

  int kallsyms_lookup_names(const char **syms, u32 cnt,
                            unsigned long *addrs)

Before we iterate kallsyms we sort user provided symbols by name and
then use that in kalsyms iteration to find each kallsyms symbol in
user provided symbols.

We also check each symbol to pass ftrace_location, because this API
will be used for fprobe symbols resolving. This can be optional in
future if there's a need.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/kallsyms.h |  6 +++++
 kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 54 insertions(+)

diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
index ce1bd2fbf23e..5320a5e77f61 100644
--- a/include/linux/kallsyms.h
+++ b/include/linux/kallsyms.h
@@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
 #ifdef CONFIG_KALLSYMS
 /* Lookup the address for a symbol. Returns 0 if not found. */
 unsigned long kallsyms_lookup_name(const char *name);
+int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
 
 extern int kallsyms_lookup_size_offset(unsigned long addr,
 				  unsigned long *symbolsize,
@@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
 	return 0;
 }
 
+int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
+{
+	return -ERANGE;
+}
+
 static inline int kallsyms_lookup_size_offset(unsigned long addr,
 					      unsigned long *symbolsize,
 					      unsigned long *offset)
diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index 79f2eb617a62..a3738ddf9e87 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -29,6 +29,8 @@
 #include <linux/compiler.h>
 #include <linux/module.h>
 #include <linux/kernel.h>
+#include <linux/bsearch.h>
+#include <linux/sort.h>
 
 /*
  * These will be re-linked against their real values
@@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
 	return __sprint_symbol(buffer, address, -1, 1, 1);
 }
 
+static int symbols_cmp(const void *a, const void *b)
+{
+	const char **str_a = (const char **) a;
+	const char **str_b = (const char **) b;
+
+	return strcmp(*str_a, *str_b);
+}
+
+struct kallsyms_data {
+	unsigned long *addrs;
+	const char **syms;
+	u32 cnt;
+	u32 found;
+};
+
+static int kallsyms_callback(void *data, const char *name,
+			     struct module *mod, unsigned long addr)
+{
+	struct kallsyms_data *args = data;
+
+	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
+		return 0;
+
+	addr = ftrace_location(addr);
+	if (!addr)
+		return 0;
+
+	args->addrs[args->found++] = addr;
+	return args->found == args->cnt ? 1 : 0;
+}
+
+int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
+{
+	struct kallsyms_data args;
+
+	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
+
+	args.addrs = addrs;
+	args.syms = syms;
+	args.cnt = cnt;
+	args.found = 0;
+	kallsyms_on_each_symbol(kallsyms_callback, &args);
+
+	return args.found == args.cnt ? 0 : -EINVAL;
+}
+
 /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
 struct kallsym_iter {
 	loff_t pos;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names
  2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
@ 2022-04-07 12:52 ` Jiri Olsa
  2022-04-08  0:58   ` Masami Hiramatsu
  2022-04-07 12:52 ` [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link Jiri Olsa
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 45+ messages in thread
From: Jiri Olsa @ 2022-04-07 12:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

Using kallsyms_lookup_names to speed up symbols lookup
in register_fprobe_syms API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/kallsyms.c     |  2 +-
 kernel/trace/fprobe.c | 23 ++---------------------
 2 files changed, 3 insertions(+), 22 deletions(-)

diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
index a3738ddf9e87..7d89da375c23 100644
--- a/kernel/kallsyms.c
+++ b/kernel/kallsyms.c
@@ -230,7 +230,7 @@ unsigned long kallsyms_lookup_name(const char *name)
 	return module_kallsyms_lookup_name(name);
 }
 
-#ifdef CONFIG_LIVEPATCH
+#if defined(CONFIG_LIVEPATCH) || defined(CONFIG_FPROBE)
 /*
  * Iterate over all symbols in vmlinux.  For symbols from modules use
  * module_kallsyms_on_each_symbol instead.
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 89d9f994ebb0..d466803dc2b2 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -88,36 +88,17 @@ NOKPROBE_SYMBOL(fprobe_exit_handler);
 /* Convert ftrace location address from symbols */
 static unsigned long *get_ftrace_locations(const char **syms, int num)
 {
-	unsigned long addr, size;
 	unsigned long *addrs;
-	int i;
 
 	/* Convert symbols to symbol address */
 	addrs = kcalloc(num, sizeof(*addrs), GFP_KERNEL);
 	if (!addrs)
 		return ERR_PTR(-ENOMEM);
 
-	for (i = 0; i < num; i++) {
-		addr = kallsyms_lookup_name(syms[i]);
-		if (!addr)	/* Maybe wrong symbol */
-			goto error;
+	if (!kallsyms_lookup_names(syms, num, addrs))
+		return addrs;
 
-		/* Convert symbol address to ftrace location. */
-		if (!kallsyms_lookup_size_offset(addr, &size, NULL) || !size)
-			goto error;
-
-		addr = ftrace_location_range(addr, addr + size - 1);
-		if (!addr) /* No dynamic ftrace there. */
-			goto error;
-
-		addrs[i] = addr;
-	}
-
-	return addrs;
-
-error:
 	kfree(addrs);
-
 	return ERR_PTR(-ENOENT);
 }
 
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
  2022-04-07 12:52 ` [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names Jiri Olsa
@ 2022-04-07 12:52 ` Jiri Olsa
  2022-04-08 23:26   ` Alexei Starovoitov
  2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-07 12:52 ` [RFC bpf-next 4/4] selftests/bpf: Add attach bench test Jiri Olsa
  2022-04-08 23:29 ` [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Alexei Starovoitov
  4 siblings, 2 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-07 12:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

Using kallsyms_lookup_names function to speed up symbols lookup in
kprobe multi link attachment and replacing with it the current
kprobe_multi_resolve_syms function.

This speeds up bpftrace kprobe attachment:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )

After:

  # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
  ...
  0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
 1 file changed, 73 insertions(+), 50 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index b26f3da943de..2602957225ba 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
 	unsigned long entry_ip;
 };
 
+struct user_syms {
+	const char **syms;
+	char *buf;
+};
+
+static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
+{
+	const char __user **usyms_copy = NULL;
+	const char **syms = NULL;
+	char *buf = NULL, *p;
+	int err = -EFAULT;
+	unsigned int i;
+	size_t size;
+
+	size = cnt * sizeof(*usyms_copy);
+
+	usyms_copy = kvmalloc(size, GFP_KERNEL);
+	if (!usyms_copy)
+		return -ENOMEM;
+
+	if (copy_from_user(usyms_copy, usyms, size))
+		goto error;
+
+	err = -ENOMEM;
+	syms = kvmalloc(size, GFP_KERNEL);
+	if (!syms)
+		goto error;
+
+	/* TODO this potentially allocates lot of memory (~6MB in my tests
+	 * with attaching ~40k functions). I haven't seen this to fail yet,
+	 * but it could be changed to allocate memory gradually if needed.
+	 */
+	size = cnt * KSYM_NAME_LEN;
+	buf = kvmalloc(size, GFP_KERNEL);
+	if (!buf)
+		goto error;
+
+	for (p = buf, i = 0; i < cnt; i++) {
+		err = strncpy_from_user(p, usyms_copy[i], KSYM_NAME_LEN);
+		if (err == KSYM_NAME_LEN)
+			err = -E2BIG;
+		if (err < 0)
+			goto error;
+		syms[i] = p;
+		p += err + 1;
+	}
+
+	err = 0;
+	us->syms = syms;
+	us->buf = buf;
+
+error:
+	kvfree(usyms_copy);
+	if (err) {
+		kvfree(syms);
+		kvfree(buf);
+	}
+	return err;
+}
+
+static void free_user_syms(struct user_syms *us)
+{
+	kvfree(us->syms);
+	kvfree(us->buf);
+}
+
 static void bpf_kprobe_multi_link_release(struct bpf_link *link)
 {
 	struct bpf_kprobe_multi_link *kmulti_link;
@@ -2346,55 +2412,6 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long entry_ip,
 	kprobe_multi_link_prog_run(link, entry_ip, regs);
 }
 
-static int
-kprobe_multi_resolve_syms(const void __user *usyms, u32 cnt,
-			  unsigned long *addrs)
-{
-	unsigned long addr, size;
-	const char __user **syms;
-	int err = -ENOMEM;
-	unsigned int i;
-	char *func;
-
-	size = cnt * sizeof(*syms);
-	syms = kvzalloc(size, GFP_KERNEL);
-	if (!syms)
-		return -ENOMEM;
-
-	func = kmalloc(KSYM_NAME_LEN, GFP_KERNEL);
-	if (!func)
-		goto error;
-
-	if (copy_from_user(syms, usyms, size)) {
-		err = -EFAULT;
-		goto error;
-	}
-
-	for (i = 0; i < cnt; i++) {
-		err = strncpy_from_user(func, syms[i], KSYM_NAME_LEN);
-		if (err == KSYM_NAME_LEN)
-			err = -E2BIG;
-		if (err < 0)
-			goto error;
-		err = -EINVAL;
-		addr = kallsyms_lookup_name(func);
-		if (!addr)
-			goto error;
-		if (!kallsyms_lookup_size_offset(addr, &size, NULL))
-			goto error;
-		addr = ftrace_location_range(addr, addr + size - 1);
-		if (!addr)
-			goto error;
-		addrs[i] = addr;
-	}
-
-	err = 0;
-error:
-	kvfree(syms);
-	kfree(func);
-	return err;
-}
-
 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 {
 	struct bpf_kprobe_multi_link *link = NULL;
@@ -2438,7 +2455,13 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
 			goto error;
 		}
 	} else {
-		err = kprobe_multi_resolve_syms(usyms, cnt, addrs);
+		struct user_syms us;
+
+		err = copy_user_syms(&us, usyms, cnt);
+		if (err)
+			goto error;
+		err = kallsyms_lookup_names(us.syms, cnt, addrs);
+		free_user_syms(&us);
 		if (err)
 			goto error;
 	}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
                   ` (2 preceding siblings ...)
  2022-04-07 12:52 ` [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link Jiri Olsa
@ 2022-04-07 12:52 ` Jiri Olsa
  2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-08 23:29 ` [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Alexei Starovoitov
  4 siblings, 1 reply; 45+ messages in thread
From: Jiri Olsa @ 2022-04-07 12:52 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu
  Cc: netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

Adding test that reads all functions from ftrace available_filter_functions
file and attach them all through kprobe_multi API.

It checks that the attach and detach times is under 2 seconds
and printf stats info with -v option, like on my setup:

  test_bench_attach: found 48712 functions
  test_bench_attach: attached in   1.069s
  test_bench_attach: detached in   0.373s

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../bpf/prog_tests/kprobe_multi_test.c        | 141 ++++++++++++++++++
 .../selftests/bpf/progs/kprobe_multi_empty.c  |  12 ++
 2 files changed, 153 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/progs/kprobe_multi_empty.c

diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index b9876b55fc0c..6798b54416de 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -2,6 +2,9 @@
 #include <test_progs.h>
 #include "kprobe_multi.skel.h"
 #include "trace_helpers.h"
+#include "kprobe_multi_empty.skel.h"
+#include "bpf/libbpf_internal.h"
+#include "bpf/hashmap.h"
 
 static void kprobe_multi_test_run(struct kprobe_multi *skel, bool test_return)
 {
@@ -301,6 +304,142 @@ static void test_attach_api_fails(void)
 	kprobe_multi__destroy(skel);
 }
 
+static inline __u64 get_time_ns(void)
+{
+	struct timespec t;
+
+	clock_gettime(CLOCK_MONOTONIC, &t);
+	return (__u64) t.tv_sec * 1000000000 + t.tv_nsec;
+}
+
+static size_t symbol_hash(const void *key, void *ctx __maybe_unused)
+{
+	return str_hash((const char *) key);
+}
+
+static bool symbol_equal(const void *key1, const void *key2, void *ctx __maybe_unused)
+{
+	return strcmp((const char *) key1, (const char *) key2) == 0;
+}
+
+#define DEBUGFS "/sys/kernel/debug/tracing/"
+
+static int get_syms(char ***symsp, size_t *cntp)
+{
+	size_t cap = 0, cnt = 0, i;
+	char *name, **syms = NULL;
+	struct hashmap *map;
+	char buf[256];
+	FILE *f;
+	int err;
+
+	/*
+	 * The available_filter_functions contains many duplicates,
+	 * but other than that all symbols are usable in kprobe multi
+	 * interface.
+	 * Filtering out duplicates by using hashmap__add, which won't
+	 * add existing entry.
+	 */
+	f = fopen(DEBUGFS "available_filter_functions", "r");
+	if (!f)
+		return -EINVAL;
+
+	map = hashmap__new(symbol_hash, symbol_equal, NULL);
+	err = libbpf_get_error(map);
+	if (err)
+		goto error;
+
+	while (fgets(buf, sizeof(buf), f)) {
+		/* skip modules */
+		if (strchr(buf, '['))
+			continue;
+		if (sscanf(buf, "%ms$*[^\n]\n", &name) != 1)
+			continue;
+		err = hashmap__add(map, name, NULL);
+		if (err) {
+			free(name);
+			if (err == -EEXIST)
+				continue;
+			goto error;
+		}
+		err = libbpf_ensure_mem((void **) &syms, &cap,
+					sizeof(*syms), cnt + 1);
+		if (err) {
+			free(name);
+			goto error;
+		}
+		syms[cnt] = name;
+		cnt++;
+	}
+
+	*symsp = syms;
+	*cntp = cnt;
+
+error:
+	fclose(f);
+	hashmap__free(map);
+	if (err) {
+		for (i = 0; i < cnt; i++)
+			free(syms[cnt]);
+		free(syms);
+	}
+	return err;
+}
+
+static void test_bench_attach(void)
+{
+	double attach_delta_ns, detach_delta_ns;
+	LIBBPF_OPTS(bpf_kprobe_multi_opts, opts);
+	struct kprobe_multi_empty *skel = NULL;
+	long attach_start_ns, attach_end_ns;
+	long detach_start_ns, detach_end_ns;
+	struct bpf_link *link = NULL;
+	char **syms = NULL;
+	size_t cnt, i;
+
+	if (!ASSERT_OK(get_syms(&syms, &cnt), "get_syms"))
+		return;
+
+	skel = kprobe_multi_empty__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "kprobe_multi_empty__open_and_load"))
+		goto cleanup;
+
+	opts.syms = (const char **) syms;
+	opts.cnt = cnt;
+
+	attach_start_ns = get_time_ns();
+	link = bpf_program__attach_kprobe_multi_opts(skel->progs.test_kprobe_empty,
+						     NULL, &opts);
+	attach_end_ns = get_time_ns();
+
+	if (!ASSERT_OK_PTR(link, "bpf_program__attach_kprobe_multi_opts"))
+		goto cleanup;
+
+	detach_start_ns = get_time_ns();
+	bpf_link__destroy(link);
+	detach_end_ns = get_time_ns();
+
+	attach_delta_ns = (attach_end_ns - attach_start_ns) / 1000000000.0;
+	detach_delta_ns = (detach_end_ns - detach_start_ns) / 1000000000.0;
+
+	fprintf(stderr, "%s: found %lu functions\n", __func__, cnt);
+	fprintf(stderr, "%s: attached in %7.3lfs\n", __func__, attach_delta_ns);
+	fprintf(stderr, "%s: detached in %7.3lfs\n", __func__, detach_delta_ns);
+
+	if (attach_delta_ns > 2.0)
+		PRINT_FAIL("attach time above 2 seconds\n");
+	if (detach_delta_ns > 2.0)
+		PRINT_FAIL("detach time above 2 seconds\n");
+
+cleanup:
+	kprobe_multi_empty__destroy(skel);
+	if (syms) {
+		for (i = 0; i < cnt; i++)
+			free(syms[i]);
+		free(syms);
+	}
+}
+
 void test_kprobe_multi_test(void)
 {
 	if (!ASSERT_OK(load_kallsyms(), "load_kallsyms"))
@@ -320,4 +459,6 @@ void test_kprobe_multi_test(void)
 		test_attach_api_syms();
 	if (test__start_subtest("attach_api_fails"))
 		test_attach_api_fails();
+	if (test__start_subtest("bench_attach"))
+		test_bench_attach();
 }
diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c b/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c
new file mode 100644
index 000000000000..be9e3d891d46
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char _license[] SEC("license") = "GPL";
+
+SEC("kprobe.multi/*")
+int test_kprobe_empty(struct pt_regs *ctx)
+{
+	return 0;
+}
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
@ 2022-04-08  0:57   ` Masami Hiramatsu
  2022-04-13  7:27     ` Jiri Olsa
  2022-04-08 23:19   ` Alexei Starovoitov
  2022-04-11 22:15   ` Andrii Nakryiko
  2 siblings, 1 reply; 45+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:57 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu,  7 Apr 2022 14:52:21 +0200
Jiri Olsa <jolsa@kernel.org> wrote:

> Adding kallsyms_lookup_names function that resolves array of symbols
> with single pass over kallsyms.
> 
> The user provides array of string pointers with count and pointer to
> allocated array for resolved values.
> 
>   int kallsyms_lookup_names(const char **syms, u32 cnt,
>                             unsigned long *addrs)
> 
> Before we iterate kallsyms we sort user provided symbols by name and
> then use that in kalsyms iteration to find each kallsyms symbol in
> user provided symbols.
> 
> We also check each symbol to pass ftrace_location, because this API
> will be used for fprobe symbols resolving. This can be optional in
> future if there's a need.

I like this idea very much :-)

> 
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/kallsyms.h |  6 +++++
>  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
> 
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index ce1bd2fbf23e..5320a5e77f61 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
>  #ifdef CONFIG_KALLSYMS
>  /* Lookup the address for a symbol. Returns 0 if not found. */
>  unsigned long kallsyms_lookup_name(const char *name);
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
>  
>  extern int kallsyms_lookup_size_offset(unsigned long addr,
>  				  unsigned long *symbolsize,
> @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
>  	return 0;
>  }
>  
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> +{
> +	return -ERANGE;
> +}
> +
>  static inline int kallsyms_lookup_size_offset(unsigned long addr,
>  					      unsigned long *symbolsize,
>  					      unsigned long *offset)
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 79f2eb617a62..a3738ddf9e87 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -29,6 +29,8 @@
>  #include <linux/compiler.h>
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> +#include <linux/bsearch.h>
> +#include <linux/sort.h>
>  
>  /*
>   * These will be re-linked against their real values
> @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
>  	return __sprint_symbol(buffer, address, -1, 1, 1);
>  }
>  
> +static int symbols_cmp(const void *a, const void *b)
> +{
> +	const char **str_a = (const char **) a;
> +	const char **str_b = (const char **) b;
> +
> +	return strcmp(*str_a, *str_b);
> +}
> +
> +struct kallsyms_data {
> +	unsigned long *addrs;
> +	const char **syms;
> +	u32 cnt;
> +	u32 found;

BTW, why do you use 'u32' for this arch independent code?
I think 'size_t' will make its role clearer.

> +};
> +
> +static int kallsyms_callback(void *data, const char *name,
> +			     struct module *mod, unsigned long addr)
> +{
> +	struct kallsyms_data *args = data;
> +
> +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> +		return 0;
> +
> +	addr = ftrace_location(addr);
> +	if (!addr)
> +		return 0;
> +
> +	args->addrs[args->found++] = addr;
> +	return args->found == args->cnt ? 1 : 0;
> +}
> +
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)

Ditto. I think 'size_t cnt' is better. 

Thank you,

> +{
> +	struct kallsyms_data args;
> +
> +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> +
> +	args.addrs = addrs;
> +	args.syms = syms;
> +	args.cnt = cnt;
> +	args.found = 0;
> +	kallsyms_on_each_symbol(kallsyms_callback, &args);
> +
> +	return args.found == args.cnt ? 0 : -EINVAL;
> +}
> +
>  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
>  struct kallsym_iter {
>  	loff_t pos;
> -- 
> 2.35.1
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names
  2022-04-07 12:52 ` [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names Jiri Olsa
@ 2022-04-08  0:58   ` Masami Hiramatsu
  0 siblings, 0 replies; 45+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:58 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, netdev,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu,  7 Apr 2022 14:52:22 +0200
Jiri Olsa <jolsa@kernel.org> wrote:

> Using kallsyms_lookup_names to speed up symbols lookup
> in register_fprobe_syms API.

OK, this looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/kallsyms.c     |  2 +-
>  kernel/trace/fprobe.c | 23 ++---------------------
>  2 files changed, 3 insertions(+), 22 deletions(-)
> 
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index a3738ddf9e87..7d89da375c23 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -230,7 +230,7 @@ unsigned long kallsyms_lookup_name(const char *name)
>  	return module_kallsyms_lookup_name(name);
>  }
>  
> -#ifdef CONFIG_LIVEPATCH
> +#if defined(CONFIG_LIVEPATCH) || defined(CONFIG_FPROBE)
>  /*
>   * Iterate over all symbols in vmlinux.  For symbols from modules use
>   * module_kallsyms_on_each_symbol instead.
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 89d9f994ebb0..d466803dc2b2 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -88,36 +88,17 @@ NOKPROBE_SYMBOL(fprobe_exit_handler);
>  /* Convert ftrace location address from symbols */
>  static unsigned long *get_ftrace_locations(const char **syms, int num)
>  {
> -	unsigned long addr, size;
>  	unsigned long *addrs;
> -	int i;
>  
>  	/* Convert symbols to symbol address */
>  	addrs = kcalloc(num, sizeof(*addrs), GFP_KERNEL);
>  	if (!addrs)
>  		return ERR_PTR(-ENOMEM);
>  
> -	for (i = 0; i < num; i++) {
> -		addr = kallsyms_lookup_name(syms[i]);
> -		if (!addr)	/* Maybe wrong symbol */
> -			goto error;
> +	if (!kallsyms_lookup_names(syms, num, addrs))
> +		return addrs;
>  
> -		/* Convert symbol address to ftrace location. */
> -		if (!kallsyms_lookup_size_offset(addr, &size, NULL) || !size)
> -			goto error;
> -
> -		addr = ftrace_location_range(addr, addr + size - 1);
> -		if (!addr) /* No dynamic ftrace there. */
> -			goto error;
> -
> -		addrs[i] = addr;
> -	}
> -
> -	return addrs;
> -
> -error:
>  	kfree(addrs);
> -
>  	return ERR_PTR(-ENOENT);
>  }
>  
> -- 
> 2.35.1
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
  2022-04-08  0:57   ` Masami Hiramatsu
@ 2022-04-08 23:19   ` Alexei Starovoitov
  2022-04-09 20:24     ` Jiri Olsa
  2022-04-12 20:46     ` Jiri Olsa
  2022-04-11 22:15   ` Andrii Nakryiko
  2 siblings, 2 replies; 45+ messages in thread
From: Alexei Starovoitov @ 2022-04-08 23:19 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 07, 2022 at 02:52:21PM +0200, Jiri Olsa wrote:
> Adding kallsyms_lookup_names function that resolves array of symbols
> with single pass over kallsyms.
> 
> The user provides array of string pointers with count and pointer to
> allocated array for resolved values.
> 
>   int kallsyms_lookup_names(const char **syms, u32 cnt,
>                             unsigned long *addrs)
> 
> Before we iterate kallsyms we sort user provided symbols by name and
> then use that in kalsyms iteration to find each kallsyms symbol in
> user provided symbols.
> 
> We also check each symbol to pass ftrace_location, because this API
> will be used for fprobe symbols resolving. This can be optional in
> future if there's a need.
> 
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/kallsyms.h |  6 +++++
>  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
> 
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index ce1bd2fbf23e..5320a5e77f61 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
>  #ifdef CONFIG_KALLSYMS
>  /* Lookup the address for a symbol. Returns 0 if not found. */
>  unsigned long kallsyms_lookup_name(const char *name);
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
>  
>  extern int kallsyms_lookup_size_offset(unsigned long addr,
>  				  unsigned long *symbolsize,
> @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
>  	return 0;
>  }
>  
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> +{
> +	return -ERANGE;
> +}
> +
>  static inline int kallsyms_lookup_size_offset(unsigned long addr,
>  					      unsigned long *symbolsize,
>  					      unsigned long *offset)
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 79f2eb617a62..a3738ddf9e87 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -29,6 +29,8 @@
>  #include <linux/compiler.h>
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> +#include <linux/bsearch.h>
> +#include <linux/sort.h>
>  
>  /*
>   * These will be re-linked against their real values
> @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
>  	return __sprint_symbol(buffer, address, -1, 1, 1);
>  }
>  
> +static int symbols_cmp(const void *a, const void *b)
> +{
> +	const char **str_a = (const char **) a;
> +	const char **str_b = (const char **) b;
> +
> +	return strcmp(*str_a, *str_b);
> +}
> +
> +struct kallsyms_data {
> +	unsigned long *addrs;
> +	const char **syms;
> +	u32 cnt;
> +	u32 found;
> +};
> +
> +static int kallsyms_callback(void *data, const char *name,
> +			     struct module *mod, unsigned long addr)
> +{
> +	struct kallsyms_data *args = data;
> +
> +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> +		return 0;
> +
> +	addr = ftrace_location(addr);
> +	if (!addr)
> +		return 0;
> +
> +	args->addrs[args->found++] = addr;
> +	return args->found == args->cnt ? 1 : 0;
> +}
> +
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> +{
> +	struct kallsyms_data args;
> +
> +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);

It's nice to share symbols_cmp for sort and bsearch,
but messing technically input argument 'syms' like this will cause
issues sooner or later.
Lets make caller do the sort.
Unordered input will cause issue with bsearch, of course,
but it's a lesser evil. imo.

> +
> +	args.addrs = addrs;
> +	args.syms = syms;
> +	args.cnt = cnt;
> +	args.found = 0;
> +	kallsyms_on_each_symbol(kallsyms_callback, &args);
> +
> +	return args.found == args.cnt ? 0 : -EINVAL;
> +}
> +
>  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
>  struct kallsym_iter {
>  	loff_t pos;
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-07 12:52 ` [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link Jiri Olsa
@ 2022-04-08 23:26   ` Alexei Starovoitov
  2022-04-09 20:24     ` Jiri Olsa
  2022-04-11 22:15     ` Andrii Nakryiko
  2022-04-11 22:15   ` Andrii Nakryiko
  1 sibling, 2 replies; 45+ messages in thread
From: Alexei Starovoitov @ 2022-04-08 23:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 07, 2022 at 02:52:23PM +0200, Jiri Olsa wrote:
> Using kallsyms_lookup_names function to speed up symbols lookup in
> kprobe multi link attachment and replacing with it the current
> kprobe_multi_resolve_syms function.
> 
> This speeds up bpftrace kprobe attachment:
> 
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> 
> After:
> 
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
>  1 file changed, 73 insertions(+), 50 deletions(-)
> 
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index b26f3da943de..2602957225ba 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
>  	unsigned long entry_ip;
>  };
>  
> +struct user_syms {
> +	const char **syms;
> +	char *buf;
> +};
> +
> +static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
> +{
> +	const char __user **usyms_copy = NULL;
> +	const char **syms = NULL;
> +	char *buf = NULL, *p;
> +	int err = -EFAULT;
> +	unsigned int i;
> +	size_t size;
> +
> +	size = cnt * sizeof(*usyms_copy);
> +
> +	usyms_copy = kvmalloc(size, GFP_KERNEL);
> +	if (!usyms_copy)
> +		return -ENOMEM;
> +
> +	if (copy_from_user(usyms_copy, usyms, size))
> +		goto error;
> +
> +	err = -ENOMEM;
> +	syms = kvmalloc(size, GFP_KERNEL);
> +	if (!syms)
> +		goto error;
> +
> +	/* TODO this potentially allocates lot of memory (~6MB in my tests
> +	 * with attaching ~40k functions). I haven't seen this to fail yet,
> +	 * but it could be changed to allocate memory gradually if needed.
> +	 */

Why would 6MB kvmalloc fail?
If we don't have such memory the kernel will be ooming soon anyway.
I don't think we'll see this kvmalloc triggering oom in practice.
The verifier allocates a lot more memory to check large programs.

imo this approach is fine. It's simple.
Trying to do gradual alloc with realloc would be just guessing.

Another option would be to ask user space (libbpf) to do the sort.
There are pros and cons.
This vmalloc+sort is slightly better imo.

> +	size = cnt * KSYM_NAME_LEN;
> +	buf = kvmalloc(size, GFP_KERNEL);
> +	if (!buf)
> +		goto error;
> +
> +	for (p = buf, i = 0; i < cnt; i++) {
> +		err = strncpy_from_user(p, usyms_copy[i], KSYM_NAME_LEN);
> +		if (err == KSYM_NAME_LEN)
> +			err = -E2BIG;
> +		if (err < 0)
> +			goto error;
> +		syms[i] = p;
> +		p += err + 1;
> +	}
> +
> +	err = 0;
> +	us->syms = syms;
> +	us->buf = buf;
> +
> +error:
> +	kvfree(usyms_copy);
> +	if (err) {
> +		kvfree(syms);
> +		kvfree(buf);
> +	}
> +	return err;
> +}
> +
> +static void free_user_syms(struct user_syms *us)
> +{
> +	kvfree(us->syms);
> +	kvfree(us->buf);
> +}
> +
>  static void bpf_kprobe_multi_link_release(struct bpf_link *link)
>  {
>  	struct bpf_kprobe_multi_link *kmulti_link;
> @@ -2346,55 +2412,6 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long entry_ip,
>  	kprobe_multi_link_prog_run(link, entry_ip, regs);
>  }
>  
> -static int
> -kprobe_multi_resolve_syms(const void __user *usyms, u32 cnt,
> -			  unsigned long *addrs)
> -{
> -	unsigned long addr, size;
> -	const char __user **syms;
> -	int err = -ENOMEM;
> -	unsigned int i;
> -	char *func;
> -
> -	size = cnt * sizeof(*syms);
> -	syms = kvzalloc(size, GFP_KERNEL);
> -	if (!syms)
> -		return -ENOMEM;
> -
> -	func = kmalloc(KSYM_NAME_LEN, GFP_KERNEL);
> -	if (!func)
> -		goto error;
> -
> -	if (copy_from_user(syms, usyms, size)) {
> -		err = -EFAULT;
> -		goto error;
> -	}
> -
> -	for (i = 0; i < cnt; i++) {
> -		err = strncpy_from_user(func, syms[i], KSYM_NAME_LEN);
> -		if (err == KSYM_NAME_LEN)
> -			err = -E2BIG;
> -		if (err < 0)
> -			goto error;
> -		err = -EINVAL;
> -		addr = kallsyms_lookup_name(func);
> -		if (!addr)
> -			goto error;
> -		if (!kallsyms_lookup_size_offset(addr, &size, NULL))
> -			goto error;
> -		addr = ftrace_location_range(addr, addr + size - 1);
> -		if (!addr)
> -			goto error;
> -		addrs[i] = addr;
> -	}
> -
> -	err = 0;
> -error:
> -	kvfree(syms);
> -	kfree(func);
> -	return err;
> -}
> -
>  int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
>  {
>  	struct bpf_kprobe_multi_link *link = NULL;
> @@ -2438,7 +2455,13 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
>  			goto error;
>  		}
>  	} else {
> -		err = kprobe_multi_resolve_syms(usyms, cnt, addrs);
> +		struct user_syms us;
> +
> +		err = copy_user_syms(&us, usyms, cnt);
> +		if (err)
> +			goto error;
> +		err = kallsyms_lookup_names(us.syms, cnt, addrs);
> +		free_user_syms(&us);
>  		if (err)
>  			goto error;
>  	}
> -- 
> 2.35.1
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
                   ` (3 preceding siblings ...)
  2022-04-07 12:52 ` [RFC bpf-next 4/4] selftests/bpf: Add attach bench test Jiri Olsa
@ 2022-04-08 23:29 ` Alexei Starovoitov
  2022-04-09 20:24   ` Jiri Olsa
  4 siblings, 1 reply; 45+ messages in thread
From: Alexei Starovoitov @ 2022-04-08 23:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> hi,
> sending additional fix for symbol resolving in kprobe multi link
> requested by Alexei and Andrii [1].
> 
> This speeds up bpftrace kprobe attachment, when using pure symbols
> (3344 symbols) to attach:
> 
> Before:
> 
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> 
> After:
> 
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> 
> 
> There are 2 reasons I'm sending this as RFC though..
> 
>   - I added test that meassures attachment speed on all possible functions
>     from available_filter_functions, which is 48712 functions on my setup.
>     The attach/detach speed for that is under 2 seconds and the test will
>     fail if it's bigger than that.. which might fail on different setups
>     or loaded machine.. I'm not sure what's the best solution yet, separate
>     bench application perhaps?

are you saying there is a bug in the code that you're still debugging?
or just worried about time?

I think it's better for it to be a part of selftest.
CI will take extra 2 seconds to run.
That's fine. It's a good stress test.

>   - copy_user_syms function potentially allocates lot of memory (~6MB in my
>     tests with attaching ~48k functions). I haven't seen this to fail yet,
>     but it might need to be changed to allocate memory gradually if needed,
>     do we care? ;-)

replied in the other email.

Thanks for working on this!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-08 23:29 ` [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Alexei Starovoitov
@ 2022-04-09 20:24   ` Jiri Olsa
  2022-04-11 22:15     ` Andrii Nakryiko
  0 siblings, 1 reply; 45+ messages in thread
From: Jiri Olsa @ 2022-04-09 20:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Fri, Apr 08, 2022 at 04:29:22PM -0700, Alexei Starovoitov wrote:
> On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> > hi,
> > sending additional fix for symbol resolving in kprobe multi link
> > requested by Alexei and Andrii [1].
> > 
> > This speeds up bpftrace kprobe attachment, when using pure symbols
> > (3344 symbols) to attach:
> > 
> > Before:
> > 
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > 
> > After:
> > 
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > 
> > 
> > There are 2 reasons I'm sending this as RFC though..
> > 
> >   - I added test that meassures attachment speed on all possible functions
> >     from available_filter_functions, which is 48712 functions on my setup.
> >     The attach/detach speed for that is under 2 seconds and the test will
> >     fail if it's bigger than that.. which might fail on different setups
> >     or loaded machine.. I'm not sure what's the best solution yet, separate
> >     bench application perhaps?
> 
> are you saying there is a bug in the code that you're still debugging?
> or just worried about time?

just the time, I can make the test fail (cross the 2 seconds limit)
when the machine is loaded, like with running kernel build

but I couldn't reproduce this with just paralel test_progs run

> 
> I think it's better for it to be a part of selftest.
> CI will take extra 2 seconds to run.
> That's fine. It's a good stress test.

ok, great

thanks,
jirka

> 
> >   - copy_user_syms function potentially allocates lot of memory (~6MB in my
> >     tests with attaching ~48k functions). I haven't seen this to fail yet,
> >     but it might need to be changed to allocate memory gradually if needed,
> >     do we care? ;-)
> 
> replied in the other email.
> 
> Thanks for working on this!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-08 23:19   ` Alexei Starovoitov
@ 2022-04-09 20:24     ` Jiri Olsa
  2022-04-12 20:46     ` Jiri Olsa
  1 sibling, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-09 20:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Fri, Apr 08, 2022 at 04:19:25PM -0700, Alexei Starovoitov wrote:
> On Thu, Apr 07, 2022 at 02:52:21PM +0200, Jiri Olsa wrote:
> > Adding kallsyms_lookup_names function that resolves array of symbols
> > with single pass over kallsyms.
> > 
> > The user provides array of string pointers with count and pointer to
> > allocated array for resolved values.
> > 
> >   int kallsyms_lookup_names(const char **syms, u32 cnt,
> >                             unsigned long *addrs)
> > 
> > Before we iterate kallsyms we sort user provided symbols by name and
> > then use that in kalsyms iteration to find each kallsyms symbol in
> > user provided symbols.
> > 
> > We also check each symbol to pass ftrace_location, because this API
> > will be used for fprobe symbols resolving. This can be optional in
> > future if there's a need.
> > 
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/linux/kallsyms.h |  6 +++++
> >  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 54 insertions(+)
> > 
> > diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> > index ce1bd2fbf23e..5320a5e77f61 100644
> > --- a/include/linux/kallsyms.h
> > +++ b/include/linux/kallsyms.h
> > @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
> >  #ifdef CONFIG_KALLSYMS
> >  /* Lookup the address for a symbol. Returns 0 if not found. */
> >  unsigned long kallsyms_lookup_name(const char *name);
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
> >  
> >  extern int kallsyms_lookup_size_offset(unsigned long addr,
> >  				  unsigned long *symbolsize,
> > @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
> >  	return 0;
> >  }
> >  
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > +{
> > +	return -ERANGE;
> > +}
> > +
> >  static inline int kallsyms_lookup_size_offset(unsigned long addr,
> >  					      unsigned long *symbolsize,
> >  					      unsigned long *offset)
> > diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> > index 79f2eb617a62..a3738ddf9e87 100644
> > --- a/kernel/kallsyms.c
> > +++ b/kernel/kallsyms.c
> > @@ -29,6 +29,8 @@
> >  #include <linux/compiler.h>
> >  #include <linux/module.h>
> >  #include <linux/kernel.h>
> > +#include <linux/bsearch.h>
> > +#include <linux/sort.h>
> >  
> >  /*
> >   * These will be re-linked against their real values
> > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> >  	return __sprint_symbol(buffer, address, -1, 1, 1);
> >  }
> >  
> > +static int symbols_cmp(const void *a, const void *b)
> > +{
> > +	const char **str_a = (const char **) a;
> > +	const char **str_b = (const char **) b;
> > +
> > +	return strcmp(*str_a, *str_b);
> > +}
> > +
> > +struct kallsyms_data {
> > +	unsigned long *addrs;
> > +	const char **syms;
> > +	u32 cnt;
> > +	u32 found;
> > +};
> > +
> > +static int kallsyms_callback(void *data, const char *name,
> > +			     struct module *mod, unsigned long addr)
> > +{
> > +	struct kallsyms_data *args = data;
> > +
> > +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > +		return 0;
> > +
> > +	addr = ftrace_location(addr);
> > +	if (!addr)
> > +		return 0;
> > +
> > +	args->addrs[args->found++] = addr;
> > +	return args->found == args->cnt ? 1 : 0;
> > +}
> > +
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > +{
> > +	struct kallsyms_data args;
> > +
> > +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> 
> It's nice to share symbols_cmp for sort and bsearch,
> but messing technically input argument 'syms' like this will cause
> issues sooner or later.
> Lets make caller do the sort.
> Unordered input will cause issue with bsearch, of course,
> but it's a lesser evil. imo.

ok, will move it out and make some proper comment for the
function mentioning the sort requirement for syms

thanks,
jirka

> 
> > +
> > +	args.addrs = addrs;
> > +	args.syms = syms;
> > +	args.cnt = cnt;
> > +	args.found = 0;
> > +	kallsyms_on_each_symbol(kallsyms_callback, &args);
> > +
> > +	return args.found == args.cnt ? 0 : -EINVAL;
> > +}
> > +
> >  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
> >  struct kallsym_iter {
> >  	loff_t pos;
> > -- 
> > 2.35.1
> > 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-08 23:26   ` Alexei Starovoitov
@ 2022-04-09 20:24     ` Jiri Olsa
  2022-04-11 22:15     ` Andrii Nakryiko
  1 sibling, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-09 20:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, netdev, bpf, lkml, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh

On Fri, Apr 08, 2022 at 04:26:10PM -0700, Alexei Starovoitov wrote:
> On Thu, Apr 07, 2022 at 02:52:23PM +0200, Jiri Olsa wrote:
> > Using kallsyms_lookup_names function to speed up symbols lookup in
> > kprobe multi link attachment and replacing with it the current
> > kprobe_multi_resolve_syms function.
> > 
> > This speeds up bpftrace kprobe attachment:
> > 
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > 
> > After:
> > 
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
> >  1 file changed, 73 insertions(+), 50 deletions(-)
> > 
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index b26f3da943de..2602957225ba 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
> >  	unsigned long entry_ip;
> >  };
> >  
> > +struct user_syms {
> > +	const char **syms;
> > +	char *buf;
> > +};
> > +
> > +static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
> > +{
> > +	const char __user **usyms_copy = NULL;
> > +	const char **syms = NULL;
> > +	char *buf = NULL, *p;
> > +	int err = -EFAULT;
> > +	unsigned int i;
> > +	size_t size;
> > +
> > +	size = cnt * sizeof(*usyms_copy);
> > +
> > +	usyms_copy = kvmalloc(size, GFP_KERNEL);
> > +	if (!usyms_copy)
> > +		return -ENOMEM;
> > +
> > +	if (copy_from_user(usyms_copy, usyms, size))
> > +		goto error;
> > +
> > +	err = -ENOMEM;
> > +	syms = kvmalloc(size, GFP_KERNEL);
> > +	if (!syms)
> > +		goto error;
> > +
> > +	/* TODO this potentially allocates lot of memory (~6MB in my tests
> > +	 * with attaching ~40k functions). I haven't seen this to fail yet,
> > +	 * but it could be changed to allocate memory gradually if needed.
> > +	 */
> 
> Why would 6MB kvmalloc fail?
> If we don't have such memory the kernel will be ooming soon anyway.
> I don't think we'll see this kvmalloc triggering oom in practice.
> The verifier allocates a lot more memory to check large programs.
> 
> imo this approach is fine. It's simple.
> Trying to do gradual alloc with realloc would be just guessing.
> 
> Another option would be to ask user space (libbpf) to do the sort.
> There are pros and cons.
> This vmalloc+sort is slightly better imo.

ok, makes sense, will keep it

jirka

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-09 20:24   ` Jiri Olsa
@ 2022-04-11 22:15     ` Andrii Nakryiko
  2022-04-11 22:18       ` Alexei Starovoitov
  0 siblings, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Sat, Apr 9, 2022 at 1:24 PM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Fri, Apr 08, 2022 at 04:29:22PM -0700, Alexei Starovoitov wrote:
> > On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> > > hi,
> > > sending additional fix for symbol resolving in kprobe multi link
> > > requested by Alexei and Andrii [1].
> > >
> > > This speeds up bpftrace kprobe attachment, when using pure symbols
> > > (3344 symbols) to attach:
> > >
> > > Before:
> > >
> > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > >   ...
> > >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > >
> > > After:
> > >
> > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > >   ...
> > >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > >
> > >
> > > There are 2 reasons I'm sending this as RFC though..
> > >
> > >   - I added test that meassures attachment speed on all possible functions
> > >     from available_filter_functions, which is 48712 functions on my setup.
> > >     The attach/detach speed for that is under 2 seconds and the test will
> > >     fail if it's bigger than that.. which might fail on different setups
> > >     or loaded machine.. I'm not sure what's the best solution yet, separate
> > >     bench application perhaps?
> >
> > are you saying there is a bug in the code that you're still debugging?
> > or just worried about time?
>
> just the time, I can make the test fail (cross the 2 seconds limit)
> when the machine is loaded, like with running kernel build
>
> but I couldn't reproduce this with just paralel test_progs run
>
> >
> > I think it's better for it to be a part of selftest.
> > CI will take extra 2 seconds to run.
> > That's fine. It's a good stress test.

I agree it's a good stress test, but I disagree on adding it as a
selftests. The speed will depend on actual host machine. In VMs it
will be slower, on busier machines it will be slower, etc. Generally,
depending on some specific timing just causes unnecessary maintenance
headaches. We can have this as a benchmark, if someone things it's
very important. I'm impartial to having this regularly executed as
it's extremely unlikely that we'll accidentally regress from NlogN
back to N^2. And if there is some X% slowdown such selftest is
unlikely to alarm us anyways. Sporadic failures will annoy us way
before that to the point of blacklisting this selftests in CI at the
very least.


>
> ok, great
>
> thanks,
> jirka
>
> >
> > >   - copy_user_syms function potentially allocates lot of memory (~6MB in my
> > >     tests with attaching ~48k functions). I haven't seen this to fail yet,
> > >     but it might need to be changed to allocate memory gradually if needed,
> > >     do we care? ;-)
> >
> > replied in the other email.
> >
> > Thanks for working on this!

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
  2022-04-08  0:57   ` Masami Hiramatsu
  2022-04-08 23:19   ` Alexei Starovoitov
@ 2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-12 20:28     ` Jiri Olsa
  2 siblings, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 7, 2022 at 5:52 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding kallsyms_lookup_names function that resolves array of symbols
> with single pass over kallsyms.
>
> The user provides array of string pointers with count and pointer to
> allocated array for resolved values.
>
>   int kallsyms_lookup_names(const char **syms, u32 cnt,
>                             unsigned long *addrs)
>
> Before we iterate kallsyms we sort user provided symbols by name and
> then use that in kalsyms iteration to find each kallsyms symbol in
> user provided symbols.
>
> We also check each symbol to pass ftrace_location, because this API
> will be used for fprobe symbols resolving. This can be optional in
> future if there's a need.
>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/kallsyms.h |  6 +++++
>  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 54 insertions(+)
>
> diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> index ce1bd2fbf23e..5320a5e77f61 100644
> --- a/include/linux/kallsyms.h
> +++ b/include/linux/kallsyms.h
> @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
>  #ifdef CONFIG_KALLSYMS
>  /* Lookup the address for a symbol. Returns 0 if not found. */
>  unsigned long kallsyms_lookup_name(const char *name);
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
>
>  extern int kallsyms_lookup_size_offset(unsigned long addr,
>                                   unsigned long *symbolsize,
> @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
>         return 0;
>  }
>
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> +{
> +       return -ERANGE;
> +}
> +
>  static inline int kallsyms_lookup_size_offset(unsigned long addr,
>                                               unsigned long *symbolsize,
>                                               unsigned long *offset)
> diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> index 79f2eb617a62..a3738ddf9e87 100644
> --- a/kernel/kallsyms.c
> +++ b/kernel/kallsyms.c
> @@ -29,6 +29,8 @@
>  #include <linux/compiler.h>
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> +#include <linux/bsearch.h>
> +#include <linux/sort.h>
>
>  /*
>   * These will be re-linked against their real values
> @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
>         return __sprint_symbol(buffer, address, -1, 1, 1);
>  }
>
> +static int symbols_cmp(const void *a, const void *b)

isn't this literally strcmp? Or compiler will actually complain about
const void * vs const char *?

> +{
> +       const char **str_a = (const char **) a;
> +       const char **str_b = (const char **) b;
> +
> +       return strcmp(*str_a, *str_b);
> +}
> +
> +struct kallsyms_data {
> +       unsigned long *addrs;
> +       const char **syms;
> +       u32 cnt;
> +       u32 found;
> +};
> +
> +static int kallsyms_callback(void *data, const char *name,
> +                            struct module *mod, unsigned long addr)
> +{
> +       struct kallsyms_data *args = data;
> +
> +       if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> +               return 0;
> +
> +       addr = ftrace_location(addr);
> +       if (!addr)
> +               return 0;
> +
> +       args->addrs[args->found++] = addr;
> +       return args->found == args->cnt ? 1 : 0;
> +}
> +
> +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> +{
> +       struct kallsyms_data args;
> +
> +       sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> +
> +       args.addrs = addrs;
> +       args.syms = syms;
> +       args.cnt = cnt;
> +       args.found = 0;
> +       kallsyms_on_each_symbol(kallsyms_callback, &args);
> +
> +       return args.found == args.cnt ? 0 : -EINVAL;

ESRCH or ENOENT makes a bit more sense as an error?


> +}
> +
>  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
>  struct kallsym_iter {
>         loff_t pos;
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-07 12:52 ` [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link Jiri Olsa
  2022-04-08 23:26   ` Alexei Starovoitov
@ 2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-12 18:42     ` Jiri Olsa
  1 sibling, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 7, 2022 at 5:53 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Using kallsyms_lookup_names function to speed up symbols lookup in
> kprobe multi link attachment and replacing with it the current
> kprobe_multi_resolve_syms function.
>
> This speeds up bpftrace kprobe attachment:
>
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
>
> After:
>
>   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
>   ...
>   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
>  1 file changed, 73 insertions(+), 50 deletions(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index b26f3da943de..2602957225ba 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
>         unsigned long entry_ip;
>  };
>
> +struct user_syms {
> +       const char **syms;
> +       char *buf;
> +};
> +
> +static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
> +{
> +       const char __user **usyms_copy = NULL;
> +       const char **syms = NULL;
> +       char *buf = NULL, *p;
> +       int err = -EFAULT;
> +       unsigned int i;
> +       size_t size;
> +
> +       size = cnt * sizeof(*usyms_copy);
> +
> +       usyms_copy = kvmalloc(size, GFP_KERNEL);
> +       if (!usyms_copy)
> +               return -ENOMEM;

do you really need usyms_copy? why not just read one pointer at a time?

> +
> +       if (copy_from_user(usyms_copy, usyms, size))
> +               goto error;
> +
> +       err = -ENOMEM;
> +       syms = kvmalloc(size, GFP_KERNEL);
> +       if (!syms)
> +               goto error;
> +
> +       /* TODO this potentially allocates lot of memory (~6MB in my tests
> +        * with attaching ~40k functions). I haven't seen this to fail yet,
> +        * but it could be changed to allocate memory gradually if needed.
> +        */
> +       size = cnt * KSYM_NAME_LEN;

this reassignment of size is making it hard to follow the code, you
can just do cnt * KSYM_NAME_LEN inside kvmalloc, you don't ever use it
anywhere else

> +       buf = kvmalloc(size, GFP_KERNEL);
> +       if (!buf)
> +               goto error;
> +
> +       for (p = buf, i = 0; i < cnt; i++) {

like here, before doing strncpy_from_user() you can read usyms[i] from
user-space into temporary variable, no need for extra kvmalloc?

> +               err = strncpy_from_user(p, usyms_copy[i], KSYM_NAME_LEN);
> +               if (err == KSYM_NAME_LEN)
> +                       err = -E2BIG;
> +               if (err < 0)
> +                       goto error;
> +               syms[i] = p;
> +               p += err + 1;
> +       }
> +
> +       err = 0;
> +       us->syms = syms;
> +       us->buf = buf;
> +

[...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-08 23:26   ` Alexei Starovoitov
  2022-04-09 20:24     ` Jiri Olsa
@ 2022-04-11 22:15     ` Andrii Nakryiko
  1 sibling, 0 replies; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:15 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Fri, Apr 8, 2022 at 4:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Apr 07, 2022 at 02:52:23PM +0200, Jiri Olsa wrote:
> > Using kallsyms_lookup_names function to speed up symbols lookup in
> > kprobe multi link attachment and replacing with it the current
> > kprobe_multi_resolve_syms function.
> >
> > This speeds up bpftrace kprobe attachment:
> >
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> >
> > After:
> >
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
> >  1 file changed, 73 insertions(+), 50 deletions(-)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index b26f3da943de..2602957225ba 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
> >       unsigned long entry_ip;
> >  };
> >
> > +struct user_syms {
> > +     const char **syms;
> > +     char *buf;
> > +};
> > +
> > +static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
> > +{
> > +     const char __user **usyms_copy = NULL;
> > +     const char **syms = NULL;
> > +     char *buf = NULL, *p;
> > +     int err = -EFAULT;
> > +     unsigned int i;
> > +     size_t size;
> > +
> > +     size = cnt * sizeof(*usyms_copy);
> > +
> > +     usyms_copy = kvmalloc(size, GFP_KERNEL);
> > +     if (!usyms_copy)
> > +             return -ENOMEM;
> > +
> > +     if (copy_from_user(usyms_copy, usyms, size))
> > +             goto error;
> > +
> > +     err = -ENOMEM;
> > +     syms = kvmalloc(size, GFP_KERNEL);
> > +     if (!syms)
> > +             goto error;
> > +
> > +     /* TODO this potentially allocates lot of memory (~6MB in my tests
> > +      * with attaching ~40k functions). I haven't seen this to fail yet,
> > +      * but it could be changed to allocate memory gradually if needed.
> > +      */
>
> Why would 6MB kvmalloc fail?
> If we don't have such memory the kernel will be ooming soon anyway.
> I don't think we'll see this kvmalloc triggering oom in practice.
> The verifier allocates a lot more memory to check large programs.
>
> imo this approach is fine. It's simple.
> Trying to do gradual alloc with realloc would be just guessing.
>
> Another option would be to ask user space (libbpf) to do the sort.
> There are pros and cons.
> This vmalloc+sort is slightly better imo.

Totally agree, the simpler the better. Also when you are attaching to
tens of thousands of probes, 6MB isn't a lot of memory for whatever
you are trying to do, anyways :)

Even if libbpf had to sort it, kernel would probably have to validate
that. Also for binary search you'd still need to read in the string
itself, but if you'd do this "on demand", we are adding TOCTOU and
other headaches.

Simple is good.

>
> > +     size = cnt * KSYM_NAME_LEN;
> > +     buf = kvmalloc(size, GFP_KERNEL);
> > +     if (!buf)
> > +             goto error;
> > +
> > +     for (p = buf, i = 0; i < cnt; i++) {
> > +             err = strncpy_from_user(p, usyms_copy[i], KSYM_NAME_LEN);
> > +             if (err == KSYM_NAME_LEN)
> > +                     err = -E2BIG;
> > +             if (err < 0)
> > +                     goto error;

[...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-07 12:52 ` [RFC bpf-next 4/4] selftests/bpf: Add attach bench test Jiri Olsa
@ 2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-12  0:49     ` Masami Hiramatsu
  2022-04-12 15:44     ` Jiri Olsa
  0 siblings, 2 replies; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:15 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Thu, Apr 7, 2022 at 5:53 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding test that reads all functions from ftrace available_filter_functions
> file and attach them all through kprobe_multi API.
>
> It checks that the attach and detach times is under 2 seconds
> and printf stats info with -v option, like on my setup:
>
>   test_bench_attach: found 48712 functions
>   test_bench_attach: attached in   1.069s
>   test_bench_attach: detached in   0.373s
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  .../bpf/prog_tests/kprobe_multi_test.c        | 141 ++++++++++++++++++
>  .../selftests/bpf/progs/kprobe_multi_empty.c  |  12 ++
>  2 files changed, 153 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/progs/kprobe_multi_empty.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> index b9876b55fc0c..6798b54416de 100644
> --- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> +++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
> @@ -2,6 +2,9 @@
>  #include <test_progs.h>
>  #include "kprobe_multi.skel.h"
>  #include "trace_helpers.h"
> +#include "kprobe_multi_empty.skel.h"
> +#include "bpf/libbpf_internal.h"
> +#include "bpf/hashmap.h"
>
>  static void kprobe_multi_test_run(struct kprobe_multi *skel, bool test_return)
>  {
> @@ -301,6 +304,142 @@ static void test_attach_api_fails(void)
>         kprobe_multi__destroy(skel);
>  }
>
> +static inline __u64 get_time_ns(void)
> +{
> +       struct timespec t;
> +
> +       clock_gettime(CLOCK_MONOTONIC, &t);
> +       return (__u64) t.tv_sec * 1000000000 + t.tv_nsec;
> +}
> +
> +static size_t symbol_hash(const void *key, void *ctx __maybe_unused)
> +{
> +       return str_hash((const char *) key);
> +}
> +
> +static bool symbol_equal(const void *key1, const void *key2, void *ctx __maybe_unused)
> +{
> +       return strcmp((const char *) key1, (const char *) key2) == 0;
> +}
> +
> +#define DEBUGFS "/sys/kernel/debug/tracing/"
> +
> +static int get_syms(char ***symsp, size_t *cntp)
> +{
> +       size_t cap = 0, cnt = 0, i;
> +       char *name, **syms = NULL;
> +       struct hashmap *map;
> +       char buf[256];
> +       FILE *f;
> +       int err;
> +
> +       /*
> +        * The available_filter_functions contains many duplicates,
> +        * but other than that all symbols are usable in kprobe multi
> +        * interface.
> +        * Filtering out duplicates by using hashmap__add, which won't
> +        * add existing entry.
> +        */
> +       f = fopen(DEBUGFS "available_filter_functions", "r");

I'm really curious how did you manage to attach to everything in
available_filter_functions because when I'm trying to do that I fail.
available_filter_functions has a bunch of functions that should not be
attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:

  void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);

So first, curious what I am doing wrong or rather why it succeeds in
your case ;)

But second, just wanted to plea to "fix" available_filter_functions to
not list stuff that should not be attachable. Can you please take a
look and checks what's going on there and why do we have notrace
functions (and what else should *NOT* be there)?


> +       if (!f)
> +               return -EINVAL;
> +
> +       map = hashmap__new(symbol_hash, symbol_equal, NULL);
> +       err = libbpf_get_error(map);
> +       if (err)
> +               goto error;
> +

[...]

> +
> +       attach_delta_ns = (attach_end_ns - attach_start_ns) / 1000000000.0;
> +       detach_delta_ns = (detach_end_ns - detach_start_ns) / 1000000000.0;
> +
> +       fprintf(stderr, "%s: found %lu functions\n", __func__, cnt);
> +       fprintf(stderr, "%s: attached in %7.3lfs\n", __func__, attach_delta_ns);
> +       fprintf(stderr, "%s: detached in %7.3lfs\n", __func__, detach_delta_ns);
> +
> +       if (attach_delta_ns > 2.0)
> +               PRINT_FAIL("attach time above 2 seconds\n");
> +       if (detach_delta_ns > 2.0)
> +               PRINT_FAIL("detach time above 2 seconds\n");

see my reply on the cover letter, any such "2 second" assumption are
guaranteed to bite us. We've dealt with a lot of timing issues due to
CI being slower and more unpredictable in terms of performance, I'd
like to avoid dealing with one more case like that.


> +
> +cleanup:
> +       kprobe_multi_empty__destroy(skel);
> +       if (syms) {
> +               for (i = 0; i < cnt; i++)
> +                       free(syms[i]);
> +               free(syms);
> +       }
> +}
> +
>  void test_kprobe_multi_test(void)
>  {
>         if (!ASSERT_OK(load_kallsyms(), "load_kallsyms"))
> @@ -320,4 +459,6 @@ void test_kprobe_multi_test(void)
>                 test_attach_api_syms();
>         if (test__start_subtest("attach_api_fails"))
>                 test_attach_api_fails();
> +       if (test__start_subtest("bench_attach"))
> +               test_bench_attach();
>  }
> diff --git a/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c b/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c
> new file mode 100644
> index 000000000000..be9e3d891d46
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/kprobe_multi_empty.c
> @@ -0,0 +1,12 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +
> +char _license[] SEC("license") = "GPL";
> +
> +SEC("kprobe.multi/*")
> +int test_kprobe_empty(struct pt_regs *ctx)
> +{
> +       return 0;
> +}
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-11 22:15     ` Andrii Nakryiko
@ 2022-04-11 22:18       ` Alexei Starovoitov
  2022-04-11 22:21         ` Andrii Nakryiko
  0 siblings, 1 reply; 45+ messages in thread
From: Alexei Starovoitov @ 2022-04-11 22:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Networking, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Mon, Apr 11, 2022 at 10:15 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sat, Apr 9, 2022 at 1:24 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > On Fri, Apr 08, 2022 at 04:29:22PM -0700, Alexei Starovoitov wrote:
> > > On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> > > > hi,
> > > > sending additional fix for symbol resolving in kprobe multi link
> > > > requested by Alexei and Andrii [1].
> > > >
> > > > This speeds up bpftrace kprobe attachment, when using pure symbols
> > > > (3344 symbols) to attach:
> > > >
> > > > Before:
> > > >
> > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > >   ...
> > > >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > > >
> > > > After:
> > > >
> > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > >   ...
> > > >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > > >
> > > >
> > > > There are 2 reasons I'm sending this as RFC though..
> > > >
> > > >   - I added test that meassures attachment speed on all possible functions
> > > >     from available_filter_functions, which is 48712 functions on my setup.
> > > >     The attach/detach speed for that is under 2 seconds and the test will
> > > >     fail if it's bigger than that.. which might fail on different setups
> > > >     or loaded machine.. I'm not sure what's the best solution yet, separate
> > > >     bench application perhaps?
> > >
> > > are you saying there is a bug in the code that you're still debugging?
> > > or just worried about time?
> >
> > just the time, I can make the test fail (cross the 2 seconds limit)
> > when the machine is loaded, like with running kernel build
> >
> > but I couldn't reproduce this with just paralel test_progs run
> >
> > >
> > > I think it's better for it to be a part of selftest.
> > > CI will take extra 2 seconds to run.
> > > That's fine. It's a good stress test.
>
> I agree it's a good stress test, but I disagree on adding it as a
> selftests. The speed will depend on actual host machine. In VMs it
> will be slower, on busier machines it will be slower, etc. Generally,
> depending on some specific timing just causes unnecessary maintenance
> headaches. We can have this as a benchmark, if someone things it's
> very important. I'm impartial to having this regularly executed as
> it's extremely unlikely that we'll accidentally regress from NlogN
> back to N^2. And if there is some X% slowdown such selftest is
> unlikely to alarm us anyways. Sporadic failures will annoy us way
> before that to the point of blacklisting this selftests in CI at the
> very least.

Such selftest shouldn't be measuring the speed, of course.
The selftest will be about:
1. not crashing
2. succeeding to attach and getting some meaningful data back.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-11 22:18       ` Alexei Starovoitov
@ 2022-04-11 22:21         ` Andrii Nakryiko
  2022-04-12 15:46           ` Jiri Olsa
  0 siblings, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-11 22:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Masami Hiramatsu, Networking, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Mon, Apr 11, 2022 at 3:18 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 11, 2022 at 10:15 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Sat, Apr 9, 2022 at 1:24 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > >
> > > On Fri, Apr 08, 2022 at 04:29:22PM -0700, Alexei Starovoitov wrote:
> > > > On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> > > > > hi,
> > > > > sending additional fix for symbol resolving in kprobe multi link
> > > > > requested by Alexei and Andrii [1].
> > > > >
> > > > > This speeds up bpftrace kprobe attachment, when using pure symbols
> > > > > (3344 symbols) to attach:
> > > > >
> > > > > Before:
> > > > >
> > > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > > >   ...
> > > > >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > > > >
> > > > > After:
> > > > >
> > > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > > >   ...
> > > > >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > > > >
> > > > >
> > > > > There are 2 reasons I'm sending this as RFC though..
> > > > >
> > > > >   - I added test that meassures attachment speed on all possible functions
> > > > >     from available_filter_functions, which is 48712 functions on my setup.
> > > > >     The attach/detach speed for that is under 2 seconds and the test will
> > > > >     fail if it's bigger than that.. which might fail on different setups
> > > > >     or loaded machine.. I'm not sure what's the best solution yet, separate
> > > > >     bench application perhaps?
> > > >
> > > > are you saying there is a bug in the code that you're still debugging?
> > > > or just worried about time?
> > >
> > > just the time, I can make the test fail (cross the 2 seconds limit)
> > > when the machine is loaded, like with running kernel build
> > >
> > > but I couldn't reproduce this with just paralel test_progs run
> > >
> > > >
> > > > I think it's better for it to be a part of selftest.
> > > > CI will take extra 2 seconds to run.
> > > > That's fine. It's a good stress test.
> >
> > I agree it's a good stress test, but I disagree on adding it as a
> > selftests. The speed will depend on actual host machine. In VMs it
> > will be slower, on busier machines it will be slower, etc. Generally,
> > depending on some specific timing just causes unnecessary maintenance
> > headaches. We can have this as a benchmark, if someone things it's
> > very important. I'm impartial to having this regularly executed as
> > it's extremely unlikely that we'll accidentally regress from NlogN
> > back to N^2. And if there is some X% slowdown such selftest is
> > unlikely to alarm us anyways. Sporadic failures will annoy us way
> > before that to the point of blacklisting this selftests in CI at the
> > very least.
>
> Such selftest shouldn't be measuring the speed, of course.
> The selftest will be about:
> 1. not crashing
> 2. succeeding to attach and getting some meaningful data back.

Yeah, that's totally fine with me. My biggest beef is using time as a
measure of test success, which will be flaky. Just a slow-ish test
doing a lot of work sounds totally fine.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-11 22:15   ` Andrii Nakryiko
@ 2022-04-12  0:49     ` Masami Hiramatsu
  2022-04-12 22:51       ` Andrii Nakryiko
  2022-04-13 16:44       ` Steven Rostedt
  2022-04-12 15:44     ` Jiri Olsa
  1 sibling, 2 replies; 45+ messages in thread
From: Masami Hiramatsu @ 2022-04-12  0:49 UTC (permalink / raw)
  To: Andrii Nakryiko, Steven Rostedt
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Mon, 11 Apr 2022 15:15:40 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> > +#define DEBUGFS "/sys/kernel/debug/tracing/"
> > +
> > +static int get_syms(char ***symsp, size_t *cntp)
> > +{
> > +       size_t cap = 0, cnt = 0, i;
> > +       char *name, **syms = NULL;
> > +       struct hashmap *map;
> > +       char buf[256];
> > +       FILE *f;
> > +       int err;
> > +
> > +       /*
> > +        * The available_filter_functions contains many duplicates,
> > +        * but other than that all symbols are usable in kprobe multi
> > +        * interface.
> > +        * Filtering out duplicates by using hashmap__add, which won't
> > +        * add existing entry.
> > +        */
> > +       f = fopen(DEBUGFS "available_filter_functions", "r");
> 
> I'm really curious how did you manage to attach to everything in
> available_filter_functions because when I'm trying to do that I fail.
> available_filter_functions has a bunch of functions that should not be
> attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> 
>   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);

Hmm, this sounds like a bug in ftrace side. IIUC, the
"available_filter_functions" only shows the functions which is NOT
instrumented by mcount, we should not see any notrace functions on it.

Technically, this is done by __no_instrument_function__ attribute.

#if defined(CC_USING_HOTPATCH)
#define notrace                 __attribute__((hotpatch(0, 0)))
#elif defined(CC_USING_PATCHABLE_FUNCTION_ENTRY)
#define notrace                 __attribute__((patchable_function_entry(0, 0)))
#else
#define notrace                 __attribute__((__no_instrument_function__))
#endif

> 
> So first, curious what I am doing wrong or rather why it succeeds in
> your case ;)
> 
> But second, just wanted to plea to "fix" available_filter_functions to
> not list stuff that should not be attachable. Can you please take a
> look and checks what's going on there and why do we have notrace
> functions (and what else should *NOT* be there)?

Can you share how did you reproduce the issue? I'll check it.

Thank you,


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-11 22:15   ` Andrii Nakryiko
  2022-04-12  0:49     ` Masami Hiramatsu
@ 2022-04-12 15:44     ` Jiri Olsa
  2022-04-12 22:59       ` Andrii Nakryiko
  1 sibling, 1 reply; 45+ messages in thread
From: Jiri Olsa @ 2022-04-12 15:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Mon, Apr 11, 2022 at 03:15:40PM -0700, Andrii Nakryiko wrote:

SNIP

> > +static int get_syms(char ***symsp, size_t *cntp)
> > +{
> > +       size_t cap = 0, cnt = 0, i;
> > +       char *name, **syms = NULL;
> > +       struct hashmap *map;
> > +       char buf[256];
> > +       FILE *f;
> > +       int err;
> > +
> > +       /*
> > +        * The available_filter_functions contains many duplicates,
> > +        * but other than that all symbols are usable in kprobe multi
> > +        * interface.
> > +        * Filtering out duplicates by using hashmap__add, which won't
> > +        * add existing entry.
> > +        */
> > +       f = fopen(DEBUGFS "available_filter_functions", "r");
> 
> I'm really curious how did you manage to attach to everything in
> available_filter_functions because when I'm trying to do that I fail.

the new code makes the differece ;-) so the main problem I could not
use available_filter_functions functions before were cases like:

  # cat available_filter_functions | grep sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall
  sys_ni_syscall

which when you try to resolve you'll find just one address:

  # cat /proc/kallsyms | egrep 'T sys_ni_syscall'
  ffffffff81170020 T sys_ni_syscall

this is caused by entries like:
    __SYSCALL(156, sys_ni_syscall)

when generating syscalls for given arch

this is handled by the new code by removing duplicates when
reading available_filter_functions



another case is the other way round, like with:

  # cat /proc/kallsyms | grep 't t_next'
  ffffffff8125c3f0 t t_next
  ffffffff8126a320 t t_next
  ffffffff81275de0 t t_next
  ffffffff8127efd0 t t_next
  ffffffff814d6660 t t_next

that has just one 'ftrace-able' instance:

  # cat available_filter_functions | grep '^t_next$'
  t_next

and this is handled by calling ftrace_location on address when
resolving symbols, to ensure each reasolved symbol lives in ftrace 

> available_filter_functions has a bunch of functions that should not be
> attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> 
>   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> 
> So first, curious what I am doing wrong or rather why it succeeds in
> your case ;)
> 
> But second, just wanted to plea to "fix" available_filter_functions to
> not list stuff that should not be attachable. Can you please take a
> look and checks what's going on there and why do we have notrace
> functions (and what else should *NOT* be there)?

yes, seems like a bug ;-) it's in available_filter_functions
but it does not have 'call __fentry__' at the entry..

I was going to check on that, because you brought that up before,
but did not get to it yet

> 
> 
> > +       if (!f)
> > +               return -EINVAL;
> > +
> > +       map = hashmap__new(symbol_hash, symbol_equal, NULL);
> > +       err = libbpf_get_error(map);
> > +       if (err)
> > +               goto error;
> > +
> 
> [...]
> 
> > +
> > +       attach_delta_ns = (attach_end_ns - attach_start_ns) / 1000000000.0;
> > +       detach_delta_ns = (detach_end_ns - detach_start_ns) / 1000000000.0;
> > +
> > +       fprintf(stderr, "%s: found %lu functions\n", __func__, cnt);
> > +       fprintf(stderr, "%s: attached in %7.3lfs\n", __func__, attach_delta_ns);
> > +       fprintf(stderr, "%s: detached in %7.3lfs\n", __func__, detach_delta_ns);
> > +
> > +       if (attach_delta_ns > 2.0)
> > +               PRINT_FAIL("attach time above 2 seconds\n");
> > +       if (detach_delta_ns > 2.0)
> > +               PRINT_FAIL("detach time above 2 seconds\n");
> 
> see my reply on the cover letter, any such "2 second" assumption are
> guaranteed to bite us. We've dealt with a lot of timing issues due to
> CI being slower and more unpredictable in terms of performance, I'd
> like to avoid dealing with one more case like that.

right, I'll remove the check

thanks,
jirka

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link
  2022-04-11 22:21         ` Andrii Nakryiko
@ 2022-04-12 15:46           ` Jiri Olsa
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-12 15:46 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Masami Hiramatsu, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Mon, Apr 11, 2022 at 03:21:49PM -0700, Andrii Nakryiko wrote:
> On Mon, Apr 11, 2022 at 3:18 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Apr 11, 2022 at 10:15 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Sat, Apr 9, 2022 at 1:24 PM Jiri Olsa <olsajiri@gmail.com> wrote:
> > > >
> > > > On Fri, Apr 08, 2022 at 04:29:22PM -0700, Alexei Starovoitov wrote:
> > > > > On Thu, Apr 07, 2022 at 02:52:20PM +0200, Jiri Olsa wrote:
> > > > > > hi,
> > > > > > sending additional fix for symbol resolving in kprobe multi link
> > > > > > requested by Alexei and Andrii [1].
> > > > > >
> > > > > > This speeds up bpftrace kprobe attachment, when using pure symbols
> > > > > > (3344 symbols) to attach:
> > > > > >
> > > > > > Before:
> > > > > >
> > > > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > > > >   ...
> > > > > >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> > > > > >
> > > > > > After:
> > > > > >
> > > > > >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> > > > > >   ...
> > > > > >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> > > > > >
> > > > > >
> > > > > > There are 2 reasons I'm sending this as RFC though..
> > > > > >
> > > > > >   - I added test that meassures attachment speed on all possible functions
> > > > > >     from available_filter_functions, which is 48712 functions on my setup.
> > > > > >     The attach/detach speed for that is under 2 seconds and the test will
> > > > > >     fail if it's bigger than that.. which might fail on different setups
> > > > > >     or loaded machine.. I'm not sure what's the best solution yet, separate
> > > > > >     bench application perhaps?
> > > > >
> > > > > are you saying there is a bug in the code that you're still debugging?
> > > > > or just worried about time?
> > > >
> > > > just the time, I can make the test fail (cross the 2 seconds limit)
> > > > when the machine is loaded, like with running kernel build
> > > >
> > > > but I couldn't reproduce this with just paralel test_progs run
> > > >
> > > > >
> > > > > I think it's better for it to be a part of selftest.
> > > > > CI will take extra 2 seconds to run.
> > > > > That's fine. It's a good stress test.
> > >
> > > I agree it's a good stress test, but I disagree on adding it as a
> > > selftests. The speed will depend on actual host machine. In VMs it
> > > will be slower, on busier machines it will be slower, etc. Generally,
> > > depending on some specific timing just causes unnecessary maintenance
> > > headaches. We can have this as a benchmark, if someone things it's
> > > very important. I'm impartial to having this regularly executed as
> > > it's extremely unlikely that we'll accidentally regress from NlogN
> > > back to N^2. And if there is some X% slowdown such selftest is
> > > unlikely to alarm us anyways. Sporadic failures will annoy us way
> > > before that to the point of blacklisting this selftests in CI at the
> > > very least.
> >
> > Such selftest shouldn't be measuring the speed, of course.
> > The selftest will be about:
> > 1. not crashing
> > 2. succeeding to attach and getting some meaningful data back.
> 
> Yeah, that's totally fine with me. My biggest beef is using time as a
> measure of test success, which will be flaky. Just a slow-ish test
> doing a lot of work sounds totally fine.

ok, I'll remove the 2 seconds check

thanks,
jirka

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link
  2022-04-11 22:15   ` Andrii Nakryiko
@ 2022-04-12 18:42     ` Jiri Olsa
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-12 18:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Mon, Apr 11, 2022 at 03:15:32PM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 7, 2022 at 5:53 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Using kallsyms_lookup_names function to speed up symbols lookup in
> > kprobe multi link attachment and replacing with it the current
> > kprobe_multi_resolve_syms function.
> >
> > This speeds up bpftrace kprobe attachment:
> >
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   6.5681 +- 0.0225 seconds time elapsed  ( +-  0.34% )
> >
> > After:
> >
> >   # perf stat -r 5 -e cycles ./src/bpftrace -e 'kprobe:x* {  } i:ms:1 { exit(); }'
> >   ...
> >   0.5661 +- 0.0275 seconds time elapsed  ( +-  4.85% )
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  kernel/trace/bpf_trace.c | 123 +++++++++++++++++++++++----------------
> >  1 file changed, 73 insertions(+), 50 deletions(-)
> >
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index b26f3da943de..2602957225ba 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -2226,6 +2226,72 @@ struct bpf_kprobe_multi_run_ctx {
> >         unsigned long entry_ip;
> >  };
> >
> > +struct user_syms {
> > +       const char **syms;
> > +       char *buf;
> > +};
> > +
> > +static int copy_user_syms(struct user_syms *us, void __user *usyms, u32 cnt)
> > +{
> > +       const char __user **usyms_copy = NULL;
> > +       const char **syms = NULL;
> > +       char *buf = NULL, *p;
> > +       int err = -EFAULT;
> > +       unsigned int i;
> > +       size_t size;
> > +
> > +       size = cnt * sizeof(*usyms_copy);
> > +
> > +       usyms_copy = kvmalloc(size, GFP_KERNEL);
> > +       if (!usyms_copy)
> > +               return -ENOMEM;
> 
> do you really need usyms_copy? why not just read one pointer at a time?
> 
> > +
> > +       if (copy_from_user(usyms_copy, usyms, size))
> > +               goto error;
> > +
> > +       err = -ENOMEM;
> > +       syms = kvmalloc(size, GFP_KERNEL);
> > +       if (!syms)
> > +               goto error;
> > +
> > +       /* TODO this potentially allocates lot of memory (~6MB in my tests
> > +        * with attaching ~40k functions). I haven't seen this to fail yet,
> > +        * but it could be changed to allocate memory gradually if needed.
> > +        */
> > +       size = cnt * KSYM_NAME_LEN;
> 
> this reassignment of size is making it hard to follow the code, you
> can just do cnt * KSYM_NAME_LEN inside kvmalloc, you don't ever use it
> anywhere else

ok

> 
> > +       buf = kvmalloc(size, GFP_KERNEL);
> > +       if (!buf)
> > +               goto error;
> > +
> > +       for (p = buf, i = 0; i < cnt; i++) {
> 
> like here, before doing strncpy_from_user() you can read usyms[i] from
> user-space into temporary variable, no need for extra kvmalloc?

yes, that could work.. one copy_from_user seemed faster than separate
get_user calls, but then it's without memory allocation.. so perhaps
that's better

jirka

> 
> > +               err = strncpy_from_user(p, usyms_copy[i], KSYM_NAME_LEN);
> > +               if (err == KSYM_NAME_LEN)
> > +                       err = -E2BIG;
> > +               if (err < 0)
> > +                       goto error;
> > +               syms[i] = p;
> > +               p += err + 1;
> > +       }
> > +
> > +       err = 0;
> > +       us->syms = syms;
> > +       us->buf = buf;
> > +
> 
> [...]

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-11 22:15   ` Andrii Nakryiko
@ 2022-04-12 20:28     ` Jiri Olsa
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-12 20:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Mon, Apr 11, 2022 at 03:15:23PM -0700, Andrii Nakryiko wrote:

SNIP

> >  static inline int kallsyms_lookup_size_offset(unsigned long addr,
> >                                               unsigned long *symbolsize,
> >                                               unsigned long *offset)
> > diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> > index 79f2eb617a62..a3738ddf9e87 100644
> > --- a/kernel/kallsyms.c
> > +++ b/kernel/kallsyms.c
> > @@ -29,6 +29,8 @@
> >  #include <linux/compiler.h>
> >  #include <linux/module.h>
> >  #include <linux/kernel.h>
> > +#include <linux/bsearch.h>
> > +#include <linux/sort.h>
> >
> >  /*
> >   * These will be re-linked against their real values
> > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> >         return __sprint_symbol(buffer, address, -1, 1, 1);
> >  }
> >
> > +static int symbols_cmp(const void *a, const void *b)
> 
> isn't this literally strcmp? Or compiler will actually complain about
> const void * vs const char *?

yes..

kernel/kallsyms.c: In function ‘kallsyms_callback’:
kernel/kallsyms.c:597:73: error: passing argument 5 of ‘bsearch’ from incompatible pointer type [-Werror=incompatible-pointer-types]
  597 |         if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), strcmp))
      |                                                                         ^~~~~~
      |                                                                         |
      |                                                                         int (*)(const char *, const char *)


> 
> > +{
> > +       const char **str_a = (const char **) a;
> > +       const char **str_b = (const char **) b;
> > +
> > +       return strcmp(*str_a, *str_b);
> > +}
> > +
> > +struct kallsyms_data {
> > +       unsigned long *addrs;
> > +       const char **syms;
> > +       u32 cnt;
> > +       u32 found;
> > +};
> > +
> > +static int kallsyms_callback(void *data, const char *name,
> > +                            struct module *mod, unsigned long addr)
> > +{
> > +       struct kallsyms_data *args = data;
> > +
> > +       if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > +               return 0;
> > +
> > +       addr = ftrace_location(addr);
> > +       if (!addr)
> > +               return 0;
> > +
> > +       args->addrs[args->found++] = addr;
> > +       return args->found == args->cnt ? 1 : 0;
> > +}
> > +
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > +{
> > +       struct kallsyms_data args;
> > +
> > +       sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> > +
> > +       args.addrs = addrs;
> > +       args.syms = syms;
> > +       args.cnt = cnt;
> > +       args.found = 0;
> > +       kallsyms_on_each_symbol(kallsyms_callback, &args);
> > +
> > +       return args.found == args.cnt ? 0 : -EINVAL;
> 
> ESRCH or ENOENT makes a bit more sense as an error?

ok

jirka

> 
> 
> > +}
> > +
> >  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
> >  struct kallsym_iter {
> >         loff_t pos;
> > --
> > 2.35.1
> >

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-08 23:19   ` Alexei Starovoitov
  2022-04-09 20:24     ` Jiri Olsa
@ 2022-04-12 20:46     ` Jiri Olsa
  2022-04-15  0:47       ` Masami Hiramatsu
  1 sibling, 1 reply; 45+ messages in thread
From: Jiri Olsa @ 2022-04-12 20:46 UTC (permalink / raw)
  To: Alexei Starovoitov, Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Fri, Apr 08, 2022 at 04:19:25PM -0700, Alexei Starovoitov wrote:
> On Thu, Apr 07, 2022 at 02:52:21PM +0200, Jiri Olsa wrote:
> > Adding kallsyms_lookup_names function that resolves array of symbols
> > with single pass over kallsyms.
> > 
> > The user provides array of string pointers with count and pointer to
> > allocated array for resolved values.
> > 
> >   int kallsyms_lookup_names(const char **syms, u32 cnt,
> >                             unsigned long *addrs)
> > 
> > Before we iterate kallsyms we sort user provided symbols by name and
> > then use that in kalsyms iteration to find each kallsyms symbol in
> > user provided symbols.
> > 
> > We also check each symbol to pass ftrace_location, because this API
> > will be used for fprobe symbols resolving. This can be optional in
> > future if there's a need.
> > 
> > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  include/linux/kallsyms.h |  6 +++++
> >  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
> >  2 files changed, 54 insertions(+)
> > 
> > diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> > index ce1bd2fbf23e..5320a5e77f61 100644
> > --- a/include/linux/kallsyms.h
> > +++ b/include/linux/kallsyms.h
> > @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
> >  #ifdef CONFIG_KALLSYMS
> >  /* Lookup the address for a symbol. Returns 0 if not found. */
> >  unsigned long kallsyms_lookup_name(const char *name);
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
> >  
> >  extern int kallsyms_lookup_size_offset(unsigned long addr,
> >  				  unsigned long *symbolsize,
> > @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
> >  	return 0;
> >  }
> >  
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > +{
> > +	return -ERANGE;
> > +}
> > +
> >  static inline int kallsyms_lookup_size_offset(unsigned long addr,
> >  					      unsigned long *symbolsize,
> >  					      unsigned long *offset)
> > diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> > index 79f2eb617a62..a3738ddf9e87 100644
> > --- a/kernel/kallsyms.c
> > +++ b/kernel/kallsyms.c
> > @@ -29,6 +29,8 @@
> >  #include <linux/compiler.h>
> >  #include <linux/module.h>
> >  #include <linux/kernel.h>
> > +#include <linux/bsearch.h>
> > +#include <linux/sort.h>
> >  
> >  /*
> >   * These will be re-linked against their real values
> > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> >  	return __sprint_symbol(buffer, address, -1, 1, 1);
> >  }
> >  
> > +static int symbols_cmp(const void *a, const void *b)
> > +{
> > +	const char **str_a = (const char **) a;
> > +	const char **str_b = (const char **) b;
> > +
> > +	return strcmp(*str_a, *str_b);
> > +}
> > +
> > +struct kallsyms_data {
> > +	unsigned long *addrs;
> > +	const char **syms;
> > +	u32 cnt;
> > +	u32 found;
> > +};
> > +
> > +static int kallsyms_callback(void *data, const char *name,
> > +			     struct module *mod, unsigned long addr)
> > +{
> > +	struct kallsyms_data *args = data;
> > +
> > +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > +		return 0;
> > +
> > +	addr = ftrace_location(addr);
> > +	if (!addr)
> > +		return 0;
> > +
> > +	args->addrs[args->found++] = addr;
> > +	return args->found == args->cnt ? 1 : 0;
> > +}
> > +
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > +{
> > +	struct kallsyms_data args;
> > +
> > +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> 
> It's nice to share symbols_cmp for sort and bsearch,
> but messing technically input argument 'syms' like this will cause
> issues sooner or later.
> Lets make caller do the sort.
> Unordered input will cause issue with bsearch, of course,
> but it's a lesser evil. imo.
> 

Masami,
this logic bubbles up to the register_fprobe_syms, because user
provides symbols as its argument. Can we still force this assumption
to the 'syms' array, like with the comment change below?

FYI the bpf side does not use register_fprobe_syms, it uses
register_fprobe_ips, because it always needs ips as search
base for cookie values

thanks,
jirka


---
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index d466803dc2b2..28379c0e23e5 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -250,7 +250,7 @@ EXPORT_SYMBOL_GPL(register_fprobe_ips);
 /**
  * register_fprobe_syms() - Register fprobe to ftrace by symbols.
  * @fp: A fprobe data structure to be registered.
- * @syms: An array of target symbols.
+ * @syms: An array of target symbols, must be alphabetically sorted.
  * @num: The number of entries of @syms.
  *
  * Register @fp to the symbols given by @syms array. This will be useful if

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-12  0:49     ` Masami Hiramatsu
@ 2022-04-12 22:51       ` Andrii Nakryiko
  2022-04-16 14:21         ` Masami Hiramatsu
  2022-04-13 16:44       ` Steven Rostedt
  1 sibling, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-12 22:51 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Mon, Apr 11, 2022 at 5:49 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 11 Apr 2022 15:15:40 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > > +#define DEBUGFS "/sys/kernel/debug/tracing/"
> > > +
> > > +static int get_syms(char ***symsp, size_t *cntp)
> > > +{
> > > +       size_t cap = 0, cnt = 0, i;
> > > +       char *name, **syms = NULL;
> > > +       struct hashmap *map;
> > > +       char buf[256];
> > > +       FILE *f;
> > > +       int err;
> > > +
> > > +       /*
> > > +        * The available_filter_functions contains many duplicates,
> > > +        * but other than that all symbols are usable in kprobe multi
> > > +        * interface.
> > > +        * Filtering out duplicates by using hashmap__add, which won't
> > > +        * add existing entry.
> > > +        */
> > > +       f = fopen(DEBUGFS "available_filter_functions", "r");
> >
> > I'm really curious how did you manage to attach to everything in
> > available_filter_functions because when I'm trying to do that I fail.
> > available_filter_functions has a bunch of functions that should not be
> > attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> >
> >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
>
> Hmm, this sounds like a bug in ftrace side. IIUC, the
> "available_filter_functions" only shows the functions which is NOT
> instrumented by mcount, we should not see any notrace functions on it.
>
> Technically, this is done by __no_instrument_function__ attribute.
>
> #if defined(CC_USING_HOTPATCH)
> #define notrace                 __attribute__((hotpatch(0, 0)))
> #elif defined(CC_USING_PATCHABLE_FUNCTION_ENTRY)
> #define notrace                 __attribute__((patchable_function_entry(0, 0)))
> #else
> #define notrace                 __attribute__((__no_instrument_function__))
> #endif
>
> >
> > So first, curious what I am doing wrong or rather why it succeeds in
> > your case ;)
> >
> > But second, just wanted to plea to "fix" available_filter_functions to
> > not list stuff that should not be attachable. Can you please take a
> > look and checks what's going on there and why do we have notrace
> > functions (and what else should *NOT* be there)?
>
> Can you share how did you reproduce the issue? I'll check it.
>

$ sudo cat /sys/kernel/debug/tracing/available_filter_functions | grep
__bpf_tramp
__bpf_tramp_image_release
__bpf_tramp_image_put_rcu_tasks
__bpf_tramp_image_put_rcu
__bpf_tramp_image_put_deferred
__bpf_tramp_exit


__bpf_tramp_exit is notrace function, so shouldn't be here. Notice
that __bpf_tramp_enter (which is also notrace) are not in
available_filter_functions.

So it's quite bizarre and inconsistent.

> Thank you,
>
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-12 15:44     ` Jiri Olsa
@ 2022-04-12 22:59       ` Andrii Nakryiko
  0 siblings, 0 replies; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-12 22:59 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Masami Hiramatsu, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Tue, Apr 12, 2022 at 8:44 AM Jiri Olsa <olsajiri@gmail.com> wrote:
>
> On Mon, Apr 11, 2022 at 03:15:40PM -0700, Andrii Nakryiko wrote:
>
> SNIP
>
> > > +static int get_syms(char ***symsp, size_t *cntp)
> > > +{
> > > +       size_t cap = 0, cnt = 0, i;
> > > +       char *name, **syms = NULL;
> > > +       struct hashmap *map;
> > > +       char buf[256];
> > > +       FILE *f;
> > > +       int err;
> > > +
> > > +       /*
> > > +        * The available_filter_functions contains many duplicates,
> > > +        * but other than that all symbols are usable in kprobe multi
> > > +        * interface.
> > > +        * Filtering out duplicates by using hashmap__add, which won't
> > > +        * add existing entry.
> > > +        */
> > > +       f = fopen(DEBUGFS "available_filter_functions", "r");
> >
> > I'm really curious how did you manage to attach to everything in
> > available_filter_functions because when I'm trying to do that I fail.
>
> the new code makes the differece ;-) so the main problem I could not
> use available_filter_functions functions before were cases like:
>
>   # cat available_filter_functions | grep sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>   sys_ni_syscall
>
> which when you try to resolve you'll find just one address:
>
>   # cat /proc/kallsyms | egrep 'T sys_ni_syscall'
>   ffffffff81170020 T sys_ni_syscall
>
> this is caused by entries like:
>     __SYSCALL(156, sys_ni_syscall)
>
> when generating syscalls for given arch
>
> this is handled by the new code by removing duplicates when
> reading available_filter_functions
>
>
>
> another case is the other way round, like with:
>
>   # cat /proc/kallsyms | grep 't t_next'
>   ffffffff8125c3f0 t t_next
>   ffffffff8126a320 t t_next
>   ffffffff81275de0 t t_next
>   ffffffff8127efd0 t t_next
>   ffffffff814d6660 t t_next
>
> that has just one 'ftrace-able' instance:
>
>   # cat available_filter_functions | grep '^t_next$'
>   t_next
>
> and this is handled by calling ftrace_location on address when
> resolving symbols, to ensure each reasolved symbol lives in ftrace
>
> > available_filter_functions has a bunch of functions that should not be
> > attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> >
> >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> >
> > So first, curious what I am doing wrong or rather why it succeeds in
> > your case ;)
> >
> > But second, just wanted to plea to "fix" available_filter_functions to
> > not list stuff that should not be attachable. Can you please take a
> > look and checks what's going on there and why do we have notrace
> > functions (and what else should *NOT* be there)?
>
> yes, seems like a bug ;-) it's in available_filter_functions
> but it does not have 'call __fentry__' at the entry..
>
> I was going to check on that, because you brought that up before,
> but did not get to it yet

yeah, see also my reply to Masami. __bpf_tramp_exit and
__bpf_tramp_enter are two specific examples. Both are marked notrace,
but one is in available_filter_functions and another is not. Neither
should be attachable, but doing this local change you can see that one
of them (__bpf_tramp_exit) is:

$ git diff
diff --git a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
index b9876b55fc0c..77cff034d427 100644
--- a/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/kprobe_multi_test.c
@@ -165,8 +165,8 @@ static void test_attach_api_pattern(void)
 {
        LIBBPF_OPTS(bpf_kprobe_multi_opts, opts);

-       test_attach_api("bpf_fentry_test*", &opts);
-       test_attach_api("bpf_fentry_test?", NULL);
+       test_attach_api("__bpf_tramp_enter", &opts);
+       test_attach_api("__bpf_tramp_exit", NULL);
 }


$ sudo ./test_progs -t kprobe_multi/attach_api_pattern -v
bpf_testmod.ko is already unloaded.
Loading bpf_testmod.ko...
Successfully loaded bpf_testmod.ko.
test_kprobe_multi_test:PASS:load_kallsyms 0 nsec
test_attach_api:PASS:fentry_raw_skel_load 0 nsec
libbpf: prog 'test_kprobe': failed to attach: Invalid argument
test_attach_api:FAIL:bpf_program__attach_kprobe_multi_opts unexpected error: -22
test_attach_api:PASS:fentry_raw_skel_load 0 nsec
test_attach_api:PASS:bpf_program__attach_kprobe_multi_opts 0 nsec

Quite weird.



>
> >
> >
> > > +       if (!f)
> > > +               return -EINVAL;
> > > +
> > > +       map = hashmap__new(symbol_hash, symbol_equal, NULL);
> > > +       err = libbpf_get_error(map);
> > > +       if (err)
> > > +               goto error;
> > > +
> >
> > [...]
> >
> > > +
> > > +       attach_delta_ns = (attach_end_ns - attach_start_ns) / 1000000000.0;
> > > +       detach_delta_ns = (detach_end_ns - detach_start_ns) / 1000000000.0;
> > > +
> > > +       fprintf(stderr, "%s: found %lu functions\n", __func__, cnt);
> > > +       fprintf(stderr, "%s: attached in %7.3lfs\n", __func__, attach_delta_ns);
> > > +       fprintf(stderr, "%s: detached in %7.3lfs\n", __func__, detach_delta_ns);
> > > +
> > > +       if (attach_delta_ns > 2.0)
> > > +               PRINT_FAIL("attach time above 2 seconds\n");
> > > +       if (detach_delta_ns > 2.0)
> > > +               PRINT_FAIL("detach time above 2 seconds\n");
> >
> > see my reply on the cover letter, any such "2 second" assumption are
> > guaranteed to bite us. We've dealt with a lot of timing issues due to
> > CI being slower and more unpredictable in terms of performance, I'd
> > like to avoid dealing with one more case like that.
>
> right, I'll remove the check
>
> thanks,
> jirka

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-08  0:57   ` Masami Hiramatsu
@ 2022-04-13  7:27     ` Jiri Olsa
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-13  7:27 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	netdev, bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Fri, Apr 08, 2022 at 09:57:01AM +0900, Masami Hiramatsu wrote:

SNIP

> >  /*
> >   * These will be re-linked against their real values
> > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> >  	return __sprint_symbol(buffer, address, -1, 1, 1);
> >  }
> >  
> > +static int symbols_cmp(const void *a, const void *b)
> > +{
> > +	const char **str_a = (const char **) a;
> > +	const char **str_b = (const char **) b;
> > +
> > +	return strcmp(*str_a, *str_b);
> > +}
> > +
> > +struct kallsyms_data {
> > +	unsigned long *addrs;
> > +	const char **syms;
> > +	u32 cnt;
> > +	u32 found;
> 
> BTW, why do you use 'u32' for this arch independent code?
> I think 'size_t' will make its role clearer.

right, will change

> 
> > +};
> > +
> > +static int kallsyms_callback(void *data, const char *name,
> > +			     struct module *mod, unsigned long addr)
> > +{
> > +	struct kallsyms_data *args = data;
> > +
> > +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > +		return 0;
> > +
> > +	addr = ftrace_location(addr);
> > +	if (!addr)
> > +		return 0;
> > +
> > +	args->addrs[args->found++] = addr;
> > +	return args->found == args->cnt ? 1 : 0;
> > +}
> > +
> > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> 
> Ditto. I think 'size_t cnt' is better. 

ok, thanks

jirka

> 
> Thank you,
> 
> > +{
> > +	struct kallsyms_data args;
> > +
> > +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> > +
> > +	args.addrs = addrs;
> > +	args.syms = syms;
> > +	args.cnt = cnt;
> > +	args.found = 0;
> > +	kallsyms_on_each_symbol(kallsyms_callback, &args);
> > +
> > +	return args.found == args.cnt ? 0 : -EINVAL;
> > +}
> > +
> >  /* To avoid using get_symbol_offset for every symbol, we carry prefix along. */
> >  struct kallsym_iter {
> >  	loff_t pos;
> > -- 
> > 2.35.1
> > 
> 
> 
> -- 
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-12  0:49     ` Masami Hiramatsu
  2022-04-12 22:51       ` Andrii Nakryiko
@ 2022-04-13 16:44       ` Steven Rostedt
  2022-04-13 16:45         ` Andrii Nakryiko
  1 sibling, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2022-04-13 16:44 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Tue, 12 Apr 2022 09:49:23 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> > I'm really curious how did you manage to attach to everything in
> > available_filter_functions because when I'm trying to do that I fail.
> > available_filter_functions has a bunch of functions that should not be
> > attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> > 
> >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);  

Did you only use the "notrace" on the prototype? I see the semicolon at
the end of your comment. It only affects the actual function itself,
not the prototype.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-13 16:44       ` Steven Rostedt
@ 2022-04-13 16:45         ` Andrii Nakryiko
  2022-04-13 16:59           ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-13 16:45 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Wed, Apr 13, 2022 at 9:44 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Tue, 12 Apr 2022 09:49:23 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > > I'm really curious how did you manage to attach to everything in
> > > available_filter_functions because when I'm trying to do that I fail.
> > > available_filter_functions has a bunch of functions that should not be
> > > attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> > >
> > >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
>
> Did you only use the "notrace" on the prototype? I see the semicolon at
> the end of your comment. It only affects the actual function itself,
> not the prototype.

notrace is both on declaration and on definition, see kernel/bpf/trampoline.c:

void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
{
        percpu_ref_put(&tr->pcref);
}


>
> -- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-13 16:45         ` Andrii Nakryiko
@ 2022-04-13 16:59           ` Steven Rostedt
  2022-04-13 19:02             ` Andrii Nakryiko
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2022-04-13 16:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Wed, 13 Apr 2022 09:45:52 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> > Did you only use the "notrace" on the prototype? I see the semicolon at
> > the end of your comment. It only affects the actual function itself,
> > not the prototype.  
> 
> notrace is both on declaration and on definition, see kernel/bpf/trampoline.c:

OK. Note, it only needs to be on the function, the prototype doesn't do
anything. But that shouldn't be the issue.

> 
> void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
> {
>         percpu_ref_put(&tr->pcref);
> }
> 

What compiler are you using? as this seems to be a compiler bug.
Because it's not ftrace that picks what functions to trace, but the
compiler itself.

-- Steve



^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-13 16:59           ` Steven Rostedt
@ 2022-04-13 19:02             ` Andrii Nakryiko
  0 siblings, 0 replies; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-13 19:02 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Wed, Apr 13, 2022 at 9:59 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Wed, 13 Apr 2022 09:45:52 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > > Did you only use the "notrace" on the prototype? I see the semicolon at
> > > the end of your comment. It only affects the actual function itself,
> > > not the prototype.
> >
> > notrace is both on declaration and on definition, see kernel/bpf/trampoline.c:
>
> OK. Note, it only needs to be on the function, the prototype doesn't do
> anything. But that shouldn't be the issue.
>
> >
> > void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
> > {
> >         percpu_ref_put(&tr->pcref);
> > }
> >
>
> What compiler are you using? as this seems to be a compiler bug.
> Because it's not ftrace that picks what functions to trace, but the
> compiler itself.

I build my local kernel with

$ gcc --version
gcc (GCC) 11.1.1 20210623 (Red Hat 11.1.1-6)


But we have the same issue in our production kernels which are most
probably built with some other version of GCC, but I don't know which
one.


>
> -- Steve
>
>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-12 20:46     ` Jiri Olsa
@ 2022-04-15  0:47       ` Masami Hiramatsu
  2022-04-15 22:39         ` Jiri Olsa
  0 siblings, 1 reply; 45+ messages in thread
From: Masami Hiramatsu @ 2022-04-15  0:47 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

Hi Jiri,

Sorry for replying later.

On Tue, 12 Apr 2022 22:46:15 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:

> On Fri, Apr 08, 2022 at 04:19:25PM -0700, Alexei Starovoitov wrote:
> > On Thu, Apr 07, 2022 at 02:52:21PM +0200, Jiri Olsa wrote:
> > > Adding kallsyms_lookup_names function that resolves array of symbols
> > > with single pass over kallsyms.
> > > 
> > > The user provides array of string pointers with count and pointer to
> > > allocated array for resolved values.
> > > 
> > >   int kallsyms_lookup_names(const char **syms, u32 cnt,
> > >                             unsigned long *addrs)
> > > 
> > > Before we iterate kallsyms we sort user provided symbols by name and
> > > then use that in kalsyms iteration to find each kallsyms symbol in
> > > user provided symbols.
> > > 
> > > We also check each symbol to pass ftrace_location, because this API
> > > will be used for fprobe symbols resolving. This can be optional in
> > > future if there's a need.
> > > 
> > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  include/linux/kallsyms.h |  6 +++++
> > >  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
> > >  2 files changed, 54 insertions(+)
> > > 
> > > diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> > > index ce1bd2fbf23e..5320a5e77f61 100644
> > > --- a/include/linux/kallsyms.h
> > > +++ b/include/linux/kallsyms.h
> > > @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
> > >  #ifdef CONFIG_KALLSYMS
> > >  /* Lookup the address for a symbol. Returns 0 if not found. */
> > >  unsigned long kallsyms_lookup_name(const char *name);
> > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
> > >  
> > >  extern int kallsyms_lookup_size_offset(unsigned long addr,
> > >  				  unsigned long *symbolsize,
> > > @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
> > >  	return 0;
> > >  }
> > >  
> > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > > +{
> > > +	return -ERANGE;
> > > +}
> > > +
> > >  static inline int kallsyms_lookup_size_offset(unsigned long addr,
> > >  					      unsigned long *symbolsize,
> > >  					      unsigned long *offset)
> > > diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> > > index 79f2eb617a62..a3738ddf9e87 100644
> > > --- a/kernel/kallsyms.c
> > > +++ b/kernel/kallsyms.c
> > > @@ -29,6 +29,8 @@
> > >  #include <linux/compiler.h>
> > >  #include <linux/module.h>
> > >  #include <linux/kernel.h>
> > > +#include <linux/bsearch.h>
> > > +#include <linux/sort.h>
> > >  
> > >  /*
> > >   * These will be re-linked against their real values
> > > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> > >  	return __sprint_symbol(buffer, address, -1, 1, 1);
> > >  }
> > >  
> > > +static int symbols_cmp(const void *a, const void *b)
> > > +{
> > > +	const char **str_a = (const char **) a;
> > > +	const char **str_b = (const char **) b;
> > > +
> > > +	return strcmp(*str_a, *str_b);
> > > +}
> > > +
> > > +struct kallsyms_data {
> > > +	unsigned long *addrs;
> > > +	const char **syms;
> > > +	u32 cnt;
> > > +	u32 found;
> > > +};
> > > +
> > > +static int kallsyms_callback(void *data, const char *name,
> > > +			     struct module *mod, unsigned long addr)
> > > +{
> > > +	struct kallsyms_data *args = data;
> > > +
> > > +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > > +		return 0;
> > > +
> > > +	addr = ftrace_location(addr);
> > > +	if (!addr)
> > > +		return 0;
> > > +
> > > +	args->addrs[args->found++] = addr;
> > > +	return args->found == args->cnt ? 1 : 0;
> > > +}
> > > +
> > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > > +{
> > > +	struct kallsyms_data args;
> > > +
> > > +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> > 
> > It's nice to share symbols_cmp for sort and bsearch,
> > but messing technically input argument 'syms' like this will cause
> > issues sooner or later.
> > Lets make caller do the sort.
> > Unordered input will cause issue with bsearch, of course,
> > but it's a lesser evil. imo.
> > 
> 
> Masami,
> this logic bubbles up to the register_fprobe_syms, because user
> provides symbols as its argument. Can we still force this assumption
> to the 'syms' array, like with the comment change below?
> 
> FYI the bpf side does not use register_fprobe_syms, it uses
> register_fprobe_ips, because it always needs ips as search
> base for cookie values

Hmm, in that case fprobe can call sort() in the register function.
That will be much easier and safer. The bpf case, the input array will
be generated by the bpftool (not by manual), so it can ensure the 
syms is sorted. But we don't know how fprobe user passes syms array.
Then register_fprobe_syms() will always requires sort(). I don't like
such redundant requirements.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function
  2022-04-15  0:47       ` Masami Hiramatsu
@ 2022-04-15 22:39         ` Jiri Olsa
  0 siblings, 0 replies; 45+ messages in thread
From: Jiri Olsa @ 2022-04-15 22:39 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, netdev, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Fri, Apr 15, 2022 at 09:47:27AM +0900, Masami Hiramatsu wrote:
> Hi Jiri,
> 
> Sorry for replying later.
> 
> On Tue, 12 Apr 2022 22:46:15 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
> 
> > On Fri, Apr 08, 2022 at 04:19:25PM -0700, Alexei Starovoitov wrote:
> > > On Thu, Apr 07, 2022 at 02:52:21PM +0200, Jiri Olsa wrote:
> > > > Adding kallsyms_lookup_names function that resolves array of symbols
> > > > with single pass over kallsyms.
> > > > 
> > > > The user provides array of string pointers with count and pointer to
> > > > allocated array for resolved values.
> > > > 
> > > >   int kallsyms_lookup_names(const char **syms, u32 cnt,
> > > >                             unsigned long *addrs)
> > > > 
> > > > Before we iterate kallsyms we sort user provided symbols by name and
> > > > then use that in kalsyms iteration to find each kallsyms symbol in
> > > > user provided symbols.
> > > > 
> > > > We also check each symbol to pass ftrace_location, because this API
> > > > will be used for fprobe symbols resolving. This can be optional in
> > > > future if there's a need.
> > > > 
> > > > Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  include/linux/kallsyms.h |  6 +++++
> > > >  kernel/kallsyms.c        | 48 ++++++++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 54 insertions(+)
> > > > 
> > > > diff --git a/include/linux/kallsyms.h b/include/linux/kallsyms.h
> > > > index ce1bd2fbf23e..5320a5e77f61 100644
> > > > --- a/include/linux/kallsyms.h
> > > > +++ b/include/linux/kallsyms.h
> > > > @@ -72,6 +72,7 @@ int kallsyms_on_each_symbol(int (*fn)(void *, const char *, struct module *,
> > > >  #ifdef CONFIG_KALLSYMS
> > > >  /* Lookup the address for a symbol. Returns 0 if not found. */
> > > >  unsigned long kallsyms_lookup_name(const char *name);
> > > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs);
> > > >  
> > > >  extern int kallsyms_lookup_size_offset(unsigned long addr,
> > > >  				  unsigned long *symbolsize,
> > > > @@ -103,6 +104,11 @@ static inline unsigned long kallsyms_lookup_name(const char *name)
> > > >  	return 0;
> > > >  }
> > > >  
> > > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > > > +{
> > > > +	return -ERANGE;
> > > > +}
> > > > +
> > > >  static inline int kallsyms_lookup_size_offset(unsigned long addr,
> > > >  					      unsigned long *symbolsize,
> > > >  					      unsigned long *offset)
> > > > diff --git a/kernel/kallsyms.c b/kernel/kallsyms.c
> > > > index 79f2eb617a62..a3738ddf9e87 100644
> > > > --- a/kernel/kallsyms.c
> > > > +++ b/kernel/kallsyms.c
> > > > @@ -29,6 +29,8 @@
> > > >  #include <linux/compiler.h>
> > > >  #include <linux/module.h>
> > > >  #include <linux/kernel.h>
> > > > +#include <linux/bsearch.h>
> > > > +#include <linux/sort.h>
> > > >  
> > > >  /*
> > > >   * These will be re-linked against their real values
> > > > @@ -572,6 +574,52 @@ int sprint_backtrace_build_id(char *buffer, unsigned long address)
> > > >  	return __sprint_symbol(buffer, address, -1, 1, 1);
> > > >  }
> > > >  
> > > > +static int symbols_cmp(const void *a, const void *b)
> > > > +{
> > > > +	const char **str_a = (const char **) a;
> > > > +	const char **str_b = (const char **) b;
> > > > +
> > > > +	return strcmp(*str_a, *str_b);
> > > > +}
> > > > +
> > > > +struct kallsyms_data {
> > > > +	unsigned long *addrs;
> > > > +	const char **syms;
> > > > +	u32 cnt;
> > > > +	u32 found;
> > > > +};
> > > > +
> > > > +static int kallsyms_callback(void *data, const char *name,
> > > > +			     struct module *mod, unsigned long addr)
> > > > +{
> > > > +	struct kallsyms_data *args = data;
> > > > +
> > > > +	if (!bsearch(&name, args->syms, args->cnt, sizeof(*args->syms), symbols_cmp))
> > > > +		return 0;
> > > > +
> > > > +	addr = ftrace_location(addr);
> > > > +	if (!addr)
> > > > +		return 0;
> > > > +
> > > > +	args->addrs[args->found++] = addr;
> > > > +	return args->found == args->cnt ? 1 : 0;
> > > > +}
> > > > +
> > > > +int kallsyms_lookup_names(const char **syms, u32 cnt, unsigned long *addrs)
> > > > +{
> > > > +	struct kallsyms_data args;
> > > > +
> > > > +	sort(syms, cnt, sizeof(*syms), symbols_cmp, NULL);
> > > 
> > > It's nice to share symbols_cmp for sort and bsearch,
> > > but messing technically input argument 'syms' like this will cause
> > > issues sooner or later.
> > > Lets make caller do the sort.
> > > Unordered input will cause issue with bsearch, of course,
> > > but it's a lesser evil. imo.
> > > 
> > 
> > Masami,
> > this logic bubbles up to the register_fprobe_syms, because user
> > provides symbols as its argument. Can we still force this assumption
> > to the 'syms' array, like with the comment change below?
> > 
> > FYI the bpf side does not use register_fprobe_syms, it uses
> > register_fprobe_ips, because it always needs ips as search
> > base for cookie values
> 
> Hmm, in that case fprobe can call sort() in the register function.
> That will be much easier and safer. The bpf case, the input array will
> be generated by the bpftool (not by manual), so it can ensure the 
> syms is sorted. But we don't know how fprobe user passes syms array.
> Then register_fprobe_syms() will always requires sort(). I don't like
> such redundant requirements.

ok, I'll add it to register_fprobe_syms

thanks,
jirka

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-12 22:51       ` Andrii Nakryiko
@ 2022-04-16 14:21         ` Masami Hiramatsu
  2022-04-18 17:33           ` Steven Rostedt
  2022-04-28 13:58           ` Steven Rostedt
  0 siblings, 2 replies; 45+ messages in thread
From: Masami Hiramatsu @ 2022-04-16 14:21 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Steven Rostedt, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

Hi,

On Tue, 12 Apr 2022 15:51:43 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Mon, Apr 11, 2022 at 5:49 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Mon, 11 Apr 2022 15:15:40 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > > > +#define DEBUGFS "/sys/kernel/debug/tracing/"
> > > > +
> > > > +static int get_syms(char ***symsp, size_t *cntp)
> > > > +{
> > > > +       size_t cap = 0, cnt = 0, i;
> > > > +       char *name, **syms = NULL;
> > > > +       struct hashmap *map;
> > > > +       char buf[256];
> > > > +       FILE *f;
> > > > +       int err;
> > > > +
> > > > +       /*
> > > > +        * The available_filter_functions contains many duplicates,
> > > > +        * but other than that all symbols are usable in kprobe multi
> > > > +        * interface.
> > > > +        * Filtering out duplicates by using hashmap__add, which won't
> > > > +        * add existing entry.
> > > > +        */
> > > > +       f = fopen(DEBUGFS "available_filter_functions", "r");
> > >
> > > I'm really curious how did you manage to attach to everything in
> > > available_filter_functions because when I'm trying to do that I fail.
> > > available_filter_functions has a bunch of functions that should not be
> > > attachable (e.g., notrace functions). Look just at __bpf_tramp_exit:
> > >
> > >   void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr);
> >
> > Hmm, this sounds like a bug in ftrace side. IIUC, the
> > "available_filter_functions" only shows the functions which is NOT
> > instrumented by mcount, we should not see any notrace functions on it.
> >
> > Technically, this is done by __no_instrument_function__ attribute.
> >
> > #if defined(CC_USING_HOTPATCH)
> > #define notrace                 __attribute__((hotpatch(0, 0)))
> > #elif defined(CC_USING_PATCHABLE_FUNCTION_ENTRY)
> > #define notrace                 __attribute__((patchable_function_entry(0, 0)))
> > #else
> > #define notrace                 __attribute__((__no_instrument_function__))
> > #endif
> >
> > >
> > > So first, curious what I am doing wrong or rather why it succeeds in
> > > your case ;)
> > >
> > > But second, just wanted to plea to "fix" available_filter_functions to
> > > not list stuff that should not be attachable. Can you please take a
> > > look and checks what's going on there and why do we have notrace
> > > functions (and what else should *NOT* be there)?
> >
> > Can you share how did you reproduce the issue? I'll check it.
> >
> 
> $ sudo cat /sys/kernel/debug/tracing/available_filter_functions | grep
> __bpf_tramp
> __bpf_tramp_image_release
> __bpf_tramp_image_put_rcu_tasks
> __bpf_tramp_image_put_rcu
> __bpf_tramp_image_put_deferred
> __bpf_tramp_exit
> 
> 
> __bpf_tramp_exit is notrace function, so shouldn't be here. Notice
> that __bpf_tramp_enter (which is also notrace) are not in
> available_filter_functions.

OK, I also confirmed that __bpf_tramp_exit is listed. (others seems no notrace)

/sys/kernel/tracing # cat available_filter_functions | grep __bpf_tramp
__bpf_tramp_image_release
__bpf_tramp_image_put_rcu
__bpf_tramp_image_put_rcu_tasks
__bpf_tramp_image_put_deferred
__bpf_tramp_exit

My gcc is older one.
gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1) 

But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed) 

ffffffff81208270 <__bpf_tramp_exit>:
ffffffff81208270:       55                      push   %rbp
ffffffff81208271:       48 89 e5                mov    %rsp,%rbp
ffffffff81208274:       53                      push   %rbx
ffffffff81208275:       48 89 fb                mov    %rdi,%rbx
ffffffff81208278:       e8 83 70 ef ff          callq  ffffffff810ff300 <__rcu_read_lock>
ffffffff8120827d:       31 d2                   xor    %edx,%edx


> 
> So it's quite bizarre and inconsistent.

Indeed. I guess there is a bug in scripts/recordmcount.pl.

Thank you,

> 
> > Thank you,
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-16 14:21         ` Masami Hiramatsu
@ 2022-04-18 17:33           ` Steven Rostedt
  2022-04-28 13:58           ` Steven Rostedt
  1 sibling, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2022-04-18 17:33 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Sat, 16 Apr 2022 23:21:03 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed) 
> 
> ffffffff81208270 <__bpf_tramp_exit>:
> ffffffff81208270:       55                      push   %rbp
> ffffffff81208271:       48 89 e5                mov    %rsp,%rbp
> ffffffff81208274:       53                      push   %rbx
> ffffffff81208275:       48 89 fb                mov    %rdi,%rbx
> ffffffff81208278:       e8 83 70 ef ff          callq  ffffffff810ff300 <__rcu_read_lock>
> ffffffff8120827d:       31 d2                   xor    %edx,%edx
> 
> 
> > 
> > So it's quite bizarre and inconsistent.  
> 
> Indeed. I guess there is a bug in scripts/recordmcount.pl.

Actually, x86 doesn't use that script. It either uses the C version, or
with latest gcc, it is created by the compiler itself.

I'll look deeper into it.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-16 14:21         ` Masami Hiramatsu
  2022-04-18 17:33           ` Steven Rostedt
@ 2022-04-28 13:58           ` Steven Rostedt
  2022-04-28 13:59             ` Steven Rostedt
  2022-04-28 18:59             ` Alexei Starovoitov
  1 sibling, 2 replies; 45+ messages in thread
From: Steven Rostedt @ 2022-04-28 13:58 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Sat, 16 Apr 2022 23:21:03 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> OK, I also confirmed that __bpf_tramp_exit is listed. (others seems no notrace)
> 
> /sys/kernel/tracing # cat available_filter_functions | grep __bpf_tramp
> __bpf_tramp_image_release
> __bpf_tramp_image_put_rcu
> __bpf_tramp_image_put_rcu_tasks
> __bpf_tramp_image_put_deferred
> __bpf_tramp_exit
> 
> My gcc is older one.
> gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1) 
> 
> But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed) 
> 
> ffffffff81208270 <__bpf_tramp_exit>:
> ffffffff81208270:       55                      push   %rbp
> ffffffff81208271:       48 89 e5                mov    %rsp,%rbp
> ffffffff81208274:       53                      push   %rbx
> ffffffff81208275:       48 89 fb                mov    %rdi,%rbx
> ffffffff81208278:       e8 83 70 ef ff          callq  ffffffff810ff300 <__rcu_read_lock>
> ffffffff8120827d:       31 d2                   xor    %edx,%edx

You need to look deeper ;-)
> 
> 
> > 
> > So it's quite bizarre and inconsistent.  
> 
> Indeed. I guess there is a bug in scripts/recordmcount.pl.

No there isn't.

I added the addresses it was mapping and found this:

ffffffffa828f680 T __bpf_tramp_exit

(which is relocated, but it's trivial to map it with the actual function).

At the end of that function we have:

ffffffff8128f767:       48 8d bb e0 00 00 00    lea    0xe0(%rbx),%rdi
ffffffff8128f76e:       48 8b 40 08             mov    0x8(%rax),%rax
ffffffff8128f772:       e8 89 28 d7 00          call   ffffffff82002000 <__x86_indirect_thunk_array>
                        ffffffff8128f773: R_X86_64_PLT32        __x86_indirect_thunk_rax-0x4
ffffffff8128f777:       e9 4a ff ff ff          jmp    ffffffff8128f6c6 <__bpf_tramp_exit+0x46>
ffffffff8128f77c:       0f 1f 40 00             nopl   0x0(%rax)
ffffffff8128f780:       e8 8b df dc ff          call   ffffffff8105d710 <__fentry__>
                        ffffffff8128f781: R_X86_64_PLT32        __fentry__-0x4
ffffffff8128f785:       b8 f4 fd ff ff          mov    $0xfffffdf4,%eax
ffffffff8128f78a:       c3                      ret    
ffffffff8128f78b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)


Notice the call to fentry!

It's due to this:

void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
{
	percpu_ref_put(&tr->pcref);
}

int __weak
arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
			    const struct btf_func_model *m, u32 flags,
			    struct bpf_tramp_progs *tprogs,
			    void *orig_call)
{
	return -ENOTSUPP;
}

The weak function gets a call to ftrace, but it still gets compiled into
vmlinux but its symbol is dropped due to it being overridden. Thus, the
mcount_loc finds this call to fentry, and maps it to the symbol that is
before it, which just happened to be __bpf_tramp_exit.

I made that weak function "notrace" and the __bpf_tramp_exit disappeared
from the available_filter_functions list.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 13:58           ` Steven Rostedt
@ 2022-04-28 13:59             ` Steven Rostedt
  2022-04-28 18:59             ` Alexei Starovoitov
  1 sibling, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2022-04-28 13:59 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, lkml, Martin KaFai Lau,
	Song Liu, Yonghong Song, John Fastabend, KP Singh

On Thu, 28 Apr 2022 09:58:03 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> I added the addresses it was mapping and found this:
> 
> ffffffffa828f680 T __bpf_tramp_exit

cut and pasted the kallsyms not the avaliable filter functions, which had
this:

  __bpf_tramp_exit (ffffffffa828f780)

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 13:58           ` Steven Rostedt
  2022-04-28 13:59             ` Steven Rostedt
@ 2022-04-28 18:59             ` Alexei Starovoitov
  2022-04-28 20:05               ` Steven Rostedt
  1 sibling, 1 reply; 45+ messages in thread
From: Alexei Starovoitov @ 2022-04-28 18:59 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Masami Hiramatsu, Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Networking, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Thu, Apr 28, 2022 at 6:58 AM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Sat, 16 Apr 2022 23:21:03 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > OK, I also confirmed that __bpf_tramp_exit is listed. (others seems no notrace)
> >
> > /sys/kernel/tracing # cat available_filter_functions | grep __bpf_tramp
> > __bpf_tramp_image_release
> > __bpf_tramp_image_put_rcu
> > __bpf_tramp_image_put_rcu_tasks
> > __bpf_tramp_image_put_deferred
> > __bpf_tramp_exit
> >
> > My gcc is older one.
> > gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.1)
> >
> > But it seems that __bpf_tramp_exit() doesn't call __fentry__. (I objdump'ed)
> >
> > ffffffff81208270 <__bpf_tramp_exit>:
> > ffffffff81208270:       55                      push   %rbp
> > ffffffff81208271:       48 89 e5                mov    %rsp,%rbp
> > ffffffff81208274:       53                      push   %rbx
> > ffffffff81208275:       48 89 fb                mov    %rdi,%rbx
> > ffffffff81208278:       e8 83 70 ef ff          callq  ffffffff810ff300 <__rcu_read_lock>
> > ffffffff8120827d:       31 d2                   xor    %edx,%edx
>
> You need to look deeper ;-)
> >
> >
> > >
> > > So it's quite bizarre and inconsistent.
> >
> > Indeed. I guess there is a bug in scripts/recordmcount.pl.
>
> No there isn't.
>
> I added the addresses it was mapping and found this:
>
> ffffffffa828f680 T __bpf_tramp_exit
>
> (which is relocated, but it's trivial to map it with the actual function).
>
> At the end of that function we have:
>
> ffffffff8128f767:       48 8d bb e0 00 00 00    lea    0xe0(%rbx),%rdi
> ffffffff8128f76e:       48 8b 40 08             mov    0x8(%rax),%rax
> ffffffff8128f772:       e8 89 28 d7 00          call   ffffffff82002000 <__x86_indirect_thunk_array>
>                         ffffffff8128f773: R_X86_64_PLT32        __x86_indirect_thunk_rax-0x4
> ffffffff8128f777:       e9 4a ff ff ff          jmp    ffffffff8128f6c6 <__bpf_tramp_exit+0x46>
> ffffffff8128f77c:       0f 1f 40 00             nopl   0x0(%rax)
> ffffffff8128f780:       e8 8b df dc ff          call   ffffffff8105d710 <__fentry__>
>                         ffffffff8128f781: R_X86_64_PLT32        __fentry__-0x4
> ffffffff8128f785:       b8 f4 fd ff ff          mov    $0xfffffdf4,%eax
> ffffffff8128f78a:       c3                      ret
> ffffffff8128f78b:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>
>
> Notice the call to fentry!
>
> It's due to this:
>
> void notrace __bpf_tramp_exit(struct bpf_tramp_image *tr)
> {
>         percpu_ref_put(&tr->pcref);
> }
>
> int __weak
> arch_prepare_bpf_trampoline(struct bpf_tramp_image *tr, void *image, void *image_end,
>                             const struct btf_func_model *m, u32 flags,
>                             struct bpf_tramp_progs *tprogs,
>                             void *orig_call)
> {
>         return -ENOTSUPP;
> }
>
> The weak function gets a call to ftrace, but it still gets compiled into
> vmlinux but its symbol is dropped due to it being overridden. Thus, the
> mcount_loc finds this call to fentry, and maps it to the symbol that is
> before it, which just happened to be __bpf_tramp_exit.

Ouch. That _is_ a bug in recordmocount.

> I made that weak function "notrace" and the __bpf_tramp_exit disappeared
> from the available_filter_functions list.

That's a hack. We cannot rely on such hacks for all weak functions.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 18:59             ` Alexei Starovoitov
@ 2022-04-28 20:05               ` Steven Rostedt
  2022-04-28 23:32                 ` Andrii Nakryiko
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2022-04-28 20:05 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Masami Hiramatsu, Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Networking, bpf, lkml,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh

On Thu, 28 Apr 2022 11:59:55 -0700
Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:

> > The weak function gets a call to ftrace, but it still gets compiled into
> > vmlinux but its symbol is dropped due to it being overridden. Thus, the
> > mcount_loc finds this call to fentry, and maps it to the symbol that is
> > before it, which just happened to be __bpf_tramp_exit.  
> 
> Ouch. That _is_ a bug in recordmocount.

Exactly HOW is it a bug in recordmcount?

The job of recordmcount is to create a section of all the locations that
call fentry. That is EXACTLY what it did. No bug there! It did its job.

In fact, recordmcount probably didn't even get called. If you see this on
x86 with gcc version greater than 8 (which I do), recordmcount is not even
used. gcc creates this section internally instead.

> 
> > I made that weak function "notrace" and the __bpf_tramp_exit disappeared
> > from the available_filter_functions list.  
> 
> That's a hack. We cannot rely on such hacks for all weak functions.

Then don't do anything. The only thing this bug causes is perhaps some
confusion, because functions before weak functions that are overridden will
be listed incorrectly in the available_filter_functions file. And that's
because of the way it is created with respect to kallsyms.

If you enable __bpf_tramp_exit, it will not do anything to that function.
What it will do is enable the location inside of the weak function that no
longer has its symbol shown.

One solution is to simply get the end of the function that is provided by
kallsyms to make sure the fentry call location is inside the function, and
if it is not, then not show that function in available_filter_functions but
instead show something like "** unnamed function **" or whatever.

I could write a patch to do that when I get the time. But because the only
issue that this causes is some confusion among the users and does not cause
any issue with functionality, then it is low priority.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 20:05               ` Steven Rostedt
@ 2022-04-28 23:32                 ` Andrii Nakryiko
  2022-04-28 23:53                   ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Andrii Nakryiko @ 2022-04-28 23:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Alexei Starovoitov, Masami Hiramatsu, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu, Apr 28, 2022 at 1:05 PM Steven Rostedt <rostedt@goodmis.org> wrote:
>
> On Thu, 28 Apr 2022 11:59:55 -0700
> Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote:
>
> > > The weak function gets a call to ftrace, but it still gets compiled into
> > > vmlinux but its symbol is dropped due to it being overridden. Thus, the
> > > mcount_loc finds this call to fentry, and maps it to the symbol that is
> > > before it, which just happened to be __bpf_tramp_exit.
> >
> > Ouch. That _is_ a bug in recordmocount.
>
> Exactly HOW is it a bug in recordmcount?
>
> The job of recordmcount is to create a section of all the locations that
> call fentry. That is EXACTLY what it did. No bug there! It did its job.

But that __fentry__ call is not part of __bpf_tramp_exit, actually.
Whether to call it a bug or limitation is secondary. It marks
__bpf_tramp_exit as attachable through kprobe/ftrace while it really
isn't.

Below you are saying there is only user confusion. It's not just
confusion. You'll get an error when you try to attach to
__bpf_tramp_exit because __bpf_tramp_exit doesn't really have
__fentry__ preamble and thus the kernel itself will reject it as a
target. So when you build a generic tracing tool that fetches all the
attachable kprobes, filters out all the blacklisted ones, you still
end up with kprobe targets that are not attachable. It's definitely
more than an inconvenience which I experienced first hand.

Can recordmcount or whoever does this be taught to use proper FUNC
symbol size to figure out boundaries of the function?

$ readelf -s ~/linux-build/default/vmlinux | rg __bpf_tramp_exit
129408: ffffffff811b2ba0    63 FUNC    GLOBAL DEFAULT    1 __bpf_tramp_exit

So only the first 63 bytes of instruction after __bpf_tramp_exit
should be taken into account. Everything else doesn't belong to
__bpf_tramp_exit. So even though objdump pretends that call __fentry__
is part of __bpf_tramp_exit, it's not.

ffffffff811b2ba0 <__bpf_tramp_exit>:
ffffffff811b2ba0:       53                      push   %rbx
ffffffff811b2ba1:       48 89 fb                mov    %rdi,%rbx
ffffffff811b2ba4:       e8 97 d2 f2 ff          call
ffffffff810dfe40 <__rcu_read_lock>
ffffffff811b2ba9:       48 8b 83 e0 00 00 00    mov    0xe0(%rbx),%rax
ffffffff811b2bb0:       a8 03                   test   $0x3,%al
ffffffff811b2bb2:       75 0a                   jne
ffffffff811b2bbe <__bpf_tramp_exit+0x1e>
ffffffff811b2bb4:       65 48 ff 08             decq   %gs:(%rax)
ffffffff811b2bb8:       5b                      pop    %rbx
ffffffff811b2bb9:       e9 d2 0e f3 ff          jmp
ffffffff810e3a90 <__rcu_read_unlock>
ffffffff811b2bbe:       48 8b 83 e8 00 00 00    mov    0xe8(%rbx),%rax
ffffffff811b2bc5:       f0 48 83 28 01          lock subq $0x1,(%rax)
ffffffff811b2bca:       75 ec                   jne
ffffffff811b2bb8 <__bpf_tramp_exit+0x18>
ffffffff811b2bcc:       48 8b 83 e8 00 00 00    mov    0xe8(%rbx),%rax
ffffffff811b2bd3:       48 8d bb e0 00 00 00    lea    0xe0(%rbx),%rdi
ffffffff811b2bda:       ff 50 08                call   *0x8(%rax)
ffffffff811b2bdd:       eb d9                   jmp
ffffffff811b2bb8 <__bpf_tramp_exit+0x18>
ffffffff811b2bdf:       90                      nop

^^^ ffffffff811b2ba0 + 63 = ffffffff811b2bdf -- this is the end of
__bpf_tramp_exit

ffffffff811b2be0:       e8 3b 9c e9 ff          call
ffffffff8104c820 <__fentry__>
ffffffff811b2be5:       b8 f4 fd ff ff          mov    $0xfffffdf4,%eax
ffffffff811b2bea:       c3                      ret
ffffffff811b2beb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)


>
> In fact, recordmcount probably didn't even get called. If you see this on
> x86 with gcc version greater than 8 (which I do), recordmcount is not even
> used. gcc creates this section internally instead.
>
> >
> > > I made that weak function "notrace" and the __bpf_tramp_exit disappeared
> > > from the available_filter_functions list.
> >
> > That's a hack. We cannot rely on such hacks for all weak functions.
>
> Then don't do anything. The only thing this bug causes is perhaps some
> confusion, because functions before weak functions that are overridden will
> be listed incorrectly in the available_filter_functions file. And that's
> because of the way it is created with respect to kallsyms.
>
> If you enable __bpf_tramp_exit, it will not do anything to that function.
> What it will do is enable the location inside of the weak function that no
> longer has its symbol shown.
>
> One solution is to simply get the end of the function that is provided by
> kallsyms to make sure the fentry call location is inside the function, and
> if it is not, then not show that function in available_filter_functions but
> instead show something like "** unnamed function **" or whatever.
>
> I could write a patch to do that when I get the time. But because the only
> issue that this causes is some confusion among the users and does not cause
> any issue with functionality, then it is low priority.
>
> -- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 23:32                 ` Andrii Nakryiko
@ 2022-04-28 23:53                   ` Steven Rostedt
  2022-04-29  0:09                     ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2022-04-28 23:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Masami Hiramatsu, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu, 28 Apr 2022 16:32:20 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > The job of recordmcount is to create a section of all the locations that
> > call fentry. That is EXACTLY what it did. No bug there! It did its job.  
> 
> But that __fentry__ call is not part of __bpf_tramp_exit, actually.
> Whether to call it a bug or limitation is secondary. It marks
> __bpf_tramp_exit as attachable through kprobe/ftrace while it really
> isn't.

I'm confused by what you mean by "marks __bpf_tramp_exit as attachable"?
What does? Where does it get that information? Does it read
available_filter_functions?

recordmcount isn't responsible for any of that, you are thinking of
kallsyms. Specifically *printf("%ps"). Because that's where the name comes
from. Anytime you print an address with "%ps" on a weak function that has
been overridden, it will give you the symbol before it. I guess you can
call it a bug in the "%ps" logic.


> 
> Below you are saying there is only user confusion. It's not just
> confusion. You'll get an error when you try to attach to
> __bpf_tramp_exit because __bpf_tramp_exit doesn't really have
> __fentry__ preamble and thus the kernel itself will reject it as a
> target. So when you build a generic tracing tool that fetches all the
> attachable kprobes, filters out all the blacklisted ones, you still
> end up with kprobe targets that are not attachable. It's definitely
> more than an inconvenience which I experienced first hand.
> 
> Can recordmcount or whoever does this be taught to use proper FUNC
> symbol size to figure out boundaries of the function?

kallsyms needs to do it. All recordmcount does is to give the locations of
the calls to fentry. It only gives addresses and does not give any symbol
information. Stop blaming recordcmcount!

> 
> $ readelf -s ~/linux-build/default/vmlinux | rg __bpf_tramp_exit
> 129408: ffffffff811b2ba0    63 FUNC    GLOBAL DEFAULT    1 __bpf_tramp_exit
> 
> So only the first 63 bytes of instruction after __bpf_tramp_exit
> should be taken into account. Everything else doesn't belong to
> __bpf_tramp_exit. So even though objdump pretends that call __fentry__
> is part of __bpf_tramp_exit, it's not.
> 
> ffffffff811b2ba0 <__bpf_tramp_exit>:
> ffffffff811b2ba0:       53                      push   %rbx
> ffffffff811b2ba1:       48 89 fb                mov    %rdi,%rbx
> ffffffff811b2ba4:       e8 97 d2 f2 ff          call
> ffffffff810dfe40 <__rcu_read_lock>
> ffffffff811b2ba9:       48 8b 83 e0 00 00 00    mov    0xe0(%rbx),%rax
> ffffffff811b2bb0:       a8 03                   test   $0x3,%al
> ffffffff811b2bb2:       75 0a                   jne
> ffffffff811b2bbe <__bpf_tramp_exit+0x1e>
> ffffffff811b2bb4:       65 48 ff 08             decq   %gs:(%rax)
> ffffffff811b2bb8:       5b                      pop    %rbx
> ffffffff811b2bb9:       e9 d2 0e f3 ff          jmp
> ffffffff810e3a90 <__rcu_read_unlock>
> ffffffff811b2bbe:       48 8b 83 e8 00 00 00    mov    0xe8(%rbx),%rax
> ffffffff811b2bc5:       f0 48 83 28 01          lock subq $0x1,(%rax)
> ffffffff811b2bca:       75 ec                   jne
> ffffffff811b2bb8 <__bpf_tramp_exit+0x18>
> ffffffff811b2bcc:       48 8b 83 e8 00 00 00    mov    0xe8(%rbx),%rax
> ffffffff811b2bd3:       48 8d bb e0 00 00 00    lea    0xe0(%rbx),%rdi
> ffffffff811b2bda:       ff 50 08                call   *0x8(%rax)
> ffffffff811b2bdd:       eb d9                   jmp
> ffffffff811b2bb8 <__bpf_tramp_exit+0x18>
> ffffffff811b2bdf:       90                      nop
> 
> ^^^ ffffffff811b2ba0 + 63 = ffffffff811b2bdf -- this is the end of
> __bpf_tramp_exit
> 
> ffffffff811b2be0:       e8 3b 9c e9 ff          call
> ffffffff8104c820 <__fentry__>
> ffffffff811b2be5:       b8 f4 fd ff ff          mov    $0xfffffdf4,%eax
> ffffffff811b2bea:       c3                      ret
> ffffffff811b2beb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> 
> 


> > One solution is to simply get the end of the function that is provided by
> > kallsyms to make sure the fentry call location is inside the function, and
> > if it is not, then not show that function in available_filter_functions but
> > instead show something like "** unnamed function **" or whatever.

Do the above. The names in available_filter_functions are derived from
kallsyms. There's a way to also ask kallsyms to give you the end pointer
of the function address. The only thing that avaliable_filter_functions
does is to print the location found by recordmcount with a "%ps".

If you don't want it to show up in available_filter_functions, then you
need to open code the %ps onto kallsyms lookup and then compare the
function with the end (if it is found). Or fix %ps.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-28 23:53                   ` Steven Rostedt
@ 2022-04-29  0:09                     ` Steven Rostedt
  2022-04-29  0:31                       ` Steven Rostedt
  0 siblings, 1 reply; 45+ messages in thread
From: Steven Rostedt @ 2022-04-29  0:09 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Masami Hiramatsu, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu, 28 Apr 2022 19:53:03 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> > But that __fentry__ call is not part of __bpf_tramp_exit, actually.
> > Whether to call it a bug or limitation is secondary. It marks
> > __bpf_tramp_exit as attachable through kprobe/ftrace while it really
> > isn't.  
> 
> I'm confused by what you mean by "marks __bpf_tramp_exit as attachable"?
> What does? Where does it get that information? Does it read
> available_filter_functions?

OK, I think I see the issue you have. Because the functions shown in
available_filter_functions which uses the simple "%ps" to show the function
name:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/trace/ftrace.c#n3692

And the code that does the actual matching uses kallsyms_lookup()

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/trace/ftrace.c#n4017

Which appears not to match the function for the address, you can't pass in
__bpf_tramp_exit because it wont match the symbol returned by
kallsyms_lookup.

This does indeed look like a bug in %ps.

But in the mean time, I could open code %ps and see if that fixes it. I'll
give it try.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC bpf-next 4/4] selftests/bpf: Add attach bench test
  2022-04-29  0:09                     ` Steven Rostedt
@ 2022-04-29  0:31                       ` Steven Rostedt
  0 siblings, 0 replies; 45+ messages in thread
From: Steven Rostedt @ 2022-04-29  0:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Masami Hiramatsu, Jiri Olsa,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, lkml, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh

On Thu, 28 Apr 2022 20:09:45 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> OK, I think I see the issue you have. Because the functions shown in
> available_filter_functions which uses the simple "%ps" to show the function
> name:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/trace/ftrace.c#n3692
> 
> And the code that does the actual matching uses kallsyms_lookup()
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/trace/ftrace.c#n4017
> 
> Which appears not to match the function for the address, you can't pass in
> __bpf_tramp_exit because it wont match the symbol returned by
> kallsyms_lookup.

Never mind, in testing this I had marked the weak function as notrace,
which was the reason I couldn't add it to the set_ftrace_notrace.

After removing the notrace, kallsyms_lookup() doesn't make a difference. It
appears that kallsyms will include overridden weak functions into the size
of the function before it. I tried:

	ret = kallsyms_lookup(rec->ip, &size, &offset, &modname, str);
	if (!ret || offset > size) {
		seq_printf(m, "no function at %lx", rec->ip);
	} else {
		seq_printf(m, "%s", str);
		if (modname)
			seq_printf(m, " [%s]", modname);
	}

And it made no difference.

> 
> This does indeed look like a bug in %ps.
> 

Yes, this does appear to be a issue with kallsyms in general.

-- Steve

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2022-04-29  0:32 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-07 12:52 [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Jiri Olsa
2022-04-07 12:52 ` [RFC bpf-next 1/4] kallsyms: Add kallsyms_lookup_names function Jiri Olsa
2022-04-08  0:57   ` Masami Hiramatsu
2022-04-13  7:27     ` Jiri Olsa
2022-04-08 23:19   ` Alexei Starovoitov
2022-04-09 20:24     ` Jiri Olsa
2022-04-12 20:46     ` Jiri Olsa
2022-04-15  0:47       ` Masami Hiramatsu
2022-04-15 22:39         ` Jiri Olsa
2022-04-11 22:15   ` Andrii Nakryiko
2022-04-12 20:28     ` Jiri Olsa
2022-04-07 12:52 ` [RFC bpf-next 2/4] fprobe: Resolve symbols with kallsyms_lookup_names Jiri Olsa
2022-04-08  0:58   ` Masami Hiramatsu
2022-04-07 12:52 ` [RFC bpf-next 3/4] bpf: Resolve symbols with kallsyms_lookup_names for kprobe multi link Jiri Olsa
2022-04-08 23:26   ` Alexei Starovoitov
2022-04-09 20:24     ` Jiri Olsa
2022-04-11 22:15     ` Andrii Nakryiko
2022-04-11 22:15   ` Andrii Nakryiko
2022-04-12 18:42     ` Jiri Olsa
2022-04-07 12:52 ` [RFC bpf-next 4/4] selftests/bpf: Add attach bench test Jiri Olsa
2022-04-11 22:15   ` Andrii Nakryiko
2022-04-12  0:49     ` Masami Hiramatsu
2022-04-12 22:51       ` Andrii Nakryiko
2022-04-16 14:21         ` Masami Hiramatsu
2022-04-18 17:33           ` Steven Rostedt
2022-04-28 13:58           ` Steven Rostedt
2022-04-28 13:59             ` Steven Rostedt
2022-04-28 18:59             ` Alexei Starovoitov
2022-04-28 20:05               ` Steven Rostedt
2022-04-28 23:32                 ` Andrii Nakryiko
2022-04-28 23:53                   ` Steven Rostedt
2022-04-29  0:09                     ` Steven Rostedt
2022-04-29  0:31                       ` Steven Rostedt
2022-04-13 16:44       ` Steven Rostedt
2022-04-13 16:45         ` Andrii Nakryiko
2022-04-13 16:59           ` Steven Rostedt
2022-04-13 19:02             ` Andrii Nakryiko
2022-04-12 15:44     ` Jiri Olsa
2022-04-12 22:59       ` Andrii Nakryiko
2022-04-08 23:29 ` [RFC bpf-next 0/4] bpf: Speed up symbol resolving in kprobe multi link Alexei Starovoitov
2022-04-09 20:24   ` Jiri Olsa
2022-04-11 22:15     ` Andrii Nakryiko
2022-04-11 22:18       ` Alexei Starovoitov
2022-04-11 22:21         ` Andrii Nakryiko
2022-04-12 15:46           ` Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.