All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support
@ 2022-07-14  7:07 Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 1/5] libbpf: generalize virtual __kconfig externs and use it for USDT Andrii Nakryiko
                   ` (6 more replies)
  0 siblings, 7 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team

Add SEC("ksyscall")/SEC("kretsyscall") sections and corresponding
bpf_program__attach_ksyscall() API that simplifies tracing kernel syscalls
through kprobe mechanism. Kprobing syscalls isn't trivial due to varying
syscall handler names in the kernel and various ways syscall argument are
passed, depending on kernel architecture and configuration. SEC("ksyscall")
allows user to not care about such details and just get access to syscall
input arguments, while libbpf takes care of necessary feature detection logic.

There are still more quirks that are not straightforward to hide completely
(see comments about mmap(), clone() and compat syscalls), so in such more
advanced scenarios user might need to fall back to plain SEC("kprobe")
approach, but for absolute majority of users SEC("ksyscall") is a big
improvement.

As part of this patch set libbpf adds two more virtual __kconfig externs, in
addition to existing LINUX_KERNEL_VERSION: LINUX_HAS_BPF_COOKIE and
LINUX_HAS_SYSCALL_WRAPPER, which let's libbpf-provided BPF-side code minimize
external dependencies and assumptions and let's user-space part of libbpf to
perform all the feature detection logic. This benefits USDT support code,
which now doesn't depend on BPF CO-RE for its functionality.

v1->v2:
  - normalize extern variable-related warn and debug message formats (Alan);
rfc->v1:
  - drop dependency on kallsyms and speed up SYSCALL_WRAPPER detection (Alexei);
  - drop dependency on /proc/config.gz in bpf_tracing.h (Yaniv);
  - add doc comment and ephasize mmap(), clone() and compat quirks that are
    not supported (Ilya);
  - use mechanism similar to LINUX_KERNEL_VERSION to also improve USDT code.

Andrii Nakryiko (5):
  libbpf: generalize virtual __kconfig externs and use it for USDT
  selftests/bpf: add test of __weak unknown virtual __kconfig extern
  libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL
  libbpf: add ksyscall/kretsyscall sections support for syscall kprobes
  selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests

 tools/lib/bpf/bpf_tracing.h                   |  51 +++--
 tools/lib/bpf/libbpf.c                        | 214 ++++++++++++++----
 tools/lib/bpf/libbpf.h                        |  46 ++++
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_internal.h               |   2 +
 tools/lib/bpf/usdt.bpf.h                      |  16 +-
 .../selftests/bpf/prog_tests/core_extern.c    |  17 +-
 .../selftests/bpf/progs/bpf_syscall_macro.c   |   6 +-
 .../selftests/bpf/progs/test_attach_probe.c   |  15 +-
 .../selftests/bpf/progs/test_core_extern.c    |   3 +
 .../selftests/bpf/progs/test_probe_user.c     |  27 +--
 11 files changed, 289 insertions(+), 109 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 1/5] libbpf: generalize virtual __kconfig externs and use it for USDT
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
@ 2022-07-14  7:07 ` Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 2/5] selftests/bpf: add test of __weak unknown virtual __kconfig extern Andrii Nakryiko
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team, Alan Maguire

Libbpf supports single virtual __kconfig extern currently: LINUX_KERNEL_VERSION.
LINUX_KERNEL_VERSION isn't coming from /proc/kconfig.gz and is intead
customly filled out by libbpf.

This patch generalizes this approach to support more such virtual
__kconfig externs. One such extern added in this patch is
LINUX_HAS_BPF_COOKIE which is used for BPF-side USDT supporting code in
usdt.bpf.h instead of using CO-RE-based enum detection approach for
detecting bpf_get_attach_cookie() BPF helper. This allows to remove
otherwise not needed CO-RE dependency and keeps user-space and BPF-side
parts of libbpf's USDT support strictly in sync in terms of their
feature detection.

We'll use similar approach for syscall wrapper detection for
BPF_KSYSCALL() BPF-side macro in follow up patch.

Generally, currently libbpf reserves CONFIG_ prefix for Kconfig values
and LINUX_ for virtual libbpf-backed externs. In the future we might
extend the set of prefixes that are supported. This can be done without
any breaking changes, as currently any __kconfig extern with
unrecognized name is rejected.

For LINUX_xxx externs we support the normal "weak rule": if libbpf
doesn't recognize given LINUX_xxx extern but such extern is marked as
__weak, it is not rejected and defaults to zero.  This follows
CONFIG_xxx handling logic and will allow BPF applications to
opportunistically use newer libbpf virtual externs without breaking on
older libbpf versions unnecessarily.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c   | 95 +++++++++++++++++++++++++++-------------
 tools/lib/bpf/usdt.bpf.h | 16 +------
 2 files changed, 66 insertions(+), 45 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 68da1aca406c..ee3176859e76 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1694,7 +1694,7 @@ static int set_kcfg_value_tri(struct extern_desc *ext, void *ext_val,
 	switch (ext->kcfg.type) {
 	case KCFG_BOOL:
 		if (value == 'm') {
-			pr_warn("extern (kcfg) %s=%c should be tristate or char\n",
+			pr_warn("extern (kcfg) '%s': value '%c' implies tristate or char type\n",
 				ext->name, value);
 			return -EINVAL;
 		}
@@ -1715,7 +1715,7 @@ static int set_kcfg_value_tri(struct extern_desc *ext, void *ext_val,
 	case KCFG_INT:
 	case KCFG_CHAR_ARR:
 	default:
-		pr_warn("extern (kcfg) %s=%c should be bool, tristate, or char\n",
+		pr_warn("extern (kcfg) '%s': value '%c' implies bool, tristate, or char type\n",
 			ext->name, value);
 		return -EINVAL;
 	}
@@ -1729,7 +1729,8 @@ static int set_kcfg_value_str(struct extern_desc *ext, char *ext_val,
 	size_t len;
 
 	if (ext->kcfg.type != KCFG_CHAR_ARR) {
-		pr_warn("extern (kcfg) %s=%s should be char array\n", ext->name, value);
+		pr_warn("extern (kcfg) '%s': value '%s' implies char array type\n",
+			ext->name, value);
 		return -EINVAL;
 	}
 
@@ -1743,7 +1744,7 @@ static int set_kcfg_value_str(struct extern_desc *ext, char *ext_val,
 	/* strip quotes */
 	len -= 2;
 	if (len >= ext->kcfg.sz) {
-		pr_warn("extern (kcfg) '%s': long string config %s of (%zu bytes) truncated to %d bytes\n",
+		pr_warn("extern (kcfg) '%s': long string '%s' of (%zu bytes) truncated to %d bytes\n",
 			ext->name, value, len, ext->kcfg.sz - 1);
 		len = ext->kcfg.sz - 1;
 	}
@@ -1800,13 +1801,20 @@ static bool is_kcfg_value_in_range(const struct extern_desc *ext, __u64 v)
 static int set_kcfg_value_num(struct extern_desc *ext, void *ext_val,
 			      __u64 value)
 {
-	if (ext->kcfg.type != KCFG_INT && ext->kcfg.type != KCFG_CHAR) {
-		pr_warn("extern (kcfg) %s=%llu should be integer\n",
+	if (ext->kcfg.type != KCFG_INT && ext->kcfg.type != KCFG_CHAR &&
+	    ext->kcfg.type != KCFG_BOOL) {
+		pr_warn("extern (kcfg) '%s': value '%llu' implies integer, char, or boolean type\n",
 			ext->name, (unsigned long long)value);
 		return -EINVAL;
 	}
+	if (ext->kcfg.type == KCFG_BOOL && value > 1) {
+		pr_warn("extern (kcfg) '%s': value '%llu' isn't boolean compatible\n",
+			ext->name, (unsigned long long)value);
+		return -EINVAL;
+
+	}
 	if (!is_kcfg_value_in_range(ext, value)) {
-		pr_warn("extern (kcfg) %s=%llu value doesn't fit in %d bytes\n",
+		pr_warn("extern (kcfg) '%s': value '%llu' doesn't fit in %d bytes\n",
 			ext->name, (unsigned long long)value, ext->kcfg.sz);
 		return -ERANGE;
 	}
@@ -1870,16 +1878,19 @@ static int bpf_object__process_kconfig_line(struct bpf_object *obj,
 		/* assume integer */
 		err = parse_u64(value, &num);
 		if (err) {
-			pr_warn("extern (kcfg) %s=%s should be integer\n",
-				ext->name, value);
+			pr_warn("extern (kcfg) '%s': value '%s' isn't a valid integer\n", ext->name, value);
 			return err;
 		}
+		if (ext->kcfg.type != KCFG_INT && ext->kcfg.type != KCFG_CHAR) {
+			pr_warn("extern (kcfg) '%s': value '%s' implies integer type\n", ext->name, value);
+			return -EINVAL;
+		}
 		err = set_kcfg_value_num(ext, ext_val, num);
 		break;
 	}
 	if (err)
 		return err;
-	pr_debug("extern (kcfg) %s=%s\n", ext->name, value);
+	pr_debug("extern (kcfg) '%s': set to %s\n", ext->name, value);
 	return 0;
 }
 
@@ -3687,7 +3698,7 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 			ext->kcfg.type = find_kcfg_type(obj->btf, t->type,
 						        &ext->kcfg.is_signed);
 			if (ext->kcfg.type == KCFG_UNKNOWN) {
-				pr_warn("extern (kcfg) '%s' type is unsupported\n", ext_name);
+				pr_warn("extern (kcfg) '%s': type is unsupported\n", ext_name);
 				return -ENOTSUP;
 			}
 		} else if (strcmp(sec_name, KSYMS_SEC) == 0) {
@@ -7287,14 +7298,14 @@ static int kallsyms_cb(unsigned long long sym_addr, char sym_type,
 		return 0;
 
 	if (ext->is_set && ext->ksym.addr != sym_addr) {
-		pr_warn("extern (ksym) '%s' resolution is ambiguous: 0x%llx or 0x%llx\n",
+		pr_warn("extern (ksym) '%s': resolution is ambiguous: 0x%llx or 0x%llx\n",
 			sym_name, ext->ksym.addr, sym_addr);
 		return -EINVAL;
 	}
 	if (!ext->is_set) {
 		ext->is_set = true;
 		ext->ksym.addr = sym_addr;
-		pr_debug("extern (ksym) %s=0x%llx\n", sym_name, sym_addr);
+		pr_debug("extern (ksym) '%s': set to 0x%llx\n", sym_name, sym_addr);
 	}
 	return 0;
 }
@@ -7498,28 +7509,50 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 	for (i = 0; i < obj->nr_extern; i++) {
 		ext = &obj->externs[i];
 
-		if (ext->type == EXT_KCFG &&
-		    strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
-			void *ext_val = kcfg_data + ext->kcfg.data_off;
-			__u32 kver = get_kernel_version();
+		if (ext->type == EXT_KSYM) {
+			if (ext->ksym.type_id)
+				need_vmlinux_btf = true;
+			else
+				need_kallsyms = true;
+			continue;
+		} else if (ext->type == EXT_KCFG) {
+			void *ext_ptr = kcfg_data + ext->kcfg.data_off;
+			__u64 value = 0;
 
-			if (!kver) {
-				pr_warn("failed to get kernel version\n");
+			/* Kconfig externs need actual /proc/config.gz */
+			if (str_has_pfx(ext->name, "CONFIG_")) {
+				need_config = true;
+				continue;
+			}
+
+			/* Virtual kcfg externs are customly handled by libbpf */
+			if (strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
+				value = get_kernel_version();
+				if (!value) {
+					pr_warn("extern (kcfg) '%s': failed to get kernel version\n", ext->name);
+					return -EINVAL;
+				}
+			} else if (strcmp(ext->name, "LINUX_HAS_BPF_COOKIE") == 0) {
+				value = kernel_supports(obj, FEAT_BPF_COOKIE);
+			} else if (!str_has_pfx(ext->name, "LINUX_") || !ext->is_weak) {
+				/* Currently libbpf supports only CONFIG_ and LINUX_ prefixed
+				 * __kconfig externs, where LINUX_ ones are virtual and filled out
+				 * customly by libbpf (their values don't come from Kconfig).
+				 * If LINUX_xxx variable is not recognized by libbpf, but is marked
+				 * __weak, it defaults to zero value, just like for CONFIG_xxx
+				 * externs.
+				 */
+				pr_warn("extern (kcfg) '%s': unrecognized virtual extern\n", ext->name);
 				return -EINVAL;
 			}
-			err = set_kcfg_value_num(ext, ext_val, kver);
+
+			err = set_kcfg_value_num(ext, ext_ptr, value);
 			if (err)
 				return err;
-			pr_debug("extern (kcfg) %s=0x%x\n", ext->name, kver);
-		} else if (ext->type == EXT_KCFG && str_has_pfx(ext->name, "CONFIG_")) {
-			need_config = true;
-		} else if (ext->type == EXT_KSYM) {
-			if (ext->ksym.type_id)
-				need_vmlinux_btf = true;
-			else
-				need_kallsyms = true;
+			pr_debug("extern (kcfg) '%s': set to 0x%llx\n",
+				 ext->name, (long long)value);
 		} else {
-			pr_warn("unrecognized extern '%s'\n", ext->name);
+			pr_warn("extern '%s': unrecognized extern kind\n", ext->name);
 			return -EINVAL;
 		}
 	}
@@ -7555,10 +7588,10 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 		ext = &obj->externs[i];
 
 		if (!ext->is_set && !ext->is_weak) {
-			pr_warn("extern %s (strong) not resolved\n", ext->name);
+			pr_warn("extern '%s' (strong): not resolved\n", ext->name);
 			return -ESRCH;
 		} else if (!ext->is_set) {
-			pr_debug("extern %s (weak) not resolved, defaulting to zero\n",
+			pr_debug("extern '%s' (weak): not resolved, defaulting to zero\n",
 				 ext->name);
 		}
 	}
diff --git a/tools/lib/bpf/usdt.bpf.h b/tools/lib/bpf/usdt.bpf.h
index 4181fddb3687..4f2adc0bd6ca 100644
--- a/tools/lib/bpf/usdt.bpf.h
+++ b/tools/lib/bpf/usdt.bpf.h
@@ -6,7 +6,6 @@
 #include <linux/errno.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
-#include <bpf/bpf_core_read.h>
 
 /* Below types and maps are internal implementation details of libbpf's USDT
  * support and are subjects to change. Also, bpf_usdt_xxx() API helpers should
@@ -30,14 +29,6 @@
 #ifndef BPF_USDT_MAX_IP_CNT
 #define BPF_USDT_MAX_IP_CNT (4 * BPF_USDT_MAX_SPEC_CNT)
 #endif
-/* We use BPF CO-RE to detect support for BPF cookie from BPF side. This is
- * the only dependency on CO-RE, so if it's undesirable, user can override
- * BPF_USDT_HAS_BPF_COOKIE to specify whether to BPF cookie is supported or not.
- */
-#ifndef BPF_USDT_HAS_BPF_COOKIE
-#define BPF_USDT_HAS_BPF_COOKIE \
-	bpf_core_enum_value_exists(enum bpf_func_id___usdt, BPF_FUNC_get_attach_cookie___usdt)
-#endif
 
 enum __bpf_usdt_arg_type {
 	BPF_USDT_ARG_CONST,
@@ -83,15 +74,12 @@ struct {
 	__type(value, __u32);
 } __bpf_usdt_ip_to_spec_id SEC(".maps") __weak;
 
-/* don't rely on user's BPF code to have latest definition of bpf_func_id */
-enum bpf_func_id___usdt {
-	BPF_FUNC_get_attach_cookie___usdt = 0xBAD, /* value doesn't matter */
-};
+extern const _Bool LINUX_HAS_BPF_COOKIE __kconfig;
 
 static __always_inline
 int __bpf_usdt_spec_id(struct pt_regs *ctx)
 {
-	if (!BPF_USDT_HAS_BPF_COOKIE) {
+	if (!LINUX_HAS_BPF_COOKIE) {
 		long ip = PT_REGS_IP(ctx);
 		int *spec_id_ptr;
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 2/5] selftests/bpf: add test of __weak unknown virtual __kconfig extern
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 1/5] libbpf: generalize virtual __kconfig externs and use it for USDT Andrii Nakryiko
@ 2022-07-14  7:07 ` Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 3/5] libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL Andrii Nakryiko
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team, Alan Maguire

Exercise libbpf's logic for unknown __weak virtual __kconfig externs.
USDT selftests are already excercising non-weak known virtual extern
already (LINUX_HAS_BPF_COOKIE), so no need to add explicit tests for it.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/prog_tests/core_extern.c      | 17 +++++++----------
 .../selftests/bpf/progs/test_core_extern.c      |  3 +++
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/core_extern.c b/tools/testing/selftests/bpf/prog_tests/core_extern.c
index 1931a158510e..63a51e9f3630 100644
--- a/tools/testing/selftests/bpf/prog_tests/core_extern.c
+++ b/tools/testing/selftests/bpf/prog_tests/core_extern.c
@@ -39,6 +39,7 @@ static struct test_case {
 		       "CONFIG_STR=\"abracad\"\n"
 		       "CONFIG_MISSING=0",
 		.data = {
+			.unkn_virt_val = 0,
 			.bpf_syscall = false,
 			.tristate_val = TRI_MODULE,
 			.bool_val = true,
@@ -121,7 +122,7 @@ static struct test_case {
 void test_core_extern(void)
 {
 	const uint32_t kern_ver = get_kernel_version();
-	int err, duration = 0, i, j;
+	int err, i, j;
 	struct test_core_extern *skel = NULL;
 	uint64_t *got, *exp;
 	int n = sizeof(*skel->data) / sizeof(uint64_t);
@@ -136,19 +137,17 @@ void test_core_extern(void)
 			continue;
 
 		skel = test_core_extern__open_opts(&opts);
-		if (CHECK(!skel, "skel_open", "skeleton open failed\n"))
+		if (!ASSERT_OK_PTR(skel, "skel_open"))
 			goto cleanup;
 		err = test_core_extern__load(skel);
 		if (t->fails) {
-			CHECK(!err, "skel_load",
-			      "shouldn't succeed open/load of skeleton\n");
+			ASSERT_ERR(err, "skel_load_should_fail");
 			goto cleanup;
-		} else if (CHECK(err, "skel_load",
-				 "failed to open/load skeleton\n")) {
+		} else if (!ASSERT_OK(err, "skel_load")) {
 			goto cleanup;
 		}
 		err = test_core_extern__attach(skel);
-		if (CHECK(err, "attach_raw_tp", "failed attach: %d\n", err))
+		if (!ASSERT_OK(err, "attach_raw_tp"))
 			goto cleanup;
 
 		usleep(1);
@@ -158,9 +157,7 @@ void test_core_extern(void)
 		got = (uint64_t *)skel->data;
 		exp = (uint64_t *)&t->data;
 		for (j = 0; j < n; j++) {
-			CHECK(got[j] != exp[j], "check_res",
-			      "result #%d: expected %llx, but got %llx\n",
-			       j, (__u64)exp[j], (__u64)got[j]);
+			ASSERT_EQ(got[j], exp[j], "result");
 		}
 cleanup:
 		test_core_extern__destroy(skel);
diff --git a/tools/testing/selftests/bpf/progs/test_core_extern.c b/tools/testing/selftests/bpf/progs/test_core_extern.c
index 3ac3603ad53d..a3c7c1042f35 100644
--- a/tools/testing/selftests/bpf/progs/test_core_extern.c
+++ b/tools/testing/selftests/bpf/progs/test_core_extern.c
@@ -11,6 +11,7 @@
 static int (*bpf_missing_helper)(const void *arg1, int arg2) = (void *) 999;
 
 extern int LINUX_KERNEL_VERSION __kconfig;
+extern int LINUX_UNKNOWN_VIRTUAL_EXTERN __kconfig __weak;
 extern bool CONFIG_BPF_SYSCALL __kconfig; /* strong */
 extern enum libbpf_tristate CONFIG_TRISTATE __kconfig __weak;
 extern bool CONFIG_BOOL __kconfig __weak;
@@ -22,6 +23,7 @@ extern const char CONFIG_STR[8] __kconfig __weak;
 extern uint64_t CONFIG_MISSING __kconfig __weak;
 
 uint64_t kern_ver = -1;
+uint64_t unkn_virt_val = -1;
 uint64_t bpf_syscall = -1;
 uint64_t tristate_val = -1;
 uint64_t bool_val = -1;
@@ -38,6 +40,7 @@ int handle_sys_enter(struct pt_regs *ctx)
 	int i;
 
 	kern_ver = LINUX_KERNEL_VERSION;
+	unkn_virt_val = LINUX_UNKNOWN_VIRTUAL_EXTERN;
 	bpf_syscall = CONFIG_BPF_SYSCALL;
 	tristate_val = CONFIG_TRISTATE;
 	bool_val = CONFIG_BOOL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 3/5] libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 1/5] libbpf: generalize virtual __kconfig externs and use it for USDT Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 2/5] selftests/bpf: add test of __weak unknown virtual __kconfig extern Andrii Nakryiko
@ 2022-07-14  7:07 ` Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 4/5] libbpf: add ksyscall/kretsyscall sections support for syscall kprobes Andrii Nakryiko
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team, Alan Maguire

Improve BPF_KPROBE_SYSCALL (and rename it to shorter BPF_KSYSCALL to
match libbpf's SEC("ksyscall") section name, added in next patch) to use
__kconfig variable to determine how to properly fetch syscall arguments.

Instead of relying on hard-coded knowledge of whether kernel's
architecture uses syscall wrapper or not (which only reflects the latest
kernel versions, but is not necessarily true for older kernels and won't
necessarily hold for later kernel versions on some particular host
architecture), determine this at runtime by attempting to create
perf_event (with fallback to kprobe event creation through tracefs on
legacy kernels, just like kprobe attachment code is doing) for kernel
function that would correspond to bpf() syscall on a system that has
CONFIG_ARCH_HAS_SYSCALL_WRAPPER set (e.g., for x86-64 it would try
'__x64_sys_bpf').

If host kernel uses syscall wrapper, syscall kernel function's first
argument is a pointer to struct pt_regs that then contains syscall
arguments. In such case we need to use bpf_probe_read_kernel() to fetch
actual arguments (which we do through BPF_CORE_READ() macro) from inner
pt_regs.

But if the kernel doesn't use syscall wrapper approach, input
arguments can be read from struct pt_regs directly with no probe reading.

All this feature detection is done without requiring /proc/config.gz
existence and parsing, and BPF-side helper code uses newly added
LINUX_HAS_SYSCALL_WRAPPER virtual __kconfig extern to keep in sync with
user-side feature detection of libbpf.

BPF_KSYSCALL() macro can be used both with SEC("kprobe") programs that
define syscall function explicitly (e.g., SEC("kprobe/__x64_sys_bpf"))
and SEC("ksyscall") program added in the next patch (which are the same
kprobe program with added benefit of libbpf determining correct kernel
function name automatically).

Kretprobe and kretsyscall (added in next patch) programs don't need
BPF_KSYSCALL as they don't provide access to input arguments. Normal
BPF_KRETPROBE is completely sufficient and is recommended.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/bpf_tracing.h | 51 +++++++++++++++++++++++++++----------
 tools/lib/bpf/libbpf.c      |  2 ++
 2 files changed, 40 insertions(+), 13 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 11f9096407fc..f4d3e1e2abe2 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -2,6 +2,8 @@
 #ifndef __BPF_TRACING_H__
 #define __BPF_TRACING_H__
 
+#include <bpf/bpf_helpers.h>
+
 /* Scan the ARCH passed in from ARCH env variable (see Makefile) */
 #if defined(__TARGET_ARCH_x86)
 	#define bpf_target_x86
@@ -140,7 +142,7 @@ struct pt_regs___s390 {
 #define __PT_RC_REG gprs[2]
 #define __PT_SP_REG gprs[15]
 #define __PT_IP_REG psw.addr
-#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma("GCC error \"use PT_REGS_PARM1_CORE_SYSCALL() instead\""); 0l; })
+#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x)
 #define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___s390 *)(x), orig_gpr2)
 
 #elif defined(bpf_target_arm)
@@ -174,7 +176,7 @@ struct pt_regs___arm64 {
 #define __PT_RC_REG regs[0]
 #define __PT_SP_REG sp
 #define __PT_IP_REG pc
-#define PT_REGS_PARM1_SYSCALL(x) ({ _Pragma("GCC error \"use PT_REGS_PARM1_CORE_SYSCALL() instead\""); 0l; })
+#define PT_REGS_PARM1_SYSCALL(x) PT_REGS_PARM1_CORE_SYSCALL(x)
 #define PT_REGS_PARM1_CORE_SYSCALL(x) BPF_CORE_READ((const struct pt_regs___arm64 *)(x), orig_x0)
 
 #elif defined(bpf_target_mips)
@@ -493,39 +495,62 @@ typeof(name(0)) name(struct pt_regs *ctx)				    \
 }									    \
 static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
 
+/* If kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER, read pt_regs directly */
 #define ___bpf_syscall_args0()           ctx
-#define ___bpf_syscall_args1(x)          ___bpf_syscall_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs)
-#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs)
-#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs)
-#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs)
-#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs)
+#define ___bpf_syscall_args1(x)          ___bpf_syscall_args0(), (void *)PT_REGS_PARM1_SYSCALL(regs)
+#define ___bpf_syscall_args2(x, args...) ___bpf_syscall_args1(args), (void *)PT_REGS_PARM2_SYSCALL(regs)
+#define ___bpf_syscall_args3(x, args...) ___bpf_syscall_args2(args), (void *)PT_REGS_PARM3_SYSCALL(regs)
+#define ___bpf_syscall_args4(x, args...) ___bpf_syscall_args3(args), (void *)PT_REGS_PARM4_SYSCALL(regs)
+#define ___bpf_syscall_args5(x, args...) ___bpf_syscall_args4(args), (void *)PT_REGS_PARM5_SYSCALL(regs)
 #define ___bpf_syscall_args(args...)     ___bpf_apply(___bpf_syscall_args, ___bpf_narg(args))(args)
 
+/* If kernel doesn't have CONFIG_ARCH_HAS_SYSCALL_WRAPPER, we have to BPF_CORE_READ from pt_regs */
+#define ___bpf_syswrap_args0()           ctx
+#define ___bpf_syswrap_args1(x)          ___bpf_syswrap_args0(), (void *)PT_REGS_PARM1_CORE_SYSCALL(regs)
+#define ___bpf_syswrap_args2(x, args...) ___bpf_syswrap_args1(args), (void *)PT_REGS_PARM2_CORE_SYSCALL(regs)
+#define ___bpf_syswrap_args3(x, args...) ___bpf_syswrap_args2(args), (void *)PT_REGS_PARM3_CORE_SYSCALL(regs)
+#define ___bpf_syswrap_args4(x, args...) ___bpf_syswrap_args3(args), (void *)PT_REGS_PARM4_CORE_SYSCALL(regs)
+#define ___bpf_syswrap_args5(x, args...) ___bpf_syswrap_args4(args), (void *)PT_REGS_PARM5_CORE_SYSCALL(regs)
+#define ___bpf_syswrap_args(args...)     ___bpf_apply(___bpf_syswrap_args, ___bpf_narg(args))(args)
+
 /*
- * BPF_KPROBE_SYSCALL is a variant of BPF_KPROBE, which is intended for
+ * BPF_KSYSCALL is a variant of BPF_KPROBE, which is intended for
  * tracing syscall functions, like __x64_sys_close. It hides the underlying
  * platform-specific low-level way of getting syscall input arguments from
  * struct pt_regs, and provides a familiar typed and named function arguments
  * syntax and semantics of accessing syscall input parameters.
  *
- * Original struct pt_regs* context is preserved as 'ctx' argument. This might
+ * Original struct pt_regs * context is preserved as 'ctx' argument. This might
  * be necessary when using BPF helpers like bpf_perf_event_output().
  *
- * This macro relies on BPF CO-RE support.
+ * At the moment BPF_KSYSCALL does not handle all the calling convention
+ * quirks for mmap(), clone() and compat syscalls transparrently. This may or
+ * may not change in the future. User needs to take extra measures to handle
+ * such quirks explicitly, if necessary.
+ *
+ * This macro relies on BPF CO-RE support and virtual __kconfig externs.
  */
-#define BPF_KPROBE_SYSCALL(name, args...)				    \
+#define BPF_KSYSCALL(name, args...)					    \
 name(struct pt_regs *ctx);						    \
+extern _Bool LINUX_HAS_SYSCALL_WRAPPER __kconfig;			    \
 static __attribute__((always_inline)) typeof(name(0))			    \
 ____##name(struct pt_regs *ctx, ##args);				    \
 typeof(name(0)) name(struct pt_regs *ctx)				    \
 {									    \
-	struct pt_regs *regs = PT_REGS_SYSCALL_REGS(ctx);		    \
+	struct pt_regs *regs = LINUX_HAS_SYSCALL_WRAPPER		    \
+			       ? (struct pt_regs *)PT_REGS_PARM1(ctx)	    \
+			       : ctx;					    \
 	_Pragma("GCC diagnostic push")					    \
 	_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")		    \
-	return ____##name(___bpf_syscall_args(args));			    \
+	if (LINUX_HAS_SYSCALL_WRAPPER)					    \
+		return ____##name(___bpf_syswrap_args(args));		    \
+	else								    \
+		return ____##name(___bpf_syscall_args(args));		    \
 	_Pragma("GCC diagnostic pop")					    \
 }									    \
 static __attribute__((always_inline)) typeof(name(0))			    \
 ____##name(struct pt_regs *ctx, ##args)
 
+#define BPF_KPROBE_SYSCALL BPF_KSYSCALL
+
 #endif
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index ee3176859e76..eb35a20a33f6 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -7534,6 +7534,8 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 				}
 			} else if (strcmp(ext->name, "LINUX_HAS_BPF_COOKIE") == 0) {
 				value = kernel_supports(obj, FEAT_BPF_COOKIE);
+			} else if (strcmp(ext->name, "LINUX_HAS_SYSCALL_WRAPPER") == 0) {
+				value = kernel_supports(obj, FEAT_SYSCALL_WRAPPER);
 			} else if (!str_has_pfx(ext->name, "LINUX_") || !ext->is_weak) {
 				/* Currently libbpf supports only CONFIG_ and LINUX_ prefixed
 				 * __kconfig externs, where LINUX_ ones are virtual and filled out
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 4/5] libbpf: add ksyscall/kretsyscall sections support for syscall kprobes
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2022-07-14  7:07 ` [PATCH v2 bpf-next 3/5] libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL Andrii Nakryiko
@ 2022-07-14  7:07 ` Andrii Nakryiko
  2022-07-14  7:07 ` [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests Andrii Nakryiko
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team, Alan Maguire

Add SEC("ksyscall")/SEC("ksyscall/<syscall_name>") and corresponding
kretsyscall variants (for return kprobes) to allow users to kprobe
syscall functions in kernel. These special sections allow to ignore
complexities and differences between kernel versions and host
architectures when it comes to syscall wrapper and corresponding
__<arch>_sys_<syscall> vs __se_sys_<syscall> differences, depending on
whether host kernel has CONFIG_ARCH_HAS_SYSCALL_WRAPPER (though libbpf
itself doesn't rely on /proc/config.gz for detecting this, see
BPF_KSYSCALL patch for how it's done internally).

Combined with the use of BPF_KSYSCALL() macro, this allows to just
specify intended syscall name and expected input arguments and leave
dealing with all the variations to libbpf.

In addition to SEC("ksyscall+") and SEC("kretsyscall+") add
bpf_program__attach_ksyscall() API which allows to specify syscall name
at runtime and provide associated BPF cookie value.

At the moment SEC("ksyscall") and bpf_program__attach_ksyscall() do not
handle all the calling convention quirks for mmap(), clone() and compat
syscalls. It also only attaches to "native" syscall interfaces. If host
system supports compat syscalls or defines 32-bit syscalls in 64-bit
kernel, such syscall interfaces won't be attached to by libbpf.

These limitations may or may not change in the future. Therefore it is
recommended to use SEC("kprobe") for these syscalls or if working with
compat and 32-bit interfaces is required.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c          | 117 +++++++++++++++++++++++++++++---
 tools/lib/bpf/libbpf.h          |  46 +++++++++++++
 tools/lib/bpf/libbpf.map        |   1 +
 tools/lib/bpf/libbpf_internal.h |   2 +
 4 files changed, 157 insertions(+), 9 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index eb35a20a33f6..ba96760936f8 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4670,6 +4670,8 @@ static int probe_kern_btf_enum64(void)
 					     strs, sizeof(strs)));
 }
 
+static int probe_kern_syscall_wrapper(void);
+
 enum kern_feature_result {
 	FEAT_UNKNOWN = 0,
 	FEAT_SUPPORTED = 1,
@@ -4738,6 +4740,9 @@ static struct kern_feature_desc {
 	[FEAT_BTF_ENUM64] = {
 		"BTF_KIND_ENUM64 support", probe_kern_btf_enum64,
 	},
+	[FEAT_SYSCALL_WRAPPER] = {
+		"Kernel using syscall wrapper", probe_kern_syscall_wrapper,
+	},
 };
 
 bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
@@ -8421,6 +8426,7 @@ int bpf_program__set_log_buf(struct bpf_program *prog, char *log_buf, size_t log
 
 static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf_link **link);
+static int attach_ksyscall(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_usdt(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
 static int attach_raw_tp(const struct bpf_program *prog, long cookie, struct bpf_link **link);
@@ -8441,6 +8447,8 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("uretprobe.s+",		KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
 	SEC_DEF("kprobe.multi+",	KPROBE,	BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi),
 	SEC_DEF("kretprobe.multi+",	KPROBE,	BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi),
+	SEC_DEF("ksyscall+",		KPROBE,	0, SEC_NONE, attach_ksyscall),
+	SEC_DEF("kretsyscall+",		KPROBE, 0, SEC_NONE, attach_ksyscall),
 	SEC_DEF("usdt+",		KPROBE,	0, SEC_NONE, attach_usdt),
 	SEC_DEF("tc",			SCHED_CLS, 0, SEC_NONE),
 	SEC_DEF("classifier",		SCHED_CLS, 0, SEC_NONE),
@@ -9797,7 +9805,7 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name,
 {
 	struct perf_event_attr attr = {};
 	char errmsg[STRERR_BUFSIZE];
-	int type, pfd, err;
+	int type, pfd;
 
 	if (ref_ctr_off >= (1ULL << PERF_UPROBE_REF_CTR_OFFSET_BITS))
 		return -EINVAL;
@@ -9833,14 +9841,7 @@ static int perf_event_open_probe(bool uprobe, bool retprobe, const char *name,
 		      pid < 0 ? -1 : pid /* pid */,
 		      pid == -1 ? 0 : -1 /* cpu */,
 		      -1 /* group_fd */, PERF_FLAG_FD_CLOEXEC);
-	if (pfd < 0) {
-		err = -errno;
-		pr_warn("%s perf_event_open() failed: %s\n",
-			uprobe ? "uprobe" : "kprobe",
-			libbpf_strerror_r(err, errmsg, sizeof(errmsg)));
-		return err;
-	}
-	return pfd;
+	return pfd >= 0 ? pfd : -errno;
 }
 
 static int append_to_file(const char *file, const char *fmt, ...)
@@ -9945,6 +9946,60 @@ static int perf_event_kprobe_open_legacy(const char *probe_name, bool retprobe,
 	return err;
 }
 
+static const char *arch_specific_syscall_pfx(void)
+{
+#if defined(__x86_64__)
+	return "x64";
+#elif defined(__i386__)
+	return "ia32";
+#elif defined(__s390x__)
+	return "s390x";
+#elif defined(__s390__)
+	return "s390";
+#elif defined(__arm__)
+	return "arm";
+#elif defined(__aarch64__)
+	return "arm64";
+#elif defined(__mips__)
+	return "mips";
+#elif defined(__riscv)
+	return "riscv";
+#else
+	return NULL;
+#endif
+}
+
+static int probe_kern_syscall_wrapper(void)
+{
+	char syscall_name[64];
+	const char *ksys_pfx;
+
+	ksys_pfx = arch_specific_syscall_pfx();
+	if (!ksys_pfx)
+		return 0;
+
+	snprintf(syscall_name, sizeof(syscall_name), "__%s_sys_bpf", ksys_pfx);
+
+	if (determine_kprobe_perf_type() >= 0) {
+		int pfd;
+
+		pfd = perf_event_open_probe(false, false, syscall_name, 0, getpid(), 0);
+		if (pfd >= 0)
+			close(pfd);
+
+		return pfd >= 0 ? 1 : 0;
+	} else { /* legacy mode */
+		char probe_name[128];
+
+		gen_kprobe_legacy_event_name(probe_name, sizeof(probe_name), syscall_name, 0);
+		if (add_kprobe_event_legacy(probe_name, false, syscall_name, 0) < 0)
+			return 0;
+
+		(void)remove_kprobe_event_legacy(probe_name, false);
+		return 1;
+	}
+}
+
 struct bpf_link *
 bpf_program__attach_kprobe_opts(const struct bpf_program *prog,
 				const char *func_name,
@@ -10030,6 +10085,29 @@ struct bpf_link *bpf_program__attach_kprobe(const struct bpf_program *prog,
 	return bpf_program__attach_kprobe_opts(prog, func_name, &opts);
 }
 
+struct bpf_link *bpf_program__attach_ksyscall(const struct bpf_program *prog,
+					      const char *syscall_name,
+					      const struct bpf_ksyscall_opts *opts)
+{
+	LIBBPF_OPTS(bpf_kprobe_opts, kprobe_opts);
+	char func_name[128];
+
+	if (!OPTS_VALID(opts, bpf_ksyscall_opts))
+		return libbpf_err_ptr(-EINVAL);
+
+	if (kernel_supports(prog->obj, FEAT_SYSCALL_WRAPPER)) {
+		snprintf(func_name, sizeof(func_name), "__%s_sys_%s",
+			 arch_specific_syscall_pfx(), syscall_name);
+	} else {
+		snprintf(func_name, sizeof(func_name), "__se_sys_%s", syscall_name);
+	}
+
+	kprobe_opts.retprobe = OPTS_GET(opts, retprobe, false);
+	kprobe_opts.bpf_cookie = OPTS_GET(opts, bpf_cookie, 0);
+
+	return bpf_program__attach_kprobe_opts(prog, func_name, &kprobe_opts);
+}
+
 /* Adapted from perf/util/string.c */
 static bool glob_match(const char *str, const char *pat)
 {
@@ -10200,6 +10278,27 @@ static int attach_kprobe(const struct bpf_program *prog, long cookie, struct bpf
 	return libbpf_get_error(*link);
 }
 
+static int attach_ksyscall(const struct bpf_program *prog, long cookie, struct bpf_link **link)
+{
+	LIBBPF_OPTS(bpf_ksyscall_opts, opts);
+	const char *syscall_name;
+
+	*link = NULL;
+
+	/* no auto-attach for SEC("ksyscall") and SEC("kretsyscall") */
+	if (strcmp(prog->sec_name, "ksyscall") == 0 || strcmp(prog->sec_name, "kretsyscall") == 0)
+		return 0;
+
+	opts.retprobe = str_has_pfx(prog->sec_name, "kretsyscall/");
+	if (opts.retprobe)
+		syscall_name = prog->sec_name + sizeof("kretsyscall/") - 1;
+	else
+		syscall_name = prog->sec_name + sizeof("ksyscall/") - 1;
+
+	*link = bpf_program__attach_ksyscall(prog, syscall_name, &opts);
+	return *link ? 0 : -errno;
+}
+
 static int attach_kprobe_multi(const struct bpf_program *prog, long cookie, struct bpf_link **link)
 {
 	LIBBPF_OPTS(bpf_kprobe_multi_opts, opts);
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index e4d5353f757b..d10234aaa8df 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -457,6 +457,52 @@ bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
 				      const char *pattern,
 				      const struct bpf_kprobe_multi_opts *opts);
 
+struct bpf_ksyscall_opts {
+	/* size of this struct, for forward/backward compatiblity */
+	size_t sz;
+	/* custom user-provided value fetchable through bpf_get_attach_cookie() */
+	__u64 bpf_cookie;
+	/* attach as return probe? */
+	bool retprobe;
+	size_t :0;
+};
+#define bpf_ksyscall_opts__last_field retprobe
+
+/**
+ * @brief **bpf_program__attach_ksyscall()** attaches a BPF program
+ * to kernel syscall handler of a specified syscall. Optionally it's possible
+ * to request to install retprobe that will be triggered at syscall exit. It's
+ * also possible to associate BPF cookie (though options).
+ *
+ * Libbpf automatically will determine correct full kernel function name,
+ * which depending on system architecture and kernel version/configuration
+ * could be of the form __<arch>_sys_<syscall> or __se_sys_<syscall>, and will
+ * attach specified program using kprobe/kretprobe mechanism.
+ *
+ * **bpf_program__attach_ksyscall()** is an API counterpart of declarative
+ * **SEC("ksyscall/<syscall>")** annotation of BPF programs.
+ *
+ * At the moment **SEC("ksyscall")** and **bpf_program__attach_ksyscall()** do
+ * not handle all the calling convention quirks for mmap(), clone() and compat
+ * syscalls. It also only attaches to "native" syscall interfaces. If host
+ * system supports compat syscalls or defines 32-bit syscalls in 64-bit
+ * kernel, such syscall interfaces won't be attached to by libbpf.
+ *
+ * These limitations may or may not change in the future. Therefore it is
+ * recommended to use SEC("kprobe") for these syscalls or if working with
+ * compat and 32-bit interfaces is required.
+ *
+ * @param prog BPF program to attach
+ * @param syscall_name Symbolic name of the syscall (e.g., "bpf")
+ * @param opts Additional options (see **struct bpf_ksyscall_opts**)
+ * @return Reference to the newly created BPF link; or NULL is returned on
+ * error, error code is stored in errno
+ */
+LIBBPF_API struct bpf_link *
+bpf_program__attach_ksyscall(const struct bpf_program *prog,
+			     const char *syscall_name,
+			     const struct bpf_ksyscall_opts *opts);
+
 struct bpf_uprobe_opts {
 	/* size of this struct, for forward/backward compatiblity */
 	size_t sz;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 94b589ecfeaa..63c059a02a0d 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -356,6 +356,7 @@ LIBBPF_0.8.0 {
 LIBBPF_1.0.0 {
 	global:
 		bpf_prog_query_opts;
+		bpf_program__attach_ksyscall;
 		btf__add_enum64;
 		btf__add_enum64_value;
 		libbpf_bpf_attach_type_str;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 9cd7829cbe41..f01dbab49da9 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -352,6 +352,8 @@ enum kern_feature_id {
 	FEAT_BPF_COOKIE,
 	/* BTF_KIND_ENUM64 support and BTF_KIND_ENUM kflag support */
 	FEAT_BTF_ENUM64,
+	/* Kernel uses syscall wrapper (CONFIG_ARCH_HAS_SYSCALL_WRAPPER) */
+	FEAT_SYSCALL_WRAPPER,
 	__FEAT_CNT,
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2022-07-14  7:07 ` [PATCH v2 bpf-next 4/5] libbpf: add ksyscall/kretsyscall sections support for syscall kprobes Andrii Nakryiko
@ 2022-07-14  7:07 ` Andrii Nakryiko
  2022-07-15 23:28   ` Ilya Leoshkevich
  2022-07-14 18:31 ` [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support sdf
  2022-07-19 16:50 ` patchwork-bot+netdevbpf
  6 siblings, 1 reply; 10+ messages in thread
From: Andrii Nakryiko @ 2022-07-14  7:07 UTC (permalink / raw)
  To: bpf, ast, daniel; +Cc: andrii, kernel-team, Alan Maguire

Convert few selftest that used plain SEC("kprobe") with arch-specific
syscall wrapper prefix to ksyscall/kretsyscall and corresponding
BPF_KSYSCALL macro. test_probe_user.c is especially benefiting from this
simplification.

Tested-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/progs/bpf_syscall_macro.c   |  6 ++---
 .../selftests/bpf/progs/test_attach_probe.c   | 15 +++++------
 .../selftests/bpf/progs/test_probe_user.c     | 27 +++++--------------
 3 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
index 05838ed9b89c..e1e11897e99b 100644
--- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
+++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
@@ -64,9 +64,9 @@ int BPF_KPROBE(handle_sys_prctl)
 	return 0;
 }
 
-SEC("kprobe/" SYS_PREFIX "sys_prctl")
-int BPF_KPROBE_SYSCALL(prctl_enter, int option, unsigned long arg2,
-		       unsigned long arg3, unsigned long arg4, unsigned long arg5)
+SEC("ksyscall/prctl")
+int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2,
+		 unsigned long arg3, unsigned long arg4, unsigned long arg5)
 {
 	pid_t pid = bpf_get_current_pid_tgid() >> 32;
 
diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
index f1c88ad368ef..a1e45fec8938 100644
--- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
+++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
@@ -1,11 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2017 Facebook
 
-#include <linux/ptrace.h>
-#include <linux/bpf.h>
+#include "vmlinux.h"
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
-#include <stdbool.h>
+#include <bpf/bpf_core_read.h>
 #include "bpf_misc.h"
 
 int kprobe_res = 0;
@@ -31,8 +30,8 @@ int handle_kprobe(struct pt_regs *ctx)
 	return 0;
 }
 
-SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
-int BPF_KPROBE(handle_kprobe_auto)
+SEC("ksyscall/nanosleep")
+int BPF_KSYSCALL(handle_kprobe_auto, struct __kernel_timespec *req, struct __kernel_timespec *rem)
 {
 	kprobe2_res = 11;
 	return 0;
@@ -56,11 +55,11 @@ int handle_kretprobe(struct pt_regs *ctx)
 	return 0;
 }
 
-SEC("kretprobe/" SYS_PREFIX "sys_nanosleep")
-int BPF_KRETPROBE(handle_kretprobe_auto)
+SEC("kretsyscall/nanosleep")
+int BPF_KRETPROBE(handle_kretprobe_auto, int ret)
 {
 	kretprobe2_res = 22;
-	return 0;
+	return ret;
 }
 
 SEC("uprobe")
diff --git a/tools/testing/selftests/bpf/progs/test_probe_user.c b/tools/testing/selftests/bpf/progs/test_probe_user.c
index 702578a5e496..8e1495008e4d 100644
--- a/tools/testing/selftests/bpf/progs/test_probe_user.c
+++ b/tools/testing/selftests/bpf/progs/test_probe_user.c
@@ -1,35 +1,20 @@
 // SPDX-License-Identifier: GPL-2.0
-
-#include <linux/ptrace.h>
-#include <linux/bpf.h>
-
-#include <netinet/in.h>
-
+#include "vmlinux.h"
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
 #include "bpf_misc.h"
 
 static struct sockaddr_in old;
 
-SEC("kprobe/" SYS_PREFIX "sys_connect")
-int BPF_KPROBE(handle_sys_connect)
+SEC("ksyscall/connect")
+int BPF_KSYSCALL(handle_sys_connect, int fd, struct sockaddr_in *uservaddr, int addrlen)
 {
-#if SYSCALL_WRAPPER == 1
-	struct pt_regs *real_regs;
-#endif
 	struct sockaddr_in new;
-	void *ptr;
-
-#if SYSCALL_WRAPPER == 0
-	ptr = (void *)PT_REGS_PARM2(ctx);
-#else
-	real_regs = (struct pt_regs *)PT_REGS_PARM1(ctx);
-	bpf_probe_read_kernel(&ptr, sizeof(ptr), &PT_REGS_PARM2(real_regs));
-#endif
 
-	bpf_probe_read_user(&old, sizeof(old), ptr);
+	bpf_probe_read_user(&old, sizeof(old), uservaddr);
 	__builtin_memset(&new, 0xab, sizeof(new));
-	bpf_probe_write_user(ptr, &new, sizeof(new));
+	bpf_probe_write_user(uservaddr, &new, sizeof(new));
 
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2022-07-14  7:07 ` [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests Andrii Nakryiko
@ 2022-07-14 18:31 ` sdf
  2022-07-19 16:50 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 10+ messages in thread
From: sdf @ 2022-07-14 18:31 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team

On 07/14, Andrii Nakryiko wrote:
> Add SEC("ksyscall")/SEC("kretsyscall") sections and corresponding
> bpf_program__attach_ksyscall() API that simplifies tracing kernel syscalls
> through kprobe mechanism. Kprobing syscalls isn't trivial due to varying
> syscall handler names in the kernel and various ways syscall argument are
> passed, depending on kernel architecture and configuration.  
> SEC("ksyscall")
> allows user to not care about such details and just get access to syscall
> input arguments, while libbpf takes care of necessary feature detection  
> logic.

> There are still more quirks that are not straightforward to hide  
> completely
> (see comments about mmap(), clone() and compat syscalls), so in such more
> advanced scenarios user might need to fall back to plain SEC("kprobe")
> approach, but for absolute majority of users SEC("ksyscall") is a big
> improvement.

> As part of this patch set libbpf adds two more virtual __kconfig externs,  
> in
> addition to existing LINUX_KERNEL_VERSION: LINUX_HAS_BPF_COOKIE and
> LINUX_HAS_SYSCALL_WRAPPER, which let's libbpf-provided BPF-side code  
> minimize
> external dependencies and assumptions and let's user-space part of libbpf  
> to
> perform all the feature detection logic. This benefits USDT support code,
> which now doesn't depend on BPF CO-RE for its functionality.

> v1->v2:
>    - normalize extern variable-related warn and debug message formats  
> (Alan);

For the series:

Reviewed-by: Stanislav Fomichev <sdf@google.com>

> rfc->v1:
>    - drop dependency on kallsyms and speed up SYSCALL_WRAPPER detection  
> (Alexei);
>    - drop dependency on /proc/config.gz in bpf_tracing.h (Yaniv);
>    - add doc comment and ephasize mmap(), clone() and compat quirks that  
> are
>      not supported (Ilya);
>    - use mechanism similar to LINUX_KERNEL_VERSION to also improve USDT  
> code.

> Andrii Nakryiko (5):
>    libbpf: generalize virtual __kconfig externs and use it for USDT
>    selftests/bpf: add test of __weak unknown virtual __kconfig extern
>    libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL
>    libbpf: add ksyscall/kretsyscall sections support for syscall kprobes
>    selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests

>   tools/lib/bpf/bpf_tracing.h                   |  51 +++--
>   tools/lib/bpf/libbpf.c                        | 214 ++++++++++++++----
>   tools/lib/bpf/libbpf.h                        |  46 ++++
>   tools/lib/bpf/libbpf.map                      |   1 +
>   tools/lib/bpf/libbpf_internal.h               |   2 +
>   tools/lib/bpf/usdt.bpf.h                      |  16 +-
>   .../selftests/bpf/prog_tests/core_extern.c    |  17 +-
>   .../selftests/bpf/progs/bpf_syscall_macro.c   |   6 +-
>   .../selftests/bpf/progs/test_attach_probe.c   |  15 +-
>   .../selftests/bpf/progs/test_core_extern.c    |   3 +
>   .../selftests/bpf/progs/test_probe_user.c     |  27 +--
>   11 files changed, 289 insertions(+), 109 deletions(-)

> --
> 2.30.2


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests
  2022-07-14  7:07 ` [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests Andrii Nakryiko
@ 2022-07-15 23:28   ` Ilya Leoshkevich
  2022-07-15 23:54     ` Ilya Leoshkevich
  0 siblings, 1 reply; 10+ messages in thread
From: Ilya Leoshkevich @ 2022-07-15 23:28 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, ast, daniel; +Cc: kernel-team, Alan Maguire

On Thu, 2022-07-14 at 00:07 -0700, Andrii Nakryiko wrote:
> Convert few selftest that used plain SEC("kprobe") with arch-specific
> syscall wrapper prefix to ksyscall/kretsyscall and corresponding
> BPF_KSYSCALL macro. test_probe_user.c is especially benefiting from
> this
> simplification.
> 
> Tested-by: Alan Maguire <alan.maguire@oracle.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  .../selftests/bpf/progs/bpf_syscall_macro.c   |  6 ++---
>  .../selftests/bpf/progs/test_attach_probe.c   | 15 +++++------
>  .../selftests/bpf/progs/test_probe_user.c     | 27 +++++------------
> --
>  3 files changed, 16 insertions(+), 32 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> index 05838ed9b89c..e1e11897e99b 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> +++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> @@ -64,9 +64,9 @@ int BPF_KPROBE(handle_sys_prctl)
>         return 0;
>  }
>  
> -SEC("kprobe/" SYS_PREFIX "sys_prctl")
> -int BPF_KPROBE_SYSCALL(prctl_enter, int option, unsigned long arg2,
> -                      unsigned long arg3, unsigned long arg4,
> unsigned long arg5)
> +SEC("ksyscall/prctl")
> +int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2,
> +                unsigned long arg3, unsigned long arg4, unsigned
> long arg5)
>  {
>         pid_t pid = bpf_get_current_pid_tgid() >> 32;
>  
> diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c
> b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> index f1c88ad368ef..a1e45fec8938 100644
> --- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
> +++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> @@ -1,11 +1,10 @@
>  // SPDX-License-Identifier: GPL-2.0
>  // Copyright (c) 2017 Facebook
>  
> -#include <linux/ptrace.h>
> -#include <linux/bpf.h>
> +#include "vmlinux.h"
>  #include <bpf/bpf_helpers.h>
>  #include <bpf/bpf_tracing.h>
> -#include <stdbool.h>
> +#include <bpf/bpf_core_read.h>
>  #include "bpf_misc.h"
>  
>  int kprobe_res = 0;
> @@ -31,8 +30,8 @@ int handle_kprobe(struct pt_regs *ctx)
>         return 0;
>  }
>  
> -SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
> -int BPF_KPROBE(handle_kprobe_auto)
> +SEC("ksyscall/nanosleep")
> +int BPF_KSYSCALL(handle_kprobe_auto, struct __kernel_timespec *req,
> struct __kernel_timespec *rem)
>  {
>         kprobe2_res = 11;
>         return 0;
> @@ -56,11 +55,11 @@ int handle_kretprobe(struct pt_regs *ctx)
>         return 0;
>  }
>  
> -SEC("kretprobe/" SYS_PREFIX "sys_nanosleep")
> -int BPF_KRETPROBE(handle_kretprobe_auto)
> +SEC("kretsyscall/nanosleep")
> +int BPF_KRETPROBE(handle_kretprobe_auto, int ret)
>  {
>         kretprobe2_res = 22;
> -       return 0;
> +       return ret;
>  }
>  
>  SEC("uprobe")
> diff --git a/tools/testing/selftests/bpf/progs/test_probe_user.c
> b/tools/testing/selftests/bpf/progs/test_probe_user.c
> index 702578a5e496..8e1495008e4d 100644
> --- a/tools/testing/selftests/bpf/progs/test_probe_user.c
> +++ b/tools/testing/selftests/bpf/progs/test_probe_user.c
> @@ -1,35 +1,20 @@
>  // SPDX-License-Identifier: GPL-2.0
> -
> -#include <linux/ptrace.h>
> -#include <linux/bpf.h>
> -
> -#include <netinet/in.h>
> -
> +#include "vmlinux.h"
>  #include <bpf/bpf_helpers.h>
>  #include <bpf/bpf_tracing.h>
> +#include <bpf/bpf_core_read.h>
>  #include "bpf_misc.h"
>  
>  static struct sockaddr_in old;
>  
> -SEC("kprobe/" SYS_PREFIX "sys_connect")
> -int BPF_KPROBE(handle_sys_connect)
> +SEC("ksyscall/connect")
> +int BPF_KSYSCALL(handle_sys_connect, int fd, struct sockaddr_in
> *uservaddr, int addrlen)
>  {
> -#if SYSCALL_WRAPPER == 1
> -       struct pt_regs *real_regs;
> -#endif
>         struct sockaddr_in new;
> -       void *ptr;
> -
> -#if SYSCALL_WRAPPER == 0
> -       ptr = (void *)PT_REGS_PARM2(ctx);
> -#else
> -       real_regs = (struct pt_regs *)PT_REGS_PARM1(ctx);
> -       bpf_probe_read_kernel(&ptr, sizeof(ptr),
> &PT_REGS_PARM2(real_regs));
> -#endif
>  
> -       bpf_probe_read_user(&old, sizeof(old), ptr);
> +       bpf_probe_read_user(&old, sizeof(old), uservaddr);
>         __builtin_memset(&new, 0xab, sizeof(new));
> -       bpf_probe_write_user(ptr, &new, sizeof(new));
> +       bpf_probe_write_user(uservaddr, &new, sizeof(new));
>  
>         return 0;
>  }

Hi,

The first two tests succeed, but test_probe_user fails on s390x with:

    serial_test_probe_user:FAIL:check_kprobe_res
    wrong kprobe res from probe read: 0.0.0.0:0

I'm not sure what is causing this, but at least the loaded BPF code
looks sane:

# bpftool prog dump xlated id 81
int handle_sys_connect(struct pt_regs * ctx):
   0: (bf) r6 = r1                      ; kernel pt_regs
   1: (18) r1 = map[id:33][0]+0
   3: (71) r1 = *(u8 *)(r1 +0)
   4: (79) r6 = *(u64 *)(r6 +40)        ; kernel gpr2
                                        ; PT_REGS_PARM1
   5: (b7) r1 = 152
   6: (bf) r3 = r6
   7: (0f) r3 += r1                     ; user orig_gpr2
                                        ; PT_REGS_PARM1_CORE_SYSCALL
                                        ; fd
   8: (bf) r1 = r10
   9: (07) r1 += -16
  10: (b4) w2 = 8
  11: (bc) w2 = w2
  12: (85) call bpf_probe_read_kernel#-76640 ; (&tmp, 8, &fd)
  13: (b7) r1 = 48
  14: (bf) r3 = r6
  15: (0f) r3 += r1                     ; user gpr3
                                        ; __PT_PARM2_REG
                                        ; uservaddr
  16: (bf) r1 = r10
  17: (07) r1 += -16
  18: (b4) w2 = 8
  19: (bc) w2 = w2
  20: (85) call bpf_probe_read_kernel#-76640 (&tmp, 8, &uservaddr)
  21: (b7) r1 = 56
  22: (0f) r6 += r1                     ; user gpr4 
                                        ; __PT_PARM3_REG
                                        ; addrlen
  23: (79) r7 = *(u64 *)(r10 -16)       ; uservaddr
  24: (bf) r1 = r10
  25: (07) r1 += -16
  26: (b4) w2 = 8
  27: (bc) w2 = w2
  28: (bf) r3 = r6
  29: (85) call bpf_probe_read_kernel#-76640 (&tmp, 8, &addrlen)
  30: (18) r1 = map[id:32][0]+0         ; &old
  32: (b4) w2 = 16
  33: (bc) w2 = w2
  34: (bf) r3 = r7
  35: (85) call bpf_probe_read_user#-76928 (&old, 16, uservaddr)
  36: (18) r1 = 0xabababababababab
  38: (7b) *(u64 *)(r10 -16) = r1       ; memset(&new, 0xab, 16)
  39: (7b) *(u64 *)(r10 -8) = r1
  40: (bf) r2 = r10
  41: (07) r2 += -16
  42: (bf) r1 = r7
  43: (b4) w3 = 16
  44: (bc) w3 = w3
  45: (85) call bpf_probe_write_user#-76352 (uservaddr, &new, 16)
  46: (b4) w0 = 0
  47: (bc) w0 = w0
  48: (95) exit

Best regards,
Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests
  2022-07-15 23:28   ` Ilya Leoshkevich
@ 2022-07-15 23:54     ` Ilya Leoshkevich
  0 siblings, 0 replies; 10+ messages in thread
From: Ilya Leoshkevich @ 2022-07-15 23:54 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, ast, daniel; +Cc: kernel-team, Alan Maguire

On Sat, 2022-07-16 at 01:28 +0200, Ilya Leoshkevich wrote:
> On Thu, 2022-07-14 at 00:07 -0700, Andrii Nakryiko wrote:
> > Convert few selftest that used plain SEC("kprobe") with arch-
> > specific
> > syscall wrapper prefix to ksyscall/kretsyscall and corresponding
> > BPF_KSYSCALL macro. test_probe_user.c is especially benefiting from
> > this
> > simplification.
> > 
> > Tested-by: Alan Maguire <alan.maguire@oracle.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  .../selftests/bpf/progs/bpf_syscall_macro.c   |  6 ++---
> >  .../selftests/bpf/progs/test_attach_probe.c   | 15 +++++------
> >  .../selftests/bpf/progs/test_probe_user.c     | 27 +++++----------
> > --
> > --
> >  3 files changed, 16 insertions(+), 32 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> > b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> > index 05838ed9b89c..e1e11897e99b 100644
> > --- a/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> > +++ b/tools/testing/selftests/bpf/progs/bpf_syscall_macro.c
> > @@ -64,9 +64,9 @@ int BPF_KPROBE(handle_sys_prctl)
> >         return 0;
> >  }
> >  
> > -SEC("kprobe/" SYS_PREFIX "sys_prctl")
> > -int BPF_KPROBE_SYSCALL(prctl_enter, int option, unsigned long
> > arg2,
> > -                      unsigned long arg3, unsigned long arg4,
> > unsigned long arg5)
> > +SEC("ksyscall/prctl")
> > +int BPF_KSYSCALL(prctl_enter, int option, unsigned long arg2,
> > +                unsigned long arg3, unsigned long arg4, unsigned
> > long arg5)
> >  {
> >         pid_t pid = bpf_get_current_pid_tgid() >> 32;
> >  
> > diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c
> > b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> > index f1c88ad368ef..a1e45fec8938 100644
> > --- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
> > +++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
> > @@ -1,11 +1,10 @@
> >  // SPDX-License-Identifier: GPL-2.0
> >  // Copyright (c) 2017 Facebook
> >  
> > -#include <linux/ptrace.h>
> > -#include <linux/bpf.h>
> > +#include "vmlinux.h"
> >  #include <bpf/bpf_helpers.h>
> >  #include <bpf/bpf_tracing.h>
> > -#include <stdbool.h>
> > +#include <bpf/bpf_core_read.h>
> >  #include "bpf_misc.h"
> >  
> >  int kprobe_res = 0;
> > @@ -31,8 +30,8 @@ int handle_kprobe(struct pt_regs *ctx)
> >         return 0;
> >  }
> >  
> > -SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
> > -int BPF_KPROBE(handle_kprobe_auto)
> > +SEC("ksyscall/nanosleep")
> > +int BPF_KSYSCALL(handle_kprobe_auto, struct __kernel_timespec
> > *req,
> > struct __kernel_timespec *rem)
> >  {
> >         kprobe2_res = 11;
> >         return 0;
> > @@ -56,11 +55,11 @@ int handle_kretprobe(struct pt_regs *ctx)
> >         return 0;
> >  }
> >  
> > -SEC("kretprobe/" SYS_PREFIX "sys_nanosleep")
> > -int BPF_KRETPROBE(handle_kretprobe_auto)
> > +SEC("kretsyscall/nanosleep")
> > +int BPF_KRETPROBE(handle_kretprobe_auto, int ret)
> >  {
> >         kretprobe2_res = 22;
> > -       return 0;
> > +       return ret;
> >  }
> >  
> >  SEC("uprobe")
> > diff --git a/tools/testing/selftests/bpf/progs/test_probe_user.c
> > b/tools/testing/selftests/bpf/progs/test_probe_user.c
> > index 702578a5e496..8e1495008e4d 100644
> > --- a/tools/testing/selftests/bpf/progs/test_probe_user.c
> > +++ b/tools/testing/selftests/bpf/progs/test_probe_user.c
> > @@ -1,35 +1,20 @@
> >  // SPDX-License-Identifier: GPL-2.0
> > -
> > -#include <linux/ptrace.h>
> > -#include <linux/bpf.h>
> > -
> > -#include <netinet/in.h>
> > -
> > +#include "vmlinux.h"
> >  #include <bpf/bpf_helpers.h>
> >  #include <bpf/bpf_tracing.h>
> > +#include <bpf/bpf_core_read.h>
> >  #include "bpf_misc.h"
> >  
> >  static struct sockaddr_in old;
> >  
> > -SEC("kprobe/" SYS_PREFIX "sys_connect")
> > -int BPF_KPROBE(handle_sys_connect)
> > +SEC("ksyscall/connect")
> > +int BPF_KSYSCALL(handle_sys_connect, int fd, struct sockaddr_in
> > *uservaddr, int addrlen)
> >  {
> > -#if SYSCALL_WRAPPER == 1
> > -       struct pt_regs *real_regs;
> > -#endif
> >         struct sockaddr_in new;
> > -       void *ptr;
> > -
> > -#if SYSCALL_WRAPPER == 0
> > -       ptr = (void *)PT_REGS_PARM2(ctx);
> > -#else
> > -       real_regs = (struct pt_regs *)PT_REGS_PARM1(ctx);
> > -       bpf_probe_read_kernel(&ptr, sizeof(ptr),
> > &PT_REGS_PARM2(real_regs));
> > -#endif
> >  
> > -       bpf_probe_read_user(&old, sizeof(old), ptr);
> > +       bpf_probe_read_user(&old, sizeof(old), uservaddr);
> >         __builtin_memset(&new, 0xab, sizeof(new));
> > -       bpf_probe_write_user(ptr, &new, sizeof(new));
> > +       bpf_probe_write_user(uservaddr, &new, sizeof(new));
> >  
> >         return 0;
> >  }
> 
> Hi,
> 
> The first two tests succeed, but test_probe_user fails on s390x with:
> 
>     serial_test_probe_user:FAIL:check_kprobe_res
>     wrong kprobe res from probe read: 0.0.0.0:0
> 
> I'm not sure what is causing this, but at least the loaded BPF code
> looks sane:
> 
> # bpftool prog dump xlated id 81
> int handle_sys_connect(struct pt_regs * ctx):
>    0: (bf) r6 = r1                      ; kernel pt_regs
>    1: (18) r1 = map[id:33][0]+0
>    3: (71) r1 = *(u8 *)(r1 +0)
>    4: (79) r6 = *(u64 *)(r6 +40)        ; kernel gpr2
>                                         ; PT_REGS_PARM1
>    5: (b7) r1 = 152
>    6: (bf) r3 = r6
>    7: (0f) r3 += r1                     ; user orig_gpr2
>                                         ; PT_REGS_PARM1_CORE_SYSCALL
>                                         ; fd
>    8: (bf) r1 = r10
>    9: (07) r1 += -16
>   10: (b4) w2 = 8
>   11: (bc) w2 = w2
>   12: (85) call bpf_probe_read_kernel#-76640 ; (&tmp, 8, &fd)
>   13: (b7) r1 = 48
>   14: (bf) r3 = r6
>   15: (0f) r3 += r1                     ; user gpr3
>                                         ; __PT_PARM2_REG
>                                         ; uservaddr
>   16: (bf) r1 = r10
>   17: (07) r1 += -16
>   18: (b4) w2 = 8
>   19: (bc) w2 = w2
>   20: (85) call bpf_probe_read_kernel#-76640 (&tmp, 8, &uservaddr)
>   21: (b7) r1 = 56
>   22: (0f) r6 += r1                     ; user gpr4 
>                                         ; __PT_PARM3_REG
>                                         ; addrlen
>   23: (79) r7 = *(u64 *)(r10 -16)       ; uservaddr
>   24: (bf) r1 = r10
>   25: (07) r1 += -16
>   26: (b4) w2 = 8
>   27: (bc) w2 = w2
>   28: (bf) r3 = r6
>   29: (85) call bpf_probe_read_kernel#-76640 (&tmp, 8, &addrlen)
>   30: (18) r1 = map[id:32][0]+0         ; &old
>   32: (b4) w2 = 16
>   33: (bc) w2 = w2
>   34: (bf) r3 = r7
>   35: (85) call bpf_probe_read_user#-76928 (&old, 16, uservaddr)
>   36: (18) r1 = 0xabababababababab
>   38: (7b) *(u64 *)(r10 -16) = r1       ; memset(&new, 0xab, 16)
>   39: (7b) *(u64 *)(r10 -8) = r1
>   40: (bf) r2 = r10
>   41: (07) r2 += -16
>   42: (bf) r1 = r7
>   43: (b4) w3 = 16
>   44: (bc) w3 = w3
>   45: (85) call bpf_probe_write_user#-76352 (uservaddr, &new, 16)
>   46: (b4) w0 = 0
>   47: (bc) w0 = w0
>   48: (95) exit
> 
> Best regards,
> Ilya

I haven't noticed that this test was already in the s390x blacklist, so
the failure has nothing to do with this series.

The problem is that the BPF code is not called at all, since s390x uses
socketcall multiplexer. I addressed a similar issue in samples/bpf a
while ago:

commit f55f4c349a03d820c27145bdf457013b42e4b487
Author: Ilya Leoshkevich <iii@linux.ibm.com>
Date:   Tue Sep 15 13:55:19 2020 +0200

    samples/bpf: Fix test_map_in_map on s390
    
    s390 uses socketcall multiplexer instead of individual socket
    syscalls.
    Therefore, "kprobe/" SYSCALL(sys_connect) does not trigger and
    test_map_in_map fails. Fix by using "kprobe/__sys_connect" instead.

Would it make sense to add two probes here: one for connect() and
another for socketcall(SYS_CONNECT)?

I also think that this deserves a mention in the list of quirks.

Best regards,
Ilya

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support
  2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2022-07-14 18:31 ` [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support sdf
@ 2022-07-19 16:50 ` patchwork-bot+netdevbpf
  6 siblings, 0 replies; 10+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-07-19 16:50 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, ast, daniel, kernel-team

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Thu, 14 Jul 2022 00:07:50 -0700 you wrote:
> Add SEC("ksyscall")/SEC("kretsyscall") sections and corresponding
> bpf_program__attach_ksyscall() API that simplifies tracing kernel syscalls
> through kprobe mechanism. Kprobing syscalls isn't trivial due to varying
> syscall handler names in the kernel and various ways syscall argument are
> passed, depending on kernel architecture and configuration. SEC("ksyscall")
> allows user to not care about such details and just get access to syscall
> input arguments, while libbpf takes care of necessary feature detection logic.
> 
> [...]

Here is the summary with links:
  - [v2,bpf-next,1/5] libbpf: generalize virtual __kconfig externs and use it for USDT
    https://git.kernel.org/bpf/bpf-next/c/55d00c37ebc3
  - [v2,bpf-next,2/5] selftests/bpf: add test of __weak unknown virtual __kconfig extern
    https://git.kernel.org/bpf/bpf-next/c/ce6dc74a0a4a
  - [v2,bpf-next,3/5] libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL
    https://git.kernel.org/bpf/bpf-next/c/6f5d467d55f0
  - [v2,bpf-next,4/5] libbpf: add ksyscall/kretsyscall sections support for syscall kprobes
    https://git.kernel.org/bpf/bpf-next/c/708ac5bea0ce
  - [v2,bpf-next,5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests
    https://git.kernel.org/bpf/bpf-next/c/d814ed62d3d2

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-19 16:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-14  7:07 [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support Andrii Nakryiko
2022-07-14  7:07 ` [PATCH v2 bpf-next 1/5] libbpf: generalize virtual __kconfig externs and use it for USDT Andrii Nakryiko
2022-07-14  7:07 ` [PATCH v2 bpf-next 2/5] selftests/bpf: add test of __weak unknown virtual __kconfig extern Andrii Nakryiko
2022-07-14  7:07 ` [PATCH v2 bpf-next 3/5] libbpf: improve BPF_KPROBE_SYSCALL macro and rename it to BPF_KSYSCALL Andrii Nakryiko
2022-07-14  7:07 ` [PATCH v2 bpf-next 4/5] libbpf: add ksyscall/kretsyscall sections support for syscall kprobes Andrii Nakryiko
2022-07-14  7:07 ` [PATCH v2 bpf-next 5/5] selftests/bpf: use BPF_KSYSCALL and SEC("ksyscall") in selftests Andrii Nakryiko
2022-07-15 23:28   ` Ilya Leoshkevich
2022-07-15 23:54     ` Ilya Leoshkevich
2022-07-14 18:31 ` [PATCH v2 bpf-next 0/5] Add SEC("ksyscall") support sdf
2022-07-19 16:50 ` patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.