All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/5] sleepable uprobe support
@ 2022-05-13  1:22 Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 3/5] bpf: allow sleepable uprobe programs to attach Delyan Kratunov
                   ` (4 more replies)
  0 siblings, 5 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

This series implements support for sleepable uprobe programs.
Key work is in patches 2 and 3, the rest is plumbing and tests.

The main observation is that the only obstacle in the way of sleepable uprobe
programs is not the uprobe infrastructure, which already runs in a user-like
context, but the rcu usage around bpf_prog_array.

Details are in patch 2 but the tl;dr is that we chain trace_tasks and normal rcu
grace periods when releasing to array to accommodate users of either rcu type.
This introduces latency for non-sleepable users (kprobe, tp) but that's deemed
acceptable, given recent benchmarks by Andrii [1]. We're a couple of orders of
magnitude under the rate of bpf_prog_array churn that would raise flags (~1MM/s per Paul).

  [1]: https://lore.kernel.org/bpf/CAEf4BzbpjN6ca7D9KOTiFPOoBYkciYvTz0UJNp5c-_3ptm=Mrg@mail.gmail.com/

v2 -> v3:
 * Inline uprobe_call_bpf into trace_uprobe.c, it's just a bpf_prog_run_array_sleepable call now.
 * Do not disable preemption for uprobe non-sleepable programs.
 * Add acks.

v1 -> v2:
 * Fix lockdep annotations in bpf_prog_run_array_sleepable
 * Chain rcu grace periods only for perf_event-attached programs. This limits
   the additional latency on the free path to use cases where we know it won't
   be a problem.
 * Add tests calling helpers only available in sleepable programs.
 * Remove kprobe.s support from libbpf.

Delyan Kratunov (5):
  bpf: move bpf_prog to bpf.h
  bpf: implement sleepable uprobes by chaining gps
  bpf: allow sleepable uprobe programs to attach
  libbpf: add support for sleepable uprobe programs
  selftests/bpf: add tests for sleepable (uk)probes

 include/linux/bpf.h                           | 89 +++++++++++++++++++
 include/linux/filter.h                        | 34 -------
 kernel/bpf/core.c                             | 15 ++++
 kernel/bpf/verifier.c                         |  4 +-
 kernel/events/core.c                          | 16 ++--
 kernel/trace/bpf_trace.c                      |  4 +-
 kernel/trace/trace_uprobe.c                   |  5 +-
 tools/lib/bpf/libbpf.c                        |  5 +-
 .../selftests/bpf/prog_tests/attach_probe.c   | 49 +++++++++-
 .../selftests/bpf/progs/test_attach_probe.c   | 60 +++++++++++++
 10 files changed, 232 insertions(+), 49 deletions(-)

--
2.35.3

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 3/5] bpf: allow sleepable uprobe programs to attach
  2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
@ 2022-05-13  1:22 ` Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 1/5] bpf: move bpf_prog to bpf.h Delyan Kratunov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

uprobe and kprobe programs have the same program type, KPROBE, which is
currently not allowed to load sleepable programs.

To avoid adding a new UPROBE type, instead allow sleepable KPROBE
programs to load and defer the is-it-actually-a-uprobe-program check
to attachment time, where there's already validation of the
corresponding perf_event.

A corollary of this patch is that you can now load a sleepable kprobe
program but cannot attach it.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 kernel/bpf/verifier.c |  4 ++--
 kernel/events/core.c  | 16 ++++++++++------
 2 files changed, 12 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 05c1b6656824..ed446ce5ad79 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14583,8 +14583,8 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 	}
 
 	if (prog->aux->sleepable && prog->type != BPF_PROG_TYPE_TRACING &&
-	    prog->type != BPF_PROG_TYPE_LSM) {
-		verbose(env, "Only fentry/fexit/fmod_ret and lsm programs can be sleepable\n");
+	    prog->type != BPF_PROG_TYPE_LSM && prog->type != BPF_PROG_TYPE_KPROBE) {
+		verbose(env, "Only fentry/fexit/fmod_ret, lsm, and kprobe/uprobe programs can be sleepable\n");
 		return -EINVAL;
 	}
 
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 23bb19716ad3..bd09c4152a3b 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -10069,26 +10069,30 @@ static inline bool perf_event_is_tracing(struct perf_event *event)
 int perf_event_set_bpf_prog(struct perf_event *event, struct bpf_prog *prog,
 			    u64 bpf_cookie)
 {
-	bool is_kprobe, is_tracepoint, is_syscall_tp;
+	bool is_kprobe, is_uprobe, is_tracepoint, is_syscall_tp;
 
 	if (!perf_event_is_tracing(event))
 		return perf_event_set_bpf_handler(event, prog, bpf_cookie);
 
-	is_kprobe = event->tp_event->flags & TRACE_EVENT_FL_UKPROBE;
+	is_kprobe = event->tp_event->flags & TRACE_EVENT_FL_KPROBE;
+	is_uprobe = event->tp_event->flags & TRACE_EVENT_FL_UPROBE;
 	is_tracepoint = event->tp_event->flags & TRACE_EVENT_FL_TRACEPOINT;
 	is_syscall_tp = is_syscall_trace_event(event->tp_event);
-	if (!is_kprobe && !is_tracepoint && !is_syscall_tp)
+	if (!is_kprobe && !is_uprobe && !is_tracepoint && !is_syscall_tp)
 		/* bpf programs can only be attached to u/kprobe or tracepoint */
 		return -EINVAL;
 
-	if ((is_kprobe && prog->type != BPF_PROG_TYPE_KPROBE) ||
+	if (((is_kprobe || is_uprobe) && prog->type != BPF_PROG_TYPE_KPROBE) ||
 	    (is_tracepoint && prog->type != BPF_PROG_TYPE_TRACEPOINT) ||
 	    (is_syscall_tp && prog->type != BPF_PROG_TYPE_TRACEPOINT))
 		return -EINVAL;
 
+	if (prog->type == BPF_PROG_TYPE_KPROBE && prog->aux->sleepable && !is_uprobe)
+		/* only uprobe programs are allowed to be sleepable */
+		return -EINVAL;
+
 	/* Kprobe override only works for kprobes, not uprobes. */
-	if (prog->kprobe_override &&
-	    !(event->tp_event->flags & TRACE_EVENT_FL_KPROBE))
+	if (prog->kprobe_override && !is_kprobe)
 		return -EINVAL;
 
 	if (is_tracepoint || is_syscall_tp) {
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 1/5] bpf: move bpf_prog to bpf.h
  2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 3/5] bpf: allow sleepable uprobe programs to attach Delyan Kratunov
@ 2022-05-13  1:22 ` Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 4/5] libbpf: add support for sleepable uprobe programs Delyan Kratunov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

In order to add a version of bpf_prog_run_array which accesses the
bpf_prog->aux member, bpf_prog needs to be more than a forward
declaration inside bpf.h.

Given that filter.h already includes bpf.h, this merely reorders
the type declarations for filter.h users. bpf.h users now have access to
bpf_prog internals.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 include/linux/bpf.h    | 36 ++++++++++++++++++++++++++++++++++++
 include/linux/filter.h | 34 ----------------------------------
 2 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 5061ccd8b2dc..b67893b47da4 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -5,6 +5,7 @@
 #define _LINUX_BPF_H 1
 
 #include <uapi/linux/bpf.h>
+#include <uapi/linux/filter.h>
 
 #include <linux/workqueue.h>
 #include <linux/file.h>
@@ -22,6 +23,7 @@
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
 #include <linux/percpu-refcount.h>
+#include <linux/stddef.h>
 #include <linux/bpfptr.h>
 #include <linux/btf.h>
 
@@ -1072,6 +1074,40 @@ struct bpf_prog_aux {
 	};
 };
 
+struct bpf_prog {
+	u16			pages;		/* Number of allocated pages */
+	u16			jited:1,	/* Is our filter JIT'ed? */
+				jit_requested:1,/* archs need to JIT the prog */
+				gpl_compatible:1, /* Is filter GPL compatible? */
+				cb_access:1,	/* Is control block accessed? */
+				dst_needed:1,	/* Do we need dst entry? */
+				blinding_requested:1, /* needs constant blinding */
+				blinded:1,	/* Was blinded */
+				is_func:1,	/* program is a bpf function */
+				kprobe_override:1, /* Do we override a kprobe? */
+				has_callchain_buf:1, /* callchain buffer allocated? */
+				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
+				call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
+				call_get_func_ip:1, /* Do we call get_func_ip() */
+				tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */
+	enum bpf_prog_type	type;		/* Type of BPF program */
+	enum bpf_attach_type	expected_attach_type; /* For some prog types */
+	u32			len;		/* Number of filter blocks */
+	u32			jited_len;	/* Size of jited insns in bytes */
+	u8			tag[BPF_TAG_SIZE];
+	struct bpf_prog_stats __percpu *stats;
+	int __percpu		*active;
+	unsigned int		(*bpf_func)(const void *ctx,
+					    const struct bpf_insn *insn);
+	struct bpf_prog_aux	*aux;		/* Auxiliary fields */
+	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
+	/* Instructions for interpreter */
+	union {
+		DECLARE_FLEX_ARRAY(struct sock_filter, insns);
+		DECLARE_FLEX_ARRAY(struct bpf_insn, insnsi);
+	};
+};
+
 struct bpf_array_aux {
 	/* Programs with direct jumps into programs part of this array. */
 	struct list_head poke_progs;
diff --git a/include/linux/filter.h b/include/linux/filter.h
index ed0c0ff42ad5..d0cbb31b1b4d 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -559,40 +559,6 @@ struct bpf_prog_stats {
 	struct u64_stats_sync syncp;
 } __aligned(2 * sizeof(u64));
 
-struct bpf_prog {
-	u16			pages;		/* Number of allocated pages */
-	u16			jited:1,	/* Is our filter JIT'ed? */
-				jit_requested:1,/* archs need to JIT the prog */
-				gpl_compatible:1, /* Is filter GPL compatible? */
-				cb_access:1,	/* Is control block accessed? */
-				dst_needed:1,	/* Do we need dst entry? */
-				blinding_requested:1, /* needs constant blinding */
-				blinded:1,	/* Was blinded */
-				is_func:1,	/* program is a bpf function */
-				kprobe_override:1, /* Do we override a kprobe? */
-				has_callchain_buf:1, /* callchain buffer allocated? */
-				enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */
-				call_get_stack:1, /* Do we call bpf_get_stack() or bpf_get_stackid() */
-				call_get_func_ip:1, /* Do we call get_func_ip() */
-				tstamp_type_access:1; /* Accessed __sk_buff->tstamp_type */
-	enum bpf_prog_type	type;		/* Type of BPF program */
-	enum bpf_attach_type	expected_attach_type; /* For some prog types */
-	u32			len;		/* Number of filter blocks */
-	u32			jited_len;	/* Size of jited insns in bytes */
-	u8			tag[BPF_TAG_SIZE];
-	struct bpf_prog_stats __percpu *stats;
-	int __percpu		*active;
-	unsigned int		(*bpf_func)(const void *ctx,
-					    const struct bpf_insn *insn);
-	struct bpf_prog_aux	*aux;		/* Auxiliary fields */
-	struct sock_fprog_kern	*orig_prog;	/* Original BPF program */
-	/* Instructions for interpreter */
-	union {
-		DECLARE_FLEX_ARRAY(struct sock_filter, insns);
-		DECLARE_FLEX_ARRAY(struct bpf_insn, insnsi);
-	};
-};
-
 struct sk_filter {
 	refcount_t	refcnt;
 	struct rcu_head	rcu;
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 4/5] libbpf: add support for sleepable uprobe programs
  2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 3/5] bpf: allow sleepable uprobe programs to attach Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 1/5] bpf: move bpf_prog to bpf.h Delyan Kratunov
@ 2022-05-13  1:22 ` Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps Delyan Kratunov
  2022-05-13  1:22 ` [PATCH bpf-next v3 5/5] selftests/bpf: add tests for sleepable (uk)probes Delyan Kratunov
  4 siblings, 0 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

Add section mappings for u(ret)probe.s programs.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 tools/lib/bpf/libbpf.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 4867a930628b..54ee6e422f60 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -9005,8 +9005,10 @@ static const struct bpf_sec_def section_defs[] = {
 	SEC_DEF("sk_reuseport",		SK_REUSEPORT, BPF_SK_REUSEPORT_SELECT, SEC_ATTACHABLE | SEC_SLOPPY_PFX),
 	SEC_DEF("kprobe+",		KPROBE,	0, SEC_NONE, attach_kprobe),
 	SEC_DEF("uprobe+",		KPROBE,	0, SEC_NONE, attach_uprobe),
+	SEC_DEF("uprobe.s+",		KPROBE,	0, SEC_SLEEPABLE, attach_uprobe),
 	SEC_DEF("kretprobe+",		KPROBE, 0, SEC_NONE, attach_kprobe),
 	SEC_DEF("uretprobe+",		KPROBE, 0, SEC_NONE, attach_uprobe),
+	SEC_DEF("uretprobe.s+",		KPROBE, 0, SEC_SLEEPABLE, attach_uprobe),
 	SEC_DEF("kprobe.multi+",	KPROBE,	BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi),
 	SEC_DEF("kretprobe.multi+",	KPROBE,	BPF_TRACE_KPROBE_MULTI, SEC_NONE, attach_kprobe_multi),
 	SEC_DEF("usdt+",		KPROBE,	0, SEC_NONE, attach_usdt),
@@ -11282,7 +11284,8 @@ static int attach_uprobe(const struct bpf_program *prog, long cookie, struct bpf
 		break;
 	case 3:
 	case 4:
-		opts.retprobe = strcmp(probe_type, "uretprobe") == 0;
+		opts.retprobe = strcmp(probe_type, "uretprobe") == 0 ||
+				strcmp(probe_type, "uretprobe.s") == 0;
 		if (opts.retprobe && offset != 0) {
 			pr_warn("prog '%s': uretprobes do not support offset specification\n",
 				prog->name);
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps
  2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
                   ` (2 preceding siblings ...)
  2022-05-13  1:22 ` [PATCH bpf-next v3 4/5] libbpf: add support for sleepable uprobe programs Delyan Kratunov
@ 2022-05-13  1:22 ` Delyan Kratunov
  2022-05-13 16:00   ` Daniel Borkmann
  2022-05-13  1:22 ` [PATCH bpf-next v3 5/5] selftests/bpf: add tests for sleepable (uk)probes Delyan Kratunov
  4 siblings, 1 reply; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

uprobes work by raising a trap, setting a task flag from within the
interrupt handler, and processing the actual work for the uprobe on the
way back to userspace. As a result, uprobe handlers already execute in a
might_fault/_sleep context. The primary obstacle to sleepable bpf uprobe
programs is therefore on the bpf side.

Namely, the bpf_prog_array attached to the uprobe is protected by normal
rcu. In order for uprobe bpf programs to become sleepable, it has to be
protected by the tasks_trace rcu flavor instead (and kfree() called after
a corresponding grace period).

Therefore, the free path for bpf_prog_array now chains a tasks_trace and
normal grace periods one after the other.

Users who iterate under tasks_trace read section would
be safe, as would users who iterate under normal read sections (from
non-sleepable locations).

The downside is that the tasks_trace latency affects all perf_event-attached
bpf programs (and not just uprobe ones). This is deemed safe given the
possible attach rates for kprobe/uprobe/tp programs.

Separately, non-sleepable programs need access to dynamically sized
rcu-protected maps, so bpf_run_prog_array_sleepables now conditionally takes
an rcu read section, in addition to the overarching tasks_trace section.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 include/linux/bpf.h         | 53 +++++++++++++++++++++++++++++++++++++
 kernel/bpf/core.c           | 15 +++++++++++
 kernel/trace/bpf_trace.c    |  4 +--
 kernel/trace/trace_uprobe.c |  5 ++--
 4 files changed, 72 insertions(+), 5 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index b67893b47da4..77ba90d654e7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -26,6 +26,7 @@
 #include <linux/stddef.h>
 #include <linux/bpfptr.h>
 #include <linux/btf.h>
+#include <linux/rcupdate_trace.h>
 
 struct bpf_verifier_env;
 struct bpf_verifier_log;
@@ -1360,6 +1361,8 @@ extern struct bpf_empty_prog_array bpf_empty_prog_array;
 
 struct bpf_prog_array *bpf_prog_array_alloc(u32 prog_cnt, gfp_t flags);
 void bpf_prog_array_free(struct bpf_prog_array *progs);
+/* Use when traversal over the bpf_prog_array uses tasks_trace rcu */
+void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs);
 int bpf_prog_array_length(struct bpf_prog_array *progs);
 bool bpf_prog_array_is_empty(struct bpf_prog_array *array);
 int bpf_prog_array_copy_to_user(struct bpf_prog_array *progs,
@@ -1451,6 +1454,56 @@ bpf_prog_run_array(const struct bpf_prog_array *array,
 	return ret;
 }
 
+/**
+ * Notes on RCU design for bpf_prog_arrays containing sleepable programs:
+ *
+ * We use the tasks_trace rcu flavor read section to protect the bpf_prog_array
+ * overall. As a result, we must use the bpf_prog_array_free_sleepable
+ * in order to use the tasks_trace rcu grace period.
+ *
+ * When a non-sleepable program is inside the array, we take the rcu read
+ * section and disable preemption for that program alone, so it can access
+ * rcu-protected dynamically sized maps.
+ */
+static __always_inline u32
+bpf_prog_run_array_sleepable(const struct bpf_prog_array __rcu *array_rcu,
+			     const void *ctx, bpf_prog_run_fn run_prog)
+{
+	const struct bpf_prog_array_item *item;
+	const struct bpf_prog *prog;
+	const struct bpf_prog_array *array;
+	struct bpf_run_ctx *old_run_ctx;
+	struct bpf_trace_run_ctx run_ctx;
+	u32 ret = 1;
+
+	might_fault();
+
+	rcu_read_lock_trace();
+	migrate_disable();
+
+	array = rcu_dereference_check(array_rcu, rcu_read_lock_trace_held());
+	if (unlikely(!array))
+		goto out;
+	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
+	item = &array->items[0];
+	while ((prog = READ_ONCE(item->prog))) {
+		if (!prog->aux->sleepable)
+			rcu_read_lock();
+
+		run_ctx.bpf_cookie = item->bpf_cookie;
+		ret &= run_prog(prog, ctx);
+		item++;
+
+		if (!prog->aux->sleepable)
+			rcu_read_unlock();
+	}
+	bpf_reset_run_ctx(old_run_ctx);
+out:
+	migrate_enable();
+	rcu_read_unlock_trace();
+	return ret;
+}
+
 #ifdef CONFIG_BPF_SYSCALL
 DECLARE_PER_CPU(int, bpf_prog_active);
 extern struct mutex bpf_stats_enabled_mutex;
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 76f68d0a7ae8..9c2175b06b38 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2268,6 +2268,21 @@ void bpf_prog_array_free(struct bpf_prog_array *progs)
 	kfree_rcu(progs, rcu);
 }
 
+static void __bpf_prog_array_free_sleepable_cb(struct rcu_head *rcu)
+{
+	struct bpf_prog_array *progs;
+
+	progs = container_of(rcu, struct bpf_prog_array, rcu);
+	kfree_rcu(progs, rcu);
+}
+
+void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs)
+{
+	if (!progs || progs == &bpf_empty_prog_array.hdr)
+		return;
+	call_rcu_tasks_trace(&progs->rcu, __bpf_prog_array_free_sleepable_cb);
+}
+
 int bpf_prog_array_length(struct bpf_prog_array *array)
 {
 	struct bpf_prog_array_item *item;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 7141ca8a1c2d..f74c53dba64e 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1934,7 +1934,7 @@ int perf_event_attach_bpf_prog(struct perf_event *event,
 	event->prog = prog;
 	event->bpf_cookie = bpf_cookie;
 	rcu_assign_pointer(event->tp_event->prog_array, new_array);
-	bpf_prog_array_free(old_array);
+	bpf_prog_array_free_sleepable(old_array);
 
 unlock:
 	mutex_unlock(&bpf_event_mutex);
@@ -1960,7 +1960,7 @@ void perf_event_detach_bpf_prog(struct perf_event *event)
 		bpf_prog_array_delete_safe(old_array, event->prog);
 	} else {
 		rcu_assign_pointer(event->tp_event->prog_array, new_array);
-		bpf_prog_array_free(old_array);
+		bpf_prog_array_free_sleepable(old_array);
 	}
 
 	bpf_prog_put(event->prog);
diff --git a/kernel/trace/trace_uprobe.c b/kernel/trace/trace_uprobe.c
index 9711589273cd..0282c119b1b2 100644
--- a/kernel/trace/trace_uprobe.c
+++ b/kernel/trace/trace_uprobe.c
@@ -16,6 +16,7 @@
 #include <linux/namei.h>
 #include <linux/string.h>
 #include <linux/rculist.h>
+#include <linux/filter.h>
 
 #include "trace_dynevent.h"
 #include "trace_probe.h"
@@ -1346,9 +1347,7 @@ static void __uprobe_perf_func(struct trace_uprobe *tu,
 	if (bpf_prog_array_valid(call)) {
 		u32 ret;
 
-		preempt_disable();
-		ret = trace_call_bpf(call, regs);
-		preempt_enable();
+		ret = bpf_prog_run_array_sleepable(call->prog_array, regs, bpf_prog_run);
 		if (!ret)
 			return;
 	}
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH bpf-next v3 5/5] selftests/bpf: add tests for sleepable (uk)probes
  2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
                   ` (3 preceding siblings ...)
  2022-05-13  1:22 ` [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps Delyan Kratunov
@ 2022-05-13  1:22 ` Delyan Kratunov
  4 siblings, 0 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13  1:22 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

Add tests that ensure sleepable uprobe programs work correctly.
Add tests that ensure sleepable kprobe programs cannot attach.

Also add tests that attach both sleepable and non-sleepable
uprobe programs to the same location (i.e. same bpf_prog_array).

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 .../selftests/bpf/prog_tests/attach_probe.c   | 49 ++++++++++++++-
 .../selftests/bpf/progs/test_attach_probe.c   | 60 +++++++++++++++++++
 2 files changed, 108 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/attach_probe.c b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
index 08c0601b3e84..0b899d2d8ea7 100644
--- a/tools/testing/selftests/bpf/prog_tests/attach_probe.c
+++ b/tools/testing/selftests/bpf/prog_tests/attach_probe.c
@@ -17,6 +17,14 @@ static void trigger_func2(void)
 	asm volatile ("");
 }
 
+/* attach point for byname sleepable uprobe */
+static void trigger_func3(void)
+{
+	asm volatile ("");
+}
+
+static char test_data[] = "test_data";
+
 void test_attach_probe(void)
 {
 	DECLARE_LIBBPF_OPTS(bpf_uprobe_opts, uprobe_opts);
@@ -49,9 +57,17 @@ void test_attach_probe(void)
 	if (!ASSERT_GE(ref_ctr_offset, 0, "ref_ctr_offset"))
 		return;
 
-	skel = test_attach_probe__open_and_load();
+	skel = test_attach_probe__open();
 	if (!ASSERT_OK_PTR(skel, "skel_open"))
 		return;
+
+	/* sleepable kprobe test case needs flags set before loading */
+	if (!ASSERT_OK(bpf_program__set_flags(skel->progs.handle_kprobe_sleepable,
+		BPF_F_SLEEPABLE), "kprobe_sleepable_flags"))
+		goto cleanup;
+
+	if (!ASSERT_OK(test_attach_probe__load(skel), "skel_load"))
+		goto cleanup;
 	if (!ASSERT_OK_PTR(skel->bss, "check_bss"))
 		goto cleanup;
 
@@ -151,6 +167,30 @@ void test_attach_probe(void)
 	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname2, "attach_uretprobe_byname2"))
 		goto cleanup;
 
+	/* sleepable kprobes should not attach successfully */
+	skel->links.handle_kprobe_sleepable = bpf_program__attach(skel->progs.handle_kprobe_sleepable);
+	if (!ASSERT_ERR_PTR(skel->links.handle_kprobe_sleepable, "attach_kprobe_sleepable"))
+		goto cleanup;
+
+	/* test sleepable uprobe and uretprobe variants */
+	skel->links.handle_uprobe_byname3_sleepable = bpf_program__attach(skel->progs.handle_uprobe_byname3_sleepable);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname3_sleepable, "attach_uprobe_byname3_sleepable"))
+		goto cleanup;
+
+	skel->links.handle_uprobe_byname3 = bpf_program__attach(skel->progs.handle_uprobe_byname3);
+	if (!ASSERT_OK_PTR(skel->links.handle_uprobe_byname3, "attach_uprobe_byname3"))
+		goto cleanup;
+
+	skel->links.handle_uretprobe_byname3_sleepable = bpf_program__attach(skel->progs.handle_uretprobe_byname3_sleepable);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname3_sleepable, "attach_uretprobe_byname3_sleepable"))
+		goto cleanup;
+
+	skel->links.handle_uretprobe_byname3 = bpf_program__attach(skel->progs.handle_uretprobe_byname3);
+	if (!ASSERT_OK_PTR(skel->links.handle_uretprobe_byname3, "attach_uretprobe_byname3"))
+		goto cleanup;
+
+	skel->bss->user_ptr = test_data;
+
 	/* trigger & validate kprobe && kretprobe */
 	usleep(1);
 
@@ -164,6 +204,9 @@ void test_attach_probe(void)
 	/* trigger & validate uprobe attached by name */
 	trigger_func2();
 
+	/* trigger & validate sleepable uprobe attached by name */
+	trigger_func3();
+
 	ASSERT_EQ(skel->bss->kprobe_res, 1, "check_kprobe_res");
 	ASSERT_EQ(skel->bss->kprobe2_res, 11, "check_kprobe_auto_res");
 	ASSERT_EQ(skel->bss->kretprobe_res, 2, "check_kretprobe_res");
@@ -174,6 +217,10 @@ void test_attach_probe(void)
 	ASSERT_EQ(skel->bss->uretprobe_byname_res, 6, "check_uretprobe_byname_res");
 	ASSERT_EQ(skel->bss->uprobe_byname2_res, 7, "check_uprobe_byname2_res");
 	ASSERT_EQ(skel->bss->uretprobe_byname2_res, 8, "check_uretprobe_byname2_res");
+	ASSERT_EQ(skel->bss->uprobe_byname3_sleepable_res, 9, "check_uprobe_byname3_sleepable_res");
+	ASSERT_EQ(skel->bss->uprobe_byname3_res, 10, "check_uprobe_byname3_res");
+	ASSERT_EQ(skel->bss->uretprobe_byname3_sleepable_res, 11, "check_uretprobe_byname3_sleepable_res");
+	ASSERT_EQ(skel->bss->uretprobe_byname3_res, 12, "check_uretprobe_byname3_res");
 
 cleanup:
 	test_attach_probe__destroy(skel);
diff --git a/tools/testing/selftests/bpf/progs/test_attach_probe.c b/tools/testing/selftests/bpf/progs/test_attach_probe.c
index ce9acf4db8d2..75d2b0492122 100644
--- a/tools/testing/selftests/bpf/progs/test_attach_probe.c
+++ b/tools/testing/selftests/bpf/progs/test_attach_probe.c
@@ -5,6 +5,7 @@
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
+#include <stdbool.h>
 #include "bpf_misc.h"
 
 int kprobe_res = 0;
@@ -17,6 +18,11 @@ int uprobe_byname_res = 0;
 int uretprobe_byname_res = 0;
 int uprobe_byname2_res = 0;
 int uretprobe_byname2_res = 0;
+int uprobe_byname3_sleepable_res = 0;
+int uprobe_byname3_res = 0;
+int uretprobe_byname3_sleepable_res = 0;
+int uretprobe_byname3_res = 0;
+void *user_ptr = 0;
 
 SEC("kprobe")
 int handle_kprobe(struct pt_regs *ctx)
@@ -32,6 +38,17 @@ int BPF_KPROBE(handle_kprobe_auto)
 	return 0;
 }
 
+/**
+ * This program will be manually made sleepable on the userspace side
+ * and should thus be unattachable.
+ */
+SEC("kprobe/" SYS_PREFIX "sys_nanosleep")
+int handle_kprobe_sleepable(struct pt_regs *ctx)
+{
+	kprobe_res = 2;
+	return 0;
+}
+
 SEC("kretprobe")
 int handle_kretprobe(struct pt_regs *ctx)
 {
@@ -93,4 +110,47 @@ int handle_uretprobe_byname2(struct pt_regs *ctx)
 	return 0;
 }
 
+static inline bool verify_sleepable_user_copy(void)
+{
+	char data[9];
+
+	bpf_copy_from_user(data, sizeof(data), user_ptr);
+	return bpf_strncmp(data, sizeof(data), "test_data") == 0;
+}
+
+SEC("uprobe.s//proc/self/exe:trigger_func3")
+int handle_uprobe_byname3_sleepable(struct pt_regs *ctx)
+{
+	if (verify_sleepable_user_copy())
+		uprobe_byname3_sleepable_res = 9;
+	return 0;
+}
+
+/**
+ * same target as the uprobe.s above to force sleepable and non-sleepable
+ * programs in the same bpf_prog_array
+ */
+SEC("uprobe//proc/self/exe:trigger_func3")
+int handle_uprobe_byname3(struct pt_regs *ctx)
+{
+	uprobe_byname3_res = 10;
+	return 0;
+}
+
+SEC("uretprobe.s//proc/self/exe:trigger_func3")
+int handle_uretprobe_byname3_sleepable(struct pt_regs *ctx)
+{
+	if (verify_sleepable_user_copy())
+		uretprobe_byname3_sleepable_res = 11;
+	return 0;
+}
+
+SEC("uretprobe//proc/self/exe:trigger_func3")
+int handle_uretprobe_byname3(struct pt_regs *ctx)
+{
+	uretprobe_byname3_res = 12;
+	return 0;
+}
+
+
 char _license[] SEC("license") = "GPL";
-- 
2.35.3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps
  2022-05-13  1:22 ` [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps Delyan Kratunov
@ 2022-05-13 16:00   ` Daniel Borkmann
  2022-05-13 17:05     ` Delyan Kratunov
  0 siblings, 1 reply; 8+ messages in thread
From: Daniel Borkmann @ 2022-05-13 16:00 UTC (permalink / raw)
  To: Delyan Kratunov, ast, andrii, bpf

On 5/13/22 3:22 AM, Delyan Kratunov wrote:
[...]
>   struct bpf_prog_array *bpf_prog_array_alloc(u32 prog_cnt, gfp_t flags);
>   void bpf_prog_array_free(struct bpf_prog_array *progs);
> +/* Use when traversal over the bpf_prog_array uses tasks_trace rcu */
> +void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs);
>   int bpf_prog_array_length(struct bpf_prog_array *progs);
>   bool bpf_prog_array_is_empty(struct bpf_prog_array *array);
>   int bpf_prog_array_copy_to_user(struct bpf_prog_array *progs,
> @@ -1451,6 +1454,56 @@ bpf_prog_run_array(const struct bpf_prog_array *array,
>   	return ret;
>   }
>   
> +/**
> + * Notes on RCU design for bpf_prog_arrays containing sleepable programs:
> + *
> + * We use the tasks_trace rcu flavor read section to protect the bpf_prog_array
> + * overall. As a result, we must use the bpf_prog_array_free_sleepable
> + * in order to use the tasks_trace rcu grace period.
> + *
> + * When a non-sleepable program is inside the array, we take the rcu read
> + * section and disable preemption for that program alone, so it can access
> + * rcu-protected dynamically sized maps.
> + */

Btw, there are a number of kdoc warnings around your series, pls make sure to
fix or use 'regular' comment:

https://patchwork.hopto.org/static/nipa/641204/12848281/kdoc/stderr
https://patchwork.hopto.org/static/nipa/641204/12848282/kdoc/stderr

I presume otherwise kbuild bot will yell at some point.

Thanks,
Daniel

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps
  2022-05-13 16:00   ` Daniel Borkmann
@ 2022-05-13 17:05     ` Delyan Kratunov
  0 siblings, 0 replies; 8+ messages in thread
From: Delyan Kratunov @ 2022-05-13 17:05 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

On Fri, 2022-05-13 at 18:00 +0200, Daniel Borkmann wrote:
> On 5/13/22 3:22 AM, Delyan Kratunov wrote:
> [...]
> >   struct bpf_prog_array *bpf_prog_array_alloc(u32 prog_cnt, gfp_t flags);
> >   void bpf_prog_array_free(struct bpf_prog_array *progs);
> > +/* Use when traversal over the bpf_prog_array uses tasks_trace rcu */
> > +void bpf_prog_array_free_sleepable(struct bpf_prog_array *progs);
> >   int bpf_prog_array_length(struct bpf_prog_array *progs);
> >   bool bpf_prog_array_is_empty(struct bpf_prog_array *array);
> >   int bpf_prog_array_copy_to_user(struct bpf_prog_array *progs,
> > @@ -1451,6 +1454,56 @@ bpf_prog_run_array(const struct bpf_prog_array *array,
> >   	return ret;
> >   }
> >   
> > +/**
> > + * Notes on RCU design for bpf_prog_arrays containing sleepable programs:
> > + *
> > + * We use the tasks_trace rcu flavor read section to protect the bpf_prog_array
> > + * overall. As a result, we must use the bpf_prog_array_free_sleepable
> > + * in order to use the tasks_trace rcu grace period.
> > + *
> > + * When a non-sleepable program is inside the array, we take the rcu read
> > + * section and disable preemption for that program alone, so it can access
> > + * rcu-protected dynamically sized maps.
> > + */
> 
> Btw, there are a number of kdoc warnings around your series, pls make sure to
> fix or use 'regular' comment:
> 
> https://patchwork.hopto.org/static/nipa/641204/12848281/kdoc/stderr 
> https://patchwork.hopto.org/static/nipa/641204/12848282/kdoc/stderr 

Yeah, I just saw these too, I'll take care of them before the next reroll. Let's see
if there's any other high level comments first.

-- Delyan

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-05-13 17:05 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-13  1:22 [PATCH bpf-next v3 0/5] sleepable uprobe support Delyan Kratunov
2022-05-13  1:22 ` [PATCH bpf-next v3 3/5] bpf: allow sleepable uprobe programs to attach Delyan Kratunov
2022-05-13  1:22 ` [PATCH bpf-next v3 1/5] bpf: move bpf_prog to bpf.h Delyan Kratunov
2022-05-13  1:22 ` [PATCH bpf-next v3 4/5] libbpf: add support for sleepable uprobe programs Delyan Kratunov
2022-05-13  1:22 ` [PATCH bpf-next v3 2/5] bpf: implement sleepable uprobes by chaining gps Delyan Kratunov
2022-05-13 16:00   ` Daniel Borkmann
2022-05-13 17:05     ` Delyan Kratunov
2022-05-13  1:22 ` [PATCH bpf-next v3 5/5] selftests/bpf: add tests for sleepable (uk)probes Delyan Kratunov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.