All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock
@ 2022-12-15 21:44 Jiri Olsa
  2022-12-15 21:44 ` [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare Jiri Olsa
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Jiri Olsa @ 2022-12-15 21:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Florent Revest

hi,
In the last revision Andrii suggested we could have the buffer
provided by bpf_bprintf_prepare [1]. It's bit more changes but
it looks like more compact solution.

v3 changes:
  - added struct to hold return data in bpf_bprintf_prepare
  - fix bug in bpf_bprintf_cleanup
  - adjust printk helpers to use new bpf_bprintf_prepare
    data argument

thanks,
jirka


[1] https://lore.kernel.org/bpf/Y5pgxd9+G2wHROlp@krava/T/#m4b256e9138cdb37cd4477571f32e47a960aad317
---
Jiri Olsa (3):
      bpf: Add struct for bin_args arg in bpf_bprintf_prepare
      bpf: Do cleanup in bpf_bprintf_cleanup only when needed
      bpf: Remove trace_printk_lock

 include/linux/bpf.h      | 12 ++++++++++--
 kernel/bpf/helpers.c     | 67 +++++++++++++++++++++++++++++++++++++++----------------------------
 kernel/bpf/verifier.c    |  3 ++-
 kernel/trace/bpf_trace.c | 56 +++++++++++++++++++++++++++-----------------------------
 4 files changed, 78 insertions(+), 60 deletions(-)

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare
  2022-12-15 21:44 [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock Jiri Olsa
@ 2022-12-15 21:44 ` Jiri Olsa
  2022-12-17  0:25   ` Yonghong Song
  2022-12-15 21:44 ` [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed Jiri Olsa
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2022-12-15 21:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Florent Revest

Adding struct bpf_bprintf_data to hold bin_args argument
for bpf_bprintf_prepare function.

We will add another return argument to bpf_bprintf_prepare
and pass the struct to bpf_bprintf_cleanup for proper cleanup
in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h      |  7 ++++++-
 kernel/bpf/helpers.c     | 24 +++++++++++++-----------
 kernel/bpf/verifier.c    |  3 ++-
 kernel/trace/bpf_trace.c | 34 ++++++++++++++++++++--------------
 4 files changed, 41 insertions(+), 27 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 3de24cfb7a3d..cc390ba32e70 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2796,8 +2796,13 @@ bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
 #define MAX_BPRINTF_VARARGS		12
 
+struct bpf_bprintf_data {
+	u32 *bin_args;
+	bool get_bin_args;
+};
+
 int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
-			u32 **bin_buf, u32 num_args);
+			u32 num_args, struct bpf_bprintf_data *data);
 void bpf_bprintf_cleanup(void);
 
 /* the implementation of the opaque uapi struct bpf_dynptr */
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index af30c6cbd65d..7dbf6bb72cad 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -798,16 +798,16 @@ void bpf_bprintf_cleanup(void)
  * Returns a negative value if fmt is an invalid format string or 0 otherwise.
  *
  * This can be used in two ways:
- * - Format string verification only: when bin_args is NULL
+ * - Format string verification only: when data->get_bin_args is false
  * - Arguments preparation: in addition to the above verification, it writes in
- *   bin_args a binary representation of arguments usable by bstr_printf where
- *   pointers from BPF have been sanitized.
+ *   data->bin_args a binary representation of arguments usable by bstr_printf
+ *   where pointers from BPF have been sanitized.
  *
  * In argument preparation mode, if 0 is returned, safe temporary buffers are
  * allocated and bpf_bprintf_cleanup should be called to free them after use.
  */
 int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
-			u32 **bin_args, u32 num_args)
+			u32 num_args, struct bpf_bprintf_data *data)
 {
 	char *unsafe_ptr = NULL, *tmp_buf = NULL, *tmp_buf_end, *fmt_end;
 	size_t sizeof_cur_arg, sizeof_cur_ip;
@@ -820,12 +820,12 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 		return -EINVAL;
 	fmt_size = fmt_end - fmt;
 
-	if (bin_args) {
+	if (data->get_bin_args) {
 		if (num_args && try_get_fmt_tmp_buf(&tmp_buf))
 			return -EBUSY;
 
 		tmp_buf_end = tmp_buf + MAX_BPRINTF_BUF_LEN;
-		*bin_args = (u32 *)tmp_buf;
+		data->bin_args = (u32 *)tmp_buf;
 	}
 
 	for (i = 0; i < fmt_size; i++) {
@@ -1026,24 +1026,26 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 }
 
 BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
-	   const void *, data, u32, data_len)
+	   const void *, args, u32, data_len)
 {
+	struct bpf_bprintf_data data = {
+		.get_bin_args	= true,
+	};
 	int err, num_args;
-	u32 *bin_args;
 
 	if (data_len % 8 || data_len > MAX_BPRINTF_VARARGS * 8 ||
-	    (data_len && !data))
+	    (data_len && !args))
 		return -EINVAL;
 	num_args = data_len / 8;
 
 	/* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
 	 * can safely give an unbounded size.
 	 */
-	err = bpf_bprintf_prepare(fmt, UINT_MAX, data, &bin_args, num_args);
+	err = bpf_bprintf_prepare(fmt, UINT_MAX, args, num_args, &data);
 	if (err < 0)
 		return err;
 
-	err = bstr_printf(str, str_size, fmt, bin_args);
+	err = bstr_printf(str, str_size, fmt, data.bin_args);
 
 	bpf_bprintf_cleanup();
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a5255a0dcbb6..faa358b3d5d7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7612,6 +7612,7 @@ static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
 	struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
 	struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
 	struct bpf_map *fmt_map = fmt_reg->map_ptr;
+	struct bpf_bprintf_data data = {};
 	int err, fmt_map_off, num_args;
 	u64 fmt_addr;
 	char *fmt;
@@ -7636,7 +7637,7 @@ static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
 	/* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
 	 * can focus on validating the format specifiers.
 	 */
-	err = bpf_bprintf_prepare(fmt, UINT_MAX, NULL, NULL, num_args);
+	err = bpf_bprintf_prepare(fmt, UINT_MAX, NULL, num_args, &data);
 	if (err < 0)
 		verbose(env, "Invalid format string\n");
 
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3bbd3f0c810c..3e849c3a7cc8 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -378,18 +378,20 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
 	   u64, arg2, u64, arg3)
 {
 	u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
-	u32 *bin_args;
+	struct bpf_bprintf_data data = {
+		.get_bin_args	= true,
+	};
 	static char buf[BPF_TRACE_PRINTK_SIZE];
 	unsigned long flags;
 	int ret;
 
-	ret = bpf_bprintf_prepare(fmt, fmt_size, args, &bin_args,
-				  MAX_TRACE_PRINTK_VARARGS);
+	ret = bpf_bprintf_prepare(fmt, fmt_size, args,
+				  MAX_TRACE_PRINTK_VARARGS, &data);
 	if (ret < 0)
 		return ret;
 
 	raw_spin_lock_irqsave(&trace_printk_lock, flags);
-	ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
+	ret = bstr_printf(buf, sizeof(buf), fmt, data.bin_args);
 
 	trace_bpf_trace_printk(buf);
 	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
@@ -427,25 +429,27 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
 	return &bpf_trace_printk_proto;
 }
 
-BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, data,
+BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, args,
 	   u32, data_len)
 {
+	struct bpf_bprintf_data data = {
+		.get_bin_args	= true,
+	};
 	static char buf[BPF_TRACE_PRINTK_SIZE];
 	unsigned long flags;
 	int ret, num_args;
-	u32 *bin_args;
 
 	if (data_len & 7 || data_len > MAX_BPRINTF_VARARGS * 8 ||
-	    (data_len && !data))
+	    (data_len && !args))
 		return -EINVAL;
 	num_args = data_len / 8;
 
-	ret = bpf_bprintf_prepare(fmt, fmt_size, data, &bin_args, num_args);
+	ret = bpf_bprintf_prepare(fmt, fmt_size, args, num_args, &data);
 	if (ret < 0)
 		return ret;
 
 	raw_spin_lock_irqsave(&trace_printk_lock, flags);
-	ret = bstr_printf(buf, sizeof(buf), fmt, bin_args);
+	ret = bstr_printf(buf, sizeof(buf), fmt, data.bin_args);
 
 	trace_bpf_trace_printk(buf);
 	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
@@ -472,21 +476,23 @@ const struct bpf_func_proto *bpf_get_trace_vprintk_proto(void)
 }
 
 BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
-	   const void *, data, u32, data_len)
+	   const void *, args, u32, data_len)
 {
+	struct bpf_bprintf_data data = {
+		.get_bin_args	= true,
+	};
 	int err, num_args;
-	u32 *bin_args;
 
 	if (data_len & 7 || data_len > MAX_BPRINTF_VARARGS * 8 ||
-	    (data_len && !data))
+	    (data_len && !args))
 		return -EINVAL;
 	num_args = data_len / 8;
 
-	err = bpf_bprintf_prepare(fmt, fmt_size, data, &bin_args, num_args);
+	err = bpf_bprintf_prepare(fmt, fmt_size, args, num_args, &data);
 	if (err < 0)
 		return err;
 
-	seq_bprintf(m, fmt, bin_args);
+	seq_bprintf(m, fmt, data.bin_args);
 
 	bpf_bprintf_cleanup();
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed
  2022-12-15 21:44 [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock Jiri Olsa
  2022-12-15 21:44 ` [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare Jiri Olsa
@ 2022-12-15 21:44 ` Jiri Olsa
  2022-12-17  0:25   ` Yonghong Song
  2022-12-15 21:44 ` [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock Jiri Olsa
  2022-12-19 21:10 ` [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock patchwork-bot+netdevbpf
  3 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2022-12-15 21:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Florent Revest

Currently we always cleanup/decrement bpf_bprintf_nest_level
variable in bpf_bprintf_cleanup if it's > 0.

There's possible scenario where this could cause a problem,
when bpf_bprintf_prepare does not get bin_args buffer (because
num_args is 0) and following bpf_bprintf_cleanup call decrements
bpf_bprintf_nest_level variable, like:

  in task context:
    bpf_bprintf_prepare(num_args != 0) increments 'bpf_bprintf_nest_level = 1'
    -> first irq :
       bpf_bprintf_prepare(num_args == 0)
       bpf_bprintf_cleanup decrements 'bpf_bprintf_nest_level = 0'
    -> second irq:
       bpf_bprintf_prepare(num_args != 0) bpf_bprintf_nest_level = 1
       gets same buffer as task context above

Adding check to bpf_bprintf_cleanup and doing the real cleanup
only if we got bin_args data in the first place.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h      |  2 +-
 kernel/bpf/helpers.c     | 16 +++++++++-------
 kernel/trace/bpf_trace.c |  6 +++---
 3 files changed, 13 insertions(+), 11 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cc390ba32e70..656879385fbf 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2803,7 +2803,7 @@ struct bpf_bprintf_data {
 
 int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 			u32 num_args, struct bpf_bprintf_data *data);
-void bpf_bprintf_cleanup(void);
+void bpf_bprintf_cleanup(struct bpf_bprintf_data *data);
 
 /* the implementation of the opaque uapi struct bpf_dynptr */
 struct bpf_dynptr_kern {
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 7dbf6bb72cad..9cca02e13f2e 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -784,12 +784,14 @@ static int try_get_fmt_tmp_buf(char **tmp_buf)
 	return 0;
 }
 
-void bpf_bprintf_cleanup(void)
+void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
 {
-	if (this_cpu_read(bpf_bprintf_nest_level)) {
-		this_cpu_dec(bpf_bprintf_nest_level);
-		preempt_enable();
-	}
+	if (!data->bin_args)
+		return;
+	if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
+		return;
+	this_cpu_dec(bpf_bprintf_nest_level);
+	preempt_enable();
 }
 
 /*
@@ -1021,7 +1023,7 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 	err = 0;
 out:
 	if (err)
-		bpf_bprintf_cleanup();
+		bpf_bprintf_cleanup(data);
 	return err;
 }
 
@@ -1047,7 +1049,7 @@ BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
 
 	err = bstr_printf(str, str_size, fmt, data.bin_args);
 
-	bpf_bprintf_cleanup();
+	bpf_bprintf_cleanup(&data);
 
 	return err + 1;
 }
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 3e849c3a7cc8..2129f7c68bb5 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -396,7 +396,7 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
 	trace_bpf_trace_printk(buf);
 	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
 
-	bpf_bprintf_cleanup();
+	bpf_bprintf_cleanup(&data);
 
 	return ret;
 }
@@ -454,7 +454,7 @@ BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, args,
 	trace_bpf_trace_printk(buf);
 	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
 
-	bpf_bprintf_cleanup();
+	bpf_bprintf_cleanup(&data);
 
 	return ret;
 }
@@ -494,7 +494,7 @@ BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
 
 	seq_bprintf(m, fmt, data.bin_args);
 
-	bpf_bprintf_cleanup();
+	bpf_bprintf_cleanup(&data);
 
 	return seq_has_overflowed(m) ? -EOVERFLOW : 0;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock
  2022-12-15 21:44 [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock Jiri Olsa
  2022-12-15 21:44 ` [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare Jiri Olsa
  2022-12-15 21:44 ` [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed Jiri Olsa
@ 2022-12-15 21:44 ` Jiri Olsa
  2022-12-17  0:28   ` Yonghong Song
  2022-12-19 21:10 ` [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock patchwork-bot+netdevbpf
  3 siblings, 1 reply; 8+ messages in thread
From: Jiri Olsa @ 2022-12-15 21:44 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Hao Sun, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
	Florent Revest

Both bpf_trace_printk and bpf_trace_vprintk helpers use static buffer
guarded with trace_printk_lock spin lock.

The spin lock contention causes issues with bpf programs attached to
contention_begin tracepoint [1] [2].

Andrii suggested we could get rid of the contention by using trylock,
but we could actually get rid of the spinlock completely by using
percpu buffers the same way as for bin_args in bpf_bprintf_prepare
function.

Adding new return 'buf' argument to struct bpf_bprintf_data and making
bpf_bprintf_prepare to return also the buffer for printk helpers.

[1] https://lore.kernel.org/bpf/CACkBjsakT_yWxnSWr4r-0TpPvbKm9-OBmVUhJb7hV3hY8fdCkw@mail.gmail.com/
[2] https://lore.kernel.org/bpf/CACkBjsaCsTovQHFfkqJKto6S4Z8d02ud1D7MPESrHa1cVNNTrw@mail.gmail.com/

Reported-by: Hao Sun <sunhao.th@gmail.com>
Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h      |  3 +++
 kernel/bpf/helpers.c     | 31 +++++++++++++++++++------------
 kernel/trace/bpf_trace.c | 20 ++++++--------------
 3 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 656879385fbf..5fec2d1be6d7 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -2795,10 +2795,13 @@ struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
 #define MAX_BPRINTF_VARARGS		12
+#define MAX_BPRINTF_BUF			1024
 
 struct bpf_bprintf_data {
 	u32 *bin_args;
+	char *buf;
 	bool get_bin_args;
+	bool get_buf;
 };
 
 int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 9cca02e13f2e..23aa8cf8fd1a 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -756,19 +756,20 @@ static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
 /* Per-cpu temp buffers used by printf-like helpers to store the bprintf binary
  * arguments representation.
  */
-#define MAX_BPRINTF_BUF_LEN	512
+#define MAX_BPRINTF_BIN_ARGS	512
 
 /* Support executing three nested bprintf helper calls on a given CPU */
 #define MAX_BPRINTF_NEST_LEVEL	3
 struct bpf_bprintf_buffers {
-	char tmp_bufs[MAX_BPRINTF_NEST_LEVEL][MAX_BPRINTF_BUF_LEN];
+	char bin_args[MAX_BPRINTF_BIN_ARGS];
+	char buf[MAX_BPRINTF_BUF];
 };
-static DEFINE_PER_CPU(struct bpf_bprintf_buffers, bpf_bprintf_bufs);
+
+static DEFINE_PER_CPU(struct bpf_bprintf_buffers[MAX_BPRINTF_NEST_LEVEL], bpf_bprintf_bufs);
 static DEFINE_PER_CPU(int, bpf_bprintf_nest_level);
 
-static int try_get_fmt_tmp_buf(char **tmp_buf)
+static int try_get_buffers(struct bpf_bprintf_buffers **bufs)
 {
-	struct bpf_bprintf_buffers *bufs;
 	int nest_level;
 
 	preempt_disable();
@@ -778,15 +779,14 @@ static int try_get_fmt_tmp_buf(char **tmp_buf)
 		preempt_enable();
 		return -EBUSY;
 	}
-	bufs = this_cpu_ptr(&bpf_bprintf_bufs);
-	*tmp_buf = bufs->tmp_bufs[nest_level - 1];
+	*bufs = this_cpu_ptr(&bpf_bprintf_bufs[nest_level - 1]);
 
 	return 0;
 }
 
 void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
 {
-	if (!data->bin_args)
+	if (!data->bin_args && !data->buf)
 		return;
 	if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
 		return;
@@ -811,7 +811,9 @@ void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
 int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 			u32 num_args, struct bpf_bprintf_data *data)
 {
+	bool get_buffers = (data->get_bin_args && num_args) || data->get_buf;
 	char *unsafe_ptr = NULL, *tmp_buf = NULL, *tmp_buf_end, *fmt_end;
+	struct bpf_bprintf_buffers *buffers = NULL;
 	size_t sizeof_cur_arg, sizeof_cur_ip;
 	int err, i, num_spec = 0;
 	u64 cur_arg;
@@ -822,14 +824,19 @@ int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
 		return -EINVAL;
 	fmt_size = fmt_end - fmt;
 
-	if (data->get_bin_args) {
-		if (num_args && try_get_fmt_tmp_buf(&tmp_buf))
-			return -EBUSY;
+	if (get_buffers && try_get_buffers(&buffers))
+		return -EBUSY;
 
-		tmp_buf_end = tmp_buf + MAX_BPRINTF_BUF_LEN;
+	if (data->get_bin_args) {
+		if (num_args)
+			tmp_buf = buffers->bin_args;
+		tmp_buf_end = tmp_buf + MAX_BPRINTF_BIN_ARGS;
 		data->bin_args = (u32 *)tmp_buf;
 	}
 
+	if (data->get_buf)
+		data->buf = buffers->buf;
+
 	for (i = 0; i < fmt_size; i++) {
 		if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
 			err = -EINVAL;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 2129f7c68bb5..23ce498bca97 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -369,8 +369,6 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
 	return &bpf_probe_write_user_proto;
 }
 
-static DEFINE_RAW_SPINLOCK(trace_printk_lock);
-
 #define MAX_TRACE_PRINTK_VARARGS	3
 #define BPF_TRACE_PRINTK_SIZE		1024
 
@@ -380,9 +378,8 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
 	u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
 	struct bpf_bprintf_data data = {
 		.get_bin_args	= true,
+		.get_buf	= true,
 	};
-	static char buf[BPF_TRACE_PRINTK_SIZE];
-	unsigned long flags;
 	int ret;
 
 	ret = bpf_bprintf_prepare(fmt, fmt_size, args,
@@ -390,11 +387,9 @@ BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
 	if (ret < 0)
 		return ret;
 
-	raw_spin_lock_irqsave(&trace_printk_lock, flags);
-	ret = bstr_printf(buf, sizeof(buf), fmt, data.bin_args);
+	ret = bstr_printf(data.buf, MAX_BPRINTF_BUF, fmt, data.bin_args);
 
-	trace_bpf_trace_printk(buf);
-	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+	trace_bpf_trace_printk(data.buf);
 
 	bpf_bprintf_cleanup(&data);
 
@@ -434,9 +429,8 @@ BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, args,
 {
 	struct bpf_bprintf_data data = {
 		.get_bin_args	= true,
+		.get_buf	= true,
 	};
-	static char buf[BPF_TRACE_PRINTK_SIZE];
-	unsigned long flags;
 	int ret, num_args;
 
 	if (data_len & 7 || data_len > MAX_BPRINTF_VARARGS * 8 ||
@@ -448,11 +442,9 @@ BPF_CALL_4(bpf_trace_vprintk, char *, fmt, u32, fmt_size, const void *, args,
 	if (ret < 0)
 		return ret;
 
-	raw_spin_lock_irqsave(&trace_printk_lock, flags);
-	ret = bstr_printf(buf, sizeof(buf), fmt, data.bin_args);
+	ret = bstr_printf(data.buf, MAX_BPRINTF_BUF, fmt, data.bin_args);
 
-	trace_bpf_trace_printk(buf);
-	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+	trace_bpf_trace_printk(data.buf);
 
 	bpf_bprintf_cleanup(&data);
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare
  2022-12-15 21:44 ` [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare Jiri Olsa
@ 2022-12-17  0:25   ` Yonghong Song
  0 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2022-12-17  0:25 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Florent Revest



On 12/15/22 1:44 PM, Jiri Olsa wrote:
> Adding struct bpf_bprintf_data to hold bin_args argument
> for bpf_bprintf_prepare function.
> 
> We will add another return argument to bpf_bprintf_prepare
> and pass the struct to bpf_bprintf_cleanup for proper cleanup
> in following changes.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed
  2022-12-15 21:44 ` [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed Jiri Olsa
@ 2022-12-17  0:25   ` Yonghong Song
  0 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2022-12-17  0:25 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: bpf, Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Stanislav Fomichev, Hao Luo, Florent Revest



On 12/15/22 1:44 PM, Jiri Olsa wrote:
> Currently we always cleanup/decrement bpf_bprintf_nest_level
> variable in bpf_bprintf_cleanup if it's > 0.
> 
> There's possible scenario where this could cause a problem,
> when bpf_bprintf_prepare does not get bin_args buffer (because
> num_args is 0) and following bpf_bprintf_cleanup call decrements
> bpf_bprintf_nest_level variable, like:
> 
>    in task context:
>      bpf_bprintf_prepare(num_args != 0) increments 'bpf_bprintf_nest_level = 1'
>      -> first irq :
>         bpf_bprintf_prepare(num_args == 0)
>         bpf_bprintf_cleanup decrements 'bpf_bprintf_nest_level = 0'
>      -> second irq:
>         bpf_bprintf_prepare(num_args != 0) bpf_bprintf_nest_level = 1
>         gets same buffer as task context above
> 
> Adding check to bpf_bprintf_cleanup and doing the real cleanup
> only if we got bin_args data in the first place.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock
  2022-12-15 21:44 ` [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock Jiri Olsa
@ 2022-12-17  0:28   ` Yonghong Song
  0 siblings, 0 replies; 8+ messages in thread
From: Yonghong Song @ 2022-12-17  0:28 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko
  Cc: Hao Sun, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Stanislav Fomichev, Hao Luo,
	Florent Revest



On 12/15/22 1:44 PM, Jiri Olsa wrote:
> Both bpf_trace_printk and bpf_trace_vprintk helpers use static buffer
> guarded with trace_printk_lock spin lock.
> 
> The spin lock contention causes issues with bpf programs attached to
> contention_begin tracepoint [1] [2].
> 
> Andrii suggested we could get rid of the contention by using trylock,
> but we could actually get rid of the spinlock completely by using
> percpu buffers the same way as for bin_args in bpf_bprintf_prepare
> function.
> 
> Adding new return 'buf' argument to struct bpf_bprintf_data and making
> bpf_bprintf_prepare to return also the buffer for printk helpers.
> 
> [1] https://lore.kernel.org/bpf/CACkBjsakT_yWxnSWr4r-0TpPvbKm9-OBmVUhJb7hV3hY8fdCkw@mail.gmail.com/
> [2] https://lore.kernel.org/bpf/CACkBjsaCsTovQHFfkqJKto6S4Z8d02ud1D7MPESrHa1cVNNTrw@mail.gmail.com/
> 
> Reported-by: Hao Sun <sunhao.th@gmail.com>
> Suggested-by: Andrii Nakryiko <andrii@kernel.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

Ack with a small nit below.

Acked-by: Yonghong Song <yhs@fb.com>

> ---
>   include/linux/bpf.h      |  3 +++
>   kernel/bpf/helpers.c     | 31 +++++++++++++++++++------------
>   kernel/trace/bpf_trace.c | 20 ++++++--------------
>   3 files changed, 28 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 656879385fbf..5fec2d1be6d7 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -2795,10 +2795,13 @@ struct btf_id_set;
>   bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
>   
>   #define MAX_BPRINTF_VARARGS		12
> +#define MAX_BPRINTF_BUF			1024
>   
>   struct bpf_bprintf_data {
>   	u32 *bin_args;
> +	char *buf;
>   	bool get_bin_args;
> +	bool get_buf;
>   };
>   
>   int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 9cca02e13f2e..23aa8cf8fd1a 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -756,19 +756,20 @@ static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
>   /* Per-cpu temp buffers used by printf-like helpers to store the bprintf binary
>    * arguments representation.
>    */
> -#define MAX_BPRINTF_BUF_LEN	512
> +#define MAX_BPRINTF_BIN_ARGS	512
>   
>   /* Support executing three nested bprintf helper calls on a given CPU */
>   #define MAX_BPRINTF_NEST_LEVEL	3
>   struct bpf_bprintf_buffers {
> -	char tmp_bufs[MAX_BPRINTF_NEST_LEVEL][MAX_BPRINTF_BUF_LEN];
> +	char bin_args[MAX_BPRINTF_BIN_ARGS];
> +	char buf[MAX_BPRINTF_BUF];
>   };
> -static DEFINE_PER_CPU(struct bpf_bprintf_buffers, bpf_bprintf_bufs);
> +
> +static DEFINE_PER_CPU(struct bpf_bprintf_buffers[MAX_BPRINTF_NEST_LEVEL], bpf_bprintf_bufs);
>   static DEFINE_PER_CPU(int, bpf_bprintf_nest_level);
>   
> -static int try_get_fmt_tmp_buf(char **tmp_buf)
> +static int try_get_buffers(struct bpf_bprintf_buffers **bufs)
>   {
> -	struct bpf_bprintf_buffers *bufs;
>   	int nest_level;
>   
>   	preempt_disable();
> @@ -778,15 +779,14 @@ static int try_get_fmt_tmp_buf(char **tmp_buf)
>   		preempt_enable();
>   		return -EBUSY;
>   	}
> -	bufs = this_cpu_ptr(&bpf_bprintf_bufs);
> -	*tmp_buf = bufs->tmp_bufs[nest_level - 1];
> +	*bufs = this_cpu_ptr(&bpf_bprintf_bufs[nest_level - 1]);
>   
>   	return 0;
>   }
>   
>   void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
>   {
> -	if (!data->bin_args)
> +	if (!data->bin_args && !data->buf)
>   		return;
>   	if (WARN_ON_ONCE(this_cpu_read(bpf_bprintf_nest_level) == 0))
>   		return;
> @@ -811,7 +811,9 @@ void bpf_bprintf_cleanup(struct bpf_bprintf_data *data)
>   int bpf_bprintf_prepare(char *fmt, u32 fmt_size, const u64 *raw_args,
>   			u32 num_args, struct bpf_bprintf_data *data)
>   {
> +	bool get_buffers = (data->get_bin_args && num_args) || data->get_buf;

We might waste some memory if num_args is 0 here. This is unlikely case
and it is not worthwhile to optimize for that, so current
implementation sounds good to me.

>   	char *unsafe_ptr = NULL, *tmp_buf = NULL, *tmp_buf_end, *fmt_end;
> +	struct bpf_bprintf_buffers *buffers = NULL;
>   	size_t sizeof_cur_arg, sizeof_cur_ip;
>   	int err, i, num_spec = 0;
>   	u64 cur_arg;
[...]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock
  2022-12-15 21:44 [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock Jiri Olsa
                   ` (2 preceding siblings ...)
  2022-12-15 21:44 ` [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock Jiri Olsa
@ 2022-12-19 21:10 ` patchwork-bot+netdevbpf
  3 siblings, 0 replies; 8+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-12-19 21:10 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: ast, daniel, andrii, bpf, kafai, songliubraving, yhs,
	john.fastabend, kpsingh, sdf, haoluo, revest

Hello:

This series was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Thu, 15 Dec 2022 22:44:27 +0100 you wrote:
> hi,
> In the last revision Andrii suggested we could have the buffer
> provided by bpf_bprintf_prepare [1]. It's bit more changes but
> it looks like more compact solution.
> 
> v3 changes:
>   - added struct to hold return data in bpf_bprintf_prepare
>   - fix bug in bpf_bprintf_cleanup
>   - adjust printk helpers to use new bpf_bprintf_prepare
>     data argument
> 
> [...]

Here is the summary with links:
  - [PATCHv3,bpf-next,1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare
    https://git.kernel.org/bpf/bpf-next/c/78aa1cc94043
  - [PATCHv3,bpf-next,2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed
    https://git.kernel.org/bpf/bpf-next/c/f19a4050455a
  - [PATCHv3,bpf-next,3/3] bpf: Remove trace_printk_lock
    https://git.kernel.org/bpf/bpf-next/c/e2bb9e01d589

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2022-12-19 21:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-12-15 21:44 [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock Jiri Olsa
2022-12-15 21:44 ` [PATCHv3 bpf-next 1/3] bpf: Add struct for bin_args arg in bpf_bprintf_prepare Jiri Olsa
2022-12-17  0:25   ` Yonghong Song
2022-12-15 21:44 ` [PATCHv3 bpf-next 2/3] bpf: Do cleanup in bpf_bprintf_cleanup only when needed Jiri Olsa
2022-12-17  0:25   ` Yonghong Song
2022-12-15 21:44 ` [PATCHv3 bpf-next 3/3] bpf: Remove trace_printk_lock Jiri Olsa
2022-12-17  0:28   ` Yonghong Song
2022-12-19 21:10 ` [PATCHv3 bpf-next 0/3] bpf: Get rid of trace_printk_lock patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.