BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper
@ 2021-03-24  2:22 Florent Revest
  2021-03-24  2:22 ` [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf Florent Revest
                   ` (5 more replies)
  0 siblings, 6 replies; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

We have a usecase where we want to audit symbol names (if available) in
callback registration hooks. (ex: fentry/nf_register_net_hook)

A few months back, I proposed a bpf_kallsyms_lookup series but it was
decided in the reviews that a more generic helper, bpf_snprintf, would
be more useful.

This series implements the helper according to the feedback received in
https://lore.kernel.org/bpf/20201126165748.1748417-1-revest@google.com/T/#u

- A new arg type guarantees the NULL-termination of string arguments and
  lets us pass format strings in only one arg
- A new helper is implemented using that guarantee. Because the format
  string is known at verification time, the format string validation is
  done by the verifier
- To implement a series of tests for bpf_snprintf, the logic for
  marshalling variadic args in a fixed-size array is reworked as per:
https://lore.kernel.org/bpf/20210310015455.1095207-1-revest@chromium.org/T/#u

---
Changes in v2:
- Extracted the format validation/argument sanitization in a generic way
  for all printf-like helpers.
- bpf_snprintf's str_size can now be 0
- bpf_snprintf is now exposed to all BPF program types
- We now preempt_disable when using a per-cpu temporary buffer
- Addressed a few cosmetic changes

Florent Revest (6):
  bpf: Factorize bpf_trace_printk and bpf_seq_printf
  bpf: Add a ARG_PTR_TO_CONST_STR argument type
  bpf: Add a bpf_snprintf helper
  libbpf: Initialize the bpf_seq_printf parameters array field by field
  libbpf: Introduce a BPF_SNPRINTF helper macro
  selftests/bpf: Add a series of tests for bpf_snprintf

 include/linux/bpf.h                           |   7 +
 include/uapi/linux/bpf.h                      |  28 +
 kernel/bpf/helpers.c                          |   2 +
 kernel/bpf/verifier.c                         |  79 +++
 kernel/trace/bpf_trace.c                      | 581 +++++++++---------
 tools/include/uapi/linux/bpf.h                |  28 +
 tools/lib/bpf/bpf_tracing.h                   |  44 +-
 .../selftests/bpf/prog_tests/snprintf.c       |  65 ++
 .../selftests/bpf/progs/test_snprintf.c       |  59 ++
 9 files changed, 604 insertions(+), 289 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 21:53   ` Andrii Nakryiko
  2021-03-24  2:22 ` [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type Florent Revest
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

Two helpers (trace_printk and seq_printf) have very similar
implementations of format string parsing and a third one is coming
(snprintf). To avoid code duplication and make the code easier to
maintain, this moves the operations associated with format string
parsing (validation and argument sanitization) into one generic
function.

Unfortunately, the implementation of the two existing helpers already
drifted quite a bit and unifying them entailed a lot of changes:

- bpf_trace_printk always expected fmt[fmt_size] to be the terminating
  NULL character, this is no longer true, the first 0 is terminating.
- bpf_trace_printk now supports %% (which produces the percentage char).
- bpf_trace_printk now skips width formating fields.
- bpf_trace_printk now supports the X modifier (capital hexadecimal).
- bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
- argument casting on 32 bit has been simplified into one macro and
  using an enum instead of obscure int increments.

- bpf_seq_printf now uses bpf_trace_copy_string instead of
  strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
- bpf_seq_printf now prints longs correctly on 32 bit architectures.

- both were changed to use a global per-cpu tmp buffer instead of one
  stack buffer for trace_printk and 6 small buffers for seq_printf.
- to avoid per-cpu buffer usage conflict, these helpers disable
  preemption while the per-cpu buffer is in use.
- both helpers now support the %ps and %pS specifiers to print symbols.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 kernel/trace/bpf_trace.c | 529 ++++++++++++++++++---------------------
 1 file changed, 244 insertions(+), 285 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0d23755c2747..0fdca94a3c9c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -372,7 +372,7 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
 	return &bpf_probe_write_user_proto;
 }
 
-static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
+static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
 		size_t bufsz)
 {
 	void __user *user_ptr = (__force void __user *)unsafe_ptr;
@@ -382,178 +382,284 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
 	switch (fmt_ptype) {
 	case 's':
 #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
-		if ((unsigned long)unsafe_ptr < TASK_SIZE) {
-			strncpy_from_user_nofault(buf, user_ptr, bufsz);
-			break;
-		}
+		if ((unsigned long)unsafe_ptr < TASK_SIZE)
+			return strncpy_from_user_nofault(buf, user_ptr, bufsz);
 		fallthrough;
 #endif
 	case 'k':
-		strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
-		break;
+		return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
 	case 'u':
-		strncpy_from_user_nofault(buf, user_ptr, bufsz);
-		break;
+		return strncpy_from_user_nofault(buf, user_ptr, bufsz);
 	}
+
+	return -EINVAL;
 }
 
 static DEFINE_RAW_SPINLOCK(trace_printk_lock);
 
-#define BPF_TRACE_PRINTK_SIZE   1024
+enum bpf_printf_mod_type {
+	BPF_PRINTF_INT,
+	BPF_PRINTF_LONG,
+	BPF_PRINTF_LONG_LONG,
+};
 
-static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
-{
-	static char buf[BPF_TRACE_PRINTK_SIZE];
-	unsigned long flags;
-	va_list ap;
-	int ret;
+/* Horrid workaround for getting va_list handling working with different
+ * argument type combinations generically for 32 and 64 bit archs.
+ */
+#define BPF_CAST_FMT_ARG(arg_nb, args, mod)				\
+	((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||			\
+	 (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))	\
+	  ? args[arg_nb]						\
+	  : ((mod[arg_nb] == BPF_PRINTF_LONG ||				\
+	     (mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))	\
+	      ? (long)args[arg_nb]					\
+	      : (u32)args[arg_nb]))
+
+/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
+ */
+#define MAX_PRINTF_BUF_LEN	512
 
-	raw_spin_lock_irqsave(&trace_printk_lock, flags);
-	va_start(ap, fmt);
-	ret = vsnprintf(buf, sizeof(buf), fmt, ap);
-	va_end(ap);
-	/* vsnprintf() will not append null for zero-length strings */
-	if (ret == 0)
-		buf[0] = '\0';
-	trace_bpf_trace_printk(buf);
-	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+struct bpf_printf_buf {
+	char tmp_buf[MAX_PRINTF_BUF_LEN];
+};
+static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
+static DEFINE_PER_CPU(int, bpf_printf_buf_used);
 
-	return ret;
+static void bpf_printf_postamble(void)
+{
+	if (this_cpu_read(bpf_printf_buf_used)) {
+		this_cpu_dec(bpf_printf_buf_used);
+		preempt_enable();
+	}
 }
 
 /*
- * Only limited trace_printk() conversion specifiers allowed:
- * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
+ * bpf_parse_fmt_str - Generic pass on format strings for printf-like helpers
+ *
+ * Returns a negative value if fmt is an invalid format string or 0 otherwise.
+ *
+ * This can be used in two ways:
+ * - Format string verification only: when final_args and mod are NULL
+ * - Arguments preparation: in addition to the above verification, it writes in
+ *   final_args a copy of raw_args where pointers from BPF have been sanitized
+ *   into pointers safe to use by snprintf. This also writes in the mod array
+ *   the size requirement of each argument, usable by BPF_CAST_FMT_ARG for ex.
+ *
+ * In argument preparation mode, if 0 is returned, safe temporary buffers are
+ * allocated and bpf_printf_postamble should be called to free them after use.
  */
-BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
-	   u64, arg2, u64, arg3)
-{
-	int i, mod[3] = {}, fmt_cnt = 0;
-	char buf[64], fmt_ptype;
-	void *unsafe_ptr = NULL;
-	bool str_seen = false;
-
-	/*
-	 * bpf_check()->check_func_arg()->check_stack_boundary()
-	 * guarantees that fmt points to bpf program stack,
-	 * fmt_size bytes of it were initialized and fmt_size > 0
-	 */
-	if (fmt[--fmt_size] != 0)
-		return -EINVAL;
-
-	/* check format string for allowed specifiers */
-	for (i = 0; i < fmt_size; i++) {
-		if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i]))
-			return -EINVAL;
+int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
+			u64 *final_args, enum bpf_printf_mod_type *mod,
+			u32 num_args)
+{
+	struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
+	int err, i, fmt_cnt = 0, copy_size, used;
+	char *unsafe_ptr = NULL, *tmp_buf = NULL;
+	bool prepare_args = final_args && mod;
+	enum bpf_printf_mod_type current_mod;
+	size_t tmp_buf_len;
+	u64 current_arg;
+	char fmt_ptype;
+
+	for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
+		if ((!isprint(fmt[i]) && !isspace(fmt[i])) ||
+		    !isascii(fmt[i])) {
+			err = -EINVAL;
+			goto out;
+		}
 
 		if (fmt[i] != '%')
 			continue;
 
-		if (fmt_cnt >= 3)
-			return -EINVAL;
+		if (fmt[i + 1] == '%') {
+			i++;
+			continue;
+		}
+
+		if (fmt_cnt >= num_args) {
+			err = -EINVAL;
+			goto out;
+		}
 
 		/* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
 		i++;
-		if (fmt[i] == 'l') {
-			mod[fmt_cnt]++;
+
+		/* skip optional "[0 +-][num]" width formating field */
+		while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
+		       fmt[i] == ' ')
+			i++;
+		if (fmt[i] >= '1' && fmt[i] <= '9') {
 			i++;
-		} else if (fmt[i] == 'p') {
-			mod[fmt_cnt]++;
-			if ((fmt[i + 1] == 'k' ||
-			     fmt[i + 1] == 'u') &&
+			while (fmt[i] >= '0' && fmt[i] <= '9')
+				i++;
+		}
+
+		if (fmt[i] == 'p') {
+			current_mod = BPF_PRINTF_LONG;
+
+			if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') &&
 			    fmt[i + 2] == 's') {
 				fmt_ptype = fmt[i + 1];
 				i += 2;
 				goto fmt_str;
 			}
 
-			if (fmt[i + 1] == 'B') {
-				i++;
+			if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
+			    ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
+			    fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
+			    fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
+				/* just kernel pointers */
+				if (prepare_args)
+					current_arg = raw_args[fmt_cnt];
 				goto fmt_next;
 			}
 
-			/* disallow any further format extensions */
-			if (fmt[i + 1] != 0 &&
-			    !isspace(fmt[i + 1]) &&
-			    !ispunct(fmt[i + 1]))
-				return -EINVAL;
+			/* only support "%pI4", "%pi4", "%pI6" and "%pi6". */
+			if ((fmt[i + 1] != 'i' && fmt[i + 1] != 'I') ||
+			    (fmt[i + 2] != '4' && fmt[i + 2] != '6')) {
+				err = -EINVAL;
+				goto out;
+			}
+
+			i += 2;
+			if (!prepare_args)
+				goto fmt_next;
+
+			if (!tmp_buf) {
+				used = this_cpu_inc_return(bpf_printf_buf_used);
+				if (WARN_ON_ONCE(used > 1)) {
+					this_cpu_dec(bpf_printf_buf_used);
+					return -EBUSY;
+				}
+				preempt_disable();
+				tmp_buf = bufs->tmp_buf;
+				tmp_buf_len = MAX_PRINTF_BUF_LEN;
+			}
+
+			copy_size = (fmt[i + 2] == '4') ? 4 : 16;
+			if (tmp_buf_len < copy_size) {
+				err = -ENOSPC;
+				goto out;
+			}
+
+			unsafe_ptr = (char *)(long)raw_args[fmt_cnt];
+			err = copy_from_kernel_nofault(tmp_buf, unsafe_ptr,
+						       copy_size);
+			if (err < 0)
+				memset(tmp_buf, 0, copy_size);
+			current_arg = (u64)(long)tmp_buf;
+			tmp_buf += copy_size;
+			tmp_buf_len -= copy_size;
 
 			goto fmt_next;
 		} else if (fmt[i] == 's') {
-			mod[fmt_cnt]++;
+			current_mod = BPF_PRINTF_LONG;
 			fmt_ptype = fmt[i];
 fmt_str:
-			if (str_seen)
-				/* allow only one '%s' per fmt string */
-				return -EINVAL;
-			str_seen = true;
-
 			if (fmt[i + 1] != 0 &&
 			    !isspace(fmt[i + 1]) &&
-			    !ispunct(fmt[i + 1]))
-				return -EINVAL;
+			    !ispunct(fmt[i + 1])) {
+				err = -EINVAL;
+				goto out;
+			}
 
-			switch (fmt_cnt) {
-			case 0:
-				unsafe_ptr = (void *)(long)arg1;
-				arg1 = (long)buf;
-				break;
-			case 1:
-				unsafe_ptr = (void *)(long)arg2;
-				arg2 = (long)buf;
-				break;
-			case 2:
-				unsafe_ptr = (void *)(long)arg3;
-				arg3 = (long)buf;
-				break;
+			if (!prepare_args)
+				goto fmt_next;
+
+			if (!tmp_buf) {
+				used = this_cpu_inc_return(bpf_printf_buf_used);
+				if (WARN_ON_ONCE(used > 1)) {
+					this_cpu_dec(bpf_printf_buf_used);
+					return -EBUSY;
+				}
+				preempt_disable();
+				tmp_buf = bufs->tmp_buf;
+				tmp_buf_len = MAX_PRINTF_BUF_LEN;
+			}
+
+			if (!tmp_buf_len) {
+				err = -ENOSPC;
+				goto out;
 			}
 
-			bpf_trace_copy_string(buf, unsafe_ptr, fmt_ptype,
-					sizeof(buf));
+			unsafe_ptr = (char *)(long)raw_args[fmt_cnt];
+			err = bpf_trace_copy_string(tmp_buf, unsafe_ptr,
+						    fmt_ptype, tmp_buf_len);
+			if (err < 0) {
+				tmp_buf[0] = '\0';
+				err = 1;
+			}
+
+			current_arg = (u64)(long)tmp_buf;
+			tmp_buf += err;
+			tmp_buf_len -= err;
+
 			goto fmt_next;
 		}
 
+		current_mod = BPF_PRINTF_INT;
+
 		if (fmt[i] == 'l') {
-			mod[fmt_cnt]++;
+			current_mod = BPF_PRINTF_LONG;
+			i++;
+		}
+		if (fmt[i] == 'l') {
+			current_mod = BPF_PRINTF_LONG_LONG;
 			i++;
 		}
 
-		if (fmt[i] != 'i' && fmt[i] != 'd' &&
-		    fmt[i] != 'u' && fmt[i] != 'x')
-			return -EINVAL;
+		if (fmt[i] != 'i' && fmt[i] != 'd' && fmt[i] != 'u' &&
+		    fmt[i] != 'x' && fmt[i] != 'X') {
+			err = -EINVAL;
+			goto out;
+		}
+
+		if (prepare_args)
+			current_arg = raw_args[fmt_cnt];
 fmt_next:
+		if (prepare_args) {
+			mod[fmt_cnt] = current_mod;
+			final_args[fmt_cnt] = current_arg;
+		}
 		fmt_cnt++;
 	}
 
-/* Horrid workaround for getting va_list handling working with different
- * argument type combinations generically for 32 and 64 bit archs.
- */
-#define __BPF_TP_EMIT()	__BPF_ARG3_TP()
-#define __BPF_TP(...)							\
-	bpf_do_trace_printk(fmt, ##__VA_ARGS__)
-
-#define __BPF_ARG1_TP(...)						\
-	((mod[0] == 2 || (mod[0] == 1 && __BITS_PER_LONG == 64))	\
-	  ? __BPF_TP(arg1, ##__VA_ARGS__)				\
-	  : ((mod[0] == 1 || (mod[0] == 0 && __BITS_PER_LONG == 32))	\
-	      ? __BPF_TP((long)arg1, ##__VA_ARGS__)			\
-	      : __BPF_TP((u32)arg1, ##__VA_ARGS__)))
-
-#define __BPF_ARG2_TP(...)						\
-	((mod[1] == 2 || (mod[1] == 1 && __BITS_PER_LONG == 64))	\
-	  ? __BPF_ARG1_TP(arg2, ##__VA_ARGS__)				\
-	  : ((mod[1] == 1 || (mod[1] == 0 && __BITS_PER_LONG == 32))	\
-	      ? __BPF_ARG1_TP((long)arg2, ##__VA_ARGS__)		\
-	      : __BPF_ARG1_TP((u32)arg2, ##__VA_ARGS__)))
-
-#define __BPF_ARG3_TP(...)						\
-	((mod[2] == 2 || (mod[2] == 1 && __BITS_PER_LONG == 64))	\
-	  ? __BPF_ARG2_TP(arg3, ##__VA_ARGS__)				\
-	  : ((mod[2] == 1 || (mod[2] == 0 && __BITS_PER_LONG == 32))	\
-	      ? __BPF_ARG2_TP((long)arg3, ##__VA_ARGS__)		\
-	      : __BPF_ARG2_TP((u32)arg3, ##__VA_ARGS__)))
-
-	return __BPF_TP_EMIT();
+	err = 0;
+out:
+	bpf_printf_postamble();
+	return err;
+}
+
+#define MAX_TRACE_PRINTK_VARARGS	3
+#define BPF_TRACE_PRINTK_SIZE		1024
+
+BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
+	   u64, arg2, u64, arg3)
+{
+	u64 args[MAX_TRACE_PRINTK_VARARGS] = { arg1, arg2, arg3 };
+	enum bpf_printf_mod_type mod[MAX_TRACE_PRINTK_VARARGS];
+	static char buf[BPF_TRACE_PRINTK_SIZE];
+	unsigned long flags;
+	int ret;
+
+	ret = bpf_printf_preamble(fmt, fmt_size, args, args, mod,
+				  MAX_TRACE_PRINTK_VARARGS);
+	if (ret < 0)
+		return ret;
+
+	ret = snprintf(buf, sizeof(buf), fmt, BPF_CAST_FMT_ARG(0, args, mod),
+		BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod));
+	/* snprintf() will not append null for zero-length strings */
+	if (ret == 0)
+		buf[0] = '\0';
+
+	raw_spin_lock_irqsave(&trace_printk_lock, flags);
+	trace_bpf_trace_printk(buf);
+	raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
+
+	bpf_printf_postamble();
+
+	return ret;
 }
 
 static const struct bpf_func_proto bpf_trace_printk_proto = {
@@ -581,184 +687,37 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
 }
 
 #define MAX_SEQ_PRINTF_VARARGS		12
-#define MAX_SEQ_PRINTF_MAX_MEMCPY	6
-#define MAX_SEQ_PRINTF_STR_LEN		128
-
-struct bpf_seq_printf_buf {
-	char buf[MAX_SEQ_PRINTF_MAX_MEMCPY][MAX_SEQ_PRINTF_STR_LEN];
-};
-static DEFINE_PER_CPU(struct bpf_seq_printf_buf, bpf_seq_printf_buf);
-static DEFINE_PER_CPU(int, bpf_seq_printf_buf_used);
 
 BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
 	   const void *, data, u32, data_len)
 {
-	int err = -EINVAL, fmt_cnt = 0, memcpy_cnt = 0;
-	int i, buf_used, copy_size, num_args;
-	u64 params[MAX_SEQ_PRINTF_VARARGS];
-	struct bpf_seq_printf_buf *bufs;
-	const u64 *args = data;
-
-	buf_used = this_cpu_inc_return(bpf_seq_printf_buf_used);
-	if (WARN_ON_ONCE(buf_used > 1)) {
-		err = -EBUSY;
-		goto out;
-	}
-
-	bufs = this_cpu_ptr(&bpf_seq_printf_buf);
-
-	/*
-	 * bpf_check()->check_func_arg()->check_stack_boundary()
-	 * guarantees that fmt points to bpf program stack,
-	 * fmt_size bytes of it were initialized and fmt_size > 0
-	 */
-	if (fmt[--fmt_size] != 0)
-		goto out;
-
-	if (data_len & 7)
-		goto out;
-
-	for (i = 0; i < fmt_size; i++) {
-		if (fmt[i] == '%') {
-			if (fmt[i + 1] == '%')
-				i++;
-			else if (!data || !data_len)
-				goto out;
-		}
-	}
+	enum bpf_printf_mod_type mod[MAX_SEQ_PRINTF_VARARGS];
+	u64 args[MAX_SEQ_PRINTF_VARARGS];
+	int err, num_args;
 
+	if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
+	    (data_len && !data))
+		return -EINVAL;
 	num_args = data_len / 8;
 
-	/* check format string for allowed specifiers */
-	for (i = 0; i < fmt_size; i++) {
-		/* only printable ascii for now. */
-		if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i])) {
-			err = -EINVAL;
-			goto out;
-		}
-
-		if (fmt[i] != '%')
-			continue;
-
-		if (fmt[i + 1] == '%') {
-			i++;
-			continue;
-		}
-
-		if (fmt_cnt >= MAX_SEQ_PRINTF_VARARGS) {
-			err = -E2BIG;
-			goto out;
-		}
-
-		if (fmt_cnt >= num_args) {
-			err = -EINVAL;
-			goto out;
-		}
-
-		/* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
-		i++;
-
-		/* skip optional "[0 +-][num]" width formating field */
-		while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
-		       fmt[i] == ' ')
-			i++;
-		if (fmt[i] >= '1' && fmt[i] <= '9') {
-			i++;
-			while (fmt[i] >= '0' && fmt[i] <= '9')
-				i++;
-		}
-
-		if (fmt[i] == 's') {
-			void *unsafe_ptr;
-
-			/* try our best to copy */
-			if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) {
-				err = -E2BIG;
-				goto out;
-			}
-
-			unsafe_ptr = (void *)(long)args[fmt_cnt];
-			err = strncpy_from_kernel_nofault(bufs->buf[memcpy_cnt],
-					unsafe_ptr, MAX_SEQ_PRINTF_STR_LEN);
-			if (err < 0)
-				bufs->buf[memcpy_cnt][0] = '\0';
-			params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt];
-
-			fmt_cnt++;
-			memcpy_cnt++;
-			continue;
-		}
-
-		if (fmt[i] == 'p') {
-			if (fmt[i + 1] == 0 ||
-			    fmt[i + 1] == 'K' ||
-			    fmt[i + 1] == 'x' ||
-			    fmt[i + 1] == 'B') {
-				/* just kernel pointers */
-				params[fmt_cnt] = args[fmt_cnt];
-				fmt_cnt++;
-				continue;
-			}
-
-			/* only support "%pI4", "%pi4", "%pI6" and "%pi6". */
-			if (fmt[i + 1] != 'i' && fmt[i + 1] != 'I') {
-				err = -EINVAL;
-				goto out;
-			}
-			if (fmt[i + 2] != '4' && fmt[i + 2] != '6') {
-				err = -EINVAL;
-				goto out;
-			}
-
-			if (memcpy_cnt >= MAX_SEQ_PRINTF_MAX_MEMCPY) {
-				err = -E2BIG;
-				goto out;
-			}
-
-
-			copy_size = (fmt[i + 2] == '4') ? 4 : 16;
-
-			err = copy_from_kernel_nofault(bufs->buf[memcpy_cnt],
-						(void *) (long) args[fmt_cnt],
-						copy_size);
-			if (err < 0)
-				memset(bufs->buf[memcpy_cnt], 0, copy_size);
-			params[fmt_cnt] = (u64)(long)bufs->buf[memcpy_cnt];
-
-			i += 2;
-			fmt_cnt++;
-			memcpy_cnt++;
-			continue;
-		}
-
-		if (fmt[i] == 'l') {
-			i++;
-			if (fmt[i] == 'l')
-				i++;
-		}
-
-		if (fmt[i] != 'i' && fmt[i] != 'd' &&
-		    fmt[i] != 'u' && fmt[i] != 'x' &&
-		    fmt[i] != 'X') {
-			err = -EINVAL;
-			goto out;
-		}
-
-		params[fmt_cnt] = args[fmt_cnt];
-		fmt_cnt++;
-	}
+	err = bpf_printf_preamble(fmt, fmt_size, data, args, mod, num_args);
+	if (err < 0)
+		return err;
 
 	/* Maximumly we can have MAX_SEQ_PRINTF_VARARGS parameter, just give
 	 * all of them to seq_printf().
 	 */
-	seq_printf(m, fmt, params[0], params[1], params[2], params[3],
-		   params[4], params[5], params[6], params[7], params[8],
-		   params[9], params[10], params[11]);
+	seq_printf(m, fmt, BPF_CAST_FMT_ARG(0, args, mod),
+		BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
+		BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
+		BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
+		BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
+		BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
+		BPF_CAST_FMT_ARG(11, args, mod));
 
-	err = seq_has_overflowed(m) ? -EOVERFLOW : 0;
-out:
-	this_cpu_dec(bpf_seq_printf_buf_used);
-	return err;
+	bpf_printf_postamble();
+
+	return seq_has_overflowed(m) ? -EOVERFLOW : 0;
 }
 
 BTF_ID_LIST_SINGLE(btf_seq_file_ids, struct, seq_file)
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
  2021-03-24  2:22 ` [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 22:23   ` Andrii Nakryiko
  2021-03-24  2:22 ` [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper Florent Revest
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

This type provides the guarantee that an argument is going to be a const
pointer to somewhere in a read-only map value. It also checks that this
pointer is followed by a zero character before the end of the map value.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/bpf.h   |  1 +
 kernel/bpf/verifier.c | 38 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 39 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a25730eaa148..7b5319d75b3e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -308,6 +308,7 @@ enum bpf_arg_type {
 	ARG_PTR_TO_PERCPU_BTF_ID,	/* pointer to in-kernel percpu type */
 	ARG_PTR_TO_FUNC,	/* pointer to a bpf program function */
 	ARG_PTR_TO_STACK_OR_NULL,	/* pointer to stack or NULL */
+	ARG_PTR_TO_CONST_STR,	/* pointer to a null terminated read-only string */
 	__BPF_ARG_TYPE_MAX,
 };
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index e26c5170c953..9e03608725b4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -4601,6 +4601,7 @@ static const struct bpf_reg_types spin_lock_types = { .types = { PTR_TO_MAP_VALU
 static const struct bpf_reg_types percpu_btf_ptr_types = { .types = { PTR_TO_PERCPU_BTF_ID } };
 static const struct bpf_reg_types func_ptr_types = { .types = { PTR_TO_FUNC } };
 static const struct bpf_reg_types stack_ptr_types = { .types = { PTR_TO_STACK } };
+static const struct bpf_reg_types const_str_ptr_types = { .types = { PTR_TO_MAP_VALUE } };
 
 static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
 	[ARG_PTR_TO_MAP_KEY]		= &map_key_value_types,
@@ -4631,6 +4632,7 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
 	[ARG_PTR_TO_PERCPU_BTF_ID]	= &percpu_btf_ptr_types,
 	[ARG_PTR_TO_FUNC]		= &func_ptr_types,
 	[ARG_PTR_TO_STACK_OR_NULL]	= &stack_ptr_types,
+	[ARG_PTR_TO_CONST_STR]		= &const_str_ptr_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -4881,6 +4883,42 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		if (err)
 			return err;
 		err = check_ptr_alignment(env, reg, 0, size, true);
+	} else if (arg_type == ARG_PTR_TO_CONST_STR) {
+		struct bpf_map *map = reg->map_ptr;
+		int map_off;
+		u64 map_addr;
+		char *str_ptr;
+
+		if (reg->type != PTR_TO_MAP_VALUE || !map ||
+		    !bpf_map_is_rdonly(map)) {
+			verbose(env, "R%d does not point to a readonly map'\n", regno);
+			return -EACCES;
+		}
+
+		if (!tnum_is_const(reg->var_off)) {
+			verbose(env, "R%d is not a constant address'\n", regno);
+			return -EACCES;
+		}
+
+		if (!map->ops->map_direct_value_addr) {
+			verbose(env, "no direct value access support for this map type\n");
+			return -EACCES;
+		}
+
+		err = check_map_access(env, regno, reg->off,
+				       map->value_size - reg->off, false);
+		if (err)
+			return err;
+
+		map_off = reg->off + reg->var_off.value;
+		err = map->ops->map_direct_value_addr(map, &map_addr, map_off);
+		if (err)
+			return err;
+
+		str_ptr = (char *)(long)(map_addr);
+		if (!strnchr(str_ptr + map_off,
+			     map->value_size - reg->off - map_off, 0))
+			verbose(env, "string is not zero-terminated\n");
 	}
 
 	return err;
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
  2021-03-24  2:22 ` [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf Florent Revest
  2021-03-24  2:22 ` [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 22:55   ` Andrii Nakryiko
  2021-03-24  2:22 ` [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field Florent Revest
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

The implementation takes inspiration from the existing bpf_trace_printk
helper but there are a few differences:

To allow for a large number of format-specifiers, parameters are
provided in an array, like in bpf_seq_printf.

Because the output string takes two arguments and the array of
parameters also takes two arguments, the format string needs to fit in
one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
NULL-terminated read-only map, we don't need a format string length arg.

Because the format-string is known at verification time, we also move
most of the format string validation, currently done in formatting
helper calls, into the verifier logic. This makes debugging easier and
also slightly improves the runtime performance.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 include/linux/bpf.h            |  6 ++++
 include/uapi/linux/bpf.h       | 28 ++++++++++++++++++
 kernel/bpf/helpers.c           |  2 ++
 kernel/bpf/verifier.c          | 41 +++++++++++++++++++++++++++
 kernel/trace/bpf_trace.c       | 52 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 28 ++++++++++++++++++
 6 files changed, 157 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7b5319d75b3e..f3d9c8fa60b3 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
 extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
 extern const struct bpf_func_proto bpf_copy_from_user_proto;
 extern const struct bpf_func_proto bpf_snprintf_btf_proto;
+extern const struct bpf_func_proto bpf_snprintf_proto;
 extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
 extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
@@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
 struct btf_id_set;
 bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
 
+enum bpf_printf_mod_type;
+int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
+			u64 *final_args, enum bpf_printf_mod_type *mod,
+			u32 num_args);
+
 #endif /* _LINUX_BPF_H */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2d3036e292a9..86af61e912c6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4660,6 +4660,33 @@ union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ *	Description
+ *		Outputs a string into the **str** buffer of size **str_size**
+ *		based on a format string stored in a read-only map pointed by
+ *		**fmt**.
+ *
+ *		Each format specifier in **fmt** corresponds to one u64 element
+ *		in the **data** array. For strings and pointers where pointees
+ *		are accessed, only the pointer values are stored in the *data*
+ *		array. The *data_len* is the size of *data* in bytes.
+ *
+ *		Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ *		memory. Reading kernel memory may fail due to either invalid
+ *		address or valid address but requiring a major memory fault. If
+ *		reading kernel memory fails, the string for **%s** will be an
+ *		empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ *		Not returning error to bpf program is consistent with what
+ *		**bpf_trace_printk**\ () does for now.
+ *
+ *	Return
+ *		The strictly positive length of the formatted string, including
+ *		the trailing zero character. If the return value is greater than
+ *		**str_size**, **str** contains a truncated string, guaranteed to
+ *		be zero-terminated.
+ *
+ *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4827,6 +4854,7 @@ union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(snprintf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 074800226327..12f4cfb04fe7 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -750,6 +750,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 		return &bpf_probe_read_kernel_str_proto;
 	case BPF_FUNC_snprintf_btf:
 		return &bpf_snprintf_btf_proto;
+	case BPF_FUNC_snprintf:
+		return &bpf_snprintf_proto;
 	default:
 		return NULL;
 	}
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 9e03608725b4..a89599dc51c9 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5729,6 +5729,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
 	return state->acquired_refs ? -EINVAL : 0;
 }
 
+static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
+				   struct bpf_reg_state *regs)
+{
+	struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
+	struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
+	struct bpf_map *fmt_map = fmt_reg->map_ptr;
+	int err, fmt_map_off, num_args;
+	u64 fmt_addr;
+	char *fmt;
+
+	/* data must be an array of u64 so data_len must be a multiple of 8 */
+	if (data_len_reg->var_off.value & 7)
+		return -EINVAL;
+	num_args = data_len_reg->var_off.value / 8;
+
+	/* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
+	 * and map_direct_value_addr is set.
+	 */
+	fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
+	err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
+						  fmt_map_off);
+	if (err)
+		return err;
+	fmt = (char *)fmt_addr + fmt_map_off;
+
+	/* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
+	 * can focus on validating the format specifiers.
+	 */
+	err = bpf_printf_preamble(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
+	if (err < 0)
+		verbose(env, "Invalid format string\n");
+
+	return err;
+}
+
 static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
 			     int *insn_idx_p)
 {
@@ -5843,6 +5878,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 			return -EINVAL;
 	}
 
+	if (func_id == BPF_FUNC_snprintf) {
+		err = check_bpf_snprintf_call(env, regs);
+		if (err < 0)
+			return err;
+	}
+
 	/* reset caller saved regs */
 	for (i = 0; i < CALLER_SAVED_REGS; i++) {
 		mark_reg_not_init(env, regs, caller_saved[i]);
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 0fdca94a3c9c..15cbc8b63206 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1230,6 +1230,56 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = {
 	.arg5_type	= ARG_ANYTHING,
 };
 
+#define MAX_SNPRINTF_VARARGS		12
+
+BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
+	   const void *, data, u32, data_len)
+{
+	enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
+	u64 args[MAX_SNPRINTF_VARARGS];
+	int err, num_args;
+
+	if (data_len & 7 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
+	    (data_len && !data))
+		return -EINVAL;
+	num_args = data_len / 8;
+
+	/* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
+	 * can safely give an unbounded size.
+	 */
+	err = bpf_printf_preamble(fmt, UINT_MAX, data, args, mod, num_args);
+	if (err < 0)
+		return err;
+
+	/* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
+	 * all of them to snprintf().
+	 */
+	err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
+		BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
+		BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
+		BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
+		BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
+		BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
+		BPF_CAST_FMT_ARG(11, args, mod));
+	if (str_size)
+		str[str_size - 1] = '\0';
+
+	bpf_printf_postamble();
+
+	return err + 1;
+}
+
+const struct bpf_func_proto bpf_snprintf_proto = {
+	.func		= bpf_snprintf,
+	.gpl_only	= true,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_PTR_TO_MEM,
+	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
+	.arg3_type	= ARG_PTR_TO_CONST_STR,
+	.arg4_type	= ARG_PTR_TO_MEM,
+	.arg5_type	= ARG_CONST_SIZE_OR_ZERO,
+};
+
 const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1332,6 +1382,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_task_storage_delete_proto;
 	case BPF_FUNC_for_each_map_elem:
 		return &bpf_for_each_map_elem_proto;
+	case BPF_FUNC_snprintf:
+		return &bpf_snprintf_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2d3036e292a9..86af61e912c6 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4660,6 +4660,33 @@ union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
+ *	Description
+ *		Outputs a string into the **str** buffer of size **str_size**
+ *		based on a format string stored in a read-only map pointed by
+ *		**fmt**.
+ *
+ *		Each format specifier in **fmt** corresponds to one u64 element
+ *		in the **data** array. For strings and pointers where pointees
+ *		are accessed, only the pointer values are stored in the *data*
+ *		array. The *data_len* is the size of *data* in bytes.
+ *
+ *		Formats **%s** and **%p{i,I}{4,6}** require to read kernel
+ *		memory. Reading kernel memory may fail due to either invalid
+ *		address or valid address but requiring a major memory fault. If
+ *		reading kernel memory fails, the string for **%s** will be an
+ *		empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
+ *		Not returning error to bpf program is consistent with what
+ *		**bpf_trace_printk**\ () does for now.
+ *
+ *	Return
+ *		The strictly positive length of the formatted string, including
+ *		the trailing zero character. If the return value is greater than
+ *		**str_size**, **str** contains a truncated string, guaranteed to
+ *		be zero-terminated.
+ *
+ *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4827,6 +4854,7 @@ union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(snprintf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
                   ` (2 preceding siblings ...)
  2021-03-24  2:22 ` [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 23:01   ` Andrii Nakryiko
  2021-03-24  2:22 ` [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro Florent Revest
  2021-03-24  2:22 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf Florent Revest
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

When initializing the __param array with a one liner, if all args are
const, the initial array value will be placed in the rodata section but
because libbpf does not support relocation in the rodata section, any
pointer in this array will stay NULL.

Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/lib/bpf/bpf_tracing.h | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index f9ef37707888..d9a4c3f77ff4 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -413,6 +413,22 @@ typeof(name(0)) name(struct pt_regs *ctx)				    \
 }									    \
 static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
 
+#define ___bpf_fill0(arr, p, x)
+#define ___bpf_fill1(arr, p, x) arr[p] = x
+#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
+#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
+#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
+#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
+#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
+#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
+#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
+#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
+#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
+#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
+#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
+#define ___bpf_fill(arr, args...) \
+	___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
+
 /*
  * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
  * in a structure.
@@ -421,12 +437,14 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
 	({								    \
 		_Pragma("GCC diagnostic push")				    \
 		_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")	    \
+		unsigned long long ___param[___bpf_narg(args)];		    \
 		static const char ___fmt[] = fmt;			    \
-		unsigned long long ___param[] = { args };		    \
+		int __ret;						    \
+		___bpf_fill(___param, args);				    \
 		_Pragma("GCC diagnostic pop")				    \
-		int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),    \
-					    ___param, sizeof(___param));    \
-		___ret;							    \
+		__ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),	    \
+				       ___param, sizeof(___param));	    \
+		__ret;							    \
 	})
 
 #endif
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
                   ` (3 preceding siblings ...)
  2021-03-24  2:22 ` [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 23:02   ` Andrii Nakryiko
  2021-03-24  2:22 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf Florent Revest
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
array of u64, making it more natural to call the bpf_snprintf helper.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/lib/bpf/bpf_tracing.h | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index d9a4c3f77ff4..e5c6ede6060b 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
 		__ret;							    \
 	})
 
+/*
+ * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of
+ * an array of u64.
+ */
+#define BPF_SNPRINTF(out, out_size, fmt, args...)			    \
+	({								    \
+		_Pragma("GCC diagnostic push")				    \
+		_Pragma("GCC diagnostic ignored \"-Wint-conversion\"")	    \
+		unsigned long long ___param[___bpf_narg(args)];		    \
+		static const char ___fmt[] = fmt;			    \
+		int __ret;						    \
+		___bpf_fill(___param, args);				    \
+		_Pragma("GCC diagnostic pop")				    \
+		__ret = bpf_snprintf(out, out_size, ___fmt,		    \
+				     ___param, sizeof(___param));	    \
+		__ret;							    \
+	})
+
 #endif
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf
  2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
                   ` (4 preceding siblings ...)
  2021-03-24  2:22 ` [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro Florent Revest
@ 2021-03-24  2:22 ` Florent Revest
  2021-03-26 23:05   ` Andrii Nakryiko
  5 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-03-24  2:22 UTC (permalink / raw)
  To: bpf
  Cc: ast, daniel, andrii, yhs, kpsingh, jackmanb, linux-kernel,
	Florent Revest

This exercises most of the format specifiers when things go well.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 .../selftests/bpf/prog_tests/snprintf.c       | 65 +++++++++++++++++++
 .../selftests/bpf/progs/test_snprintf.c       | 59 +++++++++++++++++
 2 files changed, 124 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/snprintf.c b/tools/testing/selftests/bpf/prog_tests/snprintf.c
new file mode 100644
index 000000000000..948a05e6b2cb
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/snprintf.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include <test_progs.h>
+#include "test_snprintf.skel.h"
+
+#define EXP_NUM_OUT  "-8 9 96 -424242 1337 DABBAD00"
+#define EXP_NUM_RET  sizeof(EXP_NUM_OUT)
+
+#define EXP_IP_OUT   "127.000.000.001 0000:0000:0000:0000:0000:0000:0000:0001"
+#define EXP_IP_RET   sizeof(EXP_IP_OUT)
+
+/* The third specifier, %pB, depends on compiler inlining so don't check it */
+#define EXP_SYM_OUT  "schedule schedule+0x0/"
+#define MIN_SYM_RET  sizeof(EXP_SYM_OUT)
+
+/* The third specifier, %p, is a hashed pointer which changes on every reboot */
+#define EXP_ADDR_OUT "0000000000000000 ffff00000add4e55 "
+#define EXP_ADDR_RET sizeof(EXP_ADDR_OUT "unknownhashedptr")
+
+#define EXP_STR_OUT  "str1 longstr"
+#define EXP_STR_RET  sizeof(EXP_STR_OUT)
+
+#define EXP_OVER_OUT "%over"
+#define EXP_OVER_RET 10
+
+void test_snprintf(void)
+{
+	char exp_addr_out[] = EXP_ADDR_OUT;
+	char exp_sym_out[]  = EXP_SYM_OUT;
+	struct test_snprintf *skel;
+
+	skel = test_snprintf__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel_open"))
+		return;
+
+	if (!ASSERT_OK(test_snprintf__attach(skel), "skel_attach"))
+		goto cleanup;
+
+	/* trigger tracepoint */
+	usleep(1);
+
+	ASSERT_STREQ(skel->bss->num_out, EXP_NUM_OUT, "num_out");
+	ASSERT_EQ(skel->bss->num_ret, EXP_NUM_RET, "num_ret");
+
+	ASSERT_STREQ(skel->bss->ip_out, EXP_IP_OUT, "ip_out");
+	ASSERT_EQ(skel->bss->ip_ret, EXP_IP_RET, "ip_ret");
+
+	ASSERT_OK(memcmp(skel->bss->sym_out, exp_sym_out,
+			 sizeof(exp_sym_out) - 1), "sym_out");
+	ASSERT_LT(MIN_SYM_RET, skel->bss->sym_ret, "sym_ret");
+
+	ASSERT_OK(memcmp(skel->bss->addr_out, exp_addr_out,
+			 sizeof(exp_addr_out) - 1), "addr_out");
+	ASSERT_EQ(skel->bss->addr_ret, EXP_ADDR_RET, "addr_ret");
+
+	ASSERT_STREQ(skel->bss->str_out, EXP_STR_OUT, "str_out");
+	ASSERT_EQ(skel->bss->str_ret, EXP_STR_RET, "str_ret");
+
+	ASSERT_STREQ(skel->bss->over_out, EXP_OVER_OUT, "over_out");
+	ASSERT_EQ(skel->bss->over_ret, EXP_OVER_RET, "over_ret");
+
+cleanup:
+	test_snprintf__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_snprintf.c b/tools/testing/selftests/bpf/progs/test_snprintf.c
new file mode 100644
index 000000000000..e18709055fad
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_snprintf.c
@@ -0,0 +1,59 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Google LLC. */
+
+#include <linux/bpf.h>
+#include <bpf/bpf_endian.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+
+char num_out[64] = {};
+long num_ret = 0;
+
+char ip_out[64] = {};
+long ip_ret = 0;
+
+char sym_out[64] = {};
+long sym_ret = 0;
+
+char addr_out[64] = {};
+long addr_ret = 0;
+
+char str_out[64] = {};
+long str_ret = 0;
+
+char over_out[6] = {};
+long over_ret = 0;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+	/* Convenient values to pretty-print */
+	const __u8 ex_ipv4[] = {127, 0, 0, 1};
+	const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
+	const char str1[] = "str1";
+	const char longstr[] = "longstr";
+	extern const void schedule __ksym;
+
+	/* Integer types */
+	num_ret  = BPF_SNPRINTF(num_out, sizeof(num_out),
+				"%d %u %x %li %llu %lX",
+				-8, 9, 150, -424242, 1337, 0xDABBAD00);
+	/* IP addresses */
+	ip_ret   = BPF_SNPRINTF(ip_out, sizeof(ip_out), "%pi4 %pI6",
+				&ex_ipv4, &ex_ipv6);
+	/* Symbol lookup formatting */
+	sym_ret  = BPF_SNPRINTF(sym_out,  sizeof(sym_out), "%ps %pS %pB",
+				&schedule, &schedule, &schedule);
+	/* Kernel pointers */
+	addr_ret = BPF_SNPRINTF(addr_out, sizeof(addr_out), "%pK %px %p",
+				0, 0xFFFF00000ADD4E55, 0xFFFF00000ADD4E55);
+	/* Strings embedding */
+	str_ret  = BPF_SNPRINTF(str_out, sizeof(str_out), "%s %+05s",
+				str1, longstr);
+	/* Overflow */
+	over_ret = BPF_SNPRINTF(over_out, sizeof(over_out), "%%overflow");
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.31.0.291.g576ba9dcdaf-goog


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-03-24  2:22 ` [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf Florent Revest
@ 2021-03-26 21:53   ` Andrii Nakryiko
  2021-03-26 22:51     ` Andrii Nakryiko
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 21:53 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> Two helpers (trace_printk and seq_printf) have very similar
> implementations of format string parsing and a third one is coming
> (snprintf). To avoid code duplication and make the code easier to
> maintain, this moves the operations associated with format string
> parsing (validation and argument sanitization) into one generic
> function.
>
> Unfortunately, the implementation of the two existing helpers already
> drifted quite a bit and unifying them entailed a lot of changes:

"Unfortunately" as in a lot of extra work for you? I think overall
though it was very fortunate that you ended up doing it, all
implementations are more feature-complete and saner now, no? Thanks a
lot for your hard work!

>
> - bpf_trace_printk always expected fmt[fmt_size] to be the terminating
>   NULL character, this is no longer true, the first 0 is terminating.

You mean if you had bpf_trace_printk("bla bla\0some more bla\0", 24)
it would emit that zero character? If yes, I don't think it was a sane
behavior anyways.

> - bpf_trace_printk now supports %% (which produces the percentage char).
> - bpf_trace_printk now skips width formating fields.
> - bpf_trace_printk now supports the X modifier (capital hexadecimal).
> - bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
> - argument casting on 32 bit has been simplified into one macro and
>   using an enum instead of obscure int increments.
>
> - bpf_seq_printf now uses bpf_trace_copy_string instead of
>   strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
> - bpf_seq_printf now prints longs correctly on 32 bit architectures.
>
> - both were changed to use a global per-cpu tmp buffer instead of one
>   stack buffer for trace_printk and 6 small buffers for seq_printf.
> - to avoid per-cpu buffer usage conflict, these helpers disable
>   preemption while the per-cpu buffer is in use.
> - both helpers now support the %ps and %pS specifiers to print symbols.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---

This is great, you already saved some lines of code! I suspect I'll
have some complaints about mods (it feels like this preample should
provide extra information about which arguments have to be read from
kernel/user memory, but I'll see next patches first.

See my comments below (I deliberately didn't trim most of the code for
easier jumping around), but it's great overall, thanks!

>  kernel/trace/bpf_trace.c | 529 ++++++++++++++++++---------------------
>  1 file changed, 244 insertions(+), 285 deletions(-)
>
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 0d23755c2747..0fdca94a3c9c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -372,7 +372,7 @@ static const struct bpf_func_proto *bpf_get_probe_write_proto(void)
>         return &bpf_probe_write_user_proto;
>  }
>
> -static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
> +static int bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
>                 size_t bufsz)
>  {
>         void __user *user_ptr = (__force void __user *)unsafe_ptr;
> @@ -382,178 +382,284 @@ static void bpf_trace_copy_string(char *buf, void *unsafe_ptr, char fmt_ptype,
>         switch (fmt_ptype) {
>         case 's':
>  #ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
> -               if ((unsigned long)unsafe_ptr < TASK_SIZE) {
> -                       strncpy_from_user_nofault(buf, user_ptr, bufsz);
> -                       break;
> -               }
> +               if ((unsigned long)unsafe_ptr < TASK_SIZE)
> +                       return strncpy_from_user_nofault(buf, user_ptr, bufsz);
>                 fallthrough;
>  #endif
>         case 'k':
> -               strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
> -               break;
> +               return strncpy_from_kernel_nofault(buf, unsafe_ptr, bufsz);
>         case 'u':
> -               strncpy_from_user_nofault(buf, user_ptr, bufsz);
> -               break;
> +               return strncpy_from_user_nofault(buf, user_ptr, bufsz);
>         }
> +
> +       return -EINVAL;
>  }
>
>  static DEFINE_RAW_SPINLOCK(trace_printk_lock);
>
> -#define BPF_TRACE_PRINTK_SIZE   1024
> +enum bpf_printf_mod_type {
> +       BPF_PRINTF_INT,
> +       BPF_PRINTF_LONG,
> +       BPF_PRINTF_LONG_LONG,
> +};
>
> -static __printf(1, 0) int bpf_do_trace_printk(const char *fmt, ...)
> -{
> -       static char buf[BPF_TRACE_PRINTK_SIZE];
> -       unsigned long flags;
> -       va_list ap;
> -       int ret;
> +/* Horrid workaround for getting va_list handling working with different
> + * argument type combinations generically for 32 and 64 bit archs.
> + */
> +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)                            \
> +       ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||                        \
> +        (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))     \
> +         ? args[arg_nb]                                                \
> +         : ((mod[arg_nb] == BPF_PRINTF_LONG ||                         \
> +            (mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \

is this right? INT is always 32-bit, it's only LONG that differs.
Shouldn't the rule be

(LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
(INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]

Does (long) cast do anything fancy when casting from u64? Sorry, maybe
I'm confused.


> +             ? (long)args[arg_nb]                                      \
> +             : (u32)args[arg_nb]))
> +
> +/* Per-cpu temp buffers which can be used by printf-like helpers for %s or %p
> + */
> +#define MAX_PRINTF_BUF_LEN     512
>
> -       raw_spin_lock_irqsave(&trace_printk_lock, flags);
> -       va_start(ap, fmt);
> -       ret = vsnprintf(buf, sizeof(buf), fmt, ap);
> -       va_end(ap);
> -       /* vsnprintf() will not append null for zero-length strings */
> -       if (ret == 0)
> -               buf[0] = '\0';
> -       trace_bpf_trace_printk(buf);
> -       raw_spin_unlock_irqrestore(&trace_printk_lock, flags);
> +struct bpf_printf_buf {
> +       char tmp_buf[MAX_PRINTF_BUF_LEN];
> +};
> +static DEFINE_PER_CPU(struct bpf_printf_buf, bpf_printf_buf);
> +static DEFINE_PER_CPU(int, bpf_printf_buf_used);
>
> -       return ret;
> +static void bpf_printf_postamble(void)
> +{
> +       if (this_cpu_read(bpf_printf_buf_used)) {
> +               this_cpu_dec(bpf_printf_buf_used);
> +               preempt_enable();
> +       }
>  }
>
>  /*
> - * Only limited trace_printk() conversion specifiers allowed:
> - * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pB %pks %pus %s
> + * bpf_parse_fmt_str - Generic pass on format strings for printf-like helpers
> + *
> + * Returns a negative value if fmt is an invalid format string or 0 otherwise.
> + *
> + * This can be used in two ways:
> + * - Format string verification only: when final_args and mod are NULL
> + * - Arguments preparation: in addition to the above verification, it writes in
> + *   final_args a copy of raw_args where pointers from BPF have been sanitized
> + *   into pointers safe to use by snprintf. This also writes in the mod array
> + *   the size requirement of each argument, usable by BPF_CAST_FMT_ARG for ex.
> + *
> + * In argument preparation mode, if 0 is returned, safe temporary buffers are
> + * allocated and bpf_printf_postamble should be called to free them after use.
>   */
> -BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
> -          u64, arg2, u64, arg3)
> -{
> -       int i, mod[3] = {}, fmt_cnt = 0;
> -       char buf[64], fmt_ptype;
> -       void *unsafe_ptr = NULL;
> -       bool str_seen = false;
> -
> -       /*
> -        * bpf_check()->check_func_arg()->check_stack_boundary()
> -        * guarantees that fmt points to bpf program stack,
> -        * fmt_size bytes of it were initialized and fmt_size > 0
> -        */
> -       if (fmt[--fmt_size] != 0)
> -               return -EINVAL;
> -
> -       /* check format string for allowed specifiers */
> -       for (i = 0; i < fmt_size; i++) {
> -               if ((!isprint(fmt[i]) && !isspace(fmt[i])) || !isascii(fmt[i]))
> -                       return -EINVAL;
> +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> +                       u32 num_args)
> +{
> +       struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
> +       int err, i, fmt_cnt = 0, copy_size, used;
> +       char *unsafe_ptr = NULL, *tmp_buf = NULL;
> +       bool prepare_args = final_args && mod;

probably better to enforce that both or none are specified, otherwise
return error

> +       enum bpf_printf_mod_type current_mod;
> +       size_t tmp_buf_len;
> +       u64 current_arg;
> +       char fmt_ptype;
> +
> +       for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {

Can we say that if the last character is not '\0' then it's a bad
format string and return -EINVAL? And if \0 is inside the format
string, then it's also a bad format string? I wonder what others think
about this?... I think sanity should prevail.

> +               if ((!isprint(fmt[i]) && !isspace(fmt[i])) ||
> +                   !isascii(fmt[i])) {

&& always binds tighter than ||, so you can omit extra (). I'd put
this on a single line as well, but that's a total nit.

> +                       err = -EINVAL;
> +                       goto out;
> +               }
>
>                 if (fmt[i] != '%')
>                         continue;
>
> -               if (fmt_cnt >= 3)
> -                       return -EINVAL;
> +               if (fmt[i + 1] == '%') {
> +                       i++;
> +                       continue;
> +               }
> +
> +               if (fmt_cnt >= num_args) {
> +                       err = -EINVAL;
> +                       goto out;
> +               }
>
>                 /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
>                 i++;
> -               if (fmt[i] == 'l') {
> -                       mod[fmt_cnt]++;
> +
> +               /* skip optional "[0 +-][num]" width formating field */

typo: formatting

> +               while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> +                      fmt[i] == ' ')
> +                       i++;
> +               if (fmt[i] >= '1' && fmt[i] <= '9') {
>                         i++;

Are we worried about integer overflow here? %123123123123123d
hopefully won't crash anything, right?

> -               } else if (fmt[i] == 'p') {
> -                       mod[fmt_cnt]++;
> -                       if ((fmt[i + 1] == 'k' ||
> -                            fmt[i + 1] == 'u') &&
> +                       while (fmt[i] >= '0' && fmt[i] <= '9')
> +                               i++;

whoa, fmt_size shouldn't be ignored

> +               }
> +

and here if we exhausted all format string but haven't gotten to
format specified, we should -EINVAL

if (i >= fmt_size) return -EINVAL?

> +               if (fmt[i] == 'p') {
> +                       current_mod = BPF_PRINTF_LONG;
> +
> +                       if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') &&
>                             fmt[i + 2] == 's') {

right, if i + 2 is ok to access? always be remembering about fmt_size

>                                 fmt_ptype = fmt[i + 1];
>                                 i += 2;
>                                 goto fmt_str;
>                         }
>
> -                       if (fmt[i + 1] == 'B') {
> -                               i++;
> +                       if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
> +                           ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
> +                           fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
> +                           fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
> +                               /* just kernel pointers */
> +                               if (prepare_args)
> +                                       current_arg = raw_args[fmt_cnt];

fmt_cnt is not the best name, imo. arg_cnt makes more sense

>                                 goto fmt_next;
>                         }
>
> -                       /* disallow any further format extensions */
> -                       if (fmt[i + 1] != 0 &&
> -                           !isspace(fmt[i + 1]) &&
> -                           !ispunct(fmt[i + 1]))
> -                               return -EINVAL;
> +                       /* only support "%pI4", "%pi4", "%pI6" and "%pi6". */
> +                       if ((fmt[i + 1] != 'i' && fmt[i + 1] != 'I') ||
> +                           (fmt[i + 2] != '4' && fmt[i + 2] != '6')) {
> +                               err = -EINVAL;
> +                               goto out;
> +                       }
> +
> +                       i += 2;
> +                       if (!prepare_args)
> +                               goto fmt_next;
> +
> +                       if (!tmp_buf) {
> +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> +                               if (WARN_ON_ONCE(used > 1)) {
> +                                       this_cpu_dec(bpf_printf_buf_used);
> +                                       return -EBUSY;
> +                               }
> +                               preempt_disable();

shouldn't we preempt_disable before we got bpf_printf_buf_used? if we
get preempted after incrementing counter, buffer will be unusable for
a while, potentially, right?

> +                               tmp_buf = bufs->tmp_buf;
> +                               tmp_buf_len = MAX_PRINTF_BUF_LEN;
> +                       }
> +
> +                       copy_size = (fmt[i + 2] == '4') ? 4 : 16;
> +                       if (tmp_buf_len < copy_size) {
> +                               err = -ENOSPC;
> +                               goto out;
> +                       }
> +
> +                       unsafe_ptr = (char *)(long)raw_args[fmt_cnt];
> +                       err = copy_from_kernel_nofault(tmp_buf, unsafe_ptr,
> +                                                      copy_size);
> +                       if (err < 0)
> +                               memset(tmp_buf, 0, copy_size);
> +                       current_arg = (u64)(long)tmp_buf;
> +                       tmp_buf += copy_size;
> +                       tmp_buf_len -= copy_size;
>
>                         goto fmt_next;
>                 } else if (fmt[i] == 's') {
> -                       mod[fmt_cnt]++;
> +                       current_mod = BPF_PRINTF_LONG;
>                         fmt_ptype = fmt[i];
>  fmt_str:
> -                       if (str_seen)
> -                               /* allow only one '%s' per fmt string */
> -                               return -EINVAL;
> -                       str_seen = true;
> -
>                         if (fmt[i + 1] != 0 &&
>                             !isspace(fmt[i + 1]) &&
> -                           !ispunct(fmt[i + 1]))
> -                               return -EINVAL;
> +                           !ispunct(fmt[i + 1])) {
> +                               err = -EINVAL;
> +                               goto out;
> +                       }
>
> -                       switch (fmt_cnt) {
> -                       case 0:
> -                               unsafe_ptr = (void *)(long)arg1;
> -                               arg1 = (long)buf;
> -                               break;
> -                       case 1:
> -                               unsafe_ptr = (void *)(long)arg2;
> -                               arg2 = (long)buf;
> -                               break;
> -                       case 2:
> -                               unsafe_ptr = (void *)(long)arg3;
> -                               arg3 = (long)buf;
> -                               break;
> +                       if (!prepare_args)
> +                               goto fmt_next;
> +
> +                       if (!tmp_buf) {
> +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> +                               if (WARN_ON_ONCE(used > 1)) {
> +                                       this_cpu_dec(bpf_printf_buf_used);
> +                                       return -EBUSY;
> +                               }
> +                               preempt_disable();
> +                               tmp_buf = bufs->tmp_buf;
> +                               tmp_buf_len = MAX_PRINTF_BUF_LEN;
> +                       }

how about helper used like this:

if (try_get_fmt_tmp_buf(&tmp_buf, &tmp_buf_len))
    return -EBUSY;

which will do nothing if tmp_buf != NULL?

> +
> +                       if (!tmp_buf_len) {
> +                               err = -ENOSPC;
> +                               goto out;
>                         }
>
> -                       bpf_trace_copy_string(buf, unsafe_ptr, fmt_ptype,
> -                                       sizeof(buf));
> +                       unsafe_ptr = (char *)(long)raw_args[fmt_cnt];
> +                       err = bpf_trace_copy_string(tmp_buf, unsafe_ptr,
> +                                                   fmt_ptype, tmp_buf_len);
> +                       if (err < 0) {
> +                               tmp_buf[0] = '\0';
> +                               err = 1;
> +                       }
> +
> +                       current_arg = (u64)(long)tmp_buf;
> +                       tmp_buf += err;
> +                       tmp_buf_len -= err;
> +
>                         goto fmt_next;
>                 }
>
> +               current_mod = BPF_PRINTF_INT;
> +
>                 if (fmt[i] == 'l') {
> -                       mod[fmt_cnt]++;
> +                       current_mod = BPF_PRINTF_LONG;
> +                       i++;
> +               }
> +               if (fmt[i] == 'l') {
> +                       current_mod = BPF_PRINTF_LONG_LONG;
>                         i++;
>                 }
>
> -               if (fmt[i] != 'i' && fmt[i] != 'd' &&
> -                   fmt[i] != 'u' && fmt[i] != 'x')
> -                       return -EINVAL;
> +               if (fmt[i] != 'i' && fmt[i] != 'd' && fmt[i] != 'u' &&
> +                   fmt[i] != 'x' && fmt[i] != 'X') {
> +                       err = -EINVAL;
> +                       goto out;
> +               }
> +
> +               if (prepare_args)
> +                       current_arg = raw_args[fmt_cnt];
>  fmt_next:
> +               if (prepare_args) {

I'd ditch prepare_args variable and just check final_args (and that
check to ensure both mods and final_args are specified I suggested
above)

> +                       mod[fmt_cnt] = current_mod;
> +                       final_args[fmt_cnt] = current_arg;
> +               }
>                 fmt_cnt++;
>         }

[...]

> -
> -       return __BPF_TP_EMIT();
> +       err = 0;
> +out:
> +       bpf_printf_postamble();

naming is hard, but preamble and postamble reads way too fancy :)
bpf_printf_prepare() and bpf_printf_cleanup() or something like that
is a bit more to the point, no?


> +       return err;
> +}
> +

[...]

>
>  static const struct bpf_func_proto bpf_trace_printk_proto = {
> @@ -581,184 +687,37 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void)
>  }
>
>  #define MAX_SEQ_PRINTF_VARARGS         12
> -#define MAX_SEQ_PRINTF_MAX_MEMCPY      6
> -#define MAX_SEQ_PRINTF_STR_LEN         128
> -
> -struct bpf_seq_printf_buf {
> -       char buf[MAX_SEQ_PRINTF_MAX_MEMCPY][MAX_SEQ_PRINTF_STR_LEN];
> -};
> -static DEFINE_PER_CPU(struct bpf_seq_printf_buf, bpf_seq_printf_buf);
> -static DEFINE_PER_CPU(int, bpf_seq_printf_buf_used);
>
>  BPF_CALL_5(bpf_seq_printf, struct seq_file *, m, char *, fmt, u32, fmt_size,
>            const void *, data, u32, data_len)

[...]

> +       enum bpf_printf_mod_type mod[MAX_SEQ_PRINTF_VARARGS];
> +       u64 args[MAX_SEQ_PRINTF_VARARGS];
> +       int err, num_args;
>
> +       if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
> +           (data_len && !data))

data && !data_len is also an error, no?

> +               return -EINVAL;
>         num_args = data_len / 8;
>

[...]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type
  2021-03-24  2:22 ` [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type Florent Revest
@ 2021-03-26 22:23   ` Andrii Nakryiko
  2021-04-06 15:38     ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 22:23 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> This type provides the guarantee that an argument is going to be a const
> pointer to somewhere in a read-only map value. It also checks that this
> pointer is followed by a zero character before the end of the map value.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/bpf.h   |  1 +
>  kernel/bpf/verifier.c | 38 ++++++++++++++++++++++++++++++++++++++
>  2 files changed, 39 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a25730eaa148..7b5319d75b3e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -308,6 +308,7 @@ enum bpf_arg_type {
>         ARG_PTR_TO_PERCPU_BTF_ID,       /* pointer to in-kernel percpu type */
>         ARG_PTR_TO_FUNC,        /* pointer to a bpf program function */
>         ARG_PTR_TO_STACK_OR_NULL,       /* pointer to stack or NULL */
> +       ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
>         __BPF_ARG_TYPE_MAX,
>  };
>

[...]

> +
> +               map_off = reg->off + reg->var_off.value;
> +               err = map->ops->map_direct_value_addr(map, &map_addr, map_off);
> +               if (err)
> +                       return err;
> +
> +               str_ptr = (char *)(long)(map_addr);
> +               if (!strnchr(str_ptr + map_off,
> +                            map->value_size - reg->off - map_off, 0))

you are double subtracting reg->off here. isn't map->value_size -
map_off what you want?

> +                       verbose(env, "string is not zero-terminated\n");

I'd prefer `return -EINVAL;`, but at least set err, otherwise what's the point?

>         }
>
>         return err;
> --
> 2.31.0.291.g576ba9dcdaf-goog
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-03-26 21:53   ` Andrii Nakryiko
@ 2021-03-26 22:51     ` Andrii Nakryiko
  2021-04-06 15:35       ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 22:51 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> >
> > Two helpers (trace_printk and seq_printf) have very similar
> > implementations of format string parsing and a third one is coming
> > (snprintf). To avoid code duplication and make the code easier to
> > maintain, this moves the operations associated with format string
> > parsing (validation and argument sanitization) into one generic
> > function.
> >
> > Unfortunately, the implementation of the two existing helpers already
> > drifted quite a bit and unifying them entailed a lot of changes:
>
> "Unfortunately" as in a lot of extra work for you? I think overall
> though it was very fortunate that you ended up doing it, all
> implementations are more feature-complete and saner now, no? Thanks a
> lot for your hard work!
>
> >
> > - bpf_trace_printk always expected fmt[fmt_size] to be the terminating
> >   NULL character, this is no longer true, the first 0 is terminating.
>
> You mean if you had bpf_trace_printk("bla bla\0some more bla\0", 24)
> it would emit that zero character? If yes, I don't think it was a sane
> behavior anyways.
>
> > - bpf_trace_printk now supports %% (which produces the percentage char).
> > - bpf_trace_printk now skips width formating fields.
> > - bpf_trace_printk now supports the X modifier (capital hexadecimal).
> > - bpf_trace_printk now supports %pK, %px, %pB, %pi4, %pI4, %pi6 and %pI6
> > - argument casting on 32 bit has been simplified into one macro and
> >   using an enum instead of obscure int increments.
> >
> > - bpf_seq_printf now uses bpf_trace_copy_string instead of
> >   strncpy_from_kernel_nofault and handles the %pks %pus specifiers.
> > - bpf_seq_printf now prints longs correctly on 32 bit architectures.
> >
> > - both were changed to use a global per-cpu tmp buffer instead of one
> >   stack buffer for trace_printk and 6 small buffers for seq_printf.
> > - to avoid per-cpu buffer usage conflict, these helpers disable
> >   preemption while the per-cpu buffer is in use.
> > - both helpers now support the %ps and %pS specifiers to print symbols.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
>
> This is great, you already saved some lines of code! I suspect I'll
> have some complaints about mods (it feels like this preample should
> provide extra information about which arguments have to be read from
> kernel/user memory, but I'll see next patches first.

Disregard the last part (at least for now). I had a mental model that
it should be possible to parse a format string once and then remember
"instructions" (i.e., arg1 is long, arg2 is string, and so on). But
that's too complicated, so I think re-parsing the format string is
much simpler.

>
> See my comments below (I deliberately didn't trim most of the code for
> easier jumping around), but it's great overall, thanks!
>
> >  kernel/trace/bpf_trace.c | 529 ++++++++++++++++++---------------------
> >  1 file changed, 244 insertions(+), 285 deletions(-)
> >

[...]

> > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > +                       u32 num_args)
> > +{
> > +       struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
> > +       int err, i, fmt_cnt = 0, copy_size, used;
> > +       char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > +       bool prepare_args = final_args && mod;
>
> probably better to enforce that both or none are specified, otherwise
> return error

it's actually three of them: raw_args, mod, and num_args, right? All
three are either NULL or non-NULL.

>
> > +       enum bpf_printf_mod_type current_mod;
> > +       size_t tmp_buf_len;
> > +       u64 current_arg;
> > +       char fmt_ptype;
> > +
> > +       for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
>

[...]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper
  2021-03-24  2:22 ` [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper Florent Revest
@ 2021-03-26 22:55   ` Andrii Nakryiko
  2021-04-06 16:06     ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 22:55 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> The implementation takes inspiration from the existing bpf_trace_printk
> helper but there are a few differences:
>
> To allow for a large number of format-specifiers, parameters are
> provided in an array, like in bpf_seq_printf.
>
> Because the output string takes two arguments and the array of
> parameters also takes two arguments, the format string needs to fit in
> one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
> NULL-terminated read-only map, we don't need a format string length arg.
>
> Because the format-string is known at verification time, we also move
> most of the format string validation, currently done in formatting
> helper calls, into the verifier logic. This makes debugging easier and
> also slightly improves the runtime performance.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  include/linux/bpf.h            |  6 ++++
>  include/uapi/linux/bpf.h       | 28 ++++++++++++++++++
>  kernel/bpf/helpers.c           |  2 ++
>  kernel/bpf/verifier.c          | 41 +++++++++++++++++++++++++++
>  kernel/trace/bpf_trace.c       | 52 ++++++++++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h | 28 ++++++++++++++++++
>  6 files changed, 157 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 7b5319d75b3e..f3d9c8fa60b3 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
>  extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
>  extern const struct bpf_func_proto bpf_copy_from_user_proto;
>  extern const struct bpf_func_proto bpf_snprintf_btf_proto;
> +extern const struct bpf_func_proto bpf_snprintf_proto;
>  extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
>  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
>  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> @@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
>  struct btf_id_set;
>  bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
>
> +enum bpf_printf_mod_type;
> +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> +                       u32 num_args);
> +
>  #endif /* _LINUX_BPF_H */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2d3036e292a9..86af61e912c6 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4660,6 +4660,33 @@ union bpf_attr {
>   *     Return
>   *             The number of traversed map elements for success, **-EINVAL** for
>   *             invalid **flags**.
> + *
> + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
> + *     Description
> + *             Outputs a string into the **str** buffer of size **str_size**
> + *             based on a format string stored in a read-only map pointed by
> + *             **fmt**.
> + *
> + *             Each format specifier in **fmt** corresponds to one u64 element
> + *             in the **data** array. For strings and pointers where pointees
> + *             are accessed, only the pointer values are stored in the *data*
> + *             array. The *data_len* is the size of *data* in bytes.
> + *
> + *             Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> + *             memory. Reading kernel memory may fail due to either invalid
> + *             address or valid address but requiring a major memory fault. If
> + *             reading kernel memory fails, the string for **%s** will be an
> + *             empty string, and the ip address for **%p{i,I}{4,6}** will be 0.

would it make sense for sleepable programs to allow memory fault when
reading memory?

> + *             Not returning error to bpf program is consistent with what
> + *             **bpf_trace_printk**\ () does for now.
> + *
> + *     Return
> + *             The strictly positive length of the formatted string, including
> + *             the trailing zero character. If the return value is greater than
> + *             **str_size**, **str** contains a truncated string, guaranteed to
> + *             be zero-terminated.

Except when str_size == 0.

> + *
> + *             Or **-EBUSY** if the per-CPU memory copy buffer is busy.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -4827,6 +4854,7 @@ union bpf_attr {
>         FN(sock_from_file),             \
>         FN(check_mtu),                  \
>         FN(for_each_map_elem),          \
> +       FN(snprintf),                   \
>         /* */
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 074800226327..12f4cfb04fe7 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -750,6 +750,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
>                 return &bpf_probe_read_kernel_str_proto;
>         case BPF_FUNC_snprintf_btf:
>                 return &bpf_snprintf_btf_proto;
> +       case BPF_FUNC_snprintf:
> +               return &bpf_snprintf_proto;
>         default:
>                 return NULL;
>         }
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 9e03608725b4..a89599dc51c9 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5729,6 +5729,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
>         return state->acquired_refs ? -EINVAL : 0;
>  }
>
> +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> +                                  struct bpf_reg_state *regs)
> +{
> +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> +       int err, fmt_map_off, num_args;
> +       u64 fmt_addr;
> +       char *fmt;
> +
> +       /* data must be an array of u64 so data_len must be a multiple of 8 */
> +       if (data_len_reg->var_off.value & 7)

`% 8` is not cool anymore... :)

> +               return -EINVAL;
> +       num_args = data_len_reg->var_off.value / 8;
> +
> +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> +        * and map_direct_value_addr is set.
> +        */
> +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> +                                                 fmt_map_off);
> +       if (err)
> +               return err;
> +       fmt = (char *)fmt_addr + fmt_map_off;
> +
> +       /* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
> +        * can focus on validating the format specifiers.
> +        */
> +       err = bpf_printf_preamble(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
> +       if (err < 0)
> +               verbose(env, "Invalid format string\n");
> +
> +       return err;
> +}
> +
>  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
>                              int *insn_idx_p)
>  {
> @@ -5843,6 +5878,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                         return -EINVAL;
>         }
>
> +       if (func_id == BPF_FUNC_snprintf) {
> +               err = check_bpf_snprintf_call(env, regs);
> +               if (err < 0)
> +                       return err;
> +       }
> +
>         /* reset caller saved regs */
>         for (i = 0; i < CALLER_SAVED_REGS; i++) {
>                 mark_reg_not_init(env, regs, caller_saved[i]);
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 0fdca94a3c9c..15cbc8b63206 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1230,6 +1230,56 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = {
>         .arg5_type      = ARG_ANYTHING,
>  };
>
> +#define MAX_SNPRINTF_VARARGS           12
> +
> +BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
> +          const void *, data, u32, data_len)
> +{
> +       enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
> +       u64 args[MAX_SNPRINTF_VARARGS];
> +       int err, num_args;
> +
> +       if (data_len & 7 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
> +           (data_len && !data))

see previous patches, data_len > 0 should be iff data != NULL, I think

> +               return -EINVAL;
> +       num_args = data_len / 8;
> +
> +       /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
> +        * can safely give an unbounded size.
> +        */
> +       err = bpf_printf_preamble(fmt, UINT_MAX, data, args, mod, num_args);
> +       if (err < 0)
> +               return err;
> +
> +       /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> +        * all of them to snprintf().
> +        */
> +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> +               BPF_CAST_FMT_ARG(11, args, mod));
> +       if (str_size)
> +               str[str_size - 1] = '\0';

hm... what if err < str_size ?

> +
> +       bpf_printf_postamble();
> +
> +       return err + 1;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field
  2021-03-24  2:22 ` [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field Florent Revest
@ 2021-03-26 23:01   ` Andrii Nakryiko
  2021-04-06 15:42     ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 23:01 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> When initializing the __param array with a one liner, if all args are
> const, the initial array value will be placed in the rodata section but
> because libbpf does not support relocation in the rodata section, any
> pointer in this array will stay NULL.
>
> Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  tools/lib/bpf/bpf_tracing.h | 26 ++++++++++++++++++++++----
>  1 file changed, 22 insertions(+), 4 deletions(-)
>
> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> index f9ef37707888..d9a4c3f77ff4 100644
> --- a/tools/lib/bpf/bpf_tracing.h
> +++ b/tools/lib/bpf/bpf_tracing.h
> @@ -413,6 +413,22 @@ typeof(name(0)) name(struct pt_regs *ctx)                              \
>  }                                                                          \
>  static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
>
> +#define ___bpf_fill0(arr, p, x)

can you please double-check that no-argument BPF_SEQ_PRINTF won't
generate a warning about spurious ';'? Maybe it's better to have zero
case as `do {} while(0);` ?

> +#define ___bpf_fill1(arr, p, x) arr[p] = x
> +#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
> +#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
> +#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
> +#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
> +#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
> +#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
> +#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
> +#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
> +#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
> +#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
> +#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
> +#define ___bpf_fill(arr, args...) \
> +       ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)

cool. this is regular enough to easily comprehend :)

> +
>  /*
>   * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
>   * in a structure.
> @@ -421,12 +437,14 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
>         ({                                                                  \
>                 _Pragma("GCC diagnostic push")                              \
>                 _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
> +               unsigned long long ___param[___bpf_narg(args)];             \
>                 static const char ___fmt[] = fmt;                           \
> -               unsigned long long ___param[] = { args };                   \
> +               int __ret;                                                  \
> +               ___bpf_fill(___param, args);                                \
>                 _Pragma("GCC diagnostic pop")                               \

Let's clean this up a little bit;
1. static const char ___fmt should be the very first
2. _Pragma scope should be minimal necessary, which includes only
___bpf_fill, right?
3. Empty line after int __ret; and let's keep three underscores for consistency.


> -               int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),    \
> -                                           ___param, sizeof(___param));    \
> -               ___ret;                                                     \
> +               __ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),         \
> +                                      ___param, sizeof(___param));         \
> +               __ret;                                                      \

but actually you don't need __ret at all, just bpf_seq_printf() here, right?


>         })
>
>  #endif
> --
> 2.31.0.291.g576ba9dcdaf-goog
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro
  2021-03-24  2:22 ` [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro Florent Revest
@ 2021-03-26 23:02   ` Andrii Nakryiko
  0 siblings, 0 replies; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 23:02 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> Similarly to BPF_SEQ_PRINTF, this macro turns variadic arguments into an
> array of u64, making it more natural to call the bpf_snprintf helper.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---
>  tools/lib/bpf/bpf_tracing.h | 18 ++++++++++++++++++
>  1 file changed, 18 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> index d9a4c3f77ff4..e5c6ede6060b 100644
> --- a/tools/lib/bpf/bpf_tracing.h
> +++ b/tools/lib/bpf/bpf_tracing.h
> @@ -447,4 +447,22 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
>                 __ret;                                                      \
>         })
>
> +/*
> + * BPF_SNPRINTF wraps the bpf_snprintf helper with variadic arguments instead of
> + * an array of u64.
> + */
> +#define BPF_SNPRINTF(out, out_size, fmt, args...)                          \
> +       ({                                                                  \

Same feedback as the previous patch, but let's also reduce the
nestedness level, those ({ }) can be shifted one tab left, right?
Please do the same for the previous patch as well. Thanks!

> +               _Pragma("GCC diagnostic push")                              \
> +               _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
> +               unsigned long long ___param[___bpf_narg(args)];             \
> +               static const char ___fmt[] = fmt;                           \
> +               int __ret;                                                  \
> +               ___bpf_fill(___param, args);                                \
> +               _Pragma("GCC diagnostic pop")                               \
> +               __ret = bpf_snprintf(out, out_size, ___fmt,                 \
> +                                    ___param, sizeof(___param));           \
> +               __ret;                                                      \
> +       })
> +
>  #endif
> --
> 2.31.0.291.g576ba9dcdaf-goog
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf
  2021-03-24  2:22 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf Florent Revest
@ 2021-03-26 23:05   ` Andrii Nakryiko
  2021-04-06 15:40     ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-03-26 23:05 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
>
> This exercises most of the format specifiers when things go well.
>
> Signed-off-by: Florent Revest <revest@chromium.org>
> ---

Looks good. Please add a no-argument test case as well.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  .../selftests/bpf/prog_tests/snprintf.c       | 65 +++++++++++++++++++
>  .../selftests/bpf/progs/test_snprintf.c       | 59 +++++++++++++++++
>  2 files changed, 124 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
>

[...]

> +
> +SEC("raw_tp/sys_enter")
> +int handler(const void *ctx)
> +{
> +       /* Convenient values to pretty-print */
> +       const __u8 ex_ipv4[] = {127, 0, 0, 1};
> +       const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
> +       const char str1[] = "str1";
> +       const char longstr[] = "longstr";
> +       extern const void schedule __ksym;

oh, fancy. I'd move it out of this function into global space, though,
to make it more apparent. I almost missed that it's a special one.

> +
> +       /* Integer types */
> +       num_ret  = BPF_SNPRINTF(num_out, sizeof(num_out),
> +                               "%d %u %x %li %llu %lX",
> +                               -8, 9, 150, -424242, 1337, 0xDABBAD00);
> +       /* IP addresses */
> +       ip_ret   = BPF_SNPRINTF(ip_out, sizeof(ip_out), "%pi4 %pI6",
> +                               &ex_ipv4, &ex_ipv6);
> +       /* Symbol lookup formatting */
> +       sym_ret  = BPF_SNPRINTF(sym_out,  sizeof(sym_out), "%ps %pS %pB",
> +                               &schedule, &schedule, &schedule);
> +       /* Kernel pointers */
> +       addr_ret = BPF_SNPRINTF(addr_out, sizeof(addr_out), "%pK %px %p",
> +                               0, 0xFFFF00000ADD4E55, 0xFFFF00000ADD4E55);
> +       /* Strings embedding */
> +       str_ret  = BPF_SNPRINTF(str_out, sizeof(str_out), "%s %+05s",
> +                               str1, longstr);
> +       /* Overflow */
> +       over_ret = BPF_SNPRINTF(over_out, sizeof(over_out), "%%overflow");
> +
> +       return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> --
> 2.31.0.291.g576ba9dcdaf-goog
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-03-26 22:51     ` Andrii Nakryiko
@ 2021-04-06 15:35       ` Florent Revest
  2021-04-07 21:53         ` Andrii Nakryiko
  0 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-04-06 15:35 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

[Sorry for the late replies, I'm just back from a long easter break :)]

On Fri, Mar 26, 2021 at 11:51 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > > Unfortunately, the implementation of the two existing helpers already
> > > drifted quite a bit and unifying them entailed a lot of changes:
> >
> > "Unfortunately" as in a lot of extra work for you? I think overall
> > though it was very fortunate that you ended up doing it, all
> > implementations are more feature-complete and saner now, no? Thanks a
> > lot for your hard work!

Ahah, "unfortunately" a bit of extra work for me, indeed. But I find
this kind of refactoring patches even harder to review than to write
so thank you too!

> > > - bpf_trace_printk always expected fmt[fmt_size] to be the terminating
> > >   NULL character, this is no longer true, the first 0 is terminating.
> >
> > You mean if you had bpf_trace_printk("bla bla\0some more bla\0", 24)
> > it would emit that zero character? If yes, I don't think it was a sane
> > behavior anyways.

The call to snprintf in bpf_do_trace_printk would eventually ignore
"some more bla" but the parsing done in bpf_trace_printk would indeed
read the whole string.

> > This is great, you already saved some lines of code! I suspect I'll
> > have some complaints about mods (it feels like this preample should
> > provide extra information about which arguments have to be read from
> > kernel/user memory, but I'll see next patches first.
>
> Disregard the last part (at least for now). I had a mental model that
> it should be possible to parse a format string once and then remember
> "instructions" (i.e., arg1 is long, arg2 is string, and so on). But
> that's too complicated, so I think re-parsing the format string is
> much simpler.

I also wanted to do that originally but realized it would keep a lot
of the complexity in the helpers themselves and not really move the
needle.

> > > +/* Horrid workaround for getting va_list handling working with different
> > > + * argument type combinations generically for 32 and 64 bit archs.
> > > + */
> > > +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)                            \
> > > +       ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||                        \
> > > +        (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))     \
> > > +         ? args[arg_nb]                                                \
> > > +         : ((mod[arg_nb] == BPF_PRINTF_LONG ||                         \
> > > +            (mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \
> >
> > is this right? INT is always 32-bit, it's only LONG that differs.
> > Shouldn't the rule be
> >
> > (LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
> > (INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]
> >
> > Does (long) cast do anything fancy when casting from u64? Sorry, maybe
> > I'm confused.

To be honest, I am also confused by that logic... :p My patch tries to
conserve exactly the same logic as "88a5c690b6 bpf: fix
bpf_trace_printk on 32 bit archs" because I was also afraid of missing
something and could not test it on 32 bit arches. From that commit
description, it is unclear to me what "u32 and long are passed
differently to u64, since the result of C conditional operators
follows the "usual arithmetic conversions" rules" means. Maybe Daniel
can comment on this ?

> > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > > +                       u32 num_args)
> > > +{
> > > +       struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
> > > +       int err, i, fmt_cnt = 0, copy_size, used;
> > > +       char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > > +       bool prepare_args = final_args && mod;
> >
> > probably better to enforce that both or none are specified, otherwise
> > return error

Fair :)

> it's actually three of them: raw_args, mod, and num_args, right? All
> three are either NULL or non-NULL.

It is a bit tricky to see from that patch but in "3/6 bpf: Add a
bpf_snprintf helper" the verifier code calls this function with
num_args != 0 to check whether the number of arguments is correct
without actually converting anything.

Also when the helper gets called, raw_args can come from the BPF
program and be NULL but in that case we will also have num_args = 0
guaranteed by the helper so the loop will bail out if it encounters a
format specifier.

> > > +       enum bpf_printf_mod_type current_mod;
> > > +       size_t tmp_buf_len;
> > > +       u64 current_arg;
> > > +       char fmt_ptype;
> > > +
> > > +       for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
> >
> > Can we say that if the last character is not '\0' then it's a bad
> > format string and return -EINVAL? And if \0 is inside the format
> > string, then it's also a bad format string? I wonder what others think
> > about this?... I think sanity should prevail.

Overall, there are two situations:
- bpf_seq_printf, bpf_trace_printk: we have a pointer and size but we
are not guaranteed zero-termination
- bpf_snprintf: we have a pointer, no size but it's guaranteed to be
zero-terminated (by ARG_PTR_TO_CONST_STR)

Currently, in the bpf_snprintf helper, I set fmt_size to UINT_MAX and
the terminating condition will be fmt[i] == '\0'.
As you pointed out a bit further, I got a bit carried away with the
refactoring and dropped the zero-termination checks for the existing
helpers !

So I see two possibilities:
- either we check fmt[last] == '\0', add a bail out condition in the
loop if we encounter another `\0` and set fmt_size to sprintf(fmt) in
the bpf_snprintf verifier and helper code.
- or we unconditionally call strnlen(fmt, fmt_size) in
bpf_printf_preamble. If no 0 is found, we return an error, if there is
one we treat it as the NULL terminating character.

> > > +               if ((!isprint(fmt[i]) && !isspace(fmt[i])) ||
> > > +                   !isascii(fmt[i])) {
> >
> > && always binds tighter than ||, so you can omit extra (). I'd put
> > this on a single line as well, but that's a total nit.

Neat! :)

> > > +                       err = -EINVAL;
> > > +                       goto out;
> > > +               }
> > >
> > >                 if (fmt[i] != '%')
> > >                         continue;
> > >
> > > -               if (fmt_cnt >= 3)
> > > -                       return -EINVAL;
> > > +               if (fmt[i + 1] == '%') {
> > > +                       i++;
> > > +                       continue;
> > > +               }
> > > +
> > > +               if (fmt_cnt >= num_args) {
> > > +                       err = -EINVAL;
> > > +                       goto out;
> > > +               }
> > >
> > >                 /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
> > >                 i++;
> > > -               if (fmt[i] == 'l') {
> > > -                       mod[fmt_cnt]++;
> > > +
> > > +               /* skip optional "[0 +-][num]" width formating field */
> >
> > typo: formatting

Fixed

> > > +               while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> > > +                      fmt[i] == ' ')
> > > +                       i++;
> > > +               if (fmt[i] >= '1' && fmt[i] <= '9') {
> > >                         i++;
> >
> > Are we worried about integer overflow here? %123123123123123d
> > hopefully won't crash anything, right?

I expect that this should be handled gracefully by the subsequent call
to snprintf(). Our parsing logic does not guarantee that the format
string is 100% legit but it guarantees that it's safe to call
vsnprintf with arguments coming from BPF. If the output buffer is too
small to hold the output, the output will be truncated.

Note that this is already how bpf_seq_printf already works.

> > > -               } else if (fmt[i] == 'p') {
> > > -                       mod[fmt_cnt]++;
> > > -                       if ((fmt[i + 1] == 'k' ||
> > > -                            fmt[i + 1] == 'u') &&
> > > +                       while (fmt[i] >= '0' && fmt[i] <= '9')
> > > +                               i++;
> >
> > whoa, fmt_size shouldn't be ignored

Oh no, I'll attach the stone of shame! It all made sense with
bpf_snprintf() in mind because, there, we are guaranteed to have a
NULL terminated string already but in an excess of refactoring
enthusiasm I dropped the zero-termination check for the other helpers.

But if we implement either of the options discussed above, then we do
not need to constantly check fmt_size.

> > > +               }
> > > +
> >
> > and here if we exhausted all format string but haven't gotten to
> > format specified, we should -EINVAL
> >
> > if (i >= fmt_size) return -EINVAL?

Same comment as above, if we are already guaranteed zero-termination
by a prior check, we don't need that.

> > > +               if (fmt[i] == 'p') {
> > > +                       current_mod = BPF_PRINTF_LONG;
> > > +
> > > +                       if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') &&
> > >                             fmt[i + 2] == 's') {
> >
> > right, if i + 2 is ok to access? always be remembering about fmt_size

Same.

> > >                                 fmt_ptype = fmt[i + 1];
> > >                                 i += 2;
> > >                                 goto fmt_str;
> > >                         }
> > >
> > > -                       if (fmt[i + 1] == 'B') {
> > > -                               i++;
> > > +                       if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
> > > +                           ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
> > > +                           fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
> > > +                           fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
> > > +                               /* just kernel pointers */
> > > +                               if (prepare_args)
> > > +                                       current_arg = raw_args[fmt_cnt];
> >
> > fmt_cnt is not the best name, imo. arg_cnt makes more sense

Mh, we already have "num_args" that can make it confusing. The way I see it:
- the number of format specifiers is the number of %d %s... in the format string
- the number of arguments is the number of values given in the raw_args array.

Potentially, the number of arguments can be higher than the number of
format specifiers, for example printf("%d\n", i, j); so calling them
differently sorta makes sense.
But to be honest I don't have a strong opinion about this and this is
mainly just a remaining from the current bpf_seq_printf
implementation.

> > > +                       if (!tmp_buf) {
> > > +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> > > +                               if (WARN_ON_ONCE(used > 1)) {
> > > +                                       this_cpu_dec(bpf_printf_buf_used);
> > > +                                       return -EBUSY;
> > > +                               }
> > > +                               preempt_disable();
> >
> > shouldn't we preempt_disable before we got bpf_printf_buf_used? if we
> > get preempted after incrementing counter, buffer will be unusable for
> > a while, potentially, right?

Good catch :)

> > > +                       if (!tmp_buf) {
> > > +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> > > +                               if (WARN_ON_ONCE(used > 1)) {
> > > +                                       this_cpu_dec(bpf_printf_buf_used);
> > > +                                       return -EBUSY;
> > > +                               }
> > > +                               preempt_disable();
> > > +                               tmp_buf = bufs->tmp_buf;
> > > +                               tmp_buf_len = MAX_PRINTF_BUF_LEN;
> > > +                       }
> >
> > how about helper used like this:
> >
> > if (try_get_fmt_tmp_buf(&tmp_buf, &tmp_buf_len))
> >    return -EBUSY;
> >
> > which will do nothing if tmp_buf != NULL?

Yep, I quite like that. :)

> > >  fmt_next:
> > > +               if (prepare_args) {
> >
> > I'd ditch prepare_args variable and just check final_args (and that
> > check to ensure both mods and final_args are specified I suggested
> > above)

Agreed.

> > > +                       mod[fmt_cnt] = current_mod;
> > > +                       final_args[fmt_cnt] = current_arg;
> > > +               }
> > >                 fmt_cnt++;
> > >         }
> >
> > [...]
> >
> > > -
> > > -       return __BPF_TP_EMIT();
> > > +       err = 0;
> > > +out:
> > > +       bpf_printf_postamble();
> >
> > naming is hard, but preamble and postamble reads way too fancy :)
> > bpf_printf_prepare() and bpf_printf_cleanup() or something like that
> > is a bit more to the point, no?

Haha, you're totally right.

> > > +       if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
> > > +           (data_len && !data))
> >
> > data && !data_len is also an error, no?

Isn't that checked by the verifier ?

I don't mind adding an explicit check for it (data_len ^ data or two
clearer conditions ?) but I think that even if this were to happen,
this would not be a problem: if we encounter a format specifier,
num_args will be zero so bpf_printf_preamble will bail out before it
tries to access data.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type
  2021-03-26 22:23   ` Andrii Nakryiko
@ 2021-04-06 15:38     ` Florent Revest
  0 siblings, 0 replies; 23+ messages in thread
From: Florent Revest @ 2021-04-06 15:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Fri, Mar 26, 2021 at 11:23 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > +
> > +               map_off = reg->off + reg->var_off.value;
> > +               err = map->ops->map_direct_value_addr(map, &map_addr, map_off);
> > +               if (err)
> > +                       return err;
> > +
> > +               str_ptr = (char *)(long)(map_addr);
> > +               if (!strnchr(str_ptr + map_off,
> > +                            map->value_size - reg->off - map_off, 0))
>
> you are double subtracting reg->off here. isn't map->value_size -
> map_off what you want?

Good catch!

> > +                       verbose(env, "string is not zero-terminated\n");
>
> I'd prefer `return -EINVAL;`, but at least set err, otherwise what's the point?

Ah yeah, absolutely.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf
  2021-03-26 23:05   ` Andrii Nakryiko
@ 2021-04-06 15:40     ` Florent Revest
  0 siblings, 0 replies; 23+ messages in thread
From: Florent Revest @ 2021-04-06 15:40 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Sat, Mar 27, 2021 at 12:05 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> >
> > This exercises most of the format specifiers when things go well.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
>
> Looks good. Please add a no-argument test case as well.

Agreed

> Acked-by: Andrii Nakryiko <andrii@kernel.org>
>
> >  .../selftests/bpf/prog_tests/snprintf.c       | 65 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_snprintf.c       | 59 +++++++++++++++++
> >  2 files changed, 124 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/snprintf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_snprintf.c
> >
>
> [...]
>
> > +
> > +SEC("raw_tp/sys_enter")
> > +int handler(const void *ctx)
> > +{
> > +       /* Convenient values to pretty-print */
> > +       const __u8 ex_ipv4[] = {127, 0, 0, 1};
> > +       const __u8 ex_ipv6[] = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1};
> > +       const char str1[] = "str1";
> > +       const char longstr[] = "longstr";
> > +       extern const void schedule __ksym;
>
> oh, fancy. I'd move it out of this function into global space, though,
> to make it more apparent. I almost missed that it's a special one.

Just schedule? Alright.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field
  2021-03-26 23:01   ` Andrii Nakryiko
@ 2021-04-06 15:42     ` Florent Revest
  0 siblings, 0 replies; 23+ messages in thread
From: Florent Revest @ 2021-04-06 15:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Sat, Mar 27, 2021 at 12:01 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> >
> > When initializing the __param array with a one liner, if all args are
> > const, the initial array value will be placed in the rodata section but
> > because libbpf does not support relocation in the rodata section, any
> > pointer in this array will stay NULL.
> >
> > Fixes: c09add2fbc5a ("tools/libbpf: Add bpf_iter support")
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
> >  tools/lib/bpf/bpf_tracing.h | 26 ++++++++++++++++++++++----
> >  1 file changed, 22 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> > index f9ef37707888..d9a4c3f77ff4 100644
> > --- a/tools/lib/bpf/bpf_tracing.h
> > +++ b/tools/lib/bpf/bpf_tracing.h
> > @@ -413,6 +413,22 @@ typeof(name(0)) name(struct pt_regs *ctx)                              \
> >  }                                                                          \
> >  static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
> >
> > +#define ___bpf_fill0(arr, p, x)
>
> can you please double-check that no-argument BPF_SEQ_PRINTF won't
> generate a warning about spurious ';'? Maybe it's better to have zero
> case as `do {} while(0);` ?
>
> > +#define ___bpf_fill1(arr, p, x) arr[p] = x
> > +#define ___bpf_fill2(arr, p, x, args...) arr[p] = x; ___bpf_fill1(arr, p + 1, args)
> > +#define ___bpf_fill3(arr, p, x, args...) arr[p] = x; ___bpf_fill2(arr, p + 1, args)
> > +#define ___bpf_fill4(arr, p, x, args...) arr[p] = x; ___bpf_fill3(arr, p + 1, args)
> > +#define ___bpf_fill5(arr, p, x, args...) arr[p] = x; ___bpf_fill4(arr, p + 1, args)
> > +#define ___bpf_fill6(arr, p, x, args...) arr[p] = x; ___bpf_fill5(arr, p + 1, args)
> > +#define ___bpf_fill7(arr, p, x, args...) arr[p] = x; ___bpf_fill6(arr, p + 1, args)
> > +#define ___bpf_fill8(arr, p, x, args...) arr[p] = x; ___bpf_fill7(arr, p + 1, args)
> > +#define ___bpf_fill9(arr, p, x, args...) arr[p] = x; ___bpf_fill8(arr, p + 1, args)
> > +#define ___bpf_fill10(arr, p, x, args...) arr[p] = x; ___bpf_fill9(arr, p + 1, args)
> > +#define ___bpf_fill11(arr, p, x, args...) arr[p] = x; ___bpf_fill10(arr, p + 1, args)
> > +#define ___bpf_fill12(arr, p, x, args...) arr[p] = x; ___bpf_fill11(arr, p + 1, args)
> > +#define ___bpf_fill(arr, args...) \
> > +       ___bpf_apply(___bpf_fill, ___bpf_narg(args))(arr, 0, args)
>
> cool. this is regular enough to easily comprehend :)
>
> > +
> >  /*
> >   * BPF_SEQ_PRINTF to wrap bpf_seq_printf to-be-printed values
> >   * in a structure.
> > @@ -421,12 +437,14 @@ static __always_inline typeof(name(0)) ____##name(struct pt_regs *ctx, ##args)
> >         ({                                                                  \
> >                 _Pragma("GCC diagnostic push")                              \
> >                 _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
> > +               unsigned long long ___param[___bpf_narg(args)];             \
> >                 static const char ___fmt[] = fmt;                           \
> > -               unsigned long long ___param[] = { args };                   \
> > +               int __ret;                                                  \
> > +               ___bpf_fill(___param, args);                                \
> >                 _Pragma("GCC diagnostic pop")                               \
>
> Let's clean this up a little bit;
> 1. static const char ___fmt should be the very first
> 2. _Pragma scope should be minimal necessary, which includes only
> ___bpf_fill, right?
> 3. Empty line after int __ret; and let's keep three underscores for consistency.
>
>
> > -               int ___ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),    \
> > -                                           ___param, sizeof(___param));    \
> > -               ___ret;                                                     \
> > +               __ret = bpf_seq_printf(seq, ___fmt, sizeof(___fmt),         \
> > +                                      ___param, sizeof(___param));         \
> > +               __ret;                                                      \
>
> but actually you don't need __ret at all, just bpf_seq_printf() here, right?

Agreed with everything and also the indentation comment in 5/6, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper
  2021-03-26 22:55   ` Andrii Nakryiko
@ 2021-04-06 16:06     ` Florent Revest
  2021-04-07 22:03       ` Andrii Nakryiko
  0 siblings, 1 reply; 23+ messages in thread
From: Florent Revest @ 2021-04-06 16:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Fri, Mar 26, 2021 at 11:55 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > The implementation takes inspiration from the existing bpf_trace_printk
> > helper but there are a few differences:
> >
> > To allow for a large number of format-specifiers, parameters are
> > provided in an array, like in bpf_seq_printf.
> >
> > Because the output string takes two arguments and the array of
> > parameters also takes two arguments, the format string needs to fit in
> > one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
> > NULL-terminated read-only map, we don't need a format string length arg.
> >
> > Because the format-string is known at verification time, we also move
> > most of the format string validation, currently done in formatting
> > helper calls, into the verifier logic. This makes debugging easier and
> > also slightly improves the runtime performance.
> >
> > Signed-off-by: Florent Revest <revest@chromium.org>
> > ---
> >  include/linux/bpf.h            |  6 ++++
> >  include/uapi/linux/bpf.h       | 28 ++++++++++++++++++
> >  kernel/bpf/helpers.c           |  2 ++
> >  kernel/bpf/verifier.c          | 41 +++++++++++++++++++++++++++
> >  kernel/trace/bpf_trace.c       | 52 ++++++++++++++++++++++++++++++++++
> >  tools/include/uapi/linux/bpf.h | 28 ++++++++++++++++++
> >  6 files changed, 157 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 7b5319d75b3e..f3d9c8fa60b3 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
> >  extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
> >  extern const struct bpf_func_proto bpf_copy_from_user_proto;
> >  extern const struct bpf_func_proto bpf_snprintf_btf_proto;
> > +extern const struct bpf_func_proto bpf_snprintf_proto;
> >  extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> >  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> > @@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> >  struct btf_id_set;
> >  bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
> >
> > +enum bpf_printf_mod_type;
> > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > +                       u32 num_args);
> > +
> >  #endif /* _LINUX_BPF_H */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 2d3036e292a9..86af61e912c6 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4660,6 +4660,33 @@ union bpf_attr {
> >   *     Return
> >   *             The number of traversed map elements for success, **-EINVAL** for
> >   *             invalid **flags**.
> > + *
> > + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
> > + *     Description
> > + *             Outputs a string into the **str** buffer of size **str_size**
> > + *             based on a format string stored in a read-only map pointed by
> > + *             **fmt**.
> > + *
> > + *             Each format specifier in **fmt** corresponds to one u64 element
> > + *             in the **data** array. For strings and pointers where pointees
> > + *             are accessed, only the pointer values are stored in the *data*
> > + *             array. The *data_len* is the size of *data* in bytes.
> > + *
> > + *             Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> > + *             memory. Reading kernel memory may fail due to either invalid
> > + *             address or valid address but requiring a major memory fault. If
> > + *             reading kernel memory fails, the string for **%s** will be an
> > + *             empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
>
> would it make sense for sleepable programs to allow memory fault when
> reading memory?

Probably yes. How would you do that ? I'm guessing that in
bpf_trace_copy_string you would call either strncpy_from_X_nofault or
strncpy_from_X depending on a condition but I'm not sure which one.

> > + *             Not returning error to bpf program is consistent with what
> > + *             **bpf_trace_printk**\ () does for now.
> > + *
> > + *     Return
> > + *             The strictly positive length of the formatted string, including
> > + *             the trailing zero character. If the return value is greater than
> > + *             **str_size**, **str** contains a truncated string, guaranteed to
> > + *             be zero-terminated.
>
> Except when str_size == 0.

Right

> > + *
> > + *             Or **-EBUSY** if the per-CPU memory copy buffer is busy.
> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -4827,6 +4854,7 @@ union bpf_attr {
> >         FN(sock_from_file),             \
> >         FN(check_mtu),                  \
> >         FN(for_each_map_elem),          \
> > +       FN(snprintf),                   \
> >         /* */
> >
> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> > index 074800226327..12f4cfb04fe7 100644
> > --- a/kernel/bpf/helpers.c
> > +++ b/kernel/bpf/helpers.c
> > @@ -750,6 +750,8 @@ bpf_base_func_proto(enum bpf_func_id func_id)
> >                 return &bpf_probe_read_kernel_str_proto;
> >         case BPF_FUNC_snprintf_btf:
> >                 return &bpf_snprintf_btf_proto;
> > +       case BPF_FUNC_snprintf:
> > +               return &bpf_snprintf_proto;
> >         default:
> >                 return NULL;
> >         }
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 9e03608725b4..a89599dc51c9 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5729,6 +5729,41 @@ static int check_reference_leak(struct bpf_verifier_env *env)
> >         return state->acquired_refs ? -EINVAL : 0;
> >  }
> >
> > +static int check_bpf_snprintf_call(struct bpf_verifier_env *env,
> > +                                  struct bpf_reg_state *regs)
> > +{
> > +       struct bpf_reg_state *fmt_reg = &regs[BPF_REG_3];
> > +       struct bpf_reg_state *data_len_reg = &regs[BPF_REG_5];
> > +       struct bpf_map *fmt_map = fmt_reg->map_ptr;
> > +       int err, fmt_map_off, num_args;
> > +       u64 fmt_addr;
> > +       char *fmt;
> > +
> > +       /* data must be an array of u64 so data_len must be a multiple of 8 */
> > +       if (data_len_reg->var_off.value & 7)
>
> `% 8` is not cool anymore... :)

Haha, this is a leftover from bpf_seq_printf but I agree % 8 is nicer.

> > +               return -EINVAL;
> > +       num_args = data_len_reg->var_off.value / 8;
> > +
> > +       /* fmt being ARG_PTR_TO_CONST_STR guarantees that var_off is const
> > +        * and map_direct_value_addr is set.
> > +        */
> > +       fmt_map_off = fmt_reg->off + fmt_reg->var_off.value;
> > +       err = fmt_map->ops->map_direct_value_addr(fmt_map, &fmt_addr,
> > +                                                 fmt_map_off);
> > +       if (err)
> > +               return err;
> > +       fmt = (char *)fmt_addr + fmt_map_off;
> > +
> > +       /* We are also guaranteed that fmt+fmt_map_off is NULL terminated, we
> > +        * can focus on validating the format specifiers.
> > +        */
> > +       err = bpf_printf_preamble(fmt, UINT_MAX, NULL, NULL, NULL, num_args);
> > +       if (err < 0)
> > +               verbose(env, "Invalid format string\n");
> > +
> > +       return err;
> > +}
> > +
> >  static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
> >                              int *insn_idx_p)
> >  {
> > @@ -5843,6 +5878,12 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >                         return -EINVAL;
> >         }
> >
> > +       if (func_id == BPF_FUNC_snprintf) {
> > +               err = check_bpf_snprintf_call(env, regs);
> > +               if (err < 0)
> > +                       return err;
> > +       }
> > +
> >         /* reset caller saved regs */
> >         for (i = 0; i < CALLER_SAVED_REGS; i++) {
> >                 mark_reg_not_init(env, regs, caller_saved[i]);
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index 0fdca94a3c9c..15cbc8b63206 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1230,6 +1230,56 @@ const struct bpf_func_proto bpf_snprintf_btf_proto = {
> >         .arg5_type      = ARG_ANYTHING,
> >  };
> >
> > +#define MAX_SNPRINTF_VARARGS           12
> > +
> > +BPF_CALL_5(bpf_snprintf, char *, str, u32, str_size, char *, fmt,
> > +          const void *, data, u32, data_len)
> > +{
> > +       enum bpf_printf_mod_type mod[MAX_SNPRINTF_VARARGS];
> > +       u64 args[MAX_SNPRINTF_VARARGS];
> > +       int err, num_args;
> > +
> > +       if (data_len & 7 || data_len > MAX_SNPRINTF_VARARGS * 8 ||
> > +           (data_len && !data))
>
> see previous patches, data_len > 0 should be iff data != NULL, I think

Commented there.

> > +               return -EINVAL;
> > +       num_args = data_len / 8;
> > +
> > +       /* ARG_PTR_TO_CONST_STR guarantees that fmt is zero-terminated so we
> > +        * can safely give an unbounded size.
> > +        */
> > +       err = bpf_printf_preamble(fmt, UINT_MAX, data, args, mod, num_args);
> > +       if (err < 0)
> > +               return err;
> > +
> > +       /* Maximumly we can have MAX_SNPRINTF_VARARGS parameters, just give
> > +        * all of them to snprintf().
> > +        */
> > +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> > +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> > +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> > +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> > +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> > +               BPF_CAST_FMT_ARG(11, args, mod));
> > +       if (str_size)
> > +               str[str_size - 1] = '\0';
>
> hm... what if err < str_size ?

Then there would be two zeroes, one set by snprintf in the middle and
one set by us at the end. :| I was a bit lazy there, I agree it would
be nicer if we'd do if (err >= str_size) instead.

Also makes me wonder what if str == NULL and str_size != 0. I just
assumed that the verifier would prevent that from happening but
discussions in the other patches make me unsure now.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-04-06 15:35       ` Florent Revest
@ 2021-04-07 21:53         ` Andrii Nakryiko
  2021-04-08 22:52           ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-04-07 21:53 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Apr 6, 2021 at 8:35 AM Florent Revest <revest@chromium.org> wrote:
>
> [Sorry for the late replies, I'm just back from a long easter break :)]
>
> On Fri, Mar 26, 2021 at 11:51 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > > > Unfortunately, the implementation of the two existing helpers already
> > > > drifted quite a bit and unifying them entailed a lot of changes:
> > >
> > > "Unfortunately" as in a lot of extra work for you? I think overall
> > > though it was very fortunate that you ended up doing it, all
> > > implementations are more feature-complete and saner now, no? Thanks a
> > > lot for your hard work!
>
> Ahah, "unfortunately" a bit of extra work for me, indeed. But I find
> this kind of refactoring patches even harder to review than to write
> so thank you too!
>
> > > > - bpf_trace_printk always expected fmt[fmt_size] to be the terminating
> > > >   NULL character, this is no longer true, the first 0 is terminating.
> > >
> > > You mean if you had bpf_trace_printk("bla bla\0some more bla\0", 24)
> > > it would emit that zero character? If yes, I don't think it was a sane
> > > behavior anyways.
>
> The call to snprintf in bpf_do_trace_printk would eventually ignore
> "some more bla" but the parsing done in bpf_trace_printk would indeed
> read the whole string.
>
> > > This is great, you already saved some lines of code! I suspect I'll
> > > have some complaints about mods (it feels like this preample should
> > > provide extra information about which arguments have to be read from
> > > kernel/user memory, but I'll see next patches first.
> >
> > Disregard the last part (at least for now). I had a mental model that
> > it should be possible to parse a format string once and then remember
> > "instructions" (i.e., arg1 is long, arg2 is string, and so on). But
> > that's too complicated, so I think re-parsing the format string is
> > much simpler.
>
> I also wanted to do that originally but realized it would keep a lot
> of the complexity in the helpers themselves and not really move the
> needle.
>
> > > > +/* Horrid workaround for getting va_list handling working with different
> > > > + * argument type combinations generically for 32 and 64 bit archs.
> > > > + */
> > > > +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)                            \
> > > > +       ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||                        \
> > > > +        (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))     \
> > > > +         ? args[arg_nb]                                                \
> > > > +         : ((mod[arg_nb] == BPF_PRINTF_LONG ||                         \
> > > > +            (mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \
> > >
> > > is this right? INT is always 32-bit, it's only LONG that differs.
> > > Shouldn't the rule be
> > >
> > > (LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
> > > (INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]
> > >
> > > Does (long) cast do anything fancy when casting from u64? Sorry, maybe
> > > I'm confused.
>
> To be honest, I am also confused by that logic... :p My patch tries to
> conserve exactly the same logic as "88a5c690b6 bpf: fix
> bpf_trace_printk on 32 bit archs" because I was also afraid of missing
> something and could not test it on 32 bit arches. From that commit
> description, it is unclear to me what "u32 and long are passed
> differently to u64, since the result of C conditional operators
> follows the "usual arithmetic conversions" rules" means. Maybe Daniel
> can comment on this ?

Yeah, no idea. Seems like the code above should work fine for 32 and
64 bitness and both little- and big-endianness.

>
> > > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > > > +                       u32 num_args)
> > > > +{
> > > > +       struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
> > > > +       int err, i, fmt_cnt = 0, copy_size, used;
> > > > +       char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > > > +       bool prepare_args = final_args && mod;
> > >
> > > probably better to enforce that both or none are specified, otherwise
> > > return error
>
> Fair :)
>
> > it's actually three of them: raw_args, mod, and num_args, right? All
> > three are either NULL or non-NULL.
>
> It is a bit tricky to see from that patch but in "3/6 bpf: Add a
> bpf_snprintf helper" the verifier code calls this function with
> num_args != 0 to check whether the number of arguments is correct
> without actually converting anything.
>
> Also when the helper gets called, raw_args can come from the BPF
> program and be NULL but in that case we will also have num_args = 0
> guaranteed by the helper so the loop will bail out if it encounters a
> format specifier.

ok, but at least final_args and mod are locked together, so should be
enforced to be either null or not, right?

>
> > > > +       enum bpf_printf_mod_type current_mod;
> > > > +       size_t tmp_buf_len;
> > > > +       u64 current_arg;
> > > > +       char fmt_ptype;
> > > > +
> > > > +       for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
> > >
> > > Can we say that if the last character is not '\0' then it's a bad
> > > format string and return -EINVAL? And if \0 is inside the format
> > > string, then it's also a bad format string? I wonder what others think
> > > about this?... I think sanity should prevail.
>
> Overall, there are two situations:
> - bpf_seq_printf, bpf_trace_printk: we have a pointer and size but we
> are not guaranteed zero-termination
> - bpf_snprintf: we have a pointer, no size but it's guaranteed to be
> zero-terminated (by ARG_PTR_TO_CONST_STR)
>
> Currently, in the bpf_snprintf helper, I set fmt_size to UINT_MAX and
> the terminating condition will be fmt[i] == '\0'.
> As you pointed out a bit further, I got a bit carried away with the
> refactoring and dropped the zero-termination checks for the existing
> helpers !
>
> So I see two possibilities:
> - either we check fmt[last] == '\0', add a bail out condition in the
> loop if we encounter another `\0` and set fmt_size to sprintf(fmt) in
> the bpf_snprintf verifier and helper code.
> - or we unconditionally call strnlen(fmt, fmt_size) in
> bpf_printf_preamble. If no 0 is found, we return an error, if there is
> one we treat it as the NULL terminating character.

I was thinking about the second one. It is clearly acceptable on BPF
verifier side, though one might argue that we are doing extra work on
the BPF helper side. I don't think it matters in practice, so I'll be
fine with that, if that makes code cleaner and simpler.

>
> > > > +               if ((!isprint(fmt[i]) && !isspace(fmt[i])) ||
> > > > +                   !isascii(fmt[i])) {
> > >
> > > && always binds tighter than ||, so you can omit extra (). I'd put
> > > this on a single line as well, but that's a total nit.
>
> Neat! :)

I just got a compilation warning in a similar situation yesterday when
I dropped unnecessary parentheses, so some versions of compilers might
think it is not a good practice. Just keep that in mind. I don't think
I care enough.

>
> > > > +                       err = -EINVAL;
> > > > +                       goto out;
> > > > +               }
> > > >
> > > >                 if (fmt[i] != '%')
> > > >                         continue;
> > > >
> > > > -               if (fmt_cnt >= 3)
> > > > -                       return -EINVAL;
> > > > +               if (fmt[i + 1] == '%') {
> > > > +                       i++;
> > > > +                       continue;
> > > > +               }
> > > > +
> > > > +               if (fmt_cnt >= num_args) {
> > > > +                       err = -EINVAL;
> > > > +                       goto out;
> > > > +               }
> > > >
> > > >                 /* fmt[i] != 0 && fmt[last] == 0, so we can access fmt[i + 1] */
> > > >                 i++;
> > > > -               if (fmt[i] == 'l') {
> > > > -                       mod[fmt_cnt]++;
> > > > +
> > > > +               /* skip optional "[0 +-][num]" width formating field */
> > >
> > > typo: formatting
>
> Fixed
>
> > > > +               while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> > > > +                      fmt[i] == ' ')
> > > > +                       i++;
> > > > +               if (fmt[i] >= '1' && fmt[i] <= '9') {
> > > >                         i++;
> > >
> > > Are we worried about integer overflow here? %123123123123123d
> > > hopefully won't crash anything, right?
>
> I expect that this should be handled gracefully by the subsequent call
> to snprintf(). Our parsing logic does not guarantee that the format
> string is 100% legit but it guarantees that it's safe to call
> vsnprintf with arguments coming from BPF. If the output buffer is too
> small to hold the output, the output will be truncated.
>
> Note that this is already how bpf_seq_printf already works.

Ok, but let's not hope and add the test for this.

>
> > > > -               } else if (fmt[i] == 'p') {
> > > > -                       mod[fmt_cnt]++;
> > > > -                       if ((fmt[i + 1] == 'k' ||
> > > > -                            fmt[i + 1] == 'u') &&
> > > > +                       while (fmt[i] >= '0' && fmt[i] <= '9')
> > > > +                               i++;
> > >
> > > whoa, fmt_size shouldn't be ignored
>
> Oh no, I'll attach the stone of shame! It all made sense with
> bpf_snprintf() in mind because, there, we are guaranteed to have a
> NULL terminated string already but in an excess of refactoring
> enthusiasm I dropped the zero-termination check for the other helpers.
>
> But if we implement either of the options discussed above, then we do
> not need to constantly check fmt_size.

let's see when we get to the next version ;) I don't remember code
enough by now, but I'll keep that in mind for the next revision
anyways

>
> > > > +               }
> > > > +
> > >
> > > and here if we exhausted all format string but haven't gotten to
> > > format specified, we should -EINVAL
> > >
> > > if (i >= fmt_size) return -EINVAL?
>
> Same comment as above, if we are already guaranteed zero-termination
> by a prior check, we don't need that.
>
> > > > +               if (fmt[i] == 'p') {
> > > > +                       current_mod = BPF_PRINTF_LONG;
> > > > +
> > > > +                       if ((fmt[i + 1] == 'k' || fmt[i + 1] == 'u') &&
> > > >                             fmt[i + 2] == 's') {
> > >
> > > right, if i + 2 is ok to access? always be remembering about fmt_size
>
> Same.
>
> > > >                                 fmt_ptype = fmt[i + 1];
> > > >                                 i += 2;
> > > >                                 goto fmt_str;
> > > >                         }
> > > >
> > > > -                       if (fmt[i + 1] == 'B') {
> > > > -                               i++;
> > > > +                       if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
> > > > +                           ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
> > > > +                           fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
> > > > +                           fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
> > > > +                               /* just kernel pointers */
> > > > +                               if (prepare_args)
> > > > +                                       current_arg = raw_args[fmt_cnt];
> > >
> > > fmt_cnt is not the best name, imo. arg_cnt makes more sense
>
> Mh, we already have "num_args" that can make it confusing. The way I see it:
> - the number of format specifiers is the number of %d %s... in the format string
> - the number of arguments is the number of values given in the raw_args array.
>

Well, if you read "fmt_cnt" as "number of formatters" then yeah, I
suppose it's fine. Never mind. Just fmt_cnt and fmt_size refers to
slightly different "fmt"s, which confused me for a bit, but that's ok.
You use different naming conventions, which is inconsistent, so maybe
adjust that for purists (i.e., if you have num_args, then you should
have num_fmts; or, alternatively, arg_cnt and fmt_cnt). But I'm just
nitpicking, obviously.

> Potentially, the number of arguments can be higher than the number of
> format specifiers, for example printf("%d\n", i, j); so calling them
> differently sorta makes sense.
> But to be honest I don't have a strong opinion about this and this is
> mainly just a remaining from the current bpf_seq_printf
> implementation.

Yep. I think it's ok to allow num_args > fmt_cnt.

>
> > > > +                       if (!tmp_buf) {
> > > > +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> > > > +                               if (WARN_ON_ONCE(used > 1)) {
> > > > +                                       this_cpu_dec(bpf_printf_buf_used);
> > > > +                                       return -EBUSY;
> > > > +                               }
> > > > +                               preempt_disable();
> > >
> > > shouldn't we preempt_disable before we got bpf_printf_buf_used? if we
> > > get preempted after incrementing counter, buffer will be unusable for
> > > a while, potentially, right?
>
> Good catch :)
>
> > > > +                       if (!tmp_buf) {
> > > > +                               used = this_cpu_inc_return(bpf_printf_buf_used);
> > > > +                               if (WARN_ON_ONCE(used > 1)) {
> > > > +                                       this_cpu_dec(bpf_printf_buf_used);
> > > > +                                       return -EBUSY;
> > > > +                               }
> > > > +                               preempt_disable();
> > > > +                               tmp_buf = bufs->tmp_buf;
> > > > +                               tmp_buf_len = MAX_PRINTF_BUF_LEN;
> > > > +                       }
> > >
> > > how about helper used like this:
> > >
> > > if (try_get_fmt_tmp_buf(&tmp_buf, &tmp_buf_len))
> > >    return -EBUSY;
> > >
> > > which will do nothing if tmp_buf != NULL?
>
> Yep, I quite like that. :)
>
> > > >  fmt_next:
> > > > +               if (prepare_args) {
> > >
> > > I'd ditch prepare_args variable and just check final_args (and that
> > > check to ensure both mods and final_args are specified I suggested
> > > above)
>
> Agreed.
>
> > > > +                       mod[fmt_cnt] = current_mod;
> > > > +                       final_args[fmt_cnt] = current_arg;
> > > > +               }
> > > >                 fmt_cnt++;
> > > >         }
> > >
> > > [...]
> > >
> > > > -
> > > > -       return __BPF_TP_EMIT();
> > > > +       err = 0;
> > > > +out:
> > > > +       bpf_printf_postamble();
> > >
> > > naming is hard, but preamble and postamble reads way too fancy :)
> > > bpf_printf_prepare() and bpf_printf_cleanup() or something like that
> > > is a bit more to the point, no?
>
> Haha, you're totally right.
>
> > > > +       if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
> > > > +           (data_len && !data))
> > >
> > > data && !data_len is also an error, no?
>
> Isn't that checked by the verifier ?

data_len is ARG_CONST_SIZE_OR_ZERO, so data_len == 0 is allowed by
verifier. But it's probably no harm either to allow data != NULL and
data_len = 0. Might simplify some more dynamic use of snprintf(),
actually.

>
> I don't mind adding an explicit check for it (data_len ^ data or two
> clearer conditions ?) but I think that even if this were to happen,
> this would not be a problem: if we encounter a format specifier,
> num_args will be zero so bpf_printf_preamble will bail out before it
> tries to access data.

agree

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper
  2021-04-06 16:06     ` Florent Revest
@ 2021-04-07 22:03       ` Andrii Nakryiko
  2021-04-08 22:43         ` Florent Revest
  0 siblings, 1 reply; 23+ messages in thread
From: Andrii Nakryiko @ 2021-04-07 22:03 UTC (permalink / raw)
  To: Florent Revest
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Tue, Apr 6, 2021 at 9:06 AM Florent Revest <revest@chromium.org> wrote:
>
> On Fri, Mar 26, 2021 at 11:55 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > > The implementation takes inspiration from the existing bpf_trace_printk
> > > helper but there are a few differences:
> > >
> > > To allow for a large number of format-specifiers, parameters are
> > > provided in an array, like in bpf_seq_printf.
> > >
> > > Because the output string takes two arguments and the array of
> > > parameters also takes two arguments, the format string needs to fit in
> > > one argument. But because ARG_PTR_TO_CONST_STR guarantees to point to a
> > > NULL-terminated read-only map, we don't need a format string length arg.
> > >
> > > Because the format-string is known at verification time, we also move
> > > most of the format string validation, currently done in formatting
> > > helper calls, into the verifier logic. This makes debugging easier and
> > > also slightly improves the runtime performance.
> > >
> > > Signed-off-by: Florent Revest <revest@chromium.org>
> > > ---
> > >  include/linux/bpf.h            |  6 ++++
> > >  include/uapi/linux/bpf.h       | 28 ++++++++++++++++++
> > >  kernel/bpf/helpers.c           |  2 ++
> > >  kernel/bpf/verifier.c          | 41 +++++++++++++++++++++++++++
> > >  kernel/trace/bpf_trace.c       | 52 ++++++++++++++++++++++++++++++++++
> > >  tools/include/uapi/linux/bpf.h | 28 ++++++++++++++++++
> > >  6 files changed, 157 insertions(+)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 7b5319d75b3e..f3d9c8fa60b3 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -1893,6 +1893,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto;
> > >  extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto;
> > >  extern const struct bpf_func_proto bpf_copy_from_user_proto;
> > >  extern const struct bpf_func_proto bpf_snprintf_btf_proto;
> > > +extern const struct bpf_func_proto bpf_snprintf_proto;
> > >  extern const struct bpf_func_proto bpf_per_cpu_ptr_proto;
> > >  extern const struct bpf_func_proto bpf_this_cpu_ptr_proto;
> > >  extern const struct bpf_func_proto bpf_ktime_get_coarse_ns_proto;
> > > @@ -2018,4 +2019,9 @@ int bpf_arch_text_poke(void *ip, enum bpf_text_poke_type t,
> > >  struct btf_id_set;
> > >  bool btf_id_set_contains(const struct btf_id_set *set, u32 id);
> > >
> > > +enum bpf_printf_mod_type;
> > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > > +                       u32 num_args);
> > > +
> > >  #endif /* _LINUX_BPF_H */
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index 2d3036e292a9..86af61e912c6 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -4660,6 +4660,33 @@ union bpf_attr {
> > >   *     Return
> > >   *             The number of traversed map elements for success, **-EINVAL** for
> > >   *             invalid **flags**.
> > > + *
> > > + * long bpf_snprintf(char *str, u32 str_size, const char *fmt, u64 *data, u32 data_len)
> > > + *     Description
> > > + *             Outputs a string into the **str** buffer of size **str_size**
> > > + *             based on a format string stored in a read-only map pointed by
> > > + *             **fmt**.
> > > + *
> > > + *             Each format specifier in **fmt** corresponds to one u64 element
> > > + *             in the **data** array. For strings and pointers where pointees
> > > + *             are accessed, only the pointer values are stored in the *data*
> > > + *             array. The *data_len* is the size of *data* in bytes.
> > > + *
> > > + *             Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> > > + *             memory. Reading kernel memory may fail due to either invalid
> > > + *             address or valid address but requiring a major memory fault. If
> > > + *             reading kernel memory fails, the string for **%s** will be an
> > > + *             empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
> >
> > would it make sense for sleepable programs to allow memory fault when
> > reading memory?
>
> Probably yes. How would you do that ? I'm guessing that in
> bpf_trace_copy_string you would call either strncpy_from_X_nofault or
> strncpy_from_X depending on a condition but I'm not sure which one.

So you'd have different bpf_snprintf_proto definitions for sleepable
and non-sleepable programs. And each implementation would call
bpf_printf_prepare() with a flag specifying which copy_string variant
to use (sleepable or not). So for BPF users it would be the same
bpf_snprintf() helper, but it would transparently be doing different
things depending on which BPF program it is being called from. That's
how we do bpf_get_stack(), for example, see
bpf_get_stack_proto_pe/bpf_get_stack_proto_raw_tp/bpf_get_stack_proto_tp.

But consider that for a follow up, no need to address right now.

>
> > > + *             Not returning error to bpf program is consistent with what
> > > + *             **bpf_trace_printk**\ () does for now.
> > > + *
> > > + *     Return
> > > + *             The strictly positive length of the formatted string, including
> > > + *             the trailing zero character. If the return value is greater than
> > > + *             **str_size**, **str** contains a truncated string, guaranteed to
> > > + *             be zero-terminated.
> >
> > Except when str_size == 0.
>
> Right
>

So I assume you'll adjust the comment? I always find it confusing when
zero case is allowed but it is not specified what's the behavior is.

> > > + *
> > > + *             Or **-EBUSY** if the per-CPU memory copy buffer is busy.
> > >   */

[...]

> > > +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > > +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> > > +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> > > +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> > > +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> > > +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> > > +               BPF_CAST_FMT_ARG(11, args, mod));
> > > +       if (str_size)
> > > +               str[str_size - 1] = '\0';
> >
> > hm... what if err < str_size ?
>
> Then there would be two zeroes, one set by snprintf in the middle and
> one set by us at the end. :| I was a bit lazy there, I agree it would
> be nicer if we'd do if (err >= str_size) instead.
>

snprintf() seems to be always zero-terminating the string if str_size
> 0, and does nothing if str_size == 0, which is exactly what you
want, so you can just drop that zero termination logic.

> Also makes me wonder what if str == NULL and str_size != 0. I just
> assumed that the verifier would prevent that from happening but
> discussions in the other patches make me unsure now.


ARG_CONST_SIZE_OR_ZERO should make sure that ARG_PTR_TO_MEM before
that is a valid initialized memory. But please double-check, of
course.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper
  2021-04-07 22:03       ` Andrii Nakryiko
@ 2021-04-08 22:43         ` Florent Revest
  0 siblings, 0 replies; 23+ messages in thread
From: Florent Revest @ 2021-04-08 22:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Thu, Apr 8, 2021 at 12:03 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Apr 6, 2021 at 9:06 AM Florent Revest <revest@chromium.org> wrote:
> > On Fri, Mar 26, 2021 at 11:55 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > > > + *             Formats **%s** and **%p{i,I}{4,6}** require to read kernel
> > > > + *             memory. Reading kernel memory may fail due to either invalid
> > > > + *             address or valid address but requiring a major memory fault. If
> > > > + *             reading kernel memory fails, the string for **%s** will be an
> > > > + *             empty string, and the ip address for **%p{i,I}{4,6}** will be 0.
> > >
> > > would it make sense for sleepable programs to allow memory fault when
> > > reading memory?
> >
> > Probably yes. How would you do that ? I'm guessing that in
> > bpf_trace_copy_string you would call either strncpy_from_X_nofault or
> > strncpy_from_X depending on a condition but I'm not sure which one.
>
> So you'd have different bpf_snprintf_proto definitions for sleepable
> and non-sleepable programs. And each implementation would call
> bpf_printf_prepare() with a flag specifying which copy_string variant
> to use (sleepable or not). So for BPF users it would be the same
> bpf_snprintf() helper, but it would transparently be doing different
> things depending on which BPF program it is being called from. That's
> how we do bpf_get_stack(), for example, see
> bpf_get_stack_proto_pe/bpf_get_stack_proto_raw_tp/bpf_get_stack_proto_tp.
>
> But consider that for a follow up, no need to address right now.

Ok let's keep this separate.

> >
> > > > + *             Not returning error to bpf program is consistent with what
> > > > + *             **bpf_trace_printk**\ () does for now.
> > > > + *
> > > > + *     Return
> > > > + *             The strictly positive length of the formatted string, including
> > > > + *             the trailing zero character. If the return value is greater than
> > > > + *             **str_size**, **str** contains a truncated string, guaranteed to
> > > > + *             be zero-terminated.
> > >
> > > Except when str_size == 0.
> >
> > Right
> >
>
> So I assume you'll adjust the comment? I always find it confusing when
> zero case is allowed but it is not specified what's the behavior is.

Yes, sorry it wasn't clear :) I agree it's worth being explicit.

> > > > +       err = snprintf(str, str_size, fmt, BPF_CAST_FMT_ARG(0, args, mod),
> > > > +               BPF_CAST_FMT_ARG(1, args, mod), BPF_CAST_FMT_ARG(2, args, mod),
> > > > +               BPF_CAST_FMT_ARG(3, args, mod), BPF_CAST_FMT_ARG(4, args, mod),
> > > > +               BPF_CAST_FMT_ARG(5, args, mod), BPF_CAST_FMT_ARG(6, args, mod),
> > > > +               BPF_CAST_FMT_ARG(7, args, mod), BPF_CAST_FMT_ARG(8, args, mod),
> > > > +               BPF_CAST_FMT_ARG(9, args, mod), BPF_CAST_FMT_ARG(10, args, mod),
> > > > +               BPF_CAST_FMT_ARG(11, args, mod));
> > > > +       if (str_size)
> > > > +               str[str_size - 1] = '\0';
> > >
> > > hm... what if err < str_size ?
> >
> > Then there would be two zeroes, one set by snprintf in the middle and
> > one set by us at the end. :| I was a bit lazy there, I agree it would
> > be nicer if we'd do if (err >= str_size) instead.
> >
>
> snprintf() seems to be always zero-terminating the string if str_size
> > 0, and does nothing if str_size == 0, which is exactly what you
> want, so you can just drop that zero termination logic.

Oh, that's right! I was confused by snprintf's documentation "the
resulting string is truncated" but as I read the vsnprintf
implementation I see this is indeed always zero-terminated. Great :)

> > Also makes me wonder what if str == NULL and str_size != 0. I just
> > assumed that the verifier would prevent that from happening but
> > discussions in the other patches make me unsure now.
>
>
> ARG_CONST_SIZE_OR_ZERO should make sure that ARG_PTR_TO_MEM before
> that is a valid initialized memory. But please double-check, of
> course.

Will do.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf
  2021-04-07 21:53         ` Andrii Nakryiko
@ 2021-04-08 22:52           ` Florent Revest
  0 siblings, 0 replies; 23+ messages in thread
From: Florent Revest @ 2021-04-08 22:52 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Yonghong Song, KP Singh, Brendan Jackman, open list

On Wed, Apr 7, 2021 at 11:54 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Apr 6, 2021 at 8:35 AM Florent Revest <revest@chromium.org> wrote:
> > On Fri, Mar 26, 2021 at 11:51 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > > On Fri, Mar 26, 2021 at 2:53 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > > On Tue, Mar 23, 2021 at 7:23 PM Florent Revest <revest@chromium.org> wrote:
> > > > > +/* Horrid workaround for getting va_list handling working with different
> > > > > + * argument type combinations generically for 32 and 64 bit archs.
> > > > > + */
> > > > > +#define BPF_CAST_FMT_ARG(arg_nb, args, mod)                            \
> > > > > +       ((mod[arg_nb] == BPF_PRINTF_LONG_LONG ||                        \
> > > > > +        (mod[arg_nb] == BPF_PRINTF_LONG && __BITS_PER_LONG == 64))     \
> > > > > +         ? args[arg_nb]                                                \
> > > > > +         : ((mod[arg_nb] == BPF_PRINTF_LONG ||                         \
> > > > > +            (mod[arg_nb] == BPF_PRINTF_INT && __BITS_PER_LONG == 32))  \
> > > >
> > > > is this right? INT is always 32-bit, it's only LONG that differs.
> > > > Shouldn't the rule be
> > > >
> > > > (LONG_LONG || LONG && __BITS_PER_LONG) -> (__u64)args[args_nb]
> > > > (INT || LONG && __BITS_PER_LONG == 32) -> (__u32)args[args_nb]
> > > >
> > > > Does (long) cast do anything fancy when casting from u64? Sorry, maybe
> > > > I'm confused.
> >
> > To be honest, I am also confused by that logic... :p My patch tries to
> > conserve exactly the same logic as "88a5c690b6 bpf: fix
> > bpf_trace_printk on 32 bit archs" because I was also afraid of missing
> > something and could not test it on 32 bit arches. From that commit
> > description, it is unclear to me what "u32 and long are passed
> > differently to u64, since the result of C conditional operators
> > follows the "usual arithmetic conversions" rules" means. Maybe Daniel
> > can comment on this ?
>
> Yeah, no idea. Seems like the code above should work fine for 32 and
> 64 bitness and both little- and big-endianness.

Yeah, looks good to me as well. I'll use it in v3.

> > > > > +int bpf_printf_preamble(char *fmt, u32 fmt_size, const u64 *raw_args,
> > > > > +                       u64 *final_args, enum bpf_printf_mod_type *mod,
> > > > > +                       u32 num_args)
> > > > > +{
> > > > > +       struct bpf_printf_buf *bufs = this_cpu_ptr(&bpf_printf_buf);
> > > > > +       int err, i, fmt_cnt = 0, copy_size, used;
> > > > > +       char *unsafe_ptr = NULL, *tmp_buf = NULL;
> > > > > +       bool prepare_args = final_args && mod;
> > > >
> > > > probably better to enforce that both or none are specified, otherwise
> > > > return error
> >
> > Fair :)
> >
> > > it's actually three of them: raw_args, mod, and num_args, right? All
> > > three are either NULL or non-NULL.
> >
> > It is a bit tricky to see from that patch but in "3/6 bpf: Add a
> > bpf_snprintf helper" the verifier code calls this function with
> > num_args != 0 to check whether the number of arguments is correct
> > without actually converting anything.
> >
> > Also when the helper gets called, raw_args can come from the BPF
> > program and be NULL but in that case we will also have num_args = 0
> > guaranteed by the helper so the loop will bail out if it encounters a
> > format specifier.
>
> ok, but at least final_args and mod are locked together, so should be
> enforced to be either null or not, right?

Yes :) will do.

> > > > > +       enum bpf_printf_mod_type current_mod;
> > > > > +       size_t tmp_buf_len;
> > > > > +       u64 current_arg;
> > > > > +       char fmt_ptype;
> > > > > +
> > > > > +       for (i = 0; i < fmt_size && fmt[i] != '\0'; i++) {
> > > >
> > > > Can we say that if the last character is not '\0' then it's a bad
> > > > format string and return -EINVAL? And if \0 is inside the format
> > > > string, then it's also a bad format string? I wonder what others think
> > > > about this?... I think sanity should prevail.
> >
> > Overall, there are two situations:
> > - bpf_seq_printf, bpf_trace_printk: we have a pointer and size but we
> > are not guaranteed zero-termination
> > - bpf_snprintf: we have a pointer, no size but it's guaranteed to be
> > zero-terminated (by ARG_PTR_TO_CONST_STR)
> >
> > Currently, in the bpf_snprintf helper, I set fmt_size to UINT_MAX and
> > the terminating condition will be fmt[i] == '\0'.
> > As you pointed out a bit further, I got a bit carried away with the
> > refactoring and dropped the zero-termination checks for the existing
> > helpers !
> >
> > So I see two possibilities:
> > - either we check fmt[last] == '\0', add a bail out condition in the
> > loop if we encounter another `\0` and set fmt_size to sprintf(fmt) in
> > the bpf_snprintf verifier and helper code.
> > - or we unconditionally call strnlen(fmt, fmt_size) in
> > bpf_printf_preamble. If no 0 is found, we return an error, if there is
> > one we treat it as the NULL terminating character.
>
> I was thinking about the second one. It is clearly acceptable on BPF
> verifier side, though one might argue that we are doing extra work on
> the BPF helper side. I don't think it matters in practice, so I'll be
> fine with that, if that makes code cleaner and simpler.

I also prefer that option, yes.

> > > > > +               if ((!isprint(fmt[i]) && !isspace(fmt[i])) ||
> > > > > +                   !isascii(fmt[i])) {
> > > >
> > > > && always binds tighter than ||, so you can omit extra (). I'd put
> > > > this on a single line as well, but that's a total nit.
> >
> > Neat! :)
>
> I just got a compilation warning in a similar situation yesterday when
> I dropped unnecessary parentheses, so some versions of compilers might
> think it is not a good practice. Just keep that in mind. I don't think
> I care enough.

Yes, I noticed the same compilation warning and it bothers me. I'll
keep the parentheses but make it one line.

> > > > > +               while (fmt[i] == '0' || fmt[i] == '+'  || fmt[i] == '-' ||
> > > > > +                      fmt[i] == ' ')
> > > > > +                       i++;
> > > > > +               if (fmt[i] >= '1' && fmt[i] <= '9') {
> > > > >                         i++;
> > > >
> > > > Are we worried about integer overflow here? %123123123123123d
> > > > hopefully won't crash anything, right?
> >
> > I expect that this should be handled gracefully by the subsequent call
> > to snprintf(). Our parsing logic does not guarantee that the format
> > string is 100% legit but it guarantees that it's safe to call
> > vsnprintf with arguments coming from BPF. If the output buffer is too
> > small to hold the output, the output will be truncated.
> >
> > Note that this is already how bpf_seq_printf already works.
>
> Ok, but let's not hope and add the test for this.

Ok

> > > > > -               } else if (fmt[i] == 'p') {
> > > > > -                       mod[fmt_cnt]++;
> > > > > -                       if ((fmt[i + 1] == 'k' ||
> > > > > -                            fmt[i + 1] == 'u') &&
> > > > > +                       while (fmt[i] >= '0' && fmt[i] <= '9')
> > > > > +                               i++;
> > > >
> > > > whoa, fmt_size shouldn't be ignored
> >
> > Oh no, I'll attach the stone of shame! It all made sense with
> > bpf_snprintf() in mind because, there, we are guaranteed to have a
> > NULL terminated string already but in an excess of refactoring
> > enthusiasm I dropped the zero-termination check for the other helpers.
> >
> > But if we implement either of the options discussed above, then we do
> > not need to constantly check fmt_size.
>
> let's see when we get to the next version ;) I don't remember code
> enough by now, but I'll keep that in mind for the next revision
> anyways

Sure, I'll send v3 asap.

> > > > >                                 fmt_ptype = fmt[i + 1];
> > > > >                                 i += 2;
> > > > >                                 goto fmt_str;
> > > > >                         }
> > > > >
> > > > > -                       if (fmt[i + 1] == 'B') {
> > > > > -                               i++;
> > > > > +                       if (fmt[i + 1] == 0 || isspace(fmt[i + 1]) ||
> > > > > +                           ispunct(fmt[i + 1]) || fmt[i + 1] == 'K' ||
> > > > > +                           fmt[i + 1] == 'x' || fmt[i + 1] == 'B' ||
> > > > > +                           fmt[i + 1] == 's' || fmt[i + 1] == 'S') {
> > > > > +                               /* just kernel pointers */
> > > > > +                               if (prepare_args)
> > > > > +                                       current_arg = raw_args[fmt_cnt];
> > > >
> > > > fmt_cnt is not the best name, imo. arg_cnt makes more sense
> >
> > Mh, we already have "num_args" that can make it confusing. The way I see it:
> > - the number of format specifiers is the number of %d %s... in the format string
> > - the number of arguments is the number of values given in the raw_args array.
> >
>
> Well, if you read "fmt_cnt" as "number of formatters" then yeah, I
> suppose it's fine. Never mind. Just fmt_cnt and fmt_size refers to
> slightly different "fmt"s, which confused me for a bit, but that's ok.
> You use different naming conventions, which is inconsistent, so maybe
> adjust that for purists (i.e., if you have num_args, then you should
> have num_fmts; or, alternatively, arg_cnt and fmt_cnt). But I'm just
> nitpicking, obviously.

I agree, I'll see if I can clean this up a bit.

> > > > > +       if (data_len & 7 || data_len > MAX_SEQ_PRINTF_VARARGS * 8 ||
> > > > > +           (data_len && !data))
> > > >
> > > > data && !data_len is also an error, no?
> >
> > Isn't that checked by the verifier ?
>
> data_len is ARG_CONST_SIZE_OR_ZERO, so data_len == 0 is allowed by
> verifier. But it's probably no harm either to allow data != NULL and
> data_len = 0. Might simplify some more dynamic use of snprintf(),
> actually.

Agree

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, back to index

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-24  2:22 [PATCH bpf-next v2 0/6] Add a snprintf eBPF helper Florent Revest
2021-03-24  2:22 ` [PATCH bpf-next v2 1/6] bpf: Factorize bpf_trace_printk and bpf_seq_printf Florent Revest
2021-03-26 21:53   ` Andrii Nakryiko
2021-03-26 22:51     ` Andrii Nakryiko
2021-04-06 15:35       ` Florent Revest
2021-04-07 21:53         ` Andrii Nakryiko
2021-04-08 22:52           ` Florent Revest
2021-03-24  2:22 ` [PATCH bpf-next v2 2/6] bpf: Add a ARG_PTR_TO_CONST_STR argument type Florent Revest
2021-03-26 22:23   ` Andrii Nakryiko
2021-04-06 15:38     ` Florent Revest
2021-03-24  2:22 ` [PATCH bpf-next v2 3/6] bpf: Add a bpf_snprintf helper Florent Revest
2021-03-26 22:55   ` Andrii Nakryiko
2021-04-06 16:06     ` Florent Revest
2021-04-07 22:03       ` Andrii Nakryiko
2021-04-08 22:43         ` Florent Revest
2021-03-24  2:22 ` [PATCH bpf-next v2 4/6] libbpf: Initialize the bpf_seq_printf parameters array field by field Florent Revest
2021-03-26 23:01   ` Andrii Nakryiko
2021-04-06 15:42     ` Florent Revest
2021-03-24  2:22 ` [PATCH bpf-next v2 5/6] libbpf: Introduce a BPF_SNPRINTF helper macro Florent Revest
2021-03-26 23:02   ` Andrii Nakryiko
2021-03-24  2:22 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add a series of tests for bpf_snprintf Florent Revest
2021-03-26 23:05   ` Andrii Nakryiko
2021-04-06 15:40     ` Florent Revest

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git