All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 00/13] Fixes for dynptr
@ 2022-10-18 13:59 Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
                   ` (13 more replies)
  0 siblings, 14 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

This set fixes multiple issues in the dynptr code discovered during code
review.

 - Missing dynptr stack slot liveness propagation
 - Missing checks for PTR_TO_STACK variable offset
 - Incomplete destruction of dynptr stack slots on writes
 - Modification of dynptr struct through callback argument
   with reg->type == PTR_TO_DYNPTR

These can be abused to perform arbitrary kernel memory reads/writes by
replacing dynptr contents.

The first three cases are now unreachable from unprivileged BPF since
the commit 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF") which
has been applied to released stable kernels v6.0.1 and v5.19.15.

The changes are fairly intrusive and non-trivial, in-depth review is
warranted, as they rework the code before making the fixes to it, but
for the better (IMO).

Please see the individual commit logs for the details.

Kumar Kartikeya Dwivedi (13):
  bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  bpf: Rework process_dynptr_func
  bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  bpf: Rework check_func_arg_reg_off
  bpf: Fix state pruning for STACK_DYNPTR stack slots
  bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  bpf: Fix partial dynptr stack slot reads/writes
  bpf: Use memmove for bpf_dynptr_{read,write}
  selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  selftests/bpf: Add dynptr pruning tests
  selftests/bpf: Add dynptr var_off tests
  selftests/bpf: Add dynptr partial slot overwrite tests
  selftests/bpf: Add dynptr helper tests

 include/linux/bpf.h                           |  10 +-
 include/linux/bpf_verifier.h                  |   8 +-
 include/uapi/linux/bpf.h                      |   8 +-
 kernel/bpf/btf.c                              |  22 +-
 kernel/bpf/helpers.c                          |  22 +-
 kernel/bpf/verifier.c                         | 574 ++++++++++++++----
 scripts/bpf_doc.py                            |   1 +
 tools/include/uapi/linux/bpf.h                |   8 +-
 .../testing/selftests/bpf/prog_tests/dynptr.c |   9 +-
 .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  12 +-
 .../testing/selftests/bpf/progs/dynptr_fail.c |  35 ++
 .../selftests/bpf/progs/dynptr_success.c      |  20 +
 .../bpf/progs/test_kfunc_dynptr_param.c       |  12 -
 .../selftests/bpf/progs/user_ringbuf_fail.c   |  35 ++
 tools/testing/selftests/bpf/verifier/dynptr.c | 182 ++++++
 .../testing/selftests/bpf/verifier/ringbuf.c  |   2 +-
 17 files changed, 780 insertions(+), 185 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/verifier/dynptr.c

-- 
2.38.0


^ permalink raw reply	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 19:45   ` David Vernet
  2022-10-19 22:59   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
the underlying register type is subjected to more special checks to
determine the type of object represented by the pointer and its state
consistency.

Move dynptr checks to their own 'process_dynptr_func' function so that
is consistent and in-line with existing code. This also makes it easier
to reuse this code for kfunc handling.

To this end, remove the dependency on bpf_call_arg_meta parameter by
instead taking the uninit_dynptr_regno by pointer. This is only needed
to be set to a valid pointer when arg_type has MEM_UNINIT.

Then, reuse this consolidated function in kfunc dynptr handling too.
Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
been lifted.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h                  |   8 +-
 kernel/bpf/btf.c                              |  17 +--
 kernel/bpf/verifier.c                         | 115 ++++++++++--------
 .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
 .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
 5 files changed, 69 insertions(+), 88 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 9e1e6965f407..a33683e0618b 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
 			     u32 regno);
 int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 		   u32 regno, u32 mem_size);
-bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
-			      struct bpf_reg_state *reg);
-bool is_dynptr_type_expected(struct bpf_verifier_env *env,
-			     struct bpf_reg_state *reg,
-			     enum bpf_arg_type arg_type);
+int process_dynptr_func(struct bpf_verifier_env *env, int regno,
+			enum bpf_arg_type arg_type, int argno,
+			u8 *uninit_dynptr_regno);
 
 /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
 static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index eba603cec2c5..1827d889e08a 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 						return -EINVAL;
 					}
 
-					if (!is_dynptr_reg_valid_init(env, reg)) {
-						bpf_log(log,
-							"arg#%d pointer type %s %s must be valid and initialized\n",
-							i, btf_type_str(ref_t),
-							ref_tname);
+					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
 						return -EINVAL;
-					}
-
-					if (!is_dynptr_type_expected(env, reg,
-							ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
-						bpf_log(log,
-							"arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
-							i, btf_type_str(ref_t),
-							ref_tname);
-						return -EINVAL;
-					}
-
 					continue;
 				}
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 6f6d2d511c06..31c0c999448e 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 	return true;
 }
 
-bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
-			      struct bpf_reg_state *reg)
+static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
 	int spi = get_spi(reg->off);
@@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
 	return true;
 }
 
-bool is_dynptr_type_expected(struct bpf_verifier_env *env,
-			     struct bpf_reg_state *reg,
-			     enum bpf_arg_type arg_type)
+static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+				    enum bpf_arg_type arg_type)
 {
 	struct bpf_func_state *state = func(env, reg);
 	enum bpf_dynptr_type dynptr_type;
@@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
 	return 0;
 }
 
+int process_dynptr_func(struct bpf_verifier_env *env, int regno,
+			enum bpf_arg_type arg_type, int argno,
+			u8 *uninit_dynptr_regno)
+{
+	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
+
+	/* We only need to check for initialized / uninitialized helper
+	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
+	 * assumption is that if it is, that a helper function
+	 * initialized the dynptr on behalf of the BPF program.
+	 */
+	if (base_type(reg->type) == PTR_TO_DYNPTR)
+		return 0;
+	if (arg_type & MEM_UNINIT) {
+		if (!is_dynptr_reg_valid_uninit(env, reg)) {
+			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
+			return -EINVAL;
+		}
+
+		/* We only support one dynptr being uninitialized at the moment,
+		 * which is sufficient for the helper functions we have right now.
+		 */
+		if (*uninit_dynptr_regno) {
+			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
+			return -EFAULT;
+		}
+
+		*uninit_dynptr_regno = regno;
+	} else {
+		if (!is_dynptr_reg_valid_init(env, reg)) {
+			verbose(env,
+				"Expected an initialized dynptr as arg #%d\n",
+				argno + 1);
+			return -EINVAL;
+		}
+
+		if (!is_dynptr_type_expected(env, reg, arg_type)) {
+			const char *err_extra = "";
+
+			switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
+			case DYNPTR_TYPE_LOCAL:
+				err_extra = "local";
+				break;
+			case DYNPTR_TYPE_RINGBUF:
+				err_extra = "ringbuf";
+				break;
+			default:
+				err_extra = "<unknown>";
+				break;
+			}
+			verbose(env,
+				"Expected a dynptr of type %s as arg #%d\n",
+				err_extra, argno + 1);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
 static bool arg_type_is_mem_size(enum bpf_arg_type type)
 {
 	return type == ARG_CONST_SIZE ||
@@ -6086,52 +6143,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		err = check_mem_size_reg(env, reg, regno, true, meta);
 		break;
 	case ARG_PTR_TO_DYNPTR:
-		/* We only need to check for initialized / uninitialized helper
-		 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
-		 * assumption is that if it is, that a helper function
-		 * initialized the dynptr on behalf of the BPF program.
-		 */
-		if (base_type(reg->type) == PTR_TO_DYNPTR)
-			break;
-		if (arg_type & MEM_UNINIT) {
-			if (!is_dynptr_reg_valid_uninit(env, reg)) {
-				verbose(env, "Dynptr has to be an uninitialized dynptr\n");
-				return -EINVAL;
-			}
-
-			/* We only support one dynptr being uninitialized at the moment,
-			 * which is sufficient for the helper functions we have right now.
-			 */
-			if (meta->uninit_dynptr_regno) {
-				verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
-				return -EFAULT;
-			}
-
-			meta->uninit_dynptr_regno = regno;
-		} else if (!is_dynptr_reg_valid_init(env, reg)) {
-			verbose(env,
-				"Expected an initialized dynptr as arg #%d\n",
-				arg + 1);
-			return -EINVAL;
-		} else if (!is_dynptr_type_expected(env, reg, arg_type)) {
-			const char *err_extra = "";
-
-			switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
-			case DYNPTR_TYPE_LOCAL:
-				err_extra = "local";
-				break;
-			case DYNPTR_TYPE_RINGBUF:
-				err_extra = "ringbuf";
-				break;
-			default:
-				err_extra = "<unknown>";
-				break;
-			}
-			verbose(env,
-				"Expected a dynptr of type %s as arg #%d\n",
-				err_extra, arg + 1);
-			return -EINVAL;
-		}
+		if (process_dynptr_func(env, regno, arg_type, arg, &meta->uninit_dynptr_regno))
+			return -EACCES;
 		break;
 	case ARG_CONST_ALLOC_SIZE_OR_ZERO:
 		if (!tnum_is_const(reg->var_off)) {
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
index c210657d4d0a..fc562e863e79 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
@@ -18,10 +18,7 @@ static struct {
 	const char *expected_verifier_err_msg;
 	int expected_runtime_err;
 } kfunc_dynptr_tests[] = {
-	{"dynptr_type_not_supp",
-	 "arg#0 pointer type STRUCT bpf_dynptr_kern points to unsupported dynamic pointer type", 0},
-	{"not_valid_dynptr",
-	 "arg#0 pointer type STRUCT bpf_dynptr_kern must be valid and initialized", 0},
+	{"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0},
 	{"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0},
 	{"dynptr_data_null", NULL, -EBADMSG},
 };
diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
index ce39d096bba3..f4a8250329b2 100644
--- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
@@ -32,18 +32,6 @@ int err, pid;
 
 char _license[] SEC("license") = "GPL";
 
-SEC("?lsm.s/bpf")
-int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
-	     unsigned int size)
-{
-	char write_data[64] = "hello there, world!!";
-	struct bpf_dynptr ptr;
-
-	bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
-
-	return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
-}
-
 SEC("?lsm.s/bpf")
 int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size)
 {
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 23:16   ` David Vernet
  2022-10-18 13:59 ` [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM Kumar Kartikeya Dwivedi
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the stack that is passed into the callback. To
reflect such a state, a special register type was created.

However, some checks have been bypassed incorrectly during the addition
of this feature. First, for arg_type with MEM_UNINIT flag which
initialize a dynptr, they must be rejected for such register type.
Secondly, in the future, there are plans to dynptr helpers that operate
on the dynptr itself and may change its offset and other properties.

In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
to such helpers, however the current code simply returns 0.

The rejection for helpers that release the dynptr is already handled.

For fixing this, we take a step back and rework existing code in a way
that will allow fitting in all classes of helpers and have a coherent
model for dealing with the variety of use cases in which dynptr is used.

First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
with a DYNPTR_TYPE_* constant that denotes the only type it accepts.

Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
fact. To make the distinction clear, use MEM_RDONLY flag to indicate
that the helper only operates on the memory pointed to by the dynptr,
not the dynptr itself. In C parlance, it would be equivalent to taking
the dynptr as a point to const argument.

When either of these flags are not present, the helper is allowed to
mutate both the dynptr itself and also the memory it points to.
Currently, the read only status of the memory is not tracked in the
dynptr, but it would be trivial to add this support inside dynptr state
of the register.

With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
better reflect its usage, it can no longer be passed to helpers that
initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.

A note to reviewers is that in code that does mark_stack_slots_dynptr,
and unmark_stack_slots_dynptr, we implicitly rely on the fact that
PTR_TO_STACK reg is the only case that can reach that code path, as one
cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
both cases such helpers won't be setting that flag.

The next patch will add a couple of selftest cases to make sure this
doesn't break.

Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h                           |   4 +-
 include/uapi/linux/bpf.h                      |   8 +-
 kernel/bpf/btf.c                              |   7 +-
 kernel/bpf/helpers.c                          |  18 +-
 kernel/bpf/verifier.c                         | 203 ++++++++++++++----
 scripts/bpf_doc.py                            |   1 +
 tools/include/uapi/linux/bpf.h                |   8 +-
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  10 +-
 8 files changed, 185 insertions(+), 74 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9e7d46d16032..13c6ff2de540 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -656,7 +656,7 @@ enum bpf_reg_type {
 	PTR_TO_MEM,		 /* reg points to valid memory region */
 	PTR_TO_BUF,		 /* reg points to a read/write buffer */
 	PTR_TO_FUNC,		 /* reg points to a bpf program function */
-	PTR_TO_DYNPTR,		 /* reg points to a dynptr */
+	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
 	__BPF_REG_TYPE_MAX,
 
 	/* Extended reg_types. */
@@ -2689,7 +2689,7 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
 		     enum bpf_dynptr_type type, u32 offset, u32 size);
 void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
 int bpf_dynptr_check_size(u32 size);
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
+u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
 
 #ifdef CONFIG_BPF_LSM
 void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 17f61338f8f8..2b490bde85a6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5282,7 +5282,7 @@ union bpf_attr {
  *	Return
  *		Nothing. Always succeeds.
  *
- * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
  *	Description
  *		Read *len* bytes from *src* into *dst*, starting from *offset*
  *		into *src*.
@@ -5292,7 +5292,7 @@ union bpf_attr {
  *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
  *		*flags* is not 0.
  *
- * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
  *	Description
  *		Write *len* bytes from *src* into *dst*, starting from *offset*
  *		into *dst*.
@@ -5302,7 +5302,7 @@ union bpf_attr {
  *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
  *		is a read-only dynptr or if *flags* is not 0.
  *
- * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
  *	Description
  *		Get a pointer to the underlying dynptr data.
  *
@@ -5403,7 +5403,7 @@ union bpf_attr {
  *		Drain samples from the specified user ring buffer, and invoke
  *		the provided callback for each such sample:
  *
- *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
+ *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
  *
  *		If **callback_fn** returns 0, the helper will continue to try
  *		and drain the next sample, up to a maximum of
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 1827d889e08a..b6cd91c23a27 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6479,14 +6479,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				}
 
 				if (arg_dynptr) {
-					if (reg->type != PTR_TO_STACK) {
-						bpf_log(log, "arg#%d pointer type %s %s not to stack\n",
+					if (reg->type != PTR_TO_STACK &&
+					    reg->type != CONST_PTR_TO_DYNPTR) {
+						bpf_log(log, "arg#%d pointer type %s %s not to stack or dynptr\n",
 							i, btf_type_str(ref_t),
 							ref_tname);
 						return -EINVAL;
 					}
 
-					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
+					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR | MEM_RDONLY, i, NULL))
 						return -EINVAL;
 					continue;
 				}
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index a6b04faed282..0a4017eb3616 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1398,7 +1398,7 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
 #define DYNPTR_SIZE_MASK	0xFFFFFF
 #define DYNPTR_RDONLY_BIT	BIT(31)
 
-static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
+static bool bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_RDONLY_BIT;
 }
@@ -1408,7 +1408,7 @@ static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_typ
 	ptr->size |= type << DYNPTR_TYPE_SHIFT;
 }
 
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
+u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_SIZE_MASK;
 }
@@ -1432,7 +1432,7 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
 	memset(ptr, 0, sizeof(*ptr));
 }
 
-static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
+static int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
 {
 	u32 size = bpf_dynptr_get_size(ptr);
 
@@ -1477,7 +1477,7 @@ static const struct bpf_func_proto bpf_dynptr_from_mem_proto = {
 	.arg4_type	= ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT,
 };
 
-BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src,
+BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern *, src,
 	   u32, offset, u64, flags)
 {
 	int err;
@@ -1500,12 +1500,12 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = {
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
 	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
-	.arg3_type	= ARG_PTR_TO_DYNPTR,
+	.arg3_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg4_type	= ARG_ANYTHING,
 	.arg5_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
+BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
 	   u32, len, u64, flags)
 {
 	int err;
@@ -1526,14 +1526,14 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = {
 	.func		= bpf_dynptr_write,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg2_type	= ARG_ANYTHING,
 	.arg3_type	= ARG_PTR_TO_MEM | MEM_RDONLY,
 	.arg4_type	= ARG_CONST_SIZE_OR_ZERO,
 	.arg5_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
+BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
 {
 	int err;
 
@@ -1554,7 +1554,7 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
 	.func		= bpf_dynptr_data,
 	.gpl_only	= false,
 	.ret_type	= RET_PTR_TO_DYNPTR_MEM_OR_NULL,
-	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg2_type	= ARG_ANYTHING,
 	.arg3_type	= ARG_CONST_ALLOC_SIZE_OR_ZERO,
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 31c0c999448e..87d9cccd1623 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -563,7 +563,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
 		[PTR_TO_BUF]		= "buf",
 		[PTR_TO_FUNC]		= "func",
 		[PTR_TO_MAP_KEY]	= "map_key",
-		[PTR_TO_DYNPTR]		= "dynptr_ptr",
+		[CONST_PTR_TO_DYNPTR]	= "dynptr",
 	};
 
 	if (type & PTR_MAYBE_NULL) {
@@ -697,6 +697,27 @@ static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
 	return type == BPF_DYNPTR_TYPE_RINGBUF;
 }
 
+static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
+			       struct bpf_reg_state *reg2,
+			       enum bpf_dynptr_type type);
+
+static void __mark_reg_not_init(const struct bpf_verifier_env *env,
+				struct bpf_reg_state *reg);
+
+static void mark_dynptr_stack_regs(struct bpf_reg_state *sreg1,
+				   struct bpf_reg_state *sreg2,
+				   enum bpf_dynptr_type type)
+{
+	__mark_dynptr_regs(sreg1, sreg2, type);
+}
+
+static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
+			       enum bpf_dynptr_type type)
+{
+	__mark_dynptr_regs(reg1, NULL, type);
+}
+
+
 static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 				   enum bpf_arg_type arg_type, int insn_idx)
 {
@@ -718,9 +739,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 	if (type == BPF_DYNPTR_TYPE_INVALID)
 		return -EINVAL;
 
-	state->stack[spi].spilled_ptr.dynptr.first_slot = true;
-	state->stack[spi].spilled_ptr.dynptr.type = type;
-	state->stack[spi - 1].spilled_ptr.dynptr.type = type;
+	mark_dynptr_stack_regs(&state->stack[spi].spilled_ptr,
+			       &state->stack[spi - 1].spilled_ptr, type);
 
 	if (dynptr_type_refcounted(type)) {
 		/* The id is used to track proper releasing */
@@ -728,8 +748,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 		if (id < 0)
 			return id;
 
-		state->stack[spi].spilled_ptr.id = id;
-		state->stack[spi - 1].spilled_ptr.id = id;
+		state->stack[spi].spilled_ptr.ref_obj_id = id;
+		state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
 	}
 
 	return 0;
@@ -751,25 +771,23 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 	}
 
 	/* Invalidate any slices associated with this dynptr */
-	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
-		release_reference(env, state->stack[spi].spilled_ptr.id);
-		state->stack[spi].spilled_ptr.id = 0;
-		state->stack[spi - 1].spilled_ptr.id = 0;
-	}
-
-	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
-	state->stack[spi].spilled_ptr.dynptr.type = 0;
-	state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
+	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
+		WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
 
+	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
+	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
 	return 0;
 }
 
 static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
-	int i;
+	int spi, i;
 
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return false;
+
+	spi = get_spi(reg->off);
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return true;
 
@@ -785,9 +803,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
+	int spi;
 	int i;
 
+	/* This already represents first slot of initialized bpf_dynptr */
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return true;
+
+	spi = get_spi(reg->off);
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
 	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
 		return false;
@@ -806,15 +829,21 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
 {
 	struct bpf_func_state *state = func(env, reg);
 	enum bpf_dynptr_type dynptr_type;
-	int spi = get_spi(reg->off);
+	int spi;
 
+	/* Fold MEM_RDONLY, caller already checked it */
+	arg_type &= ~MEM_RDONLY;
 	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
 	if (arg_type == ARG_PTR_TO_DYNPTR)
 		return true;
 
 	dynptr_type = arg_to_dynptr_type(arg_type);
-
-	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
+	if (reg->type == CONST_PTR_TO_DYNPTR) {
+		return reg->dynptr.type == dynptr_type;
+	} else {
+		spi = get_spi(reg->off);
+		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
+	}
 }
 
 /* The reg state of a pointer or a bounded scalar was saved when
@@ -1317,9 +1346,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
 	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
 };
 
-static void __mark_reg_not_init(const struct bpf_verifier_env *env,
-				struct bpf_reg_state *reg);
-
 /* This helper doesn't clear reg->id */
 static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
 {
@@ -1382,6 +1408,25 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
 	__mark_reg_known_zero(regs + regno);
 }
 
+static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
+			       struct bpf_reg_state *reg2,
+			       enum bpf_dynptr_type type)
+{
+	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
+	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR.
+	 */
+	__mark_reg_known_zero(reg1);
+	reg1->type = CONST_PTR_TO_DYNPTR;
+	reg1->dynptr.type = type;
+	reg1->dynptr.first_slot = true;
+	if (!reg2)
+		return;
+	__mark_reg_known_zero(reg2);
+	reg2->type = CONST_PTR_TO_DYNPTR;
+	reg2->dynptr.type = type;
+	reg2->dynptr.first_slot = false;
+}
+
 static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
 {
 	if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
@@ -5571,19 +5616,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
 	return 0;
 }
 
+/* Implementation details:
+ *
+ * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
+ * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
+ *
+ * In both cases we deal with the first 8 bytes, but need to mark the next 8
+ * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
+ * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
+ *
+ * Mutability of bpf_dynptr is at two levels, one is at the level of struct
+ * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
+ * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
+ * mutate the view of the dynptr and also possibly destroy it. In the latter
+ * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
+ * memory that dynptr points to.
+ *
+ * The verifier will keep track both levels of mutation (bpf_dynptr's in
+ * reg->type and the memory's in reg->dynptr.type), but there is no support for
+ * readonly dynptr view yet, hence only the first case is tracked and checked.
+ *
+ * This is consistent with how C applies the const modifier to a struct object,
+ * where the pointer itself inside bpf_dynptr becomes const but not what it
+ * points to.
+ *
+ * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
+ * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
+ */
 int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 			enum bpf_arg_type arg_type, int argno,
 			u8 *uninit_dynptr_regno)
 {
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
 
-	/* We only need to check for initialized / uninitialized helper
-	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
-	 * assumption is that if it is, that a helper function
-	 * initialized the dynptr on behalf of the BPF program.
+	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
+		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
+		return -EFAULT;
+	}
+
+	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
+	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
+	 *
+	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
+	 *		 constructing a mutable bpf_dynptr object.
+	 *
+	 *		 Currently, this is only possible with PTR_TO_STACK
+	 *		 pointing to a region of atleast 16 bytes which doesn't
+	 *		 contain an existing bpf_dynptr.
+	 *
+	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
+	 *		 mutated or destroyed. However, the memory it points to
+	 *		 may be mutated.
+	 *
+	 *  None       - Points to a initialized dynptr that can be mutated and
+	 *		 destroyed, including mutation of the memory it points
+	 *		 to.
 	 */
-	if (base_type(reg->type) == PTR_TO_DYNPTR)
-		return 0;
 	if (arg_type & MEM_UNINIT) {
 		if (!is_dynptr_reg_valid_uninit(env, reg)) {
 			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
@@ -5597,9 +5685,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
 			return -EFAULT;
 		}
-
 		*uninit_dynptr_regno = regno;
 	} else {
+		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
+		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
+			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
+			return -EINVAL;
+		}
+
 		if (!is_dynptr_reg_valid_init(env, reg)) {
 			verbose(env,
 				"Expected an initialized dynptr as arg #%d\n",
@@ -5607,6 +5700,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 			return -EINVAL;
 		}
 
+		arg_type &= ~MEM_RDONLY;
 		if (!is_dynptr_type_expected(env, reg, arg_type)) {
 			const char *err_extra = "";
 
@@ -5762,7 +5856,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
 static const struct bpf_reg_types dynptr_types = {
 	.types = {
 		PTR_TO_STACK,
-		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
+		CONST_PTR_TO_DYNPTR,
 	}
 };
 
@@ -5938,12 +6032,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
 }
 
-static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
+static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
+	int spi;
 
-	return state->stack[spi].spilled_ptr.id;
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return reg->ref_obj_id;
+	spi = get_spi(reg->off);
+	return state->stack[spi].spilled_ptr.ref_obj_id;
 }
 
 static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
@@ -6007,11 +6104,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 	if (arg_type_is_release(arg_type)) {
 		if (arg_type_is_dynptr(arg_type)) {
 			struct bpf_func_state *state = func(env, reg);
-			int spi = get_spi(reg->off);
+			int spi;
 
-			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
-			    !state->stack[spi].spilled_ptr.id) {
-				verbose(env, "arg %d is an unacquired reference\n", regno);
+			if (reg->type == PTR_TO_STACK) {
+				spi = get_spi(reg->off);
+				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
+				    !state->stack[spi].spilled_ptr.ref_obj_id) {
+					verbose(env, "arg %d is an unacquired reference\n", regno);
+					return -EINVAL;
+				}
+			} else {
+				verbose(env, "cannot release unowned const bpf_dynptr\n");
 				return -EINVAL;
 			}
 		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
@@ -6946,11 +7049,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
 {
 	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
 	 *			  callback_ctx, u64 flags);
-	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
+	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
 	 */
 	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
-	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
-	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
+	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
 	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
 
 	/* unused */
@@ -7328,6 +7430,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	regs = cur_regs(env);
 
+	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
+	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
+	 * is safe to do.
+	 */
 	if (meta.uninit_dynptr_regno) {
 		/* we write BPF_DW bits (8 bytes) at a time */
 		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
@@ -7346,6 +7452,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	if (meta.release_regno) {
 		err = -EINVAL;
+		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
+		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
+		 * is safe to do.
+		 */
 		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
 			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
 		else if (meta.ref_obj_id)
@@ -7428,11 +7538,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 					return -EFAULT;
 				}
 
-				if (base_type(reg->type) != PTR_TO_DYNPTR)
-					/* Find the id of the dynptr we're
-					 * tracking the reference of
-					 */
-					meta.ref_obj_id = stack_slot_get_id(env, reg);
+				/* Find the id of the dynptr we're
+				 * tracking the reference of
+				 */
+				meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
 				break;
 			}
 		}
diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
index c0e6690be82a..2865f2b22eca 100755
--- a/scripts/bpf_doc.py
+++ b/scripts/bpf_doc.py
@@ -750,6 +750,7 @@ class PrinterHelpers(Printer):
             'struct bpf_timer',
             'struct mptcp_sock',
             'struct bpf_dynptr',
+            'const struct bpf_dynptr',
             'struct iphdr',
             'struct ipv6hdr',
     }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 17f61338f8f8..2b490bde85a6 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5282,7 +5282,7 @@ union bpf_attr {
  *	Return
  *		Nothing. Always succeeds.
  *
- * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
  *	Description
  *		Read *len* bytes from *src* into *dst*, starting from *offset*
  *		into *src*.
@@ -5292,7 +5292,7 @@ union bpf_attr {
  *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
  *		*flags* is not 0.
  *
- * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
  *	Description
  *		Write *len* bytes from *src* into *dst*, starting from *offset*
  *		into *dst*.
@@ -5302,7 +5302,7 @@ union bpf_attr {
  *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
  *		is a read-only dynptr or if *flags* is not 0.
  *
- * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
  *	Description
  *		Get a pointer to the underlying dynptr data.
  *
@@ -5403,7 +5403,7 @@ union bpf_attr {
  *		Drain samples from the specified user ring buffer, and invoke
  *		the provided callback for each such sample:
  *
- *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
+ *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
  *
  *		If **callback_fn** returns 0, the helper will continue to try
  *		and drain the next sample, up to a maximum of
diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
index 02b18d018b36..39882580cb90 100644
--- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
@@ -668,13 +668,13 @@ static struct {
 	const char *expected_err_msg;
 } failure_tests[] = {
 	/* failure cases */
-	{"user_ringbuf_callback_bad_access1", "negative offset dynptr_ptr ptr"},
-	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr_ptr ptr"},
-	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr_ptr'"},
+	{"user_ringbuf_callback_bad_access1", "negative offset dynptr ptr"},
+	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr ptr"},
+	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr'"},
 	{"user_ringbuf_callback_null_context_write", "invalid mem access 'scalar'"},
 	{"user_ringbuf_callback_null_context_read", "invalid mem access 'scalar'"},
-	{"user_ringbuf_callback_discard_dynptr", "arg 1 is an unacquired reference"},
-	{"user_ringbuf_callback_submit_dynptr", "arg 1 is an unacquired reference"},
+	{"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
+	{"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
 };
 
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 21:38   ` sdf
  2022-11-07 22:35   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
imply that it carries MEM_ALLOC, while only the latter does. This causes
confusion during code review leading to conclusions like that the return
value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.

Rename it to make it clear MEM_ALLOC needs to be tacked on top of
RET_PTR_TO_MEM.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h   | 6 +++---
 kernel/bpf/verifier.c | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 13c6ff2de540..834276ba56c9 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -538,7 +538,7 @@ enum bpf_return_type {
 	RET_PTR_TO_SOCKET,		/* returns a pointer to a socket */
 	RET_PTR_TO_TCP_SOCK,		/* returns a pointer to a tcp_sock */
 	RET_PTR_TO_SOCK_COMMON,		/* returns a pointer to a sock_common */
-	RET_PTR_TO_ALLOC_MEM,		/* returns a pointer to dynamically allocated memory */
+	RET_PTR_TO_MEM,			/* returns a pointer to dynamically allocated memory */
 	RET_PTR_TO_MEM_OR_BTF_ID,	/* returns a pointer to a valid memory or a btf_id */
 	RET_PTR_TO_BTF_ID,		/* returns a pointer to a btf_id */
 	__BPF_RET_TYPE_MAX,
@@ -548,8 +548,8 @@ enum bpf_return_type {
 	RET_PTR_TO_SOCKET_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
 	RET_PTR_TO_TCP_SOCK_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
 	RET_PTR_TO_SOCK_COMMON_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
-	RET_PTR_TO_ALLOC_MEM_OR_NULL	= PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
-	RET_PTR_TO_DYNPTR_MEM_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
+	RET_PTR_TO_ALLOC_MEM_OR_NULL	= PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_MEM,
+	RET_PTR_TO_DYNPTR_MEM_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_MEM,
 	RET_PTR_TO_BTF_ID_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
 
 	/* This must be the last entry. Its purpose is to ensure the enum is
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 87d9cccd1623..a49b95c1af1b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7612,7 +7612,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		mark_reg_known_zero(env, regs, BPF_REG_0);
 		regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag;
 		break;
-	case RET_PTR_TO_ALLOC_MEM:
+	case RET_PTR_TO_MEM:
 		mark_reg_known_zero(env, regs, BPF_REG_0);
 		regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
 		regs[BPF_REG_0].mem_size = meta.mem_size;
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 21:55   ` sdf
  2022-11-07 23:17   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots Kumar Kartikeya Dwivedi
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

While check_func_arg_reg_off is the place which performs generic checks
needed by various candidates of reg->type, there is some handling for
special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
ARG_PTR_TO_ALLOC_MEM.

This commit aims to streamline these special cases and instead leave
other things up to argument type specific code to handle.

This is done primarily for two reasons: associating back reg->type to
its argument leaves room for the list getting out of sync when a new
reg->type is supported by an arg_type.

The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
we already handle, whenever a release argument is expected, it should
be passed as the pointer that was received from the acquire function.
Hence zero fixed and variable offset.

There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
with non-zero offset to other helper functions, which makes sense.

Hence, lift the arg_type_is_release check for reg->off and cover all
possible register types, instead of duplicating the same kind of check
twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).

Finally, for the release argument, arg_type_is_dynptr is the special
case, where we go to actual object being freed through the dynptr, so
the offset of the pointer still needs to allow fixed and variable offset
and process_dynptr_func will verify them later for the release argument
case as well.

Finally, since check_func_arg_reg_off is meant to be generic, move
dynptr specific check into process_dynptr_func.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c                         | 55 +++++++++++++++----
 .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
 2 files changed, 44 insertions(+), 13 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a49b95c1af1b..a8c277e51d63 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5654,6 +5654,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 		return -EFAULT;
 	}
 
+	/* CONST_PTR_TO_DYNPTR has fixed and variable offset as zero, ensured by
+	 * check_func_arg_reg_off, so this is only needed for PTR_TO_STACK.
+	 */
+	if (reg->off % BPF_REG_SIZE) {
+		verbose(env, "cannot pass in dynptr at an offset\n");
+		return -EINVAL;
+	}
+
 	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
 	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
 	 *
@@ -5672,6 +5680,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 	 *		 destroyed, including mutation of the memory it points
 	 *		 to.
 	 */
+
 	if (arg_type & MEM_UNINIT) {
 		if (!is_dynptr_reg_valid_uninit(env, reg)) {
 			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
@@ -5983,14 +5992,37 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	enum bpf_reg_type type = reg->type;
 	bool fixed_off_ok = false;
 
-	switch ((u32)type) {
-	/* Pointer types where reg offset is explicitly allowed: */
-	case PTR_TO_STACK:
-		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
-			verbose(env, "cannot pass in dynptr at an offset\n");
+	/* When referenced register is passed to release function, it's fixed
+	 * offset must be 0.
+	 *
+	 * We will check arg_type_is_release reg has ref_obj_id when storing
+	 * meta->release_regno.
+	 */
+	if (arg_type_is_release(arg_type)) {
+		/* ARG_PTR_TO_DYNPTR is a bit special, as it may not directly
+		 * point to the object being released, but to dynptr pointing
+		 * to such object, which might be at some offset on the stack.
+		 *
+		 * In that case, we simply to fallback to the default handling.
+		 */
+		if (arg_type_is_dynptr(arg_type) && type == PTR_TO_STACK)
+			goto check_type;
+		/* Going straight to check will catch this because fixed_off_ok
+		 * is false, but checking here allows us to give the user a
+		 * better error message.
+		 */
+		if (reg->off) {
+			verbose(env, "R%d must have zero offset when passed to release func\n",
+				regno);
 			return -EINVAL;
 		}
-		fallthrough;
+		goto check;
+	}
+check_type:
+	switch ((u32)type) {
+	/* Pointer types where both fixed and variable reg offset is explicitly
+	 * allowed: */
+	case PTR_TO_STACK:
 	case PTR_TO_PACKET:
 	case PTR_TO_PACKET_META:
 	case PTR_TO_MAP_KEY:
@@ -6001,12 +6033,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	case PTR_TO_BUF:
 	case PTR_TO_BUF | MEM_RDONLY:
 	case SCALAR_VALUE:
-		/* Some of the argument types nevertheless require a
-		 * zero register offset.
-		 */
-		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
-			return 0;
-		break;
+		return 0;
 	/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
 	 * fixed offset.
 	 */
@@ -6023,12 +6050,16 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 		/* For arg is release pointer, fixed_off_ok must be false, but
 		 * we already checked and rejected reg->off != 0 above, so set
 		 * to true to allow fixed offset for all other cases.
+		 *
+		 * var_off always must be 0 for PTR_TO_BTF_ID, hence we still
+		 * need to do checks instead of returning.
 		 */
 		fixed_off_ok = true;
 		break;
 	default:
 		break;
 	}
+check:
 	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
 }
 
diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c b/tools/testing/selftests/bpf/verifier/ringbuf.c
index b64d33e4833c..92e3f6a61a79 100644
--- a/tools/testing/selftests/bpf/verifier/ringbuf.c
+++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
@@ -28,7 +28,7 @@
 	},
 	.fixup_map_ringbuf = { 1 },
 	.result = REJECT,
-	.errstr = "dereference of modified alloc_mem ptr R1",
+	.errstr = "R1 must have zero offset when passed to release func",
 },
 {
 	"ringbuf: invalid reservation offset 2",
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (3 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-11-08 20:22   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR Kumar Kartikeya Dwivedi
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

The root of the problem is missing liveness marking for STACK_DYNPTR
slots. This leads to all kinds of problems inside stacksafe.

The verifier by default inside stacksafe ignores spilled_ptr in stack
slots which do not have REG_LIVE_READ marks. Since this is being checked
in the 'old' explored state, it must have already done clean_live_states
for this old bpf_func_state. Hence, it won't be receiving any more
liveness marks from to be explored insns (it has received REG_LIVE_DONE
marking from liveness point of view).

What this means is that verifier considers that it's safe to not compare
the stack slot if was never read by children states. While liveness
marks are usually propagated correctly following the parentage chain for
spilled registers (SCALAR_VALUE and PTR_* types), the same is not the
case for STACK_DYNPTR.

clean_live_states hence simply rewrites these stack slots to the type
STACK_INVALID since it sees no REG_LIVE_READ marks.

The end result is that we will never see STACK_DYNPTR slots in explored
state. Even if verifier was conservatively matching !REG_LIVE_READ
slots, very next check continuing the stacksafe loop on seeing
STACK_INVALID would again prevent further checks.

Now as long as verifier stores an explored state which we can compare to
when reaching a pruning point, we can abuse this bug to make verifier
prune search for obviously unsafe paths using STACK_DYNPTR slots
thinking they are never used hence safe.

Doing this in unprivileged mode is a bit challenging. add_new_state is
only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges)
or when jmps_processed difference is >= 2 and insn_processed difference
is >= 8. So coming up with the unprivileged case requires a little more
work, but it is still totally possible. The test case being discussed
below triggers the heuristic even in unprivileged mode.

However, it no longer works since commit
8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF").

Let's try to study the test step by step.

Consider the following program (C style BPF ASM):

0  r0 = 0;
1  r6 = &ringbuf_map;
3  r1 = r6;
4  r2 = 8;
5  r3 = 0;
6  r4 = r10;
7  r4 -= -16;
8  call bpf_ringbuf_reserve_dynptr;
9  if r0 == 0 goto pc+1;
10 goto pc+1;
11 *(r10 - 16) = 0xeB9F;
12 r1 = r10;
13 r1 -= -16;
14 r2 = 0;
15 call bpf_ringbuf_discard_dynptr;
16 r0 = 0;
17 exit;

We know that insn 12 will be a pruning point, hence if we force
add_new_state for it, it will first verify the following path as
safe in straight line exploration:
0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17

Then, when we arrive at insn 12 from the following path:
0 1 3 4 5 6 7 8 9 -> 11 (12)

We will find a state that has been verified as safe already at insn 12.
Since register state is same at this point, regsafe will pass. Next, in
stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped
seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true.
Next, refsafe is also true as reference state is unchanged in both
states.

The states are considered equivalent and search is pruned.

Hence, we are able to construct a dynptr with arbitrary contents and use
the dynptr API to operate on this arbitrary pointer and arbitrary size +
offset.

To fix this, first define a mark_dynptr_read function that propagates
liveness marks whenever a valid initialized dynptr is accessed by dynptr
helpers. REG_LIVE_WRITTEN is marked whenever we initialize an
uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows
screening off mark_reg_read and not propagating marks upwards from that
point.

This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or
none, so clean_live_states either sets both slots to STACK_INVALID or
none of them. This is the invariant the checks inside stacksafe rely on.

Next, do a complete comparison of both stack slots whenever they have
STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and
also whether both form the same first_slot. Only then is the later path
safe.

Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 73 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a8c277e51d63..8f667180f70f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -752,6 +752,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 		state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
 	}
 
+	state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
+	state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
+
 	return 0;
 }
 
@@ -776,6 +779,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 
 	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
 	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
+
+	/* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot?
+	 *
+	 * While we don't allow reading STACK_INVALID, it is still possible to
+	 * do <8 byte writes marking some but not all slots as STACK_MISC. Then,
+	 * helpers or insns can do partial read of that part without failing,
+	 * but check_stack_range_initialized, check_stack_read_var_off, and
+	 * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of
+	 * the slot conservatively. Hence we need to screen off those liveness
+	 * marking walks.
+	 *
+	 * This was not a problem before because STACK_INVALID is only set by
+	 * default, or in clean_live_states after REG_LIVE_DONE, not randomly
+	 * during verifier state exploration. Hence, for this case parentage
+	 * chain will still be live, while earlier reg->parent was NULL, so we
+	 * need REG_LIVE_WRITTEN to screen off read marker propagation.
+	 */
+	state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
+	state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
+
 	return 0;
 }
 
@@ -2354,6 +2377,30 @@ static int mark_reg_read(struct bpf_verifier_env *env,
 	return 0;
 }
 
+static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
+{
+	struct bpf_func_state *state = func(env, reg);
+	int spi, ret;
+
+	/* For CONST_PTR_TO_DYNPTR, it must have already been done by
+	 * check_reg_arg in check_helper_call and mark_btf_func_reg_size in
+	 * check_kfunc_call.
+	 */
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return 0;
+	spi = get_spi(reg->off);
+	/* Caller ensures dynptr is valid and initialized, which means spi is in
+	 * bounds and spi is the first dynptr slot. Simply mark stack slot as
+	 * read.
+	 */
+	ret = mark_reg_read(env, &state->stack[spi].spilled_ptr,
+			    state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64);
+	if (ret)
+		return ret;
+	return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr,
+			     state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64);
+}
+
 /* This function is supposed to be used by the following 32-bit optimization
  * code only. It returns TRUE if the source or destination register operates
  * on 64-bit, otherwise return FALSE.
@@ -5648,6 +5695,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 			u8 *uninit_dynptr_regno)
 {
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
+	int err;
 
 	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
 		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
@@ -5729,6 +5777,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 				err_extra, argno + 1);
 			return -EINVAL;
 		}
+
+		err = mark_dynptr_read(env, reg);
+		if (err)
+			return err;
 	}
 	return 0;
 }
@@ -11793,6 +11845,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
 			 * return false to continue verification of this path
 			 */
 			return false;
+		/* Both are same slot_type, but STACK_DYNPTR requires more
+		 * checks before it can considered safe.
+		 */
+		if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) {
+			/* If both are STACK_DYNPTR, type must be same */
+			if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type)
+				return false;
+			/* Both should also have first slot at same spi */
+			if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot)
+				return false;
+			/* ids should be same */
+			if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id)
+				return false;
+			if (old->stack[spi].spilled_ptr.ref_obj_id &&
+			    !check_ids(old->stack[spi].spilled_ptr.ref_obj_id,
+				       cur->stack[spi].spilled_ptr.ref_obj_id, idmap))
+				return false;
+			WARN_ON_ONCE(i % BPF_REG_SIZE);
+			i += BPF_REG_SIZE - 1;
+			continue;
+		}
 		if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1)
 			continue;
 		if (!is_spilled_reg(&old->stack[spi]))
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (4 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-19 18:52   ` Alexei Starovoitov
  2022-10-18 13:59 ` [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes Kumar Kartikeya Dwivedi
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Currently, the dynptr function is not checking the variable offset part
of PTR_TO_STACK that it needs to check. The fixed offset is considered
when computing the stack pointer index, but if the variable offset was
not a constant (such that it could not be accumulated in reg->off), we
will end up a discrepency where runtime pointer does not point to the
actual stack slot we mark as STACK_DYNPTR.

It is impossible to precisely track dynptr state when variable offset is
not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
simply reject the case where reg->var_off is not constant. Then,
consider both reg->off and reg->var_off.value when computing the stack
pointer index.

A new helper dynptr_get_spi is introduced to hide over these details
since the dynptr needs to be located in multiple places outside the
process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
need to enforce these checks in all places.

Note that it is disallowed for unprivileged users to have a non-constant
var_off, so this problem should only be possible to trigger from
programs having CAP_PERFMON. However, its effects can vary.

Without the fix, it is possible to replace the contents of the dynptr
arbitrarily by making verifier mark different stack slots than actual
location and then doing writes to the actual stack address of dynptr at
runtime.

Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c                         | 80 +++++++++++++++----
 .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
 .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
 3 files changed, 67 insertions(+), 21 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 8f667180f70f..0fd73f96c5e2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
 		verbose(env, "D");
 }
 
-static int get_spi(s32 off)
+static int __get_spi(s32 off)
 {
 	return (-off - 1) / BPF_REG_SIZE;
 }
 
+static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
+{
+	int spi;
+
+	if (reg->off % BPF_REG_SIZE) {
+		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
+		return -EINVAL;
+	}
+
+	if (!tnum_is_const(reg->var_off)) {
+		verbose(env, "dynptr has to be at the constant offset\n");
+		return -EINVAL;
+	}
+
+	spi = __get_spi(reg->off + reg->var_off.value);
+	if (spi < 1) {
+		verbose(env, "cannot pass in dynptr at an offset=%d\n",
+			(int)(reg->off + reg->var_off.value));
+		return -EINVAL;
+	}
+	return spi;
+}
+
 static bool is_spi_bounds_valid(struct bpf_func_state *state, int spi, int nr_slots)
 {
 	int allocated_slots = state->allocated_stack / BPF_REG_SIZE;
@@ -725,7 +748,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 	enum bpf_dynptr_type type;
 	int spi, i, id;
 
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (spi < 0)
+		return spi;
 
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return -EINVAL;
@@ -763,7 +788,9 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 	struct bpf_func_state *state = func(env, reg);
 	int spi, i;
 
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (spi < 0)
+		return spi;
 
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return -EINVAL;
@@ -810,7 +837,11 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 	if (reg->type == CONST_PTR_TO_DYNPTR)
 		return false;
 
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (spi < 0)
+		return spi;
+
+	/* We will do check_mem_access to check and update stack bounds later */
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return true;
 
@@ -826,14 +857,15 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi;
-	int i;
+	int spi, i;
 
 	/* This already represents first slot of initialized bpf_dynptr */
 	if (reg->type == CONST_PTR_TO_DYNPTR)
 		return true;
 
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (spi < 0)
+		return false;
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
 	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
 		return false;
@@ -864,7 +896,9 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
 	if (reg->type == CONST_PTR_TO_DYNPTR) {
 		return reg->dynptr.type == dynptr_type;
 	} else {
-		spi = get_spi(reg->off);
+		spi = dynptr_get_spi(env, reg);
+		if (WARN_ON_ONCE(spi < 0))
+			return false;
 		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
 	}
 }
@@ -2388,7 +2422,9 @@ static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *
 	 */
 	if (reg->type == CONST_PTR_TO_DYNPTR)
 		return 0;
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (WARN_ON_ONCE(spi < 0))
+		return spi;
 	/* Caller ensures dynptr is valid and initialized, which means spi is in
 	 * bounds and spi is the first dynptr slot. Simply mark stack slot as
 	 * read.
@@ -5663,6 +5699,11 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
 	return 0;
 }
 
+static bool arg_type_is_release(enum bpf_arg_type type)
+{
+	return type & OBJ_RELEASE;
+}
+
 /* Implementation details:
  *
  * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
@@ -5710,6 +5751,13 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 		return -EINVAL;
 	}
 
+	/* Additional check for PTR_TO_STACK offset */
+	if (reg->type == PTR_TO_STACK) {
+		err = dynptr_get_spi(env, reg);
+		if (err < 0)
+			return err;
+	}
+
 	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
 	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
 	 *
@@ -5728,7 +5776,6 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 	 *		 destroyed, including mutation of the memory it points
 	 *		 to.
 	 */
-
 	if (arg_type & MEM_UNINIT) {
 		if (!is_dynptr_reg_valid_uninit(env, reg)) {
 			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
@@ -5791,11 +5838,6 @@ static bool arg_type_is_mem_size(enum bpf_arg_type type)
 	       type == ARG_CONST_SIZE_OR_ZERO;
 }
 
-static bool arg_type_is_release(enum bpf_arg_type type)
-{
-	return type & OBJ_RELEASE;
-}
-
 static bool arg_type_is_dynptr(enum bpf_arg_type type)
 {
 	return base_type(type) == ARG_PTR_TO_DYNPTR;
@@ -6122,7 +6164,9 @@ static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state
 
 	if (reg->type == CONST_PTR_TO_DYNPTR)
 		return reg->ref_obj_id;
-	spi = get_spi(reg->off);
+	spi = dynptr_get_spi(env, reg);
+	if (WARN_ON_ONCE(spi < 0))
+		return U32_MAX;
 	return state->stack[spi].spilled_ptr.ref_obj_id;
 }
 
@@ -6190,7 +6234,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 			int spi;
 
 			if (reg->type == PTR_TO_STACK) {
-				spi = get_spi(reg->off);
+				spi = dynptr_get_spi(env, reg);
+				if (spi < 0)
+					return spi;
 				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
 				    !state->stack[spi].spilled_ptr.ref_obj_id) {
 					verbose(env, "arg %d is an unacquired reference\n", regno);
diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
index 8fc4e6c02bfd..947126d217bd 100644
--- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
@@ -27,16 +27,16 @@ static struct {
 	{"data_slice_missing_null_check1", "invalid mem access 'mem_or_null'"},
 	{"data_slice_missing_null_check2", "invalid mem access 'mem_or_null'"},
 	{"invalid_helper1", "invalid indirect read from stack"},
-	{"invalid_helper2", "Expected an initialized dynptr as arg #3"},
+	{"invalid_helper2", "cannot pass in dynptr at an offset=-8"},
 	{"invalid_write1", "Expected an initialized dynptr as arg #1"},
 	{"invalid_write2", "Expected an initialized dynptr as arg #3"},
-	{"invalid_write3", "Expected an initialized dynptr as arg #1"},
+	{"invalid_write3", "arg 1 is an unacquired reference"},
 	{"invalid_write4", "arg 1 is an unacquired reference"},
 	{"invalid_read1", "invalid read from stack"},
 	{"invalid_read2", "cannot pass in dynptr at an offset"},
 	{"invalid_read3", "invalid read from stack"},
 	{"invalid_read4", "invalid read from stack"},
-	{"invalid_offset", "invalid write to stack"},
+	{"invalid_offset", "cannot pass in dynptr at an offset=0"},
 	{"global", "type=map_value expected=fp"},
 	{"release_twice", "arg 1 is an unacquired reference"},
 	{"release_twice_callback", "arg 1 is an unacquired reference"},
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
index fc562e863e79..e4b970bc2d3f 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
@@ -18,7 +18,7 @@ static struct {
 	const char *expected_verifier_err_msg;
 	int expected_runtime_err;
 } kfunc_dynptr_tests[] = {
-	{"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0},
+	{"not_valid_dynptr", "cannot pass in dynptr at an offset=-8", 0},
 	{"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0},
 	{"dynptr_data_null", NULL, -EBADMSG},
 };
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (5 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-21 22:50   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Currently, while reads are disallowed for dynptr stack slots, writes are
not. Reads don't work from both direct access and helpers, while writes
do work in both cases, but have the effect of overwriting the slot_type.

While this is fine, handling for a few edge cases is missing. Firstly,
a user can overwrite the stack slots of dynptr partially.

Consider the following layout:
spi: [d][d][?]
      2  1  0

First slot is at spi 2, second at spi 1.
Now, do a write of 1 to 8 bytes for spi 1.

This will essentially either write STACK_MISC for all slot_types or
STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
of zeroes). The end result is that slot is scrubbed.

Now, the layout is:
spi: [d][m][?]
      2  1  0

Suppose if user initializes spi = 1 as dynptr.
We get:
spi: [d][d][d]
      2  1  0

But this time, both spi 2 and spi 1 have first_slot = true.

Now, when passing spi 2 to dynptr helper, it will consider it as
initialized as it does not check whether second slot has first_slot ==
false. And spi 1 should already work as normal.

This effectively replaced size + offset of first dynptr, hence allowing
invalid OOB reads and writes.

Make a few changes to protect against this:
When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
STACK_DYNPTR type, mark both first and second slot (regardless of which
slot we touch) as STACK_INVALID. Reads are already prevented.

Second, prevent writing	to stack memory from helpers if the range may
contain any STACK_DYNPTR slots. Reads are already prevented.

For helpers, we cannot allow it to destroy dynptrs from the writes as
depending on arguments, helper may take uninit_mem and dynptr both at
the same time. This would mean that helper may write to uninit_mem
before it reads the dynptr, which would be bad.

PTR_TO_MEM: [?????dd]

Depending on the code inside the helper, it may end up overwriting the
dynptr contents first and then read those as the dynptr argument.

Verifier would only simulate destruction when it does byte by byte
access simulation in check_helper_call for meta.access_size, and
fail to catch this case, as it happens after argument checks.

The same would need to be done for any other non-trivial objects created
on the stack in the future, such as bpf_list_head on stack, or
bpf_rb_root on stack.

A common misunderstanding in the current code is that MEM_UNINIT means
writes, but note that writes may also be performed even without
MEM_UNINIT in case of helpers, in that case the code after handling meta
&& meta->raw_mode will complain when it sees STACK_DYNPTR. So that
invalid read case also covers writes to potential STACK_DYNPTR slots.
The only loophole was in case of meta->raw_mode which simulated writes
through instructions which could overwrite them.

A future series sequenced after this will focus on the clean up of
helper access checks and bugs around that.

Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 76 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0fd73f96c5e2..89ae384ea6a7 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -740,6 +740,8 @@ static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
 	__mark_dynptr_regs(reg1, NULL, type);
 }
 
+static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
+				       struct bpf_func_state *state, int spi);
 
 static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 				   enum bpf_arg_type arg_type, int insn_idx)
@@ -755,6 +757,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return -EINVAL;
 
+	destroy_stack_slots_dynptr(env, state, spi);
+	destroy_stack_slots_dynptr(env, state, spi - 1);
+
 	for (i = 0; i < BPF_REG_SIZE; i++) {
 		state->stack[spi].slot_type[i] = STACK_DYNPTR;
 		state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
@@ -829,6 +834,44 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 	return 0;
 }
 
+static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
+				       struct bpf_func_state *state, int spi)
+{
+	int i;
+
+	/* We always ensure that STACK_DYNPTR is never set partially,
+	 * hence just checking for slot_type[0] is enough. This is
+	 * different for STACK_SPILL, where it may be only set for
+	 * 1 byte, so code has to use is_spilled_reg.
+	 */
+	if (state->stack[spi].slot_type[0] != STACK_DYNPTR)
+		return;
+	/* Reposition spi to first slot */
+	if (!state->stack[spi].spilled_ptr.dynptr.first_slot)
+		spi = spi + 1;
+
+	mark_stack_slot_scratched(env, spi);
+	mark_stack_slot_scratched(env, spi - 1);
+
+	/* Writing partially to one dynptr stack slot destroys both. */
+	for (i = 0; i < BPF_REG_SIZE; i++) {
+		state->stack[spi].slot_type[i] = STACK_INVALID;
+		state->stack[spi - 1].slot_type[i] = STACK_INVALID;
+	}
+
+	/* Do not release reference state, we are destroying dynptr on stack,
+	 * not using some helper to release it. Just reset register.
+	 */
+	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
+	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
+
+	/* Same reason as unmark_stack_slots_dynptr above */
+	state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
+	state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
+
+	return;
+}
+
 static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
@@ -3183,6 +3226,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
 			env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
 	}
 
+	destroy_stack_slots_dynptr(env, state, spi);
+
 	mark_stack_slot_scratched(env, spi);
 	if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) &&
 	    !register_is_null(reg) && env->bpf_capable) {
@@ -3296,6 +3341,13 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
+	for (i = min_off; i < max_off; i++) {
+		int slot, spi;
+
+		slot = -i - 1;
+		spi = slot / BPF_REG_SIZE;
+		destroy_stack_slots_dynptr(env, state, spi);
+	}
 
 	/* Variable offset writes destroy any spilled pointers in range. */
 	for (i = min_off; i < max_off; i++) {
@@ -5257,6 +5309,30 @@ static int check_stack_range_initialized(
 	}
 
 	if (meta && meta->raw_mode) {
+		/* Ensure we won't be overwriting dynptrs when simulating byte
+		 * by byte access in check_helper_call using meta.access_size.
+		 * This would be a problem if we have a helper in the future
+		 * which takes:
+		 *
+		 *	helper(uninit_mem, len, dynptr)
+		 *
+		 * Now, uninint_mem may overlap with dynptr pointer. Hence, it
+		 * may end up writing to dynptr itself when touching memory from
+		 * arg 1. This can be relaxed on a case by case basis for known
+		 * safe cases, but reject due to the possibilitiy of aliasing by
+		 * default.
+		 */
+		for (i = min_off; i < max_off + access_size; i++) {
+			slot = -i - 1;
+			spi = slot / BPF_REG_SIZE;
+			/* raw_mode may write past allocated_stack */
+			if (state->allocated_stack <= slot)
+				continue;
+			if (state->stack[spi].slot_type[slot % BPF_REG_SIZE] == STACK_DYNPTR) {
+				verbose(env, "potential write to dynptr at off=%d disallowed\n", i);
+				return -EACCES;
+			}
+		}
 		meta->access_size = access_size;
 		meta->regno = regno;
 		return 0;
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write}
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (6 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-21 18:12   ` Joanne Koong
  2022-10-18 13:59 ` [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

It may happen that destination buffer memory overlaps with memory dynptr
points to. Hence, we must use memmove to correctly copy from dynptr to
destination buffer, or source buffer to dynptr.

This actually isn't a problem right now, as memcpy implementation falls
back to memmove on detecting overlap and warns about it, but we
shouldn't be relying on that.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/helpers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 0a4017eb3616..2dc3f5ce8f9b 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1489,7 +1489,7 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern
 	if (err)
 		return err;
 
-	memcpy(dst, src->data + src->offset + offset, len);
+	memmove(dst, src->data + src->offset + offset, len);
 
 	return 0;
 }
@@ -1517,7 +1517,7 @@ BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, v
 	if (err)
 		return err;
 
-	memcpy(dst->data + dst->offset + offset, src, len);
+	memmove(dst->data + dst->offset + offset, src, len);
 
 	return 0;
 }
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (7 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-19 16:59   ` David Vernet
  2022-10-18 13:59 ` [PATCH bpf-next v1 10/13] selftests/bpf: Add dynptr pruning tests Kumar Kartikeya Dwivedi
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

The original support for bpf_user_ringbuf_drain callbacks simply
short-circuited checks for the dynptr state, allowing users to pass
PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a
dynptr. This bug would have also surfaced with other dynptr helpers in
the future that changed dynptr view or modified it in some way.

Include test cases for all cases, i.e. both bpf_dynptr_from_mem and
bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them.
Without the fix, both of these programs load and pass verification.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  2 ++
 .../selftests/bpf/progs/user_ringbuf_fail.c   | 35 +++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
index 39882580cb90..500a63bb70a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
@@ -676,6 +676,8 @@ static struct {
 	{"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
+	{"user_ringbuf_callback_reinit_dynptr_mem", "Dynptr has to be an uninitialized dynptr"},
+	{"user_ringbuf_callback_reinit_dynptr_ringbuf", "Dynptr has to be an uninitialized dynptr"},
 };
 
 #define SUCCESS_TEST(_func) { _func, #_func }
diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
index 82aba4529aa9..7730d13c0cea 100644
--- a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
+++ b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
@@ -18,6 +18,13 @@ struct {
 	__uint(type, BPF_MAP_TYPE_USER_RINGBUF);
 } user_ringbuf SEC(".maps");
 
+struct {
+	__uint(type, BPF_MAP_TYPE_RINGBUF);
+	__uint(max_entries, 2);
+} ringbuf SEC(".maps");
+
+static int map_value;
+
 static long
 bad_access1(struct bpf_dynptr *dynptr, void *context)
 {
@@ -175,3 +182,31 @@ int user_ringbuf_callback_invalid_return(void *ctx)
 
 	return 0;
 }
+
+static long
+try_reinit_dynptr_mem(struct bpf_dynptr *dynptr, void *context)
+{
+	bpf_dynptr_from_mem(&map_value, 4, 0, dynptr);
+	return 0;
+}
+
+static long
+try_reinit_dynptr_ringbuf(struct bpf_dynptr *dynptr, void *context)
+{
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 8, 0, dynptr);
+	return 0;
+}
+
+SEC("?raw_tp/sys_nanosleep")
+int user_ringbuf_callback_reinit_dynptr_mem(void *ctx)
+{
+	bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_mem, NULL, 0);
+	return 0;
+}
+
+SEC("?raw_tp/sys_nanosleep")
+int user_ringbuf_callback_reinit_dynptr_ringbuf(void *ctx)
+{
+	bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_ringbuf, NULL, 0);
+	return 0;
+}
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 10/13] selftests/bpf: Add dynptr pruning tests
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (8 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 11/13] selftests/bpf: Add dynptr var_off tests Kumar Kartikeya Dwivedi
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Add verifier tests that verify the new pruning behavior for STACK_DYNPTR
slots, and ensure that state equivalence takes into account changes to
the old and current verifier state correctly.

Without the prior fixes, both of these bugs trigger with unprivileged
BPF mode.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/verifier/dynptr.c | 90 +++++++++++++++++++
 1 file changed, 90 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/verifier/dynptr.c

diff --git a/tools/testing/selftests/bpf/verifier/dynptr.c b/tools/testing/selftests/bpf/verifier/dynptr.c
new file mode 100644
index 000000000000..798f4f7e0c57
--- /dev/null
+++ b/tools/testing/selftests/bpf/verifier/dynptr.c
@@ -0,0 +1,90 @@
+{
+       "dynptr: rewrite dynptr slot",
+        .insns = {
+        BPF_MOV64_IMM(BPF_REG_0, 0),
+        BPF_LD_MAP_FD(BPF_REG_6, 0),
+        BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+        BPF_MOV64_IMM(BPF_REG_2, 8),
+        BPF_MOV64_IMM(BPF_REG_3, 0),
+        BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+        BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -16),
+        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_reserve_dynptr),
+        BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 1),
+        BPF_JMP_IMM(BPF_JA, 0, 0, 1),
+        BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0xeB9F),
+        BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+        BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -16),
+        BPF_MOV64_IMM(BPF_REG_2, 0),
+        BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_discard_dynptr),
+        BPF_MOV64_IMM(BPF_REG_0, 0),
+        BPF_EXIT_INSN(),
+        },
+	.fixup_map_ringbuf = { 1 },
+	.result_unpriv = REJECT,
+	.errstr_unpriv = "unknown func bpf_ringbuf_reserve_dynptr#198",
+	.result = REJECT,
+	.errstr = "arg 1 is an unacquired reference",
+},
+{
+       "dynptr: type confusion",
+       .insns = {
+       BPF_MOV64_IMM(BPF_REG_0, 0),
+       BPF_LD_MAP_FD(BPF_REG_6, 0),
+       BPF_LD_MAP_FD(BPF_REG_7, 0),
+       BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+       BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+       BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+       BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -24),
+       BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0xeB9FeB9F),
+       BPF_ST_MEM(BPF_DW, BPF_REG_10, -24, 0xeB9FeB9F),
+       BPF_MOV64_IMM(BPF_REG_4, 0),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_2),
+       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_update_elem),
+       BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+       BPF_MOV64_REG(BPF_REG_2, BPF_REG_8),
+       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+       BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+       BPF_EXIT_INSN(),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_0),
+       BPF_MOV64_REG(BPF_REG_1, BPF_REG_7),
+       BPF_MOV64_IMM(BPF_REG_2, 8),
+       BPF_MOV64_IMM(BPF_REG_3, 0),
+       BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -16),
+       BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
+       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_reserve_dynptr),
+       BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
+       /* pad with insns to trigger add_new_state heuristic for straight line path */
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_MOV64_REG(BPF_REG_8, BPF_REG_8),
+       BPF_JMP_IMM(BPF_JA, 0, 0, 9),
+       BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+       BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0),
+       BPF_MOV64_REG(BPF_REG_1, BPF_REG_8),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 8),
+       BPF_MOV64_IMM(BPF_REG_2, 0),
+       BPF_MOV64_IMM(BPF_REG_3, 0),
+       BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -16),
+       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_dynptr_from_mem),
+       BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+       BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -16),
+       BPF_MOV64_IMM(BPF_REG_2, 0),
+       BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_discard_dynptr),
+       BPF_MOV64_IMM(BPF_REG_0, 0),
+       BPF_EXIT_INSN(),
+       },
+       .fixup_map_hash_16b = { 1 },
+       .fixup_map_ringbuf = { 3 },
+       .result_unpriv = REJECT,
+       .errstr_unpriv = "unknown func bpf_ringbuf_reserve_dynptr#198",
+       .result = REJECT,
+       .errstr = "arg 1 is an unacquired reference",
+},
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 11/13] selftests/bpf: Add dynptr var_off tests
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (9 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 10/13] selftests/bpf: Add dynptr pruning tests Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 12/13] selftests/bpf: Add dynptr partial slot overwrite tests Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Ensure that variable offset is handled correctly, and verifier takes
both fixed and variable part into account. Also ensure that only
constant var_off is allowed.

Make sure that unprivileged BPF cannot use var_off for dynptr.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/verifier/dynptr.c | 38 ++++++++++++++++++-
 1 file changed, 36 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/verifier/dynptr.c b/tools/testing/selftests/bpf/verifier/dynptr.c
index 798f4f7e0c57..1aa7241e8a9e 100644
--- a/tools/testing/selftests/bpf/verifier/dynptr.c
+++ b/tools/testing/selftests/bpf/verifier/dynptr.c
@@ -1,5 +1,5 @@
 {
-       "dynptr: rewrite dynptr slot",
+       "dynptr: rewrite dynptr slot (pruning)",
         .insns = {
         BPF_MOV64_IMM(BPF_REG_0, 0),
         BPF_LD_MAP_FD(BPF_REG_6, 0),
@@ -26,7 +26,7 @@
 	.errstr = "arg 1 is an unacquired reference",
 },
 {
-       "dynptr: type confusion",
+       "dynptr: type confusion (pruning)",
        .insns = {
        BPF_MOV64_IMM(BPF_REG_0, 0),
        BPF_LD_MAP_FD(BPF_REG_6, 0),
@@ -88,3 +88,37 @@
        .result = REJECT,
        .errstr = "arg 1 is an unacquired reference",
 },
+{
+       "dynptr: rewrite dynptr slot (var_off)",
+	.insns = {
+	BPF_ST_MEM(BPF_W, BPF_REG_10, -4, 16),
+	BPF_LDX_MEM(BPF_W, BPF_REG_8, BPF_REG_10, -4),
+	BPF_JMP_IMM(BPF_JGE, BPF_REG_8, 0, 2),
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	BPF_JMP_IMM(BPF_JLE, BPF_REG_8, 16, 2),
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	BPF_ALU64_IMM(BPF_AND, BPF_REG_8, 16),
+	BPF_LD_MAP_FD(BPF_REG_1, 0),
+	BPF_MOV64_IMM(BPF_REG_2, 8),
+	BPF_MOV64_IMM(BPF_REG_3, 0),
+	BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -32),
+	BPF_ALU64_REG(BPF_ADD, BPF_REG_4, BPF_REG_8),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_reserve_dynptr),
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0xeB9F),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -32),
+	BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_8),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_discard_dynptr),
+	BPF_MOV64_REG(BPF_REG_0, BPF_REG_8),
+	BPF_EXIT_INSN(),
+	},
+	.fixup_map_ringbuf = { 9 },
+	.result_unpriv = REJECT,
+	.errstr_unpriv = "R4 variable stack access prohibited for !root, var_off=(0x0; 0x10) off=-32",
+	.result = REJECT,
+	.errstr = "dynptr has to be at the constant offset",
+},
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 12/13] selftests/bpf: Add dynptr partial slot overwrite tests
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (10 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 11/13] selftests/bpf: Add dynptr var_off tests Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2022-10-18 13:59 ` [PATCH bpf-next v1 13/13] selftests/bpf: Add dynptr helper tests Kumar Kartikeya Dwivedi
  2023-10-31  7:05 ` CVE-2023-39191 - Dynptr fixes - reg Nandhini Rengaraj
  13 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Try creating a dynptr, then overwriting second slot with first slot of
another dynptr. Then, the first slot of first dynptr should also be
invalidated, but without our fix that does not happen. As a consequence,
the unfixed case allows passing first dynptr (as the kernel check only
checks for slot_type and then first_slot == true).

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/verifier/dynptr.c | 58 +++++++++++++++++++
 1 file changed, 58 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/dynptr.c b/tools/testing/selftests/bpf/verifier/dynptr.c
index 1aa7241e8a9e..8c57bc9e409f 100644
--- a/tools/testing/selftests/bpf/verifier/dynptr.c
+++ b/tools/testing/selftests/bpf/verifier/dynptr.c
@@ -122,3 +122,61 @@
 	.result = REJECT,
 	.errstr = "dynptr has to be at the constant offset",
 },
+{
+       "dynptr: partial dynptr slot invalidate",
+	.insns = {
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+	BPF_LD_MAP_FD(BPF_REG_6, 0),
+	BPF_LD_MAP_FD(BPF_REG_7, 0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_7),
+	BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+	BPF_ST_MEM(BPF_DW, BPF_REG_2, 0, 0),
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_2),
+	BPF_MOV64_IMM(BPF_REG_4, 0),
+	BPF_MOV64_REG(BPF_REG_8, BPF_REG_2),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_update_elem),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_7),
+	BPF_MOV64_REG(BPF_REG_2, BPF_REG_8),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_7, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_MOV64_IMM(BPF_REG_2, 8),
+	BPF_MOV64_IMM(BPF_REG_3, 0),
+	BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -24),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_reserve_dynptr),
+	BPF_ST_MEM(BPF_DW, BPF_REG_10, -16, 0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_7),
+	BPF_MOV64_IMM(BPF_REG_2, 8),
+	BPF_MOV64_IMM(BPF_REG_3, 0),
+	BPF_MOV64_REG(BPF_REG_4, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, -16),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_dynptr_from_mem),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -512),
+	BPF_MOV64_IMM(BPF_REG_2, 488),
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, -24),
+	BPF_MOV64_IMM(BPF_REG_4, 0),
+	BPF_MOV64_IMM(BPF_REG_5, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_dynptr_read),
+	BPF_MOV64_IMM(BPF_REG_8, 1),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_MOV64_IMM(BPF_REG_8, 0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -24),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_ringbuf_discard_dynptr),
+	BPF_MOV64_REG(BPF_REG_0, BPF_REG_8),
+	BPF_EXIT_INSN(),
+	},
+	.fixup_map_ringbuf = { 1 },
+	.fixup_map_hash_8b = { 3 },
+	.result_unpriv = REJECT,
+	.errstr_unpriv = "unknown func bpf_ringbuf_reserve_dynptr#198",
+	.result = REJECT,
+	.errstr = "Expected an initialized dynptr as arg #3",
+},
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* [PATCH bpf-next v1 13/13] selftests/bpf: Add dynptr helper tests
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (11 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 12/13] selftests/bpf: Add dynptr partial slot overwrite tests Kumar Kartikeya Dwivedi
@ 2022-10-18 13:59 ` Kumar Kartikeya Dwivedi
  2023-10-31  7:05 ` CVE-2023-39191 - Dynptr fixes - reg Nandhini Rengaraj
  13 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-18 13:59 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Test that MEM_UNINIT doesn't allow writing dynptr stack slots. Next,
also add a test triggering the memmove case for dynptr_read and
dynptr_write.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../testing/selftests/bpf/prog_tests/dynptr.c |  3 ++
 .../testing/selftests/bpf/progs/dynptr_fail.c | 35 +++++++++++++++++++
 .../selftests/bpf/progs/dynptr_success.c      | 20 +++++++++++
 3 files changed, 58 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/dynptr.c b/tools/testing/selftests/bpf/prog_tests/dynptr.c
index 947126d217bd..20910598a0a6 100644
--- a/tools/testing/selftests/bpf/prog_tests/dynptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/dynptr.c
@@ -42,11 +42,14 @@ static struct {
 	{"release_twice_callback", "arg 1 is an unacquired reference"},
 	{"dynptr_from_mem_invalid_api",
 		"Unsupported reg type fp for bpf_dynptr_from_mem data"},
+	{"dynptr_read_into_slot", "potential write to dynptr at off=-16"},
+	{"uninit_write_into_slot", "potential write to dynptr at off=-16"},
 
 	/* success cases */
 	{"test_read_write", NULL},
 	{"test_data_slice", NULL},
 	{"test_ringbuf", NULL},
+	{"test_overlap", NULL},
 };
 
 static void verify_fail(const char *prog_name, const char *expected_err_msg)
diff --git a/tools/testing/selftests/bpf/progs/dynptr_fail.c b/tools/testing/selftests/bpf/progs/dynptr_fail.c
index b0f08ff024fb..43a0ed3736a9 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_fail.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_fail.c
@@ -622,3 +622,38 @@ int dynptr_from_mem_invalid_api(void *ctx)
 
 	return 0;
 }
+
+/* Reject writes to dynptr slot from bpf_dynptr_read */
+SEC("?raw_tp")
+int dynptr_read_into_slot(void *ctx)
+{
+	union {
+		struct {
+			char _pad[48];
+			struct bpf_dynptr ptr;
+		};
+		char buf[64];
+	} data;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 64, 0, &data.ptr);
+	/* this should fail */
+	bpf_dynptr_read(data.buf, sizeof(data.buf), &data.ptr, 0, 0);
+
+	return 0;
+}
+
+/* Reject writes to dynptr slot for uninit arg */
+SEC("?raw_tp")
+int uninit_write_into_slot(void *ctx)
+{
+	struct {
+		char buf[64];
+		struct bpf_dynptr ptr;
+	} data;
+
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 80, 0, &data.ptr);
+	/* this should fail */
+	bpf_get_current_comm(data.buf, 80);
+
+	return 0;
+}
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
index a3a6103c8569..401e924b15a0 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -162,3 +162,23 @@ int test_ringbuf(void *ctx)
 	bpf_ringbuf_discard_dynptr(&ptr, 0);
 	return 0;
 }
+
+SEC("tp/syscalls/sys_enter_nanosleep")
+int test_overlap(void *ctx)
+{
+	struct bpf_dynptr ptr;
+	void *p;
+
+	if (bpf_get_current_pid_tgid() >> 32 != pid)
+		return 0;
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 16, 0, &ptr);
+	p = bpf_dynptr_data(&ptr, 0, 16);
+	if (!p) {
+		err = 1;
+		goto done;
+	}
+	bpf_dynptr_read(p + 1, 8, &ptr, 0, 0);
+done:
+	bpf_ringbuf_discard_dynptr(&ptr, 0);
+	return 0;
+}
-- 
2.38.0


^ permalink raw reply related	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-10-18 19:45   ` David Vernet
  2022-10-19  6:04     ` Kumar Kartikeya Dwivedi
  2022-10-19 22:59   ` Joanne Koong
  1 sibling, 1 reply; 54+ messages in thread
From: David Vernet @ 2022-10-18 19:45 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Oct 18, 2022 at 07:29:08PM +0530, Kumar Kartikeya Dwivedi wrote:

Hey Kumar, thanks for looking at this stuff.

> ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> the underlying register type is subjected to more special checks to
> determine the type of object represented by the pointer and its state
> consistency.
> 
> Move dynptr checks to their own 'process_dynptr_func' function so that
> is consistent and in-line with existing code. This also makes it easier
> to reuse this code for kfunc handling.

Just out of curiosity, do you have a specific use case for when you'd envision
a kfunc taking a dynptr? I'm not saying there are none, just curious if you
have any specifically that you've considered.

> To this end, remove the dependency on bpf_call_arg_meta parameter by
> instead taking the uninit_dynptr_regno by pointer. This is only needed
> to be set to a valid pointer when arg_type has MEM_UNINIT.
> 
> Then, reuse this consolidated function in kfunc dynptr handling too.
> Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> been lifted.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf_verifier.h                  |   8 +-
>  kernel/bpf/btf.c                              |  17 +--
>  kernel/bpf/verifier.c                         | 115 ++++++++++--------
>  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
>  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
>  5 files changed, 69 insertions(+), 88 deletions(-)
> 
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 9e1e6965f407..a33683e0618b 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
>  			     u32 regno);
>  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>  		   u32 regno, u32 mem_size);
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -			      struct bpf_reg_state *reg);
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -			     struct bpf_reg_state *reg,
> -			     enum bpf_arg_type arg_type);
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +			enum bpf_arg_type arg_type, int argno,
> +			u8 *uninit_dynptr_regno);
>  
>  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
>  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index eba603cec2c5..1827d889e08a 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
>  						return -EINVAL;
>  					}
>  
> -					if (!is_dynptr_reg_valid_init(env, reg)) {
> -						bpf_log(log,
> -							"arg#%d pointer type %s %s must be valid and initialized\n",
> -							i, btf_type_str(ref_t),
> -							ref_tname);
> +					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
>  						return -EINVAL;
> -					}

Could you please clarify why you're removing the DYNPTR_TYPE_LOCAL constraint
for kfuncs?

You seemed to have removed the following negative selftest:

> -SEC("?lsm.s/bpf")
> -int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
> -	     unsigned int size)
> -{
> -	char write_data[64] = "hello there, world!!";
> -	struct bpf_dynptr ptr;
> -
> -	bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
> -
> -	return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
> -}
> -

But it was clearly the intention of the test validate that we can't pass a
dynptr to a ringbuf region to this kfunc, so I'm curious what's changed since
that test was added.

> -
> -					if (!is_dynptr_type_expected(env, reg,
> -							ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> -						bpf_log(log,
> -							"arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> -							i, btf_type_str(ref_t),
> -							ref_tname);
> -						return -EINVAL;
> -					}
> -
>  					continue;
>  				}
>  
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 6f6d2d511c06..31c0c999448e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>  	return true;
>  }
>  
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -			      struct bpf_reg_state *reg)
> +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
>  	int spi = get_spi(reg->off);
> @@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
>  	return true;
>  }
>  
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -			     struct bpf_reg_state *reg,
> -			     enum bpf_arg_type arg_type)
> +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> +				    enum bpf_arg_type arg_type)
>  {
>  	struct bpf_func_state *state = func(env, reg);
>  	enum bpf_dynptr_type dynptr_type;
> @@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>  	return 0;
>  }
>  
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +			enum bpf_arg_type arg_type, int argno,
> +			u8 *uninit_dynptr_regno)
> +{

IMO 'process' is a bit too generic of a term. If we decide to go with this,
what do you think about changing the name to check_func_dynptr_arg(), or just
check_dynptr_arg()?

> +	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> +
> +	/* We only need to check for initialized / uninitialized helper
> +	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> +	 * assumption is that if it is, that a helper function
> +	 * initialized the dynptr on behalf of the BPF program.
> +	 */
> +	if (base_type(reg->type) == PTR_TO_DYNPTR)
> +		return 0;
> +	if (arg_type & MEM_UNINIT) {
> +		if (!is_dynptr_reg_valid_uninit(env, reg)) {
> +			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> +			return -EINVAL;
> +		}
> +
> +		/* We only support one dynptr being uninitialized at the moment,
> +		 * which is sufficient for the helper functions we have right now.
> +		 */
> +		if (*uninit_dynptr_regno) {
> +			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> +			return -EFAULT;
> +		}
> +
> +		*uninit_dynptr_regno = regno;
> +	} else {
> +		if (!is_dynptr_reg_valid_init(env, reg)) {
> +			verbose(env,
> +				"Expected an initialized dynptr as arg #%d\n",
> +				argno + 1);
> +			return -EINVAL;
> +		}
> +
> +		if (!is_dynptr_type_expected(env, reg, arg_type)) {
> +			const char *err_extra = "";
> +
> +			switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> +			case DYNPTR_TYPE_LOCAL:
> +				err_extra = "local";
> +				break;
> +			case DYNPTR_TYPE_RINGBUF:
> +				err_extra = "ringbuf";
> +				break;
> +			default:
> +				err_extra = "<unknown>";
> +				break;
> +			}
> +			verbose(env,
> +				"Expected a dynptr of type %s as arg #%d\n",
> +				err_extra, argno + 1);
> +			return -EINVAL;
> +		}
> +	}
> +	return 0;
> +}

[...]

Seems like a reasonable cleanup overall, comments aside.

Thanks,
David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  2022-10-18 13:59 ` [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM Kumar Kartikeya Dwivedi
@ 2022-10-18 21:38   ` sdf
  2022-10-19  6:19     ` Kumar Kartikeya Dwivedi
  2022-11-07 22:35   ` Joanne Koong
  1 sibling, 1 reply; 54+ messages in thread
From: sdf @ 2022-10-18 21:38 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On 10/18, Kumar Kartikeya Dwivedi wrote:
> Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
> RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
> imply that it carries MEM_ALLOC, while only the latter does. This causes
> confusion during code review leading to conclusions like that the return
> value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
> PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.

> Rename it to make it clear MEM_ALLOC needs to be tacked on top of
> RET_PTR_TO_MEM.

> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>   include/linux/bpf.h   | 6 +++---
>   kernel/bpf/verifier.c | 2 +-
>   2 files changed, 4 insertions(+), 4 deletions(-)

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 13c6ff2de540..834276ba56c9 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -538,7 +538,7 @@ enum bpf_return_type {
>   	RET_PTR_TO_SOCKET,		/* returns a pointer to a socket */
>   	RET_PTR_TO_TCP_SOCK,		/* returns a pointer to a tcp_sock */
>   	RET_PTR_TO_SOCK_COMMON,		/* returns a pointer to a sock_common */
> -	RET_PTR_TO_ALLOC_MEM,		/* returns a pointer to dynamically allocated  
> memory */
> +	RET_PTR_TO_MEM,			/* returns a pointer to dynamically allocated memory  
> */

What about the comment? It still says that it's a pointer to a
dynamically allocated memory :-/ Does it make sense to clarify it as
well?

>   	RET_PTR_TO_MEM_OR_BTF_ID,	/* returns a pointer to a valid memory or a  
> btf_id */
>   	RET_PTR_TO_BTF_ID,		/* returns a pointer to a btf_id */
>   	__BPF_RET_TYPE_MAX,
> @@ -548,8 +548,8 @@ enum bpf_return_type {
>   	RET_PTR_TO_SOCKET_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
>   	RET_PTR_TO_TCP_SOCK_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
>   	RET_PTR_TO_SOCK_COMMON_OR_NULL	= PTR_MAYBE_NULL |  
> RET_PTR_TO_SOCK_COMMON,
> -	RET_PTR_TO_ALLOC_MEM_OR_NULL	= PTR_MAYBE_NULL | MEM_ALLOC |  
> RET_PTR_TO_ALLOC_MEM,
> -	RET_PTR_TO_DYNPTR_MEM_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
> +	RET_PTR_TO_ALLOC_MEM_OR_NULL	= PTR_MAYBE_NULL | MEM_ALLOC |  
> RET_PTR_TO_MEM,
> +	RET_PTR_TO_DYNPTR_MEM_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_MEM,
>   	RET_PTR_TO_BTF_ID_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,

>   	/* This must be the last entry. Its purpose is to ensure the enum is
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 87d9cccd1623..a49b95c1af1b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7612,7 +7612,7 @@ static int check_helper_call(struct  
> bpf_verifier_env *env, struct bpf_insn *insn
>   		mark_reg_known_zero(env, regs, BPF_REG_0);
>   		regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag;
>   		break;
> -	case RET_PTR_TO_ALLOC_MEM:
> +	case RET_PTR_TO_MEM:
>   		mark_reg_known_zero(env, regs, BPF_REG_0);
>   		regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
>   		regs[BPF_REG_0].mem_size = meta.mem_size;
> --
> 2.38.0


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off
  2022-10-18 13:59 ` [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
@ 2022-10-18 21:55   ` sdf
  2022-10-19  6:24     ` Kumar Kartikeya Dwivedi
  2022-11-07 23:17   ` Joanne Koong
  1 sibling, 1 reply; 54+ messages in thread
From: sdf @ 2022-10-18 21:55 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On 10/18, Kumar Kartikeya Dwivedi wrote:
> While check_func_arg_reg_off is the place which performs generic checks
> needed by various candidates of reg->type, there is some handling for
> special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> ARG_PTR_TO_ALLOC_MEM.

> This commit aims to streamline these special cases and instead leave
> other things up to argument type specific code to handle.

> This is done primarily for two reasons: associating back reg->type to
> its argument leaves room for the list getting out of sync when a new
> reg->type is supported by an arg_type.

> The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> we already handle, whenever a release argument is expected, it should
> be passed as the pointer that was received from the acquire function.
> Hence zero fixed and variable offset.

> There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> with non-zero offset to other helper functions, which makes sense.

> Hence, lift the arg_type_is_release check for reg->off and cover all
> possible register types, instead of duplicating the same kind of check
> twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).

> Finally, for the release argument, arg_type_is_dynptr is the special
> case, where we go to actual object being freed through the dynptr, so
> the offset of the pointer still needs to allow fixed and variable offset
> and process_dynptr_func will verify them later for the release argument
> case as well.

> Finally, since check_func_arg_reg_off is meant to be generic, move
> dynptr specific check into process_dynptr_func.

> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>   kernel/bpf/verifier.c                         | 55 +++++++++++++++----
>   .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
>   2 files changed, 44 insertions(+), 13 deletions(-)

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a49b95c1af1b..a8c277e51d63 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5654,6 +5654,14 @@ int process_dynptr_func(struct bpf_verifier_env  
> *env, int regno,
>   		return -EFAULT;
>   	}

> +	/* CONST_PTR_TO_DYNPTR has fixed and variable offset as zero, ensured by
> +	 * check_func_arg_reg_off, so this is only needed for PTR_TO_STACK.
> +	 */
> +	if (reg->off % BPF_REG_SIZE) {
> +		verbose(env, "cannot pass in dynptr at an offset\n");
> +		return -EINVAL;
> +	}

This is what I'm missing here and in the original code as well, maybe you
can clarify?

"if (reg->off & BPF_REG_SIZE)" here vs "if (reg->off)" below. What's the
difference?

> +
>   	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>   	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>   	 *
> @@ -5672,6 +5680,7 @@ int process_dynptr_func(struct bpf_verifier_env  
> *env, int regno,
>   	 *		 destroyed, including mutation of the memory it points
>   	 *		 to.
>   	 */
> +
>   	if (arg_type & MEM_UNINIT) {
>   		if (!is_dynptr_reg_valid_uninit(env, reg)) {
>   			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> @@ -5983,14 +5992,37 @@ int check_func_arg_reg_off(struct  
> bpf_verifier_env *env,
>   	enum bpf_reg_type type = reg->type;
>   	bool fixed_off_ok = false;

> -	switch ((u32)type) {
> -	/* Pointer types where reg offset is explicitly allowed: */
> -	case PTR_TO_STACK:
> -		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -			verbose(env, "cannot pass in dynptr at an offset\n");
> +	/* When referenced register is passed to release function, it's fixed
> +	 * offset must be 0.
> +	 *
> +	 * We will check arg_type_is_release reg has ref_obj_id when storing
> +	 * meta->release_regno.
> +	 */
> +	if (arg_type_is_release(arg_type)) {
> +		/* ARG_PTR_TO_DYNPTR is a bit special, as it may not directly
> +		 * point to the object being released, but to dynptr pointing
> +		 * to such object, which might be at some offset on the stack.
> +		 *
> +		 * In that case, we simply to fallback to the default handling.
> +		 */
> +		if (arg_type_is_dynptr(arg_type) && type == PTR_TO_STACK)
> +			goto check_type;
> +		/* Going straight to check will catch this because fixed_off_ok
> +		 * is false, but checking here allows us to give the user a
> +		 * better error message.
> +		 */
> +		if (reg->off) {
> +			verbose(env, "R%d must have zero offset when passed to release  
> func\n",
> +				regno);
>   			return -EINVAL;
>   		}
> -		fallthrough;
> +		goto check;
> +	}
> +check_type:
> +	switch ((u32)type) {
> +	/* Pointer types where both fixed and variable reg offset is explicitly
> +	 * allowed: */
> +	case PTR_TO_STACK:
>   	case PTR_TO_PACKET:
>   	case PTR_TO_PACKET_META:
>   	case PTR_TO_MAP_KEY:
> @@ -6001,12 +6033,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env  
> *env,
>   	case PTR_TO_BUF:
>   	case PTR_TO_BUF | MEM_RDONLY:
>   	case SCALAR_VALUE:
> -		/* Some of the argument types nevertheless require a
> -		 * zero register offset.
> -		 */
> -		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> -			return 0;
> -		break;
> +		return 0;
>   	/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
>   	 * fixed offset.
>   	 */
> @@ -6023,12 +6050,16 @@ int check_func_arg_reg_off(struct  
> bpf_verifier_env *env,
>   		/* For arg is release pointer, fixed_off_ok must be false, but
>   		 * we already checked and rejected reg->off != 0 above, so set
>   		 * to true to allow fixed offset for all other cases.
> +		 *
> +		 * var_off always must be 0 for PTR_TO_BTF_ID, hence we still
> +		 * need to do checks instead of returning.
>   		 */
>   		fixed_off_ok = true;
>   		break;
>   	default:
>   		break;
>   	}
> +check:
>   	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>   }

> diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c  
> b/tools/testing/selftests/bpf/verifier/ringbuf.c
> index b64d33e4833c..92e3f6a61a79 100644
> --- a/tools/testing/selftests/bpf/verifier/ringbuf.c
> +++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
> @@ -28,7 +28,7 @@
>   	},
>   	.fixup_map_ringbuf = { 1 },
>   	.result = REJECT,
> -	.errstr = "dereference of modified alloc_mem ptr R1",
> +	.errstr = "R1 must have zero offset when passed to release func",
>   },
>   {
>   	"ringbuf: invalid reservation offset 2",
> --
> 2.38.0


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func
  2022-10-18 13:59 ` [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-10-18 23:16   ` David Vernet
  2022-10-19  6:18     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: David Vernet @ 2022-10-18 23:16 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Oct 18, 2022 at 07:29:09PM +0530, Kumar Kartikeya Dwivedi wrote:
> Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
> for use in callback state, because in case of user ringbuf helpers,
> there is no dynptr on the stack that is passed into the callback. To
> reflect such a state, a special register type was created.
> 
> However, some checks have been bypassed incorrectly during the addition
> of this feature. First, for arg_type with MEM_UNINIT flag which
> initialize a dynptr, they must be rejected for such register type.

Ahhh, great point. Thanks a lot for catching this.

> Secondly, in the future, there are plans to dynptr helpers that operate
> on the dynptr itself and may change its offset and other properties.

small nit: s/to dynptr helpers/to add dynptr helpers

> In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> to such helpers, however the current code simply returns 0.
>
> The rejection for helpers that release the dynptr is already handled.
> 
> For fixing this, we take a step back and rework existing code in a way
> that will allow fitting in all classes of helpers and have a coherent
> model for dealing with the variety of use cases in which dynptr is used.
> 
> First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
>
> Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> that the helper only operates on the memory pointed to by the dynptr,

Hmmm, it feels a bit confusing to overload MEM_RDONLY like this. I
understand the intention (which is logical) to imply that the pointer to
the dynptr is read only, but the fact that the memory contained in the
dynptr may not be read only will doubtless confuse people.

I don't really have a better suggestion. This is the proper use of
MEM_RDONLY, but it really feels super confusing. I guess this is
somewhat mitigated by the fact that the read-only nature of the dynptr
is something that will be validated at runtime?

> not the dynptr itself. In C parlance, it would be equivalent to taking
> the dynptr as a point to const argument.
> 
> When either of these flags are not present, the helper is allowed to
> mutate both the dynptr itself and also the memory it points to.
> Currently, the read only status of the memory is not tracked in the
> dynptr, but it would be trivial to add this support inside dynptr state
> of the register.
> 
> With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
> better reflect its usage, it can no longer be passed to helpers that
> initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
> 
> A note to reviewers is that in code that does mark_stack_slots_dynptr,
> and unmark_stack_slots_dynptr, we implicitly rely on the fact that
> PTR_TO_STACK reg is the only case that can reach that code path, as one
> cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
> both cases such helpers won't be setting that flag.
> 
> The next patch will add a couple of selftest cases to make sure this
> doesn't break.
> 
> Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf.h                           |   4 +-
>  include/uapi/linux/bpf.h                      |   8 +-
>  kernel/bpf/btf.c                              |   7 +-
>  kernel/bpf/helpers.c                          |  18 +-
>  kernel/bpf/verifier.c                         | 203 ++++++++++++++----
>  scripts/bpf_doc.py                            |   1 +
>  tools/include/uapi/linux/bpf.h                |   8 +-
>  .../selftests/bpf/prog_tests/user_ringbuf.c   |  10 +-
>  8 files changed, 185 insertions(+), 74 deletions(-)

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 31c0c999448e..87d9cccd1623 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -563,7 +563,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
>  		[PTR_TO_BUF]		= "buf",
>  		[PTR_TO_FUNC]		= "func",
>  		[PTR_TO_MAP_KEY]	= "map_key",
> -		[PTR_TO_DYNPTR]		= "dynptr_ptr",
> +		[CONST_PTR_TO_DYNPTR]	= "dynptr",
>  	};
>  
>  	if (type & PTR_MAYBE_NULL) {
> @@ -697,6 +697,27 @@ static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
>  	return type == BPF_DYNPTR_TYPE_RINGBUF;
>  }
>  
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +			       struct bpf_reg_state *reg2,
> +			       enum bpf_dynptr_type type);
> +
> +static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> +				struct bpf_reg_state *reg);
> +
> +static void mark_dynptr_stack_regs(struct bpf_reg_state *sreg1,
> +				   struct bpf_reg_state *sreg2,
> +				   enum bpf_dynptr_type type)
> +{
> +	__mark_dynptr_regs(sreg1, sreg2, type);
> +}
> +
> +static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> +			       enum bpf_dynptr_type type)
> +{
> +	__mark_dynptr_regs(reg1, NULL, type);
> +}
> +
> +
>  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>  				   enum bpf_arg_type arg_type, int insn_idx)
>  {
> @@ -718,9 +739,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>  	if (type == BPF_DYNPTR_TYPE_INVALID)
>  		return -EINVAL;
>  
> -	state->stack[spi].spilled_ptr.dynptr.first_slot = true;
> -	state->stack[spi].spilled_ptr.dynptr.type = type;
> -	state->stack[spi - 1].spilled_ptr.dynptr.type = type;
> +	mark_dynptr_stack_regs(&state->stack[spi].spilled_ptr,
> +			       &state->stack[spi - 1].spilled_ptr, type);
>  
>  	if (dynptr_type_refcounted(type)) {
>  		/* The id is used to track proper releasing */
> @@ -728,8 +748,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>  		if (id < 0)
>  			return id;
>  
> -		state->stack[spi].spilled_ptr.id = id;
> -		state->stack[spi - 1].spilled_ptr.id = id;
> +		state->stack[spi].spilled_ptr.ref_obj_id = id;
> +		state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
>  	}
>  
>  	return 0;
> @@ -751,25 +771,23 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
>  	}
>  
>  	/* Invalidate any slices associated with this dynptr */
> -	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
> -		release_reference(env, state->stack[spi].spilled_ptr.id);
> -		state->stack[spi].spilled_ptr.id = 0;
> -		state->stack[spi - 1].spilled_ptr.id = 0;
> -	}
> -
> -	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
> -	state->stack[spi].spilled_ptr.dynptr.type = 0;
> -	state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
> +	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
> +		WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
>  
> +	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> +	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
>  	return 0;
>  }
>  
>  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> -	int i;
> +	int spi, i;
>  
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return false;
> +
> +	spi = get_spi(reg->off);
>  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
>  		return true;
>  
> @@ -785,9 +803,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>  static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> +	int spi;
>  	int i;
>  
> +	/* This already represents first slot of initialized bpf_dynptr */
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return true;
> +
> +	spi = get_spi(reg->off);
>  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
>  	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
>  		return false;
> @@ -806,15 +829,21 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
>  {
>  	struct bpf_func_state *state = func(env, reg);
>  	enum bpf_dynptr_type dynptr_type;
> -	int spi = get_spi(reg->off);
> +	int spi;
>  
> +	/* Fold MEM_RDONLY, caller already checked it */
> +	arg_type &= ~MEM_RDONLY;

This is already done in the caller, I think it can just be removed?

>  	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
>  	if (arg_type == ARG_PTR_TO_DYNPTR)
>  		return true;
>  
>  	dynptr_type = arg_to_dynptr_type(arg_type);
> -
> -	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +	if (reg->type == CONST_PTR_TO_DYNPTR) {
> +		return reg->dynptr.type == dynptr_type;
> +	} else {
> +		spi = get_spi(reg->off);
> +		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +	}
>  }
>  
>  /* The reg state of a pointer or a bounded scalar was saved when
> @@ -1317,9 +1346,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
>  	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
>  };
>  
> -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> -				struct bpf_reg_state *reg);
> -
>  /* This helper doesn't clear reg->id */
>  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
>  {
> @@ -1382,6 +1408,25 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
>  	__mark_reg_known_zero(regs + regno);
>  }
>  
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +			       struct bpf_reg_state *reg2,
> +			       enum bpf_dynptr_type type)
> +{
> +	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> +	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR.
> +	 */

Meh, this is mildly confusing. Please correct me if my understanding is wrong,
but the reason this is the case is that we only set the struct bpf_reg_state
from the stack, whereas the actual reg itself of course has PTR_TO_STACK. If
that's the case, can we go into just a bit more detail here in this comment
about what's going on? It's kind of confusing that we have an actual register
of type PTR_TO_STACK, which points to stack register state of (inconsequential)
type CONST_PTR_TO_DYNPTR. It's also kind of weird (but also inconsequential)
that we have dynptr.first_slot for CONST_PTR_TO_DYNPTR.

Just my two cents as well, but even if the field isn't really used for
anything, I would still add an additional enum bpf_reg_type parameter that sets
this to STACK_DYNPTR, with a comment that says it's currently only used by
CONST_PTR_TO_DYNPTR registers.

> +	__mark_reg_known_zero(reg1);
> +	reg1->type = CONST_PTR_TO_DYNPTR;
> +	reg1->dynptr.type = type;
> +	reg1->dynptr.first_slot = true;
> +	if (!reg2)
> +		return;
> +	__mark_reg_known_zero(reg2);
> +	reg2->type = CONST_PTR_TO_DYNPTR;
> +	reg2->dynptr.type = type;
> +	reg2->dynptr.first_slot = false;
> +}
> +
>  static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
>  {
>  	if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
> @@ -5571,19 +5616,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>  	return 0;
>  }
>  
> +/* Implementation details:
> + *
> + * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
> + * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
> + *
> + * In both cases we deal with the first 8 bytes, but need to mark the next 8
> + * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
> + * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
> + *
> + * Mutability of bpf_dynptr is at two levels, one is at the level of struct
> + * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
> + * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
> + * mutate the view of the dynptr and also possibly destroy it. In the latter
> + * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
> + * memory that dynptr points to.
> + *
> + * The verifier will keep track both levels of mutation (bpf_dynptr's in
> + * reg->type and the memory's in reg->dynptr.type), but there is no support for
> + * readonly dynptr view yet, hence only the first case is tracked and checked.
> + *
> + * This is consistent with how C applies the const modifier to a struct object,
> + * where the pointer itself inside bpf_dynptr becomes const but not what it
> + * points to.
> + *
> + * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
> + * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
> + */
>  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  			enum bpf_arg_type arg_type, int argno,
>  			u8 *uninit_dynptr_regno)
>  {
>  	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
>  
> -	/* We only need to check for initialized / uninitialized helper
> -	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> -	 * assumption is that if it is, that a helper function
> -	 * initialized the dynptr on behalf of the BPF program.
> +	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> +		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> +		return -EFAULT;
> +	}
> +
> +	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
> +	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> +	 *
> +	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
> +	 *		 constructing a mutable bpf_dynptr object.
> +	 *
> +	 *		 Currently, this is only possible with PTR_TO_STACK
> +	 *		 pointing to a region of atleast 16 bytes which doesn't
> +	 *		 contain an existing bpf_dynptr.
> +	 *
> +	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
> +	 *		 mutated or destroyed. However, the memory it points to
> +	 *		 may be mutated.
> +	 *
> +	 *  None       - Points to a initialized dynptr that can be mutated and
> +	 *		 destroyed, including mutation of the memory it points
> +	 *		 to.
>  	 */
> -	if (base_type(reg->type) == PTR_TO_DYNPTR)
> -		return 0;
>  	if (arg_type & MEM_UNINIT) {
>  		if (!is_dynptr_reg_valid_uninit(env, reg)) {
>  			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> @@ -5597,9 +5685,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
>  			return -EFAULT;
>  		}
> -
>  		*uninit_dynptr_regno = regno;
>  	} else {
> +		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> +		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> +			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> +			return -EINVAL;
> +		}
> +
>  		if (!is_dynptr_reg_valid_init(env, reg)) {
>  			verbose(env,
>  				"Expected an initialized dynptr as arg #%d\n",
> @@ -5607,6 +5700,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  			return -EINVAL;
>  		}
>  
> +		arg_type &= ~MEM_RDONLY;
>  		if (!is_dynptr_type_expected(env, reg, arg_type)) {
>  			const char *err_extra = "";
>  
> @@ -5762,7 +5856,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
>  static const struct bpf_reg_types dynptr_types = {
>  	.types = {
>  		PTR_TO_STACK,
> -		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
> +		CONST_PTR_TO_DYNPTR,
>  	}
>  };
>  
> @@ -5938,12 +6032,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>  
> -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> +static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> +	int spi;
>  
> -	return state->stack[spi].spilled_ptr.id;
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return reg->ref_obj_id;
> +	spi = get_spi(reg->off);
> +	return state->stack[spi].spilled_ptr.ref_obj_id;
>  }
>  
>  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -6007,11 +6104,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>  	if (arg_type_is_release(arg_type)) {
>  		if (arg_type_is_dynptr(arg_type)) {
>  			struct bpf_func_state *state = func(env, reg);
> -			int spi = get_spi(reg->off);
> +			int spi;
>  
> -			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> -			    !state->stack[spi].spilled_ptr.id) {
> -				verbose(env, "arg %d is an unacquired reference\n", regno);
> +			if (reg->type == PTR_TO_STACK) {
> +				spi = get_spi(reg->off);
> +				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> +				    !state->stack[spi].spilled_ptr.ref_obj_id) {
> +					verbose(env, "arg %d is an unacquired reference\n", regno);
> +					return -EINVAL;
> +				}
> +			} else {
> +				verbose(env, "cannot release unowned const bpf_dynptr\n");
>  				return -EINVAL;
>  			}
>  		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
> @@ -6946,11 +7049,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
>  {
>  	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
>  	 *			  callback_ctx, u64 flags);
> -	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
> +	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
>  	 */
>  	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
> -	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
> -	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
> +	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
>  	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
>  
>  	/* unused */
> @@ -7328,6 +7430,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  
>  	regs = cur_regs(env);
>  
> +	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
> +	 * is safe to do.
> +	 */
>  	if (meta.uninit_dynptr_regno) {
>  		/* we write BPF_DW bits (8 bytes) at a time */
>  		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
> @@ -7346,6 +7452,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  
>  	if (meta.release_regno) {
>  		err = -EINVAL;
> +		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
> +		 * is safe to do.
> +		 */
>  		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
>  			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
>  		else if (meta.ref_obj_id)
> @@ -7428,11 +7538,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  					return -EFAULT;
>  				}
>  
> -				if (base_type(reg->type) != PTR_TO_DYNPTR)
> -					/* Find the id of the dynptr we're
> -					 * tracking the reference of
> -					 */
> -					meta.ref_obj_id = stack_slot_get_id(env, reg);
> +				/* Find the id of the dynptr we're
> +				 * tracking the reference of
> +				 */

I think this can be brought onto one line now.

> +				meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
>  				break;
>  			}
>  		}

[...]

Overall this looks great. Thanks again for working on this. I'd love to hear
Andrii and/or Joanne's thoughts, but overall this looks good and like a solid
improvement (both in terms of fixing 205715673844 ("bpf: Add
bpf_user_ringbuf_drain() helper"), and in terms of the right direction for
dynptrs architecturally).

Thanks,
David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-18 19:45   ` David Vernet
@ 2022-10-19  6:04     ` Kumar Kartikeya Dwivedi
  2022-10-19 15:26       ` David Vernet
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-19  6:04 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Wed, Oct 19, 2022 at 01:15:37AM IST, David Vernet wrote:
> On Tue, Oct 18, 2022 at 07:29:08PM +0530, Kumar Kartikeya Dwivedi wrote:
>
> Hey Kumar, thanks for looking at this stuff.
>
> > ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> > the underlying register type is subjected to more special checks to
> > determine the type of object represented by the pointer and its state
> > consistency.
> >
> > Move dynptr checks to their own 'process_dynptr_func' function so that
> > is consistent and in-line with existing code. This also makes it easier
> > to reuse this code for kfunc handling.
>
> Just out of curiosity, do you have a specific use case for when you'd envision
> a kfunc taking a dynptr? I'm not saying there are none, just curious if you
> have any specifically that you've considered.
>

There is already a kfunc that takes dynptrs, bpf_verify_pkcs7_signature. I am
sure we'll get more in the future.

> > To this end, remove the dependency on bpf_call_arg_meta parameter by
> > instead taking the uninit_dynptr_regno by pointer. This is only needed
> > to be set to a valid pointer when arg_type has MEM_UNINIT.
> >
> > Then, reuse this consolidated function in kfunc dynptr handling too.
> > Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> > been lifted.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h                  |   8 +-
> >  kernel/bpf/btf.c                              |  17 +--
> >  kernel/bpf/verifier.c                         | 115 ++++++++++--------
> >  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
> >  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
> >  5 files changed, 69 insertions(+), 88 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 9e1e6965f407..a33683e0618b 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
> >  			     u32 regno);
> >  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> >  		   u32 regno, u32 mem_size);
> > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > -			      struct bpf_reg_state *reg);
> > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > -			     struct bpf_reg_state *reg,
> > -			     enum bpf_arg_type arg_type);
> > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > +			enum bpf_arg_type arg_type, int argno,
> > +			u8 *uninit_dynptr_regno);
> >
> >  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
> >  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index eba603cec2c5..1827d889e08a 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> >  						return -EINVAL;
> >  					}
> >
> > -					if (!is_dynptr_reg_valid_init(env, reg)) {
> > -						bpf_log(log,
> > -							"arg#%d pointer type %s %s must be valid and initialized\n",
> > -							i, btf_type_str(ref_t),
> > -							ref_tname);
> > +					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
> >  						return -EINVAL;
> > -					}
>
> Could you please clarify why you're removing the DYNPTR_TYPE_LOCAL constraint
> for kfuncs?
>
> You seemed to have removed the following negative selftest:
>
> > -SEC("?lsm.s/bpf")
> > -int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
> > -	     unsigned int size)
> > -{
> > -	char write_data[64] = "hello there, world!!";
> > -	struct bpf_dynptr ptr;
> > -
> > -	bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
> > -
> > -	return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
> > -}
> > -
>
> But it was clearly the intention of the test validate that we can't pass a
> dynptr to a ringbuf region to this kfunc, so I'm curious what's changed since
> that test was added.
>

There was no inherent limitation for just accepting local dynptrs, it's that
when this was added I suggested sticking to one kind back then, because of the
code divergence between kfunc argument checking and helper argument checking.

Now that both share the same code, it's easier to handle everything one place
and make it work everywhere the same way.

Also, next patch adds a very clear distinction between argument type which only
operates on the dynamically sized memory slice and ones which may also modify
dynptr, which also makes it easier to support things for kfuncs by setting
MEM_RDONLY.

> > -
> > -					if (!is_dynptr_type_expected(env, reg,
> > -							ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> > -						bpf_log(log,
> > -							"arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> > -							i, btf_type_str(ref_t),
> > -							ref_tname);
> > -						return -EINVAL;
> > -					}
> > -
> >  					continue;
> >  				}
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 6f6d2d511c06..31c0c999448e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
> >  	return true;
> >  }
> >
> > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > -			      struct bpf_reg_state *reg)
> > +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> >  	int spi = get_spi(reg->off);
> > @@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> >  	return true;
> >  }
> >
> > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > -			     struct bpf_reg_state *reg,
> > -			     enum bpf_arg_type arg_type)
> > +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > +				    enum bpf_arg_type arg_type)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> >  	enum bpf_dynptr_type dynptr_type;
> > @@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
> >  	return 0;
> >  }
> >
> > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > +			enum bpf_arg_type arg_type, int argno,
> > +			u8 *uninit_dynptr_regno)
> > +{
>
> IMO 'process' is a bit too generic of a term. If we decide to go with this,
> what do you think about changing the name to check_func_dynptr_arg(), or just
> check_dynptr_arg()?
>

While I agree, then it would be different from the existing ones and look a bit
odd in the list (e.g. process_spin_lock, process_kptr_func, etc.). So I am not
very sure, but if you still feel it's better I don't mind.

> [...]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func
  2022-10-18 23:16   ` David Vernet
@ 2022-10-19  6:18     ` Kumar Kartikeya Dwivedi
  2022-10-19 16:05       ` David Vernet
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-19  6:18 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Wed, Oct 19, 2022 at 04:46:57AM IST, David Vernet wrote:
> On Tue, Oct 18, 2022 at 07:29:09PM +0530, Kumar Kartikeya Dwivedi wrote:
> > Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
> > for use in callback state, because in case of user ringbuf helpers,
> > there is no dynptr on the stack that is passed into the callback. To
> > reflect such a state, a special register type was created.
> >
> > However, some checks have been bypassed incorrectly during the addition
> > of this feature. First, for arg_type with MEM_UNINIT flag which
> > initialize a dynptr, they must be rejected for such register type.
>
> Ahhh, great point. Thanks a lot for catching this.
>
> > Secondly, in the future, there are plans to dynptr helpers that operate
> > on the dynptr itself and may change its offset and other properties.
>
> small nit: s/to dynptr helpers/to add dynptr helpers
>

Ack.

> > In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> > to such helpers, however the current code simply returns 0.
> >
> > The rejection for helpers that release the dynptr is already handled.
> >
> > For fixing this, we take a step back and rework existing code in a way
> > that will allow fitting in all classes of helpers and have a coherent
> > model for dealing with the variety of use cases in which dynptr is used.
> >
> > First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> > with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
> >
> > Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> > fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> > that the helper only operates on the memory pointed to by the dynptr,
>
> Hmmm, it feels a bit confusing to overload MEM_RDONLY like this. I
> understand the intention (which is logical) to imply that the pointer to
> the dynptr is read only, but the fact that the memory contained in the
> dynptr may not be read only will doubtless confuse people.
>
> I don't really have a better suggestion. This is the proper use of
> MEM_RDONLY, but it really feels super confusing. I guess this is
> somewhat mitigated by the fact that the read-only nature of the dynptr
> is something that will be validated at runtime?
>

Nope, both dynptr's const-ness and const-ness of the memory it points to are
supposed to be tracked statically. It's part of the type of the dynptr.

The second case doesn't exist yet, but will soon (with skb dynptrs abstracting
over read only __sk_buff ctx).

So what MEM_RDONLY in argument type really means is that I take a pointer to
const struct bpf_dynptr, which means I can't modify the struct bpf_dynptr itself
(so it's size, offset, ptr, etc.), but that is independent of r/w state of what
it points to.

const T *p vs T *const p

In this case it's the latter. Soon we will also support const T *const p.

Hence, MEM_RDONLY is at the argument type level, translating to reg->type, and
the read only status for the dynptr's memory slice will be part of dynptr
specific register state (dynptr.type).

But I am open to more suggestions on how to write this stuff, if it makes the
code easier to read.

> >  [...]
> >  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> > -	int spi = get_spi(reg->off);
> > -	int i;
> > +	int spi, i;
> >
> > +	if (reg->type == CONST_PTR_TO_DYNPTR)
> > +		return false;
> > +
> > +	spi = get_spi(reg->off);
> >  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
> >  		return true;
> >
> > @@ -785,9 +803,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
> >  static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> > -	int spi = get_spi(reg->off);
> > +	int spi;
> >  	int i;
> >
> > +	/* This already represents first slot of initialized bpf_dynptr */
> > +	if (reg->type == CONST_PTR_TO_DYNPTR)
> > +		return true;
> > +
> > +	spi = get_spi(reg->off);
> >  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> >  	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
> >  		return false;
> > @@ -806,15 +829,21 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> >  	enum bpf_dynptr_type dynptr_type;
> > -	int spi = get_spi(reg->off);
> > +	int spi;
> >
> > +	/* Fold MEM_RDONLY, caller already checked it */
> > +	arg_type &= ~MEM_RDONLY;
>
> This is already done in the caller, I think it can just be removed?
>

Right, I was first doing it inside, but then I moved it out and forgot to remove
this hunk.

> >  	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
> >  	if (arg_type == ARG_PTR_TO_DYNPTR)
> >  		return true;
> >
> >  	dynptr_type = arg_to_dynptr_type(arg_type);
> > -
> > -	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > +	if (reg->type == CONST_PTR_TO_DYNPTR) {
> > +		return reg->dynptr.type == dynptr_type;
> > +	} else {
> > +		spi = get_spi(reg->off);
> > +		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > +	}
> >  }
> >
> >  /* The reg state of a pointer or a bounded scalar was saved when
> > @@ -1317,9 +1346,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
> >  	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
> >  };
> >
> > -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> > -				struct bpf_reg_state *reg);
> > -
> >  /* This helper doesn't clear reg->id */
> >  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
> >  {
> > @@ -1382,6 +1408,25 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
> >  	__mark_reg_known_zero(regs + regno);
> >  }
> >
> > +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> > +			       struct bpf_reg_state *reg2,
> > +			       enum bpf_dynptr_type type)
> > +{
> > +	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> > +	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR.
> > +	 */
>
> Meh, this is mildly confusing. Please correct me if my understanding is wrong,
> but the reason this is the case is that we only set the struct bpf_reg_state
> from the stack, whereas the actual reg itself of course has PTR_TO_STACK. If
> that's the case, can we go into just a bit more detail here in this comment
> about what's going on? It's kind of confusing that we have an actual register
> of type PTR_TO_STACK, which points to stack register state of (inconsequential)
> type CONST_PTR_TO_DYNPTR. It's also kind of weird (but also inconsequential)
> that we have dynptr.first_slot for CONST_PTR_TO_DYNPTR.
>

There are two cases which this function is called for, one is for the
spilled registers for dynptr on the stack. In that case it *is* the dynptr, so
reg->type as CONST_PTR_TO_DYNPTR is meaningless/wrong, and not checked. The type
is already part of slot_type == STACK_DYNPTR.

We reuse spilled_reg part of stack state to store info about the dynptr. We need
two spilled_regs to fully track it.

Later, we will have more owned objects on the stack (bpf_list_head, bpf_rb_root)
where you splice it out. Their handling will have to be similar.

PTR_TO_STACK points to the slots whose spilled registers we will call this
function for. That is different from the second case, i.e. for callback R1,
where it will be CONST_PTR_TO_DYNPTR. For consistency, I marked it as first_slot
because we always work using the first dynptr slot.

So to summarize:

PTR_TO_STACK points to bpf_dynptr on stack. So we store this info as 2 spilled
registers on the stack. In that case both of them are the first and second slot
of the dynptr (8-bytes each). They are the actual dynptr object.

In second case we set dynptr state on the reg itself, which points to actual
dynptr object. The reference now records the information we need about the
object.

Yes, it is a bit confusing, and again, I'm open to better ideas. The
difference/confusion is mainly because of different places where state is
tracked. For the stack we track it in stack state precisely, for
CONST_PTR_TO_DYNPTR it is recorded in the pointer to dynptr object.

> Just my two cents as well, but even if the field isn't really used for
> anything, I would still add an additional enum bpf_reg_type parameter that sets
> this to STACK_DYNPTR, with a comment that says it's currently only used by
> CONST_PTR_TO_DYNPTR registers.
>
> > +	__mark_reg_known_zero(reg1);
> > +	reg1->type = CONST_PTR_TO_DYNPTR;
> > +	reg1->dynptr.type = type;
> > +	reg1->dynptr.first_slot = true;
> > +	if (!reg2)
> > +		return;
> > +	__mark_reg_known_zero(reg2);
> > +	reg2->type = CONST_PTR_TO_DYNPTR;
> > +	reg2->dynptr.type = type;
> > +	reg2->dynptr.first_slot = false;
> > +}
> > +
> >  static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
> >  {
> >  	if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
> > @@ -5571,19 +5616,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
> >  	return 0;
> >  }
> >
> > +/* Implementation details:
> > + *
> > + * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
> > + * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
> > + *
> > + * In both cases we deal with the first 8 bytes, but need to mark the next 8
> > + * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
> > + * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
> > + *
> > + * Mutability of bpf_dynptr is at two levels, one is at the level of struct
> > + * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
> > + * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
> > + * mutate the view of the dynptr and also possibly destroy it. In the latter
> > + * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
> > + * memory that dynptr points to.
> > + *
> > + * The verifier will keep track both levels of mutation (bpf_dynptr's in
> > + * reg->type and the memory's in reg->dynptr.type), but there is no support for
> > + * readonly dynptr view yet, hence only the first case is tracked and checked.
> > + *
> > + * This is consistent with how C applies the const modifier to a struct object,
> > + * where the pointer itself inside bpf_dynptr becomes const but not what it
> > + * points to.
> > + *
> > + * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
> > + * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
> > + */
> >  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  			enum bpf_arg_type arg_type, int argno,
> >  			u8 *uninit_dynptr_regno)
> >  {
> >  	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> >
> > -	/* We only need to check for initialized / uninitialized helper
> > -	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> > -	 * assumption is that if it is, that a helper function
> > -	 * initialized the dynptr on behalf of the BPF program.
> > +	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> > +		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> > +		return -EFAULT;
> > +	}
> > +
> > +	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
> > +	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> > +	 *
> > +	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
> > +	 *		 constructing a mutable bpf_dynptr object.
> > +	 *
> > +	 *		 Currently, this is only possible with PTR_TO_STACK
> > +	 *		 pointing to a region of atleast 16 bytes which doesn't
> > +	 *		 contain an existing bpf_dynptr.
> > +	 *
> > +	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
> > +	 *		 mutated or destroyed. However, the memory it points to
> > +	 *		 may be mutated.
> > +	 *
> > +	 *  None       - Points to a initialized dynptr that can be mutated and
> > +	 *		 destroyed, including mutation of the memory it points
> > +	 *		 to.
> >  	 */
> > -	if (base_type(reg->type) == PTR_TO_DYNPTR)
> > -		return 0;
> >  	if (arg_type & MEM_UNINIT) {
> >  		if (!is_dynptr_reg_valid_uninit(env, reg)) {
> >  			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> > @@ -5597,9 +5685,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> >  			return -EFAULT;
> >  		}
> > -
> >  		*uninit_dynptr_regno = regno;
> >  	} else {
> > +		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> > +		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> > +			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> > +			return -EINVAL;
> > +		}
> > +
> >  		if (!is_dynptr_reg_valid_init(env, reg)) {
> >  			verbose(env,
> >  				"Expected an initialized dynptr as arg #%d\n",
> > @@ -5607,6 +5700,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  			return -EINVAL;
> >  		}
> >
> > +		arg_type &= ~MEM_RDONLY;
> >  		if (!is_dynptr_type_expected(env, reg, arg_type)) {
> >  			const char *err_extra = "";
> >
> > @@ -5762,7 +5856,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
> >  static const struct bpf_reg_types dynptr_types = {
> >  	.types = {
> >  		PTR_TO_STACK,
> > -		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
> > +		CONST_PTR_TO_DYNPTR,
> >  	}
> >  };
> >
> > @@ -5938,12 +6032,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >  	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
> >  }
> >
> > -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > +static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> > -	int spi = get_spi(reg->off);
> > +	int spi;
> >
> > -	return state->stack[spi].spilled_ptr.id;
> > +	if (reg->type == CONST_PTR_TO_DYNPTR)
> > +		return reg->ref_obj_id;
> > +	spi = get_spi(reg->off);
> > +	return state->stack[spi].spilled_ptr.ref_obj_id;
> >  }
> >
> >  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > @@ -6007,11 +6104,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >  	if (arg_type_is_release(arg_type)) {
> >  		if (arg_type_is_dynptr(arg_type)) {
> >  			struct bpf_func_state *state = func(env, reg);
> > -			int spi = get_spi(reg->off);
> > +			int spi;
> >
> > -			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> > -			    !state->stack[spi].spilled_ptr.id) {
> > -				verbose(env, "arg %d is an unacquired reference\n", regno);
> > +			if (reg->type == PTR_TO_STACK) {
> > +				spi = get_spi(reg->off);
> > +				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> > +				    !state->stack[spi].spilled_ptr.ref_obj_id) {
> > +					verbose(env, "arg %d is an unacquired reference\n", regno);
> > +					return -EINVAL;
> > +				}
> > +			} else {
> > +				verbose(env, "cannot release unowned const bpf_dynptr\n");
> >  				return -EINVAL;
> >  			}
> >  		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
> > @@ -6946,11 +7049,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
> >  {
> >  	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
> >  	 *			  callback_ctx, u64 flags);
> > -	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
> > +	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
> >  	 */
> >  	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
> > -	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
> > -	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
> > +	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
> >  	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
> >
> >  	/* unused */
> > @@ -7328,6 +7430,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >
> >  	regs = cur_regs(env);
> >
> > +	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> > +	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
> > +	 * is safe to do.
> > +	 */
> >  	if (meta.uninit_dynptr_regno) {
> >  		/* we write BPF_DW bits (8 bytes) at a time */
> >  		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
> > @@ -7346,6 +7452,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >
> >  	if (meta.release_regno) {
> >  		err = -EINVAL;
> > +		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> > +		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
> > +		 * is safe to do.
> > +		 */
> >  		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
> >  			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
> >  		else if (meta.ref_obj_id)
> > @@ -7428,11 +7538,10 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >  					return -EFAULT;
> >  				}
> >
> > -				if (base_type(reg->type) != PTR_TO_DYNPTR)
> > -					/* Find the id of the dynptr we're
> > -					 * tracking the reference of
> > -					 */
> > -					meta.ref_obj_id = stack_slot_get_id(env, reg);
> > +				/* Find the id of the dynptr we're
> > +				 * tracking the reference of
> > +				 */
>
> I think this can be brought onto one line now.
>

Ack.

> > +				meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
> >  				break;
> >  			}
> >  		}
>
> [...]
>
> Overall this looks great. Thanks again for working on this. I'd love to hear
> Andrii and/or Joanne's thoughts, but overall this looks good and like a solid
> improvement (both in terms of fixing 205715673844 ("bpf: Add
> bpf_user_ringbuf_drain() helper"), and in terms of the right direction for
> dynptrs architecturally).
>

Thanks for the reviews!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  2022-10-18 21:38   ` sdf
@ 2022-10-19  6:19     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-19  6:19 UTC (permalink / raw)
  To: sdf
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Wed, Oct 19, 2022 at 03:08:21AM IST, sdf@google.com wrote:
> On 10/18, Kumar Kartikeya Dwivedi wrote:
> > Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
> > RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
> > imply that it carries MEM_ALLOC, while only the latter does. This causes
> > confusion during code review leading to conclusions like that the return
> > value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
> > PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.
>
> > Rename it to make it clear MEM_ALLOC needs to be tacked on top of
> > RET_PTR_TO_MEM.
>
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >   include/linux/bpf.h   | 6 +++---
> >   kernel/bpf/verifier.c | 2 +-
> >   2 files changed, 4 insertions(+), 4 deletions(-)
>
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 13c6ff2de540..834276ba56c9 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -538,7 +538,7 @@ enum bpf_return_type {
> >   	RET_PTR_TO_SOCKET,		/* returns a pointer to a socket */
> >   	RET_PTR_TO_TCP_SOCK,		/* returns a pointer to a tcp_sock */
> >   	RET_PTR_TO_SOCK_COMMON,		/* returns a pointer to a sock_common */
> > -	RET_PTR_TO_ALLOC_MEM,		/* returns a pointer to dynamically allocated
> > memory */
> > +	RET_PTR_TO_MEM,			/* returns a pointer to dynamically allocated memory
> > */
>
> What about the comment? It still says that it's a pointer to a
> dynamically allocated memory :-/ Does it make sense to clarify it as
> well?
>

Argh, right, I will change that. Thanks for spotting it!

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off
  2022-10-18 21:55   ` sdf
@ 2022-10-19  6:24     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-19  6:24 UTC (permalink / raw)
  To: sdf
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Wed, Oct 19, 2022 at 03:25:21AM IST, sdf@google.com wrote:
> On 10/18, Kumar Kartikeya Dwivedi wrote:
> > While check_func_arg_reg_off is the place which performs generic checks
> > needed by various candidates of reg->type, there is some handling for
> > special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> > ARG_PTR_TO_ALLOC_MEM.
>
> > This commit aims to streamline these special cases and instead leave
> > other things up to argument type specific code to handle.
>
> > This is done primarily for two reasons: associating back reg->type to
> > its argument leaves room for the list getting out of sync when a new
> > reg->type is supported by an arg_type.
>
> > The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> > we already handle, whenever a release argument is expected, it should
> > be passed as the pointer that was received from the acquire function.
> > Hence zero fixed and variable offset.
>
> > There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> > its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> > with non-zero offset to other helper functions, which makes sense.
>
> > Hence, lift the arg_type_is_release check for reg->off and cover all
> > possible register types, instead of duplicating the same kind of check
> > twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
>
> > Finally, for the release argument, arg_type_is_dynptr is the special
> > case, where we go to actual object being freed through the dynptr, so
> > the offset of the pointer still needs to allow fixed and variable offset
> > and process_dynptr_func will verify them later for the release argument
> > case as well.
>
> > Finally, since check_func_arg_reg_off is meant to be generic, move
> > dynptr specific check into process_dynptr_func.
>
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >   kernel/bpf/verifier.c                         | 55 +++++++++++++++----
> >   .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
> >   2 files changed, 44 insertions(+), 13 deletions(-)
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a49b95c1af1b..a8c277e51d63 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5654,6 +5654,14 @@ int process_dynptr_func(struct bpf_verifier_env
> > *env, int regno,
> >   		return -EFAULT;
> >   	}
>
> > +	/* CONST_PTR_TO_DYNPTR has fixed and variable offset as zero, ensured by
> > +	 * check_func_arg_reg_off, so this is only needed for PTR_TO_STACK.
> > +	 */
> > +	if (reg->off % BPF_REG_SIZE) {
> > +		verbose(env, "cannot pass in dynptr at an offset\n");
> > +		return -EINVAL;
> > +	}
>
> This is what I'm missing here and in the original code as well, maybe you
> can clarify?
>
> "if (reg->off & BPF_REG_SIZE)" here vs "if (reg->off)" below. What's the
> difference?
>

That second one happens earlier in check_func_arg_reg_off, this check happens
later.

Usually when we have release arguments, we want pointer to object unmodified.
So the fixed and variable offset must be 0. The check_func_arg_reg_off checks
ensure that. But PTR_TO_STACK in case of dynptr release functions point to the
dynptr object on the stack which has to be released.

In this case fp will have some fixed offset. So we make an exception for it and
fallback to normal checks for PTR_TO_STACK.

Later when we come here, we reach the function for two kinds of registers,
CONST_PTR_TO_DYNPTR and PTR_TO_STACK. PTR_TO_STACK reg->off must be aligned
to 8-byte alignment since we want to find stack slot index (each representing 8
byte slot) of the dynptr to operate on it.

For CONST_PTR_TO_DYNPTR it directly points to dynptr with 0 offset, which
check_func_arg_reg_off already ensures for it.

Note that this reg->off check is actually broken, the correct one is in patch 6
which takes into account the variable offset.

You can consider check_func_arg_reg_off to only do high level checks which are
common for all helpers, and later processing builds upon those guarantees and
does further checking.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-19  6:04     ` Kumar Kartikeya Dwivedi
@ 2022-10-19 15:26       ` David Vernet
  0 siblings, 0 replies; 54+ messages in thread
From: David Vernet @ 2022-10-19 15:26 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Wed, Oct 19, 2022 at 11:34:12AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Wed, Oct 19, 2022 at 01:15:37AM IST, David Vernet wrote:
> > On Tue, Oct 18, 2022 at 07:29:08PM +0530, Kumar Kartikeya Dwivedi wrote:
> >
> > Hey Kumar, thanks for looking at this stuff.
> >
> > > ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> > > the underlying register type is subjected to more special checks to
> > > determine the type of object represented by the pointer and its state
> > > consistency.
> > >
> > > Move dynptr checks to their own 'process_dynptr_func' function so that
> > > is consistent and in-line with existing code. This also makes it easier
> > > to reuse this code for kfunc handling.
> >
> > Just out of curiosity, do you have a specific use case for when you'd envision
> > a kfunc taking a dynptr? I'm not saying there are none, just curious if you
> > have any specifically that you've considered.
> >
> 
> There is already a kfunc that takes dynptrs, bpf_verify_pkcs7_signature. I am
> sure we'll get more in the future.

Ah, ok, hence why the negative-selftest you removed called that kfunc
with a ringbuf dynptr.

> > > To this end, remove the dependency on bpf_call_arg_meta parameter by
> > > instead taking the uninit_dynptr_regno by pointer. This is only needed
> > > to be set to a valid pointer when arg_type has MEM_UNINIT.
> > >
> > > Then, reuse this consolidated function in kfunc dynptr handling too.
> > > Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> > > been lifted.
> > >
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  include/linux/bpf_verifier.h                  |   8 +-
> > >  kernel/bpf/btf.c                              |  17 +--
> > >  kernel/bpf/verifier.c                         | 115 ++++++++++--------
> > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
> > >  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
> > >  5 files changed, 69 insertions(+), 88 deletions(-)
> > >
> > > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > > index 9e1e6965f407..a33683e0618b 100644
> > > --- a/include/linux/bpf_verifier.h
> > > +++ b/include/linux/bpf_verifier.h
> > > @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
> > >  			     u32 regno);
> > >  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > >  		   u32 regno, u32 mem_size);
> > > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > > -			      struct bpf_reg_state *reg);
> > > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > > -			     struct bpf_reg_state *reg,
> > > -			     enum bpf_arg_type arg_type);
> > > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > > +			enum bpf_arg_type arg_type, int argno,
> > > +			u8 *uninit_dynptr_regno);
> > >
> > >  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
> > >  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index eba603cec2c5..1827d889e08a 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> > >  						return -EINVAL;
> > >  					}
> > >
> > > -					if (!is_dynptr_reg_valid_init(env, reg)) {
> > > -						bpf_log(log,
> > > -							"arg#%d pointer type %s %s must be valid and initialized\n",
> > > -							i, btf_type_str(ref_t),
> > > -							ref_tname);
> > > +					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
> > >  						return -EINVAL;
> > > -					}
> >
> > Could you please clarify why you're removing the DYNPTR_TYPE_LOCAL constraint
> > for kfuncs?
> >
> > You seemed to have removed the following negative selftest:
> >
> > > -SEC("?lsm.s/bpf")
> > > -int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
> > > -	     unsigned int size)
> > > -{
> > > -	char write_data[64] = "hello there, world!!";
> > > -	struct bpf_dynptr ptr;
> > > -
> > > -	bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
> > > -
> > > -	return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
> > > -}
> > > -
> >
> > But it was clearly the intention of the test validate that we can't pass a
> > dynptr to a ringbuf region to this kfunc, so I'm curious what's changed since
> > that test was added.
> >
> 
> There was no inherent limitation for just accepting local dynptrs, it's that
> when this was added I suggested sticking to one kind back then, because of the
> code divergence between kfunc argument checking and helper argument checking.
> 
> Now that both share the same code, it's easier to handle everything one place
> and make it work everywhere the same way.
> 
> Also, next patch adds a very clear distinction between argument type which only
> operates on the dynamically sized memory slice and ones which may also modify
> dynptr, which also makes it easier to support things for kfuncs by setting
> MEM_RDONLY.

Makes sense, thanks for clarifying.

> > > -
> > > -					if (!is_dynptr_type_expected(env, reg,
> > > -							ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> > > -						bpf_log(log,
> > > -							"arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> > > -							i, btf_type_str(ref_t),
> > > -							ref_tname);
> > > -						return -EINVAL;
> > > -					}
> > > -
> > >  					continue;
> > >  				}
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 6f6d2d511c06..31c0c999448e 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
> > >  	return true;
> > >  }
> > >
> > > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > > -			      struct bpf_reg_state *reg)
> > > +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > >  {
> > >  	struct bpf_func_state *state = func(env, reg);
> > >  	int spi = get_spi(reg->off);
> > > @@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > >  	return true;
> > >  }
> > >
> > > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > > -			     struct bpf_reg_state *reg,
> > > -			     enum bpf_arg_type arg_type)
> > > +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > > +				    enum bpf_arg_type arg_type)
> > >  {
> > >  	struct bpf_func_state *state = func(env, reg);
> > >  	enum bpf_dynptr_type dynptr_type;
> > > @@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
> > >  	return 0;
> > >  }
> > >
> > > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > > +			enum bpf_arg_type arg_type, int argno,
> > > +			u8 *uninit_dynptr_regno)
> > > +{
> >
> > IMO 'process' is a bit too generic of a term. If we decide to go with this,
> > what do you think about changing the name to check_func_dynptr_arg(), or just
> > check_dynptr_arg()?
> >
> 
> While I agree, then it would be different from the existing ones and look a bit
> odd in the list (e.g. process_spin_lock, process_kptr_func, etc.). So I am not
> very sure, but if you still feel it's better I don't mind.

Uniformity should trump my own personal preferences. We can stick with
process_dynptr_func().

LGTM, thanks for answering my questions.

Acked-by: David Vernet <void@manifault.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func
  2022-10-19  6:18     ` Kumar Kartikeya Dwivedi
@ 2022-10-19 16:05       ` David Vernet
  2022-10-20  1:09         ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: David Vernet @ 2022-10-19 16:05 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Wed, Oct 19, 2022 at 11:48:21AM +0530, Kumar Kartikeya Dwivedi wrote:

[...]

> > > In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> > > to such helpers, however the current code simply returns 0.
> > >
> > > The rejection for helpers that release the dynptr is already handled.
> > >
> > > For fixing this, we take a step back and rework existing code in a way
> > > that will allow fitting in all classes of helpers and have a coherent
> > > model for dealing with the variety of use cases in which dynptr is used.
> > >
> > > First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> > > with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
> > >
> > > Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> > > fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> > > that the helper only operates on the memory pointed to by the dynptr,
> >
> > Hmmm, it feels a bit confusing to overload MEM_RDONLY like this. I
> > understand the intention (which is logical) to imply that the pointer to
> > the dynptr is read only, but the fact that the memory contained in the
> > dynptr may not be read only will doubtless confuse people.
> >
> > I don't really have a better suggestion. This is the proper use of
> > MEM_RDONLY, but it really feels super confusing. I guess this is
> > somewhat mitigated by the fact that the read-only nature of the dynptr
> > is something that will be validated at runtime?
> >
> 
> Nope, both dynptr's const-ness and const-ness of the memory it points to are
> supposed to be tracked statically. It's part of the type of the dynptr.

Could you please clarify what you're "noping" here? The dynptr being
read-only is tracked statically, but based on the discussion in the
thread at [0] I thought the plan was to enforce this property at
runtime. Am I wrong about that?

[0]: https://lore.kernel.org/bpf/CAJnrk1Y0r3++RLpT2jvp4st-79x3dUYk3uP-4tfnAeL5_kgM0Q@mail.gmail.com/

My point was just that it might be less difficult to confuse
CONST_PTR_TO_DYNPTR | MEM_RDONLY with the memory contained in the dynptr
region if there's a separate field inside the dynptr itself which tracks
whether that region is R/O. I'm mostly just thinking out loud -- as I
said in the last email I think using MEM_RDONLY as you are is logical.

> The second case doesn't exist yet, but will soon (with skb dynptrs abstracting
> over read only __sk_buff ctx).
> 
> So what MEM_RDONLY in argument type really means is that I take a pointer to
> const struct bpf_dynptr, which means I can't modify the struct bpf_dynptr itself
> (so it's size, offset, ptr, etc.), but that is independent of r/w state of what
> it points to.
>
> const T *p vs T *const p

Right, I understand the intention of the patch (which was why I said it
was a logical choice) and the distinction between the two variants of
const. My point was that at first glance, someone who's not a verifier
expert who's trying to understand all of this to enable their writing of
a BPF program may be thrown off by seeing "PTR_TO_DYNPTR | RDONLY".
Hopefully that's something we can address with adequately documenting
helpers, and in any case, it's certainly not an argument against your
overall approach.

Also, I think it will end up being more clear if and when we have e.g.
a helper that takes a CONST_PTR_TO_DYNPTR | MEM_RDONLY dynptr, and
returns e.g. an R/O PTR_TO_MEM | MEM_RDONLY pointer to its backing
memory.

Anyways, at the end of the day this is really all implementation details
of the verifier and BPF internals, so I digress...

> In this case it's the latter. Soon we will also support const T *const p.
> 
> Hence, MEM_RDONLY is at the argument type level, translating to reg->type, and
> the read only status for the dynptr's memory slice will be part of dynptr
> specific register state (dynptr.type).
> 
> But I am open to more suggestions on how to write this stuff, if it makes the
> code easier to read.

I think what you have makes sense and IMO is the cleanest way to express
all of this.

The only thing that I'm now wondering after sleeping on this is whether
it's really necessary to rename the register type to CONST_PTR_TO_DYNPTR.
We're already restricting that it always be called with MEM_RDONLY. Are
we _100%_ sure that it will always be fully static whether a dynptr is
R/O? I know that Joanne said probably yes in [1], but it feels perhaps
unnecessarily restrictive to codify that by making the register type
CONST_PTR_TO_DYNPTR. Why not just make it PTR_TO_DYNPTR and keep the
verifications you added in this patch that it's always specified with
MEM_RDONLY, and then if we ever change our minds and later decide to add
helpers that can change the access permissions on the dynptr, it will
just be a matter of changing our expectations around the presence of
that MEM_RDONLY modifier?

[1]: https://lore.kernel.org/bpf/CAJnrk1Zmne1uDn8EKdNKJe6O-k_moU9Sryfws_J-TF2BvX2QMg@mail.gmail.com/

[...]

> > >  	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
> > >  	if (arg_type == ARG_PTR_TO_DYNPTR)
> > >  		return true;
> > >
> > >  	dynptr_type = arg_to_dynptr_type(arg_type);
> > > -
> > > -	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > > +	if (reg->type == CONST_PTR_TO_DYNPTR) {
> > > +		return reg->dynptr.type == dynptr_type;
> > > +	} else {
> > > +		spi = get_spi(reg->off);
> > > +		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > > +	}
> > >  }
> > >
> > >  /* The reg state of a pointer or a bounded scalar was saved when
> > > @@ -1317,9 +1346,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
> > >  	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
> > >  };
> > >
> > > -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> > > -				struct bpf_reg_state *reg);
> > > -
> > >  /* This helper doesn't clear reg->id */
> > >  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
> > >  {
> > > @@ -1382,6 +1408,25 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
> > >  	__mark_reg_known_zero(regs + regno);
> > >  }
> > >
> > > +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> > > +			       struct bpf_reg_state *reg2,
> > > +			       enum bpf_dynptr_type type)
> > > +{
> > > +	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> > > +	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR.
> > > +	 */
> >
> > Meh, this is mildly confusing. Please correct me if my understanding is wrong,
> > but the reason this is the case is that we only set the struct bpf_reg_state
> > from the stack, whereas the actual reg itself of course has PTR_TO_STACK. If
> > that's the case, can we go into just a bit more detail here in this comment
> > about what's going on? It's kind of confusing that we have an actual register
> > of type PTR_TO_STACK, which points to stack register state of (inconsequential)
> > type CONST_PTR_TO_DYNPTR. It's also kind of weird (but also inconsequential)
> > that we have dynptr.first_slot for CONST_PTR_TO_DYNPTR.
> >
> 
> There are two cases which this function is called for, one is for the
> spilled registers for dynptr on the stack. In that case it *is* the dynptr, so
> reg->type as CONST_PTR_TO_DYNPTR is meaningless/wrong, and not checked. The type
> is already part of slot_type == STACK_DYNPTR.

Ok, thanks for confirming my understanding.

> We reuse spilled_reg part of stack state to store info about the dynptr. We need
> two spilled_regs to fully track it.
> 
> Later, we will have more owned objects on the stack (bpf_list_head, bpf_rb_root)
> where you splice it out. Their handling will have to be similar.
> 
> PTR_TO_STACK points to the slots whose spilled registers we will call this
> function for. That is different from the second case, i.e. for callback R1,
> where it will be CONST_PTR_TO_DYNPTR. For consistency, I marked it as first_slot
> because we always work using the first dynptr slot.
> 
> So to summarize:
> 
> PTR_TO_STACK points to bpf_dynptr on stack. So we store this info as 2 spilled
> registers on the stack. In that case both of them are the first and second slot
> of the dynptr (8-bytes each). They are the actual dynptr object.
> 
> In second case we set dynptr state on the reg itself, which points to actual
> dynptr object. The reference now records the information we need about the
> object.
> 
> Yes, it is a bit confusing, and again, I'm open to better ideas. The
> difference/confusion is mainly because of different places where state is
> tracked. For the stack we track it in stack state precisely, for
> CONST_PTR_TO_DYNPTR it is recorded in the pointer to dynptr object.

Thanks for clarifying, then my initial understanding was correct. If
that's the case, what do you think about this suggestion to make the
code a bit more consistent:

> > Just my two cents as well, but even if the field isn't really used for
> > anything, I would still add an additional enum bpf_reg_type parameter that sets
> > this to STACK_DYNPTR, with a comment that says it's currently only used by
> > CONST_PTR_TO_DYNPTR registers.

I would rather reg->type be _unused_ for the dynptr in the spilled
registers on the stack, then be both unused and meaningless/wrong (as
you put it).

[...]

Thanks,
David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  2022-10-18 13:59 ` [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
@ 2022-10-19 16:59   ` David Vernet
  0 siblings, 0 replies; 54+ messages in thread
From: David Vernet @ 2022-10-19 16:59 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Oct 18, 2022 at 07:29:16PM +0530, Kumar Kartikeya Dwivedi wrote:
> The original support for bpf_user_ringbuf_drain callbacks simply
> short-circuited checks for the dynptr state, allowing users to pass
> PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a
> dynptr. This bug would have also surfaced with other dynptr helpers in
> the future that changed dynptr view or modified it in some way.
> 
> Include test cases for all cases, i.e. both bpf_dynptr_from_mem and
> bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them.
> Without the fix, both of these programs load and pass verification.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---

LGTM

Acked-by: David Vernet <void@manifault.com>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-18 13:59 ` [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR Kumar Kartikeya Dwivedi
@ 2022-10-19 18:52   ` Alexei Starovoitov
  2022-10-20  1:04     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Alexei Starovoitov @ 2022-10-19 18:52 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, the dynptr function is not checking the variable offset part
> of PTR_TO_STACK that it needs to check. The fixed offset is considered
> when computing the stack pointer index, but if the variable offset was
> not a constant (such that it could not be accumulated in reg->off), we
> will end up a discrepency where runtime pointer does not point to the
> actual stack slot we mark as STACK_DYNPTR.
>
> It is impossible to precisely track dynptr state when variable offset is
> not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> simply reject the case where reg->var_off is not constant. Then,
> consider both reg->off and reg->var_off.value when computing the stack
> pointer index.
>
> A new helper dynptr_get_spi is introduced to hide over these details
> since the dynptr needs to be located in multiple places outside the
> process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> need to enforce these checks in all places.
>
> Note that it is disallowed for unprivileged users to have a non-constant
> var_off, so this problem should only be possible to trigger from
> programs having CAP_PERFMON. However, its effects can vary.
>
> Without the fix, it is possible to replace the contents of the dynptr
> arbitrarily by making verifier mark different stack slots than actual
> location and then doing writes to the actual stack address of dynptr at
> runtime.
>
> Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
>  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
>  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
>  3 files changed, 67 insertions(+), 21 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 8f667180f70f..0fd73f96c5e2 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
>                 verbose(env, "D");
>  }
>
> -static int get_spi(s32 off)
> +static int __get_spi(s32 off)
>  {
>         return (-off - 1) / BPF_REG_SIZE;
>  }
>
> +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> +{
> +       int spi;
> +
> +       if (reg->off % BPF_REG_SIZE) {
> +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> +               return -EINVAL;
> +       }

I think this cannot happen.

> +       if (!tnum_is_const(reg->var_off)) {
> +               verbose(env, "dynptr has to be at the constant offset\n");
> +               return -EINVAL;
> +       }

This part can.

> +       spi = __get_spi(reg->off + reg->var_off.value);
> +       if (spi < 1) {
> +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> +                       (int)(reg->off + reg->var_off.value));
> +               return -EINVAL;
> +       }
> +       return spi;
> +}

This one is a more conservative (read: redundant) check.
The is_spi_bounds_valid() is doing it better.
How about keeping get_spi(reg) as error free and use it
directly in places where it cannot fail without
defensive WARN_ON_ONCE.
int get_spi(reg)
{ return (-reg->off - reg->var_off.value - 1) / BPF_REG_SIZE; }

While moving tnum_is_const() check into is_spi_bounds_valid() ?

Like is_spi_bounds_valid(state, reg, spi) ?

We should probably remove BPF_DYNPTR_NR_SLOTS since
there are so many other places where dynptr is assumed
to be 16-bytes. That macro doesn't help at all.
It only causes confusion.

I guess we can replace is_spi_bounds_valid() with a differnet
helper that checks and computes spi.
Like get_spi_and_check(state, reg, &spi)
and use it in places where we have get_spi + is_spi_bounds_valid
while using unchecked get_spi where it cannot fail?

If we only have get_spi_and_check() we'd have to add
WARN_ON_ONCE in a few places and that bothers me...
due to defensive programming...
If code is so complex that we cannot think it through
we have to refactor it. Sprinkling WARN_ON_ONCE (just to be sure)
doesn't inspire confidence.

> +
>  static bool is_spi_bounds_valid(struct bpf_func_state *state, int spi, int nr_slots)
>  {
>         int allocated_slots = state->allocated_stack / BPF_REG_SIZE;
> @@ -725,7 +748,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>         enum bpf_dynptr_type type;
>         int spi, i, id;
>
> -       spi = get_spi(reg->off);
> +       spi = dynptr_get_spi(env, reg);
> +       if (spi < 0)
> +               return spi;
>
>         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
>                 return -EINVAL;

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
  2022-10-18 19:45   ` David Vernet
@ 2022-10-19 22:59   ` Joanne Koong
  2022-10-20  0:55     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 54+ messages in thread
From: Joanne Koong @ 2022-10-19 22:59 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> the underlying register type is subjected to more special checks to
> determine the type of object represented by the pointer and its state
> consistency.
>
> Move dynptr checks to their own 'process_dynptr_func' function so that
> is consistent and in-line with existing code. This also makes it easier
> to reuse this code for kfunc handling.
>
> To this end, remove the dependency on bpf_call_arg_meta parameter by
> instead taking the uninit_dynptr_regno by pointer. This is only needed
> to be set to a valid pointer when arg_type has MEM_UNINIT.
>
> Then, reuse this consolidated function in kfunc dynptr handling too.
> Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> been lifted.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf_verifier.h                  |   8 +-
>  kernel/bpf/btf.c                              |  17 +--
>  kernel/bpf/verifier.c                         | 115 ++++++++++--------
>  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
>  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
>  5 files changed, 69 insertions(+), 88 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 9e1e6965f407..a33683e0618b 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
>                              u32 regno);
>  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>                    u32 regno, u32 mem_size);
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -                             struct bpf_reg_state *reg);
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -                            struct bpf_reg_state *reg,
> -                            enum bpf_arg_type arg_type);
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +                       enum bpf_arg_type arg_type, int argno,
> +                       u8 *uninit_dynptr_regno);
>
>  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
>  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index eba603cec2c5..1827d889e08a 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
>                                                 return -EINVAL;
>                                         }
>
> -                                       if (!is_dynptr_reg_valid_init(env, reg)) {
> -                                               bpf_log(log,
> -                                                       "arg#%d pointer type %s %s must be valid and initialized\n",
> -                                                       i, btf_type_str(ref_t),
> -                                                       ref_tname);
> +                                       if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))

I think it'd be helpful to add a bpf_log statement here that this failed

>                                                 return -EINVAL;
> -                                       }
> -
> -                                       if (!is_dynptr_type_expected(env, reg,
> -                                                       ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> -                                               bpf_log(log,
> -                                                       "arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> -                                                       i, btf_type_str(ref_t),
> -                                                       ref_tname);
> -                                               return -EINVAL;
> -                                       }
> -
>                                         continue;
>                                 }
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 6f6d2d511c06..31c0c999448e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>         return true;
>  }
>
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -                             struct bpf_reg_state *reg)
> +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
>         int spi = get_spi(reg->off);
> @@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
>         return true;
>  }
>
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -                            struct bpf_reg_state *reg,
> -                            enum bpf_arg_type arg_type)
> +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> +                                   enum bpf_arg_type arg_type)
>  {
>         struct bpf_func_state *state = func(env, reg);
>         enum bpf_dynptr_type dynptr_type;
> @@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>         return 0;
>  }
>
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +                       enum bpf_arg_type arg_type, int argno,

Do we need both regno and argno given that regno is always argno +
BPF_REG_1 and in this function we only use the argno param for "argno
+ 1"? I think we could just pass in regno.

> +                       u8 *uninit_dynptr_regno)

nit: this is personal preference, but I think it looks cleaner passing
"struct bpf_call_arg_meta *meta" here instead of "u8
*uninit_dynptr_regno".

> +{
> +       struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> +
> +       /* We only need to check for initialized / uninitialized helper
> +        * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> +        * assumption is that if it is, that a helper function
> +        * initialized the dynptr on behalf of the BPF program.
> +        */
> +       if (base_type(reg->type) == PTR_TO_DYNPTR)
> +               return 0;
> +       if (arg_type & MEM_UNINIT) {
> +               if (!is_dynptr_reg_valid_uninit(env, reg)) {
> +                       verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> +                       return -EINVAL;
> +               }
> +
> +               /* We only support one dynptr being uninitialized at the moment,
> +                * which is sufficient for the helper functions we have right now.
> +                */
> +               if (*uninit_dynptr_regno) {
> +                       verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> +                       return -EFAULT;
> +               }
> +
> +               *uninit_dynptr_regno = regno;
> +       } else {
> +               if (!is_dynptr_reg_valid_init(env, reg)) {
> +                       verbose(env,
> +                               "Expected an initialized dynptr as arg #%d\n",
> +                               argno + 1);
> +                       return -EINVAL;
> +               }
> +
> +               if (!is_dynptr_type_expected(env, reg, arg_type)) {
> +                       const char *err_extra = "";
> +
> +                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> +                       case DYNPTR_TYPE_LOCAL:
> +                               err_extra = "local";
> +                               break;
> +                       case DYNPTR_TYPE_RINGBUF:
> +                               err_extra = "ringbuf";
> +                               break;
> +                       default:
> +                               err_extra = "<unknown>";
> +                               break;
> +                       }
> +                       verbose(env,
> +                               "Expected a dynptr of type %s as arg #%d\n",
> +                               err_extra, argno + 1);
> +                       return -EINVAL;
> +               }
> +       }
> +       return 0;
> +}
> +
>  static bool arg_type_is_mem_size(enum bpf_arg_type type)
>  {
>         return type == ARG_CONST_SIZE ||
> @@ -6086,52 +6143,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                 err = check_mem_size_reg(env, reg, regno, true, meta);
>                 break;
>         case ARG_PTR_TO_DYNPTR:
> -               /* We only need to check for initialized / uninitialized helper
> -                * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> -                * assumption is that if it is, that a helper function
> -                * initialized the dynptr on behalf of the BPF program.
> -                */
> -               if (base_type(reg->type) == PTR_TO_DYNPTR)
> -                       break;
> -               if (arg_type & MEM_UNINIT) {
> -                       if (!is_dynptr_reg_valid_uninit(env, reg)) {
> -                               verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> -                               return -EINVAL;
> -                       }
> -
> -                       /* We only support one dynptr being uninitialized at the moment,
> -                        * which is sufficient for the helper functions we have right now.
> -                        */
> -                       if (meta->uninit_dynptr_regno) {
> -                               verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> -                               return -EFAULT;
> -                       }
> -
> -                       meta->uninit_dynptr_regno = regno;
> -               } else if (!is_dynptr_reg_valid_init(env, reg)) {
> -                       verbose(env,
> -                               "Expected an initialized dynptr as arg #%d\n",
> -                               arg + 1);
> -                       return -EINVAL;
> -               } else if (!is_dynptr_type_expected(env, reg, arg_type)) {
> -                       const char *err_extra = "";
> -
> -                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> -                       case DYNPTR_TYPE_LOCAL:
> -                               err_extra = "local";
> -                               break;
> -                       case DYNPTR_TYPE_RINGBUF:
> -                               err_extra = "ringbuf";
> -                               break;
> -                       default:
> -                               err_extra = "<unknown>";
> -                               break;
> -                       }
> -                       verbose(env,
> -                               "Expected a dynptr of type %s as arg #%d\n",
> -                               err_extra, arg + 1);
> -                       return -EINVAL;
> -               }
> +               if (process_dynptr_func(env, regno, arg_type, arg, &meta->uninit_dynptr_regno))
> +                       return -EACCES;

process_dynptr_func could return -EFAULT so I think we should do "err
= process_dynptr_func(...)" here instead.

>                 break;
>         case ARG_CONST_ALLOC_SIZE_OR_ZERO:
>                 if (!tnum_is_const(reg->var_off)) {
> diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> index c210657d4d0a..fc562e863e79 100644
> --- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> @@ -18,10 +18,7 @@ static struct {
>         const char *expected_verifier_err_msg;
>         int expected_runtime_err;
>  } kfunc_dynptr_tests[] = {
> -       {"dynptr_type_not_supp",
> -        "arg#0 pointer type STRUCT bpf_dynptr_kern points to unsupported dynamic pointer type", 0},
> -       {"not_valid_dynptr",
> -        "arg#0 pointer type STRUCT bpf_dynptr_kern must be valid and initialized", 0},
> +       {"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0},
>         {"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0},
>         {"dynptr_data_null", NULL, -EBADMSG},
>  };
> diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> index ce39d096bba3..f4a8250329b2 100644
> --- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> +++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> @@ -32,18 +32,6 @@ int err, pid;
>
>  char _license[] SEC("license") = "GPL";
>
> -SEC("?lsm.s/bpf")
> -int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
> -            unsigned int size)
> -{
> -       char write_data[64] = "hello there, world!!";
> -       struct bpf_dynptr ptr;
> -
> -       bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
> -
> -       return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
> -}
> -
>  SEC("?lsm.s/bpf")
>  int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size)
>  {
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-10-19 22:59   ` Joanne Koong
@ 2022-10-20  0:55     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-20  0:55 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Thu, Oct 20, 2022 at 04:29:57AM IST, Joanne Koong wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> > the underlying register type is subjected to more special checks to
> > determine the type of object represented by the pointer and its state
> > consistency.
> >
> > Move dynptr checks to their own 'process_dynptr_func' function so that
> > is consistent and in-line with existing code. This also makes it easier
> > to reuse this code for kfunc handling.
> >
> > To this end, remove the dependency on bpf_call_arg_meta parameter by
> > instead taking the uninit_dynptr_regno by pointer. This is only needed
> > to be set to a valid pointer when arg_type has MEM_UNINIT.
> >
> > Then, reuse this consolidated function in kfunc dynptr handling too.
> > Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> > been lifted.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf_verifier.h                  |   8 +-
> >  kernel/bpf/btf.c                              |  17 +--
> >  kernel/bpf/verifier.c                         | 115 ++++++++++--------
> >  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
> >  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
> >  5 files changed, 69 insertions(+), 88 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 9e1e6965f407..a33683e0618b 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
> >                              u32 regno);
> >  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> >                    u32 regno, u32 mem_size);
> > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > -                             struct bpf_reg_state *reg);
> > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > -                            struct bpf_reg_state *reg,
> > -                            enum bpf_arg_type arg_type);
> > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > +                       enum bpf_arg_type arg_type, int argno,
> > +                       u8 *uninit_dynptr_regno);
> >
> >  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
> >  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index eba603cec2c5..1827d889e08a 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -6486,23 +6486,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> >                                                 return -EINVAL;
> >                                         }
> >
> > -                                       if (!is_dynptr_reg_valid_init(env, reg)) {
> > -                                               bpf_log(log,
> > -                                                       "arg#%d pointer type %s %s must be valid and initialized\n",
> > -                                                       i, btf_type_str(ref_t),
> > -                                                       ref_tname);
> > +                                       if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, i, NULL))
>
> I think it'd be helpful to add a bpf_log statement here that this failed
>

I left it out because process_dynptr_func itself will do the logging we were
doing here before.

> >                                                 return -EINVAL;
> > -                                       }
> > -
> > -                                       if (!is_dynptr_type_expected(env, reg,
> > -                                                       ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> > -                                               bpf_log(log,
> > -                                                       "arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> > -                                                       i, btf_type_str(ref_t),
> > -                                                       ref_tname);
> > -                                               return -EINVAL;
> > -                                       }
> > -
> >                                         continue;
> >                                 }
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 6f6d2d511c06..31c0c999448e 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -782,8 +782,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
> >         return true;
> >  }
> >
> > -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> > -                             struct bpf_reg_state *reg)
> > +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >         struct bpf_func_state *state = func(env, reg);
> >         int spi = get_spi(reg->off);
> > @@ -802,9 +801,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> >         return true;
> >  }
> >
> > -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> > -                            struct bpf_reg_state *reg,
> > -                            enum bpf_arg_type arg_type)
> > +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > +                                   enum bpf_arg_type arg_type)
> >  {
> >         struct bpf_func_state *state = func(env, reg);
> >         enum bpf_dynptr_type dynptr_type;
> > @@ -5573,6 +5571,65 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
> >         return 0;
> >  }
> >
> > +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > +                       enum bpf_arg_type arg_type, int argno,
>
> Do we need both regno and argno given that regno is always argno +
> BPF_REG_1 and in this function we only use the argno param for "argno
> + 1"? I think we could just pass in regno.
>

Hmm, not really. I can drop it.

> > +                       u8 *uninit_dynptr_regno)
>
> nit: this is personal preference, but I think it looks cleaner passing
> "struct bpf_call_arg_meta *meta" here instead of "u8
> *uninit_dynptr_regno".
>

Right, the thinking was that kfuncs could also handle MEM_UNINIT case, in both
cases the meta type is different but this could be same, but let's think about
that when/if dynptr API function is added as a kfunc.

> > +{
> > +       struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> > +
> > +       /* We only need to check for initialized / uninitialized helper
> > +        * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> > +        * assumption is that if it is, that a helper function
> > +        * initialized the dynptr on behalf of the BPF program.
> > +        */
> > +       if (base_type(reg->type) == PTR_TO_DYNPTR)
> > +               return 0;
> > +       if (arg_type & MEM_UNINIT) {
> > +               if (!is_dynptr_reg_valid_uninit(env, reg)) {
> > +                       verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> > +                       return -EINVAL;
> > +               }
> > +
> > +               /* We only support one dynptr being uninitialized at the moment,
> > +                * which is sufficient for the helper functions we have right now.
> > +                */
> > +               if (*uninit_dynptr_regno) {
> > +                       verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> > +                       return -EFAULT;
> > +               }
> > +
> > +               *uninit_dynptr_regno = regno;
> > +       } else {
> > +               if (!is_dynptr_reg_valid_init(env, reg)) {
> > +                       verbose(env,
> > +                               "Expected an initialized dynptr as arg #%d\n",
> > +                               argno + 1);
> > +                       return -EINVAL;
> > +               }
> > +
> > +               if (!is_dynptr_type_expected(env, reg, arg_type)) {
> > +                       const char *err_extra = "";
> > +
> > +                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> > +                       case DYNPTR_TYPE_LOCAL:
> > +                               err_extra = "local";
> > +                               break;
> > +                       case DYNPTR_TYPE_RINGBUF:
> > +                               err_extra = "ringbuf";
> > +                               break;
> > +                       default:
> > +                               err_extra = "<unknown>";
> > +                               break;
> > +                       }
> > +                       verbose(env,
> > +                               "Expected a dynptr of type %s as arg #%d\n",
> > +                               err_extra, argno + 1);
> > +                       return -EINVAL;
> > +               }
> > +       }
> > +       return 0;
> > +}
> > +
> >  static bool arg_type_is_mem_size(enum bpf_arg_type type)
> >  {
> >         return type == ARG_CONST_SIZE ||
> > @@ -6086,52 +6143,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >                 err = check_mem_size_reg(env, reg, regno, true, meta);
> >                 break;
> >         case ARG_PTR_TO_DYNPTR:
> > -               /* We only need to check for initialized / uninitialized helper
> > -                * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> > -                * assumption is that if it is, that a helper function
> > -                * initialized the dynptr on behalf of the BPF program.
> > -                */
> > -               if (base_type(reg->type) == PTR_TO_DYNPTR)
> > -                       break;
> > -               if (arg_type & MEM_UNINIT) {
> > -                       if (!is_dynptr_reg_valid_uninit(env, reg)) {
> > -                               verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> > -                               return -EINVAL;
> > -                       }
> > -
> > -                       /* We only support one dynptr being uninitialized at the moment,
> > -                        * which is sufficient for the helper functions we have right now.
> > -                        */
> > -                       if (meta->uninit_dynptr_regno) {
> > -                               verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> > -                               return -EFAULT;
> > -                       }
> > -
> > -                       meta->uninit_dynptr_regno = regno;
> > -               } else if (!is_dynptr_reg_valid_init(env, reg)) {
> > -                       verbose(env,
> > -                               "Expected an initialized dynptr as arg #%d\n",
> > -                               arg + 1);
> > -                       return -EINVAL;
> > -               } else if (!is_dynptr_type_expected(env, reg, arg_type)) {
> > -                       const char *err_extra = "";
> > -
> > -                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> > -                       case DYNPTR_TYPE_LOCAL:
> > -                               err_extra = "local";
> > -                               break;
> > -                       case DYNPTR_TYPE_RINGBUF:
> > -                               err_extra = "ringbuf";
> > -                               break;
> > -                       default:
> > -                               err_extra = "<unknown>";
> > -                               break;
> > -                       }
> > -                       verbose(env,
> > -                               "Expected a dynptr of type %s as arg #%d\n",
> > -                               err_extra, arg + 1);
> > -                       return -EINVAL;
> > -               }
> > +               if (process_dynptr_func(env, regno, arg_type, arg, &meta->uninit_dynptr_regno))
> > +                       return -EACCES;
>
> process_dynptr_func could return -EFAULT so I think we should do "err
> = process_dynptr_func(...)" here instead.
>

Agreed, I'll also propagate errors from other similar named functions.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-19 18:52   ` Alexei Starovoitov
@ 2022-10-20  1:04     ` Kumar Kartikeya Dwivedi
  2022-10-20  2:13       ` Alexei Starovoitov
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-20  1:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > Currently, the dynptr function is not checking the variable offset part
> > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > when computing the stack pointer index, but if the variable offset was
> > not a constant (such that it could not be accumulated in reg->off), we
> > will end up a discrepency where runtime pointer does not point to the
> > actual stack slot we mark as STACK_DYNPTR.
> >
> > It is impossible to precisely track dynptr state when variable offset is
> > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > simply reject the case where reg->var_off is not constant. Then,
> > consider both reg->off and reg->var_off.value when computing the stack
> > pointer index.
> >
> > A new helper dynptr_get_spi is introduced to hide over these details
> > since the dynptr needs to be located in multiple places outside the
> > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > need to enforce these checks in all places.
> >
> > Note that it is disallowed for unprivileged users to have a non-constant
> > var_off, so this problem should only be possible to trigger from
> > programs having CAP_PERFMON. However, its effects can vary.
> >
> > Without the fix, it is possible to replace the contents of the dynptr
> > arbitrarily by making verifier mark different stack slots than actual
> > location and then doing writes to the actual stack address of dynptr at
> > runtime.
> >
> > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> >  3 files changed, 67 insertions(+), 21 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 8f667180f70f..0fd73f96c5e2 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> >                 verbose(env, "D");
> >  }
> >
> > -static int get_spi(s32 off)
> > +static int __get_spi(s32 off)
> >  {
> >         return (-off - 1) / BPF_REG_SIZE;
> >  }
> >
> > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > +{
> > +       int spi;
> > +
> > +       if (reg->off % BPF_REG_SIZE) {
> > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > +               return -EINVAL;
> > +       }
>
> I think this cannot happen.
>

There are existing selftests that trigger this.
Or do you mean it cannot happen anymore? If so, why?

> > +       if (!tnum_is_const(reg->var_off)) {
> > +               verbose(env, "dynptr has to be at the constant offset\n");
> > +               return -EINVAL;
> > +       }
>
> This part can.
>
> > +       spi = __get_spi(reg->off + reg->var_off.value);
> > +       if (spi < 1) {
> > +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> > +                       (int)(reg->off + reg->var_off.value));
> > +               return -EINVAL;
> > +       }
> > +       return spi;
> > +}
>
> This one is a more conservative (read: redundant) check.
> The is_spi_bounds_valid() is doing it better.

The problem is, is_spi_bounds_valid returning an error is not always a problem.
See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds,
then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it
to grow the stack depth and do MAX_BPF_STACK check.

> How about keeping get_spi(reg) as error free and use it
> directly in places where it cannot fail without
> defensive WARN_ON_ONCE.
> int get_spi(reg)
> { return (-reg->off - reg->var_off.value - 1) / BPF_REG_SIZE; }
>
> While moving tnum_is_const() check into is_spi_bounds_valid() ?
>
> Like is_spi_bounds_valid(state, reg, spi) ?
>
> We should probably remove BPF_DYNPTR_NR_SLOTS since
> there are so many other places where dynptr is assumed
> to be 16-bytes. That macro doesn't help at all.
> It only causes confusion.
>
> I guess we can replace is_spi_bounds_valid() with a differnet
> helper that checks and computes spi.
> Like get_spi_and_check(state, reg, &spi)
> and use it in places where we have get_spi + is_spi_bounds_valid
> while using unchecked get_spi where it cannot fail?
>
> If we only have get_spi_and_check() we'd have to add
> WARN_ON_ONCE in a few places and that bothers me...
> due to defensive programming...
> If code is so complex that we cannot think it through
> we have to refactor it. Sprinkling WARN_ON_ONCE (just to be sure)
> doesn't inspire confidence.
>

I will think about this and reply later today.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func
  2022-10-19 16:05       ` David Vernet
@ 2022-10-20  1:09         ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-20  1:09 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Wed, Oct 19, 2022 at 09:35:02PM IST, David Vernet wrote:
> On Wed, Oct 19, 2022 at 11:48:21AM +0530, Kumar Kartikeya Dwivedi wrote:
>
> [...]
>
> > > > In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> > > > to such helpers, however the current code simply returns 0.
> > > >
> > > > The rejection for helpers that release the dynptr is already handled.
> > > >
> > > > For fixing this, we take a step back and rework existing code in a way
> > > > that will allow fitting in all classes of helpers and have a coherent
> > > > model for dealing with the variety of use cases in which dynptr is used.
> > > >
> > > > First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> > > > with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
> > > >
> > > > Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> > > > fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> > > > that the helper only operates on the memory pointed to by the dynptr,
> > >
> > > Hmmm, it feels a bit confusing to overload MEM_RDONLY like this. I
> > > understand the intention (which is logical) to imply that the pointer to
> > > the dynptr is read only, but the fact that the memory contained in the
> > > dynptr may not be read only will doubtless confuse people.
> > >
> > > I don't really have a better suggestion. This is the proper use of
> > > MEM_RDONLY, but it really feels super confusing. I guess this is
> > > somewhat mitigated by the fact that the read-only nature of the dynptr
> > > is something that will be validated at runtime?
> > >
> >
> > Nope, both dynptr's const-ness and const-ness of the memory it points to are
> > supposed to be tracked statically. It's part of the type of the dynptr.
>
> Could you please clarify what you're "noping" here? The dynptr being
> read-only is tracked statically, but based on the discussion in the
> thread at [0] I thought the plan was to enforce this property at
> runtime. Am I wrong about that?
>
> [0]: https://lore.kernel.org/bpf/CAJnrk1Y0r3++RLpT2jvp4st-79x3dUYk3uP-4tfnAeL5_kgM0Q@mail.gmail.com/
>

The more updated version of [0] is https://lore.kernel.org/bpf/CAEf4BzawD+_buWqp_U3cu71QZH_OVTseuSUyEcva9qCd1=GQ-A@mail.gmail.com .

> My point was just that it might be less difficult to confuse
> CONST_PTR_TO_DYNPTR | MEM_RDONLY with the memory contained in the dynptr
> region if there's a separate field inside the dynptr itself which tracks
> whether that region is R/O. I'm mostly just thinking out loud -- as I
> said in the last email I think using MEM_RDONLY as you are is logical.
>
> > The second case doesn't exist yet, but will soon (with skb dynptrs abstracting
> > over read only __sk_buff ctx).
> >
> > So what MEM_RDONLY in argument type really means is that I take a pointer to
> > const struct bpf_dynptr, which means I can't modify the struct bpf_dynptr itself
> > (so it's size, offset, ptr, etc.), but that is independent of r/w state of what
> > it points to.
> >
> > const T *p vs T *const p
>
> Right, I understand the intention of the patch (which was why I said it
> was a logical choice) and the distinction between the two variants of
> const. My point was that at first glance, someone who's not a verifier
> expert who's trying to understand all of this to enable their writing of
> a BPF program may be thrown off by seeing "PTR_TO_DYNPTR | RDONLY".
> Hopefully that's something we can address with adequately documenting
> helpers, and in any case, it's certainly not an argument against your
> overall approach.
>
> Also, I think it will end up being more clear if and when we have e.g.
> a helper that takes a CONST_PTR_TO_DYNPTR | MEM_RDONLY dynptr, and
> returns e.g. an R/O PTR_TO_MEM | MEM_RDONLY pointer to its backing
> memory.
>
> Anyways, at the end of the day this is really all implementation details
> of the verifier and BPF internals, so I digress...
>
> > In this case it's the latter. Soon we will also support const T *const p.
> >
> > Hence, MEM_RDONLY is at the argument type level, translating to reg->type, and
> > the read only status for the dynptr's memory slice will be part of dynptr
> > specific register state (dynptr.type).
> >
> > But I am open to more suggestions on how to write this stuff, if it makes the
> > code easier to read.
>
> I think what you have makes sense and IMO is the cleanest way to express
> all of this.
>
> The only thing that I'm now wondering after sleeping on this is whether
> it's really necessary to rename the register type to CONST_PTR_TO_DYNPTR.
> We're already restricting that it always be called with MEM_RDONLY. Are
> we _100%_ sure that it will always be fully static whether a dynptr is
> R/O? I know that Joanne said probably yes in [1], but it feels perhaps
> unnecessarily restrictive to codify that by making the register type
> CONST_PTR_TO_DYNPTR. Why not just make it PTR_TO_DYNPTR and keep the
> verifications you added in this patch that it's always specified with
> MEM_RDONLY, and then if we ever change our minds and later decide to add
> helpers that can change the access permissions on the dynptr, it will
> just be a matter of changing our expectations around the presence of
> that MEM_RDONLY modifier?
>

I'm not too worried about whether it can change in the future or not, if it does
we can rename the register type and rework the code accordingly. But mostly this
will be used to pass in dynptr ref to callbacks and similar cases, where you
don't want the callback to modify the dynptr itself that is passed in.

Maybe a use case will come up later, but we can revisit it when that happens.

> [1]: https://lore.kernel.org/bpf/CAJnrk1Zmne1uDn8EKdNKJe6O-k_moU9Sryfws_J-TF2BvX2QMg@mail.gmail.com/
>
> [...]
>
> > > >  	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
> > > >  	if (arg_type == ARG_PTR_TO_DYNPTR)
> > > >  		return true;
> > > >
> > > >  	dynptr_type = arg_to_dynptr_type(arg_type);
> > > > -
> > > > -	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > > > +	if (reg->type == CONST_PTR_TO_DYNPTR) {
> > > > +		return reg->dynptr.type == dynptr_type;
> > > > +	} else {
> > > > +		spi = get_spi(reg->off);
> > > > +		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> > > > +	}
> > > >  }
> > > >
> > > >  /* The reg state of a pointer or a bounded scalar was saved when
> > > > @@ -1317,9 +1346,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
> > > >  	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
> > > >  };
> > > >
> > > > -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> > > > -				struct bpf_reg_state *reg);
> > > > -
> > > >  /* This helper doesn't clear reg->id */
> > > >  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
> > > >  {
> > > > @@ -1382,6 +1408,25 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
> > > >  	__mark_reg_known_zero(regs + regno);
> > > >  }
> > > >
> > > > +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> > > > +			       struct bpf_reg_state *reg2,
> > > > +			       enum bpf_dynptr_type type)
> > > > +{
> > > > +	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> > > > +	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR.
> > > > +	 */
> > >
> > > Meh, this is mildly confusing. Please correct me if my understanding is wrong,
> > > but the reason this is the case is that we only set the struct bpf_reg_state
> > > from the stack, whereas the actual reg itself of course has PTR_TO_STACK. If
> > > that's the case, can we go into just a bit more detail here in this comment
> > > about what's going on? It's kind of confusing that we have an actual register
> > > of type PTR_TO_STACK, which points to stack register state of (inconsequential)
> > > type CONST_PTR_TO_DYNPTR. It's also kind of weird (but also inconsequential)
> > > that we have dynptr.first_slot for CONST_PTR_TO_DYNPTR.
> > >
> >
> > There are two cases which this function is called for, one is for the
> > spilled registers for dynptr on the stack. In that case it *is* the dynptr, so
> > reg->type as CONST_PTR_TO_DYNPTR is meaningless/wrong, and not checked. The type
> > is already part of slot_type == STACK_DYNPTR.
>
> Ok, thanks for confirming my understanding.
>
> > We reuse spilled_reg part of stack state to store info about the dynptr. We need
> > two spilled_regs to fully track it.
> >
> > Later, we will have more owned objects on the stack (bpf_list_head, bpf_rb_root)
> > where you splice it out. Their handling will have to be similar.
> >
> > PTR_TO_STACK points to the slots whose spilled registers we will call this
> > function for. That is different from the second case, i.e. for callback R1,
> > where it will be CONST_PTR_TO_DYNPTR. For consistency, I marked it as first_slot
> > because we always work using the first dynptr slot.
> >
> > So to summarize:
> >
> > PTR_TO_STACK points to bpf_dynptr on stack. So we store this info as 2 spilled
> > registers on the stack. In that case both of them are the first and second slot
> > of the dynptr (8-bytes each). They are the actual dynptr object.
> >
> > In second case we set dynptr state on the reg itself, which points to actual
> > dynptr object. The reference now records the information we need about the
> > object.
> >
> > Yes, it is a bit confusing, and again, I'm open to better ideas. The
> > difference/confusion is mainly because of different places where state is
> > tracked. For the stack we track it in stack state precisely, for
> > CONST_PTR_TO_DYNPTR it is recorded in the pointer to dynptr object.
>
> Thanks for clarifying, then my initial understanding was correct. If
> that's the case, what do you think about this suggestion to make the
> code a bit more consistent:
>
> > > Just my two cents as well, but even if the field isn't really used for
> > > anything, I would still add an additional enum bpf_reg_type parameter that sets
> > > this to STACK_DYNPTR, with a comment that says it's currently only used by
> > > CONST_PTR_TO_DYNPTR registers.
>
> I would rather reg->type be _unused_ for the dynptr in the spilled
> registers on the stack, then be both unused and meaningless/wrong (as
> you put it).
>
> [...]
>
> Thanks,
> David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-20  1:04     ` Kumar Kartikeya Dwivedi
@ 2022-10-20  2:13       ` Alexei Starovoitov
  2022-10-20  2:40         ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Alexei Starovoitov @ 2022-10-20  2:13 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > Currently, the dynptr function is not checking the variable offset part
> > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > when computing the stack pointer index, but if the variable offset was
> > > not a constant (such that it could not be accumulated in reg->off), we
> > > will end up a discrepency where runtime pointer does not point to the
> > > actual stack slot we mark as STACK_DYNPTR.
> > >
> > > It is impossible to precisely track dynptr state when variable offset is
> > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > simply reject the case where reg->var_off is not constant. Then,
> > > consider both reg->off and reg->var_off.value when computing the stack
> > > pointer index.
> > >
> > > A new helper dynptr_get_spi is introduced to hide over these details
> > > since the dynptr needs to be located in multiple places outside the
> > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > need to enforce these checks in all places.
> > >
> > > Note that it is disallowed for unprivileged users to have a non-constant
> > > var_off, so this problem should only be possible to trigger from
> > > programs having CAP_PERFMON. However, its effects can vary.
> > >
> > > Without the fix, it is possible to replace the contents of the dynptr
> > > arbitrarily by making verifier mark different stack slots than actual
> > > location and then doing writes to the actual stack address of dynptr at
> > > runtime.
> > >
> > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 8f667180f70f..0fd73f96c5e2 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > >                 verbose(env, "D");
> > >  }
> > >
> > > -static int get_spi(s32 off)
> > > +static int __get_spi(s32 off)
> > >  {
> > >         return (-off - 1) / BPF_REG_SIZE;
> > >  }
> > >
> > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > +{
> > > +       int spi;
> > > +
> > > +       if (reg->off % BPF_REG_SIZE) {
> > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > +               return -EINVAL;
> > > +       }
> >
> > I think this cannot happen.
> >
>
> There are existing selftests that trigger this.

Really. Which one is that?
Those that you've modified in this patch are hitting
"cannot pass in dynptr..." message from the check below, no?

> Or do you mean it cannot happen anymore? If so, why?

Why would it? There is an alignment check earlier.

> > > +       if (!tnum_is_const(reg->var_off)) {
> > > +               verbose(env, "dynptr has to be at the constant offset\n");
> > > +               return -EINVAL;
> > > +       }
> >
> > This part can.
> >
> > > +       spi = __get_spi(reg->off + reg->var_off.value);
> > > +       if (spi < 1) {
> > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> > > +                       (int)(reg->off + reg->var_off.value));
> > > +               return -EINVAL;
> > > +       }
> > > +       return spi;
> > > +}
> >
> > This one is a more conservative (read: redundant) check.
> > The is_spi_bounds_valid() is doing it better.
>
> The problem is, is_spi_bounds_valid returning an error is not always a problem.
> See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds,
> then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it
> to grow the stack depth and do MAX_BPF_STACK check.

It's a weird one. I'm not sure it's actually correct to do it this way.

> > How about keeping get_spi(reg) as error free and use it
> > directly in places where it cannot fail without
> > defensive WARN_ON_ONCE.
> > int get_spi(reg)
> > { return (-reg->off - reg->var_off.value - 1) / BPF_REG_SIZE; }
> >
> > While moving tnum_is_const() check into is_spi_bounds_valid() ?
> >
> > Like is_spi_bounds_valid(state, reg, spi) ?
> >
> > We should probably remove BPF_DYNPTR_NR_SLOTS since
> > there are so many other places where dynptr is assumed
> > to be 16-bytes. That macro doesn't help at all.
> > It only causes confusion.
> >
> > I guess we can replace is_spi_bounds_valid() with a differnet
> > helper that checks and computes spi.
> > Like get_spi_and_check(state, reg, &spi)
> > and use it in places where we have get_spi + is_spi_bounds_valid
> > while using unchecked get_spi where it cannot fail?
> >
> > If we only have get_spi_and_check() we'd have to add
> > WARN_ON_ONCE in a few places and that bothers me...
> > due to defensive programming...
> > If code is so complex that we cannot think it through
> > we have to refactor it. Sprinkling WARN_ON_ONCE (just to be sure)
> > doesn't inspire confidence.
> >
>
> I will think about this and reply later today.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-20  2:13       ` Alexei Starovoitov
@ 2022-10-20  2:40         ` Kumar Kartikeya Dwivedi
  2022-10-20  2:56           ` Alexei Starovoitov
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-20  2:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote:
> On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > Currently, the dynptr function is not checking the variable offset part
> > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > > when computing the stack pointer index, but if the variable offset was
> > > > not a constant (such that it could not be accumulated in reg->off), we
> > > > will end up a discrepency where runtime pointer does not point to the
> > > > actual stack slot we mark as STACK_DYNPTR.
> > > >
> > > > It is impossible to precisely track dynptr state when variable offset is
> > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > > simply reject the case where reg->var_off is not constant. Then,
> > > > consider both reg->off and reg->var_off.value when computing the stack
> > > > pointer index.
> > > >
> > > > A new helper dynptr_get_spi is introduced to hide over these details
> > > > since the dynptr needs to be located in multiple places outside the
> > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > > need to enforce these checks in all places.
> > > >
> > > > Note that it is disallowed for unprivileged users to have a non-constant
> > > > var_off, so this problem should only be possible to trigger from
> > > > programs having CAP_PERFMON. However, its effects can vary.
> > > >
> > > > Without the fix, it is possible to replace the contents of the dynptr
> > > > arbitrarily by making verifier mark different stack slots than actual
> > > > location and then doing writes to the actual stack address of dynptr at
> > > > runtime.
> > > >
> > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > ---
> > > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index 8f667180f70f..0fd73f96c5e2 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > > >                 verbose(env, "D");
> > > >  }
> > > >
> > > > -static int get_spi(s32 off)
> > > > +static int __get_spi(s32 off)
> > > >  {
> > > >         return (-off - 1) / BPF_REG_SIZE;
> > > >  }
> > > >
> > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > > +{
> > > > +       int spi;
> > > > +
> > > > +       if (reg->off % BPF_REG_SIZE) {
> > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > > +               return -EINVAL;
> > > > +       }
> > >
> > > I think this cannot happen.
> > >
> >
> > There are existing selftests that trigger this.
>
> Really. Which one is that?
> Those that you've modified in this patch are hitting
> "cannot pass in dynptr..." message from the check below, no?
>

Just taking one example, invalid_read2 which does:

bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0);

does hit this one, it passes fp-15, no var_off.

Same with invalid_helper2 that was updated.
Same with invalid_offset that was updated.
invalid_write3 gained coverage from this patch, earlier it was probably just
being rejected because of arg_type_is_release checking spilled_ptr.id.
not_valid_dynptr is also hitting this one, not the one below.

The others now started hitting this error as the order of checks was changed in
the verifier. Since arg_type_is_release checking happens before
process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr.
At that point no checks have been made of the dynptr argument, so dynptr_get_spi
is required to ensure spi is in bounds.

The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone
is not sufficient. This is why I wrapped everything into dynptr_get_spi.

> > Or do you mean it cannot happen anymore? If so, why?
>
> Why would it? There is an alignment check earlier.
>

I removed the one in check_func_arg_reg_off. So this is the only place now where
this alignment check happens.

> > > > +       if (!tnum_is_const(reg->var_off)) {
> > > > +               verbose(env, "dynptr has to be at the constant offset\n");
> > > > +               return -EINVAL;
> > > > +       }
> > >
> > > This part can.
> > >
> > > > +       spi = __get_spi(reg->off + reg->var_off.value);
> > > > +       if (spi < 1) {
> > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> > > > +                       (int)(reg->off + reg->var_off.value));
> > > > +               return -EINVAL;
> > > > +       }
> > > > +       return spi;
> > > > +}
> > >
> > > This one is a more conservative (read: redundant) check.
> > > The is_spi_bounds_valid() is doing it better.
> >
> > The problem is, is_spi_bounds_valid returning an error is not always a problem.
> > See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds,
> > then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it
> > to grow the stack depth and do MAX_BPF_STACK check.
>
> It's a weird one. I'm not sure it's actually correct to do it this way.
>

Yeah, when looking at this I was actually surprised by that return true,
thinking that was by accident and the stack depth was not being updated, but it
later happens using check_mem_access in that if block.

I'm open to other ideas, like separating out code in
check_stack_write_fixed_off, but the only issue is code divergence and we miss
checks we need to in both places due to duplication. Let me know what you think.

But however you do it, it has to be done after check_func_arg. The stack depth
should not be updated until all other arguments have been checked. If you
consider meta.access_size handling, that happens in a similar way.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-20  2:40         ` Kumar Kartikeya Dwivedi
@ 2022-10-20  2:56           ` Alexei Starovoitov
  2022-10-20  3:23             ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Alexei Starovoitov @ 2022-10-20  2:56 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Wed, Oct 19, 2022 at 7:40 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote:
> > On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > > <memxor@gmail.com> wrote:
> > > > >
> > > > > Currently, the dynptr function is not checking the variable offset part
> > > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > > > when computing the stack pointer index, but if the variable offset was
> > > > > not a constant (such that it could not be accumulated in reg->off), we
> > > > > will end up a discrepency where runtime pointer does not point to the
> > > > > actual stack slot we mark as STACK_DYNPTR.
> > > > >
> > > > > It is impossible to precisely track dynptr state when variable offset is
> > > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > > > simply reject the case where reg->var_off is not constant. Then,
> > > > > consider both reg->off and reg->var_off.value when computing the stack
> > > > > pointer index.
> > > > >
> > > > > A new helper dynptr_get_spi is introduced to hide over these details
> > > > > since the dynptr needs to be located in multiple places outside the
> > > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > > > need to enforce these checks in all places.
> > > > >
> > > > > Note that it is disallowed for unprivileged users to have a non-constant
> > > > > var_off, so this problem should only be possible to trigger from
> > > > > programs having CAP_PERFMON. However, its effects can vary.
> > > > >
> > > > > Without the fix, it is possible to replace the contents of the dynptr
> > > > > arbitrarily by making verifier mark different stack slots than actual
> > > > > location and then doing writes to the actual stack address of dynptr at
> > > > > runtime.
> > > > >
> > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > > ---
> > > > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > > > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > > > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > > > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > > > >
> > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > index 8f667180f70f..0fd73f96c5e2 100644
> > > > > --- a/kernel/bpf/verifier.c
> > > > > +++ b/kernel/bpf/verifier.c
> > > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > > > >                 verbose(env, "D");
> > > > >  }
> > > > >
> > > > > -static int get_spi(s32 off)
> > > > > +static int __get_spi(s32 off)
> > > > >  {
> > > > >         return (-off - 1) / BPF_REG_SIZE;
> > > > >  }
> > > > >
> > > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > > > +{
> > > > > +       int spi;
> > > > > +
> > > > > +       if (reg->off % BPF_REG_SIZE) {
> > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > > > +               return -EINVAL;
> > > > > +       }
> > > >
> > > > I think this cannot happen.
> > > >
> > >
> > > There are existing selftests that trigger this.
> >
> > Really. Which one is that?
> > Those that you've modified in this patch are hitting
> > "cannot pass in dynptr..." message from the check below, no?
> >
>
> Just taking one example, invalid_read2 which does:
>
> bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0);
>
> does hit this one, it passes fp-15, no var_off.
>
> Same with invalid_helper2 that was updated.
> Same with invalid_offset that was updated.
> invalid_write3 gained coverage from this patch, earlier it was probably just
> being rejected because of arg_type_is_release checking spilled_ptr.id.
> not_valid_dynptr is also hitting this one, not the one below.
>
> The others now started hitting this error as the order of checks was changed in
> the verifier. Since arg_type_is_release checking happens before
> process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr.
> At that point no checks have been made of the dynptr argument, so dynptr_get_spi
> is required to ensure spi is in bounds.
>
> The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone
> is not sufficient. This is why I wrapped everything into dynptr_get_spi.

I see. That was not obvious at all that some other patch
is removing that check from check_func_arg_reg_off.

Why is the check there not sufficient?

> > > Or do you mean it cannot happen anymore? If so, why?
> >
> > Why would it? There is an alignment check earlier.
> >
>
> I removed the one in check_func_arg_reg_off. So this is the only place now where
> this alignment check happens.
>
> > > > > +       if (!tnum_is_const(reg->var_off)) {
> > > > > +               verbose(env, "dynptr has to be at the constant offset\n");
> > > > > +               return -EINVAL;
> > > > > +       }
> > > >
> > > > This part can.
> > > >
> > > > > +       spi = __get_spi(reg->off + reg->var_off.value);
> > > > > +       if (spi < 1) {
> > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> > > > > +                       (int)(reg->off + reg->var_off.value));
> > > > > +               return -EINVAL;
> > > > > +       }
> > > > > +       return spi;
> > > > > +}
> > > >
> > > > This one is a more conservative (read: redundant) check.
> > > > The is_spi_bounds_valid() is doing it better.
> > >
> > > The problem is, is_spi_bounds_valid returning an error is not always a problem.
> > > See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds,
> > > then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it
> > > to grow the stack depth and do MAX_BPF_STACK check.
> >
> > It's a weird one. I'm not sure it's actually correct to do it this way.
> >
>
> Yeah, when looking at this I was actually surprised by that return true,
> thinking that was by accident and the stack depth was not being updated, but it
> later happens using check_mem_access in that if block.
>
> I'm open to other ideas, like separating out code in
> check_stack_write_fixed_off, but the only issue is code divergence and we miss
> checks we need to in both places due to duplication. Let me know what you think.

Not following. Why check_stack_write_fixed_off has to do with any of that?

The bug you're fixing is missing tnum_is_const(reg->var_off), right?
All other changes make it hard to understand what is going on.

> But however you do it, it has to be done after check_func_arg. The stack depth
> should not be updated until all other arguments have been checked. If you
> consider meta.access_size handling, that happens in a similar way.

Not following.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-20  2:56           ` Alexei Starovoitov
@ 2022-10-20  3:23             ` Kumar Kartikeya Dwivedi
  2022-10-21  0:46               ` Alexei Starovoitov
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-20  3:23 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Thu, Oct 20, 2022 at 08:26:44AM IST, Alexei Starovoitov wrote:
> On Wed, Oct 19, 2022 at 7:40 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote:
> > > On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > > > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > > > <memxor@gmail.com> wrote:
> > > > > >
> > > > > > Currently, the dynptr function is not checking the variable offset part
> > > > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > > > > when computing the stack pointer index, but if the variable offset was
> > > > > > not a constant (such that it could not be accumulated in reg->off), we
> > > > > > will end up a discrepency where runtime pointer does not point to the
> > > > > > actual stack slot we mark as STACK_DYNPTR.
> > > > > >
> > > > > > It is impossible to precisely track dynptr state when variable offset is
> > > > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > > > > simply reject the case where reg->var_off is not constant. Then,
> > > > > > consider both reg->off and reg->var_off.value when computing the stack
> > > > > > pointer index.
> > > > > >
> > > > > > A new helper dynptr_get_spi is introduced to hide over these details
> > > > > > since the dynptr needs to be located in multiple places outside the
> > > > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > > > > need to enforce these checks in all places.
> > > > > >
> > > > > > Note that it is disallowed for unprivileged users to have a non-constant
> > > > > > var_off, so this problem should only be possible to trigger from
> > > > > > programs having CAP_PERFMON. However, its effects can vary.
> > > > > >
> > > > > > Without the fix, it is possible to replace the contents of the dynptr
> > > > > > arbitrarily by making verifier mark different stack slots than actual
> > > > > > location and then doing writes to the actual stack address of dynptr at
> > > > > > runtime.
> > > > > >
> > > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > > > ---
> > > > > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > > > > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > > > > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > > > > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > > > > >
> > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > index 8f667180f70f..0fd73f96c5e2 100644
> > > > > > --- a/kernel/bpf/verifier.c
> > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > > > > >                 verbose(env, "D");
> > > > > >  }
> > > > > >
> > > > > > -static int get_spi(s32 off)
> > > > > > +static int __get_spi(s32 off)
> > > > > >  {
> > > > > >         return (-off - 1) / BPF_REG_SIZE;
> > > > > >  }
> > > > > >
> > > > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > > > > +{
> > > > > > +       int spi;
> > > > > > +
> > > > > > +       if (reg->off % BPF_REG_SIZE) {
> > > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > >
> > > > > I think this cannot happen.
> > > > >
> > > >
> > > > There are existing selftests that trigger this.
> > >
> > > Really. Which one is that?
> > > Those that you've modified in this patch are hitting
> > > "cannot pass in dynptr..." message from the check below, no?
> > >
> >
> > Just taking one example, invalid_read2 which does:
> >
> > bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0);
> >
> > does hit this one, it passes fp-15, no var_off.
> >
> > Same with invalid_helper2 that was updated.
> > Same with invalid_offset that was updated.
> > invalid_write3 gained coverage from this patch, earlier it was probably just
> > being rejected because of arg_type_is_release checking spilled_ptr.id.
> > not_valid_dynptr is also hitting this one, not the one below.
> >
> > The others now started hitting this error as the order of checks was changed in
> > the verifier. Since arg_type_is_release checking happens before
> > process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr.
> > At that point no checks have been made of the dynptr argument, so dynptr_get_spi
> > is required to ensure spi is in bounds.
> >
> > The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone
> > is not sufficient. This is why I wrapped everything into dynptr_get_spi.
>
> I see. That was not obvious at all that some other patch
> is removing that check from check_func_arg_reg_off.
>

It is done in patch 4. There I move that check from the check_func_arg_reg_off
to process_dynptr_func.

> Why is the check there not sufficient?
>

I wanted to keep check_func_arg_reg_off free of assumptions for helper specific
checks. It just ensures a few rules:

When OBJ_RELEASE, offsets (fixed and var are 0)
Otherwise, for some specific register types, allow fixed and var_off.
For PTR_TO_BTF_ID, allow fixed but not var_off.
Reject any fixed or var_off for all other cases.

Everything else is handled on top of that.

> > > > Or do you mean it cannot happen anymore? If so, why?
> > >
> > > Why would it? There is an alignment check earlier.
> > >
> >
> > I removed the one in check_func_arg_reg_off. So this is the only place now where
> > this alignment check happens.
> >
> > > > > > +       if (!tnum_is_const(reg->var_off)) {
> > > > > > +               verbose(env, "dynptr has to be at the constant offset\n");
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > >
> > > > > This part can.
> > > > >
> > > > > > +       spi = __get_spi(reg->off + reg->var_off.value);
> > > > > > +       if (spi < 1) {
> > > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n",
> > > > > > +                       (int)(reg->off + reg->var_off.value));
> > > > > > +               return -EINVAL;
> > > > > > +       }
> > > > > > +       return spi;
> > > > > > +}
> > > > >
> > > > > This one is a more conservative (read: redundant) check.
> > > > > The is_spi_bounds_valid() is doing it better.
> > > >
> > > > The problem is, is_spi_bounds_valid returning an error is not always a problem.
> > > > See how in is_dynptr_reg_valid_uninit we just return true on invalid bounds,
> > > > then later simulate two 8-byte accesses for uninit_dynptr_regno and rely on it
> > > > to grow the stack depth and do MAX_BPF_STACK check.
> > >
> > > It's a weird one. I'm not sure it's actually correct to do it this way.
> > >
> >
> > Yeah, when looking at this I was actually surprised by that return true,
> > thinking that was by accident and the stack depth was not being updated, but it
> > later happens using check_mem_access in that if block.
> >
> > I'm open to other ideas, like separating out code in
> > check_stack_write_fixed_off, but the only issue is code divergence and we miss
> > checks we need to in both places due to duplication. Let me know what you think.
>
> Not following. Why check_stack_write_fixed_off has to do with any of that?
>

Well, I thought you didn't consider check_mem_access based simulation of writes
to grow stack bounds to be clean, so I was soliciting opinions on how it could
be done otherwise. It ends up calling check_stack_write_fixed_off internally.

per
> > > It's a weird one. I'm not sure it's actually correct to do it this way.

but maybe I misunderstood and you meant it for is_spi_bounds_valid only.

> The bug you're fixing is missing tnum_is_const(reg->var_off), right?
> All other changes make it hard to understand what is going on.
>

In this patch, there is no other change. Every site that used get_spi(reg->off)
now uses get_spi(reg->off + reg->var_off.value) essentially.

For dynptr, only spi 1 and above are valid values.

The main ugliness comes because it needs to get ref_obj_id earlier before
argument processing begins in arg_type_is_release block. Maybe that step should
be moved later below, I don't see anything using meta->ref_obj_id inside
functions called by the switch case.

Also, going back to what you said earlier:
> If we only have get_spi_and_check() we'd have to add
> WARN_ON_ONCE in a few places and that bothers me...
> due to defensive programming...
> If code is so complex that we cannot think it through
> we have to refactor it. Sprinkling WARN_ON_ONCE (just to be sure)
> doesn't inspire confidence.
>

Once we are done with process_dynptr_func, the rest of code can assume it points
to a valid stack location where dynptr needs to be marked/unmarked, so the rest
of the code doesn't do any checking of the spi etc.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-20  3:23             ` Kumar Kartikeya Dwivedi
@ 2022-10-21  0:46               ` Alexei Starovoitov
  2022-10-21  1:53                 ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Alexei Starovoitov @ 2022-10-21  0:46 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Thu, Oct 20, 2022 at 08:53:45AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Thu, Oct 20, 2022 at 08:26:44AM IST, Alexei Starovoitov wrote:
> > On Wed, Oct 19, 2022 at 7:40 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote:
> > > > On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
> > > > <memxor@gmail.com> wrote:
> > > > >
> > > > > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > > > > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > > > > <memxor@gmail.com> wrote:
> > > > > > >
> > > > > > > Currently, the dynptr function is not checking the variable offset part
> > > > > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > > > > > when computing the stack pointer index, but if the variable offset was
> > > > > > > not a constant (such that it could not be accumulated in reg->off), we
> > > > > > > will end up a discrepency where runtime pointer does not point to the
> > > > > > > actual stack slot we mark as STACK_DYNPTR.
> > > > > > >
> > > > > > > It is impossible to precisely track dynptr state when variable offset is
> > > > > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > > > > > simply reject the case where reg->var_off is not constant. Then,
> > > > > > > consider both reg->off and reg->var_off.value when computing the stack
> > > > > > > pointer index.
> > > > > > >
> > > > > > > A new helper dynptr_get_spi is introduced to hide over these details
> > > > > > > since the dynptr needs to be located in multiple places outside the
> > > > > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > > > > > need to enforce these checks in all places.
> > > > > > >
> > > > > > > Note that it is disallowed for unprivileged users to have a non-constant
> > > > > > > var_off, so this problem should only be possible to trigger from
> > > > > > > programs having CAP_PERFMON. However, its effects can vary.
> > > > > > >
> > > > > > > Without the fix, it is possible to replace the contents of the dynptr
> > > > > > > arbitrarily by making verifier mark different stack slots than actual
> > > > > > > location and then doing writes to the actual stack address of dynptr at
> > > > > > > runtime.
> > > > > > >
> > > > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > > > > ---
> > > > > > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > > > > > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > > > > > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > > > > > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > > > > > >
> > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > index 8f667180f70f..0fd73f96c5e2 100644
> > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > > > > > >                 verbose(env, "D");
> > > > > > >  }
> > > > > > >
> > > > > > > -static int get_spi(s32 off)
> > > > > > > +static int __get_spi(s32 off)
> > > > > > >  {
> > > > > > >         return (-off - 1) / BPF_REG_SIZE;
> > > > > > >  }
> > > > > > >
> > > > > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > > > > > +{
> > > > > > > +       int spi;
> > > > > > > +
> > > > > > > +       if (reg->off % BPF_REG_SIZE) {
> > > > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > > > > > +               return -EINVAL;
> > > > > > > +       }
> > > > > >
> > > > > > I think this cannot happen.
> > > > > >
> > > > >
> > > > > There are existing selftests that trigger this.
> > > >
> > > > Really. Which one is that?
> > > > Those that you've modified in this patch are hitting
> > > > "cannot pass in dynptr..." message from the check below, no?
> > > >
> > >
> > > Just taking one example, invalid_read2 which does:
> > >
> > > bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0);
> > >
> > > does hit this one, it passes fp-15, no var_off.
> > >
> > > Same with invalid_helper2 that was updated.
> > > Same with invalid_offset that was updated.
> > > invalid_write3 gained coverage from this patch, earlier it was probably just
> > > being rejected because of arg_type_is_release checking spilled_ptr.id.
> > > not_valid_dynptr is also hitting this one, not the one below.
> > >
> > > The others now started hitting this error as the order of checks was changed in
> > > the verifier. Since arg_type_is_release checking happens before
> > > process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr.
> > > At that point no checks have been made of the dynptr argument, so dynptr_get_spi
> > > is required to ensure spi is in bounds.
> > >
> > > The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone
> > > is not sufficient. This is why I wrapped everything into dynptr_get_spi.
> >
> > I see. That was not obvious at all that some other patch
> > is removing that check from check_func_arg_reg_off.
> >
> 
> It is done in patch 4. There I move that check from the check_func_arg_reg_off
> to process_dynptr_func.

"Finally, since check_func_arg_reg_off is meant to be generic, move
dynptr specific check into process_dynptr_func."

It's a sign that patch 4 is doing too much. It should be at least two patches.

> 
> > Why is the check there not sufficient?
> >
> 
> I wanted to keep check_func_arg_reg_off free of assumptions for helper specific
> checks. It just ensures a few rules:

Currently it's
        case PTR_TO_STACK:
                if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
it's not really helper specific.

process_dynptr_func may be the right palce to check for alignment,
but imo the patch set is doing way too much.
Instead of fixing dynptr specific issues it goes into massive refactoring.
Please do one or the other.
One patch set for refactoring only with no functional changes.
Another patch set with fixes.
Either order is fine.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR
  2022-10-21  0:46               ` Alexei Starovoitov
@ 2022-10-21  1:53                 ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-21  1:53 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

On Fri, Oct 21, 2022 at 06:16:27AM IST, Alexei Starovoitov wrote:
> On Thu, Oct 20, 2022 at 08:53:45AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Thu, Oct 20, 2022 at 08:26:44AM IST, Alexei Starovoitov wrote:
> > > On Wed, Oct 19, 2022 at 7:40 PM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > On Thu, Oct 20, 2022 at 07:43:16AM IST, Alexei Starovoitov wrote:
> > > > > On Wed, Oct 19, 2022 at 6:04 PM Kumar Kartikeya Dwivedi
> > > > > <memxor@gmail.com> wrote:
> > > > > >
> > > > > > On Thu, Oct 20, 2022 at 12:22:56AM IST, Alexei Starovoitov wrote:
> > > > > > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > > > > > <memxor@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Currently, the dynptr function is not checking the variable offset part
> > > > > > > > of PTR_TO_STACK that it needs to check. The fixed offset is considered
> > > > > > > > when computing the stack pointer index, but if the variable offset was
> > > > > > > > not a constant (such that it could not be accumulated in reg->off), we
> > > > > > > > will end up a discrepency where runtime pointer does not point to the
> > > > > > > > actual stack slot we mark as STACK_DYNPTR.
> > > > > > > >
> > > > > > > > It is impossible to precisely track dynptr state when variable offset is
> > > > > > > > not constant, hence, just like bpf_timer, kptr, bpf_spin_lock, etc.
> > > > > > > > simply reject the case where reg->var_off is not constant. Then,
> > > > > > > > consider both reg->off and reg->var_off.value when computing the stack
> > > > > > > > pointer index.
> > > > > > > >
> > > > > > > > A new helper dynptr_get_spi is introduced to hide over these details
> > > > > > > > since the dynptr needs to be located in multiple places outside the
> > > > > > > > process_dynptr_func checks, hence once we know it's a PTR_TO_STACK, we
> > > > > > > > need to enforce these checks in all places.
> > > > > > > >
> > > > > > > > Note that it is disallowed for unprivileged users to have a non-constant
> > > > > > > > var_off, so this problem should only be possible to trigger from
> > > > > > > > programs having CAP_PERFMON. However, its effects can vary.
> > > > > > > >
> > > > > > > > Without the fix, it is possible to replace the contents of the dynptr
> > > > > > > > arbitrarily by making verifier mark different stack slots than actual
> > > > > > > > location and then doing writes to the actual stack address of dynptr at
> > > > > > > > runtime.
> > > > > > > >
> > > > > > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > > > > > ---
> > > > > > > >  kernel/bpf/verifier.c                         | 80 +++++++++++++++----
> > > > > > > >  .../testing/selftests/bpf/prog_tests/dynptr.c |  6 +-
> > > > > > > >  .../bpf/prog_tests/kfunc_dynptr_param.c       |  2 +-
> > > > > > > >  3 files changed, 67 insertions(+), 21 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > > > > index 8f667180f70f..0fd73f96c5e2 100644
> > > > > > > > --- a/kernel/bpf/verifier.c
> > > > > > > > +++ b/kernel/bpf/verifier.c
> > > > > > > > @@ -610,11 +610,34 @@ static void print_liveness(struct bpf_verifier_env *env,
> > > > > > > >                 verbose(env, "D");
> > > > > > > >  }
> > > > > > > >
> > > > > > > > -static int get_spi(s32 off)
> > > > > > > > +static int __get_spi(s32 off)
> > > > > > > >  {
> > > > > > > >         return (-off - 1) / BPF_REG_SIZE;
> > > > > > > >  }
> > > > > > > >
> > > > > > > > +static int dynptr_get_spi(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > > > > > > +{
> > > > > > > > +       int spi;
> > > > > > > > +
> > > > > > > > +       if (reg->off % BPF_REG_SIZE) {
> > > > > > > > +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > > > > > > +               return -EINVAL;
> > > > > > > > +       }
> > > > > > >
> > > > > > > I think this cannot happen.
> > > > > > >
> > > > > >
> > > > > > There are existing selftests that trigger this.
> > > > >
> > > > > Really. Which one is that?
> > > > > Those that you've modified in this patch are hitting
> > > > > "cannot pass in dynptr..." message from the check below, no?
> > > > >
> > > >
> > > > Just taking one example, invalid_read2 which does:
> > > >
> > > > bpf_dynptr_read(read_data, sizeof(read_data), (void *)&ptr + 1, 0, 0);
> > > >
> > > > does hit this one, it passes fp-15, no var_off.
> > > >
> > > > Same with invalid_helper2 that was updated.
> > > > Same with invalid_offset that was updated.
> > > > invalid_write3 gained coverage from this patch, earlier it was probably just
> > > > being rejected because of arg_type_is_release checking spilled_ptr.id.
> > > > not_valid_dynptr is also hitting this one, not the one below.
> > > >
> > > > The others now started hitting this error as the order of checks was changed in
> > > > the verifier. Since arg_type_is_release checking happens before
> > > > process_dynptr_func, it uses dynptr_get_spi to check ref_obj_id of spilled_ptr.
> > > > At that point no checks have been made of the dynptr argument, so dynptr_get_spi
> > > > is required to ensure spi is in bounds.
> > > >
> > > > The reg->off % BPF_REG_SIZE was earlier in check_func_arg_reg_off but that alone
> > > > is not sufficient. This is why I wrapped everything into dynptr_get_spi.
> > >
> > > I see. That was not obvious at all that some other patch
> > > is removing that check from check_func_arg_reg_off.
> > >
> >
> > It is done in patch 4. There I move that check from the check_func_arg_reg_off
> > to process_dynptr_func.
>
> "Finally, since check_func_arg_reg_off is meant to be generic, move
> dynptr specific check into process_dynptr_func."
>
> It's a sign that patch 4 is doing too much. It should be at least two patches.
>
> >
> > > Why is the check there not sufficient?
> > >
> >
> > I wanted to keep check_func_arg_reg_off free of assumptions for helper specific
> > checks. It just ensures a few rules:
>
> Currently it's
>         case PTR_TO_STACK:
>                 if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> it's not really helper specific.
>
> process_dynptr_func may be the right palce to check for alignment,
> but imo the patch set is doing way too much.
> Instead of fixing dynptr specific issues it goes into massive refactoring.
> Please do one or the other.
> One patch set for refactoring only with no functional changes.
> Another patch set with fixes.
> Either order is fine.

Ok, I will split it into two. First send the refactorings (and incorporate
feedback based on the discussion), and then the fixes on top of that.

Thanks.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write}
  2022-10-18 13:59 ` [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
@ 2022-10-21 18:12   ` Joanne Koong
  0 siblings, 0 replies; 54+ messages in thread
From: Joanne Koong @ 2022-10-21 18:12 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> It may happen that destination buffer memory overlaps with memory dynptr
> points to. Hence, we must use memmove to correctly copy from dynptr to
> destination buffer, or source buffer to dynptr.
>
> This actually isn't a problem right now, as memcpy implementation falls
> back to memmove on detecting overlap and warns about it, but we
> shouldn't be relying on that.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  kernel/bpf/helpers.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 0a4017eb3616..2dc3f5ce8f9b 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1489,7 +1489,7 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern
>         if (err)
>                 return err;
>
> -       memcpy(dst, src->data + src->offset + offset, len);
> +       memmove(dst, src->data + src->offset + offset, len);
>
>         return 0;
>  }
> @@ -1517,7 +1517,7 @@ BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, v
>         if (err)
>                 return err;
>
> -       memcpy(dst->data + dst->offset + offset, src, len);
> +       memmove(dst->data + dst->offset + offset, src, len);
>
>         return 0;
>  }
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-10-18 13:59 ` [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes Kumar Kartikeya Dwivedi
@ 2022-10-21 22:50   ` Joanne Koong
  2022-10-21 22:57     ` Joanne Koong
  2022-10-22  4:08     ` Kumar Kartikeya Dwivedi
  0 siblings, 2 replies; 54+ messages in thread
From: Joanne Koong @ 2022-10-21 22:50 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, while reads are disallowed for dynptr stack slots, writes are
> not. Reads don't work from both direct access and helpers, while writes
> do work in both cases, but have the effect of overwriting the slot_type.
>
> While this is fine, handling for a few edge cases is missing. Firstly,
> a user can overwrite the stack slots of dynptr partially.
>
> Consider the following layout:
> spi: [d][d][?]
>       2  1  0
>
> First slot is at spi 2, second at spi 1.
> Now, do a write of 1 to 8 bytes for spi 1.
>
> This will essentially either write STACK_MISC for all slot_types or
> STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
> of zeroes). The end result is that slot is scrubbed.
>
> Now, the layout is:
> spi: [d][m][?]
>       2  1  0
>
> Suppose if user initializes spi = 1 as dynptr.
> We get:
> spi: [d][d][d]
>       2  1  0
>
> But this time, both spi 2 and spi 1 have first_slot = true.
>
> Now, when passing spi 2 to dynptr helper, it will consider it as
> initialized as it does not check whether second slot has first_slot ==
> false. And spi 1 should already work as normal.
>
> This effectively replaced size + offset of first dynptr, hence allowing
> invalid OOB reads and writes.
>
> Make a few changes to protect against this:
> When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
> STACK_DYNPTR type, mark both first and second slot (regardless of which
> slot we touch) as STACK_INVALID. Reads are already prevented.
>
> Second, prevent writing to stack memory from helpers if the range may
> contain any STACK_DYNPTR slots. Reads are already prevented.
>
> For helpers, we cannot allow it to destroy dynptrs from the writes as
> depending on arguments, helper may take uninit_mem and dynptr both at
> the same time. This would mean that helper may write to uninit_mem
> before it reads the dynptr, which would be bad.
>
> PTR_TO_MEM: [?????dd]
>
> Depending on the code inside the helper, it may end up overwriting the
> dynptr contents first and then read those as the dynptr argument.
>
> Verifier would only simulate destruction when it does byte by byte
> access simulation in check_helper_call for meta.access_size, and
> fail to catch this case, as it happens after argument checks.
>
> The same would need to be done for any other non-trivial objects created
> on the stack in the future, such as bpf_list_head on stack, or
> bpf_rb_root on stack.
>
> A common misunderstanding in the current code is that MEM_UNINIT means
> writes, but note that writes may also be performed even without
> MEM_UNINIT in case of helpers, in that case the code after handling meta
> && meta->raw_mode will complain when it sees STACK_DYNPTR. So that
> invalid read case also covers writes to potential STACK_DYNPTR slots.
> The only loophole was in case of meta->raw_mode which simulated writes
> through instructions which could overwrite them.
>
> A future series sequenced after this will focus on the clean up of
> helper access checks and bugs around that.

thanks for your work on this (and on the rest of the stack, which I'm
still working on reviewing)

Regarding writes leading to partial dynptr stack slots, I'm regretting
not having the verifier flat-out reject this in the first place
(instead of it being allowed but internally the stack slot gets marked
as invalid) - I think it overall ends up being more confusing to end
users, where there it's not obvious at all that writing to the dynptr
on the stack automatically invalidates it. I'm not sure whether it's
too late from a public API behavior perspective to change this or not.
ANyways, assuming it is too late, I left a few comments below.

>
> Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c | 76 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 76 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 0fd73f96c5e2..89ae384ea6a7 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -740,6 +740,8 @@ static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
>         __mark_dynptr_regs(reg1, NULL, type);
>  }
>
> +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> +                                      struct bpf_func_state *state, int spi);
>
>  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>                                    enum bpf_arg_type arg_type, int insn_idx)
> @@ -755,6 +757,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
>                 return -EINVAL;
>
> +       destroy_stack_slots_dynptr(env, state, spi);
> +       destroy_stack_slots_dynptr(env, state, spi - 1);

I don't think we need these two lines. mark_stack_slots_dynptr() is
called only in the case where an uninitialized dynptr is getting
initialized; is_dynptr_reg_valid_uninit() will have already been
called prior to this (in check_func_arg()), where
is_dynptr_reg_valid_uninit() will have checked that for any
uninitialized dynptr, the stack slot has not already been marked as
STACK_DYNTPR. Maybe I'm missing something in this analysis? What are
your thoughts?

> +
>         for (i = 0; i < BPF_REG_SIZE; i++) {
>                 state->stack[spi].slot_type[i] = STACK_DYNPTR;
>                 state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
> @@ -829,6 +834,44 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
>         return 0;
>  }
>
> +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> +                                      struct bpf_func_state *state, int spi)
> +{
> +       int i;
> +
> +       /* We always ensure that STACK_DYNPTR is never set partially,
> +        * hence just checking for slot_type[0] is enough. This is
> +        * different for STACK_SPILL, where it may be only set for
> +        * 1 byte, so code has to use is_spilled_reg.
> +        */
> +       if (state->stack[spi].slot_type[0] != STACK_DYNPTR)
> +               return;
> +       /* Reposition spi to first slot */
> +       if (!state->stack[spi].spilled_ptr.dynptr.first_slot)
> +               spi = spi + 1;
> +
> +       mark_stack_slot_scratched(env, spi);
> +       mark_stack_slot_scratched(env, spi - 1);
> +
> +       /* Writing partially to one dynptr stack slot destroys both. */
> +       for (i = 0; i < BPF_REG_SIZE; i++) {
> +               state->stack[spi].slot_type[i] = STACK_INVALID;
> +               state->stack[spi - 1].slot_type[i] = STACK_INVALID;
> +       }
> +
> +       /* Do not release reference state, we are destroying dynptr on stack,
> +        * not using some helper to release it. Just reset register.
> +        */
> +       __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> +       __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> +
> +       /* Same reason as unmark_stack_slots_dynptr above */
> +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +
> +       return;
> +}

I think it'd be cleaner if we combined this and
unmark_stack_slots_dynptr() into one function. The logic is pretty
much the same except for if the reference state should be released.

> +
>  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
> @@ -3183,6 +3226,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
>                         env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
>         }
>
> +       destroy_stack_slots_dynptr(env, state, spi);

If the stack slot is a dynptr, I think we can just return after this
call, else we do extra work and mark the stack slots as STACK_MISC
(3rd case in the if statement).

> +
>         mark_stack_slot_scratched(env, spi);
>         if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) &&
>             !register_is_null(reg) && env->bpf_capable) {
> @@ -3296,6 +3341,13 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
>         if (err)
>                 return err;
>
> +       for (i = min_off; i < max_off; i++) {
> +               int slot, spi;
> +
> +               slot = -i - 1;
> +               spi = slot / BPF_REG_SIZE;
> +               destroy_stack_slots_dynptr(env, state, spi);
> +       }
>

Instead of calling destroy_stack_slots_dynptr() in
check_stack_write_fixed_off() and check_stack_write_var_off(), I think
calling it from check_stack_write() would be a better place. I think
that'd be more efficient as well where if it is a write to a dynptr,
we can directly return after invalidating the stack slot.

>         /* Variable offset writes destroy any spilled pointers in range. */
>         for (i = min_off; i < max_off; i++) {
> @@ -5257,6 +5309,30 @@ static int check_stack_range_initialized(
>         }
>
>         if (meta && meta->raw_mode) {
> +               /* Ensure we won't be overwriting dynptrs when simulating byte
> +                * by byte access in check_helper_call using meta.access_size.
> +                * This would be a problem if we have a helper in the future
> +                * which takes:
> +                *
> +                *      helper(uninit_mem, len, dynptr)
> +                *
> +                * Now, uninint_mem may overlap with dynptr pointer. Hence, it
> +                * may end up writing to dynptr itself when touching memory from
> +                * arg 1. This can be relaxed on a case by case basis for known
> +                * safe cases, but reject due to the possibilitiy of aliasing by
> +                * default.
> +                */
> +               for (i = min_off; i < max_off + access_size; i++) {
> +                       slot = -i - 1;
> +                       spi = slot / BPF_REG_SIZE;

I think we can just use get_spi(i) here

> +                       /* raw_mode may write past allocated_stack */
> +                       if (state->allocated_stack <= slot)
> +                               continue;

break?

> +                       if (state->stack[spi].slot_type[slot % BPF_REG_SIZE] == STACK_DYNPTR) {
> +                               verbose(env, "potential write to dynptr at off=%d disallowed\n", i);
> +                               return -EACCES;
> +                       }
> +               }
>                 meta->access_size = access_size;
>                 meta->regno = regno;
>                 return 0;
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-10-21 22:50   ` Joanne Koong
@ 2022-10-21 22:57     ` Joanne Koong
  2022-10-22  4:08     ` Kumar Kartikeya Dwivedi
  1 sibling, 0 replies; 54+ messages in thread
From: Joanne Koong @ 2022-10-21 22:57 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

[...]
>
> > +                       /* raw_mode may write past allocated_stack */
> > +                       if (state->allocated_stack <= slot)
> > +                               continue;
>
> break?

nvm, i think this should stay "continue".

>
> > +                       if (state->stack[spi].slot_type[slot % BPF_REG_SIZE] == STACK_DYNPTR) {
> > +                               verbose(env, "potential write to dynptr at off=%d disallowed\n", i);
> > +                               return -EACCES;
> > +                       }
> > +               }
> >                 meta->access_size = access_size;
> >                 meta->regno = regno;
> >                 return 0;
> > --
> > 2.38.0
> >

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-10-21 22:50   ` Joanne Koong
  2022-10-21 22:57     ` Joanne Koong
@ 2022-10-22  4:08     ` Kumar Kartikeya Dwivedi
  2022-11-03 14:07       ` Joanne Koong
  1 sibling, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-10-22  4:08 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Sat, Oct 22, 2022 at 04:20:28AM IST, Joanne Koong wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > Currently, while reads are disallowed for dynptr stack slots, writes are
> > not. Reads don't work from both direct access and helpers, while writes
> > do work in both cases, but have the effect of overwriting the slot_type.
> >
> > While this is fine, handling for a few edge cases is missing. Firstly,
> > a user can overwrite the stack slots of dynptr partially.
> >
> > Consider the following layout:
> > spi: [d][d][?]
> >       2  1  0
> >
> > First slot is at spi 2, second at spi 1.
> > Now, do a write of 1 to 8 bytes for spi 1.
> >
> > This will essentially either write STACK_MISC for all slot_types or
> > STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
> > of zeroes). The end result is that slot is scrubbed.
> >
> > Now, the layout is:
> > spi: [d][m][?]
> >       2  1  0
> >
> > Suppose if user initializes spi = 1 as dynptr.
> > We get:
> > spi: [d][d][d]
> >       2  1  0
> >
> > But this time, both spi 2 and spi 1 have first_slot = true.
> >
> > Now, when passing spi 2 to dynptr helper, it will consider it as
> > initialized as it does not check whether second slot has first_slot ==
> > false. And spi 1 should already work as normal.
> >
> > This effectively replaced size + offset of first dynptr, hence allowing
> > invalid OOB reads and writes.
> >
> > Make a few changes to protect against this:
> > When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
> > STACK_DYNPTR type, mark both first and second slot (regardless of which
> > slot we touch) as STACK_INVALID. Reads are already prevented.
> >
> > Second, prevent writing to stack memory from helpers if the range may
> > contain any STACK_DYNPTR slots. Reads are already prevented.
> >
> > For helpers, we cannot allow it to destroy dynptrs from the writes as
> > depending on arguments, helper may take uninit_mem and dynptr both at
> > the same time. This would mean that helper may write to uninit_mem
> > before it reads the dynptr, which would be bad.
> >
> > PTR_TO_MEM: [?????dd]
> >
> > Depending on the code inside the helper, it may end up overwriting the
> > dynptr contents first and then read those as the dynptr argument.
> >
> > Verifier would only simulate destruction when it does byte by byte
> > access simulation in check_helper_call for meta.access_size, and
> > fail to catch this case, as it happens after argument checks.
> >
> > The same would need to be done for any other non-trivial objects created
> > on the stack in the future, such as bpf_list_head on stack, or
> > bpf_rb_root on stack.
> >
> > A common misunderstanding in the current code is that MEM_UNINIT means
> > writes, but note that writes may also be performed even without
> > MEM_UNINIT in case of helpers, in that case the code after handling meta
> > && meta->raw_mode will complain when it sees STACK_DYNPTR. So that
> > invalid read case also covers writes to potential STACK_DYNPTR slots.
> > The only loophole was in case of meta->raw_mode which simulated writes
> > through instructions which could overwrite them.
> >
> > A future series sequenced after this will focus on the clean up of
> > helper access checks and bugs around that.
>
> thanks for your work on this (and on the rest of the stack, which I'm
> still working on reviewing)
>
> Regarding writes leading to partial dynptr stack slots, I'm regretting
> not having the verifier flat-out reject this in the first place
> (instead of it being allowed but internally the stack slot gets marked
> as invalid) - I think it overall ends up being more confusing to end
> users, where there it's not obvious at all that writing to the dynptr
> on the stack automatically invalidates it. I'm not sure whether it's
> too late from a public API behavior perspective to change this or not.

It would be incorrect to reject writes into dynptrs whose reference is not
tracked by the verifier (so bpf_dynptr_from_mem), because the compiler would be
free to reuse the stack space for some other variable when the local dynptr
variable's lifetime ends, and the verifier would have no way to know when the
variable went out of scope.

I feel it is also incorrect to refuse bpf_dynptr_from_mem where unref dynptr
already exists as well. Right now it sees STACK_DYNPTR in the slot_type and
fails. But consider something like this:

void prog(void)
{
	{
		struct bpf_dynptr ptr;
		bpf_dynptr_from_mem(...);
		...
	}

	...

	{
		struct bpf_dynptr ptr;
		bpf_dynptr_from_mem(...);
	}
}

The program is valid, but if ptr in both scopes share the same stack slots, the
call in the second scope would fail because verifier would see STACK_DYNPTR in
slot_type.

It is fine though to simply reject writes in case of dynptrs obtained from
bpf_ringbuf_reserve_dynptr, because if they are overwritten before being
released, it will end up being an error later due to unreleased reference state.
The lifetime of the object in this case is being controlled using BPF helpers
explicitly.

So I think it is ok to do in the second case, and it is unaffected by backward
compatibility constraints. It wouldn't have been possible for the unref case
even when you started out with this.

> ANyways, assuming it is too late, I left a few comments below.
>
> >
> > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 76 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 76 insertions(+)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 0fd73f96c5e2..89ae384ea6a7 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -740,6 +740,8 @@ static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> >         __mark_dynptr_regs(reg1, NULL, type);
> >  }
> >
> > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > +                                      struct bpf_func_state *state, int spi);
> >
> >  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> >                                    enum bpf_arg_type arg_type, int insn_idx)
> > @@ -755,6 +757,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> >         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
> >                 return -EINVAL;
> >
> > +       destroy_stack_slots_dynptr(env, state, spi);
> > +       destroy_stack_slots_dynptr(env, state, spi - 1);
>
> I don't think we need these two lines. mark_stack_slots_dynptr() is
> called only in the case where an uninitialized dynptr is getting
> initialized; is_dynptr_reg_valid_uninit() will have already been
> called prior to this (in check_func_arg()), where
> is_dynptr_reg_valid_uninit() will have checked that for any
> uninitialized dynptr, the stack slot has not already been marked as
> STACK_DYNTPR. Maybe I'm missing something in this analysis? What are
> your thoughts?
>

You're right, it shouldn't be needed here now.
In case of insn writes we already destroy both slots of a pair.

If we decide to allow mark_stack_slots_dynptr on STACK_DYNPTR that is
unreferenced, per the discussion above, I will keep it, because it would be
needed then, otherwise I will drop it.

> > +
> >         for (i = 0; i < BPF_REG_SIZE; i++) {
> >                 state->stack[spi].slot_type[i] = STACK_DYNPTR;
> >                 state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
> > @@ -829,6 +834,44 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> >         return 0;
> >  }
> >
> > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > +                                      struct bpf_func_state *state, int spi)
> > +{
> > +       int i;
> > +
> > +       /* We always ensure that STACK_DYNPTR is never set partially,
> > +        * hence just checking for slot_type[0] is enough. This is
> > +        * different for STACK_SPILL, where it may be only set for
> > +        * 1 byte, so code has to use is_spilled_reg.
> > +        */
> > +       if (state->stack[spi].slot_type[0] != STACK_DYNPTR)
> > +               return;
> > +       /* Reposition spi to first slot */
> > +       if (!state->stack[spi].spilled_ptr.dynptr.first_slot)
> > +               spi = spi + 1;
> > +
> > +       mark_stack_slot_scratched(env, spi);
> > +       mark_stack_slot_scratched(env, spi - 1);
> > +
> > +       /* Writing partially to one dynptr stack slot destroys both. */
> > +       for (i = 0; i < BPF_REG_SIZE; i++) {
> > +               state->stack[spi].slot_type[i] = STACK_INVALID;
> > +               state->stack[spi - 1].slot_type[i] = STACK_INVALID;
> > +       }
> > +
> > +       /* Do not release reference state, we are destroying dynptr on stack,
> > +        * not using some helper to release it. Just reset register.
> > +        */
> > +       __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> > +       __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > +
> > +       /* Same reason as unmark_stack_slots_dynptr above */
> > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +
> > +       return;
> > +}
>
> I think it'd be cleaner if we combined this and
> unmark_stack_slots_dynptr() into one function. The logic is pretty
> much the same except for if the reference state should be released.
>

Ack, will do. I can put this logic in a common function and both could be
callers of that, passing true/false, so it remains readable while avoiding the
duplication.

> > +
> >  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >         struct bpf_func_state *state = func(env, reg);
> > @@ -3183,6 +3226,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
> >                         env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
> >         }
> >
> > +       destroy_stack_slots_dynptr(env, state, spi);
>
> If the stack slot is a dynptr, I think we can just return after this
> call, else we do extra work and mark the stack slots as STACK_MISC
> (3rd case in the if statement).
>

That is the intention here. The destroy_stack_slots_dynptr overwrites two slots,
while we still simulate the write to the slot being written to.

[?][d][d]
 2  1  0

If I wrote to spi = 1, it would now be [?][m][?].
Earlier it would have been [?][m][d].

Any stray write (either fixed or variable offset) to a dynptr slot ends the
lifetime of the dynptr object, so both slots representing the dynptr object need
to be invalidated.

But the write itself needs to happen, and its state has to be reflected in the
stack state for those particular slot(s).

The main point here is to prevent partial destruction, which allows manifesting
the case described in the commit log. Writing to one slot of the two
representing a dynptr invalidates both.

> > +
> >         mark_stack_slot_scratched(env, spi);
> >         if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) &&
> >             !register_is_null(reg) && env->bpf_capable) {
> > @@ -3296,6 +3341,13 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
> >         if (err)
> >                 return err;
> >
> > +       for (i = min_off; i < max_off; i++) {
> > +               int slot, spi;
> > +
> > +               slot = -i - 1;
> > +               spi = slot / BPF_REG_SIZE;
> > +               destroy_stack_slots_dynptr(env, state, spi);
> > +       }
> >
>
> Instead of calling destroy_stack_slots_dynptr() in
> check_stack_write_fixed_off() and check_stack_write_var_off(), I think
> calling it from check_stack_write() would be a better place. I think
> that'd be more efficient as well where if it is a write to a dynptr,
> we can directly return after invalidating the stack slot.
>

We cannot directly return, as explained above.

> >         /* Variable offset writes destroy any spilled pointers in range. */
> >         for (i = min_off; i < max_off; i++) {
> > @@ -5257,6 +5309,30 @@ static int check_stack_range_initialized(
> >         }
> >
> >         if (meta && meta->raw_mode) {
> > +               /* Ensure we won't be overwriting dynptrs when simulating byte
> > +                * by byte access in check_helper_call using meta.access_size.
> > +                * This would be a problem if we have a helper in the future
> > +                * which takes:
> > +                *
> > +                *      helper(uninit_mem, len, dynptr)
> > +                *
> > +                * Now, uninint_mem may overlap with dynptr pointer. Hence, it
> > +                * may end up writing to dynptr itself when touching memory from
> > +                * arg 1. This can be relaxed on a case by case basis for known
> > +                * safe cases, but reject due to the possibilitiy of aliasing by
> > +                * default.
> > +                */
> > +               for (i = min_off; i < max_off + access_size; i++) {
> > +                       slot = -i - 1;
> > +                       spi = slot / BPF_REG_SIZE;
>
> I think we can just use get_spi(i) here
>

Ack.

> > +                       /* raw_mode may write past allocated_stack */
> > +                       if (state->allocated_stack <= slot)
> > +                               continue;
>
> break?
>

I think you realised why it's continue in your other reply :).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-10-22  4:08     ` Kumar Kartikeya Dwivedi
@ 2022-11-03 14:07       ` Joanne Koong
  2022-11-04 22:14         ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Joanne Koong @ 2022-11-03 14:07 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Sat, Oct 22, 2022 at 5:08 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Sat, Oct 22, 2022 at 04:20:28AM IST, Joanne Koong wrote:
> > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > Currently, while reads are disallowed for dynptr stack slots, writes are
> > > not. Reads don't work from both direct access and helpers, while writes
> > > do work in both cases, but have the effect of overwriting the slot_type.
> > >
> > > While this is fine, handling for a few edge cases is missing. Firstly,
> > > a user can overwrite the stack slots of dynptr partially.
> > >
> > > Consider the following layout:
> > > spi: [d][d][?]
> > >       2  1  0
> > >
> > > First slot is at spi 2, second at spi 1.
> > > Now, do a write of 1 to 8 bytes for spi 1.
> > >
> > > This will essentially either write STACK_MISC for all slot_types or
> > > STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
> > > of zeroes). The end result is that slot is scrubbed.
> > >
> > > Now, the layout is:
> > > spi: [d][m][?]
> > >       2  1  0
> > >
> > > Suppose if user initializes spi = 1 as dynptr.
> > > We get:
> > > spi: [d][d][d]
> > >       2  1  0
> > >
> > > But this time, both spi 2 and spi 1 have first_slot = true.
> > >
> > > Now, when passing spi 2 to dynptr helper, it will consider it as
> > > initialized as it does not check whether second slot has first_slot ==
> > > false. And spi 1 should already work as normal.
> > >
> > > This effectively replaced size + offset of first dynptr, hence allowing
> > > invalid OOB reads and writes.
> > >
> > > Make a few changes to protect against this:
> > > When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
> > > STACK_DYNPTR type, mark both first and second slot (regardless of which
> > > slot we touch) as STACK_INVALID. Reads are already prevented.
> > >
> > > Second, prevent writing to stack memory from helpers if the range may
> > > contain any STACK_DYNPTR slots. Reads are already prevented.
> > >
> > > For helpers, we cannot allow it to destroy dynptrs from the writes as
> > > depending on arguments, helper may take uninit_mem and dynptr both at
> > > the same time. This would mean that helper may write to uninit_mem
> > > before it reads the dynptr, which would be bad.
> > >
> > > PTR_TO_MEM: [?????dd]
> > >
> > > Depending on the code inside the helper, it may end up overwriting the
> > > dynptr contents first and then read those as the dynptr argument.
> > >
> > > Verifier would only simulate destruction when it does byte by byte
> > > access simulation in check_helper_call for meta.access_size, and
> > > fail to catch this case, as it happens after argument checks.
> > >
> > > The same would need to be done for any other non-trivial objects created
> > > on the stack in the future, such as bpf_list_head on stack, or
> > > bpf_rb_root on stack.
> > >
> > > A common misunderstanding in the current code is that MEM_UNINIT means
> > > writes, but note that writes may also be performed even without
> > > MEM_UNINIT in case of helpers, in that case the code after handling meta
> > > && meta->raw_mode will complain when it sees STACK_DYNPTR. So that
> > > invalid read case also covers writes to potential STACK_DYNPTR slots.
> > > The only loophole was in case of meta->raw_mode which simulated writes
> > > through instructions which could overwrite them.
> > >
> > > A future series sequenced after this will focus on the clean up of
> > > helper access checks and bugs around that.
> >
> > thanks for your work on this (and on the rest of the stack, which I'm
> > still working on reviewing)
> >
> > Regarding writes leading to partial dynptr stack slots, I'm regretting
> > not having the verifier flat-out reject this in the first place
> > (instead of it being allowed but internally the stack slot gets marked
> > as invalid) - I think it overall ends up being more confusing to end
> > users, where there it's not obvious at all that writing to the dynptr
> > on the stack automatically invalidates it. I'm not sure whether it's
> > too late from a public API behavior perspective to change this or not.
>
> It would be incorrect to reject writes into dynptrs whose reference is not
> tracked by the verifier (so bpf_dynptr_from_mem), because the compiler would be
> free to reuse the stack space for some other variable when the local dynptr
> variable's lifetime ends, and the verifier would have no way to know when the
> variable went out of scope.
>
> I feel it is also incorrect to refuse bpf_dynptr_from_mem where unref dynptr
> already exists as well. Right now it sees STACK_DYNPTR in the slot_type and
> fails. But consider something like this:
>
> void prog(void)
> {
>         {
>                 struct bpf_dynptr ptr;
>                 bpf_dynptr_from_mem(...);
>                 ...
>         }
>
>         ...
>
>         {
>                 struct bpf_dynptr ptr;
>                 bpf_dynptr_from_mem(...);
>         }
> }
>
> The program is valid, but if ptr in both scopes share the same stack slots, the
> call in the second scope would fail because verifier would see STACK_DYNPTR in
> slot_type.
>
> It is fine though to simply reject writes in case of dynptrs obtained from
> bpf_ringbuf_reserve_dynptr, because if they are overwritten before being
> released, it will end up being an error later due to unreleased reference state.
> The lifetime of the object in this case is being controlled using BPF helpers
> explicitly.
>
> So I think it is ok to do in the second case, and it is unaffected by backward
> compatibility constraints. It wouldn't have been possible for the unref case
> even when you started out with this.

I see! I didn't realize the compiler can reuse the stack slot for
different variables within the same stack frame. I agree with your
thoughts.

>
> > ANyways, assuming it is too late, I left a few comments below.
> >
> > >
> > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 76 +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 76 insertions(+)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 0fd73f96c5e2..89ae384ea6a7 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -740,6 +740,8 @@ static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> > >         __mark_dynptr_regs(reg1, NULL, type);
> > >  }
> > >
> > > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > > +                                      struct bpf_func_state *state, int spi);
> > >
> > >  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > >                                    enum bpf_arg_type arg_type, int insn_idx)
> > > @@ -755,6 +757,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> > >         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
> > >                 return -EINVAL;
> > >
> > > +       destroy_stack_slots_dynptr(env, state, spi);
> > > +       destroy_stack_slots_dynptr(env, state, spi - 1);
> >
> > I don't think we need these two lines. mark_stack_slots_dynptr() is
> > called only in the case where an uninitialized dynptr is getting
> > initialized; is_dynptr_reg_valid_uninit() will have already been
> > called prior to this (in check_func_arg()), where
> > is_dynptr_reg_valid_uninit() will have checked that for any
> > uninitialized dynptr, the stack slot has not already been marked as
> > STACK_DYNTPR. Maybe I'm missing something in this analysis? What are
> > your thoughts?
> >
>
> You're right, it shouldn't be needed here now.
> In case of insn writes we already destroy both slots of a pair.
>
> If we decide to allow mark_stack_slots_dynptr on STACK_DYNPTR that is
> unreferenced, per the discussion above, I will keep it, because it would be
> needed then, otherwise I will drop it.

I think we should remove these two lines from this patch and have the
code for allowing mark_stack_slots_dynptr on unreferenced
STACK_DYNPTRs as a separate patch since that will also require changes
to is_dynptr_reg_valid_uninit().

>
> > > +
> > >         for (i = 0; i < BPF_REG_SIZE; i++) {
> > >                 state->stack[spi].slot_type[i] = STACK_DYNPTR;
> > >                 state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
> > > @@ -829,6 +834,44 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> > >         return 0;
> > >  }
> > >
> > > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > > +                                      struct bpf_func_state *state, int spi)
> > > +{
> > > +       int i;
> > > +
> > > +       /* We always ensure that STACK_DYNPTR is never set partially,
> > > +        * hence just checking for slot_type[0] is enough. This is
> > > +        * different for STACK_SPILL, where it may be only set for
> > > +        * 1 byte, so code has to use is_spilled_reg.
> > > +        */
> > > +       if (state->stack[spi].slot_type[0] != STACK_DYNPTR)
> > > +               return;
> > > +       /* Reposition spi to first slot */
> > > +       if (!state->stack[spi].spilled_ptr.dynptr.first_slot)
> > > +               spi = spi + 1;
> > > +
> > > +       mark_stack_slot_scratched(env, spi);
> > > +       mark_stack_slot_scratched(env, spi - 1);
> > > +
> > > +       /* Writing partially to one dynptr stack slot destroys both. */
> > > +       for (i = 0; i < BPF_REG_SIZE; i++) {
> > > +               state->stack[spi].slot_type[i] = STACK_INVALID;
> > > +               state->stack[spi - 1].slot_type[i] = STACK_INVALID;
> > > +       }
> > > +
> > > +       /* Do not release reference state, we are destroying dynptr on stack,
> > > +        * not using some helper to release it. Just reset register.
> > > +        */
> > > +       __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> > > +       __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > > +
> > > +       /* Same reason as unmark_stack_slots_dynptr above */
> > > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +
> > > +       return;
> > > +}
> >
> > I think it'd be cleaner if we combined this and
> > unmark_stack_slots_dynptr() into one function. The logic is pretty
> > much the same except for if the reference state should be released.
> >
>
> Ack, will do. I can put this logic in a common function and both could be
> callers of that, passing true/false, so it remains readable while avoiding the
> duplication.
>
> > > +
> > >  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > >  {
> > >         struct bpf_func_state *state = func(env, reg);
> > > @@ -3183,6 +3226,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
> > >                         env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
> > >         }
> > >
> > > +       destroy_stack_slots_dynptr(env, state, spi);
> >
> > If the stack slot is a dynptr, I think we can just return after this
> > call, else we do extra work and mark the stack slots as STACK_MISC
> > (3rd case in the if statement).
> >
>
> That is the intention here. The destroy_stack_slots_dynptr overwrites two slots,
> while we still simulate the write to the slot being written to.
>
> [?][d][d]
>  2  1  0
>
> If I wrote to spi = 1, it would now be [?][m][?].
> Earlier it would have been [?][m][d].
>
> Any stray write (either fixed or variable offset) to a dynptr slot ends the
> lifetime of the dynptr object, so both slots representing the dynptr object need
> to be invalidated.
>
> But the write itself needs to happen, and its state has to be reflected in the
> stack state for those particular slot(s).
>
> The main point here is to prevent partial destruction, which allows manifesting
> the case described in the commit log. Writing to one slot of the two
> representing a dynptr invalidates both.

Returning after the destroy_stack_slots_dynptr call would overwrite
both slots; my previous point was that we could return after calling
this instead of also going through the 3rd if statement below. But I
just realized that we do need to go through the 3rd if statement since
the stack slots need to be STACK_MISC, not STACK_INVALID.

>
> > > +
> > >         mark_stack_slot_scratched(env, spi);
> > >         if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) &&
> > >             !register_is_null(reg) && env->bpf_capable) {
> > > @@ -3296,6 +3341,13 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
> > >         if (err)
> > >                 return err;
> > >
> > > +       for (i = min_off; i < max_off; i++) {
> > > +               int slot, spi;
> > > +
> > > +               slot = -i - 1;
> > > +               spi = slot / BPF_REG_SIZE;
> > > +               destroy_stack_slots_dynptr(env, state, spi);
> > > +       }
> > >
> >
> > Instead of calling destroy_stack_slots_dynptr() in
> > check_stack_write_fixed_off() and check_stack_write_var_off(), I think
> > calling it from check_stack_write() would be a better place. I think
> > that'd be more efficient as well where if it is a write to a dynptr,
> > we can directly return after invalidating the stack slot.
> >
>
> We cannot directly return, as explained above.
>
> > >         /* Variable offset writes destroy any spilled pointers in range. */
> > >         for (i = min_off; i < max_off; i++) {
> > > @@ -5257,6 +5309,30 @@ static int check_stack_range_initialized(
> > >         }
> > >
> > >         if (meta && meta->raw_mode) {
> > > +               /* Ensure we won't be overwriting dynptrs when simulating byte
> > > +                * by byte access in check_helper_call using meta.access_size.
> > > +                * This would be a problem if we have a helper in the future
> > > +                * which takes:
> > > +                *
> > > +                *      helper(uninit_mem, len, dynptr)
> > > +                *
> > > +                * Now, uninint_mem may overlap with dynptr pointer. Hence, it
> > > +                * may end up writing to dynptr itself when touching memory from
> > > +                * arg 1. This can be relaxed on a case by case basis for known
> > > +                * safe cases, but reject due to the possibilitiy of aliasing by
> > > +                * default.
> > > +                */
> > > +               for (i = min_off; i < max_off + access_size; i++) {
> > > +                       slot = -i - 1;
> > > +                       spi = slot / BPF_REG_SIZE;
> >
> > I think we can just use get_spi(i) here
> >
>
> Ack.
>
> > > +                       /* raw_mode may write past allocated_stack */
> > > +                       if (state->allocated_stack <= slot)
> > > +                               continue;
> >
> > break?
> >
>
> I think you realised why it's continue in your other reply :).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-11-03 14:07       ` Joanne Koong
@ 2022-11-04 22:14         ` Andrii Nakryiko
  2022-11-04 23:02           ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Andrii Nakryiko @ 2022-11-04 22:14 UTC (permalink / raw)
  To: Joanne Koong
  Cc: Kumar Kartikeya Dwivedi, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau, David Vernet

On Thu, Nov 3, 2022 at 7:07 AM Joanne Koong <joannelkoong@gmail.com> wrote:
>
> On Sat, Oct 22, 2022 at 5:08 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Sat, Oct 22, 2022 at 04:20:28AM IST, Joanne Koong wrote:
> > > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > Currently, while reads are disallowed for dynptr stack slots, writes are
> > > > not. Reads don't work from both direct access and helpers, while writes
> > > > do work in both cases, but have the effect of overwriting the slot_type.
> > > >
> > > > While this is fine, handling for a few edge cases is missing. Firstly,
> > > > a user can overwrite the stack slots of dynptr partially.
> > > >
> > > > Consider the following layout:
> > > > spi: [d][d][?]
> > > >       2  1  0
> > > >
> > > > First slot is at spi 2, second at spi 1.
> > > > Now, do a write of 1 to 8 bytes for spi 1.
> > > >
> > > > This will essentially either write STACK_MISC for all slot_types or
> > > > STACK_MISC and STACK_ZERO (in case of size < BPF_REG_SIZE partial write
> > > > of zeroes). The end result is that slot is scrubbed.
> > > >
> > > > Now, the layout is:
> > > > spi: [d][m][?]
> > > >       2  1  0
> > > >
> > > > Suppose if user initializes spi = 1 as dynptr.
> > > > We get:
> > > > spi: [d][d][d]
> > > >       2  1  0
> > > >
> > > > But this time, both spi 2 and spi 1 have first_slot = true.
> > > >
> > > > Now, when passing spi 2 to dynptr helper, it will consider it as
> > > > initialized as it does not check whether second slot has first_slot ==
> > > > false. And spi 1 should already work as normal.
> > > >
> > > > This effectively replaced size + offset of first dynptr, hence allowing
> > > > invalid OOB reads and writes.
> > > >
> > > > Make a few changes to protect against this:
> > > > When writing to PTR_TO_STACK using BPF insns, when we touch spi of a
> > > > STACK_DYNPTR type, mark both first and second slot (regardless of which
> > > > slot we touch) as STACK_INVALID. Reads are already prevented.
> > > >
> > > > Second, prevent writing to stack memory from helpers if the range may
> > > > contain any STACK_DYNPTR slots. Reads are already prevented.
> > > >
> > > > For helpers, we cannot allow it to destroy dynptrs from the writes as
> > > > depending on arguments, helper may take uninit_mem and dynptr both at
> > > > the same time. This would mean that helper may write to uninit_mem
> > > > before it reads the dynptr, which would be bad.
> > > >
> > > > PTR_TO_MEM: [?????dd]
> > > >
> > > > Depending on the code inside the helper, it may end up overwriting the
> > > > dynptr contents first and then read those as the dynptr argument.
> > > >
> > > > Verifier would only simulate destruction when it does byte by byte
> > > > access simulation in check_helper_call for meta.access_size, and
> > > > fail to catch this case, as it happens after argument checks.
> > > >
> > > > The same would need to be done for any other non-trivial objects created
> > > > on the stack in the future, such as bpf_list_head on stack, or
> > > > bpf_rb_root on stack.
> > > >
> > > > A common misunderstanding in the current code is that MEM_UNINIT means
> > > > writes, but note that writes may also be performed even without
> > > > MEM_UNINIT in case of helpers, in that case the code after handling meta
> > > > && meta->raw_mode will complain when it sees STACK_DYNPTR. So that
> > > > invalid read case also covers writes to potential STACK_DYNPTR slots.
> > > > The only loophole was in case of meta->raw_mode which simulated writes
> > > > through instructions which could overwrite them.
> > > >
> > > > A future series sequenced after this will focus on the clean up of
> > > > helper access checks and bugs around that.
> > >
> > > thanks for your work on this (and on the rest of the stack, which I'm
> > > still working on reviewing)
> > >
> > > Regarding writes leading to partial dynptr stack slots, I'm regretting
> > > not having the verifier flat-out reject this in the first place
> > > (instead of it being allowed but internally the stack slot gets marked
> > > as invalid) - I think it overall ends up being more confusing to end
> > > users, where there it's not obvious at all that writing to the dynptr
> > > on the stack automatically invalidates it. I'm not sure whether it's
> > > too late from a public API behavior perspective to change this or not.
> >
> > It would be incorrect to reject writes into dynptrs whose reference is not
> > tracked by the verifier (so bpf_dynptr_from_mem), because the compiler would be
> > free to reuse the stack space for some other variable when the local dynptr
> > variable's lifetime ends, and the verifier would have no way to know when the
> > variable went out of scope.
> >
> > I feel it is also incorrect to refuse bpf_dynptr_from_mem where unref dynptr
> > already exists as well. Right now it sees STACK_DYNPTR in the slot_type and
> > fails. But consider something like this:
> >
> > void prog(void)
> > {
> >         {
> >                 struct bpf_dynptr ptr;
> >                 bpf_dynptr_from_mem(...);
> >                 ...
> >         }
> >
> >         ...
> >
> >         {
> >                 struct bpf_dynptr ptr;
> >                 bpf_dynptr_from_mem(...);
> >         }
> > }
> >
> > The program is valid, but if ptr in both scopes share the same stack slots, the
> > call in the second scope would fail because verifier would see STACK_DYNPTR in
> > slot_type.

I don't think compiler is allowed to reuse the same stack slot for
those two ptrs, because we are passing a pointer to it into a
black-box bpf_dynptr_from_mem() function, so kernel can't assume that
this slot is free to be reused just because no one is accessing it
after bpf_dynptr_from_mem (I think?)

Would it make sense to allow *optional* bpf_dynptr_free (of is it
bpf_dynptr_put, not sure) for non-reference-tracked dynptrs if indeed
we wanted to reuse the same stack variable for multiple dynptrs,
though?

> >
> > It is fine though to simply reject writes in case of dynptrs obtained from
> > bpf_ringbuf_reserve_dynptr, because if they are overwritten before being
> > released, it will end up being an error later due to unreleased reference state.
> > The lifetime of the object in this case is being controlled using BPF helpers
> > explicitly.
> >
> > So I think it is ok to do in the second case, and it is unaffected by backward
> > compatibility constraints. It wouldn't have been possible for the unref case
> > even when you started out with this.
>
> I see! I didn't realize the compiler can reuse the stack slot for
> different variables within the same stack frame. I agree with your
> thoughts.
>
> >
> > > ANyways, assuming it is too late, I left a few comments below.
> > >
> > > >
> > > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > ---
> > > >  kernel/bpf/verifier.c | 76 +++++++++++++++++++++++++++++++++++++++++++
> > > >  1 file changed, 76 insertions(+)
> > > >
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index 0fd73f96c5e2..89ae384ea6a7 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -740,6 +740,8 @@ static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> > > >         __mark_dynptr_regs(reg1, NULL, type);
> > > >  }
> > > >
> > > > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > > > +                                      struct bpf_func_state *state, int spi);
> > > >
> > > >  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> > > >                                    enum bpf_arg_type arg_type, int insn_idx)
> > > > @@ -755,6 +757,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> > > >         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
> > > >                 return -EINVAL;
> > > >
> > > > +       destroy_stack_slots_dynptr(env, state, spi);
> > > > +       destroy_stack_slots_dynptr(env, state, spi - 1);
> > >
> > > I don't think we need these two lines. mark_stack_slots_dynptr() is
> > > called only in the case where an uninitialized dynptr is getting
> > > initialized; is_dynptr_reg_valid_uninit() will have already been
> > > called prior to this (in check_func_arg()), where
> > > is_dynptr_reg_valid_uninit() will have checked that for any
> > > uninitialized dynptr, the stack slot has not already been marked as
> > > STACK_DYNTPR. Maybe I'm missing something in this analysis? What are
> > > your thoughts?
> > >
> >
> > You're right, it shouldn't be needed here now.
> > In case of insn writes we already destroy both slots of a pair.
> >
> > If we decide to allow mark_stack_slots_dynptr on STACK_DYNPTR that is
> > unreferenced, per the discussion above, I will keep it, because it would be
> > needed then, otherwise I will drop it.
>
> I think we should remove these two lines from this patch and have the
> code for allowing mark_stack_slots_dynptr on unreferenced
> STACK_DYNPTRs as a separate patch since that will also require changes
> to is_dynptr_reg_valid_uninit().
>
> >
> > > > +
> > > >         for (i = 0; i < BPF_REG_SIZE; i++) {
> > > >                 state->stack[spi].slot_type[i] = STACK_DYNPTR;
> > > >                 state->stack[spi - 1].slot_type[i] = STACK_DYNPTR;
> > > > @@ -829,6 +834,44 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> > > >         return 0;
> > > >  }
> > > >
> > > > +static void destroy_stack_slots_dynptr(struct bpf_verifier_env *env,
> > > > +                                      struct bpf_func_state *state, int spi)
> > > > +{
> > > > +       int i;
> > > > +
> > > > +       /* We always ensure that STACK_DYNPTR is never set partially,
> > > > +        * hence just checking for slot_type[0] is enough. This is
> > > > +        * different for STACK_SPILL, where it may be only set for
> > > > +        * 1 byte, so code has to use is_spilled_reg.
> > > > +        */
> > > > +       if (state->stack[spi].slot_type[0] != STACK_DYNPTR)
> > > > +               return;
> > > > +       /* Reposition spi to first slot */
> > > > +       if (!state->stack[spi].spilled_ptr.dynptr.first_slot)
> > > > +               spi = spi + 1;
> > > > +
> > > > +       mark_stack_slot_scratched(env, spi);
> > > > +       mark_stack_slot_scratched(env, spi - 1);
> > > > +
> > > > +       /* Writing partially to one dynptr stack slot destroys both. */
> > > > +       for (i = 0; i < BPF_REG_SIZE; i++) {
> > > > +               state->stack[spi].slot_type[i] = STACK_INVALID;
> > > > +               state->stack[spi - 1].slot_type[i] = STACK_INVALID;
> > > > +       }
> > > > +
> > > > +       /* Do not release reference state, we are destroying dynptr on stack,
> > > > +        * not using some helper to release it. Just reset register.
> > > > +        */
> > > > +       __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> > > > +       __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > > > +
> > > > +       /* Same reason as unmark_stack_slots_dynptr above */
> > > > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > > +
> > > > +       return;
> > > > +}
> > >
> > > I think it'd be cleaner if we combined this and
> > > unmark_stack_slots_dynptr() into one function. The logic is pretty
> > > much the same except for if the reference state should be released.
> > >
> >
> > Ack, will do. I can put this logic in a common function and both could be
> > callers of that, passing true/false, so it remains readable while avoiding the
> > duplication.
> >
> > > > +
> > > >  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > >  {
> > > >         struct bpf_func_state *state = func(env, reg);
> > > > @@ -3183,6 +3226,8 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env,
> > > >                         env->insn_aux_data[insn_idx].sanitize_stack_spill = true;
> > > >         }
> > > >
> > > > +       destroy_stack_slots_dynptr(env, state, spi);
> > >
> > > If the stack slot is a dynptr, I think we can just return after this
> > > call, else we do extra work and mark the stack slots as STACK_MISC
> > > (3rd case in the if statement).
> > >
> >
> > That is the intention here. The destroy_stack_slots_dynptr overwrites two slots,
> > while we still simulate the write to the slot being written to.
> >
> > [?][d][d]
> >  2  1  0
> >
> > If I wrote to spi = 1, it would now be [?][m][?].
> > Earlier it would have been [?][m][d].
> >
> > Any stray write (either fixed or variable offset) to a dynptr slot ends the
> > lifetime of the dynptr object, so both slots representing the dynptr object need
> > to be invalidated.
> >
> > But the write itself needs to happen, and its state has to be reflected in the
> > stack state for those particular slot(s).
> >
> > The main point here is to prevent partial destruction, which allows manifesting
> > the case described in the commit log. Writing to one slot of the two
> > representing a dynptr invalidates both.
>
> Returning after the destroy_stack_slots_dynptr call would overwrite
> both slots; my previous point was that we could return after calling
> this instead of also going through the 3rd if statement below. But I
> just realized that we do need to go through the 3rd if statement since
> the stack slots need to be STACK_MISC, not STACK_INVALID.
>
> >
> > > > +
> > > >         mark_stack_slot_scratched(env, spi);
> > > >         if (reg && !(off % BPF_REG_SIZE) && register_is_bounded(reg) &&
> > > >             !register_is_null(reg) && env->bpf_capable) {
> > > > @@ -3296,6 +3341,13 @@ static int check_stack_write_var_off(struct bpf_verifier_env *env,
> > > >         if (err)
> > > >                 return err;
> > > >
> > > > +       for (i = min_off; i < max_off; i++) {
> > > > +               int slot, spi;
> > > > +
> > > > +               slot = -i - 1;
> > > > +               spi = slot / BPF_REG_SIZE;
> > > > +               destroy_stack_slots_dynptr(env, state, spi);
> > > > +       }
> > > >
> > >
> > > Instead of calling destroy_stack_slots_dynptr() in
> > > check_stack_write_fixed_off() and check_stack_write_var_off(), I think
> > > calling it from check_stack_write() would be a better place. I think
> > > that'd be more efficient as well where if it is a write to a dynptr,
> > > we can directly return after invalidating the stack slot.
> > >
> >
> > We cannot directly return, as explained above.
> >
> > > >         /* Variable offset writes destroy any spilled pointers in range. */
> > > >         for (i = min_off; i < max_off; i++) {
> > > > @@ -5257,6 +5309,30 @@ static int check_stack_range_initialized(
> > > >         }
> > > >
> > > >         if (meta && meta->raw_mode) {
> > > > +               /* Ensure we won't be overwriting dynptrs when simulating byte
> > > > +                * by byte access in check_helper_call using meta.access_size.
> > > > +                * This would be a problem if we have a helper in the future
> > > > +                * which takes:
> > > > +                *
> > > > +                *      helper(uninit_mem, len, dynptr)
> > > > +                *
> > > > +                * Now, uninint_mem may overlap with dynptr pointer. Hence, it
> > > > +                * may end up writing to dynptr itself when touching memory from
> > > > +                * arg 1. This can be relaxed on a case by case basis for known
> > > > +                * safe cases, but reject due to the possibilitiy of aliasing by
> > > > +                * default.
> > > > +                */
> > > > +               for (i = min_off; i < max_off + access_size; i++) {
> > > > +                       slot = -i - 1;
> > > > +                       spi = slot / BPF_REG_SIZE;
> > >
> > > I think we can just use get_spi(i) here
> > >
> >
> > Ack.
> >
> > > > +                       /* raw_mode may write past allocated_stack */
> > > > +                       if (state->allocated_stack <= slot)
> > > > +                               continue;
> > >
> > > break?
> > >
> >
> > I think you realised why it's continue in your other reply :).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-11-04 22:14         ` Andrii Nakryiko
@ 2022-11-04 23:02           ` Kumar Kartikeya Dwivedi
  2022-11-04 23:08             ` Andrii Nakryiko
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-04 23:02 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Joanne Koong, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, David Vernet

On Sat, Nov 05, 2022 at 03:44:53AM IST, Andrii Nakryiko wrote:
> On Thu, Nov 3, 2022 at 7:07 AM Joanne Koong <joannelkoong@gmail.com> wrote:
> >
> > On Sat, Oct 22, 2022 at 5:08 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Sat, Oct 22, 2022 at 04:20:28AM IST, Joanne Koong wrote:
> > > > [...]
> > > >
> > > > thanks for your work on this (and on the rest of the stack, which I'm
> > > > still working on reviewing)
> > > >
> > > > Regarding writes leading to partial dynptr stack slots, I'm regretting
> > > > not having the verifier flat-out reject this in the first place
> > > > (instead of it being allowed but internally the stack slot gets marked
> > > > as invalid) - I think it overall ends up being more confusing to end
> > > > users, where there it's not obvious at all that writing to the dynptr
> > > > on the stack automatically invalidates it. I'm not sure whether it's
> > > > too late from a public API behavior perspective to change this or not.
> > >
> > > It would be incorrect to reject writes into dynptrs whose reference is not
> > > tracked by the verifier (so bpf_dynptr_from_mem), because the compiler would be
> > > free to reuse the stack space for some other variable when the local dynptr
> > > variable's lifetime ends, and the verifier would have no way to know when the
> > > variable went out of scope.
> > >
> > > I feel it is also incorrect to refuse bpf_dynptr_from_mem where unref dynptr
> > > already exists as well. Right now it sees STACK_DYNPTR in the slot_type and
> > > fails. But consider something like this:
> > >
> > > void prog(void)
> > > {
> > >         {
> > >                 struct bpf_dynptr ptr;
> > >                 bpf_dynptr_from_mem(...);
> > >                 ...
> > >         }
> > >
> > >         ...
> > >
> > >         {
> > >                 struct bpf_dynptr ptr;
> > >                 bpf_dynptr_from_mem(...);
> > >         }
> > > }
> > >
> > > The program is valid, but if ptr in both scopes share the same stack slots, the
> > > call in the second scope would fail because verifier would see STACK_DYNPTR in
> > > slot_type.
>
> I don't think compiler is allowed to reuse the same stack slot for
> those two ptrs, because we are passing a pointer to it into a
> black-box bpf_dynptr_from_mem() function, so kernel can't assume that
> this slot is free to be reused just because no one is accessing it
> after bpf_dynptr_from_mem (I think?)
>

At the C level, once the lifetime of the object ends upon execution going out of
its enclosing scope, even if its pointer escaped, the ending of the lifetime
implicitly invalidates any such pointers (as per the abstract machine), so the
compiler is well within its rights (because using such a pointer any more is UB)
to reuse the same storage for another object.

E.g. https://godbolt.org/z/Wcvzfbfbr

For the following:

struct bpf_dynptr {
	unsigned long a, b;
};

extern void bpf_dynptr_func(struct bpf_dynptr *);
extern void bpf_dynptr_ro(const struct bpf_dynptr *);

int main(void)
{
	{
		struct bpf_dynptr d2;

		bpf_dynptr_func(&d2);
		bpf_dynptr_ro(&d2);
	}
	{
		struct bpf_dynptr d3;

		bpf_dynptr_func(&d3);
		bpf_dynptr_ro(&d3);
	}
}

clang produces:

main:                                   # @main
        r6 = r10
        r6 += -16
        call bpf_dynptr_func
        call bpf_dynptr_ro
        r6 = r10
        r6 += -16
        call bpf_dynptr_func
        call bpf_dynptr_ro
        exit

> Would it make sense to allow *optional* bpf_dynptr_free (of is it
> bpf_dynptr_put, not sure) for non-reference-tracked dynptrs if indeed
> we wanted to reuse the same stack variable for multiple dynptrs,
> though?

I think it's fine to simply overwrite the type of object when reusing the same
stack slot.

For some precedence:

This is what compilers essentially do for effective(C)/dynamic(C++) types, for
instance GCC will simply overwrite the effective type of the object (even for
declared objects, even though standard only permits overwriting it for allocated
objects) whenever you do a store using a lvalue of some type T or memcpy
(transferring the effective type to dst).

For a more concrete analogy, Storage Reuse in
https://en.cppreference.com/w/cpp/language/lifetime describes how placement new
simply reuses storage of trivially destructible objects without requiring the
destructor to be called. Since it is C++ if you replace storage of non-trivial
object it must be switched back to the same type before the runtime calls the
original destructors emitted implicitly for declared type.

So in BPF terms, local dynptr (dynptr_from_mem) is trivially destructible, but
ringbuf dynptr requires its 'destructor' to be called before storage is reused.

There are two choices in the second case, either complain when such a ringbuf
dynptr's storage is reused without calling its release function, or the verifier
will complain later when seeing an unreleased reference. I think complaining
early is better as later the user has to correlate insn_idx of leaked reference.
The program is going to fail in both cases anyway, so it makes sense to give a
better error back to the user.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes
  2022-11-04 23:02           ` Kumar Kartikeya Dwivedi
@ 2022-11-04 23:08             ` Andrii Nakryiko
  0 siblings, 0 replies; 54+ messages in thread
From: Andrii Nakryiko @ 2022-11-04 23:08 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Joanne Koong, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, David Vernet

On Fri, Nov 4, 2022 at 4:02 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> On Sat, Nov 05, 2022 at 03:44:53AM IST, Andrii Nakryiko wrote:
> > On Thu, Nov 3, 2022 at 7:07 AM Joanne Koong <joannelkoong@gmail.com> wrote:
> > >
> > > On Sat, Oct 22, 2022 at 5:08 AM Kumar Kartikeya Dwivedi
> > > <memxor@gmail.com> wrote:
> > > >
> > > > On Sat, Oct 22, 2022 at 04:20:28AM IST, Joanne Koong wrote:
> > > > > [...]
> > > > >
> > > > > thanks for your work on this (and on the rest of the stack, which I'm
> > > > > still working on reviewing)
> > > > >
> > > > > Regarding writes leading to partial dynptr stack slots, I'm regretting
> > > > > not having the verifier flat-out reject this in the first place
> > > > > (instead of it being allowed but internally the stack slot gets marked
> > > > > as invalid) - I think it overall ends up being more confusing to end
> > > > > users, where there it's not obvious at all that writing to the dynptr
> > > > > on the stack automatically invalidates it. I'm not sure whether it's
> > > > > too late from a public API behavior perspective to change this or not.
> > > >
> > > > It would be incorrect to reject writes into dynptrs whose reference is not
> > > > tracked by the verifier (so bpf_dynptr_from_mem), because the compiler would be
> > > > free to reuse the stack space for some other variable when the local dynptr
> > > > variable's lifetime ends, and the verifier would have no way to know when the
> > > > variable went out of scope.
> > > >
> > > > I feel it is also incorrect to refuse bpf_dynptr_from_mem where unref dynptr
> > > > already exists as well. Right now it sees STACK_DYNPTR in the slot_type and
> > > > fails. But consider something like this:
> > > >
> > > > void prog(void)
> > > > {
> > > >         {
> > > >                 struct bpf_dynptr ptr;
> > > >                 bpf_dynptr_from_mem(...);
> > > >                 ...
> > > >         }
> > > >
> > > >         ...
> > > >
> > > >         {
> > > >                 struct bpf_dynptr ptr;
> > > >                 bpf_dynptr_from_mem(...);
> > > >         }
> > > > }
> > > >
> > > > The program is valid, but if ptr in both scopes share the same stack slots, the
> > > > call in the second scope would fail because verifier would see STACK_DYNPTR in
> > > > slot_type.
> >
> > I don't think compiler is allowed to reuse the same stack slot for
> > those two ptrs, because we are passing a pointer to it into a
> > black-box bpf_dynptr_from_mem() function, so kernel can't assume that
> > this slot is free to be reused just because no one is accessing it
> > after bpf_dynptr_from_mem (I think?)
> >
>
> At the C level, once the lifetime of the object ends upon execution going out of
> its enclosing scope, even if its pointer escaped, the ending of the lifetime
> implicitly invalidates any such pointers (as per the abstract machine), so the
> compiler is well within its rights (because using such a pointer any more is UB)
> to reuse the same storage for another object.
>
> E.g. https://godbolt.org/z/Wcvzfbfbr
>
> For the following:
>
> struct bpf_dynptr {
>         unsigned long a, b;
> };
>
> extern void bpf_dynptr_func(struct bpf_dynptr *);
> extern void bpf_dynptr_ro(const struct bpf_dynptr *);
>
> int main(void)
> {
>         {
>                 struct bpf_dynptr d2;
>
>                 bpf_dynptr_func(&d2);
>                 bpf_dynptr_ro(&d2);
>         }
>         {
>                 struct bpf_dynptr d3;
>
>                 bpf_dynptr_func(&d3);
>                 bpf_dynptr_ro(&d3);
>         }
> }

oh, I completely missed that it's nested lexical scopes, doh. Never
mind what I said then.

>
> clang produces:
>
> main:                                   # @main
>         r6 = r10
>         r6 += -16
>         call bpf_dynptr_func
>         call bpf_dynptr_ro
>         r6 = r10
>         r6 += -16
>         call bpf_dynptr_func
>         call bpf_dynptr_ro
>         exit
>
> > Would it make sense to allow *optional* bpf_dynptr_free (of is it
> > bpf_dynptr_put, not sure) for non-reference-tracked dynptrs if indeed
> > we wanted to reuse the same stack variable for multiple dynptrs,
> > though?
>
> I think it's fine to simply overwrite the type of object when reusing the same
> stack slot.
>
> For some precedence:
>
> This is what compilers essentially do for effective(C)/dynamic(C++) types, for
> instance GCC will simply overwrite the effective type of the object (even for
> declared objects, even though standard only permits overwriting it for allocated
> objects) whenever you do a store using a lvalue of some type T or memcpy
> (transferring the effective type to dst).
>
> For a more concrete analogy, Storage Reuse in
> https://en.cppreference.com/w/cpp/language/lifetime describes how placement new
> simply reuses storage of trivially destructible objects without requiring the
> destructor to be called. Since it is C++ if you replace storage of non-trivial
> object it must be switched back to the same type before the runtime calls the
> original destructors emitted implicitly for declared type.
>
> So in BPF terms, local dynptr (dynptr_from_mem) is trivially destructible, but
> ringbuf dynptr requires its 'destructor' to be called before storage is reused.
>
> There are two choices in the second case, either complain when such a ringbuf
> dynptr's storage is reused without calling its release function, or the verifier
> will complain later when seeing an unreleased reference. I think complaining
> early is better as later the user has to correlate insn_idx of leaked reference.
> The program is going to fail in both cases anyway, so it makes sense to give a
> better error back to the user.

Agreed.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  2022-10-18 13:59 ` [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM Kumar Kartikeya Dwivedi
  2022-10-18 21:38   ` sdf
@ 2022-11-07 22:35   ` Joanne Koong
  2022-11-07 23:12     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 54+ messages in thread
From: Joanne Koong @ 2022-11-07 22:35 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
> RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
> imply that it carries MEM_ALLOC, while only the latter does. This causes
> confusion during code review leading to conclusions like that the return
> value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
> PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.
>
> Rename it to make it clear MEM_ALLOC needs to be tacked on top of
> RET_PTR_TO_MEM.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf.h   | 6 +++---
>  kernel/bpf/verifier.c | 2 +-
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 13c6ff2de540..834276ba56c9 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -538,7 +538,7 @@ enum bpf_return_type {
>         RET_PTR_TO_SOCKET,              /* returns a pointer to a socket */
>         RET_PTR_TO_TCP_SOCK,            /* returns a pointer to a tcp_sock */
>         RET_PTR_TO_SOCK_COMMON,         /* returns a pointer to a sock_common */
> -       RET_PTR_TO_ALLOC_MEM,           /* returns a pointer to dynamically allocated memory */
> +       RET_PTR_TO_MEM,                 /* returns a pointer to dynamically allocated memory */
>         RET_PTR_TO_MEM_OR_BTF_ID,       /* returns a pointer to a valid memory or a btf_id */
>         RET_PTR_TO_BTF_ID,              /* returns a pointer to a btf_id */
>         __BPF_RET_TYPE_MAX,
> @@ -548,8 +548,8 @@ enum bpf_return_type {
>         RET_PTR_TO_SOCKET_OR_NULL       = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
>         RET_PTR_TO_TCP_SOCK_OR_NULL     = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
>         RET_PTR_TO_SOCK_COMMON_OR_NULL  = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
> -       RET_PTR_TO_ALLOC_MEM_OR_NULL    = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
> -       RET_PTR_TO_DYNPTR_MEM_OR_NULL   = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
> +       RET_PTR_TO_ALLOC_MEM_OR_NULL    = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_MEM,

Can you also rename this to RET_PTR_TO_RINGBUF_MEM_OR_NULL instead of
RET_PTR_TO_ALLOC_MEM_OR_NULL, and MEM_RINGBUF instead of MEM_ALLOC?
RET_PTR_TO_ALLOC_MEM_OR_NULL only pertains to ringbuf records, not
generic dynamically allocated memory, so I think this rename would
make this a lot more clear.

> +       RET_PTR_TO_DYNPTR_MEM_OR_NULL   = PTR_MAYBE_NULL | RET_PTR_TO_MEM,
>         RET_PTR_TO_BTF_ID_OR_NULL       = PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
>
>         /* This must be the last entry. Its purpose is to ensure the enum is
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 87d9cccd1623..a49b95c1af1b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7612,7 +7612,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                 mark_reg_known_zero(env, regs, BPF_REG_0);
>                 regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag;
>                 break;
> -       case RET_PTR_TO_ALLOC_MEM:
> +       case RET_PTR_TO_MEM:
>                 mark_reg_known_zero(env, regs, BPF_REG_0);
>                 regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
>                 regs[BPF_REG_0].mem_size = meta.mem_size;
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM
  2022-11-07 22:35   ` Joanne Koong
@ 2022-11-07 23:12     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-07 23:12 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Nov 08, 2022 at 04:05:22AM IST, Joanne Koong wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > Currently, the verifier has two return types, RET_PTR_TO_ALLOC_MEM, and
> > RET_PTR_TO_ALLOC_MEM_OR_NULL, however the former is confusingly named to
> > imply that it carries MEM_ALLOC, while only the latter does. This causes
> > confusion during code review leading to conclusions like that the return
> > value of RET_PTR_TO_DYNPTR_MEM_OR_NULL (which is RET_PTR_TO_ALLOC_MEM |
> > PTR_MAYBE_NULL) may be consumable by bpf_ringbuf_{submit,commit}.
> >
> > Rename it to make it clear MEM_ALLOC needs to be tacked on top of
> > RET_PTR_TO_MEM.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf.h   | 6 +++---
> >  kernel/bpf/verifier.c | 2 +-
> >  2 files changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 13c6ff2de540..834276ba56c9 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -538,7 +538,7 @@ enum bpf_return_type {
> >         RET_PTR_TO_SOCKET,              /* returns a pointer to a socket */
> >         RET_PTR_TO_TCP_SOCK,            /* returns a pointer to a tcp_sock */
> >         RET_PTR_TO_SOCK_COMMON,         /* returns a pointer to a sock_common */
> > -       RET_PTR_TO_ALLOC_MEM,           /* returns a pointer to dynamically allocated memory */
> > +       RET_PTR_TO_MEM,                 /* returns a pointer to dynamically allocated memory */
> >         RET_PTR_TO_MEM_OR_BTF_ID,       /* returns a pointer to a valid memory or a btf_id */
> >         RET_PTR_TO_BTF_ID,              /* returns a pointer to a btf_id */
> >         __BPF_RET_TYPE_MAX,
> > @@ -548,8 +548,8 @@ enum bpf_return_type {
> >         RET_PTR_TO_SOCKET_OR_NULL       = PTR_MAYBE_NULL | RET_PTR_TO_SOCKET,
> >         RET_PTR_TO_TCP_SOCK_OR_NULL     = PTR_MAYBE_NULL | RET_PTR_TO_TCP_SOCK,
> >         RET_PTR_TO_SOCK_COMMON_OR_NULL  = PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
> > -       RET_PTR_TO_ALLOC_MEM_OR_NULL    = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
> > -       RET_PTR_TO_DYNPTR_MEM_OR_NULL   = PTR_MAYBE_NULL | RET_PTR_TO_ALLOC_MEM,
> > +       RET_PTR_TO_ALLOC_MEM_OR_NULL    = PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_MEM,
>
> Can you also rename this to RET_PTR_TO_RINGBUF_MEM_OR_NULL instead of
> RET_PTR_TO_ALLOC_MEM_OR_NULL, and MEM_RINGBUF instead of MEM_ALLOC?
> RET_PTR_TO_ALLOC_MEM_OR_NULL only pertains to ringbuf records, not
> generic dynamically allocated memory, so I think this rename would
> make this a lot more clear.
>

I have posted it here: https://lore.kernel.org/bpf/20221107230950.7117-6-memxor@gmail.com

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off
  2022-10-18 13:59 ` [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
  2022-10-18 21:55   ` sdf
@ 2022-11-07 23:17   ` Joanne Koong
  2022-11-08 18:22     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 54+ messages in thread
From: Joanne Koong @ 2022-11-07 23:17 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> While check_func_arg_reg_off is the place which performs generic checks
> needed by various candidates of reg->type, there is some handling for
> special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> ARG_PTR_TO_ALLOC_MEM.
>
> This commit aims to streamline these special cases and instead leave
> other things up to argument type specific code to handle.
>
> This is done primarily for two reasons: associating back reg->type to
> its argument leaves room for the list getting out of sync when a new
> reg->type is supported by an arg_type.
>
> The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> we already handle, whenever a release argument is expected, it should
> be passed as the pointer that was received from the acquire function.
> Hence zero fixed and variable offset.
>
> There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> with non-zero offset to other helper functions, which makes sense.
>
> Hence, lift the arg_type_is_release check for reg->off and cover all
> possible register types, instead of duplicating the same kind of check
> twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
>
> Finally, for the release argument, arg_type_is_dynptr is the special
> case, where we go to actual object being freed through the dynptr, so
> the offset of the pointer still needs to allow fixed and variable offset
> and process_dynptr_func will verify them later for the release argument
> case as well.
>
> Finally, since check_func_arg_reg_off is meant to be generic, move
> dynptr specific check into process_dynptr_func.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c                         | 55 +++++++++++++++----
>  .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
>  2 files changed, 44 insertions(+), 13 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a49b95c1af1b..a8c277e51d63 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5654,6 +5654,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                 return -EFAULT;
>         }
>
> +       /* CONST_PTR_TO_DYNPTR has fixed and variable offset as zero, ensured by
> +        * check_func_arg_reg_off, so this is only needed for PTR_TO_STACK.
> +        */
> +       if (reg->off % BPF_REG_SIZE) {
> +               verbose(env, "cannot pass in dynptr at an offset\n");
> +               return -EINVAL;
> +       }
> +

Imo, this logic belongs more in check_func_arg_reg_off(). It's cleaner
to me to have all the logic for reg->off checking consolidated in one
place.

>         /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>          * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>          *
> @@ -5672,6 +5680,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>          *               destroyed, including mutation of the memory it points
>          *               to.
>          */
> +
>         if (arg_type & MEM_UNINIT) {
>                 if (!is_dynptr_reg_valid_uninit(env, reg)) {
>                         verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> @@ -5983,14 +5992,37 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         enum bpf_reg_type type = reg->type;
>         bool fixed_off_ok = false;
>
> -       switch ((u32)type) {
> -       /* Pointer types where reg offset is explicitly allowed: */
> -       case PTR_TO_STACK:
> -               if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -                       verbose(env, "cannot pass in dynptr at an offset\n");
> +       /* When referenced register is passed to release function, it's fixed
> +        * offset must be 0.
> +        *
> +        * We will check arg_type_is_release reg has ref_obj_id when storing
> +        * meta->release_regno.
> +        */
> +       if (arg_type_is_release(arg_type)) {
> +               /* ARG_PTR_TO_DYNPTR is a bit special, as it may not directly
> +                * point to the object being released, but to dynptr pointing
> +                * to such object, which might be at some offset on the stack.
> +                *
> +                * In that case, we simply to fallback to the default handling.
> +                */
> +               if (arg_type_is_dynptr(arg_type) && type == PTR_TO_STACK)

Do we need the "arg_type_is_dynptr(arg_type)" part? I think just "if
(type == PTR_TO_STACK)" suffices since any release args on the stack
will be at some fp offset.

> +                       goto check_type;

I think this logic is a lot simpler to read:

if (arg_type_is_release(arg_type)) {
    if (type != PTR_TO_STACK) {
        if (reg->off) {
            verbose(env, "R%d must have zero offset...");
            return -EINVAL;
        }
        return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
    }
}

> +               /* Going straight to check will catch this because fixed_off_ok
> +                * is false, but checking here allows us to give the user a
> +                * better error message.
> +                */
> +               if (reg->off) {
> +                       verbose(env, "R%d must have zero offset when passed to release func\n",
> +                               regno);
>                         return -EINVAL;
>                 }
> -               fallthrough;
> +               goto check;

I think it's cleaner here to just "return __check_ptr_off_reg(env,
reg, regno, fixed_off_ok);" instead of adding the goto check.

> +       }
> +check_type:
> +       switch ((u32)type) {

btw I don't think we need this (u32) cast. type is an enum
bpf_reg_type, which is by default a u32.

> +       /* Pointer types where both fixed and variable reg offset is explicitly
> +        * allowed: */
> +       case PTR_TO_STACK:
>         case PTR_TO_PACKET:
>         case PTR_TO_PACKET_META:
>         case PTR_TO_MAP_KEY:
> @@ -6001,12 +6033,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         case PTR_TO_BUF:
>         case PTR_TO_BUF | MEM_RDONLY:
>         case SCALAR_VALUE:
> -               /* Some of the argument types nevertheless require a
> -                * zero register offset.
> -                */
> -               if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> -                       return 0;
> -               break;
> +               return 0;
>         /* All the rest must be rejected, except PTR_TO_BTF_ID which allows
>          * fixed offset.
>          */

We should also remove the "if (arg_type_is_release(arg_type) &&
reg->off)" code in the PTR_TO_BTF_ID case.

> @@ -6023,12 +6050,16 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>                 /* For arg is release pointer, fixed_off_ok must be false, but
>                  * we already checked and rejected reg->off != 0 above, so set
>                  * to true to allow fixed offset for all other cases.
> +                *
> +                * var_off always must be 0 for PTR_TO_BTF_ID, hence we still
> +                * need to do checks instead of returning.
>                  */
>                 fixed_off_ok = true;
>                 break;
>         default:
>                 break;
>         }
> +check:
>         return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>
> diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c b/tools/testing/selftests/bpf/verifier/ringbuf.c
> index b64d33e4833c..92e3f6a61a79 100644
> --- a/tools/testing/selftests/bpf/verifier/ringbuf.c
> +++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
> @@ -28,7 +28,7 @@
>         },
>         .fixup_map_ringbuf = { 1 },
>         .result = REJECT,
> -       .errstr = "dereference of modified alloc_mem ptr R1",
> +       .errstr = "R1 must have zero offset when passed to release func",
>  },
>  {
>         "ringbuf: invalid reservation offset 2",
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off
  2022-11-07 23:17   ` Joanne Koong
@ 2022-11-08 18:22     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-08 18:22 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Nov 08, 2022 at 04:47:08AM IST, Joanne Koong wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > While check_func_arg_reg_off is the place which performs generic checks
> > needed by various candidates of reg->type, there is some handling for
> > special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> > ARG_PTR_TO_ALLOC_MEM.
> >
> > This commit aims to streamline these special cases and instead leave
> > other things up to argument type specific code to handle.
> >
> > This is done primarily for two reasons: associating back reg->type to
> > its argument leaves room for the list getting out of sync when a new
> > reg->type is supported by an arg_type.
> >
> > The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> > we already handle, whenever a release argument is expected, it should
> > be passed as the pointer that was received from the acquire function.
> > Hence zero fixed and variable offset.
> >
> > There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> > its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> > with non-zero offset to other helper functions, which makes sense.
> >
> > Hence, lift the arg_type_is_release check for reg->off and cover all
> > possible register types, instead of duplicating the same kind of check
> > twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
> >
> > Finally, for the release argument, arg_type_is_dynptr is the special
> > case, where we go to actual object being freed through the dynptr, so
> > the offset of the pointer still needs to allow fixed and variable offset
> > and process_dynptr_func will verify them later for the release argument
> > case as well.
> >
> > Finally, since check_func_arg_reg_off is meant to be generic, move
> > dynptr specific check into process_dynptr_func.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c                         | 55 +++++++++++++++----
> >  .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
> >  2 files changed, 44 insertions(+), 13 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a49b95c1af1b..a8c277e51d63 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5654,6 +5654,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >                 return -EFAULT;
> >         }
> >
> > +       /* CONST_PTR_TO_DYNPTR has fixed and variable offset as zero, ensured by
> > +        * check_func_arg_reg_off, so this is only needed for PTR_TO_STACK.
> > +        */
> > +       if (reg->off % BPF_REG_SIZE) {
> > +               verbose(env, "cannot pass in dynptr at an offset\n");
> > +               return -EINVAL;
> > +       }
> > +
>
> Imo, this logic belongs more in check_func_arg_reg_off(). It's cleaner
> to me to have all the logic for reg->off checking consolidated in one
> place.
>

I think this alignment requirement is specific to dynptr, so it should be here.
My idea with this patch was to only force offset rules per register type that
don't harcode any assumptions about what each helper does with that register
type. Each ARG_TYPE_* can then further build upon what this function guarantees
about the register type.

e.g. PTR_TO_MAP_VALUE doesn't have restriction to have constant var_off, but a
lot of helpers require it to be const. It wouldn't make sense to move their
helper specific offset checks to this function. Same reasoning here.

> >         /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
> >          * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> >          *
> > @@ -5672,6 +5680,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >          *               destroyed, including mutation of the memory it points
> >          *               to.
> >          */
> > +
> >         if (arg_type & MEM_UNINIT) {
> >                 if (!is_dynptr_reg_valid_uninit(env, reg)) {
> >                         verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> > @@ -5983,14 +5992,37 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >         enum bpf_reg_type type = reg->type;
> >         bool fixed_off_ok = false;
> >
> > -       switch ((u32)type) {
> > -       /* Pointer types where reg offset is explicitly allowed: */
> > -       case PTR_TO_STACK:
> > -               if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> > -                       verbose(env, "cannot pass in dynptr at an offset\n");
> > +       /* When referenced register is passed to release function, it's fixed
> > +        * offset must be 0.
> > +        *
> > +        * We will check arg_type_is_release reg has ref_obj_id when storing
> > +        * meta->release_regno.
> > +        */
> > +       if (arg_type_is_release(arg_type)) {
> > +               /* ARG_PTR_TO_DYNPTR is a bit special, as it may not directly
> > +                * point to the object being released, but to dynptr pointing
> > +                * to such object, which might be at some offset on the stack.
> > +                *
> > +                * In that case, we simply to fallback to the default handling.
> > +                */
> > +               if (arg_type_is_dynptr(arg_type) && type == PTR_TO_STACK)
>
> Do we need the "arg_type_is_dynptr(arg_type)" part? I think just "if
> (type == PTR_TO_STACK)" suffices since any release args on the stack
> will be at some fp offset.
>

Just being more careful here. We can drop it once we have more cases, but it's
better IMO to be more restrictive by default to prevent things from slipping
through.

In the future there will be more helpers that work similar to dynptr (e.g.
initializing a bpf_list_head on stack), then we can abstract them behind a
common check.

> > +                       goto check_type;
>
> I think this logic is a lot simpler to read:
>
> if (arg_type_is_release(arg_type)) {
>     if (type != PTR_TO_STACK) {
>         if (reg->off) {
>             verbose(env, "R%d must have zero offset...");
>             return -EINVAL;
>         }
>         return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>     }
> }
>

Sure, I can rewrite it like this.

> > +               /* Going straight to check will catch this because fixed_off_ok
> > +                * is false, but checking here allows us to give the user a
> > +                * better error message.
> > +                */
> > +               if (reg->off) {
> > +                       verbose(env, "R%d must have zero offset when passed to release func\n",
> > +                               regno);
> >                         return -EINVAL;
> >                 }
> > -               fallthrough;
> > +               goto check;
>
> I think it's cleaner here to just "return __check_ptr_off_reg(env,
> reg, regno, fixed_off_ok);" instead of adding the goto check.
>
> > +       }
> > +check_type:
> > +       switch ((u32)type) {
>
> btw I don't think we need this (u32) cast. type is an enum
> bpf_reg_type, which is by default a u32.
>

nit: it's an int by default.

I've found clang complaining when you switch over an enum type and cases contain
non-enum constants (like type | flag). For clarity I'll declare the variable as
u32.

> > +       /* Pointer types where both fixed and variable reg offset is explicitly
> > +        * allowed: */
> > +       case PTR_TO_STACK:
> >         case PTR_TO_PACKET:
> >         case PTR_TO_PACKET_META:
> >         case PTR_TO_MAP_KEY:
> > @@ -6001,12 +6033,7 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >         case PTR_TO_BUF:
> >         case PTR_TO_BUF | MEM_RDONLY:
> >         case SCALAR_VALUE:
> > -               /* Some of the argument types nevertheless require a
> > -                * zero register offset.
> > -                */
> > -               if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> > -                       return 0;
> > -               break;
> > +               return 0;
> >         /* All the rest must be rejected, except PTR_TO_BTF_ID which allows
> >          * fixed offset.
> >          */
>
> We should also remove the "if (arg_type_is_release(arg_type) &&
> reg->off)" code in the PTR_TO_BTF_ID case.
>

Ack.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots
  2022-10-18 13:59 ` [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots Kumar Kartikeya Dwivedi
@ 2022-11-08 20:22   ` Joanne Koong
  2022-11-09 18:39     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 54+ messages in thread
From: Joanne Koong @ 2022-11-08 20:22 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> The root of the problem is missing liveness marking for STACK_DYNPTR
> slots. This leads to all kinds of problems inside stacksafe.
>
> The verifier by default inside stacksafe ignores spilled_ptr in stack
> slots which do not have REG_LIVE_READ marks. Since this is being checked
> in the 'old' explored state, it must have already done clean_live_states
> for this old bpf_func_state. Hence, it won't be receiving any more
> liveness marks from to be explored insns (it has received REG_LIVE_DONE
> marking from liveness point of view).

To summarize, your last two sentences are saying that once an "old"
state has been explored, its liveness will never be modified when the
verifier explores other newer states, correct?

>
> What this means is that verifier considers that it's safe to not compare
> the stack slot if was never read by children states. While liveness
> marks are usually propagated correctly following the parentage chain for
> spilled registers (SCALAR_VALUE and PTR_* types), the same is not the
> case for STACK_DYNPTR.
>

In the context of liveness marking, what is "parent" and "children"?
Is the parent the first possibility that is explored, and every
possibility after that is its children?

> clean_live_states hence simply rewrites these stack slots to the type
> STACK_INVALID since it sees no REG_LIVE_READ marks.
>
> The end result is that we will never see STACK_DYNPTR slots in explored
> state. Even if verifier was conservatively matching !REG_LIVE_READ
> slots, very next check continuing the stacksafe loop on seeing
> STACK_INVALID would again prevent further checks.
>
> Now as long as verifier stores an explored state which we can compare to
> when reaching a pruning point, we can abuse this bug to make verifier
> prune search for obviously unsafe paths using STACK_DYNPTR slots
> thinking they are never used hence safe.
>
> Doing this in unprivileged mode is a bit challenging. add_new_state is
> only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges)
> or when jmps_processed difference is >= 2 and insn_processed difference
> is >= 8. So coming up with the unprivileged case requires a little more
> work, but it is still totally possible. The test case being discussed
> below triggers the heuristic even in unprivileged mode.
>
> However, it no longer works since commit
> 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF").
>
> Let's try to study the test step by step.
>
> Consider the following program (C style BPF ASM):
>
> 0  r0 = 0;
> 1  r6 = &ringbuf_map;
> 3  r1 = r6;
> 4  r2 = 8;
> 5  r3 = 0;
> 6  r4 = r10;
> 7  r4 -= -16;
> 8  call bpf_ringbuf_reserve_dynptr;
> 9  if r0 == 0 goto pc+1;
> 10 goto pc+1;
> 11 *(r10 - 16) = 0xeB9F;
> 12 r1 = r10;
> 13 r1 -= -16;
> 14 r2 = 0;
> 15 call bpf_ringbuf_discard_dynptr;
> 16 r0 = 0;
> 17 exit;
>
> We know that insn 12 will be a pruning point, hence if we force
> add_new_state for it, it will first verify the following path as
> safe in straight line exploration:
> 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17
>
> Then, when we arrive at insn 12 from the following path:
> 0 1 3 4 5 6 7 8 9 -> 11 (12)
>
> We will find a state that has been verified as safe already at insn 12.
> Since register state is same at this point, regsafe will pass. Next, in
> stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped
> seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true.
> Next, refsafe is also true as reference state is unchanged in both
> states.
>
> The states are considered equivalent and search is pruned.
>
> Hence, we are able to construct a dynptr with arbitrary contents and use
> the dynptr API to operate on this arbitrary pointer and arbitrary size +
> offset.
>
> To fix this, first define a mark_dynptr_read function that propagates
> liveness marks whenever a valid initialized dynptr is accessed by dynptr
> helpers. REG_LIVE_WRITTEN is marked whenever we initialize an
> uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows
> screening off mark_reg_read and not propagating marks upwards from that
> point.
>
> This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or
> none, so clean_live_states either sets both slots to STACK_INVALID or
> none of them. This is the invariant the checks inside stacksafe rely on.
>
> Next, do a complete comparison of both stack slots whenever they have
> STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and
> also whether both form the same first_slot. Only then is the later path
> safe.
>
> Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 73 insertions(+)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a8c277e51d63..8f667180f70f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -752,6 +752,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>                 state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
>         }
>
> +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +

Is the purpose of REG_LIVE_WRITTEN that it indicates to the verifier
that this register needs to be checked when comparing the state
against another possible state?

>         return 0;
>  }
>
> @@ -776,6 +779,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
>
>         __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
>         __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> +
> +       /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot?
> +        *
> +        * While we don't allow reading STACK_INVALID, it is still possible to
> +        * do <8 byte writes marking some but not all slots as STACK_MISC. Then,
> +        * helpers or insns can do partial read of that part without failing,
> +        * but check_stack_range_initialized, check_stack_read_var_off, and
> +        * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of
> +        * the slot conservatively. Hence we need to screen off those liveness
> +        * marking walks.
> +        *
> +        * This was not a problem before because STACK_INVALID is only set by
> +        * default, or in clean_live_states after REG_LIVE_DONE, not randomly
> +        * during verifier state exploration. Hence, for this case parentage
> +        * chain will still be live, while earlier reg->parent was NULL, so we
> +        * need REG_LIVE_WRITTEN to screen off read marker propagation.

What does "screen off" mean in "screen off those liveness marking
walks" and "screen off read marker propagation"?

> +        */
> +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> +


>         return 0;
>  }
>
> @@ -2354,6 +2377,30 @@ static int mark_reg_read(struct bpf_verifier_env *env,
>         return 0;
>  }
>
> +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> +{
> +       struct bpf_func_state *state = func(env, reg);
> +       int spi, ret;
> +
> +       /* For CONST_PTR_TO_DYNPTR, it must have already been done by
> +        * check_reg_arg in check_helper_call and mark_btf_func_reg_size in
> +        * check_kfunc_call.
> +        */
> +       if (reg->type == CONST_PTR_TO_DYNPTR)
> +               return 0;
> +       spi = get_spi(reg->off);
> +       /* Caller ensures dynptr is valid and initialized, which means spi is in
> +        * bounds and spi is the first dynptr slot. Simply mark stack slot as
> +        * read.
> +        */
> +       ret = mark_reg_read(env, &state->stack[spi].spilled_ptr,
> +                           state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64);
> +       if (ret)
> +               return ret;
> +       return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr,
> +                            state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64);
> +}
> +
>  /* This function is supposed to be used by the following 32-bit optimization
>   * code only. It returns TRUE if the source or destination register operates
>   * on 64-bit, otherwise return FALSE.
> @@ -5648,6 +5695,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                         u8 *uninit_dynptr_regno)
>  {
>         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> +       int err;
>
>         if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
>                 verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> @@ -5729,6 +5777,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                                 err_extra, argno + 1);
>                         return -EINVAL;
>                 }
> +
> +               err = mark_dynptr_read(env, reg);
> +               if (err)
> +                       return err;
>         }
>         return 0;
>  }
> @@ -11793,6 +11845,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
>                          * return false to continue verification of this path
>                          */
>                         return false;
> +               /* Both are same slot_type, but STACK_DYNPTR requires more
> +                * checks before it can considered safe.
> +                */
> +               if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) {
> +                       /* If both are STACK_DYNPTR, type must be same */
> +                       if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type)
> +                               return false;
> +                       /* Both should also have first slot at same spi */
> +                       if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot)
> +                               return false;
> +                       /* ids should be same */
> +                       if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id)

Do we need two !s or is just one ! enough?

> +                               return false;
> +                       if (old->stack[spi].spilled_ptr.ref_obj_id &&
> +                           !check_ids(old->stack[spi].spilled_ptr.ref_obj_id,
> +                                      cur->stack[spi].spilled_ptr.ref_obj_id, idmap))
> +                               return false;
> +                       WARN_ON_ONCE(i % BPF_REG_SIZE);
> +                       i += BPF_REG_SIZE - 1;
> +                       continue;
> +               }
>                 if (i % BPF_REG_SIZE != BPF_REG_SIZE - 1)
>                         continue;
>                 if (!is_spilled_reg(&old->stack[spi]))
> --
> 2.38.0
>

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots
  2022-11-08 20:22   ` Joanne Koong
@ 2022-11-09 18:39     ` Kumar Kartikeya Dwivedi
  2022-11-10  0:41       ` Joanne Koong
  0 siblings, 1 reply; 54+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-09 18:39 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Wed, Nov 09, 2022 at 01:52:50AM IST, Joanne Koong wrote:
> On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > The root of the problem is missing liveness marking for STACK_DYNPTR
> > slots. This leads to all kinds of problems inside stacksafe.
> >
> > The verifier by default inside stacksafe ignores spilled_ptr in stack
> > slots which do not have REG_LIVE_READ marks. Since this is being checked
> > in the 'old' explored state, it must have already done clean_live_states
> > for this old bpf_func_state. Hence, it won't be receiving any more
> > liveness marks from to be explored insns (it has received REG_LIVE_DONE
> > marking from liveness point of view).
>
> To summarize, your last two sentences are saying that once an "old"
> state has been explored, its liveness will never be modified when the
> verifier explores other newer states, correct?
>

Yes, once you enter is_state_visited for a prune point with the current state,
it will do clean_live_states for all existing states at that insn_idx (with the
same callsite) for states with branches == 0. The branches counter keeps track
of currently being explored paths. It is always 1 by default for cur.]

So it is not ok to set REG_LIVE_DONE for them when branches > 0, as it shows
that there are still other paths that need to be explored and this is possibly a
parent of those.

So when branches == 0, it becomes a completely explored state with nothing left
to be simulated for it. It will be receiving no more liveness marks from
anything, so we can set REG_LIVE_DONE for its regs.

An example is an if statement. When you do that you use push_stack to push one
state to be explored, and the follow the 'else' in the current state. When
current reaches BPF_EXIT, you will pop the other branch and continue exploring
from there. So when we do push_stack cur->parent.branches will become 2, and it
will be 1 for both cur and the pushed state.

> >
> > What this means is that verifier considers that it's safe to not compare
> > the stack slot if was never read by children states. While liveness
> > marks are usually propagated correctly following the parentage chain for
> > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the
> > case for STACK_DYNPTR.
> >
>
> In the context of liveness marking, what is "parent" and "children"?
> Is the parent the first possibility that is explored, and every
> possibility after that is its children?
>

parent will be the state we forked from, the registers in current are connected
to the same registers in parent using reg->parent. Liveness allows us to know
whether that forked child state ever really used the particular reg in our state
later.

Once everything has been explored for our children and our branches count drops
to 0, and a reg never got REG_LIVE_READ propagated by following reg->parent
whenever a read was done for a reg, we know that the reg was unused, so it isn't
required to compare it during states_equal checks. Both regsafe and stacksafe
ignore regs without REG_LIVE_READ.

The child sets REG_LIVE_WRITTEN whenever it overwrites a reg, as that value of
the parent no longer matters to it. There are some subtleties (like not setting
it when size != BPF_REG_SIZE, because the partial contents still continue to
matter).

> > clean_live_states hence simply rewrites these stack slots to the type
> > STACK_INVALID since it sees no REG_LIVE_READ marks.
> >
> > The end result is that we will never see STACK_DYNPTR slots in explored
> > state. Even if verifier was conservatively matching !REG_LIVE_READ
> > slots, very next check continuing the stacksafe loop on seeing
> > STACK_INVALID would again prevent further checks.
> >
> > Now as long as verifier stores an explored state which we can compare to
> > when reaching a pruning point, we can abuse this bug to make verifier
> > prune search for obviously unsafe paths using STACK_DYNPTR slots
> > thinking they are never used hence safe.
> >
> > Doing this in unprivileged mode is a bit challenging. add_new_state is
> > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges)
> > or when jmps_processed difference is >= 2 and insn_processed difference
> > is >= 8. So coming up with the unprivileged case requires a little more
> > work, but it is still totally possible. The test case being discussed
> > below triggers the heuristic even in unprivileged mode.
> >
> > However, it no longer works since commit
> > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF").
> >
> > Let's try to study the test step by step.
> >
> > Consider the following program (C style BPF ASM):
> >
> > 0  r0 = 0;
> > 1  r6 = &ringbuf_map;
> > 3  r1 = r6;
> > 4  r2 = 8;
> > 5  r3 = 0;
> > 6  r4 = r10;
> > 7  r4 -= -16;
> > 8  call bpf_ringbuf_reserve_dynptr;
> > 9  if r0 == 0 goto pc+1;
> > 10 goto pc+1;
> > 11 *(r10 - 16) = 0xeB9F;
> > 12 r1 = r10;
> > 13 r1 -= -16;
> > 14 r2 = 0;
> > 15 call bpf_ringbuf_discard_dynptr;
> > 16 r0 = 0;
> > 17 exit;
> >
> > We know that insn 12 will be a pruning point, hence if we force
> > add_new_state for it, it will first verify the following path as
> > safe in straight line exploration:
> > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17
> >
> > Then, when we arrive at insn 12 from the following path:
> > 0 1 3 4 5 6 7 8 9 -> 11 (12)
> >
> > We will find a state that has been verified as safe already at insn 12.
> > Since register state is same at this point, regsafe will pass. Next, in
> > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped
> > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true.
> > Next, refsafe is also true as reference state is unchanged in both
> > states.
> >
> > The states are considered equivalent and search is pruned.
> >
> > Hence, we are able to construct a dynptr with arbitrary contents and use
> > the dynptr API to operate on this arbitrary pointer and arbitrary size +
> > offset.
> >
> > To fix this, first define a mark_dynptr_read function that propagates
> > liveness marks whenever a valid initialized dynptr is accessed by dynptr
> > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an
> > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows
> > screening off mark_reg_read and not propagating marks upwards from that
> > point.
> >
> > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or
> > none, so clean_live_states either sets both slots to STACK_INVALID or
> > none of them. This is the invariant the checks inside stacksafe rely on.
> >
> > Next, do a complete comparison of both stack slots whenever they have
> > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and
> > also whether both form the same first_slot. Only then is the later path
> > safe.
> >
> > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 73 insertions(+)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a8c277e51d63..8f667180f70f 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -752,6 +752,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> >                 state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
> >         }
> >
> > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +
>
> Is the purpose of REG_LIVE_WRITTEN that it indicates to the verifier
> that this register needs to be checked when comparing the state
> against another possible state?
>

It simply means that the slot was overwritten, so we shouldn't be following
spilled_ptr.parent when doing mark_reg_read for the spilled_ptr in
mark_dynptr_read. __mark_reg_not_init won't touch live, parent, and other such
states, so they need to be handled separately.

> >         return 0;
> >  }
> >
> > @@ -776,6 +779,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> >
> >         __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> >         __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > +
> > +       /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot?
> > +        *
> > +        * While we don't allow reading STACK_INVALID, it is still possible to
> > +        * do <8 byte writes marking some but not all slots as STACK_MISC. Then,
> > +        * helpers or insns can do partial read of that part without failing,
> > +        * but check_stack_range_initialized, check_stack_read_var_off, and
> > +        * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of
> > +        * the slot conservatively. Hence we need to screen off those liveness
> > +        * marking walks.
> > +        *
> > +        * This was not a problem before because STACK_INVALID is only set by
> > +        * default, or in clean_live_states after REG_LIVE_DONE, not randomly
> > +        * during verifier state exploration. Hence, for this case parentage
> > +        * chain will still be live, while earlier reg->parent was NULL, so we
> > +        * need REG_LIVE_WRITTEN to screen off read marker propagation.
>
> What does "screen off" mean in "screen off those liveness marking
> walks" and "screen off read marker propagation"?
>

"screen off" simply means setting REG_LIVE_WRITTEN prevents mark_reg_read from
following reg->parent pointers after that point. What was in this stack slot
doesn't matter anymore as it was overwritten, so it's liveness won't matter and
any reads don't have to be propagated upwards. They need to be propagated to
this slot if children states use it.

> > +        */
> > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > +
>
>
> >         return 0;
> >  }
> >
> > @@ -2354,6 +2377,30 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> >         return 0;
> >  }
> >
> > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > +{
> > +       struct bpf_func_state *state = func(env, reg);
> > +       int spi, ret;
> > +
> > +       /* For CONST_PTR_TO_DYNPTR, it must have already been done by
> > +        * check_reg_arg in check_helper_call and mark_btf_func_reg_size in
> > +        * check_kfunc_call.
> > +        */
> > +       if (reg->type == CONST_PTR_TO_DYNPTR)
> > +               return 0;
> > +       spi = get_spi(reg->off);
> > +       /* Caller ensures dynptr is valid and initialized, which means spi is in
> > +        * bounds and spi is the first dynptr slot. Simply mark stack slot as
> > +        * read.
> > +        */
> > +       ret = mark_reg_read(env, &state->stack[spi].spilled_ptr,
> > +                           state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64);
> > +       if (ret)
> > +               return ret;
> > +       return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr,
> > +                            state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64);
> > +}
> > +
> >  /* This function is supposed to be used by the following 32-bit optimization
> >   * code only. It returns TRUE if the source or destination register operates
> >   * on 64-bit, otherwise return FALSE.
> > @@ -5648,6 +5695,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >                         u8 *uninit_dynptr_regno)
> >  {
> >         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> > +       int err;
> >
> >         if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> >                 verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> > @@ -5729,6 +5777,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >                                 err_extra, argno + 1);
> >                         return -EINVAL;
> >                 }
> > +
> > +               err = mark_dynptr_read(env, reg);
> > +               if (err)
> > +                       return err;
> >         }
> >         return 0;
> >  }
> > @@ -11793,6 +11845,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> >                          * return false to continue verification of this path
> >                          */
> >                         return false;
> > +               /* Both are same slot_type, but STACK_DYNPTR requires more
> > +                * checks before it can considered safe.
> > +                */
> > +               if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) {
> > +                       /* If both are STACK_DYNPTR, type must be same */
> > +                       if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type)
> > +                               return false;
> > +                       /* Both should also have first slot at same spi */
> > +                       if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot)
> > +                               return false;
> > +                       /* ids should be same */
> > +                       if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id)
>
> Do we need two !s or is just one ! enough?
>

It is trying to test whether both have ref_obj_id or none.
So if set it turns non-zero to 1, otherwise 0.
Then 1 != 0 or 0 != 1 fails, 0 == 0 or 1 == 1 succeeds,
and if ref_obj_id is set we match the ids in the next comparison.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots
  2022-11-09 18:39     ` Kumar Kartikeya Dwivedi
@ 2022-11-10  0:41       ` Joanne Koong
  0 siblings, 0 replies; 54+ messages in thread
From: Joanne Koong @ 2022-11-10  0:41 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Wed, Nov 9, 2022 at 10:41 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Wed, Nov 09, 2022 at 01:52:50AM IST, Joanne Koong wrote:
> > On Tue, Oct 18, 2022 at 6:59 AM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > The root of the problem is missing liveness marking for STACK_DYNPTR
> > > slots. This leads to all kinds of problems inside stacksafe.
> > >
> > > The verifier by default inside stacksafe ignores spilled_ptr in stack
> > > slots which do not have REG_LIVE_READ marks. Since this is being checked
> > > in the 'old' explored state, it must have already done clean_live_states
> > > for this old bpf_func_state. Hence, it won't be receiving any more
> > > liveness marks from to be explored insns (it has received REG_LIVE_DONE
> > > marking from liveness point of view).
> >
> > To summarize, your last two sentences are saying that once an "old"
> > state has been explored, its liveness will never be modified when the
> > verifier explores other newer states, correct?
> >
>
> Yes, once you enter is_state_visited for a prune point with the current state,
> it will do clean_live_states for all existing states at that insn_idx (with the
> same callsite) for states with branches == 0. The branches counter keeps track
> of currently being explored paths. It is always 1 by default for cur.]
>
> So it is not ok to set REG_LIVE_DONE for them when branches > 0, as it shows
> that there are still other paths that need to be explored and this is possibly a
> parent of those.
>
> So when branches == 0, it becomes a completely explored state with nothing left
> to be simulated for it. It will be receiving no more liveness marks from
> anything, so we can set REG_LIVE_DONE for its regs.
>
> An example is an if statement. When you do that you use push_stack to push one
> state to be explored, and the follow the 'else' in the current state. When
> current reaches BPF_EXIT, you will pop the other branch and continue exploring
> from there. So when we do push_stack cur->parent.branches will become 2, and it
> will be 1 for both cur and the pushed state.
>
> > >
> > > What this means is that verifier considers that it's safe to not compare
> > > the stack slot if was never read by children states. While liveness
> > > marks are usually propagated correctly following the parentage chain for
> > > spilled registers (SCALAR_VALUE and PTR_* types), the same is not the
> > > case for STACK_DYNPTR.
> > >
> >
> > In the context of liveness marking, what is "parent" and "children"?
> > Is the parent the first possibility that is explored, and every
> > possibility after that is its children?
> >
>
> parent will be the state we forked from, the registers in current are connected
> to the same registers in parent using reg->parent. Liveness allows us to know
> whether that forked child state ever really used the particular reg in our state
> later.
>
> Once everything has been explored for our children and our branches count drops
> to 0, and a reg never got REG_LIVE_READ propagated by following reg->parent
> whenever a read was done for a reg, we know that the reg was unused, so it isn't
> required to compare it during states_equal checks. Both regsafe and stacksafe
> ignore regs without REG_LIVE_READ.
>
> The child sets REG_LIVE_WRITTEN whenever it overwrites a reg, as that value of
> the parent no longer matters to it. There are some subtleties (like not setting
> it when size != BPF_REG_SIZE, because the partial contents still continue to
> matter).
>
> > > clean_live_states hence simply rewrites these stack slots to the type
> > > STACK_INVALID since it sees no REG_LIVE_READ marks.
> > >
> > > The end result is that we will never see STACK_DYNPTR slots in explored
> > > state. Even if verifier was conservatively matching !REG_LIVE_READ
> > > slots, very next check continuing the stacksafe loop on seeing
> > > STACK_INVALID would again prevent further checks.
> > >
> > > Now as long as verifier stores an explored state which we can compare to
> > > when reaching a pruning point, we can abuse this bug to make verifier
> > > prune search for obviously unsafe paths using STACK_DYNPTR slots
> > > thinking they are never used hence safe.
> > >
> > > Doing this in unprivileged mode is a bit challenging. add_new_state is
> > > only set when seeing BPF_F_TEST_STATE_FREQ (which requires privileges)
> > > or when jmps_processed difference is >= 2 and insn_processed difference
> > > is >= 8. So coming up with the unprivileged case requires a little more
> > > work, but it is still totally possible. The test case being discussed
> > > below triggers the heuristic even in unprivileged mode.
> > >
> > > However, it no longer works since commit
> > > 8addbfc7b308 ("bpf: Gate dynptr API behind CAP_BPF").
> > >
> > > Let's try to study the test step by step.
> > >
> > > Consider the following program (C style BPF ASM):
> > >
> > > 0  r0 = 0;
> > > 1  r6 = &ringbuf_map;
> > > 3  r1 = r6;
> > > 4  r2 = 8;
> > > 5  r3 = 0;
> > > 6  r4 = r10;
> > > 7  r4 -= -16;
> > > 8  call bpf_ringbuf_reserve_dynptr;
> > > 9  if r0 == 0 goto pc+1;
> > > 10 goto pc+1;
> > > 11 *(r10 - 16) = 0xeB9F;
> > > 12 r1 = r10;
> > > 13 r1 -= -16;
> > > 14 r2 = 0;
> > > 15 call bpf_ringbuf_discard_dynptr;
> > > 16 r0 = 0;
> > > 17 exit;
> > >
> > > We know that insn 12 will be a pruning point, hence if we force
> > > add_new_state for it, it will first verify the following path as
> > > safe in straight line exploration:
> > > 0 1 3 4 5 6 7 8 9 -> 10 -> (12) 13 14 15 16 17
> > >
> > > Then, when we arrive at insn 12 from the following path:
> > > 0 1 3 4 5 6 7 8 9 -> 11 (12)
> > >
> > > We will find a state that has been verified as safe already at insn 12.
> > > Since register state is same at this point, regsafe will pass. Next, in
> > > stacksafe, for spi = 0 and spi = 1 (location of our dynptr) is skipped
> > > seeing !REG_LIVE_READ. The rest matches, so stacksafe returns true.
> > > Next, refsafe is also true as reference state is unchanged in both
> > > states.
> > >
> > > The states are considered equivalent and search is pruned.
> > >
> > > Hence, we are able to construct a dynptr with arbitrary contents and use
> > > the dynptr API to operate on this arbitrary pointer and arbitrary size +
> > > offset.
> > >
> > > To fix this, first define a mark_dynptr_read function that propagates
> > > liveness marks whenever a valid initialized dynptr is accessed by dynptr
> > > helpers. REG_LIVE_WRITTEN is marked whenever we initialize an
> > > uninitialized dynptr. This is done in mark_stack_slots_dynptr. It allows
> > > screening off mark_reg_read and not propagating marks upwards from that
> > > point.
> > >
> > > This ensures that we either set REG_LIVE_READ64 on both dynptr slots, or
> > > none, so clean_live_states either sets both slots to STACK_INVALID or
> > > none of them. This is the invariant the checks inside stacksafe rely on.
> > >
> > > Next, do a complete comparison of both stack slots whenever they have
> > > STACK_DYNPTR. Compare the dynptr type stored in the spilled_ptr, and
> > > also whether both form the same first_slot. Only then is the later path
> > > safe.
> > >
> > > Fixes: 97e03f521050 ("bpf: Add verifier support for dynptrs")
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 73 +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 73 insertions(+)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index a8c277e51d63..8f667180f70f 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -752,6 +752,9 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
> > >                 state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
> > >         }
> > >
> > > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +
> >
> > Is the purpose of REG_LIVE_WRITTEN that it indicates to the verifier
> > that this register needs to be checked when comparing the state
> > against another possible state?
> >
>
> It simply means that the slot was overwritten, so we shouldn't be following
> spilled_ptr.parent when doing mark_reg_read for the spilled_ptr in
> mark_dynptr_read. __mark_reg_not_init won't touch live, parent, and other such
> states, so they need to be handled separately.
>
> > >         return 0;
> > >  }
> > >
> > > @@ -776,6 +779,26 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
> > >
> > >         __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> > >         __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
> > > +
> > > +       /* Why do we need to set REG_LIVE_WRITTEN for STACK_INVALID slot?
> > > +        *
> > > +        * While we don't allow reading STACK_INVALID, it is still possible to
> > > +        * do <8 byte writes marking some but not all slots as STACK_MISC. Then,
> > > +        * helpers or insns can do partial read of that part without failing,
> > > +        * but check_stack_range_initialized, check_stack_read_var_off, and
> > > +        * check_stack_read_fixed_off will do mark_reg_read for all 8-bytes of
> > > +        * the slot conservatively. Hence we need to screen off those liveness
> > > +        * marking walks.
> > > +        *
> > > +        * This was not a problem before because STACK_INVALID is only set by
> > > +        * default, or in clean_live_states after REG_LIVE_DONE, not randomly
> > > +        * during verifier state exploration. Hence, for this case parentage
> > > +        * chain will still be live, while earlier reg->parent was NULL, so we
> > > +        * need REG_LIVE_WRITTEN to screen off read marker propagation.
> >
> > What does "screen off" mean in "screen off those liveness marking
> > walks" and "screen off read marker propagation"?
> >
>
> "screen off" simply means setting REG_LIVE_WRITTEN prevents mark_reg_read from
> following reg->parent pointers after that point. What was in this stack slot
> doesn't matter anymore as it was overwritten, so it's liveness won't matter and
> any reads don't have to be propagated upwards. They need to be propagated to
> this slot if children states use it.
>

Awesome, thanks for the explanation here and above.

Looking forward to v2 and getting this upstreamed asap.

> > > +        */
> > > +       state->stack[spi].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +       state->stack[spi - 1].spilled_ptr.live |= REG_LIVE_WRITTEN;
> > > +
> >
> >
> > >         return 0;
> > >  }
> > >
> > > @@ -2354,6 +2377,30 @@ static int mark_reg_read(struct bpf_verifier_env *env,
> > >         return 0;
> > >  }
> > >
> > > +static int mark_dynptr_read(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > > +{
> > > +       struct bpf_func_state *state = func(env, reg);
> > > +       int spi, ret;
> > > +
> > > +       /* For CONST_PTR_TO_DYNPTR, it must have already been done by
> > > +        * check_reg_arg in check_helper_call and mark_btf_func_reg_size in
> > > +        * check_kfunc_call.
> > > +        */
> > > +       if (reg->type == CONST_PTR_TO_DYNPTR)
> > > +               return 0;
> > > +       spi = get_spi(reg->off);
> > > +       /* Caller ensures dynptr is valid and initialized, which means spi is in
> > > +        * bounds and spi is the first dynptr slot. Simply mark stack slot as
> > > +        * read.
> > > +        */
> > > +       ret = mark_reg_read(env, &state->stack[spi].spilled_ptr,
> > > +                           state->stack[spi].spilled_ptr.parent, REG_LIVE_READ64);
> > > +       if (ret)
> > > +               return ret;
> > > +       return mark_reg_read(env, &state->stack[spi - 1].spilled_ptr,
> > > +                            state->stack[spi - 1].spilled_ptr.parent, REG_LIVE_READ64);
> > > +}
> > > +
> > >  /* This function is supposed to be used by the following 32-bit optimization
> > >   * code only. It returns TRUE if the source or destination register operates
> > >   * on 64-bit, otherwise return FALSE.
> > > @@ -5648,6 +5695,7 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > >                         u8 *uninit_dynptr_regno)
> > >  {
> > >         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> > > +       int err;
> > >
> > >         if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> > >                 verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> > > @@ -5729,6 +5777,10 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > >                                 err_extra, argno + 1);
> > >                         return -EINVAL;
> > >                 }
> > > +
> > > +               err = mark_dynptr_read(env, reg);
> > > +               if (err)
> > > +                       return err;
> > >         }
> > >         return 0;
> > >  }
> > > @@ -11793,6 +11845,27 @@ static bool stacksafe(struct bpf_verifier_env *env, struct bpf_func_state *old,
> > >                          * return false to continue verification of this path
> > >                          */
> > >                         return false;
> > > +               /* Both are same slot_type, but STACK_DYNPTR requires more
> > > +                * checks before it can considered safe.
> > > +                */
> > > +               if (old->stack[spi].slot_type[i % BPF_REG_SIZE] == STACK_DYNPTR) {
> > > +                       /* If both are STACK_DYNPTR, type must be same */
> > > +                       if (old->stack[spi].spilled_ptr.dynptr.type != cur->stack[spi].spilled_ptr.dynptr.type)
> > > +                               return false;
> > > +                       /* Both should also have first slot at same spi */
> > > +                       if (old->stack[spi].spilled_ptr.dynptr.first_slot != cur->stack[spi].spilled_ptr.dynptr.first_slot)
> > > +                               return false;
> > > +                       /* ids should be same */
> > > +                       if (!!old->stack[spi].spilled_ptr.ref_obj_id != !!cur->stack[spi].spilled_ptr.ref_obj_id)
> >
> > Do we need two !s or is just one ! enough?
> >
>
> It is trying to test whether both have ref_obj_id or none.
> So if set it turns non-zero to 1, otherwise 0.
> Then 1 != 0 or 0 != 1 fails, 0 == 0 or 1 == 1 succeeds,
> and if ref_obj_id is set we match the ids in the next comparison.

Does just using one ! not test whether both/none have ref_obj_id?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* CVE-2023-39191 - Dynptr fixes - reg.
  2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
                   ` (12 preceding siblings ...)
  2022-10-18 13:59 ` [PATCH bpf-next v1 13/13] selftests/bpf: Add dynptr helper tests Kumar Kartikeya Dwivedi
@ 2023-10-31  7:05 ` Nandhini Rengaraj
  2023-10-31  7:13   ` Greg KH
  2023-10-31  7:57   ` Shung-Hsi Yu
  13 siblings, 2 replies; 54+ messages in thread
From: Nandhini Rengaraj @ 2023-10-31  7:05 UTC (permalink / raw)
  To: memxor; +Cc: andrii, ast, bpf, daniel, joannelkoong, martin.lau, void

Hi,
This is marked as a fix for CVE-2023-39191. Does this vulnerability also affect dynptr in stable kernel v6.1? If so, would you please be able to help us backport the fix to stable kernel v6.1?

Thank you,
Nandhini Rengaraj

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: CVE-2023-39191 - Dynptr fixes - reg.
  2023-10-31  7:05 ` CVE-2023-39191 - Dynptr fixes - reg Nandhini Rengaraj
@ 2023-10-31  7:13   ` Greg KH
  2023-10-31  7:57   ` Shung-Hsi Yu
  1 sibling, 0 replies; 54+ messages in thread
From: Greg KH @ 2023-10-31  7:13 UTC (permalink / raw)
  To: Nandhini Rengaraj
  Cc: memxor, andrii, ast, bpf, daniel, joannelkoong, martin.lau, void

On Tue, Oct 31, 2023 at 07:05:56AM +0000, Nandhini Rengaraj wrote:
> Hi,
> This is marked as a fix for CVE-2023-39191. Does this vulnerability also affect dynptr in stable kernel v6.1? If so, would you please be able to help us backport the fix to stable kernel v6.1?

Have you tried to backport it and tested it properly?  Why require
someone else to do this if you are seeing the issue in the 6.1.y kernel
release?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: CVE-2023-39191 - Dynptr fixes - reg.
  2023-10-31  7:05 ` CVE-2023-39191 - Dynptr fixes - reg Nandhini Rengaraj
  2023-10-31  7:13   ` Greg KH
@ 2023-10-31  7:57   ` Shung-Hsi Yu
  1 sibling, 0 replies; 54+ messages in thread
From: Shung-Hsi Yu @ 2023-10-31  7:57 UTC (permalink / raw)
  To: Nandhini Rengaraj
  Cc: memxor, andrii, ast, bpf, daniel, joannelkoong, martin.lau, void

On Tue, Oct 31, 2023 at 07:05:56AM +0000, Nandhini Rengaraj wrote:
> Hi,
> This is marked as a fix for CVE-2023-39191. Does this vulnerability also affect dynptr in stable kernel v6.1? If so, would you please be able to help us backport the fix to stable kernel v6.1?

I have not work with v6.1, only our distro kernel based on an earlier kernel
(which requires more tweaking since it doesn't have user ringbuf).

Regarding backport to v6.1, this series depends on the "Dynptr
refactorings" series[0]; once that's backported this series should apply
relatively cleanly.

Shung-Hsi

0: https://lore.kernel.org/bpf/20221207204141.308952-1-memxor@gmail.com/

> Thank you,
> Nandhini Rengaraj

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2023-10-31  7:58 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-10-18 13:59 [PATCH bpf-next v1 00/13] Fixes for dynptr Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 01/13] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
2022-10-18 19:45   ` David Vernet
2022-10-19  6:04     ` Kumar Kartikeya Dwivedi
2022-10-19 15:26       ` David Vernet
2022-10-19 22:59   ` Joanne Koong
2022-10-20  0:55     ` Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 02/13] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
2022-10-18 23:16   ` David Vernet
2022-10-19  6:18     ` Kumar Kartikeya Dwivedi
2022-10-19 16:05       ` David Vernet
2022-10-20  1:09         ` Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 03/13] bpf: Rename confusingly named RET_PTR_TO_ALLOC_MEM Kumar Kartikeya Dwivedi
2022-10-18 21:38   ` sdf
2022-10-19  6:19     ` Kumar Kartikeya Dwivedi
2022-11-07 22:35   ` Joanne Koong
2022-11-07 23:12     ` Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 04/13] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
2022-10-18 21:55   ` sdf
2022-10-19  6:24     ` Kumar Kartikeya Dwivedi
2022-11-07 23:17   ` Joanne Koong
2022-11-08 18:22     ` Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 05/13] bpf: Fix state pruning for STACK_DYNPTR stack slots Kumar Kartikeya Dwivedi
2022-11-08 20:22   ` Joanne Koong
2022-11-09 18:39     ` Kumar Kartikeya Dwivedi
2022-11-10  0:41       ` Joanne Koong
2022-10-18 13:59 ` [PATCH bpf-next v1 06/13] bpf: Fix missing var_off check for ARG_PTR_TO_DYNPTR Kumar Kartikeya Dwivedi
2022-10-19 18:52   ` Alexei Starovoitov
2022-10-20  1:04     ` Kumar Kartikeya Dwivedi
2022-10-20  2:13       ` Alexei Starovoitov
2022-10-20  2:40         ` Kumar Kartikeya Dwivedi
2022-10-20  2:56           ` Alexei Starovoitov
2022-10-20  3:23             ` Kumar Kartikeya Dwivedi
2022-10-21  0:46               ` Alexei Starovoitov
2022-10-21  1:53                 ` Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 07/13] bpf: Fix partial dynptr stack slot reads/writes Kumar Kartikeya Dwivedi
2022-10-21 22:50   ` Joanne Koong
2022-10-21 22:57     ` Joanne Koong
2022-10-22  4:08     ` Kumar Kartikeya Dwivedi
2022-11-03 14:07       ` Joanne Koong
2022-11-04 22:14         ` Andrii Nakryiko
2022-11-04 23:02           ` Kumar Kartikeya Dwivedi
2022-11-04 23:08             ` Andrii Nakryiko
2022-10-18 13:59 ` [PATCH bpf-next v1 08/13] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
2022-10-21 18:12   ` Joanne Koong
2022-10-18 13:59 ` [PATCH bpf-next v1 09/13] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
2022-10-19 16:59   ` David Vernet
2022-10-18 13:59 ` [PATCH bpf-next v1 10/13] selftests/bpf: Add dynptr pruning tests Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 11/13] selftests/bpf: Add dynptr var_off tests Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 12/13] selftests/bpf: Add dynptr partial slot overwrite tests Kumar Kartikeya Dwivedi
2022-10-18 13:59 ` [PATCH bpf-next v1 13/13] selftests/bpf: Add dynptr helper tests Kumar Kartikeya Dwivedi
2023-10-31  7:05 ` CVE-2023-39191 - Dynptr fixes - reg Nandhini Rengaraj
2023-10-31  7:13   ` Greg KH
2023-10-31  7:57   ` Shung-Hsi Yu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.