All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 0/7] Dynptr refactorings
@ 2022-11-15  0:01 Kumar Kartikeya Dwivedi
  2022-11-15  0:01 ` [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

This is part 1 of https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com.
This thread also gives some background on why the refactor is being done:
https://lore.kernel.org/bpf/CAEf4Bzb4beTHgVo+G+jehSj8oCeAjRbRcm6MRe=Gr+cajRBwEw@mail.gmail.com

As requested in patch 6 by Alexei, it only includes patches which
refactors the code, on top of which further fixes will be made in part
2. The refactor itself fixes another issue as a side effect. No
functional change is intended (except a few modified log messages).

Changelog:
----------
Fixes v1 -> v1:
Fixes v1: https://lore.kernel.org/bpf/20221018135920.726360-1-memxor@gmail.com

 * Collect acks from Joanne and David
 * Fix misc nits pointed out by Joanne, David
 * Split move of reg->off alignment check for dynptr into separate
   change (Alexei)

Kumar Kartikeya Dwivedi (7):
  bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  bpf: Propagate errors from process_* checks in check_func_arg
  bpf: Rework process_dynptr_func
  bpf: Rework check_func_arg_reg_off
  bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  bpf: Use memmove for bpf_dynptr_{read,write}
  selftests/bpf: Add test for dynptr reinit in user_ringbuf callback

 include/linux/bpf.h                           |   4 +-
 include/linux/bpf_verifier.h                  |   8 +-
 include/uapi/linux/bpf.h                      |   8 +-
 kernel/bpf/btf.c                              |  22 +-
 kernel/bpf/helpers.c                          |  22 +-
 kernel/bpf/verifier.c                         | 416 ++++++++++++------
 scripts/bpf_doc.py                            |   1 +
 tools/include/uapi/linux/bpf.h                |   8 +-
 .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  12 +-
 .../bpf/progs/test_kfunc_dynptr_param.c       |  12 -
 .../selftests/bpf/progs/user_ringbuf_fail.c   |  35 ++
 .../testing/selftests/bpf/verifier/ringbuf.c  |   2 +-
 13 files changed, 355 insertions(+), 200 deletions(-)


base-commit: de763fbb2c5bfad1ab7c4232e6a804726f0b0744
-- 
2.38.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-15  4:15   ` Joanne Koong
  2022-11-15  0:01 ` [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg Kumar Kartikeya Dwivedi
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: David Vernet, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Joanne Koong

ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
the underlying register type is subjected to more special checks to
determine the type of object represented by the pointer and its state
consistency.

Move dynptr checks to their own 'process_dynptr_func' function so that
is consistent and in-line with existing code. This also makes it easier
to reuse this code for kfunc handling.

Then, reuse this consolidated function in kfunc dynptr handling too.
Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
been lifted.

Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h                  |   8 +-
 kernel/bpf/btf.c                              |  17 +--
 kernel/bpf/verifier.c                         | 116 ++++++++++--------
 .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
 .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
 5 files changed, 70 insertions(+), 88 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 1a32baa78ce2..a69b6d49d40c 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
 			     u32 regno);
 int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 		   u32 regno, u32 mem_size);
-bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
-			      struct bpf_reg_state *reg);
-bool is_dynptr_type_expected(struct bpf_verifier_env *env,
-			     struct bpf_reg_state *reg,
-			     enum bpf_arg_type arg_type);
+struct bpf_call_arg_meta;
+int process_dynptr_func(struct bpf_verifier_env *env, int regno,
+			enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta);
 
 /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
 static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 5579ff3a5b54..d02ae2f4249b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6575,23 +6575,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 						return -EINVAL;
 					}
 
-					if (!is_dynptr_reg_valid_init(env, reg)) {
-						bpf_log(log,
-							"arg#%d pointer type %s %s must be valid and initialized\n",
-							i, btf_type_str(ref_t),
-							ref_tname);
+					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, NULL))
 						return -EINVAL;
-					}
-
-					if (!is_dynptr_type_expected(env, reg,
-							ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
-						bpf_log(log,
-							"arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
-							i, btf_type_str(ref_t),
-							ref_tname);
-						return -EINVAL;
-					}
-
 					continue;
 				}
 
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 07c0259dfc1a..56f48ab9827f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -784,8 +784,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 	return true;
 }
 
-bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
-			      struct bpf_reg_state *reg)
+static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
 	int spi = get_spi(reg->off);
@@ -804,9 +803,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
 	return true;
 }
 
-bool is_dynptr_type_expected(struct bpf_verifier_env *env,
-			     struct bpf_reg_state *reg,
-			     enum bpf_arg_type arg_type)
+static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
+				    enum bpf_arg_type arg_type)
 {
 	struct bpf_func_state *state = func(env, reg);
 	enum bpf_dynptr_type dynptr_type;
@@ -5694,6 +5692,66 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
 	return 0;
 }
 
+int process_dynptr_func(struct bpf_verifier_env *env, int regno,
+			enum bpf_arg_type arg_type,
+			struct bpf_call_arg_meta *meta)
+{
+	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
+	int argno = regno - 1;
+
+	/* We only need to check for initialized / uninitialized helper
+	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
+	 * assumption is that if it is, that a helper function
+	 * initialized the dynptr on behalf of the BPF program.
+	 */
+	if (base_type(reg->type) == PTR_TO_DYNPTR)
+		return 0;
+	if (arg_type & MEM_UNINIT) {
+		if (!is_dynptr_reg_valid_uninit(env, reg)) {
+			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
+			return -EINVAL;
+		}
+
+		/* We only support one dynptr being uninitialized at the moment,
+		 * which is sufficient for the helper functions we have right now.
+		 */
+		if (meta->uninit_dynptr_regno) {
+			verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
+			return -EFAULT;
+		}
+
+		meta->uninit_dynptr_regno = regno;
+	} else {
+		if (!is_dynptr_reg_valid_init(env, reg)) {
+			verbose(env,
+				"Expected an initialized dynptr as arg #%d\n",
+				argno + 1);
+			return -EINVAL;
+		}
+
+		if (!is_dynptr_type_expected(env, reg, arg_type)) {
+			const char *err_extra = "";
+
+			switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
+			case DYNPTR_TYPE_LOCAL:
+				err_extra = "local";
+				break;
+			case DYNPTR_TYPE_RINGBUF:
+				err_extra = "ringbuf";
+				break;
+			default:
+				err_extra = "<unknown>";
+				break;
+			}
+			verbose(env,
+				"Expected a dynptr of type %s as arg #%d\n",
+				err_extra, argno + 1);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
 static bool arg_type_is_mem_size(enum bpf_arg_type type)
 {
 	return type == ARG_CONST_SIZE ||
@@ -6197,52 +6255,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		err = check_mem_size_reg(env, reg, regno, true, meta);
 		break;
 	case ARG_PTR_TO_DYNPTR:
-		/* We only need to check for initialized / uninitialized helper
-		 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
-		 * assumption is that if it is, that a helper function
-		 * initialized the dynptr on behalf of the BPF program.
-		 */
-		if (base_type(reg->type) == PTR_TO_DYNPTR)
-			break;
-		if (arg_type & MEM_UNINIT) {
-			if (!is_dynptr_reg_valid_uninit(env, reg)) {
-				verbose(env, "Dynptr has to be an uninitialized dynptr\n");
-				return -EINVAL;
-			}
-
-			/* We only support one dynptr being uninitialized at the moment,
-			 * which is sufficient for the helper functions we have right now.
-			 */
-			if (meta->uninit_dynptr_regno) {
-				verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
-				return -EFAULT;
-			}
-
-			meta->uninit_dynptr_regno = regno;
-		} else if (!is_dynptr_reg_valid_init(env, reg)) {
-			verbose(env,
-				"Expected an initialized dynptr as arg #%d\n",
-				arg + 1);
-			return -EINVAL;
-		} else if (!is_dynptr_type_expected(env, reg, arg_type)) {
-			const char *err_extra = "";
-
-			switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
-			case DYNPTR_TYPE_LOCAL:
-				err_extra = "local";
-				break;
-			case DYNPTR_TYPE_RINGBUF:
-				err_extra = "ringbuf";
-				break;
-			default:
-				err_extra = "<unknown>";
-				break;
-			}
-			verbose(env,
-				"Expected a dynptr of type %s as arg #%d\n",
-				err_extra, arg + 1);
-			return -EINVAL;
-		}
+		if (process_dynptr_func(env, regno, arg_type, meta))
+			return -EACCES;
 		break;
 	case ARG_CONST_ALLOC_SIZE_OR_ZERO:
 		if (!tnum_is_const(reg->var_off)) {
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
index c210657d4d0a..fc562e863e79 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
@@ -18,10 +18,7 @@ static struct {
 	const char *expected_verifier_err_msg;
 	int expected_runtime_err;
 } kfunc_dynptr_tests[] = {
-	{"dynptr_type_not_supp",
-	 "arg#0 pointer type STRUCT bpf_dynptr_kern points to unsupported dynamic pointer type", 0},
-	{"not_valid_dynptr",
-	 "arg#0 pointer type STRUCT bpf_dynptr_kern must be valid and initialized", 0},
+	{"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0},
 	{"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0},
 	{"dynptr_data_null", NULL, -EBADMSG},
 };
diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
index ce39d096bba3..f4a8250329b2 100644
--- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
+++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
@@ -32,18 +32,6 @@ int err, pid;
 
 char _license[] SEC("license") = "GPL";
 
-SEC("?lsm.s/bpf")
-int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
-	     unsigned int size)
-{
-	char write_data[64] = "hello there, world!!";
-	struct bpf_dynptr ptr;
-
-	bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
-
-	return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
-}
-
 SEC("?lsm.s/bpf")
 int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size)
 {
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
  2022-11-15  0:01 ` [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-15  3:53   ` Joanne Koong
  2022-11-15  0:01 ` [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Currently, we simply ignore the errors in process_spin_lock,
process_timer_func, process_kptr_func, process_dynptr_func.
Instead, bubble up storing and checking err variable.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 25 +++++++++++++++----------
 1 file changed, 15 insertions(+), 10 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 56f48ab9827f..41ef7e4b73e4 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6220,19 +6220,22 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		break;
 	case ARG_PTR_TO_SPIN_LOCK:
 		if (meta->func_id == BPF_FUNC_spin_lock) {
-			if (process_spin_lock(env, regno, true))
-				return -EACCES;
+			err = process_spin_lock(env, regno, true);
+			if (err)
+				return err;
 		} else if (meta->func_id == BPF_FUNC_spin_unlock) {
-			if (process_spin_lock(env, regno, false))
-				return -EACCES;
+			err = process_spin_lock(env, regno, false);
+			if (err)
+				return err;
 		} else {
 			verbose(env, "verifier internal error\n");
 			return -EFAULT;
 		}
 		break;
 	case ARG_PTR_TO_TIMER:
-		if (process_timer_func(env, regno, meta))
-			return -EACCES;
+		err = process_timer_func(env, regno, meta);
+		if (err)
+			return err;
 		break;
 	case ARG_PTR_TO_FUNC:
 		meta->subprogno = reg->subprogno;
@@ -6255,8 +6258,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		err = check_mem_size_reg(env, reg, regno, true, meta);
 		break;
 	case ARG_PTR_TO_DYNPTR:
-		if (process_dynptr_func(env, regno, arg_type, meta))
-			return -EACCES;
+		err = process_dynptr_func(env, regno, arg_type, meta);
+		if (err)
+			return err;
 		break;
 	case ARG_CONST_ALLOC_SIZE_OR_ZERO:
 		if (!tnum_is_const(reg->var_off)) {
@@ -6323,8 +6327,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 		break;
 	}
 	case ARG_PTR_TO_KPTR:
-		if (process_kptr_func(env, regno, meta))
-			return -EACCES;
+		err = process_kptr_func(env, regno, meta);
+		if (err)
+			return err;
 		break;
 	}
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
  2022-11-15  0:01 ` [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
  2022-11-15  0:01 ` [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-16 18:04   ` Joanne Koong
  2022-11-17 21:11   ` David Vernet
  2022-11-15  0:01 ` [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
for use in callback state, because in case of user ringbuf helpers,
there is no dynptr on the stack that is passed into the callback. To
reflect such a state, a special register type was created.

However, some checks have been bypassed incorrectly during the addition
of this feature. First, for arg_type with MEM_UNINIT flag which
initialize a dynptr, they must be rejected for such register type.
Secondly, in the future, there are plans to add dynptr helpers that
operate on the dynptr itself and may change its offset and other
properties.

In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
to such helpers, however the current code simply returns 0.

The rejection for helpers that release the dynptr is already handled.

For fixing this, we take a step back and rework existing code in a way
that will allow fitting in all classes of helpers and have a coherent
model for dealing with the variety of use cases in which dynptr is used.

First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
with a DYNPTR_TYPE_* constant that denotes the only type it accepts.

Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
fact. To make the distinction clear, use MEM_RDONLY flag to indicate
that the helper only operates on the memory pointed to by the dynptr,
not the dynptr itself. In C parlance, it would be equivalent to taking
the dynptr as a point to const argument.

When either of these flags are not present, the helper is allowed to
mutate both the dynptr itself and also the memory it points to.
Currently, the read only status of the memory is not tracked in the
dynptr, but it would be trivial to add this support inside dynptr state
of the register.

With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
better reflect its usage, it can no longer be passed to helpers that
initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.

A note to reviewers is that in code that does mark_stack_slots_dynptr,
and unmark_stack_slots_dynptr, we implicitly rely on the fact that
PTR_TO_STACK reg is the only case that can reach that code path, as one
cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
both cases such helpers won't be setting that flag.

The next patch will add a couple of selftest cases to make sure this
doesn't break.

Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h                           |   4 +-
 include/uapi/linux/bpf.h                      |   8 +-
 kernel/bpf/btf.c                              |   7 +-
 kernel/bpf/helpers.c                          |  18 +-
 kernel/bpf/verifier.c                         | 220 +++++++++++++-----
 scripts/bpf_doc.py                            |   1 +
 tools/include/uapi/linux/bpf.h                |   8 +-
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  10 +-
 8 files changed, 195 insertions(+), 81 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 798aec816970..dfe45f6caa4f 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -709,7 +709,7 @@ enum bpf_reg_type {
 	PTR_TO_MEM,		 /* reg points to valid memory region */
 	PTR_TO_BUF,		 /* reg points to a read/write buffer */
 	PTR_TO_FUNC,		 /* reg points to a bpf program function */
-	PTR_TO_DYNPTR,		 /* reg points to a dynptr */
+	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
 	__BPF_REG_TYPE_MAX,
 
 	/* Extended reg_types. */
@@ -2761,7 +2761,7 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
 		     enum bpf_dynptr_type type, u32 offset, u32 size);
 void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
 int bpf_dynptr_check_size(u32 size);
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
+u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
 
 #ifdef CONFIG_BPF_LSM
 void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index fb4c911d2a03..b7543efd6284 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5290,7 +5290,7 @@ union bpf_attr {
  *	Return
  *		Nothing. Always succeeds.
  *
- * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
  *	Description
  *		Read *len* bytes from *src* into *dst*, starting from *offset*
  *		into *src*.
@@ -5300,7 +5300,7 @@ union bpf_attr {
  *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
  *		*flags* is not 0.
  *
- * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
  *	Description
  *		Write *len* bytes from *src* into *dst*, starting from *offset*
  *		into *dst*.
@@ -5310,7 +5310,7 @@ union bpf_attr {
  *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
  *		is a read-only dynptr or if *flags* is not 0.
  *
- * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
  *	Description
  *		Get a pointer to the underlying dynptr data.
  *
@@ -5411,7 +5411,7 @@ union bpf_attr {
  *		Drain samples from the specified user ring buffer, and invoke
  *		the provided callback for each such sample:
  *
- *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
+ *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
  *
  *		If **callback_fn** returns 0, the helper will continue to try
  *		and drain the next sample, up to a maximum of
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index d02ae2f4249b..ba3b50895f6b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6568,14 +6568,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				}
 
 				if (arg_dynptr) {
-					if (reg->type != PTR_TO_STACK) {
-						bpf_log(log, "arg#%d pointer type %s %s not to stack\n",
+					if (reg->type != PTR_TO_STACK &&
+					    reg->type != CONST_PTR_TO_DYNPTR) {
+						bpf_log(log, "arg#%d pointer type %s %s not to stack or dynptr\n",
 							i, btf_type_str(ref_t),
 							ref_tname);
 						return -EINVAL;
 					}
 
-					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, NULL))
+					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR | MEM_RDONLY, NULL))
 						return -EINVAL;
 					continue;
 				}
diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 283f55bbeb70..714f5c9d0c1f 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1398,7 +1398,7 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
 #define DYNPTR_SIZE_MASK	0xFFFFFF
 #define DYNPTR_RDONLY_BIT	BIT(31)
 
-static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
+static bool bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_RDONLY_BIT;
 }
@@ -1408,7 +1408,7 @@ static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_typ
 	ptr->size |= type << DYNPTR_TYPE_SHIFT;
 }
 
-u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
+u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
 {
 	return ptr->size & DYNPTR_SIZE_MASK;
 }
@@ -1432,7 +1432,7 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
 	memset(ptr, 0, sizeof(*ptr));
 }
 
-static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
+static int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
 {
 	u32 size = bpf_dynptr_get_size(ptr);
 
@@ -1477,7 +1477,7 @@ static const struct bpf_func_proto bpf_dynptr_from_mem_proto = {
 	.arg4_type	= ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT,
 };
 
-BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src,
+BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern *, src,
 	   u32, offset, u64, flags)
 {
 	int err;
@@ -1500,12 +1500,12 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = {
 	.ret_type	= RET_INTEGER,
 	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
 	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
-	.arg3_type	= ARG_PTR_TO_DYNPTR,
+	.arg3_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg4_type	= ARG_ANYTHING,
 	.arg5_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
+BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
 	   u32, len, u64, flags)
 {
 	int err;
@@ -1526,14 +1526,14 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = {
 	.func		= bpf_dynptr_write,
 	.gpl_only	= false,
 	.ret_type	= RET_INTEGER,
-	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg2_type	= ARG_ANYTHING,
 	.arg3_type	= ARG_PTR_TO_MEM | MEM_RDONLY,
 	.arg4_type	= ARG_CONST_SIZE_OR_ZERO,
 	.arg5_type	= ARG_ANYTHING,
 };
 
-BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
+BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
 {
 	int err;
 
@@ -1554,7 +1554,7 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
 	.func		= bpf_dynptr_data,
 	.gpl_only	= false,
 	.ret_type	= RET_PTR_TO_DYNPTR_MEM_OR_NULL,
-	.arg1_type	= ARG_PTR_TO_DYNPTR,
+	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
 	.arg2_type	= ARG_ANYTHING,
 	.arg3_type	= ARG_CONST_ALLOC_SIZE_OR_ZERO,
 };
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 41ef7e4b73e4..c484e632b0cd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -565,7 +565,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
 		[PTR_TO_BUF]		= "buf",
 		[PTR_TO_FUNC]		= "func",
 		[PTR_TO_MAP_KEY]	= "map_key",
-		[PTR_TO_DYNPTR]		= "dynptr_ptr",
+		[CONST_PTR_TO_DYNPTR]	= "dynptr",
 	};
 
 	if (type & PTR_MAYBE_NULL) {
@@ -699,6 +699,27 @@ static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
 	return type == BPF_DYNPTR_TYPE_RINGBUF;
 }
 
+static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
+			       struct bpf_reg_state *reg2,
+			       enum bpf_dynptr_type type);
+
+static void __mark_reg_not_init(const struct bpf_verifier_env *env,
+				struct bpf_reg_state *reg);
+
+static void mark_dynptr_stack_regs(struct bpf_reg_state *sreg1,
+				   struct bpf_reg_state *sreg2,
+				   enum bpf_dynptr_type type)
+{
+	__mark_dynptr_regs(sreg1, sreg2, type);
+}
+
+static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
+			       enum bpf_dynptr_type type)
+{
+	__mark_dynptr_regs(reg1, NULL, type);
+}
+
+
 static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
 				   enum bpf_arg_type arg_type, int insn_idx)
 {
@@ -720,9 +741,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 	if (type == BPF_DYNPTR_TYPE_INVALID)
 		return -EINVAL;
 
-	state->stack[spi].spilled_ptr.dynptr.first_slot = true;
-	state->stack[spi].spilled_ptr.dynptr.type = type;
-	state->stack[spi - 1].spilled_ptr.dynptr.type = type;
+	mark_dynptr_stack_regs(&state->stack[spi].spilled_ptr,
+			       &state->stack[spi - 1].spilled_ptr, type);
 
 	if (dynptr_type_refcounted(type)) {
 		/* The id is used to track proper releasing */
@@ -730,8 +750,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
 		if (id < 0)
 			return id;
 
-		state->stack[spi].spilled_ptr.id = id;
-		state->stack[spi - 1].spilled_ptr.id = id;
+		state->stack[spi].spilled_ptr.ref_obj_id = id;
+		state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
 	}
 
 	return 0;
@@ -753,25 +773,23 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
 	}
 
 	/* Invalidate any slices associated with this dynptr */
-	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
-		release_reference(env, state->stack[spi].spilled_ptr.id);
-		state->stack[spi].spilled_ptr.id = 0;
-		state->stack[spi - 1].spilled_ptr.id = 0;
-	}
-
-	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
-	state->stack[spi].spilled_ptr.dynptr.type = 0;
-	state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
+	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
+		WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
 
+	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
+	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
 	return 0;
 }
 
 static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
-	int i;
+	int spi, i;
+
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return false;
 
+	spi = get_spi(reg->off);
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
 		return true;
 
@@ -787,9 +805,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
 static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
+	int spi;
 	int i;
 
+	/* This already represents first slot of initialized bpf_dynptr */
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return true;
+
+	spi = get_spi(reg->off);
 	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
 	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
 		return false;
@@ -808,15 +831,19 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
 {
 	struct bpf_func_state *state = func(env, reg);
 	enum bpf_dynptr_type dynptr_type;
-	int spi = get_spi(reg->off);
+	int spi;
 
 	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
 	if (arg_type == ARG_PTR_TO_DYNPTR)
 		return true;
 
 	dynptr_type = arg_to_dynptr_type(arg_type);
-
-	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
+	if (reg->type == CONST_PTR_TO_DYNPTR) {
+		return reg->dynptr.type == dynptr_type;
+	} else {
+		spi = get_spi(reg->off);
+		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
+	}
 }
 
 /* The reg state of a pointer or a bounded scalar was saved when
@@ -1324,9 +1351,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
 	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
 };
 
-static void __mark_reg_not_init(const struct bpf_verifier_env *env,
-				struct bpf_reg_state *reg);
-
 /* This helper doesn't clear reg->id */
 static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
 {
@@ -1389,6 +1413,26 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
 	__mark_reg_known_zero(regs + regno);
 }
 
+static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
+			       struct bpf_reg_state *reg2,
+			       enum bpf_dynptr_type type)
+{
+	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
+	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR, so simply
+	 * set it unconditionally as it is ignored for STACK_DYNPTR anyway.
+	 */
+	__mark_reg_known_zero(reg1);
+	reg1->type = CONST_PTR_TO_DYNPTR;
+	reg1->dynptr.type = type;
+	reg1->dynptr.first_slot = true;
+	if (!reg2)
+		return;
+	__mark_reg_known_zero(reg2);
+	reg2->type = CONST_PTR_TO_DYNPTR;
+	reg2->dynptr.type = type;
+	reg2->dynptr.first_slot = false;
+}
+
 static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
 {
 	if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
@@ -5692,20 +5736,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
 	return 0;
 }
 
+/* Implementation details:
+ *
+ * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
+ * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
+ *
+ * In both cases we deal with the first 8 bytes, but need to mark the next 8
+ * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
+ * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
+ *
+ * Mutability of bpf_dynptr is at two levels, one is at the level of struct
+ * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
+ * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
+ * mutate the view of the dynptr and also possibly destroy it. In the latter
+ * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
+ * memory that dynptr points to.
+ *
+ * The verifier will keep track both levels of mutation (bpf_dynptr's in
+ * reg->type and the memory's in reg->dynptr.type), but there is no support for
+ * readonly dynptr view yet, hence only the first case is tracked and checked.
+ *
+ * This is consistent with how C applies the const modifier to a struct object,
+ * where the pointer itself inside bpf_dynptr becomes const but not what it
+ * points to.
+ *
+ * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
+ * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
+ */
 int process_dynptr_func(struct bpf_verifier_env *env, int regno,
-			enum bpf_arg_type arg_type,
-			struct bpf_call_arg_meta *meta)
+			enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta)
 {
 	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
 	int argno = regno - 1;
 
-	/* We only need to check for initialized / uninitialized helper
-	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
-	 * assumption is that if it is, that a helper function
-	 * initialized the dynptr on behalf of the BPF program.
+	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
+		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
+		return -EFAULT;
+	}
+
+	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
+	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
+	 *
+	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
+	 *		 constructing a mutable bpf_dynptr object.
+	 *
+	 *		 Currently, this is only possible with PTR_TO_STACK
+	 *		 pointing to a region of atleast 16 bytes which doesn't
+	 *		 contain an existing bpf_dynptr.
+	 *
+	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
+	 *		 mutated or destroyed. However, the memory it points to
+	 *		 may be mutated.
+	 *
+	 *  None       - Points to a initialized dynptr that can be mutated and
+	 *		 destroyed, including mutation of the memory it points
+	 *		 to.
 	 */
-	if (base_type(reg->type) == PTR_TO_DYNPTR)
-		return 0;
 	if (arg_type & MEM_UNINIT) {
 		if (!is_dynptr_reg_valid_uninit(env, reg)) {
 			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
@@ -5722,6 +5808,12 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 
 		meta->uninit_dynptr_regno = regno;
 	} else {
+		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
+		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
+			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
+			return -EINVAL;
+		}
+
 		if (!is_dynptr_reg_valid_init(env, reg)) {
 			verbose(env,
 				"Expected an initialized dynptr as arg #%d\n",
@@ -5729,6 +5821,8 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 			return -EINVAL;
 		}
 
+		/* Fold MEM_RDONLY, only inspect arg_type */
+		arg_type &= ~MEM_RDONLY;
 		if (!is_dynptr_type_expected(env, reg, arg_type)) {
 			const char *err_extra = "";
 
@@ -5874,7 +5968,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
 static const struct bpf_reg_types dynptr_types = {
 	.types = {
 		PTR_TO_STACK,
-		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
+		CONST_PTR_TO_DYNPTR,
 	}
 };
 
@@ -6050,12 +6144,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
 }
 
-static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
+static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
 {
 	struct bpf_func_state *state = func(env, reg);
-	int spi = get_spi(reg->off);
+	int spi;
 
-	return state->stack[spi].spilled_ptr.id;
+	if (reg->type == CONST_PTR_TO_DYNPTR)
+		return reg->ref_obj_id;
+	spi = get_spi(reg->off);
+	return state->stack[spi].spilled_ptr.ref_obj_id;
 }
 
 static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
@@ -6119,11 +6216,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 	if (arg_type_is_release(arg_type)) {
 		if (arg_type_is_dynptr(arg_type)) {
 			struct bpf_func_state *state = func(env, reg);
-			int spi = get_spi(reg->off);
+			int spi;
 
-			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
-			    !state->stack[spi].spilled_ptr.id) {
-				verbose(env, "arg %d is an unacquired reference\n", regno);
+			if (reg->type == PTR_TO_STACK) {
+				spi = get_spi(reg->off);
+				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
+				    !state->stack[spi].spilled_ptr.ref_obj_id) {
+					verbose(env, "arg %d is an unacquired reference\n", regno);
+					return -EINVAL;
+				}
+			} else {
+				verbose(env, "cannot release unowned const bpf_dynptr\n");
 				return -EINVAL;
 			}
 		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
@@ -7091,11 +7194,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
 {
 	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
 	 *			  callback_ctx, u64 flags);
-	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
+	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
 	 */
 	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
-	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
-	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
+	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
 	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
 
 	/* unused */
@@ -7474,7 +7576,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	regs = cur_regs(env);
 
+	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
+	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
+	 * is safe to do directly.
+	 */
 	if (meta.uninit_dynptr_regno) {
+		if (WARN_ON_ONCE(regs[meta.uninit_dynptr_regno].type == CONST_PTR_TO_DYNPTR))
+			return -EFAULT;
 		/* we write BPF_DW bits (8 bytes) at a time */
 		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
 			err = check_mem_access(env, insn_idx, meta.uninit_dynptr_regno,
@@ -7492,15 +7600,22 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 
 	if (meta.release_regno) {
 		err = -EINVAL;
-		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
+		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
+		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
+		 * is safe to do directly.
+		 */
+		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1])) {
+			if (WARN_ON_ONCE(regs[meta.release_regno].type == CONST_PTR_TO_DYNPTR))
+				return -EFAULT;
 			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
-		else if (meta.ref_obj_id)
+		} else if (meta.ref_obj_id) {
 			err = release_reference(env, meta.ref_obj_id);
-		/* meta.ref_obj_id can only be 0 if register that is meant to be
-		 * released is NULL, which must be > R0.
-		 */
-		else if (register_is_null(&regs[meta.release_regno]))
+		} else if (register_is_null(&regs[meta.release_regno])) {
+			/* meta.ref_obj_id can only be 0 if register that is meant to be
+			 * released is NULL, which must be > R0.
+			 */
 			err = 0;
+		}
 		if (err) {
 			verbose(env, "func %s#%d reference has not been acquired before\n",
 				func_id_name(func_id), func_id);
@@ -7574,11 +7689,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 					return -EFAULT;
 				}
 
-				if (base_type(reg->type) != PTR_TO_DYNPTR)
-					/* Find the id of the dynptr we're
-					 * tracking the reference of
-					 */
-					meta.ref_obj_id = stack_slot_get_id(env, reg);
+				/* Find the id of the dynptr we're tracking the reference of */
+				meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
 				break;
 			}
 		}
diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
index fdb0aff8cb5a..e8d90829f23e 100755
--- a/scripts/bpf_doc.py
+++ b/scripts/bpf_doc.py
@@ -752,6 +752,7 @@ class PrinterHelpers(Printer):
             'struct bpf_timer',
             'struct mptcp_sock',
             'struct bpf_dynptr',
+            'const struct bpf_dynptr',
             'struct iphdr',
             'struct ipv6hdr',
     }
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index fb4c911d2a03..b7543efd6284 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5290,7 +5290,7 @@ union bpf_attr {
  *	Return
  *		Nothing. Always succeeds.
  *
- * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
+ * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
  *	Description
  *		Read *len* bytes from *src* into *dst*, starting from *offset*
  *		into *src*.
@@ -5300,7 +5300,7 @@ union bpf_attr {
  *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
  *		*flags* is not 0.
  *
- * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
+ * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
  *	Description
  *		Write *len* bytes from *src* into *dst*, starting from *offset*
  *		into *dst*.
@@ -5310,7 +5310,7 @@ union bpf_attr {
  *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
  *		is a read-only dynptr or if *flags* is not 0.
  *
- * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
+ * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
  *	Description
  *		Get a pointer to the underlying dynptr data.
  *
@@ -5411,7 +5411,7 @@ union bpf_attr {
  *		Drain samples from the specified user ring buffer, and invoke
  *		the provided callback for each such sample:
  *
- *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
+ *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
  *
  *		If **callback_fn** returns 0, the helper will continue to try
  *		and drain the next sample, up to a maximum of
diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
index 02b18d018b36..39882580cb90 100644
--- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
@@ -668,13 +668,13 @@ static struct {
 	const char *expected_err_msg;
 } failure_tests[] = {
 	/* failure cases */
-	{"user_ringbuf_callback_bad_access1", "negative offset dynptr_ptr ptr"},
-	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr_ptr ptr"},
-	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr_ptr'"},
+	{"user_ringbuf_callback_bad_access1", "negative offset dynptr ptr"},
+	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr ptr"},
+	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr'"},
 	{"user_ringbuf_callback_null_context_write", "invalid mem access 'scalar'"},
 	{"user_ringbuf_callback_null_context_read", "invalid mem access 'scalar'"},
-	{"user_ringbuf_callback_discard_dynptr", "arg 1 is an unacquired reference"},
-	{"user_ringbuf_callback_submit_dynptr", "arg 1 is an unacquired reference"},
+	{"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
+	{"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
 };
 
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2022-11-15  0:01 ` [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-15 18:24   ` Joanne Koong
  2022-11-17 23:42   ` David Vernet
  2022-11-15  0:01 ` [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

While check_func_arg_reg_off is the place which performs generic checks
needed by various candidates of reg->type, there is some handling for
special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
ARG_PTR_TO_ALLOC_MEM.

This commit aims to streamline these special cases and instead leave
other things up to argument type specific code to handle. The function
will be restrictive by default, and cover all possible cases when
OBJ_RELEASE is set, without having to update the function again (and
missing to do that being a bug).

This is done primarily for two reasons: associating back reg->type to
its argument leaves room for the list getting out of sync when a new
reg->type is supported by an arg_type.

The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
we already handle, whenever a release argument is expected, it should
be passed as the pointer that was received from the acquire function.
Hence zero fixed and variable offset.

There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
with non-zero offset to other helper functions, which makes sense.

Hence, lift the arg_type_is_release check for reg->off and cover all
possible register types, instead of duplicating the same kind of check
twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).

For the release argument, arg_type_is_dynptr is the special case, where
we go to actual object being freed through the dynptr, so the offset of
the pointer still needs to allow fixed and variable offset and
process_dynptr_func will verify them later for the release argument case
as well.

This is not specific to ARG_PTR_TO_DYNPTR though, we will need to make
this exception for any future object on the stack that needs to be
released. In this sense, PTR_TO_STACK as a candidate for object on stack
argument is a special case for release offset checks, and they need to
be done by the helper releasing the object on stack.

Since the check has been lifted above all register type checks, remove
the duplicated check that is being done for PTR_TO_BTF_ID.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c                         | 62 ++++++++++++-------
 .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
 2 files changed, 39 insertions(+), 25 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index c484e632b0cd..34e67d04579b 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6092,11 +6092,38 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 			   const struct bpf_reg_state *reg, int regno,
 			   enum bpf_arg_type arg_type)
 {
-	enum bpf_reg_type type = reg->type;
-	bool fixed_off_ok = false;
+	u32 type = reg->type;
 
-	switch ((u32)type) {
-	/* Pointer types where reg offset is explicitly allowed: */
+	/* When referenced register is passed to release function, it's fixed
+	 * offset must be 0.
+	 *
+	 * We will check arg_type_is_release reg has ref_obj_id when storing
+	 * meta->release_regno.
+	 */
+	if (arg_type_is_release(arg_type)) {
+		/* ARG_PTR_TO_DYNPTR with OBJ_RELEASE is a bit special, as it
+		 * may not directly point to the object being released, but to
+		 * dynptr pointing to such object, which might be at some offset
+		 * on the stack. In that case, we simply to fallback to the
+		 * default handling.
+		 */
+		if (!arg_type_is_dynptr(arg_type) || type != PTR_TO_STACK) {
+			/* Doing check_ptr_off_reg check for the offset will
+			 * catch this because fixed_off_ok is false, but
+			 * checking here allows us to give the user a better
+			 * error message.
+			 */
+			if (reg->off) {
+				verbose(env, "R%d must have zero offset when passed to release func\n",
+					regno);
+				return -EINVAL;
+			}
+			return __check_ptr_off_reg(env, reg, regno, false);
+		}
+		/* Fallback to default handling */
+	}
+	switch (type) {
+	/* Pointer types where both fixed and variable offset is explicitly allowed: */
 	case PTR_TO_STACK:
 		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
 			verbose(env, "cannot pass in dynptr at an offset\n");
@@ -6113,35 +6140,22 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	case PTR_TO_BUF:
 	case PTR_TO_BUF | MEM_RDONLY:
 	case SCALAR_VALUE:
-		/* Some of the argument types nevertheless require a
-		 * zero register offset.
-		 */
-		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
-			return 0;
-		break;
+		return 0;
 	/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
 	 * fixed offset.
 	 */
 	case PTR_TO_BTF_ID:
 		/* When referenced PTR_TO_BTF_ID is passed to release function,
 		 * it's fixed offset must be 0.	In the other cases, fixed offset
-		 * can be non-zero.
+		 * can be non-zero. This was already checked above. So pass
+		 * fixed_off_ok as true to allow fixed offset for all other
+		 * cases. var_off always must be 0 for PTR_TO_BTF_ID, hence we
+		 * still need to do checks instead of returning.
 		 */
-		if (arg_type_is_release(arg_type) && reg->off) {
-			verbose(env, "R%d must have zero offset when passed to release func\n",
-				regno);
-			return -EINVAL;
-		}
-		/* For arg is release pointer, fixed_off_ok must be false, but
-		 * we already checked and rejected reg->off != 0 above, so set
-		 * to true to allow fixed offset for all other cases.
-		 */
-		fixed_off_ok = true;
-		break;
+		return __check_ptr_off_reg(env, reg, regno, true);
 	default:
-		break;
+		return __check_ptr_off_reg(env, reg, regno, false);
 	}
-	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
 }
 
 static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c b/tools/testing/selftests/bpf/verifier/ringbuf.c
index b64d33e4833c..92e3f6a61a79 100644
--- a/tools/testing/selftests/bpf/verifier/ringbuf.c
+++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
@@ -28,7 +28,7 @@
 	},
 	.fixup_map_ringbuf = { 1 },
 	.result = REJECT,
-	.errstr = "dereference of modified alloc_mem ptr R1",
+	.errstr = "R1 must have zero offset when passed to release func",
 },
 {
 	"ringbuf: invalid reservation offset 2",
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
                   ` (3 preceding siblings ...)
  2022-11-15  0:01 ` [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-15 18:29   ` Joanne Koong
  2022-11-17 23:49   ` David Vernet
  2022-11-15  0:01 ` [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
  2022-11-15  0:01 ` [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
  6 siblings, 2 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong, David Vernet

After previous commit, we are minimizing helper specific assumptions
from check_func_arg_reg_off, making it generic, and offloading checks
for a specific argument type to their respective functions called after
check_func_arg_reg_off has been called.

This allows relying on a consistent set of guarantees after that call
and then relying on them in code that deals with registers for each
argument type later. This is in line with how process_spin_lock,
process_timer_func, process_kptr_func check reg->var_off to be constant.
The same reasoning is used here to move the alignment check into
process_dynptr_func. Note that it also needs to check for constant
var_off, and accumulate the constant var_off when computing the spi in
get_spi, but that fix will come in later changes.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/verifier.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 34e67d04579b..fd292f762d53 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
 		return -EFAULT;
 	}
 
+	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
+	 * check_func_arg_reg_off's logic. We only need to check offset
+	 * alignment for PTR_TO_STACK.
+	 */
+	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
+		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
+		return -EINVAL;
+	}
 	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
 	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
 	 *
@@ -6125,11 +6133,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
 	switch (type) {
 	/* Pointer types where both fixed and variable offset is explicitly allowed: */
 	case PTR_TO_STACK:
-		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
-			verbose(env, "cannot pass in dynptr at an offset\n");
-			return -EINVAL;
-		}
-		fallthrough;
 	case PTR_TO_PACKET:
 	case PTR_TO_PACKET_META:
 	case PTR_TO_MAP_KEY:
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write}
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
                   ` (4 preceding siblings ...)
  2022-11-15  0:01 ` [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-17 23:51   ` David Vernet
  2022-11-15  0:01 ` [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
  6 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: Joanne Koong, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, David Vernet

It may happen that destination buffer memory overlaps with memory dynptr
points to. Hence, we must use memmove to correctly copy from dynptr to
destination buffer, or source buffer to dynptr.

This actually isn't a problem right now, as memcpy implementation falls
back to memmove on detecting overlap and warns about it, but we
shouldn't be relying on that.

Acked-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/helpers.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
index 714f5c9d0c1f..1099ed1e7712 100644
--- a/kernel/bpf/helpers.c
+++ b/kernel/bpf/helpers.c
@@ -1489,7 +1489,7 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern
 	if (err)
 		return err;
 
-	memcpy(dst, src->data + src->offset + offset, len);
+	memmove(dst, src->data + src->offset + offset, len);
 
 	return 0;
 }
@@ -1517,7 +1517,7 @@ BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, v
 	if (err)
 		return err;
 
-	memcpy(dst->data + dst->offset + offset, src, len);
+	memmove(dst->data + dst->offset + offset, src, len);
 
 	return 0;
 }
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
                   ` (5 preceding siblings ...)
  2022-11-15  0:01 ` [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
@ 2022-11-15  0:01 ` Kumar Kartikeya Dwivedi
  2022-11-15 18:36   ` Joanne Koong
  6 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15  0:01 UTC (permalink / raw)
  To: bpf
  Cc: David Vernet, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Joanne Koong

The original support for bpf_user_ringbuf_drain callbacks simply
short-circuited checks for the dynptr state, allowing users to pass
PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a
dynptr. This bug would have also surfaced with other dynptr helpers in
the future that changed dynptr view or modified it in some way.

Include test cases for all cases, i.e. both bpf_dynptr_from_mem and
bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them.
Without the fix, both of these programs load and pass verification.

Acked-by: David Vernet <void@manifault.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../selftests/bpf/prog_tests/user_ringbuf.c   |  2 ++
 .../selftests/bpf/progs/user_ringbuf_fail.c   | 35 +++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
index 39882580cb90..500a63bb70a8 100644
--- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
+++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
@@ -676,6 +676,8 @@ static struct {
 	{"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
 	{"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
+	{"user_ringbuf_callback_reinit_dynptr_mem", "Dynptr has to be an uninitialized dynptr"},
+	{"user_ringbuf_callback_reinit_dynptr_ringbuf", "Dynptr has to be an uninitialized dynptr"},
 };
 
 #define SUCCESS_TEST(_func) { _func, #_func }
diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
index 82aba4529aa9..7730d13c0cea 100644
--- a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
+++ b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
@@ -18,6 +18,13 @@ struct {
 	__uint(type, BPF_MAP_TYPE_USER_RINGBUF);
 } user_ringbuf SEC(".maps");
 
+struct {
+	__uint(type, BPF_MAP_TYPE_RINGBUF);
+	__uint(max_entries, 2);
+} ringbuf SEC(".maps");
+
+static int map_value;
+
 static long
 bad_access1(struct bpf_dynptr *dynptr, void *context)
 {
@@ -175,3 +182,31 @@ int user_ringbuf_callback_invalid_return(void *ctx)
 
 	return 0;
 }
+
+static long
+try_reinit_dynptr_mem(struct bpf_dynptr *dynptr, void *context)
+{
+	bpf_dynptr_from_mem(&map_value, 4, 0, dynptr);
+	return 0;
+}
+
+static long
+try_reinit_dynptr_ringbuf(struct bpf_dynptr *dynptr, void *context)
+{
+	bpf_ringbuf_reserve_dynptr(&ringbuf, 8, 0, dynptr);
+	return 0;
+}
+
+SEC("?raw_tp/sys_nanosleep")
+int user_ringbuf_callback_reinit_dynptr_mem(void *ctx)
+{
+	bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_mem, NULL, 0);
+	return 0;
+}
+
+SEC("?raw_tp/sys_nanosleep")
+int user_ringbuf_callback_reinit_dynptr_ringbuf(void *ctx)
+{
+	bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_ringbuf, NULL, 0);
+	return 0;
+}
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg
  2022-11-15  0:01 ` [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg Kumar Kartikeya Dwivedi
@ 2022-11-15  3:53   ` Joanne Koong
  0 siblings, 0 replies; 28+ messages in thread
From: Joanne Koong @ 2022-11-15  3:53 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Currently, we simply ignore the errors in process_spin_lock,
> process_timer_func, process_kptr_func, process_dynptr_func.
> Instead, bubble up storing and checking err variable.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  kernel/bpf/verifier.c | 25 +++++++++++++++----------
>  1 file changed, 15 insertions(+), 10 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 56f48ab9827f..41ef7e4b73e4 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6220,19 +6220,22 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                 break;
>         case ARG_PTR_TO_SPIN_LOCK:
>                 if (meta->func_id == BPF_FUNC_spin_lock) {
> -                       if (process_spin_lock(env, regno, true))
> -                               return -EACCES;
> +                       err = process_spin_lock(env, regno, true);
> +                       if (err)
> +                               return err;
>                 } else if (meta->func_id == BPF_FUNC_spin_unlock) {
> -                       if (process_spin_lock(env, regno, false))
> -                               return -EACCES;
> +                       err = process_spin_lock(env, regno, false);
> +                       if (err)
> +                               return err;
>                 } else {
>                         verbose(env, "verifier internal error\n");
>                         return -EFAULT;
>                 }
>                 break;
>         case ARG_PTR_TO_TIMER:
> -               if (process_timer_func(env, regno, meta))
> -                       return -EACCES;
> +               err = process_timer_func(env, regno, meta);
> +               if (err)
> +                       return err;
>                 break;
>         case ARG_PTR_TO_FUNC:
>                 meta->subprogno = reg->subprogno;
> @@ -6255,8 +6258,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                 err = check_mem_size_reg(env, reg, regno, true, meta);
>                 break;
>         case ARG_PTR_TO_DYNPTR:
> -               if (process_dynptr_func(env, regno, arg_type, meta))
> -                       return -EACCES;
> +               err = process_dynptr_func(env, regno, arg_type, meta);
> +               if (err)
> +                       return err;
>                 break;
>         case ARG_CONST_ALLOC_SIZE_OR_ZERO:
>                 if (!tnum_is_const(reg->var_off)) {
> @@ -6323,8 +6327,9 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                 break;
>         }
>         case ARG_PTR_TO_KPTR:
> -               if (process_kptr_func(env, regno, meta))
> -                       return -EACCES;
> +               err = process_kptr_func(env, regno, meta);
> +               if (err)
> +                       return err;
>                 break;
>         }
>
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func
  2022-11-15  0:01 ` [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-15  4:15   ` Joanne Koong
  0 siblings, 0 replies; 28+ messages in thread
From: Joanne Koong @ 2022-11-15  4:15 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, David Vernet, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> ARG_PTR_TO_DYNPTR is akin to ARG_PTR_TO_TIMER, ARG_PTR_TO_KPTR, where
> the underlying register type is subjected to more special checks to
> determine the type of object represented by the pointer and its state
> consistency.
>
> Move dynptr checks to their own 'process_dynptr_func' function so that
> is consistent and in-line with existing code. This also makes it easier
> to reuse this code for kfunc handling.
>
> Then, reuse this consolidated function in kfunc dynptr handling too.
> Note that for kfuncs, the arg_type constraint of DYNPTR_TYPE_LOCAL has
> been lifted.
>
> Acked-by: David Vernet <void@manifault.com>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

Left a couple of comments below but otherwise LGTM!

> ---
>  include/linux/bpf_verifier.h                  |   8 +-
>  kernel/bpf/btf.c                              |  17 +--
>  kernel/bpf/verifier.c                         | 116 ++++++++++--------
>  .../bpf/prog_tests/kfunc_dynptr_param.c       |   5 +-
>  .../bpf/progs/test_kfunc_dynptr_param.c       |  12 --
>  5 files changed, 70 insertions(+), 88 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 1a32baa78ce2..a69b6d49d40c 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -593,11 +593,9 @@ int check_kfunc_mem_size_reg(struct bpf_verifier_env *env, struct bpf_reg_state
>                              u32 regno);
>  int check_mem_reg(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>                    u32 regno, u32 mem_size);
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -                             struct bpf_reg_state *reg);
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -                            struct bpf_reg_state *reg,
> -                            enum bpf_arg_type arg_type);
> +struct bpf_call_arg_meta;
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +                       enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta);
>
>  /* this lives here instead of in bpf.h because it needs to dereference tgt_prog */
>  static inline u64 bpf_trampoline_compute_key(const struct bpf_prog *tgt_prog,
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index 5579ff3a5b54..d02ae2f4249b 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6575,23 +6575,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
>                                                 return -EINVAL;
>                                         }
>
> -                                       if (!is_dynptr_reg_valid_init(env, reg)) {
> -                                               bpf_log(log,
> -                                                       "arg#%d pointer type %s %s must be valid and initialized\n",
> -                                                       i, btf_type_str(ref_t),
> -                                                       ref_tname);
> +                                       if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, NULL))
>                                                 return -EINVAL;
> -                                       }
> -
> -                                       if (!is_dynptr_type_expected(env, reg,
> -                                                       ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL)) {
> -                                               bpf_log(log,
> -                                                       "arg#%d pointer type %s %s points to unsupported dynamic pointer type\n",
> -                                                       i, btf_type_str(ref_t),
> -                                                       ref_tname);
> -                                               return -EINVAL;
> -                                       }
> -
>                                         continue;
>                                 }
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 07c0259dfc1a..56f48ab9827f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -784,8 +784,7 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>         return true;
>  }
>
> -bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
> -                             struct bpf_reg_state *reg)
> +static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
>         int spi = get_spi(reg->off);
> @@ -804,9 +803,8 @@ bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env,
>         return true;
>  }
>
> -bool is_dynptr_type_expected(struct bpf_verifier_env *env,
> -                            struct bpf_reg_state *reg,
> -                            enum bpf_arg_type arg_type)
> +static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
> +                                   enum bpf_arg_type arg_type)
>  {
>         struct bpf_func_state *state = func(env, reg);
>         enum bpf_dynptr_type dynptr_type;
> @@ -5694,6 +5692,66 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>         return 0;
>  }
>
> +int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> +                       enum bpf_arg_type arg_type,
> +                       struct bpf_call_arg_meta *meta)
> +{
> +       struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> +       int argno = regno - 1;

I think we can just get rid of argno here and replace every instance
of argno in this function with regno

> +
> +       /* We only need to check for initialized / uninitialized helper
> +        * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> +        * assumption is that if it is, that a helper function
> +        * initialized the dynptr on behalf of the BPF program.
> +        */
> +       if (base_type(reg->type) == PTR_TO_DYNPTR)
> +               return 0;
> +       if (arg_type & MEM_UNINIT) {
> +               if (!is_dynptr_reg_valid_uninit(env, reg)) {
> +                       verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> +                       return -EINVAL;
> +               }
> +
> +               /* We only support one dynptr being uninitialized at the moment,
> +                * which is sufficient for the helper functions we have right now.
> +                */
> +               if (meta->uninit_dynptr_regno) {
> +                       verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> +                       return -EFAULT;
> +               }
> +
> +               meta->uninit_dynptr_regno = regno;
> +       } else {
> +               if (!is_dynptr_reg_valid_init(env, reg)) {
> +                       verbose(env,
> +                               "Expected an initialized dynptr as arg #%d\n",
> +                               argno + 1);
> +                       return -EINVAL;
> +               }
> +
> +               if (!is_dynptr_type_expected(env, reg, arg_type)) {
> +                       const char *err_extra = "";
> +
> +                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> +                       case DYNPTR_TYPE_LOCAL:
> +                               err_extra = "local";
> +                               break;
> +                       case DYNPTR_TYPE_RINGBUF:
> +                               err_extra = "ringbuf";
> +                               break;
> +                       default:
> +                               err_extra = "<unknown>";
> +                               break;
> +                       }
> +                       verbose(env,
> +                               "Expected a dynptr of type %s as arg #%d\n",
> +                               err_extra, argno + 1);
> +                       return -EINVAL;
> +               }
> +       }
> +       return 0;
> +}
> +
>  static bool arg_type_is_mem_size(enum bpf_arg_type type)
>  {
>         return type == ARG_CONST_SIZE ||
> @@ -6197,52 +6255,8 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                 err = check_mem_size_reg(env, reg, regno, true, meta);
>                 break;
>         case ARG_PTR_TO_DYNPTR:
> -               /* We only need to check for initialized / uninitialized helper
> -                * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> -                * assumption is that if it is, that a helper function
> -                * initialized the dynptr on behalf of the BPF program.
> -                */
> -               if (base_type(reg->type) == PTR_TO_DYNPTR)
> -                       break;
> -               if (arg_type & MEM_UNINIT) {
> -                       if (!is_dynptr_reg_valid_uninit(env, reg)) {
> -                               verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> -                               return -EINVAL;
> -                       }
> -
> -                       /* We only support one dynptr being uninitialized at the moment,
> -                        * which is sufficient for the helper functions we have right now.
> -                        */
> -                       if (meta->uninit_dynptr_regno) {
> -                               verbose(env, "verifier internal error: multiple uninitialized dynptr args\n");
> -                               return -EFAULT;
> -                       }
> -
> -                       meta->uninit_dynptr_regno = regno;
> -               } else if (!is_dynptr_reg_valid_init(env, reg)) {
> -                       verbose(env,
> -                               "Expected an initialized dynptr as arg #%d\n",
> -                               arg + 1);
> -                       return -EINVAL;
> -               } else if (!is_dynptr_type_expected(env, reg, arg_type)) {
> -                       const char *err_extra = "";
> -
> -                       switch (arg_type & DYNPTR_TYPE_FLAG_MASK) {
> -                       case DYNPTR_TYPE_LOCAL:
> -                               err_extra = "local";
> -                               break;
> -                       case DYNPTR_TYPE_RINGBUF:
> -                               err_extra = "ringbuf";
> -                               break;
> -                       default:
> -                               err_extra = "<unknown>";
> -                               break;
> -                       }
> -                       verbose(env,
> -                               "Expected a dynptr of type %s as arg #%d\n",
> -                               err_extra, arg + 1);
> -                       return -EINVAL;
> -               }
> +               if (process_dynptr_func(env, regno, arg_type, meta))
> +                       return -EACCES;

Should this return -EACCES or propagate the error from process_dynptr_func?

>                 break;
>         case ARG_CONST_ALLOC_SIZE_OR_ZERO:
>                 if (!tnum_is_const(reg->var_off)) {
> diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> index c210657d4d0a..fc562e863e79 100644
> --- a/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> +++ b/tools/testing/selftests/bpf/prog_tests/kfunc_dynptr_param.c
> @@ -18,10 +18,7 @@ static struct {
>         const char *expected_verifier_err_msg;
>         int expected_runtime_err;
>  } kfunc_dynptr_tests[] = {
> -       {"dynptr_type_not_supp",
> -        "arg#0 pointer type STRUCT bpf_dynptr_kern points to unsupported dynamic pointer type", 0},
> -       {"not_valid_dynptr",
> -        "arg#0 pointer type STRUCT bpf_dynptr_kern must be valid and initialized", 0},
> +       {"not_valid_dynptr", "Expected an initialized dynptr as arg #1", 0},
>         {"not_ptr_to_stack", "arg#0 pointer type STRUCT bpf_dynptr_kern not to stack", 0},
>         {"dynptr_data_null", NULL, -EBADMSG},
>  };
> diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> index ce39d096bba3..f4a8250329b2 100644
> --- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> +++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> @@ -32,18 +32,6 @@ int err, pid;
>
>  char _license[] SEC("license") = "GPL";
>
> -SEC("?lsm.s/bpf")
> -int BPF_PROG(dynptr_type_not_supp, int cmd, union bpf_attr *attr,
> -            unsigned int size)
> -{
> -       char write_data[64] = "hello there, world!!";
> -       struct bpf_dynptr ptr;
> -
> -       bpf_ringbuf_reserve_dynptr(&ringbuf, sizeof(write_data), 0, &ptr);
> -
> -       return bpf_verify_pkcs7_signature(&ptr, &ptr, NULL);
> -}
> -
>  SEC("?lsm.s/bpf")
>  int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size)
>  {
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off
  2022-11-15  0:01 ` [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
@ 2022-11-15 18:24   ` Joanne Koong
  2022-11-17 23:42   ` David Vernet
  1 sibling, 0 replies; 28+ messages in thread
From: Joanne Koong @ 2022-11-15 18:24 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> While check_func_arg_reg_off is the place which performs generic checks
> needed by various candidates of reg->type, there is some handling for
> special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> ARG_PTR_TO_ALLOC_MEM.
>
> This commit aims to streamline these special cases and instead leave
> other things up to argument type specific code to handle. The function
> will be restrictive by default, and cover all possible cases when
> OBJ_RELEASE is set, without having to update the function again (and
> missing to do that being a bug).
>
> This is done primarily for two reasons: associating back reg->type to
> its argument leaves room for the list getting out of sync when a new
> reg->type is supported by an arg_type.
>
> The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> we already handle, whenever a release argument is expected, it should
> be passed as the pointer that was received from the acquire function.
> Hence zero fixed and variable offset.
>
> There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> with non-zero offset to other helper functions, which makes sense.
>
> Hence, lift the arg_type_is_release check for reg->off and cover all
> possible register types, instead of duplicating the same kind of check
> twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
>
> For the release argument, arg_type_is_dynptr is the special case, where
> we go to actual object being freed through the dynptr, so the offset of
> the pointer still needs to allow fixed and variable offset and
> process_dynptr_func will verify them later for the release argument case
> as well.
>
> This is not specific to ARG_PTR_TO_DYNPTR though, we will need to make
> this exception for any future object on the stack that needs to be
> released. In this sense, PTR_TO_STACK as a candidate for object on stack
> argument is a special case for release offset checks, and they need to
> be done by the helper releasing the object on stack.
>
> Since the check has been lifted above all register type checks, remove
> the duplicated check that is being done for PTR_TO_BTF_ID.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  kernel/bpf/verifier.c                         | 62 ++++++++++++-------
>  .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
>  2 files changed, 39 insertions(+), 25 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c484e632b0cd..34e67d04579b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6092,11 +6092,38 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>                            const struct bpf_reg_state *reg, int regno,
>                            enum bpf_arg_type arg_type)
>  {
> -       enum bpf_reg_type type = reg->type;
> -       bool fixed_off_ok = false;
> +       u32 type = reg->type;
>
> -       switch ((u32)type) {
> -       /* Pointer types where reg offset is explicitly allowed: */
> +       /* When referenced register is passed to release function, it's fixed
> +        * offset must be 0.
> +        *
> +        * We will check arg_type_is_release reg has ref_obj_id when storing
> +        * meta->release_regno.
> +        */
> +       if (arg_type_is_release(arg_type)) {
> +               /* ARG_PTR_TO_DYNPTR with OBJ_RELEASE is a bit special, as it
> +                * may not directly point to the object being released, but to
> +                * dynptr pointing to such object, which might be at some offset
> +                * on the stack. In that case, we simply to fallback to the
> +                * default handling.
> +                */
> +               if (!arg_type_is_dynptr(arg_type) || type != PTR_TO_STACK) {
> +                       /* Doing check_ptr_off_reg check for the offset will
> +                        * catch this because fixed_off_ok is false, but
> +                        * checking here allows us to give the user a better
> +                        * error message.
> +                        */
> +                       if (reg->off) {
> +                               verbose(env, "R%d must have zero offset when passed to release func\n",
> +                                       regno);
> +                               return -EINVAL;
> +                       }
> +                       return __check_ptr_off_reg(env, reg, regno, false);
> +               }
> +               /* Fallback to default handling */
> +       }
> +       switch (type) {
> +       /* Pointer types where both fixed and variable offset is explicitly allowed: */
>         case PTR_TO_STACK:
>                 if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
>                         verbose(env, "cannot pass in dynptr at an offset\n");
> @@ -6113,35 +6140,22 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         case PTR_TO_BUF:
>         case PTR_TO_BUF | MEM_RDONLY:
>         case SCALAR_VALUE:
> -               /* Some of the argument types nevertheless require a
> -                * zero register offset.
> -                */
> -               if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> -                       return 0;
> -               break;
> +               return 0;
>         /* All the rest must be rejected, except PTR_TO_BTF_ID which allows
>          * fixed offset.
>          */
>         case PTR_TO_BTF_ID:
>                 /* When referenced PTR_TO_BTF_ID is passed to release function,
>                  * it's fixed offset must be 0. In the other cases, fixed offset
> -                * can be non-zero.
> +                * can be non-zero. This was already checked above. So pass
> +                * fixed_off_ok as true to allow fixed offset for all other
> +                * cases. var_off always must be 0 for PTR_TO_BTF_ID, hence we
> +                * still need to do checks instead of returning.
>                  */
> -               if (arg_type_is_release(arg_type) && reg->off) {
> -                       verbose(env, "R%d must have zero offset when passed to release func\n",
> -                               regno);
> -                       return -EINVAL;
> -               }
> -               /* For arg is release pointer, fixed_off_ok must be false, but
> -                * we already checked and rejected reg->off != 0 above, so set
> -                * to true to allow fixed offset for all other cases.
> -                */
> -               fixed_off_ok = true;
> -               break;
> +               return __check_ptr_off_reg(env, reg, regno, true);
>         default:
> -               break;
> +               return __check_ptr_off_reg(env, reg, regno, false);
>         }
> -       return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>
>  static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c b/tools/testing/selftests/bpf/verifier/ringbuf.c
> index b64d33e4833c..92e3f6a61a79 100644
> --- a/tools/testing/selftests/bpf/verifier/ringbuf.c
> +++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
> @@ -28,7 +28,7 @@
>         },
>         .fixup_map_ringbuf = { 1 },
>         .result = REJECT,
> -       .errstr = "dereference of modified alloc_mem ptr R1",
> +       .errstr = "R1 must have zero offset when passed to release func",
>  },
>  {
>         "ringbuf: invalid reservation offset 2",
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-15  0:01 ` [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-15 18:29   ` Joanne Koong
  2022-11-17 23:49   ` David Vernet
  1 sibling, 0 replies; 28+ messages in thread
From: Joanne Koong @ 2022-11-15 18:29 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> After previous commit, we are minimizing helper specific assumptions
> from check_func_arg_reg_off, making it generic, and offloading checks
> for a specific argument type to their respective functions called after
> check_func_arg_reg_off has been called.
>
> This allows relying on a consistent set of guarantees after that call
> and then relying on them in code that deals with registers for each
> argument type later. This is in line with how process_spin_lock,
> process_timer_func, process_kptr_func check reg->var_off to be constant.
> The same reasoning is used here to move the alignment check into
> process_dynptr_func. Note that it also needs to check for constant
> var_off, and accumulate the constant var_off when computing the spi in
> get_spi, but that fix will come in later changes.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  kernel/bpf/verifier.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 34e67d04579b..fd292f762d53 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                 return -EFAULT;
>         }
>
> +       /* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> +        * check_func_arg_reg_off's logic. We only need to check offset
> +        * alignment for PTR_TO_STACK.
> +        */
> +       if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> +               verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> +               return -EINVAL;
> +       }
>         /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>          * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>          *
> @@ -6125,11 +6133,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         switch (type) {
>         /* Pointer types where both fixed and variable offset is explicitly allowed: */
>         case PTR_TO_STACK:
> -               if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -                       verbose(env, "cannot pass in dynptr at an offset\n");
> -                       return -EINVAL;
> -               }
> -               fallthrough;
>         case PTR_TO_PACKET:
>         case PTR_TO_PACKET_META:
>         case PTR_TO_MAP_KEY:
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  2022-11-15  0:01 ` [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
@ 2022-11-15 18:36   ` Joanne Koong
  2022-11-15 19:41     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 28+ messages in thread
From: Joanne Koong @ 2022-11-15 18:36 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, David Vernet, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> The original support for bpf_user_ringbuf_drain callbacks simply
> short-circuited checks for the dynptr state, allowing users to pass
> PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a
> dynptr. This bug would have also surfaced with other dynptr helpers in
> the future that changed dynptr view or modified it in some way.
>
> Include test cases for all cases, i.e. both bpf_dynptr_from_mem and
> bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them.
> Without the fix, both of these programs load and pass verification.
>
> Acked-by: David Vernet <void@manifault.com>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

Acked-by: Joanne Koong <joannelkoong@gmail.com>

Left a small comment below.

> ---
>  .../selftests/bpf/prog_tests/user_ringbuf.c   |  2 ++
>  .../selftests/bpf/progs/user_ringbuf_fail.c   | 35 +++++++++++++++++++
>  2 files changed, 37 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> index 39882580cb90..500a63bb70a8 100644
> --- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> @@ -676,6 +676,8 @@ static struct {
>         {"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
>         {"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
>         {"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
> +       {"user_ringbuf_callback_reinit_dynptr_mem", "Dynptr has to be an uninitialized dynptr"},
> +       {"user_ringbuf_callback_reinit_dynptr_ringbuf", "Dynptr has to be an uninitialized dynptr"},
>  };
>
>  #define SUCCESS_TEST(_func) { _func, #_func }
> diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> index 82aba4529aa9..7730d13c0cea 100644
> --- a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> +++ b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> @@ -18,6 +18,13 @@ struct {
>         __uint(type, BPF_MAP_TYPE_USER_RINGBUF);
>  } user_ringbuf SEC(".maps");
>
> +struct {
> +       __uint(type, BPF_MAP_TYPE_RINGBUF);
> +       __uint(max_entries, 2);
> +} ringbuf SEC(".maps");
> +
> +static int map_value;
> +
>  static long
>  bad_access1(struct bpf_dynptr *dynptr, void *context)
>  {
> @@ -175,3 +182,31 @@ int user_ringbuf_callback_invalid_return(void *ctx)
>
>         return 0;
>  }
> +
> +static long
> +try_reinit_dynptr_mem(struct bpf_dynptr *dynptr, void *context)
> +{
> +       bpf_dynptr_from_mem(&map_value, 4, 0, dynptr);
> +       return 0;
> +}
> +
> +static long
> +try_reinit_dynptr_ringbuf(struct bpf_dynptr *dynptr, void *context)
> +{
> +       bpf_ringbuf_reserve_dynptr(&ringbuf, 8, 0, dynptr);
> +       return 0;
> +}
> +
> +SEC("?raw_tp/sys_nanosleep")
> +int user_ringbuf_callback_reinit_dynptr_mem(void *ctx)
> +{
> +       bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_mem, NULL, 0);
> +       return 0;
> +}
> +
> +SEC("?raw_tp/sys_nanosleep")

nit: here and above, I think this should just be "?raw_tp/" without
the nanosleep, since there is no nanosleep tracepoint.

> +int user_ringbuf_callback_reinit_dynptr_ringbuf(void *ctx)
> +{
> +       bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_ringbuf, NULL, 0);
> +       return 0;
> +}
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback
  2022-11-15 18:36   ` Joanne Koong
@ 2022-11-15 19:41     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-15 19:41 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, David Vernet, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau

On Wed, Nov 16, 2022 at 12:06:36AM IST, Joanne Koong wrote:
> On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > The original support for bpf_user_ringbuf_drain callbacks simply
> > short-circuited checks for the dynptr state, allowing users to pass
> > PTR_TO_DYNPTR (now CONST_PTR_TO_DYNPTR) to helpers that initialize a
> > dynptr. This bug would have also surfaced with other dynptr helpers in
> > the future that changed dynptr view or modified it in some way.
> >
> > Include test cases for all cases, i.e. both bpf_dynptr_from_mem and
> > bpf_ringbuf_reserve_dynptr, and ensure verifier rejects both of them.
> > Without the fix, both of these programs load and pass verification.
> >
> > Acked-by: David Vernet <void@manifault.com>
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
>
> Acked-by: Joanne Koong <joannelkoong@gmail.com>
>
> Left a small comment below.
>
> > ---
> >  .../selftests/bpf/prog_tests/user_ringbuf.c   |  2 ++
> >  .../selftests/bpf/progs/user_ringbuf_fail.c   | 35 +++++++++++++++++++
> >  2 files changed, 37 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> > index 39882580cb90..500a63bb70a8 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> > @@ -676,6 +676,8 @@ static struct {
> >         {"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
> >         {"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
> >         {"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
> > +       {"user_ringbuf_callback_reinit_dynptr_mem", "Dynptr has to be an uninitialized dynptr"},
> > +       {"user_ringbuf_callback_reinit_dynptr_ringbuf", "Dynptr has to be an uninitialized dynptr"},
> >  };
> >
> >  #define SUCCESS_TEST(_func) { _func, #_func }
> > diff --git a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> > index 82aba4529aa9..7730d13c0cea 100644
> > --- a/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> > +++ b/tools/testing/selftests/bpf/progs/user_ringbuf_fail.c
> > @@ -18,6 +18,13 @@ struct {
> >         __uint(type, BPF_MAP_TYPE_USER_RINGBUF);
> >  } user_ringbuf SEC(".maps");
> >
> > +struct {
> > +       __uint(type, BPF_MAP_TYPE_RINGBUF);
> > +       __uint(max_entries, 2);
> > +} ringbuf SEC(".maps");
> > +
> > +static int map_value;
> > +
> >  static long
> >  bad_access1(struct bpf_dynptr *dynptr, void *context)
> >  {
> > @@ -175,3 +182,31 @@ int user_ringbuf_callback_invalid_return(void *ctx)
> >
> >         return 0;
> >  }
> > +
> > +static long
> > +try_reinit_dynptr_mem(struct bpf_dynptr *dynptr, void *context)
> > +{
> > +       bpf_dynptr_from_mem(&map_value, 4, 0, dynptr);
> > +       return 0;
> > +}
> > +
> > +static long
> > +try_reinit_dynptr_ringbuf(struct bpf_dynptr *dynptr, void *context)
> > +{
> > +       bpf_ringbuf_reserve_dynptr(&ringbuf, 8, 0, dynptr);
> > +       return 0;
> > +}
> > +
> > +SEC("?raw_tp/sys_nanosleep")
> > +int user_ringbuf_callback_reinit_dynptr_mem(void *ctx)
> > +{
> > +       bpf_user_ringbuf_drain(&user_ringbuf, try_reinit_dynptr_mem, NULL, 0);
> > +       return 0;
> > +}
> > +
> > +SEC("?raw_tp/sys_nanosleep")
>
> nit: here and above, I think this should just be "?raw_tp/" without
> the nanosleep, since there is no nanosleep tracepoint.
>

True, looks like it's the same for all prior cases in this file as well. I will
drop sys_nanosleep for all of them.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func
  2022-11-15  0:01 ` [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
@ 2022-11-16 18:04   ` Joanne Koong
  2022-11-17 21:11   ` David Vernet
  1 sibling, 0 replies; 28+ messages in thread
From: Joanne Koong @ 2022-11-16 18:04 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, David Vernet

On Mon, Nov 14, 2022 at 4:01 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
> for use in callback state, because in case of user ringbuf helpers,
> there is no dynptr on the stack that is passed into the callback. To
> reflect such a state, a special register type was created.
>
> However, some checks have been bypassed incorrectly during the addition
> of this feature. First, for arg_type with MEM_UNINIT flag which
> initialize a dynptr, they must be rejected for such register type.
> Secondly, in the future, there are plans to add dynptr helpers that
> operate on the dynptr itself and may change its offset and other
> properties.
>
> In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> to such helpers, however the current code simply returns 0.
>
> The rejection for helpers that release the dynptr is already handled.
>
> For fixing this, we take a step back and rework existing code in a way
> that will allow fitting in all classes of helpers and have a coherent
> model for dealing with the variety of use cases in which dynptr is used.
>
> First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
>
> Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> that the helper only operates on the memory pointed to by the dynptr,
> not the dynptr itself. In C parlance, it would be equivalent to taking
> the dynptr as a point to const argument.
>
> When either of these flags are not present, the helper is allowed to
> mutate both the dynptr itself and also the memory it points to.
> Currently, the read only status of the memory is not tracked in the
> dynptr, but it would be trivial to add this support inside dynptr state
> of the register.
>
> With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
> better reflect its usage, it can no longer be passed to helpers that
> initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
>
> A note to reviewers is that in code that does mark_stack_slots_dynptr,
> and unmark_stack_slots_dynptr, we implicitly rely on the fact that
> PTR_TO_STACK reg is the only case that can reach that code path, as one
> cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
> both cases such helpers won't be setting that flag.
>
> The next patch will add a couple of selftest cases to make sure this
> doesn't break.
>
> Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>

overall lgtm

Acked-by: Joanne Koong <joannelkoong@gmail.com>

> ---
>  include/linux/bpf.h                           |   4 +-
>  include/uapi/linux/bpf.h                      |   8 +-
>  kernel/bpf/btf.c                              |   7 +-
>  kernel/bpf/helpers.c                          |  18 +-
>  kernel/bpf/verifier.c                         | 220 +++++++++++++-----
>  scripts/bpf_doc.py                            |   1 +
>  tools/include/uapi/linux/bpf.h                |   8 +-
>  .../selftests/bpf/prog_tests/user_ringbuf.c   |  10 +-
>  8 files changed, 195 insertions(+), 81 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 798aec816970..dfe45f6caa4f 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -709,7 +709,7 @@ enum bpf_reg_type {
>         PTR_TO_MEM,              /* reg points to valid memory region */
>         PTR_TO_BUF,              /* reg points to a read/write buffer */
>         PTR_TO_FUNC,             /* reg points to a bpf program function */
> -       PTR_TO_DYNPTR,           /* reg points to a dynptr */
> +       CONST_PTR_TO_DYNPTR,     /* reg points to a const struct bpf_dynptr */
>         __BPF_REG_TYPE_MAX,
>
>         /* Extended reg_types. */
> @@ -2761,7 +2761,7 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
>                      enum bpf_dynptr_type type, u32 offset, u32 size);
>  void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
>  int bpf_dynptr_check_size(u32 size);
> -u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
> +u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
>
>  #ifdef CONFIG_BPF_LSM
>  void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index fb4c911d2a03..b7543efd6284 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5290,7 +5290,7 @@ union bpf_attr {
>   *     Return
>   *             Nothing. Always succeeds.
>   *
> - * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
> + * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
>   *     Description
>   *             Read *len* bytes from *src* into *dst*, starting from *offset*
>   *             into *src*.
> @@ -5300,7 +5300,7 @@ union bpf_attr {
>   *             of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
>   *             *flags* is not 0.
>   *
> - * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
> + * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
>   *     Description
>   *             Write *len* bytes from *src* into *dst*, starting from *offset*
>   *             into *dst*.
> @@ -5310,7 +5310,7 @@ union bpf_attr {
>   *             of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
>   *             is a read-only dynptr or if *flags* is not 0.
>   *
> - * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> + * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
>   *     Description
>   *             Get a pointer to the underlying dynptr data.
>   *
> @@ -5411,7 +5411,7 @@ union bpf_attr {
>   *             Drain samples from the specified user ring buffer, and invoke
>   *             the provided callback for each such sample:
>   *
> - *             long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
> + *             long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
>   *
>   *             If **callback_fn** returns 0, the helper will continue to try
>   *             and drain the next sample, up to a maximum of
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index d02ae2f4249b..ba3b50895f6b 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6568,14 +6568,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
>                                 }
>
>                                 if (arg_dynptr) {
> -                                       if (reg->type != PTR_TO_STACK) {
> -                                               bpf_log(log, "arg#%d pointer type %s %s not to stack\n",
> +                                       if (reg->type != PTR_TO_STACK &&
> +                                           reg->type != CONST_PTR_TO_DYNPTR) {
> +                                               bpf_log(log, "arg#%d pointer type %s %s not to stack or dynptr\n",
>                                                         i, btf_type_str(ref_t),
>                                                         ref_tname);
>                                                 return -EINVAL;
>                                         }
>
> -                                       if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, NULL))
> +                                       if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR | MEM_RDONLY, NULL))
>                                                 return -EINVAL;
>                                         continue;
>                                 }
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 283f55bbeb70..714f5c9d0c1f 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1398,7 +1398,7 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
>  #define DYNPTR_SIZE_MASK       0xFFFFFF
>  #define DYNPTR_RDONLY_BIT      BIT(31)
>
> -static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
> +static bool bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
>  {
>         return ptr->size & DYNPTR_RDONLY_BIT;
>  }
> @@ -1408,7 +1408,7 @@ static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_typ
>         ptr->size |= type << DYNPTR_TYPE_SHIFT;
>  }
>
> -u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
> +u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
>  {
>         return ptr->size & DYNPTR_SIZE_MASK;
>  }
> @@ -1432,7 +1432,7 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
>         memset(ptr, 0, sizeof(*ptr));
>  }
>
> -static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
> +static int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
>  {
>         u32 size = bpf_dynptr_get_size(ptr);
>
> @@ -1477,7 +1477,7 @@ static const struct bpf_func_proto bpf_dynptr_from_mem_proto = {
>         .arg4_type      = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT,
>  };
>
> -BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src,
> +BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern *, src,
>            u32, offset, u64, flags)
>  {
>         int err;
> @@ -1500,12 +1500,12 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = {
>         .ret_type       = RET_INTEGER,
>         .arg1_type      = ARG_PTR_TO_UNINIT_MEM,
>         .arg2_type      = ARG_CONST_SIZE_OR_ZERO,
> -       .arg3_type      = ARG_PTR_TO_DYNPTR,
> +       .arg3_type      = ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>         .arg4_type      = ARG_ANYTHING,
>         .arg5_type      = ARG_ANYTHING,
>  };
>
> -BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
> +BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
>            u32, len, u64, flags)
>  {
>         int err;
> @@ -1526,14 +1526,14 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = {
>         .func           = bpf_dynptr_write,
>         .gpl_only       = false,
>         .ret_type       = RET_INTEGER,
> -       .arg1_type      = ARG_PTR_TO_DYNPTR,
> +       .arg1_type      = ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>         .arg2_type      = ARG_ANYTHING,
>         .arg3_type      = ARG_PTR_TO_MEM | MEM_RDONLY,
>         .arg4_type      = ARG_CONST_SIZE_OR_ZERO,
>         .arg5_type      = ARG_ANYTHING,
>  };
>
> -BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
> +BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
>  {
>         int err;
>
> @@ -1554,7 +1554,7 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
>         .func           = bpf_dynptr_data,
>         .gpl_only       = false,
>         .ret_type       = RET_PTR_TO_DYNPTR_MEM_OR_NULL,
> -       .arg1_type      = ARG_PTR_TO_DYNPTR,
> +       .arg1_type      = ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>         .arg2_type      = ARG_ANYTHING,
>         .arg3_type      = ARG_CONST_ALLOC_SIZE_OR_ZERO,
>  };
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 41ef7e4b73e4..c484e632b0cd 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -565,7 +565,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
>                 [PTR_TO_BUF]            = "buf",
>                 [PTR_TO_FUNC]           = "func",
>                 [PTR_TO_MAP_KEY]        = "map_key",
> -               [PTR_TO_DYNPTR]         = "dynptr_ptr",
> +               [CONST_PTR_TO_DYNPTR]   = "dynptr",

nit: personally I think "= dynptr_ptr" is more useful here

>         };
>
>         if (type & PTR_MAYBE_NULL) {
> @@ -699,6 +699,27 @@ static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
>         return type == BPF_DYNPTR_TYPE_RINGBUF;
>  }
>
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +                              struct bpf_reg_state *reg2,
> +                              enum bpf_dynptr_type type);
> +
> +static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> +                               struct bpf_reg_state *reg);
> +
> +static void mark_dynptr_stack_regs(struct bpf_reg_state *sreg1,
> +                                  struct bpf_reg_state *sreg2,
> +                                  enum bpf_dynptr_type type)
> +{
> +       __mark_dynptr_regs(sreg1, sreg2, type);
> +}
> +
> +static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> +                              enum bpf_dynptr_type type)
> +{
> +       __mark_dynptr_regs(reg1, NULL, type);
> +}
> +
> +
>  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>                                    enum bpf_arg_type arg_type, int insn_idx)
>  {
> @@ -720,9 +741,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>         if (type == BPF_DYNPTR_TYPE_INVALID)
>                 return -EINVAL;
>
> -       state->stack[spi].spilled_ptr.dynptr.first_slot = true;
> -       state->stack[spi].spilled_ptr.dynptr.type = type;
> -       state->stack[spi - 1].spilled_ptr.dynptr.type = type;
> +       mark_dynptr_stack_regs(&state->stack[spi].spilled_ptr,
> +                              &state->stack[spi - 1].spilled_ptr, type);
>
>         if (dynptr_type_refcounted(type)) {
>                 /* The id is used to track proper releasing */
> @@ -730,8 +750,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>                 if (id < 0)
>                         return id;
>
> -               state->stack[spi].spilled_ptr.id = id;
> -               state->stack[spi - 1].spilled_ptr.id = id;
> +               state->stack[spi].spilled_ptr.ref_obj_id = id;
> +               state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
>         }
>
>         return 0;
> @@ -753,25 +773,23 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
>         }
>
>         /* Invalidate any slices associated with this dynptr */
> -       if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
> -               release_reference(env, state->stack[spi].spilled_ptr.id);
> -               state->stack[spi].spilled_ptr.id = 0;
> -               state->stack[spi - 1].spilled_ptr.id = 0;
> -       }
> -
> -       state->stack[spi].spilled_ptr.dynptr.first_slot = false;
> -       state->stack[spi].spilled_ptr.dynptr.type = 0;
> -       state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
> +       if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
> +               WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
>
> +       __mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> +       __mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
>         return 0;
>  }
>
>  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
> -       int spi = get_spi(reg->off);
> -       int i;
> +       int spi, i;
> +
> +       if (reg->type == CONST_PTR_TO_DYNPTR)
> +               return false;
>
> +       spi = get_spi(reg->off);
>         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
>                 return true;
>
> @@ -787,9 +805,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>  static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
> -       int spi = get_spi(reg->off);
> +       int spi;
>         int i;
>
> +       /* This already represents first slot of initialized bpf_dynptr */
> +       if (reg->type == CONST_PTR_TO_DYNPTR)
> +               return true;
> +
> +       spi = get_spi(reg->off);
>         if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
>             !state->stack[spi].spilled_ptr.dynptr.first_slot)
>                 return false;
> @@ -808,15 +831,19 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
>  {
>         struct bpf_func_state *state = func(env, reg);
>         enum bpf_dynptr_type dynptr_type;
> -       int spi = get_spi(reg->off);
> +       int spi;
>
>         /* ARG_PTR_TO_DYNPTR takes any type of dynptr */
>         if (arg_type == ARG_PTR_TO_DYNPTR)
>                 return true;
>
>         dynptr_type = arg_to_dynptr_type(arg_type);
> -
> -       return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +       if (reg->type == CONST_PTR_TO_DYNPTR) {
> +               return reg->dynptr.type == dynptr_type;
> +       } else {
> +               spi = get_spi(reg->off);
> +               return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +       }
>  }
>
>  /* The reg state of a pointer or a bounded scalar was saved when
> @@ -1324,9 +1351,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
>         BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
>  };
>
> -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> -                               struct bpf_reg_state *reg);
> -
>  /* This helper doesn't clear reg->id */
>  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
>  {
> @@ -1389,6 +1413,26 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
>         __mark_reg_known_zero(regs + regno);
>  }
>
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +                              struct bpf_reg_state *reg2,
> +                              enum bpf_dynptr_type type)

nit: Given that the logic is the same, I think it'd be cleaner if
__mark_dynptr_regs takes 1 bpf_reg_state arg, since in some cases (eg
mark_dynptr_cb_reg) NULL is passed in for reg2 anyways.

> +{
> +       /* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> +        * callback arguments, it does need to be CONST_PTR_TO_DYNPTR, so simply
> +        * set it unconditionally as it is ignored for STACK_DYNPTR anyway.
> +        */
> +       __mark_reg_known_zero(reg1);
> +       reg1->type = CONST_PTR_TO_DYNPTR;
> +       reg1->dynptr.type = type;
> +       reg1->dynptr.first_slot = true;
> +       if (!reg2)
> +               return;
> +       __mark_reg_known_zero(reg2);
> +       reg2->type = CONST_PTR_TO_DYNPTR;
> +       reg2->dynptr.type = type;
> +       reg2->dynptr.first_slot = false;
> +}
> +
>  static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
>  {
>         if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
> @@ -5692,20 +5736,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>         return 0;
>  }
>
> +/* Implementation details:
> + *
> + * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
> + * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
> + *
> + * In both cases we deal with the first 8 bytes, but need to mark the next 8
> + * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
> + * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
> + *
> + * Mutability of bpf_dynptr is at two levels, one is at the level of struct
> + * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
> + * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
> + * mutate the view of the dynptr and also possibly destroy it. In the latter
> + * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
> + * memory that dynptr points to.
> + *
> + * The verifier will keep track both levels of mutation (bpf_dynptr's in
> + * reg->type and the memory's in reg->dynptr.type), but there is no support for
> + * readonly dynptr view yet, hence only the first case is tracked and checked.
> + *
> + * This is consistent with how C applies the const modifier to a struct object,
> + * where the pointer itself inside bpf_dynptr becomes const but not what it
> + * points to.
> + *
> + * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
> + * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
> + */
>  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> -                       enum bpf_arg_type arg_type,
> -                       struct bpf_call_arg_meta *meta)
> +                       enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta)
>  {
>         struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
>         int argno = regno - 1;
>
> -       /* We only need to check for initialized / uninitialized helper
> -        * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> -        * assumption is that if it is, that a helper function
> -        * initialized the dynptr on behalf of the BPF program.
> +       if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> +               verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> +               return -EFAULT;
> +       }
> +
I think it' dbe more helpful if the comment block below was placed
before this if statement above, which explains that MEM_UNINIT and
MEM_RDONLY are mutually exclusive.

> +       /* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
> +        * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> +        *
> +        *  MEM_UNINIT - Points to memory that is an appropriate candidate for
> +        *               constructing a mutable bpf_dynptr object.
> +        *
> +        *               Currently, this is only possible with PTR_TO_STACK
> +        *               pointing to a region of atleast 16 bytes which doesn't
> +        *               contain an existing bpf_dynptr.
> +        *
> +        *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
> +        *               mutated or destroyed. However, the memory it points to
> +        *               may be mutated.
> +        *
> +        *  None       - Points to a initialized dynptr that can be mutated and
> +        *               destroyed, including mutation of the memory it points
> +        *               to.
>          */
> -       if (base_type(reg->type) == PTR_TO_DYNPTR)
> -               return 0;
>         if (arg_type & MEM_UNINIT) {
>                 if (!is_dynptr_reg_valid_uninit(env, reg)) {
>                         verbose(env, "Dynptr has to be an uninitialized dynptr\n");
> @@ -5722,6 +5808,12 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>
>                 meta->uninit_dynptr_regno = regno;
>         } else {
> +               /* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> +               if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> +                       verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> +                       return -EINVAL;
> +               }
> +
>                 if (!is_dynptr_reg_valid_init(env, reg)) {
>                         verbose(env,
>                                 "Expected an initialized dynptr as arg #%d\n",
> @@ -5729,6 +5821,8 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>                         return -EINVAL;
>                 }
>
> +               /* Fold MEM_RDONLY, only inspect arg_type */
> +               arg_type &= ~MEM_RDONLY;

nit: it seems cleaner to me to call !is_dynptr_type_expected(env, reg,
arg_type & ~MEM_RDONLY) rather than mutate arg_type

>                 if (!is_dynptr_type_expected(env, reg, arg_type)) {
>                         const char *err_extra = "";
>
> @@ -5874,7 +5968,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
>  static const struct bpf_reg_types dynptr_types = {
>         .types = {
>                 PTR_TO_STACK,
> -               PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
> +               CONST_PTR_TO_DYNPTR,
>         }
>  };
>
> @@ -6050,12 +6144,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>         return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>
> -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> +static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>         struct bpf_func_state *state = func(env, reg);
> -       int spi = get_spi(reg->off);
> +       int spi;
>
> -       return state->stack[spi].spilled_ptr.id;
> +       if (reg->type == CONST_PTR_TO_DYNPTR)
> +               return reg->ref_obj_id;

nit: extra line here would improve readibility

> +       spi = get_spi(reg->off);
> +       return state->stack[spi].spilled_ptr.ref_obj_id;
>  }
>
>  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -6119,11 +6216,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>         if (arg_type_is_release(arg_type)) {
>                 if (arg_type_is_dynptr(arg_type)) {
>                         struct bpf_func_state *state = func(env, reg);
> -                       int spi = get_spi(reg->off);
> +                       int spi;
>
> -                       if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> -                           !state->stack[spi].spilled_ptr.id) {
> -                               verbose(env, "arg %d is an unacquired reference\n", regno);
> +                       if (reg->type == PTR_TO_STACK) {
> +                               spi = get_spi(reg->off);
> +                               if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> +                                   !state->stack[spi].spilled_ptr.ref_obj_id) {
> +                                       verbose(env, "arg %d is an unacquired reference\n", regno);
> +                                       return -EINVAL;
> +                               }
> +                       } else {
> +                               verbose(env, "cannot release unowned const bpf_dynptr\n");
>                                 return -EINVAL;
>                         }
>                 } else if (!reg->ref_obj_id && !register_is_null(reg)) {
> @@ -7091,11 +7194,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
>  {
>         /* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
>          *                        callback_ctx, u64 flags);
> -        * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
> +        * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
>          */
>         __mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
> -       callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
> -       __mark_reg_known_zero(&callee->regs[BPF_REG_1]);
> +       mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
>         callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
>
>         /* unused */
> @@ -7474,7 +7576,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>
>         regs = cur_regs(env);
>
> +       /* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +        * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
> +        * is safe to do directly.
> +        */
>         if (meta.uninit_dynptr_regno) {
> +               if (WARN_ON_ONCE(regs[meta.uninit_dynptr_regno].type == CONST_PTR_TO_DYNPTR))
> +                       return -EFAULT;
>                 /* we write BPF_DW bits (8 bytes) at a time */
>                 for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
>                         err = check_mem_access(env, insn_idx, meta.uninit_dynptr_regno,
> @@ -7492,15 +7600,22 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>
>         if (meta.release_regno) {
>                 err = -EINVAL;
> -               if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
> +               /* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +                * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
> +                * is safe to do directly.
> +                */
> +               if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1])) {
> +                       if (WARN_ON_ONCE(regs[meta.release_regno].type == CONST_PTR_TO_DYNPTR))
> +                               return -EFAULT;
>                         err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
> -               else if (meta.ref_obj_id)
> +               } else if (meta.ref_obj_id) {
>                         err = release_reference(env, meta.ref_obj_id);
> -               /* meta.ref_obj_id can only be 0 if register that is meant to be
> -                * released is NULL, which must be > R0.
> -                */
> -               else if (register_is_null(&regs[meta.release_regno]))
> +               } else if (register_is_null(&regs[meta.release_regno])) {
> +                       /* meta.ref_obj_id can only be 0 if register that is meant to be
> +                        * released is NULL, which must be > R0.
> +                        */
>                         err = 0;
> +               }
>                 if (err) {
>                         verbose(env, "func %s#%d reference has not been acquired before\n",
>                                 func_id_name(func_id), func_id);
> @@ -7574,11 +7689,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>                                         return -EFAULT;
>                                 }
>
> -                               if (base_type(reg->type) != PTR_TO_DYNPTR)
> -                                       /* Find the id of the dynptr we're
> -                                        * tracking the reference of
> -                                        */
> -                                       meta.ref_obj_id = stack_slot_get_id(env, reg);
> +                               /* Find the id of the dynptr we're tracking the reference of */
> +                               meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
>                                 break;
>                         }
>                 }
> diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
> index fdb0aff8cb5a..e8d90829f23e 100755
> --- a/scripts/bpf_doc.py
> +++ b/scripts/bpf_doc.py
> @@ -752,6 +752,7 @@ class PrinterHelpers(Printer):
>              'struct bpf_timer',
>              'struct mptcp_sock',
>              'struct bpf_dynptr',
> +            'const struct bpf_dynptr',
>              'struct iphdr',
>              'struct ipv6hdr',
>      }
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index fb4c911d2a03..b7543efd6284 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5290,7 +5290,7 @@ union bpf_attr {
>   *     Return
>   *             Nothing. Always succeeds.
>   *
> - * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
> + * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
>   *     Description
>   *             Read *len* bytes from *src* into *dst*, starting from *offset*
>   *             into *src*.
> @@ -5300,7 +5300,7 @@ union bpf_attr {
>   *             of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
>   *             *flags* is not 0.
>   *
> - * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
> + * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
>   *     Description
>   *             Write *len* bytes from *src* into *dst*, starting from *offset*
>   *             into *dst*.
> @@ -5310,7 +5310,7 @@ union bpf_attr {
>   *             of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
>   *             is a read-only dynptr or if *flags* is not 0.
>   *
> - * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> + * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
>   *     Description
>   *             Get a pointer to the underlying dynptr data.
>   *
> @@ -5411,7 +5411,7 @@ union bpf_attr {
>   *             Drain samples from the specified user ring buffer, and invoke
>   *             the provided callback for each such sample:
>   *
> - *             long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
> + *             long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
>   *
>   *             If **callback_fn** returns 0, the helper will continue to try
>   *             and drain the next sample, up to a maximum of
> diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> index 02b18d018b36..39882580cb90 100644
> --- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> @@ -668,13 +668,13 @@ static struct {
>         const char *expected_err_msg;
>  } failure_tests[] = {
>         /* failure cases */
> -       {"user_ringbuf_callback_bad_access1", "negative offset dynptr_ptr ptr"},
> -       {"user_ringbuf_callback_bad_access2", "dereference of modified dynptr_ptr ptr"},
> -       {"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr_ptr'"},
> +       {"user_ringbuf_callback_bad_access1", "negative offset dynptr ptr"},
> +       {"user_ringbuf_callback_bad_access2", "dereference of modified dynptr ptr"},
> +       {"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr'"},
>         {"user_ringbuf_callback_null_context_write", "invalid mem access 'scalar'"},
>         {"user_ringbuf_callback_null_context_read", "invalid mem access 'scalar'"},
> -       {"user_ringbuf_callback_discard_dynptr", "arg 1 is an unacquired reference"},
> -       {"user_ringbuf_callback_submit_dynptr", "arg 1 is an unacquired reference"},
> +       {"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
> +       {"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
>         {"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
>  };
>
> --
> 2.38.1
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func
  2022-11-15  0:01 ` [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
  2022-11-16 18:04   ` Joanne Koong
@ 2022-11-17 21:11   ` David Vernet
  2022-11-20 18:06     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 28+ messages in thread
From: David Vernet @ 2022-11-17 21:11 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Nov 15, 2022 at 05:31:26AM +0530, Kumar Kartikeya Dwivedi wrote:
> Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
> for use in callback state, because in case of user ringbuf helpers,
> there is no dynptr on the stack that is passed into the callback. To
> reflect such a state, a special register type was created.
> 
> However, some checks have been bypassed incorrectly during the addition
> of this feature. First, for arg_type with MEM_UNINIT flag which
> initialize a dynptr, they must be rejected for such register type.
> Secondly, in the future, there are plans to add dynptr helpers that
> operate on the dynptr itself and may change its offset and other
> properties.
> 
> In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> to such helpers, however the current code simply returns 0.
> 
> The rejection for helpers that release the dynptr is already handled.
> 
> For fixing this, we take a step back and rework existing code in a way
> that will allow fitting in all classes of helpers and have a coherent
> model for dealing with the variety of use cases in which dynptr is used.
> 
> First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
> 
> Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> that the helper only operates on the memory pointed to by the dynptr,
> not the dynptr itself. In C parlance, it would be equivalent to taking
> the dynptr as a point to const argument.
> 
> When either of these flags are not present, the helper is allowed to
> mutate both the dynptr itself and also the memory it points to.
> Currently, the read only status of the memory is not tracked in the
> dynptr, but it would be trivial to add this support inside dynptr state
> of the register.
> 
> With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
> better reflect its usage, it can no longer be passed to helpers that
> initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
> 
> A note to reviewers is that in code that does mark_stack_slots_dynptr,
> and unmark_stack_slots_dynptr, we implicitly rely on the fact that
> PTR_TO_STACK reg is the only case that can reach that code path, as one
> cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
> both cases such helpers won't be setting that flag.
> 
> The next patch will add a couple of selftest cases to make sure this
> doesn't break.
> 
> Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---

Overall this LGTM. Left some comments / nits before stamping.

>  include/linux/bpf.h                           |   4 +-
>  include/uapi/linux/bpf.h                      |   8 +-
>  kernel/bpf/btf.c                              |   7 +-
>  kernel/bpf/helpers.c                          |  18 +-
>  kernel/bpf/verifier.c                         | 220 +++++++++++++-----
>  scripts/bpf_doc.py                            |   1 +
>  tools/include/uapi/linux/bpf.h                |   8 +-
>  .../selftests/bpf/prog_tests/user_ringbuf.c   |  10 +-
>  8 files changed, 195 insertions(+), 81 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 798aec816970..dfe45f6caa4f 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -709,7 +709,7 @@ enum bpf_reg_type {
>  	PTR_TO_MEM,		 /* reg points to valid memory region */
>  	PTR_TO_BUF,		 /* reg points to a read/write buffer */
>  	PTR_TO_FUNC,		 /* reg points to a bpf program function */
> -	PTR_TO_DYNPTR,		 /* reg points to a dynptr */
> +	CONST_PTR_TO_DYNPTR,	 /* reg points to a const struct bpf_dynptr */
>  	__BPF_REG_TYPE_MAX,
>  
>  	/* Extended reg_types. */
> @@ -2761,7 +2761,7 @@ void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
>  		     enum bpf_dynptr_type type, u32 offset, u32 size);
>  void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
>  int bpf_dynptr_check_size(u32 size);
> -u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr);
> +u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr);
>  
>  #ifdef CONFIG_BPF_LSM
>  void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index fb4c911d2a03..b7543efd6284 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5290,7 +5290,7 @@ union bpf_attr {
>   *	Return
>   *		Nothing. Always succeeds.
>   *
> - * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
> + * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
>   *	Description
>   *		Read *len* bytes from *src* into *dst*, starting from *offset*
>   *		into *src*.
> @@ -5300,7 +5300,7 @@ union bpf_attr {
>   *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
>   *		*flags* is not 0.
>   *
> - * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
> + * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
>   *	Description
>   *		Write *len* bytes from *src* into *dst*, starting from *offset*
>   *		into *dst*.
> @@ -5310,7 +5310,7 @@ union bpf_attr {
>   *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
>   *		is a read-only dynptr or if *flags* is not 0.
>   *
> - * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> + * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
>   *	Description
>   *		Get a pointer to the underlying dynptr data.
>   *
> @@ -5411,7 +5411,7 @@ union bpf_attr {
>   *		Drain samples from the specified user ring buffer, and invoke
>   *		the provided callback for each such sample:
>   *
> - *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
> + *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
>   *
>   *		If **callback_fn** returns 0, the helper will continue to try
>   *		and drain the next sample, up to a maximum of
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index d02ae2f4249b..ba3b50895f6b 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6568,14 +6568,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
>  				}
>  
>  				if (arg_dynptr) {
> -					if (reg->type != PTR_TO_STACK) {
> -						bpf_log(log, "arg#%d pointer type %s %s not to stack\n",
> +					if (reg->type != PTR_TO_STACK &&
> +					    reg->type != CONST_PTR_TO_DYNPTR) {
> +						bpf_log(log, "arg#%d pointer type %s %s not to stack or dynptr\n",
>  							i, btf_type_str(ref_t),
>  							ref_tname);
>  						return -EINVAL;
>  					}

Should we bring this check into process_dynptr_func()? It's kind of
unfortunate that we have divergent logic between helpers and kfuncs,
when it comes to checking types. Let's at least be uniform here?

> -					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR, NULL))
> +					if (process_dynptr_func(env, regno, ARG_PTR_TO_DYNPTR | MEM_RDONLY, NULL))
>  						return -EINVAL;
>  					continue;
>  				}
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 283f55bbeb70..714f5c9d0c1f 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1398,7 +1398,7 @@ static const struct bpf_func_proto bpf_kptr_xchg_proto = {
>  #define DYNPTR_SIZE_MASK	0xFFFFFF
>  #define DYNPTR_RDONLY_BIT	BIT(31)
>  
> -static bool bpf_dynptr_is_rdonly(struct bpf_dynptr_kern *ptr)
> +static bool bpf_dynptr_is_rdonly(const struct bpf_dynptr_kern *ptr)
>  {
>  	return ptr->size & DYNPTR_RDONLY_BIT;
>  }
> @@ -1408,7 +1408,7 @@ static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum bpf_dynptr_typ
>  	ptr->size |= type << DYNPTR_TYPE_SHIFT;
>  }
>  
> -u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
> +u32 bpf_dynptr_get_size(const struct bpf_dynptr_kern *ptr)
>  {
>  	return ptr->size & DYNPTR_SIZE_MASK;
>  }
> @@ -1432,7 +1432,7 @@ void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr)
>  	memset(ptr, 0, sizeof(*ptr));
>  }
>  
> -static int bpf_dynptr_check_off_len(struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
> +static int bpf_dynptr_check_off_len(const struct bpf_dynptr_kern *ptr, u32 offset, u32 len)
>  {
>  	u32 size = bpf_dynptr_get_size(ptr);
>  
> @@ -1477,7 +1477,7 @@ static const struct bpf_func_proto bpf_dynptr_from_mem_proto = {
>  	.arg4_type	= ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL | MEM_UNINIT,
>  };
>  
> -BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct bpf_dynptr_kern *, src,
> +BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern *, src,
>  	   u32, offset, u64, flags)
>  {
>  	int err;
> @@ -1500,12 +1500,12 @@ static const struct bpf_func_proto bpf_dynptr_read_proto = {
>  	.ret_type	= RET_INTEGER,
>  	.arg1_type	= ARG_PTR_TO_UNINIT_MEM,
>  	.arg2_type	= ARG_CONST_SIZE_OR_ZERO,
> -	.arg3_type	= ARG_PTR_TO_DYNPTR,
> +	.arg3_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>  	.arg4_type	= ARG_ANYTHING,
>  	.arg5_type	= ARG_ANYTHING,
>  };
>  
> -BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
> +BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, void *, src,
>  	   u32, len, u64, flags)
>  {
>  	int err;
> @@ -1526,14 +1526,14 @@ static const struct bpf_func_proto bpf_dynptr_write_proto = {
>  	.func		= bpf_dynptr_write,
>  	.gpl_only	= false,
>  	.ret_type	= RET_INTEGER,
> -	.arg1_type	= ARG_PTR_TO_DYNPTR,
> +	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>  	.arg2_type	= ARG_ANYTHING,
>  	.arg3_type	= ARG_PTR_TO_MEM | MEM_RDONLY,
>  	.arg4_type	= ARG_CONST_SIZE_OR_ZERO,
>  	.arg5_type	= ARG_ANYTHING,
>  };
>  
> -BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
> +BPF_CALL_3(bpf_dynptr_data, const struct bpf_dynptr_kern *, ptr, u32, offset, u32, len)
>  {
>  	int err;
>  
> @@ -1554,7 +1554,7 @@ static const struct bpf_func_proto bpf_dynptr_data_proto = {
>  	.func		= bpf_dynptr_data,
>  	.gpl_only	= false,
>  	.ret_type	= RET_PTR_TO_DYNPTR_MEM_OR_NULL,
> -	.arg1_type	= ARG_PTR_TO_DYNPTR,
> +	.arg1_type	= ARG_PTR_TO_DYNPTR | MEM_RDONLY,
>  	.arg2_type	= ARG_ANYTHING,
>  	.arg3_type	= ARG_CONST_ALLOC_SIZE_OR_ZERO,
>  };
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 41ef7e4b73e4..c484e632b0cd 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -565,7 +565,7 @@ static const char *reg_type_str(struct bpf_verifier_env *env,
>  		[PTR_TO_BUF]		= "buf",
>  		[PTR_TO_FUNC]		= "func",
>  		[PTR_TO_MAP_KEY]	= "map_key",
> -		[PTR_TO_DYNPTR]		= "dynptr_ptr",
> +		[CONST_PTR_TO_DYNPTR]	= "dynptr",
>  	};
>  
>  	if (type & PTR_MAYBE_NULL) {
> @@ -699,6 +699,27 @@ static bool dynptr_type_refcounted(enum bpf_dynptr_type type)
>  	return type == BPF_DYNPTR_TYPE_RINGBUF;
>  }
>  
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +			       struct bpf_reg_state *reg2,
> +			       enum bpf_dynptr_type type);
> +
> +static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> +				struct bpf_reg_state *reg);
> +
> +static void mark_dynptr_stack_regs(struct bpf_reg_state *sreg1,
> +				   struct bpf_reg_state *sreg2,
> +				   enum bpf_dynptr_type type)
> +{
> +	__mark_dynptr_regs(sreg1, sreg2, type);
> +}
> +
> +static void mark_dynptr_cb_reg(struct bpf_reg_state *reg1,
> +			       enum bpf_dynptr_type type)
> +{
> +	__mark_dynptr_regs(reg1, NULL, type);
> +}
> +
> +
>  static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_state *reg,
>  				   enum bpf_arg_type arg_type, int insn_idx)
>  {
> @@ -720,9 +741,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>  	if (type == BPF_DYNPTR_TYPE_INVALID)
>  		return -EINVAL;
>  
> -	state->stack[spi].spilled_ptr.dynptr.first_slot = true;
> -	state->stack[spi].spilled_ptr.dynptr.type = type;
> -	state->stack[spi - 1].spilled_ptr.dynptr.type = type;
> +	mark_dynptr_stack_regs(&state->stack[spi].spilled_ptr,
> +			       &state->stack[spi - 1].spilled_ptr, type);
>  
>  	if (dynptr_type_refcounted(type)) {
>  		/* The id is used to track proper releasing */
> @@ -730,8 +750,8 @@ static int mark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_reg_
>  		if (id < 0)
>  			return id;
>  
> -		state->stack[spi].spilled_ptr.id = id;
> -		state->stack[spi - 1].spilled_ptr.id = id;
> +		state->stack[spi].spilled_ptr.ref_obj_id = id;
> +		state->stack[spi - 1].spilled_ptr.ref_obj_id = id;
>  	}
>  
>  	return 0;
> @@ -753,25 +773,23 @@ static int unmark_stack_slots_dynptr(struct bpf_verifier_env *env, struct bpf_re
>  	}
>  
>  	/* Invalidate any slices associated with this dynptr */
> -	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type)) {
> -		release_reference(env, state->stack[spi].spilled_ptr.id);
> -		state->stack[spi].spilled_ptr.id = 0;
> -		state->stack[spi - 1].spilled_ptr.id = 0;
> -	}
> -
> -	state->stack[spi].spilled_ptr.dynptr.first_slot = false;
> -	state->stack[spi].spilled_ptr.dynptr.type = 0;
> -	state->stack[spi - 1].spilled_ptr.dynptr.type = 0;
> +	if (dynptr_type_refcounted(state->stack[spi].spilled_ptr.dynptr.type))
> +		WARN_ON_ONCE(release_reference(env, state->stack[spi].spilled_ptr.ref_obj_id));
>  
> +	__mark_reg_not_init(env, &state->stack[spi].spilled_ptr);
> +	__mark_reg_not_init(env, &state->stack[spi - 1].spilled_ptr);
>  	return 0;
>  }
>  
>  static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> -	int i;
> +	int spi, i;
> +
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return false;
>  
> +	spi = get_spi(reg->off);
>  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS))
>  		return true;
>  
> @@ -787,9 +805,14 @@ static bool is_dynptr_reg_valid_uninit(struct bpf_verifier_env *env, struct bpf_
>  static bool is_dynptr_reg_valid_init(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> +	int spi;
>  	int i;
>  
> +	/* This already represents first slot of initialized bpf_dynptr */
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return true;
> +
> +	spi = get_spi(reg->off);
>  	if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
>  	    !state->stack[spi].spilled_ptr.dynptr.first_slot)
>  		return false;
> @@ -808,15 +831,19 @@ static bool is_dynptr_type_expected(struct bpf_verifier_env *env, struct bpf_reg
>  {
>  	struct bpf_func_state *state = func(env, reg);
>  	enum bpf_dynptr_type dynptr_type;
> -	int spi = get_spi(reg->off);
> +	int spi;
>  
>  	/* ARG_PTR_TO_DYNPTR takes any type of dynptr */
>  	if (arg_type == ARG_PTR_TO_DYNPTR)
>  		return true;
>  
>  	dynptr_type = arg_to_dynptr_type(arg_type);
> -
> -	return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +	if (reg->type == CONST_PTR_TO_DYNPTR) {
> +		return reg->dynptr.type == dynptr_type;
> +	} else {
> +		spi = get_spi(reg->off);
> +		return state->stack[spi].spilled_ptr.dynptr.type == dynptr_type;
> +	}
>  }
>  
>  /* The reg state of a pointer or a bounded scalar was saved when
> @@ -1324,9 +1351,6 @@ static const int caller_saved[CALLER_SAVED_REGS] = {
>  	BPF_REG_0, BPF_REG_1, BPF_REG_2, BPF_REG_3, BPF_REG_4, BPF_REG_5
>  };
>  
> -static void __mark_reg_not_init(const struct bpf_verifier_env *env,
> -				struct bpf_reg_state *reg);
> -
>  /* This helper doesn't clear reg->id */
>  static void ___mark_reg_known(struct bpf_reg_state *reg, u64 imm)
>  {
> @@ -1389,6 +1413,26 @@ static void mark_reg_known_zero(struct bpf_verifier_env *env,
>  	__mark_reg_known_zero(regs + regno);
>  }
>  
> +static void __mark_dynptr_regs(struct bpf_reg_state *reg1,
> +			       struct bpf_reg_state *reg2,
> +			       enum bpf_dynptr_type type)
> +{
> +	/* reg->type has no meaning for STACK_DYNPTR, but when we set reg for
> +	 * callback arguments, it does need to be CONST_PTR_TO_DYNPTR, so simply
> +	 * set it unconditionally as it is ignored for STACK_DYNPTR anyway.
> +	 */
> +	__mark_reg_known_zero(reg1);
> +	reg1->type = CONST_PTR_TO_DYNPTR;
> +	reg1->dynptr.type = type;
> +	reg1->dynptr.first_slot = true;
> +	if (!reg2)
> +		return;
> +	__mark_reg_known_zero(reg2);
> +	reg2->type = CONST_PTR_TO_DYNPTR;
> +	reg2->dynptr.type = type;
> +	reg2->dynptr.first_slot = false;
> +}
> +
>  static void mark_ptr_not_null_reg(struct bpf_reg_state *reg)
>  {
>  	if (base_type(reg->type) == PTR_TO_MAP_VALUE) {
> @@ -5692,20 +5736,62 @@ static int process_kptr_func(struct bpf_verifier_env *env, int regno,
>  	return 0;
>  }
>  
> +/* Implementation details:

I don't think "Implementation details" is necessary to include in the
function header (and thank you for adding this header btw).

> + *
> + * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
> + * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
> + *
> + * In both cases we deal with the first 8 bytes, but need to mark the next 8
> + * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
> + * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
> + *
> + * Mutability of bpf_dynptr is at two levels, one is at the level of struct
> + * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
> + * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
> + * mutate the view of the dynptr and also possibly destroy it. In the latter
> + * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
> + * memory that dynptr points to.
> + *
> + * The verifier will keep track both levels of mutation (bpf_dynptr's in
> + * reg->type and the memory's in reg->dynptr.type), but there is no support for
> + * readonly dynptr view yet, hence only the first case is tracked and checked.
> + *
> + * This is consistent with how C applies the const modifier to a struct object,
> + * where the pointer itself inside bpf_dynptr becomes const but not what it
> + * points to.
> + *
> + * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
> + * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
> + */
>  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> -			enum bpf_arg_type arg_type,
> -			struct bpf_call_arg_meta *meta)
> +			enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta)
>  {
>  	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
>  	int argno = regno - 1;
>  
> -	/* We only need to check for initialized / uninitialized helper
> -	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> -	 * assumption is that if it is, that a helper function
> -	 * initialized the dynptr on behalf of the BPF program.
> +	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> +		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> +		return -EFAULT;
> +	}
> +
> +	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a

s/applied to a/applied to an

> +	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> +	 *
> +	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
> +	 *		 constructing a mutable bpf_dynptr object.
> +	 *
> +	 *		 Currently, this is only possible with PTR_TO_STACK
> +	 *		 pointing to a region of atleast 16 bytes which doesn't

s/atleast/at least

> +	 *		 contain an existing bpf_dynptr.
> +	 *
> +	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
> +	 *		 mutated or destroyed. However, the memory it points to
> +	 *		 may be mutated.
> +	 *
> +	 *  None       - Points to a initialized dynptr that can be mutated and
> +	 *		 destroyed, including mutation of the memory it points
> +	 *		 to.
>  	 */
> -	if (base_type(reg->type) == PTR_TO_DYNPTR)
> -		return 0;
>  	if (arg_type & MEM_UNINIT) {
>  		if (!is_dynptr_reg_valid_uninit(env, reg)) {
>  			verbose(env, "Dynptr has to be an uninitialized dynptr\n");

Not your change, but this is an awkwardly phrased error message. IMO
"dynptr must be initialized" is more succinct. Feel free to ignore if
you'd like, I'm happy to submit a separate patch to change it as some
point.

> @@ -5722,6 +5808,12 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  
>  		meta->uninit_dynptr_regno = regno;
>  	} else {
> +		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> +		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> +			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> +			return -EINVAL;
> +		}
> +
>  		if (!is_dynptr_reg_valid_init(env, reg)) {
>  			verbose(env,
>  				"Expected an initialized dynptr as arg #%d\n",
> @@ -5729,6 +5821,8 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  			return -EINVAL;
>  		}
>  
> +		/* Fold MEM_RDONLY, only inspect arg_type */
> +		arg_type &= ~MEM_RDONLY;

This seems brittle. Can is_dynptr_type_expected() just check the base
type?

>  		if (!is_dynptr_type_expected(env, reg, arg_type)) {
>  			const char *err_extra = "";
>  
> @@ -5874,7 +5968,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
>  static const struct bpf_reg_types dynptr_types = {
>  	.types = {
>  		PTR_TO_STACK,
> -		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
> +		CONST_PTR_TO_DYNPTR,
>  	}
>  };
>  
> @@ -6050,12 +6144,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>  
> -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> +static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
>  {
>  	struct bpf_func_state *state = func(env, reg);
> -	int spi = get_spi(reg->off);
> +	int spi;
>  
> -	return state->stack[spi].spilled_ptr.id;
> +	if (reg->type == CONST_PTR_TO_DYNPTR)
> +		return reg->ref_obj_id;
> +	spi = get_spi(reg->off);
> +	return state->stack[spi].spilled_ptr.ref_obj_id;
>  }
>  
>  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -6119,11 +6216,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>  	if (arg_type_is_release(arg_type)) {
>  		if (arg_type_is_dynptr(arg_type)) {
>  			struct bpf_func_state *state = func(env, reg);
> -			int spi = get_spi(reg->off);
> +			int spi;
>  
> -			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> -			    !state->stack[spi].spilled_ptr.id) {
> -				verbose(env, "arg %d is an unacquired reference\n", regno);

Can we add a comment here explaining why only PTR_TO_STACK dynptrs are
expected to be released? I know we have such comments elsewhere already,
but if we're going to have logic like this which is hard-coded against
assumptions of what types of dynptrs can be used in which contexts /
helpers, I think it is important to be verbose in calling that out as
it's not obvious from the code itself why this is the case.

> +			if (reg->type == PTR_TO_STACK) {
> +				spi = get_spi(reg->off);
> +				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> +				    !state->stack[spi].spilled_ptr.ref_obj_id) {
> +					verbose(env, "arg %d is an unacquired reference\n", regno);
> +					return -EINVAL;
> +				}
> +			} else {
> +				verbose(env, "cannot release unowned const bpf_dynptr\n");
>  				return -EINVAL;
>  			}
>  		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
> @@ -7091,11 +7194,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
>  {
>  	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
>  	 *			  callback_ctx, u64 flags);
> -	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
> +	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
>  	 */
>  	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
> -	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
> -	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
> +	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
>  	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
>  
>  	/* unused */
> @@ -7474,7 +7576,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  
>  	regs = cur_regs(env);
>  
> +	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
> +	 * is safe to do directly.
> +	 */
>  	if (meta.uninit_dynptr_regno) {
> +		if (WARN_ON_ONCE(regs[meta.uninit_dynptr_regno].type == CONST_PTR_TO_DYNPTR))

I think we tend to do "verifier internal error" logs for situations like
this rather than WARN_ON_ONCE(), right?

> +			return -EFAULT;
>  		/* we write BPF_DW bits (8 bytes) at a time */
>  		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
>  			err = check_mem_access(env, insn_idx, meta.uninit_dynptr_regno,
> @@ -7492,15 +7600,22 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  
>  	if (meta.release_regno) {
>  		err = -EINVAL;
> -		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
> +		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> +		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
> +		 * is safe to do directly.
> +		 */
> +		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1])) {
> +			if (WARN_ON_ONCE(regs[meta.release_regno].type == CONST_PTR_TO_DYNPTR))

Same here r.e. using verifier log rather than WARN_ON_ONCE(). Also, can
we bring this comment inside the if statement to match the alignemnt of
the one you added below?

> +				return -EFAULT;
>  			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
> -		else if (meta.ref_obj_id)
> +		} else if (meta.ref_obj_id) {
>  			err = release_reference(env, meta.ref_obj_id);
> -		/* meta.ref_obj_id can only be 0 if register that is meant to be
> -		 * released is NULL, which must be > R0.
> -		 */
> -		else if (register_is_null(&regs[meta.release_regno]))
> +		} else if (register_is_null(&regs[meta.release_regno])) {
> +			/* meta.ref_obj_id can only be 0 if register that is meant to be
> +			 * released is NULL, which must be > R0.
> +			 */
>  			err = 0;
> +		}
>  		if (err) {
>  			verbose(env, "func %s#%d reference has not been acquired before\n",
>  				func_id_name(func_id), func_id);
> @@ -7574,11 +7689,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
>  					return -EFAULT;
>  				}
>  
> -				if (base_type(reg->type) != PTR_TO_DYNPTR)
> -					/* Find the id of the dynptr we're
> -					 * tracking the reference of
> -					 */
> -					meta.ref_obj_id = stack_slot_get_id(env, reg);
> +				/* Find the id of the dynptr we're tracking the reference of */

Not your change, but IMO this comment does not add any value.

> +				meta.ref_obj_id = dynptr_ref_obj_id(env, reg);
>  				break;
>  			}
>  		}
> diff --git a/scripts/bpf_doc.py b/scripts/bpf_doc.py
> index fdb0aff8cb5a..e8d90829f23e 100755
> --- a/scripts/bpf_doc.py
> +++ b/scripts/bpf_doc.py
> @@ -752,6 +752,7 @@ class PrinterHelpers(Printer):
>              'struct bpf_timer',
>              'struct mptcp_sock',
>              'struct bpf_dynptr',
> +            'const struct bpf_dynptr',
>              'struct iphdr',
>              'struct ipv6hdr',
>      }
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index fb4c911d2a03..b7543efd6284 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5290,7 +5290,7 @@ union bpf_attr {
>   *	Return
>   *		Nothing. Always succeeds.
>   *
> - * long bpf_dynptr_read(void *dst, u32 len, struct bpf_dynptr *src, u32 offset, u64 flags)
> + * long bpf_dynptr_read(void *dst, u32 len, const struct bpf_dynptr *src, u32 offset, u64 flags)
>   *	Description
>   *		Read *len* bytes from *src* into *dst*, starting from *offset*
>   *		into *src*.
> @@ -5300,7 +5300,7 @@ union bpf_attr {
>   *		of *src*'s data, -EINVAL if *src* is an invalid dynptr or if
>   *		*flags* is not 0.
>   *
> - * long bpf_dynptr_write(struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
> + * long bpf_dynptr_write(const struct bpf_dynptr *dst, u32 offset, void *src, u32 len, u64 flags)
>   *	Description
>   *		Write *len* bytes from *src* into *dst*, starting from *offset*
>   *		into *dst*.
> @@ -5310,7 +5310,7 @@ union bpf_attr {
>   *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
>   *		is a read-only dynptr or if *flags* is not 0.
>   *
> - * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> + * void *bpf_dynptr_data(const struct bpf_dynptr *ptr, u32 offset, u32 len)
>   *	Description
>   *		Get a pointer to the underlying dynptr data.
>   *
> @@ -5411,7 +5411,7 @@ union bpf_attr {
>   *		Drain samples from the specified user ring buffer, and invoke
>   *		the provided callback for each such sample:
>   *
> - *		long (\*callback_fn)(struct bpf_dynptr \*dynptr, void \*ctx);
> + *		long (\*callback_fn)(const struct bpf_dynptr \*dynptr, void \*ctx);
>   *
>   *		If **callback_fn** returns 0, the helper will continue to try
>   *		and drain the next sample, up to a maximum of
> diff --git a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> index 02b18d018b36..39882580cb90 100644
> --- a/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/user_ringbuf.c
> @@ -668,13 +668,13 @@ static struct {
>  	const char *expected_err_msg;
>  } failure_tests[] = {
>  	/* failure cases */
> -	{"user_ringbuf_callback_bad_access1", "negative offset dynptr_ptr ptr"},
> -	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr_ptr ptr"},
> -	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr_ptr'"},
> +	{"user_ringbuf_callback_bad_access1", "negative offset dynptr ptr"},
> +	{"user_ringbuf_callback_bad_access2", "dereference of modified dynptr ptr"},
> +	{"user_ringbuf_callback_write_forbidden", "invalid mem access 'dynptr'"},
>  	{"user_ringbuf_callback_null_context_write", "invalid mem access 'scalar'"},
>  	{"user_ringbuf_callback_null_context_read", "invalid mem access 'scalar'"},
> -	{"user_ringbuf_callback_discard_dynptr", "arg 1 is an unacquired reference"},
> -	{"user_ringbuf_callback_submit_dynptr", "arg 1 is an unacquired reference"},
> +	{"user_ringbuf_callback_discard_dynptr", "cannot release unowned const bpf_dynptr"},
> +	{"user_ringbuf_callback_submit_dynptr", "cannot release unowned const bpf_dynptr"},
>  	{"user_ringbuf_callback_invalid_return", "At callback return the register R0 has value"},
>  };
>  
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off
  2022-11-15  0:01 ` [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
  2022-11-15 18:24   ` Joanne Koong
@ 2022-11-17 23:42   ` David Vernet
  2022-11-20 18:41     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 28+ messages in thread
From: David Vernet @ 2022-11-17 23:42 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Nov 15, 2022 at 05:31:27AM +0530, Kumar Kartikeya Dwivedi wrote:
> While check_func_arg_reg_off is the place which performs generic checks
> needed by various candidates of reg->type, there is some handling for
> special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> ARG_PTR_TO_ALLOC_MEM.
> 
> This commit aims to streamline these special cases and instead leave
> other things up to argument type specific code to handle. The function
> will be restrictive by default, and cover all possible cases when
> OBJ_RELEASE is set, without having to update the function again (and
> missing to do that being a bug).
>
> This is done primarily for two reasons: associating back reg->type to
> its argument leaves room for the list getting out of sync when a new
> reg->type is supported by an arg_type.
>
> The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> we already handle, whenever a release argument is expected, it should
> be passed as the pointer that was received from the acquire function.
> Hence zero fixed and variable offset.
> 
> There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> with non-zero offset to other helper functions, which makes sense.

Agreed -- just out of curiosity do you know what the rationale was for
it disallowing offsets in the first place?

> Hence, lift the arg_type_is_release check for reg->off and cover all
> possible register types, instead of duplicating the same kind of check
> twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
> 
> For the release argument, arg_type_is_dynptr is the special case, where
> we go to actual object being freed through the dynptr, so the offset of
> the pointer still needs to allow fixed and variable offset and
> process_dynptr_func will verify them later for the release argument case
> as well.
> 
> This is not specific to ARG_PTR_TO_DYNPTR though, we will need to make
> this exception for any future object on the stack that needs to be
> released. In this sense, PTR_TO_STACK as a candidate for object on stack
> argument is a special case for release offset checks, and they need to
> be done by the helper releasing the object on stack.
> 
> Since the check has been lifted above all register type checks, remove
> the duplicated check that is being done for PTR_TO_BTF_ID.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c                         | 62 ++++++++++++-------
>  .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
>  2 files changed, 39 insertions(+), 25 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index c484e632b0cd..34e67d04579b 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -6092,11 +6092,38 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  			   const struct bpf_reg_state *reg, int regno,
>  			   enum bpf_arg_type arg_type)
>  {
> -	enum bpf_reg_type type = reg->type;
> -	bool fixed_off_ok = false;
> +	u32 type = reg->type;
>  
> -	switch ((u32)type) {
> -	/* Pointer types where reg offset is explicitly allowed: */
> +	/* When referenced register is passed to release function, it's fixed

s/it's/its

> +	 * offset must be 0.
> +	 *
> +	 * We will check arg_type_is_release reg has ref_obj_id when storing
> +	 * meta->release_regno.
> +	 */
> +	if (arg_type_is_release(arg_type)) {
> +		/* ARG_PTR_TO_DYNPTR with OBJ_RELEASE is a bit special, as it
> +		 * may not directly point to the object being released, but to
> +		 * dynptr pointing to such object, which might be at some offset
> +		 * on the stack. In that case, we simply to fallback to the
> +		 * default handling.
> +		 */

I personally am not a fan of this. We'd now have an entirely separate
branch of logic for most OBJ_RELEASE args, but then we fall through to
default handling for only PTR_TO_STACK specifically? That kind of
tactical complexity / readability tax is unfortunate, and adds up
quickly. It's already gotten pretty pervasive in the verifier, and I
personally would prefer to see us moving away from adding small one-off
conditional branches like this in the verifier when possible.

It seems to me like the problem we're trying to solve is that both arg
type and register type are relevant when detecting whether a register is
allowed to have nonzero offsets. It would be ideal if we could encode
some of this at build-time, similar to how we statically define and
compare compatible_reg_types in check_reg_type(). Is there some way for
us to do this for allowing offsets? Like maybe we can hoist some of the
build logic for how we define compatible_reg_types into a separate
table-like header file that can be included in multiple places to
construct statically-defined, constant arrays where all of this logic is
encoded? These check-offset / check-type functions could then really
just be lookups into these const arrays, and e.g. enabling new arg types
for register types could be captured by adding an entry to that
header-file table.

Maybe that's overkill for this specific example, but I really feel like
we need to start to try and move away from these one-off conditional
branches which collectively create a lot of complexity, and make it
difficult to reason about high-level correctness in the verifier.

To be clear: I won't block your change on this, but I'd like to hear
what you think about it as an idea, and I personally would really like
to see the verifier moving in that direction in general.

> +		if (!arg_type_is_dynptr(arg_type) || type != PTR_TO_STACK) {
> +			/* Doing check_ptr_off_reg check for the offset will
> +			 * catch this because fixed_off_ok is false, but
> +			 * checking here allows us to give the user a better
> +			 * error message.
> +			 */
> +			if (reg->off) {
> +				verbose(env, "R%d must have zero offset when passed to release func\n",
> +					regno);
> +				return -EINVAL;
> +			}

Why is this check necessary? To get a better error message in the
verifier if the register offset is nonzero? If so, IMO I don't think
that's a good enough reason to completely circumvent calling
__check_ptr_off_reg(). Not for correctness, but because it's
__check_ptr_off_reg()'s job to look at the register offset (that's what
the fixed_off_ok arg is for after all), and adding one-off checks like
which duplicate checks in other functions, it ends up ballooning the
code and complexity in the verifier. If we want to have a different
message for testing the offset being zero depending on the arg type,
that's where it belongs.

> +			return __check_ptr_off_reg(env, reg, regno, false);
> +		}
> +		/* Fallback to default handling */
> +	}
> +	switch (type) {
> +	/* Pointer types where both fixed and variable offset is explicitly allowed: */
>  	case PTR_TO_STACK:
>  		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
>  			verbose(env, "cannot pass in dynptr at an offset\n");
> @@ -6113,35 +6140,22 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  	case PTR_TO_BUF:
>  	case PTR_TO_BUF | MEM_RDONLY:
>  	case SCALAR_VALUE:
> -		/* Some of the argument types nevertheless require a
> -		 * zero register offset.
> -		 */
> -		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> -			return 0;

Doesn't this result in a change of behavior beyond just verifying an
offset of 0? Before, in addition to checking offset 0, we were also
verifying that e.g. reg->off was nonnegative for ARG_PTR_TO_ALLOC_MEM.
Now we only do that if it's OBJ_RELEASE. I'm actually not following why
we wouldn't want to invoke __check_ptr_offset() for any of these other
register types.

> -		break;
> +		return 0;
>  	/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
>  	 * fixed offset.
>  	 */
>  	case PTR_TO_BTF_ID:
>  		/* When referenced PTR_TO_BTF_ID is passed to release function,
>  		 * it's fixed offset must be 0.	In the other cases, fixed offset

Not your change, but while you're here could you please fix s/it's/its

> -		 * can be non-zero.
> +		 * can be non-zero. This was already checked above. So pass
> +		 * fixed_off_ok as true to allow fixed offset for all other
> +		 * cases. var_off always must be 0 for PTR_TO_BTF_ID, hence we
> +		 * still need to do checks instead of returning.
>  		 */
> -		if (arg_type_is_release(arg_type) && reg->off) {
> -			verbose(env, "R%d must have zero offset when passed to release func\n",
> -				regno);
> -			return -EINVAL;
> -		}
> -		/* For arg is release pointer, fixed_off_ok must be false, but
> -		 * we already checked and rejected reg->off != 0 above, so set
> -		 * to true to allow fixed offset for all other cases.
> -		 */
> -		fixed_off_ok = true;
> -		break;
> +		return __check_ptr_off_reg(env, reg, regno, true);
>  	default:
> -		break;
> +		return __check_ptr_off_reg(env, reg, regno, false);
>  	}
> -	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>  }
>  
>  static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> diff --git a/tools/testing/selftests/bpf/verifier/ringbuf.c b/tools/testing/selftests/bpf/verifier/ringbuf.c
> index b64d33e4833c..92e3f6a61a79 100644
> --- a/tools/testing/selftests/bpf/verifier/ringbuf.c
> +++ b/tools/testing/selftests/bpf/verifier/ringbuf.c
> @@ -28,7 +28,7 @@
>  	},
>  	.fixup_map_ringbuf = { 1 },
>  	.result = REJECT,
> -	.errstr = "dereference of modified alloc_mem ptr R1",
> +	.errstr = "R1 must have zero offset when passed to release func",
>  },
>  {
>  	"ringbuf: invalid reservation offset 2",
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-15  0:01 ` [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func Kumar Kartikeya Dwivedi
  2022-11-15 18:29   ` Joanne Koong
@ 2022-11-17 23:49   ` David Vernet
  2022-11-20 19:10     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 28+ messages in thread
From: David Vernet @ 2022-11-17 23:49 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> After previous commit, we are minimizing helper specific assumptions
> from check_func_arg_reg_off, making it generic, and offloading checks
> for a specific argument type to their respective functions called after
> check_func_arg_reg_off has been called.

What's the point of check_func_arg_reg_off() if helpers have to check
offsets after it's been called? Also, in [0], there's now logic in
check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
there's still a precedent for looking at arg types there. IMO it's
actually less confusing to just push as much offset checking as possible
into one place.

[0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/

> This allows relying on a consistent set of guarantees after that call
> and then relying on them in code that deals with registers for each
> argument type later. This is in line with how process_spin_lock,
> process_timer_func, process_kptr_func check reg->var_off to be constant.
> The same reasoning is used here to move the alignment check into
> process_dynptr_func. Note that it also needs to check for constant
> var_off, and accumulate the constant var_off when computing the spi in
> get_spi, but that fix will come in later changes.
> 
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  kernel/bpf/verifier.c | 13 ++++++++-----
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 34e67d04579b..fd292f762d53 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
>  		return -EFAULT;
>  	}
>  
> +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> +	 * check_func_arg_reg_off's logic. We only need to check offset
> +	 * alignment for PTR_TO_STACK.
> +	 */
> +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> +		return -EINVAL;
> +	}

As alluded to above, I personally think it's more confusing to have this
check in process_dynptr_func(). On the one hand you have
check_func_arg_reg_off() which verifies that a register has the correct
offset, but then here we have to check for the register offset for
PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
as much of the offset-checking complexity into one place as possible?

>  	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>  	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
>  	 *
> @@ -6125,11 +6133,6 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
>  	switch (type) {
>  	/* Pointer types where both fixed and variable offset is explicitly allowed: */
>  	case PTR_TO_STACK:
> -		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> -			verbose(env, "cannot pass in dynptr at an offset\n");
> -			return -EINVAL;
> -		}
> -		fallthrough;
>  	case PTR_TO_PACKET:
>  	case PTR_TO_PACKET_META:
>  	case PTR_TO_MAP_KEY:
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write}
  2022-11-15  0:01 ` [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
@ 2022-11-17 23:51   ` David Vernet
  0 siblings, 0 replies; 28+ messages in thread
From: David Vernet @ 2022-11-17 23:51 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Joanne Koong, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau

On Tue, Nov 15, 2022 at 05:31:29AM +0530, Kumar Kartikeya Dwivedi wrote:
> It may happen that destination buffer memory overlaps with memory dynptr
> points to. Hence, we must use memmove to correctly copy from dynptr to
> destination buffer, or source buffer to dynptr.
> 
> This actually isn't a problem right now, as memcpy implementation falls
> back to memmove on detecting overlap and warns about it, but we
> shouldn't be relying on that.
> 
> Acked-by: Joanne Koong <joannelkoong@gmail.com>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---

Good call, thanks for this.

Acked-by: David Vernet <void@manifault.com>

>  kernel/bpf/helpers.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 714f5c9d0c1f..1099ed1e7712 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1489,7 +1489,7 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, const struct bpf_dynptr_kern
>  	if (err)
>  		return err;
>  
> -	memcpy(dst, src->data + src->offset + offset, len);
> +	memmove(dst, src->data + src->offset + offset, len);

Can you add a comment here and in bpf_dynptr_write() which explain why
memmove is needed?

>  
>  	return 0;
>  }
> @@ -1517,7 +1517,7 @@ BPF_CALL_5(bpf_dynptr_write, const struct bpf_dynptr_kern *, dst, u32, offset, v
>  	if (err)
>  		return err;
>  
> -	memcpy(dst->data + dst->offset + offset, src, len);
> +	memmove(dst->data + dst->offset + offset, src, len);
>  
>  	return 0;
>  }
> -- 
> 2.38.1
> 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func
  2022-11-17 21:11   ` David Vernet
@ 2022-11-20 18:06     ` Kumar Kartikeya Dwivedi
  2022-11-20 18:16       ` David Vernet
  0 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-20 18:06 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Fri, Nov 18, 2022 at 02:41:57AM IST, David Vernet wrote:
> On Tue, Nov 15, 2022 at 05:31:26AM +0530, Kumar Kartikeya Dwivedi wrote:
> > Recently, user ringbuf support introduced a PTR_TO_DYNPTR register type
> > for use in callback state, because in case of user ringbuf helpers,
> > there is no dynptr on the stack that is passed into the callback. To
> > reflect such a state, a special register type was created.
> >
> > However, some checks have been bypassed incorrectly during the addition
> > of this feature. First, for arg_type with MEM_UNINIT flag which
> > initialize a dynptr, they must be rejected for such register type.
> > Secondly, in the future, there are plans to add dynptr helpers that
> > operate on the dynptr itself and may change its offset and other
> > properties.
> >
> > In all of these cases, PTR_TO_DYNPTR shouldn't be allowed to be passed
> > to such helpers, however the current code simply returns 0.
> >
> > The rejection for helpers that release the dynptr is already handled.
> >
> > For fixing this, we take a step back and rework existing code in a way
> > that will allow fitting in all classes of helpers and have a coherent
> > model for dealing with the variety of use cases in which dynptr is used.
> >
> > First, for ARG_PTR_TO_DYNPTR, it can either be set alone or together
> > with a DYNPTR_TYPE_* constant that denotes the only type it accepts.
> >
> > Next, helpers which initialize a dynptr use MEM_UNINIT to indicate this
> > fact. To make the distinction clear, use MEM_RDONLY flag to indicate
> > that the helper only operates on the memory pointed to by the dynptr,
> > not the dynptr itself. In C parlance, it would be equivalent to taking
> > the dynptr as a point to const argument.
> >
> > When either of these flags are not present, the helper is allowed to
> > mutate both the dynptr itself and also the memory it points to.
> > Currently, the read only status of the memory is not tracked in the
> > dynptr, but it would be trivial to add this support inside dynptr state
> > of the register.
> >
> > With these changes and renaming PTR_TO_DYNPTR to CONST_PTR_TO_DYNPTR to
> > better reflect its usage, it can no longer be passed to helpers that
> > initialize a dynptr, i.e. bpf_dynptr_from_mem, bpf_ringbuf_reserve_dynptr.
> >
> > A note to reviewers is that in code that does mark_stack_slots_dynptr,
> > and unmark_stack_slots_dynptr, we implicitly rely on the fact that
> > PTR_TO_STACK reg is the only case that can reach that code path, as one
> > cannot pass CONST_PTR_TO_DYNPTR to helpers that don't set MEM_RDONLY. In
> > both cases such helpers won't be setting that flag.
> >
> > The next patch will add a couple of selftest cases to make sure this
> > doesn't break.
> >
> > Fixes: 205715673844 ("bpf: Add bpf_user_ringbuf_drain() helper")
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
>
> Overall this LGTM. Left some comments / nits before stamping.
> [...]
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index d02ae2f4249b..ba3b50895f6b 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -6568,14 +6568,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
> >  				}
> >
> >  				if (arg_dynptr) {
> > -					if (reg->type != PTR_TO_STACK) {
> > -						bpf_log(log, "arg#%d pointer type %s %s not to stack\n",
> > +					if (reg->type != PTR_TO_STACK &&
> > +					    reg->type != CONST_PTR_TO_DYNPTR) {
> > +						bpf_log(log, "arg#%d pointer type %s %s not to stack or dynptr\n",
> >  							i, btf_type_str(ref_t),
> >  							ref_tname);
> >  						return -EINVAL;
> >  					}
>
> Should we bring this check into process_dynptr_func()? It's kind of
> unfortunate that we have divergent logic between helpers and kfuncs,
> when it comes to checking types. Let's at least be uniform here?
>

Maybe. This is just a hardcoded version of check_reg_type for arg_dynptr.
process_dynptr_func can then just assume it only gets called with these two
register types in case of helpers or kfuncs. You would then just be repeating
the same check for check_helper_call inside process_dynptr_func.

So yes, we have to repeat the check differently here, but we're still a few
refactorings away before we can reasonably share code directly between
check_helper_call and check_kfunc_call. kfunc support has grown organically, it
was never meant to do all this. The recent refactor is one step in that
direction.

> > [...]
> > +/* Implementation details:
>
> I don't think "Implementation details" is necessary to include in the
> function header (and thank you for adding this header btw).
>

Ack.

> > + *
> > + * There are two register types representing a bpf_dynptr, one is PTR_TO_STACK
> > + * which points to a stack slot, and the other is CONST_PTR_TO_DYNPTR.
> > + *
> > + * In both cases we deal with the first 8 bytes, but need to mark the next 8
> > + * bytes as STACK_DYNPTR in case of PTR_TO_STACK. In case of
> > + * CONST_PTR_TO_DYNPTR, we are guaranteed to get the beginning of the object.
> > + *
> > + * Mutability of bpf_dynptr is at two levels, one is at the level of struct
> > + * bpf_dynptr itself, i.e. whether the helper is receiving a pointer to struct
> > + * bpf_dynptr or pointer to const struct bpf_dynptr. In the former case, it can
> > + * mutate the view of the dynptr and also possibly destroy it. In the latter
> > + * case, it cannot mutate the bpf_dynptr itself but it can still mutate the
> > + * memory that dynptr points to.
> > + *
> > + * The verifier will keep track both levels of mutation (bpf_dynptr's in
> > + * reg->type and the memory's in reg->dynptr.type), but there is no support for
> > + * readonly dynptr view yet, hence only the first case is tracked and checked.
> > + *
> > + * This is consistent with how C applies the const modifier to a struct object,
> > + * where the pointer itself inside bpf_dynptr becomes const but not what it
> > + * points to.
> > + *
> > + * Helpers which do not mutate the bpf_dynptr set MEM_RDONLY in their argument
> > + * type, and declare it as 'const struct bpf_dynptr *' in their prototype.
> > + */
> >  int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > -			enum bpf_arg_type arg_type,
> > -			struct bpf_call_arg_meta *meta)
> > +			enum bpf_arg_type arg_type, struct bpf_call_arg_meta *meta)
> >  {
> >  	struct bpf_reg_state *regs = cur_regs(env), *reg = &regs[regno];
> >  	int argno = regno - 1;
> >
> > -	/* We only need to check for initialized / uninitialized helper
> > -	 * dynptr args if the dynptr is not PTR_TO_DYNPTR, as the
> > -	 * assumption is that if it is, that a helper function
> > -	 * initialized the dynptr on behalf of the BPF program.
> > +	if ((arg_type & (MEM_UNINIT | MEM_RDONLY)) == (MEM_UNINIT | MEM_RDONLY)) {
> > +		verbose(env, "verifier internal error: misconfigured dynptr helper type flags\n");
> > +		return -EFAULT;
> > +	}
> > +
> > +	/* MEM_UNINIT and MEM_RDONLY are exclusive, when applied to a
>
> s/applied to a/applied to an
>

Ack.

> > +	 * ARG_PTR_TO_DYNPTR (or ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_*):
> > +	 *
> > +	 *  MEM_UNINIT - Points to memory that is an appropriate candidate for
> > +	 *		 constructing a mutable bpf_dynptr object.
> > +	 *
> > +	 *		 Currently, this is only possible with PTR_TO_STACK
> > +	 *		 pointing to a region of atleast 16 bytes which doesn't
>
> s/atleast/at least
>

Ack.

> > +	 *		 contain an existing bpf_dynptr.
> > +	 *
> > +	 *  MEM_RDONLY - Points to a initialized bpf_dynptr that will not be
> > +	 *		 mutated or destroyed. However, the memory it points to
> > +	 *		 may be mutated.
> > +	 *
> > +	 *  None       - Points to a initialized dynptr that can be mutated and
> > +	 *		 destroyed, including mutation of the memory it points
> > +	 *		 to.
> >  	 */
> > -	if (base_type(reg->type) == PTR_TO_DYNPTR)
> > -		return 0;
> >  	if (arg_type & MEM_UNINIT) {
> >  		if (!is_dynptr_reg_valid_uninit(env, reg)) {
> >  			verbose(env, "Dynptr has to be an uninitialized dynptr\n");
>
> Not your change, but this is an awkwardly phrased error message. IMO
> "dynptr must be initialized" is more succinct. Feel free to ignore if
> you'd like, I'm happy to submit a separate patch to change it as some
> point.
>

Feel free to, since I think unrelated changes should not be mixed in this patch.

> > @@ -5722,6 +5808,12 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >
> >  		meta->uninit_dynptr_regno = regno;
> >  	} else {
> > +		/* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> > +		if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> > +			verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> > +			return -EINVAL;
> > +		}
> > +
> >  		if (!is_dynptr_reg_valid_init(env, reg)) {
> >  			verbose(env,
> >  				"Expected an initialized dynptr as arg #%d\n",
> > @@ -5729,6 +5821,8 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  			return -EINVAL;
> >  		}
> >
> > +		/* Fold MEM_RDONLY, only inspect arg_type */
> > +		arg_type &= ~MEM_RDONLY;
>
> This seems brittle. Can is_dynptr_type_expected() just check the base
> type?
>

Yeah, it can. I'll switch it.

> >  		if (!is_dynptr_type_expected(env, reg, arg_type)) {
> >  			const char *err_extra = "";
> >
> > @@ -5874,7 +5968,7 @@ static const struct bpf_reg_types kptr_types = { .types = { PTR_TO_MAP_VALUE } }
> >  static const struct bpf_reg_types dynptr_types = {
> >  	.types = {
> >  		PTR_TO_STACK,
> > -		PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL,
> > +		CONST_PTR_TO_DYNPTR,
> >  	}
> >  };
> >
> > @@ -6050,12 +6144,15 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >  	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
> >  }
> >
> > -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> > +static u32 dynptr_ref_obj_id(struct bpf_verifier_env *env, struct bpf_reg_state *reg)
> >  {
> >  	struct bpf_func_state *state = func(env, reg);
> > -	int spi = get_spi(reg->off);
> > +	int spi;
> >
> > -	return state->stack[spi].spilled_ptr.id;
> > +	if (reg->type == CONST_PTR_TO_DYNPTR)
> > +		return reg->ref_obj_id;
> > +	spi = get_spi(reg->off);
> > +	return state->stack[spi].spilled_ptr.ref_obj_id;
> >  }
> >
> >  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > @@ -6119,11 +6216,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >  	if (arg_type_is_release(arg_type)) {
> >  		if (arg_type_is_dynptr(arg_type)) {
> >  			struct bpf_func_state *state = func(env, reg);
> > -			int spi = get_spi(reg->off);
> > +			int spi;
> >
> > -			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> > -			    !state->stack[spi].spilled_ptr.id) {
> > -				verbose(env, "arg %d is an unacquired reference\n", regno);
>
> Can we add a comment here explaining why only PTR_TO_STACK dynptrs are
> expected to be released? I know we have such comments elsewhere already,
> but if we're going to have logic like this which is hard-coded against
> assumptions of what types of dynptrs can be used in which contexts /
> helpers, I think it is important to be verbose in calling that out as
> it's not obvious from the code itself why this is the case.
>

Sure, but you mean a code comment, right?

> > +			if (reg->type == PTR_TO_STACK) {
> > +				spi = get_spi(reg->off);
> > +				if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> > +				    !state->stack[spi].spilled_ptr.ref_obj_id) {
> > +					verbose(env, "arg %d is an unacquired reference\n", regno);
> > +					return -EINVAL;
> > +				}
> > +			} else {
> > +				verbose(env, "cannot release unowned const bpf_dynptr\n");
> >  				return -EINVAL;
> >  			}
> >  		} else if (!reg->ref_obj_id && !register_is_null(reg)) {
> > @@ -7091,11 +7194,10 @@ static int set_user_ringbuf_callback_state(struct bpf_verifier_env *env,
> >  {
> >  	/* bpf_user_ringbuf_drain(struct bpf_map *map, void *callback_fn, void
> >  	 *			  callback_ctx, u64 flags);
> > -	 * callback_fn(struct bpf_dynptr_t* dynptr, void *callback_ctx);
> > +	 * callback_fn(const struct bpf_dynptr_t* dynptr, void *callback_ctx);
> >  	 */
> >  	__mark_reg_not_init(env, &callee->regs[BPF_REG_0]);
> > -	callee->regs[BPF_REG_1].type = PTR_TO_DYNPTR | DYNPTR_TYPE_LOCAL;
> > -	__mark_reg_known_zero(&callee->regs[BPF_REG_1]);
> > +	mark_dynptr_cb_reg(&callee->regs[BPF_REG_1], BPF_DYNPTR_TYPE_LOCAL);
> >  	callee->regs[BPF_REG_2] = caller->regs[BPF_REG_3];
> >
> >  	/* unused */
> > @@ -7474,7 +7576,13 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >
> >  	regs = cur_regs(env);
> >
> > +	/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> > +	 * be reinitialized by any dynptr helper. Hence, mark_stack_slots_dynptr
> > +	 * is safe to do directly.
> > +	 */
> >  	if (meta.uninit_dynptr_regno) {
> > +		if (WARN_ON_ONCE(regs[meta.uninit_dynptr_regno].type == CONST_PTR_TO_DYNPTR))
>
> I think we tend to do "verifier internal error" logs for situations like
> this rather than WARN_ON_ONCE(), right?
>

Yeah, I'll switch it.

> > +			return -EFAULT;
> >  		/* we write BPF_DW bits (8 bytes) at a time */
> >  		for (i = 0; i < BPF_DYNPTR_SIZE; i += 8) {
> >  			err = check_mem_access(env, insn_idx, meta.uninit_dynptr_regno,
> > @@ -7492,15 +7600,22 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >
> >  	if (meta.release_regno) {
> >  		err = -EINVAL;
> > -		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1]))
> > +		/* This can only be set for PTR_TO_STACK, as CONST_PTR_TO_DYNPTR cannot
> > +		 * be released by any dynptr helper. Hence, unmark_stack_slots_dynptr
> > +		 * is safe to do directly.
> > +		 */
> > +		if (arg_type_is_dynptr(fn->arg_type[meta.release_regno - BPF_REG_1])) {
> > +			if (WARN_ON_ONCE(regs[meta.release_regno].type == CONST_PTR_TO_DYNPTR))
>
> Same here r.e. using verifier log rather than WARN_ON_ONCE(). Also, can
> we bring this comment inside the if statement to match the alignemnt of
> the one you added below?
>

Ack.

> > +				return -EFAULT;
> >  			err = unmark_stack_slots_dynptr(env, &regs[meta.release_regno]);
> > -		else if (meta.ref_obj_id)
> > +		} else if (meta.ref_obj_id) {
> >  			err = release_reference(env, meta.ref_obj_id);
> > -		/* meta.ref_obj_id can only be 0 if register that is meant to be
> > -		 * released is NULL, which must be > R0.
> > -		 */
> > -		else if (register_is_null(&regs[meta.release_regno]))
> > +		} else if (register_is_null(&regs[meta.release_regno])) {
> > +			/* meta.ref_obj_id can only be 0 if register that is meant to be
> > +			 * released is NULL, which must be > R0.
> > +			 */
> >  			err = 0;
> > +		}
> >  		if (err) {
> >  			verbose(env, "func %s#%d reference has not been acquired before\n",
> >  				func_id_name(func_id), func_id);
> > @@ -7574,11 +7689,8 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
> >  					return -EFAULT;
> >  				}
> >
> > -				if (base_type(reg->type) != PTR_TO_DYNPTR)
> > -					/* Find the id of the dynptr we're
> > -					 * tracking the reference of
> > -					 */
> > -					meta.ref_obj_id = stack_slot_get_id(env, reg);
> > +				/* Find the id of the dynptr we're tracking the reference of */
>
> Not your change, but IMO this comment does not add any value.
>

Ok, I'll drop it.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func
  2022-11-20 18:06     ` Kumar Kartikeya Dwivedi
@ 2022-11-20 18:16       ` David Vernet
  0 siblings, 0 replies; 28+ messages in thread
From: David Vernet @ 2022-11-20 18:16 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Sun, Nov 20, 2022 at 11:36:51PM +0530, Kumar Kartikeya Dwivedi wrote:

[...]

> > Not your change, but this is an awkwardly phrased error message. IMO
> > "dynptr must be initialized" is more succinct. Feel free to ignore if
> > you'd like, I'm happy to submit a separate patch to change it as some
> > point.
> >
> 
> Feel free to, since I think unrelated changes should not be mixed in this patch.

No problem, will do.

[...]

> > >  static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > > @@ -6119,11 +6216,17 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > >  	if (arg_type_is_release(arg_type)) {
> > >  		if (arg_type_is_dynptr(arg_type)) {
> > >  			struct bpf_func_state *state = func(env, reg);
> > > -			int spi = get_spi(reg->off);
> > > +			int spi;
> > >
> > > -			if (!is_spi_bounds_valid(state, spi, BPF_DYNPTR_NR_SLOTS) ||
> > > -			    !state->stack[spi].spilled_ptr.id) {
> > > -				verbose(env, "arg %d is an unacquired reference\n", regno);
> >
> > Can we add a comment here explaining why only PTR_TO_STACK dynptrs are
> > expected to be released? I know we have such comments elsewhere already,
> > but if we're going to have logic like this which is hard-coded against
> > assumptions of what types of dynptrs can be used in which contexts /
> > helpers, I think it is important to be verbose in calling that out as
> > it's not obvious from the code itself why this is the case.
> >
> 
> Sure, but you mean a code comment, right?

Yep, a code comment.

[...]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off
  2022-11-17 23:42   ` David Vernet
@ 2022-11-20 18:41     ` Kumar Kartikeya Dwivedi
  2022-11-21  5:39       ` David Vernet
  0 siblings, 1 reply; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-20 18:41 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Fri, Nov 18, 2022 at 05:12:07AM IST, David Vernet wrote:
> On Tue, Nov 15, 2022 at 05:31:27AM +0530, Kumar Kartikeya Dwivedi wrote:
> > While check_func_arg_reg_off is the place which performs generic checks
> > needed by various candidates of reg->type, there is some handling for
> > special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> > ARG_PTR_TO_ALLOC_MEM.
> >
> > This commit aims to streamline these special cases and instead leave
> > other things up to argument type specific code to handle. The function
> > will be restrictive by default, and cover all possible cases when
> > OBJ_RELEASE is set, without having to update the function again (and
> > missing to do that being a bug).
> >
> > This is done primarily for two reasons: associating back reg->type to
> > its argument leaves room for the list getting out of sync when a new
> > reg->type is supported by an arg_type.
> >
> > The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> > we already handle, whenever a release argument is expected, it should
> > be passed as the pointer that was received from the acquire function.
> > Hence zero fixed and variable offset.
> >
> > There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> > its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> > with non-zero offset to other helper functions, which makes sense.
>
> Agreed -- just out of curiosity do you know what the rationale was for
> it disallowing offsets in the first place?
>

64620e0a1e71 ("bpf: Fix out of bounds access for ringbuf helpers")
It predates OBJ_RELEASE's addition.

> > Hence, lift the arg_type_is_release check for reg->off and cover all
> > possible register types, instead of duplicating the same kind of check
> > twice for current OBJ_RELEASE arg_types (alloc_mem and ptr_to_btf_id).
> >
> > For the release argument, arg_type_is_dynptr is the special case, where
> > we go to actual object being freed through the dynptr, so the offset of
> > the pointer still needs to allow fixed and variable offset and
> > process_dynptr_func will verify them later for the release argument case
> > as well.
> >
> > This is not specific to ARG_PTR_TO_DYNPTR though, we will need to make
> > this exception for any future object on the stack that needs to be
> > released. In this sense, PTR_TO_STACK as a candidate for object on stack
> > argument is a special case for release offset checks, and they need to
> > be done by the helper releasing the object on stack.
> >
> > Since the check has been lifted above all register type checks, remove
> > the duplicated check that is being done for PTR_TO_BTF_ID.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c                         | 62 ++++++++++++-------
> >  .../testing/selftests/bpf/verifier/ringbuf.c  |  2 +-
> >  2 files changed, 39 insertions(+), 25 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index c484e632b0cd..34e67d04579b 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -6092,11 +6092,38 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >  			   const struct bpf_reg_state *reg, int regno,
> >  			   enum bpf_arg_type arg_type)
> >  {
> > -	enum bpf_reg_type type = reg->type;
> > -	bool fixed_off_ok = false;
> > +	u32 type = reg->type;
> >
> > -	switch ((u32)type) {
> > -	/* Pointer types where reg offset is explicitly allowed: */
> > +	/* When referenced register is passed to release function, it's fixed
>
> s/it's/its
>

Ack.

> > +	 * offset must be 0.
> > +	 *
> > +	 * We will check arg_type_is_release reg has ref_obj_id when storing
> > +	 * meta->release_regno.
> > +	 */
> > +	if (arg_type_is_release(arg_type)) {
> > +		/* ARG_PTR_TO_DYNPTR with OBJ_RELEASE is a bit special, as it
> > +		 * may not directly point to the object being released, but to
> > +		 * dynptr pointing to such object, which might be at some offset
> > +		 * on the stack. In that case, we simply to fallback to the
> > +		 * default handling.
> > +		 */
>
> I personally am not a fan of this. We'd now have an entirely separate
> branch of logic for most OBJ_RELEASE args, but then we fall through to
> default handling for only PTR_TO_STACK specifically? That kind of
> tactical complexity / readability tax is unfortunate, and adds up
> quickly. It's already gotten pretty pervasive in the verifier, and I
> personally would prefer to see us moving away from adding small one-off
> conditional branches like this in the verifier when possible.

I think that's because PTR_TO_STACK is a bit special. Unlike normal register
types which you get from acquire functions, you instead create some kind of
resource on the stack in this case. So the actual pointer that is to be released
(dynptr wrapping ringbuf reservation) is actually on the stack, and unlike the
normal case there is an indirection.

This will be the same with more cases in the future (e.g. moving a list_head to
the stack, which you will have to release, etc.).

>
> It seems to me like the problem we're trying to solve is that both arg
> type and register type are relevant when detecting whether a register is
> allowed to have nonzero offsets. It would be ideal if we could encode
> some of this at build-time, similar to how we statically define and
> compare compatible_reg_types in check_reg_type(). Is there some way for
> us to do this for allowing offsets? Like maybe we can hoist some of the
> build logic for how we define compatible_reg_types into a separate
> table-like header file that can be included in multiple places to
> construct statically-defined, constant arrays where all of this logic is
> encoded? These check-offset / check-type functions could then really
> just be lookups into these const arrays, and e.g. enabling new arg types
> for register types could be captured by adding an entry to that
> header-file table.
>

It would be easier to understand this if you can share it as an RFC/PoC (i.e. if
you determine it helps and is an improvement in terms of readability).

> Maybe that's overkill for this specific example, but I really feel like
> we need to start to try and move away from these one-off conditional
> branches which collectively create a lot of complexity, and make it
> difficult to reason about high-level correctness in the verifier.
>

I totally agree. The code is complicated.

> To be clear: I won't block your change on this, but I'd like to hear
> what you think about it as an idea, and I personally would really like
> to see the verifier moving in that direction in general.
>

I'm all for simplifications. I already tried dropping a few special cases, but I
was not able to figure out how we can capture the requirement for objects on
stack being released any better than this.

Feel free to provide more suggestions though, based on the above.

> > +		if (!arg_type_is_dynptr(arg_type) || type != PTR_TO_STACK) {

So technically this will be called arg_type_is_object_on_stack once we have more
cases in the future.

> > +			/* Doing check_ptr_off_reg check for the offset will
> > +			 * catch this because fixed_off_ok is false, but
> > +			 * checking here allows us to give the user a better
> > +			 * error message.
> > +			 */
> > +			if (reg->off) {
> > +				verbose(env, "R%d must have zero offset when passed to release func\n",
> > +					regno);
> > +				return -EINVAL;
> > +			}
>
> Why is this check necessary? To get a better error message in the
> verifier if the register offset is nonzero? If so, IMO I don't think
> that's a good enough reason to completely circumvent calling
> __check_ptr_off_reg().

But it's not circumvented, it's called after this check?

> Not for correctness, but because it's
> __check_ptr_off_reg()'s job to look at the register offset (that's what
> the fixed_off_ok arg is for after all), and adding one-off checks like
> which duplicate checks in other functions, it ends up ballooning the
> code and complexity in the verifier. If we want to have a different
> message for testing the offset being zero depending on the arg type,
> that's where it belongs.
>

The problem is that it's also invoked in other places outside of
check_func_arg_reg_off where that bool is_release_arg parameter would make no
sense.

I don't have a strong opinion on moving it inside the function though, just
sharing my thoughts.

> > +			return __check_ptr_off_reg(env, reg, regno, false);
> > +		}
> > +		/* Fallback to default handling */
> > +	}
> > +	switch (type) {
> > +	/* Pointer types where both fixed and variable offset is explicitly allowed: */
> >  	case PTR_TO_STACK:
> >  		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> >  			verbose(env, "cannot pass in dynptr at an offset\n");
> > @@ -6113,35 +6140,22 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> >  	case PTR_TO_BUF:
> >  	case PTR_TO_BUF | MEM_RDONLY:
> >  	case SCALAR_VALUE:
> > -		/* Some of the argument types nevertheless require a
> > -		 * zero register offset.
> > -		 */
> > -		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> > -			return 0;
>
> Doesn't this result in a change of behavior beyond just verifying an
> offset of 0? Before, in addition to checking offset 0, we were also
> verifying that e.g. reg->off was nonnegative for ARG_PTR_TO_ALLOC_MEM.

Since that translates to PTR_TO_MEM, __check_mem_access handles the negative
offset case (all memory types do, and PTR_TO_BTF_ID interprets off as u32).
PTR_TO_STACK has to explicitly permit reg->off < 0.

> Now we only do that if it's OBJ_RELEASE. I'm actually not following why
> we wouldn't want to invoke __check_ptr_offset() for any of these other
> register types.
>

Because all of them having non-zero reg->off, reg->var_off is totally ok. The
point of this function is to reject all other register types.

> > -		break;
> > +		return 0;
> >  	/* All the rest must be rejected, except PTR_TO_BTF_ID which allows
> >  	 * fixed offset.
> >  	 */
> >  	case PTR_TO_BTF_ID:
> >  		/* When referenced PTR_TO_BTF_ID is passed to release function,
> >  		 * it's fixed offset must be 0.	In the other cases, fixed offset
>
> Not your change, but while you're here could you please fix s/it's/its
>

Will do.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-17 23:49   ` David Vernet
@ 2022-11-20 19:10     ` Kumar Kartikeya Dwivedi
  2022-11-20 19:40       ` Alexei Starovoitov
  2022-11-21  7:27       ` David Vernet
  0 siblings, 2 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-20 19:10 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > After previous commit, we are minimizing helper specific assumptions
> > from check_func_arg_reg_off, making it generic, and offloading checks
> > for a specific argument type to their respective functions called after
> > check_func_arg_reg_off has been called.
>
> What's the point of check_func_arg_reg_off() if helpers have to check
> offsets after it's been called? Also, in [0], there's now logic in
> check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> there's still a precedent for looking at arg types there. IMO it's
> actually less confusing to just push as much offset checking as possible
> into one place.
>

I think you need to define 'as much offset checking'.

Consider process_kptr_func, it requires var_off to be constant. Same for
bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
of that into check_func_arg_reg_off?

Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
exception go in this function as well?

Where do you draw the line here in terms of what that function does?

IMO, there are a certain set of properties that check_func_arg_reg_off provides,
you could say that if each register type was a class, then the checks there
would be what you would do while constructing them on calling:

PtrToStack(off, var_off /* can be const or variable */)
PtrToMapValue(off, var_off /* can be const or variable */)
PtrToBtfId(off /* must be >= 0 */) /* no var_off */

How they get used by each helper and what further checks each helper needs to do
on them based on the arg_type should be done at a later stage when the specific
argument type is processed.

Agreed that, it's not perfect, with the odd case for PTR_TO_STACK having non-0
reg->off for OBJ_RELEASE. But IMO once you realise it makes no sense to release
PTR_TO_STACK and that PTR_TO_STACK actually points to the real pointer being
released, it needs to be handled specially.

> [0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/
>
> > This allows relying on a consistent set of guarantees after that call
> > and then relying on them in code that deals with registers for each
> > argument type later. This is in line with how process_spin_lock,
> > process_timer_func, process_kptr_func check reg->var_off to be constant.
> > The same reasoning is used here to move the alignment check into
> > process_dynptr_func. Note that it also needs to check for constant
> > var_off, and accumulate the constant var_off when computing the spi in
> > get_spi, but that fix will come in later changes.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  kernel/bpf/verifier.c | 13 ++++++++-----
> >  1 file changed, 8 insertions(+), 5 deletions(-)
> >
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index 34e67d04579b..fd292f762d53 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> >  		return -EFAULT;
> >  	}
> >
> > +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> > +	 * check_func_arg_reg_off's logic. We only need to check offset
> > +	 * alignment for PTR_TO_STACK.
> > +	 */
> > +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> > +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > +		return -EINVAL;
> > +	}
>
> As alluded to above, I personally think it's more confusing to have this
> check in process_dynptr_func(). On the one hand you have
> check_func_arg_reg_off() which verifies that a register has the correct
> offset, but then here we have to check for the register offset for
> PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
> as much of the offset-checking complexity into one place as possible?
>

I'm trying to make a split between 'offset correct for the reg->type in
general', and 'offset correct for the reg->type when when passed as an argument
for arg_type'. I think the latter is specific and different for each case and
thus belongs inside the case ARG_TYPE_* block of each of those.

Anyhow, all of this is not to reject your point, but to say that if we're
keeping that check in check_func_arg_reg_off for dynptr, let's also examine why
we should/shouldn't move checks for other arg_types inside it as well, and
whether the end result is going to be better than this.

In that case, atleast to me, it doesn't make sense to check reg->off %
BPF_REG_SIZE for ARG_PTR_TO_DYNPTR while leaving out other arg types.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-20 19:10     ` Kumar Kartikeya Dwivedi
@ 2022-11-20 19:40       ` Alexei Starovoitov
  2022-11-20 21:02         ` Kumar Kartikeya Dwivedi
  2022-11-21  7:27       ` David Vernet
  1 sibling, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2022-11-20 19:40 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: David Vernet, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Joanne Koong

On Sun, Nov 20, 2022 at 11:10 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> you could say that if each register type was a class, then the checks there
> would be what you would do while constructing them on calling:
>
> PtrToStack(off, var_off /* can be const or variable */)
> PtrToMapValue(off, var_off /* can be const or variable */)
> PtrToBtfId(off /* must be >= 0 */) /* no var_off */

Just to complicate things a bit... ;)
There was a request to allow var_off in ptr_to_btf_id.
Sometimes there are fixed size arrays in structs and
programs need to iterate those arrays in a loop.
Another case is to access flex_array at the end of a struct.
Like cgroup->ancestors[].
Both are currently impossible, but the verifier has to
get this capability.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-20 19:40       ` Alexei Starovoitov
@ 2022-11-20 21:02         ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-11-20 21:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David Vernet, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Joanne Koong

On Mon, Nov 21, 2022 at 01:10:21AM IST, Alexei Starovoitov wrote:
> On Sun, Nov 20, 2022 at 11:10 AM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> > you could say that if each register type was a class, then the checks there
> > would be what you would do while constructing them on calling:
> >
> > PtrToStack(off, var_off /* can be const or variable */)
> > PtrToMapValue(off, var_off /* can be const or variable */)
> > PtrToBtfId(off /* must be >= 0 */) /* no var_off */
>
> Just to complicate things a bit... ;)
> There was a request to allow var_off in ptr_to_btf_id.
> Sometimes there are fixed size arrays in structs and
> programs need to iterate those arrays in a loop.

Honestly, I don't see why this case needs var_off.
Each iteration will bump the reg->off while indexing into the flex/fixed size
array. Even ptr += scalar where scalar is known will accumulate into reg->off.
But maybe I missed some specific case.

> Another case is to access flex_array at the end of a struct.
> Like cgroup->ancestors[].

This might, but still, to mark the dst_reg as pointer you need to know whether
the possibly variable offset going beyond type->size is 8-byte aligned, right?

Otherwise the best that could be done conservatively is mark_reg_unknown.

> Both are currently impossible, but the verifier has to
> get this capability.

I guess it will be a very involved change. All over the verifier there are
implicit assumptions in places (after certain checks) that PTR_TO_BTF_ID never
has var_off.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off
  2022-11-20 18:41     ` Kumar Kartikeya Dwivedi
@ 2022-11-21  5:39       ` David Vernet
  0 siblings, 0 replies; 28+ messages in thread
From: David Vernet @ 2022-11-21  5:39 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Mon, Nov 21, 2022 at 12:11:37AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Nov 18, 2022 at 05:12:07AM IST, David Vernet wrote:
> > On Tue, Nov 15, 2022 at 05:31:27AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > While check_func_arg_reg_off is the place which performs generic checks
> > > needed by various candidates of reg->type, there is some handling for
> > > special cases, like ARG_PTR_TO_DYNPTR, OBJ_RELEASE, and
> > > ARG_PTR_TO_ALLOC_MEM.
> > >
> > > This commit aims to streamline these special cases and instead leave
> > > other things up to argument type specific code to handle. The function
> > > will be restrictive by default, and cover all possible cases when
> > > OBJ_RELEASE is set, without having to update the function again (and
> > > missing to do that being a bug).
> > >
> > > This is done primarily for two reasons: associating back reg->type to
> > > its argument leaves room for the list getting out of sync when a new
> > > reg->type is supported by an arg_type.
> > >
> > > The other case is ARG_PTR_TO_ALLOC_MEM. The problem there is something
> > > we already handle, whenever a release argument is expected, it should
> > > be passed as the pointer that was received from the acquire function.
> > > Hence zero fixed and variable offset.
> > >
> > > There is nothing special about ARG_PTR_TO_ALLOC_MEM, where technically
> > > its target register type PTR_TO_MEM | MEM_ALLOC can already be passed
> > > with non-zero offset to other helper functions, which makes sense.
> >
> > Agreed -- just out of curiosity do you know what the rationale was for
> > it disallowing offsets in the first place?
> >
> 
> 64620e0a1e71 ("bpf: Fix out of bounds access for ringbuf helpers")
> It predates OBJ_RELEASE's addition.

Thanks

[...]

TL;DR: going to ack this for now so the set can land before in plenty of
time before the 6.1 release. Thanks again for the cleanup + fixes in
this set. Left some other comments below that I'd like to discuss and
then possibly address at some point down the road.

Acked-by: David Vernet <void@manifault.com>

> > > +	 * offset must be 0.
> > > +	 *
> > > +	 * We will check arg_type_is_release reg has ref_obj_id when storing
> > > +	 * meta->release_regno.
> > > +	 */
> > > +	if (arg_type_is_release(arg_type)) {
> > > +		/* ARG_PTR_TO_DYNPTR with OBJ_RELEASE is a bit special, as it
> > > +		 * may not directly point to the object being released, but to
> > > +		 * dynptr pointing to such object, which might be at some offset
> > > +		 * on the stack. In that case, we simply to fallback to the
> > > +		 * default handling.
> > > +		 */
> >
> > I personally am not a fan of this. We'd now have an entirely separate
> > branch of logic for most OBJ_RELEASE args, but then we fall through to
> > default handling for only PTR_TO_STACK specifically? That kind of
> > tactical complexity / readability tax is unfortunate, and adds up
> > quickly. It's already gotten pretty pervasive in the verifier, and I
> > personally would prefer to see us moving away from adding small one-off
> > conditional branches like this in the verifier when possible.
> 
> I think that's because PTR_TO_STACK is a bit special. Unlike normal register
> types which you get from acquire functions, you instead create some kind of
> resource on the stack in this case. So the actual pointer that is to be released
> (dynptr wrapping ringbuf reservation) is actually on the stack, and unlike the
> normal case there is an indirection.
> 
> This will be the same with more cases in the future (e.g. moving a list_head to
> the stack, which you will have to release, etc.).

Yeah, I understand why it's this way. For now I think it's fine, but IMO
the fact that we branch out like this for different register types is a
sign that we need to add some more indirection here to clarify things a
bit. As you reasonably requested below, once I have more time on my
hands (probably not until 2023), I'm happy to explore this and send out
an RFC patch set. Alexei mentioned that Daniel had been considering
spending some time cleaning up the verifier as well, so I'd be curious
to chat with him and see what he's been thinking.

> > It seems to me like the problem we're trying to solve is that both arg
> > type and register type are relevant when detecting whether a register is
> > allowed to have nonzero offsets. It would be ideal if we could encode
> > some of this at build-time, similar to how we statically define and
> > compare compatible_reg_types in check_reg_type(). Is there some way for
> > us to do this for allowing offsets? Like maybe we can hoist some of the
> > build logic for how we define compatible_reg_types into a separate
> > table-like header file that can be included in multiple places to
> > construct statically-defined, constant arrays where all of this logic is
> > encoded? These check-offset / check-type functions could then really
> > just be lookups into these const arrays, and e.g. enabling new arg types
> > for register types could be captured by adding an entry to that
> > header-file table.
> >
> 
> It would be easier to understand this if you can share it as an RFC/PoC (i.e. if
> you determine it helps and is an improvement in terms of readability).

Totally reasonable request, I'll try to work on this once I get a few
more things taken care of (probably won't be until 2023).

> 
> > Maybe that's overkill for this specific example, but I really feel like
> > we need to start to try and move away from these one-off conditional
> > branches which collectively create a lot of complexity, and make it
> > difficult to reason about high-level correctness in the verifier.
> >
> 
> I totally agree. The code is complicated.
> 
> > To be clear: I won't block your change on this, but I'd like to hear
> > what you think about it as an idea, and I personally would really like
> > to see the verifier moving in that direction in general.
> >
> 
> I'm all for simplifications. I already tried dropping a few special cases, but I
> was not able to figure out how we can capture the requirement for objects on
> stack being released any better than this.
> 
> Feel free to provide more suggestions though, based on the above.

I don't have anything concrete to suggest for this revision.

> > > +		if (!arg_type_is_dynptr(arg_type) || type != PTR_TO_STACK) {
> 
> So technically this will be called arg_type_is_object_on_stack once we have more
> cases in the future.
> 
> > > +			/* Doing check_ptr_off_reg check for the offset will
> > > +			 * catch this because fixed_off_ok is false, but
> > > +			 * checking here allows us to give the user a better
> > > +			 * error message.
> > > +			 */
> > > +			if (reg->off) {
> > > +				verbose(env, "R%d must have zero offset when passed to release func\n",
> > > +					regno);
> > > +				return -EINVAL;
> > > +			}
> >
> > Why is this check necessary? To get a better error message in the
> > verifier if the register offset is nonzero? If so, IMO I don't think
> > that's a good enough reason to completely circumvent calling
> > __check_ptr_off_reg().
> 
> But it's not circumvented, it's called after this check?

Circumvent was probably the wrong word. What I meant was to say that
this is unnecessarily short-circuiting the same offset check that is
already present in __check_ptr_off_reg(). IMO, the purpose of
check_func_arg_reg_off() is essentially to call __check_ptr_off_reg()
with the correct parameters. It's mapping (reg x arg_type) -> what type
of offset check should be performed for this register. I'm not convinced
that adding another verifier message which complains about specifically
release args is worth the extra cognitive overhead of someone reading
this and trying to understand why there are multiple places where we're
checking reg->off > 0. I think that is especially true given that some
callers are setting OBJ_RELEASE to force a zero-offset check even though
they're not actually release args, so the message may be misleading for
some programs. Perhaps something like this is more generalizable:

int err;
bool fixed_off_ok = false;

...

err = __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
if (err)
	verbose(env, "R%d reg with reg type %s, arg type %s had invalid offset %d\n",
		regno, reg_type_str(...), arg_type_str(...), reg->off);

return err;

This could be done at the bottom of the function so that all callers
receive the same information in the event of an invalid reg offset.
Apologies for the poor choice of using the word "circumvent" on the
first email.

> 
> > Not for correctness, but because it's
> > __check_ptr_off_reg()'s job to look at the register offset (that's what
> > the fixed_off_ok arg is for after all), and adding one-off checks like
> > which duplicate checks in other functions, it ends up ballooning the
> > code and complexity in the verifier. If we want to have a different
> > message for testing the offset being zero depending on the arg type,
> > that's where it belongs.
> >
> 
> The problem is that it's also invoked in other places outside of
> check_func_arg_reg_off where that bool is_release_arg parameter would make no
> sense.

I agree with you on this -- adding arg_type to __check_ptr_off_reg()
doesn't make sense (that's the point of check_func_arg_reg_off()). What
do you think about my suggestion above instead?

> I don't have a strong opinion on moving it inside the function though, just
> sharing my thoughts.

You're making reasonable points as well. Let me know if the above
clarifies what I meant a bit.

> > > +			return __check_ptr_off_reg(env, reg, regno, false);
> > > +		}
> > > +		/* Fallback to default handling */
> > > +	}
> > > +	switch (type) {
> > > +	/* Pointer types where both fixed and variable offset is explicitly allowed: */
> > >  	case PTR_TO_STACK:
> > >  		if (arg_type_is_dynptr(arg_type) && reg->off % BPF_REG_SIZE) {
> > >  			verbose(env, "cannot pass in dynptr at an offset\n");
> > > @@ -6113,35 +6140,22 @@ int check_func_arg_reg_off(struct bpf_verifier_env *env,
> > >  	case PTR_TO_BUF:
> > >  	case PTR_TO_BUF | MEM_RDONLY:
> > >  	case SCALAR_VALUE:
> > > -		/* Some of the argument types nevertheless require a
> > > -		 * zero register offset.
> > > -		 */
> > > -		if (base_type(arg_type) != ARG_PTR_TO_ALLOC_MEM)
> > > -			return 0;
> >
> > Doesn't this result in a change of behavior beyond just verifying an
> > offset of 0? Before, in addition to checking offset 0, we were also
> > verifying that e.g. reg->off was nonnegative for ARG_PTR_TO_ALLOC_MEM.
> 
> Since that translates to PTR_TO_MEM, __check_mem_access handles the negative
> offset case (all memory types do, and PTR_TO_BTF_ID interprets off as u32).
> PTR_TO_STACK has to explicitly permit reg->off < 0.

Ahh right, it's just PTR_TO_MEM. Thanks for clarifying.

> > Now we only do that if it's OBJ_RELEASE. I'm actually not following why
> > we wouldn't want to invoke __check_ptr_offset() for any of these other
> > register types.
> >
> 
> Because all of them having non-zero reg->off, reg->var_off is totally ok. The
> point of this function is to reject all other register types.

Fair enough. I think part of my confusion here stemmed from the fact
that these register types actually _do_ need to have their offsets
verified, it just happens to be checked somewhere else (as you said, in
__check_mem_access()). I wonder if at some point we want to refactor the
verifier to bring __check_mem_access() into this function, and call it
here for these pointer types? Or some other refactor that unifies where
and how validation is done for different register types and arg types.

IMO a somehwat significant part of the complexity in the verifier is the
result of divergences like this between e.g. different reg types and arg
types. Someone reading check_func_arg(), for example, will see that we
call check_func_arg_reg_off(), and would arguably reasonably assume that
the register offset has been verified by the time this is called.

Anyways, I certainly don't mean to drag your patches through this
digression. If and when we start to consider doing some more refactoring
in the verifier, I think this is something we should keep in mind.

In general this patch LGTM. As mentioned above, I'll sign off now so we
can make sure this gets in before the 6.1 release, but I think it'd be
good to revisit the verifier message thing I mentioned above once we all
have a bit more bandwidth.

Acked-by: David Vernet <void@manifault.com>

[...]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-20 19:10     ` Kumar Kartikeya Dwivedi
  2022-11-20 19:40       ` Alexei Starovoitov
@ 2022-11-21  7:27       ` David Vernet
  2022-12-07 20:41         ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 28+ messages in thread
From: David Vernet @ 2022-11-21  7:27 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Mon, Nov 21, 2022 at 12:40:13AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> > On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > After previous commit, we are minimizing helper specific assumptions
> > > from check_func_arg_reg_off, making it generic, and offloading checks
> > > for a specific argument type to their respective functions called after
> > > check_func_arg_reg_off has been called.
> >
> > What's the point of check_func_arg_reg_off() if helpers have to check
> > offsets after it's been called? Also, in [0], there's now logic in
> > check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> > there's still a precedent for looking at arg types there. IMO it's
> > actually less confusing to just push as much offset checking as possible
> > into one place.
> >
> 
> I think you need to define 'as much offset checking'.

We certainly don't want to make check_func_arg_reg_off() a monster
function, but I think we can do better in terms of making the verifier
consistent and easier to reason about by pushing more logic into it.

My view (subject to change upon learning new context I may be missing)
is that the job of check_func_arg_reg_off() should be to map all
(reg_type x arg_type) combinations to checking the correct offset,
likely by calling __check_ptr_off_reg() with the correct arguments. The
signature / implementation of __check_ptr_off_reg() may need to change
for that to happen, and we may need to e.g. also leverage
__check_mem_access() to accommodate the mem register types.

Yes, doing this may cause check_func_arg_reg_off() to have a potentially
very large switch statement (though there are other ways to address
that), but isn't that preferable to having to read through hundreds or
thousands of lines of verifier code to convince yourself that an offset
was correctly determined for every possible (register x arg_type)?
Having all of that contained in one, well-defined spot seems much
simpler. Please let me know if I've grossly misunderstood something, or
am missing important / relevant context.

> Consider process_kptr_func, it requires var_off to be constant. Same for

IMO that check should certainly go in check_func_arg_reg_off().
__check_ptr_off_reg() already checks for tnum_is_const(reg->var_off) in
__check_ptr_off_reg(), and check_func_arg_reg_off() has all the
information it needs to encapsulate the logic for that check.

> bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
> PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
> of that into check_func_arg_reg_off?
> 
> Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
> exception go in this function as well?
> 
> Where do you draw the line here in terms of what that function does?

Personally I think "drawing the line" is the wrong way to think about
it. We need to decide what role the function plays, and generalize it in
a way that's consistent and clear. IMO its role at a high level should
be, "The verify arg / register offsets step in the verifier". If you
break that encapsulation, it becomes much more difficult to build a
consistent mental model of what the verifier is doing. Note that this
applies to other verification steps as well, such as e.g. verifying
types, verifying proper refcounts, etc. Perhaps I'm grossly naive for
thinking this is possible? Please let me know if you think that's the
case.

Anyways, it might not be possible to aggregate all logic for checking
reg->off into the function in the codebase as it exists today, but it
seems like a desirable end-state, and it feels like a step backwards to
start selectively moving reg->off checking out of
check_func_arg_reg_off() and into arg / helper specific functions.

> IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> you could say that if each register type was a class, then the checks there
> would be what you would do while constructing them on calling:
> 
> PtrToStack(off, var_off /* can be const or variable */)
> PtrToMapValue(off, var_off /* can be const or variable */)
> PtrToBtfId(off /* must be >= 0 */) /* no var_off */

Hmmm, but these are all just reg_type, right? Why are we checking
OBJ_RELEASE in check_func_arg_reg_off() if that's the case?

> How they get used by each helper and what further checks each helper needs to do
> on them based on the arg_type should be done at a later stage when the specific
> argument type is processed.

I definitely agree that there may be helper-specific verification that
needs to be done. We're talking about arg_type and reg_type, though.
Those aren't specific to an _individual_ helper (though yes, of course
arg_types are specific to whatever _sets_ of helpers take them, such as
e.g. dynptrs).

If we go with the approach of having individual arg types or sets of
helpers verify offsets, I think that still needs to be generalized so
that it's happening in a single place. This would involve something
like:

1. Having check_func_arg_reg_off() as it exists today be renamed to
check_func_reg_off(), and be solely responsible for checking reg_type.

2. Update check_func_arg_reg_off() to contain all the logic which does
actual arg type checking, possibly calling out to one of many possible
helper functions depending on the arg type.

My main issue is really just the fact that all of this logic is
scattered throughout the verifier.

> Agreed that, it's not perfect, with the odd case for PTR_TO_STACK having non-0
> reg->off for OBJ_RELEASE. But IMO once you realise it makes no sense to release
> PTR_TO_STACK and that PTR_TO_STACK actually points to the real pointer being
> released, it needs to be handled specially.
> 
> > [0]: https://lore.kernel.org/all/20221115000130.1967465-5-memxor@gmail.com/
> >
> > > This allows relying on a consistent set of guarantees after that call
> > > and then relying on them in code that deals with registers for each
> > > argument type later. This is in line with how process_spin_lock,
> > > process_timer_func, process_kptr_func check reg->var_off to be constant.
> > > The same reasoning is used here to move the alignment check into
> > > process_dynptr_func. Note that it also needs to check for constant
> > > var_off, and accumulate the constant var_off when computing the spi in
> > > get_spi, but that fix will come in later changes.
> > >
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  kernel/bpf/verifier.c | 13 ++++++++-----
> > >  1 file changed, 8 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index 34e67d04579b..fd292f762d53 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -5774,6 +5774,14 @@ int process_dynptr_func(struct bpf_verifier_env *env, int regno,
> > >  		return -EFAULT;
> > >  	}
> > >
> > > +	/* CONST_PTR_TO_DYNPTR already has fixed and var_off as 0 due to
> > > +	 * check_func_arg_reg_off's logic. We only need to check offset
> > > +	 * alignment for PTR_TO_STACK.
> > > +	 */
> > > +	if (reg->type == PTR_TO_STACK && (reg->off % BPF_REG_SIZE)) {
> > > +		verbose(env, "cannot pass in dynptr at an offset=%d\n", reg->off);
> > > +		return -EINVAL;
> > > +	}
> >
> > As alluded to above, I personally think it's more confusing to have this
> > check in process_dynptr_func(). On the one hand you have
> > check_func_arg_reg_off() which verifies that a register has the correct
> > offset, but then here we have to check for the register offset for
> > PTR_TO_STACK dynptrs specifically? Wouldn't it be better to try and push
> > as much of the offset-checking complexity into one place as possible?
> >
> 
> I'm trying to make a split between 'offset correct for the reg->type in
> general', and 'offset correct for the reg->type when when passed as an argument
> for arg_type'. I think the latter is specific and different for each case and
> thus belongs inside the case ARG_TYPE_* block of each of those.

If we're going to do that, IMO we should just remove the arg_type
parameter from the function altogether. Having specific arg_type logic
in the function feels artificial.

> Anyhow, all of this is not to reject your point, but to say that if we're
> keeping that check in check_func_arg_reg_off for dynptr, let's also examine why
> we should/shouldn't move checks for other arg_types inside it as well, and
> whether the end result is going to be better than this.

Agreed that this is the crux of the issue. I think I've said my piece so
I won't reply directly to this point here.

> In that case, atleast to me, it doesn't make sense to check reg->off %
> BPF_REG_SIZE for ARG_PTR_TO_DYNPTR while leaving out other arg types.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func
  2022-11-21  7:27       ` David Vernet
@ 2022-12-07 20:41         ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 28+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-12-07 20:41 UTC (permalink / raw)
  To: David Vernet
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Joanne Koong

On Mon, Nov 21, 2022 at 12:57:26PM IST, David Vernet wrote:
> On Mon, Nov 21, 2022 at 12:40:13AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Fri, Nov 18, 2022 at 05:19:27AM IST, David Vernet wrote:
> > > On Tue, Nov 15, 2022 at 05:31:28AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > After previous commit, we are minimizing helper specific assumptions
> > > > from check_func_arg_reg_off, making it generic, and offloading checks
> > > > for a specific argument type to their respective functions called after
> > > > check_func_arg_reg_off has been called.
> > >
> > > What's the point of check_func_arg_reg_off() if helpers have to check
> > > offsets after it's been called? Also, in [0], there's now logic in
> > > check_func_arg_reg_off() which checks for OBJ_RELEASE arg types, so
> > > there's still a precedent for looking at arg types there. IMO it's
> > > actually less confusing to just push as much offset checking as possible
> > > into one place.
> > >
> >
> > I think you need to define 'as much offset checking'.
>
> We certainly don't want to make check_func_arg_reg_off() a monster
> function, but I think we can do better in terms of making the verifier
> consistent and easier to reason about by pushing more logic into it.
>
> My view (subject to change upon learning new context I may be missing)
> is that the job of check_func_arg_reg_off() should be to map all
> (reg_type x arg_type) combinations to checking the correct offset,
> likely by calling __check_ptr_off_reg() with the correct arguments. The
> signature / implementation of __check_ptr_off_reg() may need to change
> for that to happen, and we may need to e.g. also leverage
> __check_mem_access() to accommodate the mem register types.
>
> Yes, doing this may cause check_func_arg_reg_off() to have a potentially
> very large switch statement (though there are other ways to address
> that), but isn't that preferable to having to read through hundreds or
> thousands of lines of verifier code to convince yourself that an offset
> was correctly determined for every possible (register x arg_type)?
> Having all of that contained in one, well-defined spot seems much
> simpler. Please let me know if I've grossly misunderstood something, or
> am missing important / relevant context.
>
> > Consider process_kptr_func, it requires var_off to be constant. Same for
>
> IMO that check should certainly go in check_func_arg_reg_off().
> __check_ptr_off_reg() already checks for tnum_is_const(reg->var_off) in
> __check_ptr_off_reg(), and check_func_arg_reg_off() has all the
> information it needs to encapsulate the logic for that check.
>

I think this will be a bigger change that should be probably go in on its own,
since you've expressed in the reply to patch 4 that you intend to discuss this
with Daniel, I'm respinning without this for now.

> > bpf_timer, bpf_spin_lock, bpf_list_head, bpf_list_node, etc. They take
> > PTR_TO_MAP_VALUE, PTR_TO_BTF_ID, PTR_TO_BTF_ID | MEM_ALLOC. Should we move all
> > of that into check_func_arg_reg_off?
> >
> > Some argument types like ARG_PTR_TO_MEM are ok with variable offset, should that
> > exception go in this function as well?
> >
> > Where do you draw the line here in terms of what that function does?
>
> Personally I think "drawing the line" is the wrong way to think about
> it. We need to decide what role the function plays, and generalize it in
> a way that's consistent and clear. IMO its role at a high level should
> be, "The verify arg / register offsets step in the verifier". If you
> break that encapsulation, it becomes much more difficult to build a
> consistent mental model of what the verifier is doing. Note that this
> applies to other verification steps as well, such as e.g. verifying
> types, verifying proper refcounts, etc. Perhaps I'm grossly naive for
> thinking this is possible? Please let me know if you think that's the
> case.
>
> Anyways, it might not be possible to aggregate all logic for checking
> reg->off into the function in the codebase as it exists today, but it
> seems like a desirable end-state, and it feels like a step backwards to
> start selectively moving reg->off checking out of
> check_func_arg_reg_off() and into arg / helper specific functions.
>
> > IMO, there are a certain set of properties that check_func_arg_reg_off provides,
> > you could say that if each register type was a class, then the checks there
> > would be what you would do while constructing them on calling:
> >
> > PtrToStack(off, var_off /* can be const or variable */)
> > PtrToMapValue(off, var_off /* can be const or variable */)
> > PtrToBtfId(off /* must be >= 0 */) /* no var_off */
>
> Hmmm, but these are all just reg_type, right? Why are we checking
> OBJ_RELEASE in check_func_arg_reg_off() if that's the case?
>

Probably for the same reason we handle meta->release_regno in a unified manner
for all helpers. Earlier OBJ_RELEASE was not an arg_type, but a list of func_ids
matched in a function, so in that sense it had nothing to do with the arg type
until then.

> > How they get used by each helper and what further checks each helper needs to do
> > on them based on the arg_type should be done at a later stage when the specific
> > argument type is processed.
>
> I definitely agree that there may be helper-specific verification that
> needs to be done. We're talking about arg_type and reg_type, though.
> Those aren't specific to an _individual_ helper (though yes, of course
> arg_types are specific to whatever _sets_ of helpers take them, such as
> e.g. dynptrs).
>
> If we go with the approach of having individual arg types or sets of
> helpers verify offsets, I think that still needs to be generalized so
> that it's happening in a single place. This would involve something
> like:
>
> 1. Having check_func_arg_reg_off() as it exists today be renamed to
> check_func_reg_off(), and be solely responsible for checking reg_type.
>
> 2. Update check_func_arg_reg_off() to contain all the logic which does
> actual arg type checking, possibly calling out to one of many possible
> helper functions depending on the arg type.
>
> My main issue is really just the fact that all of this logic is
> scattered throughout the verifier.
>

I think it's a difference of opinion. I guess it's fine to move all offset
related checks to this function, for every arg_type and every single case as
well, but then it should be part of a bigger change that it does for all of the
existing cases, not just limited to ARG_PTR_TO_{DYNPTR, KPTR, SPIN_LOCK, TIMER}
etc. It will be a bigger refactoring that rearranges and puts all those checks
in a single place, which I think should be done later as its own set.

Though you might end up duplicating some checks because the verification path
from insns won't call check_func_arg_reg_off, but those shared functions are
called from both insns and helpers side, so the checks would need to remain
there unless this refactored function is called for each insn (in which case it
would need to check based on insn type as well). So I might be wrong but there
won't be a lot of code reduction without more involved refactorings.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2022-12-07 20:41 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-11-15  0:01 [PATCH bpf-next v1 0/7] Dynptr refactorings Kumar Kartikeya Dwivedi
2022-11-15  0:01 ` [PATCH bpf-next v1 1/7] bpf: Refactor ARG_PTR_TO_DYNPTR checks into process_dynptr_func Kumar Kartikeya Dwivedi
2022-11-15  4:15   ` Joanne Koong
2022-11-15  0:01 ` [PATCH bpf-next v1 2/7] bpf: Propagate errors from process_* checks in check_func_arg Kumar Kartikeya Dwivedi
2022-11-15  3:53   ` Joanne Koong
2022-11-15  0:01 ` [PATCH bpf-next v1 3/7] bpf: Rework process_dynptr_func Kumar Kartikeya Dwivedi
2022-11-16 18:04   ` Joanne Koong
2022-11-17 21:11   ` David Vernet
2022-11-20 18:06     ` Kumar Kartikeya Dwivedi
2022-11-20 18:16       ` David Vernet
2022-11-15  0:01 ` [PATCH bpf-next v1 4/7] bpf: Rework check_func_arg_reg_off Kumar Kartikeya Dwivedi
2022-11-15 18:24   ` Joanne Koong
2022-11-17 23:42   ` David Vernet
2022-11-20 18:41     ` Kumar Kartikeya Dwivedi
2022-11-21  5:39       ` David Vernet
2022-11-15  0:01 ` [PATCH bpf-next v1 5/7] bpf: Move PTR_TO_STACK alignment check to process_dynptr_func Kumar Kartikeya Dwivedi
2022-11-15 18:29   ` Joanne Koong
2022-11-17 23:49   ` David Vernet
2022-11-20 19:10     ` Kumar Kartikeya Dwivedi
2022-11-20 19:40       ` Alexei Starovoitov
2022-11-20 21:02         ` Kumar Kartikeya Dwivedi
2022-11-21  7:27       ` David Vernet
2022-12-07 20:41         ` Kumar Kartikeya Dwivedi
2022-11-15  0:01 ` [PATCH bpf-next v1 6/7] bpf: Use memmove for bpf_dynptr_{read,write} Kumar Kartikeya Dwivedi
2022-11-17 23:51   ` David Vernet
2022-11-15  0:01 ` [PATCH bpf-next v1 7/7] selftests/bpf: Add test for dynptr reinit in user_ringbuf callback Kumar Kartikeya Dwivedi
2022-11-15 18:36   ` Joanne Koong
2022-11-15 19:41     ` Kumar Kartikeya Dwivedi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.