bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
@ 2022-03-06 23:43 Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
                   ` (5 more replies)
  0 siblings, 6 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
returning a packet pointer with a fixed immutable range. This can be useful to
enable DPA without having to use memcpy (currently the case in
bpf_xdp_load_bytes and bpf_xdp_store_bytes).

The intended usage to read and write data for multi-buff XDP is:

	int err = 0;
	char buf[N];

	off &= 0xffff;
	ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
	if (unlikely(!ptr)) {
		if (err < 0)
			return XDP_ABORTED;
		err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
		if (err < 0)
			return XDP_ABORTED;
		ptr = buf;
	}
	...
	// Do some stores and loads in [ptr, ptr + N) region
	...
	if (unlikely(ptr == buf)) {
		err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
		if (err < 0)
			return XDP_ABORTED;
	}

Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
it is also more meaningful to the user to see return value as R0=pkt.

This series is meant to collect feedback on the approach, next version can
include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
hooks, and explore not resetting range to zero on r0 += rX, instead check access
like check_mem_region_access (var_off + off < range), since there would be no
data_end to compare against and obtain a new range.

The common name and func_id is supposed to allow writing generic code using
bpf_packet_pointer that works for both XDP and TC programs.

Please see the individual patches for implementation details.

Kumar Kartikeya Dwivedi (5):
  bpf: Add ARG_SCALAR and ARG_CONSTANT
  bpf: Introduce pkt_uid concept for PTR_TO_PACKET
  bpf: Introduce bpf_packet_pointer helper to do DPA
  selftests/bpf: Add verifier tests for pkt pointer with pkt_uid
  selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer

 include/linux/bpf.h                           |   4 +
 include/linux/bpf_verifier.h                  |   9 +-
 include/uapi/linux/bpf.h                      |  12 ++
 kernel/bpf/verifier.c                         |  97 ++++++++++--
 net/core/filter.c                             |  48 +++---
 tools/include/uapi/linux/bpf.h                |  12 ++
 .../bpf/prog_tests/xdp_adjust_frags.c         |  46 ++++--
 .../bpf/progs/test_xdp_update_frags.c         |  46 ++++--
 tools/testing/selftests/bpf/verifier/xdp.c    | 146 ++++++++++++++++++
 9 files changed, 358 insertions(+), 62 deletions(-)

-- 
2.35.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
@ 2022-03-06 23:43 ` Kumar Kartikeya Dwivedi
  2022-03-07 19:28   ` Joanne Koong
  2022-03-08  5:42   ` Andrii Nakryiko
  2022-03-06 23:43 ` [PATCH bpf-next v1 2/5] bpf: Introduce pkt_uid concept for PTR_TO_PACKET Kumar Kartikeya Dwivedi
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

In the next patch, we will introduce a new helper 'bpf_packet_pointer'
that takes offset and len and returns a packet pointer. There we want to
statically enforce offset is in range [0, 0xffff], and that len is a
constant value, in range [1, 0xffff]. This also helps us avoid a
pointless runtime check. To make these checks possible, we need to
ensure we only get a scalar type. Although a lot of other argument types
take scalars, their intent is different. Hence add general ARG_SCALAR
and ARG_CONSTANT types, where the latter is also checked to be constant
in addition to being scalar.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h   |  2 ++
 kernel/bpf/verifier.c | 13 +++++++++++++
 2 files changed, 15 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 88449fbbe063..7841d90b83df 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -391,6 +391,8 @@ enum bpf_arg_type {
 	ARG_PTR_TO_STACK,	/* pointer to stack */
 	ARG_PTR_TO_CONST_STR,	/* pointer to a null terminated read-only string */
 	ARG_PTR_TO_TIMER,	/* pointer to bpf_timer */
+	ARG_SCALAR,		/* a scalar with any value(s) */
+	ARG_CONSTANT,		/* a scalar with constant value */
 	__BPF_ARG_TYPE_MAX,
 
 	/* Extended arg_types. */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ec3a7b6c9515..0373d5bd240f 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
 	       type == ARG_PTR_TO_LONG;
 }
 
+static bool arg_type_is_scalar(enum bpf_arg_type type)
+{
+	return type == ARG_SCALAR ||
+	       type == ARG_CONSTANT;
+}
+
 static int int_ptr_type_to_size(enum bpf_arg_type type)
 {
 	if (type == ARG_PTR_TO_INT)
@@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
 	[ARG_PTR_TO_STACK]		= &stack_ptr_types,
 	[ARG_PTR_TO_CONST_STR]		= &const_str_ptr_types,
 	[ARG_PTR_TO_TIMER]		= &timer_types,
+	[ARG_SCALAR]			= &scalar_types,
+	[ARG_CONSTANT]			= &scalar_types,
 };
 
 static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
@@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 			verbose(env, "string is not zero-terminated\n");
 			return -EINVAL;
 		}
+	} else if (arg_type_is_scalar(arg_type)) {
+		if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
+			verbose(env, "R%d is not a known constant\n", regno);
+			return -EACCES;
+		}
 	}
 
 	return err;
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v1 2/5] bpf: Introduce pkt_uid concept for PTR_TO_PACKET
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
@ 2022-03-06 23:43 ` Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 3/5] bpf: Introduce bpf_packet_pointer helper to do DPA Kumar Kartikeya Dwivedi
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

Add a new member in PTR_TO_PACKET specific register state, namely
pkt_uid. This is used to classify packet pointers into different sets,
and the invariant is that any pkt pointers not belonging to the same
set, i.e. not sharing same pkt_uid, won't be allowed for comparison with
each other. During range propagation in __find_good_pkt_pointers, we now
need to take care to skip packet pointers with a different pkt_uid.

This can be used to set for a packet pointer returned from a helper
'bpf_packet_pointer' in the next patch, that encodes the range from the
len parameter it gets. Generating a distinct pkt_uid means this pointer
cannot be compared with other packet pointers and its range cannot be
manipulated.

Note that for helpers which change underlying packet data, we don't make
any distinction based on pkt_uid for clear_all_pkt_pointers, since even
though the pkt_uid is different, they all point into ctx.

regsafe is updated to match non-zero pkt_uid using the idmap to ensure
it rejects distinct pkt_uid pkt pointers.

We also replace memset of reg->raw to set range to 0, but it is helpful
to elaborate on why replacing it with reg->range = 0 is correct. In
commit 0962590e5533 ("bpf: fix partial copy of map_ptr when dst is scalar"),
the copying was changed to use raw so that all possible members of type
specific register state are copied, since at that point the type of
register is not known. But inside the reg_is_pkt_pointer block, there is
no need to memset the whole 'raw' struct, since we also have a pkt_uid
member that we now want to preserve after copying from one register to
another, for pkt pointers. A test for this case has been included to
prevent regressions.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf_verifier.h |  9 ++++++-
 kernel/bpf/verifier.c        | 47 ++++++++++++++++++++++++++++--------
 2 files changed, 45 insertions(+), 11 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index c1fc4af47f69..0379f953cf22 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -50,7 +50,14 @@ struct bpf_reg_state {
 	s32 off;
 	union {
 		/* valid when type == PTR_TO_PACKET */
-		int range;
+		struct {
+			int range;
+			/* This is used to tag some PTR_TO_PACKET so that they
+			 * cannot be compared existing PTR_TO_PACKET with
+			 * different pkt_uid.
+			 */
+			u32 pkt_uid;
+		};
 
 		/* valid when type == CONST_PTR_TO_MAP | PTR_TO_MAP_VALUE |
 		 *   PTR_TO_MAP_VALUE_OR_NULL
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0373d5bd240f..88ac2c833bed 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -712,8 +712,14 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 				verbose_a("ref_obj_id=%d", reg->ref_obj_id);
 			if (t != SCALAR_VALUE)
 				verbose_a("off=%d", reg->off);
-			if (type_is_pkt_pointer(t))
+			if (type_is_pkt_pointer(t)) {
 				verbose_a("r=%d", reg->range);
+				/* pkt_uid is only set for PTR_TO_PACKET, so
+				 * type_is_pkt_pointer check is enough.
+				 */
+				if (reg->pkt_uid)
+					verbose_a("pkt_uid=%d", reg->pkt_uid);
+			}
 			else if (base_type(t) == CONST_PTR_TO_MAP ||
 				 base_type(t) == PTR_TO_MAP_KEY ||
 				 base_type(t) == PTR_TO_MAP_VALUE)
@@ -7604,7 +7610,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 		if (reg_is_pkt_pointer(ptr_reg)) {
 			dst_reg->id = ++env->id_gen;
 			/* something was added to pkt_ptr, set range to zero */
-			memset(&dst_reg->raw, 0, sizeof(dst_reg->raw));
+			dst_reg->range = 0;
 		}
 		break;
 	case BPF_SUB:
@@ -7664,7 +7670,7 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 			dst_reg->id = ++env->id_gen;
 			/* something was added to pkt_ptr, set range to zero */
 			if (smin_val < 0)
-				memset(&dst_reg->raw, 0, sizeof(dst_reg->raw));
+				dst_reg->range = 0;
 		}
 		break;
 	case BPF_AND:
@@ -8701,7 +8707,8 @@ static void __find_good_pkt_pointers(struct bpf_func_state *state,
 
 	for (i = 0; i < MAX_BPF_REG; i++) {
 		reg = &state->regs[i];
-		if (reg->type == type && reg->id == dst_reg->id)
+		if (reg->type == type && reg->id == dst_reg->id &&
+		    reg->pkt_uid == dst_reg->pkt_uid)
 			/* keep the maximum range already checked */
 			reg->range = max(reg->range, new_range);
 	}
@@ -8709,7 +8716,8 @@ static void __find_good_pkt_pointers(struct bpf_func_state *state,
 	bpf_for_each_spilled_reg(i, state, reg) {
 		if (!reg)
 			continue;
-		if (reg->type == type && reg->id == dst_reg->id)
+		if (reg->type == type && reg->id == dst_reg->id &&
+		    reg->pkt_uid == dst_reg->pkt_uid)
 			reg->range = max(reg->range, new_range);
 	}
 }
@@ -9330,6 +9338,14 @@ static void mark_ptr_or_null_regs(struct bpf_verifier_state *vstate, u32 regno,
 		__mark_ptr_or_null_regs(vstate->frame[i], id, is_null);
 }
 
+static bool is_bad_pkt_comparison(const struct bpf_reg_state *dst_reg,
+				  const struct bpf_reg_state *src_reg)
+{
+	if (!reg_is_pkt_pointer_any(dst_reg) || !reg_is_pkt_pointer_any(src_reg))
+		return false;
+	return dst_reg->pkt_uid != src_reg->pkt_uid;
+}
+
 static bool try_match_pkt_pointers(const struct bpf_insn *insn,
 				   struct bpf_reg_state *dst_reg,
 				   struct bpf_reg_state *src_reg,
@@ -9343,6 +9359,9 @@ static bool try_match_pkt_pointers(const struct bpf_insn *insn,
 	if (BPF_CLASS(insn->code) == BPF_JMP32)
 		return false;
 
+	if (is_bad_pkt_comparison(dst_reg, src_reg))
+		return false;
+
 	switch (BPF_OP(insn->code)) {
 	case BPF_JGT:
 		if ((dst_reg->type == PTR_TO_PACKET &&
@@ -9640,11 +9659,17 @@ static int check_cond_jmp_op(struct bpf_verifier_env *env,
 		mark_ptr_or_null_regs(other_branch, insn->dst_reg,
 				      opcode == BPF_JEQ);
 	} else if (!try_match_pkt_pointers(insn, dst_reg, &regs[insn->src_reg],
-					   this_branch, other_branch) &&
-		   is_pointer_value(env, insn->dst_reg)) {
-		verbose(env, "R%d pointer comparison prohibited\n",
-			insn->dst_reg);
-		return -EACCES;
+					   this_branch, other_branch)) {
+		if (is_pointer_value(env, insn->dst_reg)) {
+			verbose(env, "R%d pointer comparison prohibited\n",
+				insn->dst_reg);
+			return -EACCES;
+		}
+		if (is_bad_pkt_comparison(dst_reg, &regs[insn->src_reg])) {
+			verbose(env, "R%d, R%d pkt pointer comparison prohibited\n",
+				insn->dst_reg, insn->src_reg);
+			return -EACCES;
+		}
 	}
 	if (env->log.level & BPF_LOG_LEVEL)
 		print_insn_state(env, this_branch->frame[this_branch->curframe]);
@@ -10891,6 +10916,8 @@ static bool regsafe(struct bpf_verifier_env *env, struct bpf_reg_state *rold,
 		/* id relations must be preserved */
 		if (rold->id && !check_ids(rold->id, rcur->id, idmap))
 			return false;
+		if (rold->pkt_uid && !check_ids(rold->pkt_uid, rcur->pkt_uid, idmap))
+			return false;
 		/* new val must satisfy old val knowledge */
 		return range_within(rold, rcur) &&
 		       tnum_in(rold->var_off, rcur->var_off);
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v1 3/5] bpf: Introduce bpf_packet_pointer helper to do DPA
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 2/5] bpf: Introduce pkt_uid concept for PTR_TO_PACKET Kumar Kartikeya Dwivedi
@ 2022-03-06 23:43 ` Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 4/5] selftests/bpf: Add verifier tests for pkt pointer with pkt_uid Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

Introduce a new helper 'bpf_packet_pointer', that returns a packet
pointer to a linear area in a possibly multi-buffer XDP buff. Earlier,
user had to use bpf_xdp_load_bytes and bpf_xdp_store_bytes to read from
and write to multi-bufer XDP buff, but this led to a memcpy for an ideal
case (where we detect a linear area in the initial frame or frags).
Instead, we can expose the bpf_packet_pointer function, and return a
packet pointer with a fixed range, so that user can do direct packet
access in the contiguous region.

The name bpf_packet_pointer is chosen so this helper can also be
implemented for TC programs in the future, using skb as ctx.

The helper either returns the pointer to linear contiguous area, or NULL
if it fails to find one. In that case, user can resort to the existing
helpers to do access across frame or frag boundaries. The case of offset
+ len > xdp_get_buff_len is still rejected, but the user can already
check for that beforehand so the error code is dropped for it, and NULL
is returned.

We use the support for ARG_SCALAR, ARG_CONSTANT, and pkt_uid for
PTR_TO_PACKET in this commit. First, it is enforced that offset is only
in range [0, 0xffff], and that len is a constant, with value in range
[1, 0xffff]. Then, we introduce ret_pkt_len member in bpf_call_arg_meta
to remember the length to set for the returned packet pointer. A fresh
ID is assigned to pkt_uid on each call, so that comparisons of these
PTR_TO_PACKET is rejected with existing packet pointers obtained from
ctx or other calls to bpf_packet_pointer, to prevent range manipulation.
The existing bpf_xdp_load_bytes/bpf_xdp_store_bytes now do a call to
bpf_xdp_copy_buf directly. The intended usage is that user first calls
bpf_packet_pointer, and on receiving NULL from the call, invokes these
'slow path' helpers that handle the access across head/frag boundary.

Note that the reason we choose PTR_TO_PACKET as the return value, and
not PTR_TO_MEM with a fixed mem_size, is because these pointers need
to be invalided (by clear_all_pkt_pointers) when a helper that changes
packet is invoked. Instead of special casing PTR_TO_MEM for that
purpose, it is better to adjust PTR_TO_PACKET to work for this mode with
minimal additions on the verifier side (from previous commit). Also, the
verifier errors related to bad access mention pkt pointer and not
pointer to memory, which is more meaningful to the BPF programmer.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h            |  2 ++
 include/uapi/linux/bpf.h       | 12 +++++++++
 kernel/bpf/verifier.c          | 37 ++++++++++++++++++++++++++
 net/core/filter.c              | 48 +++++++++++++++++-----------------
 tools/include/uapi/linux/bpf.h | 12 +++++++++
 5 files changed, 87 insertions(+), 24 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 7841d90b83df..981e87c64e47 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -421,6 +421,7 @@ enum bpf_return_type {
 	RET_PTR_TO_ALLOC_MEM,		/* returns a pointer to dynamically allocated memory */
 	RET_PTR_TO_MEM_OR_BTF_ID,	/* returns a pointer to a valid memory or a btf_id */
 	RET_PTR_TO_BTF_ID,		/* returns a pointer to a btf_id */
+	RET_PTR_TO_PACKET,		/* returns a pointer to a packet */
 	__BPF_RET_TYPE_MAX,
 
 	/* Extended ret_types. */
@@ -430,6 +431,7 @@ enum bpf_return_type {
 	RET_PTR_TO_SOCK_COMMON_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_SOCK_COMMON,
 	RET_PTR_TO_ALLOC_MEM_OR_NULL	= PTR_MAYBE_NULL | MEM_ALLOC | RET_PTR_TO_ALLOC_MEM,
 	RET_PTR_TO_BTF_ID_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_BTF_ID,
+	RET_PTR_TO_PACKET_OR_NULL	= PTR_MAYBE_NULL | RET_PTR_TO_PACKET,
 
 	/* This must be the last entry. Its purpose is to ensure the enum is
 	 * wide enough to hold the higher bits reserved for bpf_type_flag.
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 4eebea830613..3736cfbb325e 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5117,6 +5117,17 @@ union bpf_attr {
  *		0 on success.
  *		**-EINVAL** for invalid input
  *		**-EOPNOTSUPP** for unsupported delivery_time_type and protocol
+ *
+ * void *bpf_packet_pointer(void *ctx, u32 offset, u32 len)
+ *	Description
+ *		Return a pointer to linear area in packet at *offset* of length
+ *		*len*. The returned packet pointer cannot be compared to any
+ *		other packet pointers.
+ *
+ *		This helper is only available to XDP programs.
+ *	Return
+ *		Pointer to packet on success that can be accessed for *len*
+ *		bytes, or NULL when it fails.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5312,6 +5323,7 @@ union bpf_attr {
 	FN(xdp_store_bytes),		\
 	FN(copy_from_user_task),	\
 	FN(skb_set_delivery_time),      \
+	FN(packet_pointer),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 88ac2c833bed..e6e494e07f4c 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -257,6 +257,7 @@ struct bpf_call_arg_meta {
 	struct btf *ret_btf;
 	u32 ret_btf_id;
 	u32 subprogno;
+	int ret_pkt_len;
 };
 
 struct btf *btf_vmlinux;
@@ -5654,6 +5655,32 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 			verbose(env, "R%d is not a known constant\n", regno);
 			return -EACCES;
 		}
+
+		if (meta->func_id == BPF_FUNC_packet_pointer) {
+			struct tnum range;
+
+			switch (arg + 1) {
+			case 2:
+				/* arg2 = offset, enforce that the range is [0, 0xffff] */
+				range = tnum_range(0, 0xffff);
+				if (!tnum_in(range, reg->var_off)) {
+					verbose(env, "R%d must be in range [0, 0xffff]\n", regno);
+					return -EINVAL;
+				}
+				break;
+			case 3:
+				/* arg3 = len, already checked to be constant */
+				if (!reg->var_off.value || reg->var_off.value > 0xffff) {
+					verbose(env, "R%d must be in range [1, 0xffff]\n", regno);
+					return -EINVAL;
+				}
+				meta->ret_pkt_len = reg->var_off.value;
+				break;
+			default:
+				verbose(env, "verifier internal error: bpf_xdp_pointer unknown arg\n");
+				return -EFAULT;
+			}
+		}
 	}
 
 	return err;
@@ -6873,6 +6900,16 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn
 		 */
 		regs[BPF_REG_0].btf = btf_vmlinux;
 		regs[BPF_REG_0].btf_id = ret_btf_id;
+	} else if (base_type(ret_type) == RET_PTR_TO_PACKET) {
+		mark_reg_known_zero(env, regs, BPF_REG_0);
+		regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag;
+		regs[BPF_REG_0].pkt_uid = ++env->id_gen;
+		if (!meta.ret_pkt_len) {
+			verbose(env, "verifier internal error: ret_pkt_len unset\n");
+			return -EFAULT;
+		}
+		/* Already checked to be in range [1, 0xffff] */
+		regs[BPF_REG_0].range = meta.ret_pkt_len;
 	} else {
 		verbose(env, "unknown return type %u of func %s#%d\n",
 			base_type(ret_type), func_id_name(func_id), func_id);
diff --git a/net/core/filter.c b/net/core/filter.c
index 88767f7da150..4fc19b9e64c7 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3889,18 +3889,15 @@ static void bpf_xdp_copy_buf(struct xdp_buff *xdp, unsigned long off,
 	}
 }
 
-static void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len)
+BPF_CALL_3(bpf_xdp_pointer, struct xdp_buff *, xdp, u32, offset, u32, len)
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_buff(xdp);
 	u32 size = xdp->data_end - xdp->data;
 	void *addr = xdp->data;
 	int i;
 
-	if (unlikely(offset > 0xffff || len > 0xffff))
-		return ERR_PTR(-EFAULT);
-
 	if (offset + len > xdp_get_buff_len(xdp))
-		return ERR_PTR(-EINVAL);
+		return (unsigned long)NULL;
 
 	if (offset < size) /* linear area */
 		goto out;
@@ -3917,23 +3914,28 @@ static void *bpf_xdp_pointer(struct xdp_buff *xdp, u32 offset, u32 len)
 		offset -= frag_size;
 	}
 out:
-	return offset + len < size ? addr + offset : NULL;
+	return offset + len < size ? (unsigned long)addr + offset : (unsigned long)NULL;
 }
 
+static const struct bpf_func_proto bpf_xdp_pointer_proto = {
+	.func		= bpf_xdp_pointer,
+	.gpl_only	= false,
+	.ret_type	= RET_PTR_TO_PACKET_OR_NULL,
+	.arg1_type	= ARG_PTR_TO_CTX,
+	.arg2_type	= ARG_SCALAR,
+	.arg3_type	= ARG_CONSTANT,
+};
+
 BPF_CALL_4(bpf_xdp_load_bytes, struct xdp_buff *, xdp, u32, offset,
 	   void *, buf, u32, len)
 {
-	void *ptr;
-
-	ptr = bpf_xdp_pointer(xdp, offset, len);
-	if (IS_ERR(ptr))
-		return PTR_ERR(ptr);
+	if (unlikely(offset > 0xffff || len > 0xffff))
+		return -EFAULT;
 
-	if (!ptr)
-		bpf_xdp_copy_buf(xdp, offset, buf, len, false);
-	else
-		memcpy(buf, ptr, len);
+	if (offset + len > xdp_get_buff_len(xdp))
+		return -EINVAL;
 
+	bpf_xdp_copy_buf(xdp, offset, buf, len, false);
 	return 0;
 }
 
@@ -3950,17 +3952,13 @@ static const struct bpf_func_proto bpf_xdp_load_bytes_proto = {
 BPF_CALL_4(bpf_xdp_store_bytes, struct xdp_buff *, xdp, u32, offset,
 	   void *, buf, u32, len)
 {
-	void *ptr;
-
-	ptr = bpf_xdp_pointer(xdp, offset, len);
-	if (IS_ERR(ptr))
-		return PTR_ERR(ptr);
+	if (unlikely(offset > 0xffff || len > 0xffff))
+		return -EFAULT;
 
-	if (!ptr)
-		bpf_xdp_copy_buf(xdp, offset, buf, len, true);
-	else
-		memcpy(ptr, buf, len);
+	if (offset + len > xdp_get_buff_len(xdp))
+		return -EINVAL;
 
+	bpf_xdp_copy_buf(xdp, offset, buf, len, true);
 	return 0;
 }
 
@@ -7820,6 +7818,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_xdp_load_bytes_proto;
 	case BPF_FUNC_xdp_store_bytes:
 		return &bpf_xdp_store_bytes_proto;
+	case BPF_FUNC_packet_pointer:
+		return &bpf_xdp_pointer_proto;
 	case BPF_FUNC_fib_lookup:
 		return &bpf_xdp_fib_lookup_proto;
 	case BPF_FUNC_check_mtu:
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 4eebea830613..3736cfbb325e 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5117,6 +5117,17 @@ union bpf_attr {
  *		0 on success.
  *		**-EINVAL** for invalid input
  *		**-EOPNOTSUPP** for unsupported delivery_time_type and protocol
+ *
+ * void *bpf_packet_pointer(void *ctx, u32 offset, u32 len)
+ *	Description
+ *		Return a pointer to linear area in packet at *offset* of length
+ *		*len*. The returned packet pointer cannot be compared to any
+ *		other packet pointers.
+ *
+ *		This helper is only available to XDP programs.
+ *	Return
+ *		Pointer to packet on success that can be accessed for *len*
+ *		bytes, or NULL when it fails.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -5312,6 +5323,7 @@ union bpf_attr {
 	FN(xdp_store_bytes),		\
 	FN(copy_from_user_task),	\
 	FN(skb_set_delivery_time),      \
+	FN(packet_pointer),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v1 4/5] selftests/bpf: Add verifier tests for pkt pointer with pkt_uid
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2022-03-06 23:43 ` [PATCH bpf-next v1 3/5] bpf: Introduce bpf_packet_pointer helper to do DPA Kumar Kartikeya Dwivedi
@ 2022-03-06 23:43 ` Kumar Kartikeya Dwivedi
  2022-03-06 23:43 ` [PATCH bpf-next v1 5/5] selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer Kumar Kartikeya Dwivedi
  2022-03-08  5:48 ` [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Andrii Nakryiko
  5 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

Use bpf_packet_pointer to obtain such pkt pointers, and verify various
behaviors, like find_good_pkt_pointers skipping pkt pointers with
unequal pkt_uid, ensuring that offset, len to bpf_packet_pointer are
within limits imposed statically by verifier, rejecting comparion of pkt
pointer with pkt_uid against unequal pkt_uid, ensuring
clear_all_pkt_pointers doens't skip pkt_uid pkts.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/verifier/xdp.c | 146 +++++++++++++++++++++
 1 file changed, 146 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/xdp.c b/tools/testing/selftests/bpf/verifier/xdp.c
index 5ac390508139..580b294cde11 100644
--- a/tools/testing/selftests/bpf/verifier/xdp.c
+++ b/tools/testing/selftests/bpf/verifier/xdp.c
@@ -12,3 +12,149 @@
 	.prog_type = BPF_PROG_TYPE_XDP,
 	.retval = 1,
 },
+{
+	"XDP bpf_packet_pointer offset cannot be > 0xffff",
+	.insns = {
+	BPF_MOV64_IMM(BPF_REG_2, 0x10000),
+	BPF_MOV64_IMM(BPF_REG_3, 42),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R2 must be in range [0, 0xffff]",
+},
+{
+	"XDP bpf_packet_pointer len must be constant",
+	.insns = {
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct xdp_md, ingress_ifindex)),
+	BPF_JMP32_IMM(BPF_JSGE, BPF_REG_2, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_JMP32_IMM(BPF_JSLE, BPF_REG_2, 0xffff, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_3, BPF_REG_2),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R3 is not a known constant",
+},
+{
+	"XDP bpf_packet_pointer len cannot be 0",
+	.insns = {
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1, offsetof(struct xdp_md, ingress_ifindex)),
+	BPF_JMP32_IMM(BPF_JSGE, BPF_REG_2, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_JMP32_IMM(BPF_JSLE, BPF_REG_2, 0xffff, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_IMM(BPF_REG_3, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R3 must be in range [1, 0xffff]",
+},
+{
+	"XDP bpf_packet_pointer R0 cannot be compared with xdp_md pkt ptr",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_MOV64_IMM(BPF_REG_3, 42),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct xdp_md, data_end)),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, 16),
+	BPF_JMP_REG(BPF_JGE, BPF_REG_0, BPF_REG_1, 1),
+	BPF_EXIT_INSN(),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R0, R1 pkt pointer comparison prohibited",
+},
+{
+	"XDP bpf_packet_pointer R0 range propagation skips unequal pkt_uid",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_MOV64_IMM(BPF_REG_3, 1),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct xdp_md, data)),
+	BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_6, offsetof(struct xdp_md, data)),
+	BPF_LDX_MEM(BPF_W, BPF_REG_3, BPF_REG_6, offsetof(struct xdp_md, data)),
+	BPF_LDX_MEM(BPF_W, BPF_REG_4, BPF_REG_6, offsetof(struct xdp_md, data_end)),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 16),
+	BPF_JMP_REG(BPF_JLT, BPF_REG_1, BPF_REG_4, 1),
+	BPF_EXIT_INSN(),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, -16),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_2, 4),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_3, 8),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0),
+	BPF_MOV64_IMM(BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "invalid access to packet, off=0 size=8, R0(id=0,off=0,r=1)",
+},
+{
+	"XDP clear_all_pkt_pointers doesn't skip pkt_uid != 0",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_MOV64_IMM(BPF_REG_3, 16),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_7, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_MOV64_IMM(BPF_REG_2, 1),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_xdp_adjust_tail),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_7, BPF_REG_7, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R7 invalid mem access 'scalar'",
+},
+{
+	"XDP pkt_uid preserved when resetting range on rX += var",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
+	BPF_MOV64_IMM(BPF_REG_2, 0),
+	BPF_MOV64_IMM(BPF_REG_3, 16),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_packet_pointer),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_6, offsetof(struct xdp_md, ingress_ifindex)),
+	BPF_JMP32_IMM(BPF_JGE, BPF_REG_1, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_JMP32_IMM(BPF_JLE, BPF_REG_1, 4, 1),
+	BPF_EXIT_INSN(),
+	BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_0),
+	BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct xdp_md, data_end)),
+	BPF_JMP_REG(BPF_JLT, BPF_REG_1, BPF_REG_0, 1),
+	BPF_EXIT_INSN(),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_XDP,
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R1, R0 pkt pointer comparison prohibited",
+},
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH bpf-next v1 5/5] selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
                   ` (3 preceding siblings ...)
  2022-03-06 23:43 ` [PATCH bpf-next v1 4/5] selftests/bpf: Add verifier tests for pkt pointer with pkt_uid Kumar Kartikeya Dwivedi
@ 2022-03-06 23:43 ` Kumar Kartikeya Dwivedi
  2022-03-08  5:48 ` [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Andrii Nakryiko
  5 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-06 23:43 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

Test that in case of linear region, we always are able to do DPA without
any errors. Note how offset is clamped to range [0, 0xffff] and len is a
constant. Ensure that helper vs DPA is detected and tested.

Add a force_helper mode, that forces use of bpf_xdp_load_bytes and
bpf_xdp_store_bytes instead of using bpf_packet_pointer, even for
contiguous regions, to make sure that case keeps working.

Also, we can take this opportunity to convert it to use BPF skeleton.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../bpf/prog_tests/xdp_adjust_frags.c         | 46 +++++++++++++------
 .../bpf/progs/test_xdp_update_frags.c         | 46 +++++++++++++------
 2 files changed, 65 insertions(+), 27 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_frags.c b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_frags.c
index 2f033da4cd45..cfb50a575b11 100644
--- a/tools/testing/selftests/bpf/prog_tests/xdp_adjust_frags.c
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_adjust_frags.c
@@ -2,26 +2,24 @@
 #include <test_progs.h>
 #include <network_helpers.h>
 
-static void test_xdp_update_frags(void)
+#include "test_xdp_update_frags.skel.h"
+
+static void test_xdp_update_frags(bool force_helper)
 {
-	const char *file = "./test_xdp_update_frags.o";
 	int err, prog_fd, max_skb_frags, buf_size, num;
-	struct bpf_program *prog;
-	struct bpf_object *obj;
+	LIBBPF_OPTS(bpf_test_run_opts, topts);
+	struct test_xdp_update_frags *skel;
 	__u32 *offset;
 	__u8 *buf;
 	FILE *f;
-	LIBBPF_OPTS(bpf_test_run_opts, topts);
 
-	obj = bpf_object__open(file);
-	if (libbpf_get_error(obj))
+	skel = test_xdp_update_frags__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "test_xdp_update_frags__open_and_load"))
 		return;
 
-	prog = bpf_object__next_program(obj, NULL);
-	if (bpf_object__load(obj))
-		return;
+	skel->bss->force_helper = force_helper;
 
-	prog_fd = bpf_program__fd(prog);
+	prog_fd = bpf_program__fd(skel->progs.xdp_adjust_frags);
 
 	buf = malloc(128);
 	if (!ASSERT_OK_PTR(buf, "alloc buf 128b"))
@@ -45,6 +43,13 @@ static void test_xdp_update_frags(void)
 	ASSERT_EQ(topts.retval, XDP_PASS, "xdp_update_frag retval");
 	ASSERT_EQ(buf[16], 0xbb, "xdp_update_frag buf[16]");
 	ASSERT_EQ(buf[31], 0xbb, "xdp_update_frag buf[31]");
+	if (force_helper) {
+		ASSERT_EQ(skel->bss->used_dpa, false, "did not use DPA");
+		ASSERT_EQ(skel->bss->used_helper, true, "used helper");
+	} else {
+		ASSERT_EQ(skel->bss->used_dpa, true, "used DPA");
+		ASSERT_EQ(skel->bss->used_helper, false, "did not use helper");
+	}
 
 	free(buf);
 
@@ -70,6 +75,13 @@ static void test_xdp_update_frags(void)
 	ASSERT_EQ(topts.retval, XDP_PASS, "xdp_update_frag retval");
 	ASSERT_EQ(buf[5000], 0xbb, "xdp_update_frag buf[5000]");
 	ASSERT_EQ(buf[5015], 0xbb, "xdp_update_frag buf[5015]");
+	if (force_helper) {
+		ASSERT_EQ(skel->bss->used_dpa, false, "did not use DPA");
+		ASSERT_EQ(skel->bss->used_helper, true, "used helper");
+	} else {
+		ASSERT_EQ(skel->bss->used_dpa, true, "used DPA");
+		ASSERT_EQ(skel->bss->used_helper, false, "did not use helper");
+	}
 
 	memset(buf, 0, 9000);
 	offset = (__u32 *)buf;
@@ -84,6 +96,8 @@ static void test_xdp_update_frags(void)
 	ASSERT_EQ(topts.retval, XDP_PASS, "xdp_update_frag retval");
 	ASSERT_EQ(buf[3510], 0xbb, "xdp_update_frag buf[3510]");
 	ASSERT_EQ(buf[3525], 0xbb, "xdp_update_frag buf[3525]");
+	ASSERT_EQ(skel->bss->used_dpa, false, "did not use DPA");
+	ASSERT_EQ(skel->bss->used_helper, true, "used helper");
 
 	memset(buf, 0, 9000);
 	offset = (__u32 *)buf;
@@ -98,6 +112,8 @@ static void test_xdp_update_frags(void)
 	ASSERT_EQ(topts.retval, XDP_PASS, "xdp_update_frag retval");
 	ASSERT_EQ(buf[7606], 0xbb, "xdp_update_frag buf[7606]");
 	ASSERT_EQ(buf[7621], 0xbb, "xdp_update_frag buf[7621]");
+	ASSERT_EQ(skel->bss->used_dpa, false, "did not use DPA");
+	ASSERT_EQ(skel->bss->used_helper, true, "used helper");
 
 	free(buf);
 
@@ -136,11 +152,13 @@ static void test_xdp_update_frags(void)
 		  "unsupported buf size, possible non-default /proc/sys/net/core/max_skb_flags?");
 	free(buf);
 out:
-	bpf_object__close(obj);
+	test_xdp_update_frags__destroy(skel);
 }
 
 void test_xdp_adjust_frags(void)
 {
-	if (test__start_subtest("xdp_adjust_frags"))
-		test_xdp_update_frags();
+	if (test__start_subtest("xdp_adjust_frags-force-nodpa"))
+		test_xdp_update_frags(true);
+	if (test__start_subtest("xdp_adjust_frags-dpa+memcpy"))
+		test_xdp_update_frags(false);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c b/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
index 2a3496d8e327..1ad5c45e06e0 100644
--- a/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
+++ b/tools/testing/selftests/bpf/progs/test_xdp_update_frags.c
@@ -4,37 +4,57 @@
  * modify it under the terms of version 2 of the GNU General Public
  * License as published by the Free Software Foundation.
  */
-#include <linux/bpf.h>
-#include <linux/if_ether.h>
+#include <vmlinux.h>
 #include <bpf/bpf_helpers.h>
 
 int _version SEC("version") = 1;
 
+bool force_helper;
+bool used_dpa;
+bool used_helper;
+
+#define XDP_LEN 16
+
 SEC("xdp.frags")
 int xdp_adjust_frags(struct xdp_md *xdp)
 {
 	__u8 *data_end = (void *)(long)xdp->data_end;
 	__u8 *data = (void *)(long)xdp->data;
-	__u8 val[16] = {};
+	__u8 val[XDP_LEN] = {};
+	__u8 *ptr = NULL;
 	__u32 offset;
 	int err;
 
+	used_dpa = false;
+	used_helper = false;
+
 	if (data + sizeof(__u32) > data_end)
 		return XDP_DROP;
 
 	offset = *(__u32 *)data;
-	err = bpf_xdp_load_bytes(xdp, offset, val, sizeof(val));
-	if (err < 0)
+	offset &= 0xffff;
+	if (!force_helper)
+		ptr = bpf_packet_pointer(xdp, offset, XDP_LEN);
+	if (!ptr) {
+		used_helper = true;
+		err = bpf_xdp_load_bytes(xdp, offset, val, sizeof(val));
+		if (err < 0)
+			return XDP_DROP;
+		ptr = val;
+	} else {
+		used_dpa = true;
+	}
+
+	if (ptr[0] != 0xaa || ptr[15] != 0xaa) /* marker */
 		return XDP_DROP;
 
-	if (val[0] != 0xaa || val[15] != 0xaa) /* marker */
-		return XDP_DROP;
-
-	val[0] = 0xbb; /* update the marker */
-	val[15] = 0xbb;
-	err = bpf_xdp_store_bytes(xdp, offset, val, sizeof(val));
-	if (err < 0)
-		return XDP_DROP;
+	ptr[0] = 0xbb; /* update the marker */
+	ptr[15] = 0xbb;
+	if (ptr == val) {
+		err = bpf_xdp_store_bytes(xdp, offset, val, sizeof(val));
+		if (err < 0)
+			return XDP_DROP;
+	}
 
 	return XDP_PASS;
 }
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
@ 2022-03-07 19:28   ` Joanne Koong
  2022-03-07 21:22     ` Kumar Kartikeya Dwivedi
  2022-03-08  5:42   ` Andrii Nakryiko
  1 sibling, 1 reply; 24+ messages in thread
From: Joanne Koong @ 2022-03-07 19:28 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

On Mon, Mar 7, 2022 at 1:24 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> that takes offset and len and returns a packet pointer. There we want to
> statically enforce offset is in range [0, 0xffff], and that len is a
> constant value, in range [1, 0xffff]. This also helps us avoid a
> pointless runtime check. To make these checks possible, we need to
> ensure we only get a scalar type. Although a lot of other argument types
> take scalars, their intent is different. Hence add general ARG_SCALAR
> and ARG_CONSTANT types, where the latter is also checked to be constant
> in addition to being scalar.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf.h   |  2 ++
>  kernel/bpf/verifier.c | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 88449fbbe063..7841d90b83df 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -391,6 +391,8 @@ enum bpf_arg_type {
>         ARG_PTR_TO_STACK,       /* pointer to stack */
>         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
>         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> +       ARG_SCALAR,             /* a scalar with any value(s) */
> +       ARG_CONSTANT,           /* a scalar with constant value */
>         __BPF_ARG_TYPE_MAX,
>

Should we rename ARG_CONST_SIZE and ARG_CONST_SIZE_OR_ZERO to
something like ARG_MEM_CONST_SIZE / ARG_MEM_CONST_SIZE_OR_ZERO to make
the interface more explicit? I think that would make it less confusing
to differentiate between ARG_CONST_SIZE and ARG_CONSTANT for arguments
that describe the length/size but are not associated with a memory
buffer.


>         /* Extended arg_types. */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index ec3a7b6c9515..0373d5bd240f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
>                type == ARG_PTR_TO_LONG;
>  }
>
> +static bool arg_type_is_scalar(enum bpf_arg_type type)
> +{
> +       return type == ARG_SCALAR ||
> +              type == ARG_CONSTANT;
> +}
> +

I think this function name might be a bit misleading since
ARG_CONST_SIZE / ARG_CONST_SIZE_OR_ZERO / ARG_CONST_ALLOC_SIZE_OR_ZERO
are also scalar arg types. Maybe one alternative is to add a function
"arg_type_is_const" and then in check_func_arg, enforce that if
arg_type_is_const, then tnum_is_const(reg->var_off) must be true.
WDYT?

>  static int int_ptr_type_to_size(enum bpf_arg_type type)
>  {
>         if (type == ARG_PTR_TO_INT)
> @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
>         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
>         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
>         [ARG_PTR_TO_TIMER]              = &timer_types,
> +       [ARG_SCALAR]                    = &scalar_types,
> +       [ARG_CONSTANT]                  = &scalar_types,
>  };
>
>  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                         verbose(env, "string is not zero-terminated\n");
>                         return -EINVAL;
>                 }
> +       } else if (arg_type_is_scalar(arg_type)) {
> +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> +                       verbose(env, "R%d is not a known constant\n", regno);
> +                       return -EACCES;
> +               }
>         }
>
>         return err;
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-07 19:28   ` Joanne Koong
@ 2022-03-07 21:22     ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-07 21:22 UTC (permalink / raw)
  To: Joanne Koong
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, netdev

On Tue, Mar 08, 2022 at 12:58:13AM IST, Joanne Koong wrote:
> On Mon, Mar 7, 2022 at 1:24 AM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> > that takes offset and len and returns a packet pointer. There we want to
> > statically enforce offset is in range [0, 0xffff], and that len is a
> > constant value, in range [1, 0xffff]. This also helps us avoid a
> > pointless runtime check. To make these checks possible, we need to
> > ensure we only get a scalar type. Although a lot of other argument types
> > take scalars, their intent is different. Hence add general ARG_SCALAR
> > and ARG_CONSTANT types, where the latter is also checked to be constant
> > in addition to being scalar.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf.h   |  2 ++
> >  kernel/bpf/verifier.c | 13 +++++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 88449fbbe063..7841d90b83df 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -391,6 +391,8 @@ enum bpf_arg_type {
> >         ARG_PTR_TO_STACK,       /* pointer to stack */
> >         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
> >         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> > +       ARG_SCALAR,             /* a scalar with any value(s) */
> > +       ARG_CONSTANT,           /* a scalar with constant value */
> >         __BPF_ARG_TYPE_MAX,
> >
>
> Should we rename ARG_CONST_SIZE and ARG_CONST_SIZE_OR_ZERO to
> something like ARG_MEM_CONST_SIZE / ARG_MEM_CONST_SIZE_OR_ZERO to make
> the interface more explicit? I think that would make it less confusing
> to differentiate between ARG_CONST_SIZE and ARG_CONSTANT for arguments
> that describe the length/size but are not associated with a memory
> buffer.
>

I don't have a problem with that. I was just avoiding the churn since
ARG_CONST_SIZE vs ARG_CONSTANT didn't seem confusing to me.

>
> >         /* Extended arg_types. */
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index ec3a7b6c9515..0373d5bd240f 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
> >                type == ARG_PTR_TO_LONG;
> >  }
> >
> > +static bool arg_type_is_scalar(enum bpf_arg_type type)
> > +{
> > +       return type == ARG_SCALAR ||
> > +              type == ARG_CONSTANT;
> > +}
> > +
>
> I think this function name might be a bit misleading since
> ARG_CONST_SIZE / ARG_CONST_SIZE_OR_ZERO / ARG_CONST_ALLOC_SIZE_OR_ZERO
> are also scalar arg types. Maybe one alternative is to add a function

They already have their own arg_type_is_mem_size, I think the naming here has no
relation to the underlying compatible register types. ARG_CONSTANT is just a
stronger ARG_SCALAR, so it makes sense to put both under arg_type_is_scalar
umbrella.

> "arg_type_is_const" and then in check_func_arg, enforce that if
> arg_type_is_const, then tnum_is_const(reg->var_off) must be true.
> WDYT?

I don't think ARG_CONST_SIZE[_OR_ZERO] have any requirement that the scalar
value must be a known constant, so tnum_is_const check would be prohibitive.

I think the 'CONST' in their name is more misleading, so it might make sense to
drop that instead.

>
> >  static int int_ptr_type_to_size(enum bpf_arg_type type)
> >  {
> >         if (type == ARG_PTR_TO_INT)
> > @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> >         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
> >         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
> >         [ARG_PTR_TO_TIMER]              = &timer_types,
> > +       [ARG_SCALAR]                    = &scalar_types,
> > +       [ARG_CONSTANT]                  = &scalar_types,
> >  };
> >
> >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >                         verbose(env, "string is not zero-terminated\n");
> >                         return -EINVAL;
> >                 }
> > +       } else if (arg_type_is_scalar(arg_type)) {
> > +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> > +                       verbose(env, "R%d is not a known constant\n", regno);
> > +                       return -EACCES;
> > +               }
> >         }
> >
> >         return err;
> > --
> > 2.35.1
> >

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
  2022-03-07 19:28   ` Joanne Koong
@ 2022-03-08  5:42   ` Andrii Nakryiko
  2022-03-08  6:26     ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2022-03-08  5:42 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> that takes offset and len and returns a packet pointer. There we want to
> statically enforce offset is in range [0, 0xffff], and that len is a
> constant value, in range [1, 0xffff]. This also helps us avoid a
> pointless runtime check. To make these checks possible, we need to
> ensure we only get a scalar type. Although a lot of other argument types
> take scalars, their intent is different. Hence add general ARG_SCALAR
> and ARG_CONSTANT types, where the latter is also checked to be constant
> in addition to being scalar.
>
> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> ---
>  include/linux/bpf.h   |  2 ++
>  kernel/bpf/verifier.c | 13 +++++++++++++
>  2 files changed, 15 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 88449fbbe063..7841d90b83df 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -391,6 +391,8 @@ enum bpf_arg_type {
>         ARG_PTR_TO_STACK,       /* pointer to stack */
>         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
>         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> +       ARG_SCALAR,             /* a scalar with any value(s) */

What's the difference between ARG_ANYTHING and ARG_SCALAR?

> +       ARG_CONSTANT,           /* a scalar with constant value */

This ARG_CONSTANT serves a very similar purpose as
ARG_CONST_ALLOC_SIZE_OR_ZERO, tbh. The only difference is that one is
used to set meta->mem_size and this one is used (through extra func_id
special handling) to set meta->ret_pkt_len. But meta->mem_size and
meta->ret_pkt_len mean the same thing: how many bytes are directly
accessible through a pointer returned from the helper. So I feel like
there is some opportunity to unify and generalize, instead of adding
more custom variants of constants. WDYT?



>         __BPF_ARG_TYPE_MAX,
>
>         /* Extended arg_types. */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index ec3a7b6c9515..0373d5bd240f 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
>                type == ARG_PTR_TO_LONG;
>  }
>
> +static bool arg_type_is_scalar(enum bpf_arg_type type)
> +{
> +       return type == ARG_SCALAR ||
> +              type == ARG_CONSTANT;
> +}
> +
>  static int int_ptr_type_to_size(enum bpf_arg_type type)
>  {
>         if (type == ARG_PTR_TO_INT)
> @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
>         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
>         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
>         [ARG_PTR_TO_TIMER]              = &timer_types,
> +       [ARG_SCALAR]                    = &scalar_types,
> +       [ARG_CONSTANT]                  = &scalar_types,
>  };
>
>  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                         verbose(env, "string is not zero-terminated\n");
>                         return -EINVAL;
>                 }
> +       } else if (arg_type_is_scalar(arg_type)) {
> +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> +                       verbose(env, "R%d is not a known constant\n", regno);
> +                       return -EACCES;
> +               }
>         }
>
>         return err;
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
                   ` (4 preceding siblings ...)
  2022-03-06 23:43 ` [PATCH bpf-next v1 5/5] selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer Kumar Kartikeya Dwivedi
@ 2022-03-08  5:48 ` Andrii Nakryiko
  2022-03-08  7:08   ` Kumar Kartikeya Dwivedi
  5 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2022-03-08  5:48 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> returning a packet pointer with a fixed immutable range. This can be useful to
> enable DPA without having to use memcpy (currently the case in
> bpf_xdp_load_bytes and bpf_xdp_store_bytes).
>
> The intended usage to read and write data for multi-buff XDP is:
>
>         int err = 0;
>         char buf[N];
>
>         off &= 0xffff;
>         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
>         if (unlikely(!ptr)) {
>                 if (err < 0)
>                         return XDP_ABORTED;
>                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
>                 if (err < 0)
>                         return XDP_ABORTED;
>                 ptr = buf;
>         }
>         ...
>         // Do some stores and loads in [ptr, ptr + N) region
>         ...
>         if (unlikely(ptr == buf)) {
>                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
>                 if (err < 0)
>                         return XDP_ABORTED;
>         }
>
> Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> it is also more meaningful to the user to see return value as R0=pkt.
>
> This series is meant to collect feedback on the approach, next version can
> include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> hooks, and explore not resetting range to zero on r0 += rX, instead check access
> like check_mem_region_access (var_off + off < range), since there would be no
> data_end to compare against and obtain a new range.
>
> The common name and func_id is supposed to allow writing generic code using
> bpf_packet_pointer that works for both XDP and TC programs.
>
> Please see the individual patches for implementation details.
>

Joanne is working on a "bpf_dynptr" framework that will support
exactly this feature, in addition to working with dynamically
allocated memory, working with memory of statically unknown size (but
safe and checked at runtime), etc. And all that within a generic
common feature implemented uniformly within the verifier. E.g., it
won't need any of the custom bits of logic added in patch #2 and #3.
So I'm thinking that instead of custom-implementing a partial case of
bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
and do it only once there?

See also my ARG_CONSTANT comment. It seems like a pretty common thing
where input constant is used to characterize some pointer returned
from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
that for bpf_dynptr for exactly this "give me direct access of N
bytes, if possible" case. So improving/generalizing it now before
dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
feature itself.

> Kumar Kartikeya Dwivedi (5):
>   bpf: Add ARG_SCALAR and ARG_CONSTANT
>   bpf: Introduce pkt_uid concept for PTR_TO_PACKET
>   bpf: Introduce bpf_packet_pointer helper to do DPA
>   selftests/bpf: Add verifier tests for pkt pointer with pkt_uid
>   selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer
>
>  include/linux/bpf.h                           |   4 +
>  include/linux/bpf_verifier.h                  |   9 +-
>  include/uapi/linux/bpf.h                      |  12 ++
>  kernel/bpf/verifier.c                         |  97 ++++++++++--
>  net/core/filter.c                             |  48 +++---
>  tools/include/uapi/linux/bpf.h                |  12 ++
>  .../bpf/prog_tests/xdp_adjust_frags.c         |  46 ++++--
>  .../bpf/progs/test_xdp_update_frags.c         |  46 ++++--
>  tools/testing/selftests/bpf/verifier/xdp.c    | 146 ++++++++++++++++++
>  9 files changed, 358 insertions(+), 62 deletions(-)
>
> --
> 2.35.1
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-08  5:42   ` Andrii Nakryiko
@ 2022-03-08  6:26     ` Kumar Kartikeya Dwivedi
  2022-03-10 23:05       ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-08  6:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Tue, Mar 08, 2022 at 11:12:13AM IST, Andrii Nakryiko wrote:
> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> > that takes offset and len and returns a packet pointer. There we want to
> > statically enforce offset is in range [0, 0xffff], and that len is a
> > constant value, in range [1, 0xffff]. This also helps us avoid a
> > pointless runtime check. To make these checks possible, we need to
> > ensure we only get a scalar type. Although a lot of other argument types
> > take scalars, their intent is different. Hence add general ARG_SCALAR
> > and ARG_CONSTANT types, where the latter is also checked to be constant
> > in addition to being scalar.
> >
> > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > ---
> >  include/linux/bpf.h   |  2 ++
> >  kernel/bpf/verifier.c | 13 +++++++++++++
> >  2 files changed, 15 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 88449fbbe063..7841d90b83df 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -391,6 +391,8 @@ enum bpf_arg_type {
> >         ARG_PTR_TO_STACK,       /* pointer to stack */
> >         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
> >         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> > +       ARG_SCALAR,             /* a scalar with any value(s) */
>
> What's the difference between ARG_ANYTHING and ARG_SCALAR?
>

ARG_SCALAR only accepts reg->type == SCALAR, ARG_ANYTHING accepts anything as
long as reg->type != NOT_INIT (due to SRC_OP for check_reg_arg and early return
without further checks).

> > +       ARG_CONSTANT,           /* a scalar with constant value */
>
> This ARG_CONSTANT serves a very similar purpose as
> ARG_CONST_ALLOC_SIZE_OR_ZERO, tbh. The only difference is that one is
> used to set meta->mem_size and this one is used (through extra func_id
> special handling) to set meta->ret_pkt_len. But meta->mem_size and
> meta->ret_pkt_len mean the same thing: how many bytes are directly
> accessible through a pointer returned from the helper. So I feel like
> there is some opportunity to unify and generalize, instead of adding
> more custom variants of constants. WDYT?
>

I see, indeed it would make sense to make both equivalent, since
CONST_ALLOC_SIZE must also be a constant. Joanne also mentioned consolidating,
but I didn't understand how that would work for ARG_CONSTANT and ARG_CONST_SIZE
ones.

I'm wondering whether we can take a step back and should go with the following
convention:

ARG_MEM_SIZE, and two type flags, ARG_ZERO | ARG_CONSTANT

Old				New (in bpf_func_proto)
-------------------------------------------------------------------
ARG_CONST_SIZE			ARG_MEM_SIZE
ARG_CONST_SIZE_OR_ZERO		ARG_MEM_SIZE | ARG_ZERO
ARG_CONST_ALLOC_SIZE		ARG_MEM_SIZE | ARG_CONST
ARG_CONST_ALLOC_SIZE_OR_ZERO	ARG_MEM_SIZE | ARG_CONST | ARG_ZERO
ARG_CONSTANT (mine)		ARG_MEM_SIZE | ARG_CONST

When we detect ARG_CONST, we always set meta->mem_size, which can be used to
refine returned pointer range, otherwise meta->mem_size = -1 by default (so it
will be -1 for the !tnum_is_const(reg->var_off) case).

if (arg_type & ARG_CONST)
	meta->mem_size = reg->var_off.value;
	if (!(arg_type & ARG_ZERO) && !meta->mem_size)
		// error

The check_mem_size_reg call is only made when we see that previous reg was
ARG_PTR_TO_MEM. When preceding argument is not ARG_PTR_TO_MEM, we error if
ARG_CONST is not set for ARG_MEM_SIZE (so that either the mem size is for
previous parameter, or otherwise a constant size for the returned pointer).
We can also only allow certain pointer return types for that case.

If that is too much automagic, we can also discern using ARG_MEM_SIZE vs
ARG_RET_MEM_SIZE, but I think the above is fine.

ARG_CONST ofcourse only applies to args taking scalar type.

>
>
> >         __BPF_ARG_TYPE_MAX,
> >
> >         /* Extended arg_types. */
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index ec3a7b6c9515..0373d5bd240f 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
> >                type == ARG_PTR_TO_LONG;
> >  }
> >
> > +static bool arg_type_is_scalar(enum bpf_arg_type type)
> > +{
> > +       return type == ARG_SCALAR ||
> > +              type == ARG_CONSTANT;
> > +}
> > +
> >  static int int_ptr_type_to_size(enum bpf_arg_type type)
> >  {
> >         if (type == ARG_PTR_TO_INT)
> > @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> >         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
> >         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
> >         [ARG_PTR_TO_TIMER]              = &timer_types,
> > +       [ARG_SCALAR]                    = &scalar_types,
> > +       [ARG_CONSTANT]                  = &scalar_types,
> >  };
> >
> >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >                         verbose(env, "string is not zero-terminated\n");
> >                         return -EINVAL;
> >                 }
> > +       } else if (arg_type_is_scalar(arg_type)) {
> > +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> > +                       verbose(env, "R%d is not a known constant\n", regno);
> > +                       return -EACCES;
> > +               }
> >         }
> >
> >         return err;
> > --
> > 2.35.1
> >

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-08  5:48 ` [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Andrii Nakryiko
@ 2022-03-08  7:08   ` Kumar Kartikeya Dwivedi
  2022-03-08 13:40     ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-08  7:08 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> > returning a packet pointer with a fixed immutable range. This can be useful to
> > enable DPA without having to use memcpy (currently the case in
> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> >
> > The intended usage to read and write data for multi-buff XDP is:
> >
> >         int err = 0;
> >         char buf[N];
> >
> >         off &= 0xffff;
> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> >         if (unlikely(!ptr)) {
> >                 if (err < 0)
> >                         return XDP_ABORTED;
> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> >                 if (err < 0)
> >                         return XDP_ABORTED;
> >                 ptr = buf;
> >         }
> >         ...
> >         // Do some stores and loads in [ptr, ptr + N) region
> >         ...
> >         if (unlikely(ptr == buf)) {
> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> >                 if (err < 0)
> >                         return XDP_ABORTED;
> >         }
> >
> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> > it is also more meaningful to the user to see return value as R0=pkt.
> >
> > This series is meant to collect feedback on the approach, next version can
> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> > like check_mem_region_access (var_off + off < range), since there would be no
> > data_end to compare against and obtain a new range.
> >
> > The common name and func_id is supposed to allow writing generic code using
> > bpf_packet_pointer that works for both XDP and TC programs.
> >
> > Please see the individual patches for implementation details.
> >
>
> Joanne is working on a "bpf_dynptr" framework that will support
> exactly this feature, in addition to working with dynamically
> allocated memory, working with memory of statically unknown size (but
> safe and checked at runtime), etc. And all that within a generic
> common feature implemented uniformly within the verifier. E.g., it
> won't need any of the custom bits of logic added in patch #2 and #3.
> So I'm thinking that instead of custom-implementing a partial case of
> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> and do it only once there?
>

Interesting stuff, looking forward to it.

> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> where input constant is used to characterize some pointer returned
> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> that for bpf_dynptr for exactly this "give me direct access of N
> bytes, if possible" case. So improving/generalizing it now before
> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> feature itself.

No worries, we can continue the discussion in patch 1, I'll split out the arg
changes into a separate patch, and wait for dynptr to be posted before reworking
this.

>
> > Kumar Kartikeya Dwivedi (5):
> >   bpf: Add ARG_SCALAR and ARG_CONSTANT
> >   bpf: Introduce pkt_uid concept for PTR_TO_PACKET
> >   bpf: Introduce bpf_packet_pointer helper to do DPA
> >   selftests/bpf: Add verifier tests for pkt pointer with pkt_uid
> >   selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer
> >
> >  include/linux/bpf.h                           |   4 +
> >  include/linux/bpf_verifier.h                  |   9 +-
> >  include/uapi/linux/bpf.h                      |  12 ++
> >  kernel/bpf/verifier.c                         |  97 ++++++++++--
> >  net/core/filter.c                             |  48 +++---
> >  tools/include/uapi/linux/bpf.h                |  12 ++
> >  .../bpf/prog_tests/xdp_adjust_frags.c         |  46 ++++--
> >  .../bpf/progs/test_xdp_update_frags.c         |  46 ++++--
> >  tools/testing/selftests/bpf/verifier/xdp.c    | 146 ++++++++++++++++++
> >  9 files changed, 358 insertions(+), 62 deletions(-)
> >
> > --
> > 2.35.1
> >

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-08  7:08   ` Kumar Kartikeya Dwivedi
@ 2022-03-08 13:40     ` Toke Høiland-Jørgensen
  2022-03-10 23:12       ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-03-08 13:40 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi, Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Jesper Dangaard Brouer, Lorenzo Bianconi,
	John Fastabend, Jakub Kicinski, Lorenz Bauer, Networking

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
>> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> >
>> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
>> > returning a packet pointer with a fixed immutable range. This can be useful to
>> > enable DPA without having to use memcpy (currently the case in
>> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
>> >
>> > The intended usage to read and write data for multi-buff XDP is:
>> >
>> >         int err = 0;
>> >         char buf[N];
>> >
>> >         off &= 0xffff;
>> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
>> >         if (unlikely(!ptr)) {
>> >                 if (err < 0)
>> >                         return XDP_ABORTED;
>> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
>> >                 if (err < 0)
>> >                         return XDP_ABORTED;
>> >                 ptr = buf;
>> >         }
>> >         ...
>> >         // Do some stores and loads in [ptr, ptr + N) region
>> >         ...
>> >         if (unlikely(ptr == buf)) {
>> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
>> >                 if (err < 0)
>> >                         return XDP_ABORTED;
>> >         }
>> >
>> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
>> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
>> > it is also more meaningful to the user to see return value as R0=pkt.
>> >
>> > This series is meant to collect feedback on the approach, next version can
>> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
>> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
>> > like check_mem_region_access (var_off + off < range), since there would be no
>> > data_end to compare against and obtain a new range.
>> >
>> > The common name and func_id is supposed to allow writing generic code using
>> > bpf_packet_pointer that works for both XDP and TC programs.
>> >
>> > Please see the individual patches for implementation details.
>> >
>>
>> Joanne is working on a "bpf_dynptr" framework that will support
>> exactly this feature, in addition to working with dynamically
>> allocated memory, working with memory of statically unknown size (but
>> safe and checked at runtime), etc. And all that within a generic
>> common feature implemented uniformly within the verifier. E.g., it
>> won't need any of the custom bits of logic added in patch #2 and #3.
>> So I'm thinking that instead of custom-implementing a partial case of
>> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
>> and do it only once there?
>>
>
> Interesting stuff, looking forward to it.
>
>> See also my ARG_CONSTANT comment. It seems like a pretty common thing
>> where input constant is used to characterize some pointer returned
>> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
>> that for bpf_dynptr for exactly this "give me direct access of N
>> bytes, if possible" case. So improving/generalizing it now before
>> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
>> feature itself.
>
> No worries, we can continue the discussion in patch 1, I'll split out the arg
> changes into a separate patch, and wait for dynptr to be posted before reworking
> this.

This does raise the question of what we do in the meantime, though? Your
patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
making it, really has to go in before those hit a release and become
UAPI.

One option would be to still make the change to those other helpers;
they'd become a bit slower, but if we have a solution for that coming,
that may be OK for a single release? WDYT?

-Toke


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-08  6:26     ` Kumar Kartikeya Dwivedi
@ 2022-03-10 23:05       ` Andrii Nakryiko
  2022-03-10 23:09         ` Andrii Nakryiko
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2022-03-10 23:05 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Mon, Mar 7, 2022 at 10:26 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
>
> On Tue, Mar 08, 2022 at 11:12:13AM IST, Andrii Nakryiko wrote:
> > On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > >
> > > In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> > > that takes offset and len and returns a packet pointer. There we want to
> > > statically enforce offset is in range [0, 0xffff], and that len is a
> > > constant value, in range [1, 0xffff]. This also helps us avoid a
> > > pointless runtime check. To make these checks possible, we need to
> > > ensure we only get a scalar type. Although a lot of other argument types
> > > take scalars, their intent is different. Hence add general ARG_SCALAR
> > > and ARG_CONSTANT types, where the latter is also checked to be constant
> > > in addition to being scalar.
> > >
> > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > ---
> > >  include/linux/bpf.h   |  2 ++
> > >  kernel/bpf/verifier.c | 13 +++++++++++++
> > >  2 files changed, 15 insertions(+)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 88449fbbe063..7841d90b83df 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -391,6 +391,8 @@ enum bpf_arg_type {
> > >         ARG_PTR_TO_STACK,       /* pointer to stack */
> > >         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
> > >         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> > > +       ARG_SCALAR,             /* a scalar with any value(s) */
> >
> > What's the difference between ARG_ANYTHING and ARG_SCALAR?
> >
>
> ARG_SCALAR only accepts reg->type == SCALAR, ARG_ANYTHING accepts anything as
> long as reg->type != NOT_INIT (due to SRC_OP for check_reg_arg and early return
> without further checks).
>

Ah, ok, didn't realize that it's not always scalar for ARG_ANYTHING


> > > +       ARG_CONSTANT,           /* a scalar with constant value */
> >
> > This ARG_CONSTANT serves a very similar purpose as
> > ARG_CONST_ALLOC_SIZE_OR_ZERO, tbh. The only difference is that one is
> > used to set meta->mem_size and this one is used (through extra func_id
> > special handling) to set meta->ret_pkt_len. But meta->mem_size and
> > meta->ret_pkt_len mean the same thing: how many bytes are directly
> > accessible through a pointer returned from the helper. So I feel like
> > there is some opportunity to unify and generalize, instead of adding
> > more custom variants of constants. WDYT?
> >
>
> I see, indeed it would make sense to make both equivalent, since
> CONST_ALLOC_SIZE must also be a constant. Joanne also mentioned consolidating,
> but I didn't understand how that would work for ARG_CONSTANT and ARG_CONST_SIZE
> ones.
>
> I'm wondering whether we can take a step back and should go with the following
> convention:
>
> ARG_MEM_SIZE, and two type flags, ARG_ZERO | ARG_CONSTANT
>
> Old                             New (in bpf_func_proto)
> -------------------------------------------------------------------
> ARG_CONST_SIZE                  ARG_MEM_SIZE
> ARG_CONST_SIZE_OR_ZERO          ARG_MEM_SIZE | ARG_ZERO
> ARG_CONST_ALLOC_SIZE            ARG_MEM_SIZE | ARG_CONST
> ARG_CONST_ALLOC_SIZE_OR_ZERO    ARG_MEM_SIZE | ARG_CONST | ARG_ZERO
> ARG_CONSTANT (mine)             ARG_MEM_SIZE | ARG_CONST
>

I think using "ARG_MEM_SIZE" as part of ARG_CONSTANT is backwards and
misleading. It makes more sense to me to have ARG_CONSTANT and use
ARG_ZERO (or rather ARG_MAYBE_ZERO?) and ARG_MEM_SIZE (to specify that
this constant is describing the size of memory of a pointer that is
passed in a previous argument).

Basically, something like:

ARG_CONST_SIZE => ARG_CONSTANT | ARG_MEM_SIZE
ARG_CONST_SIZE_OR_ZERO => ARG_CONSTANT | ARG_MEM_SIZE | ARG_MAYBE_ZERO

Then we can replace ARG_CONST_ALLOC_SIZE and
ARG_CONST_ALLOC_SIZE_OR_ZERO with ARG_CONSTANT  and ARG_CONSTANT |
ARG_MAYBE_ZERO and we'll have a bit of special case to handle
bpf_ringbuf_reserve.

For ARG_CONSTANT, verifier will remember the value in
bpf_call_arg_meta, and then we can use it as necessary (e.g., instead
of mem_size when ARG_MEM_SIZE is specified) depending on context,
helper being called, etc.

Adding ARG_CONST just makes no sense as we always want constant value,
otherwise it might as well be just ARG_ANYTHING, right?

I haven't spent much time thinking about this, though, so I'm probably
missing something.


> When we detect ARG_CONST, we always set meta->mem_size, which can be used to
> refine returned pointer range, otherwise meta->mem_size = -1 by default (so it
> will be -1 for the !tnum_is_const(reg->var_off) case).
>
> if (arg_type & ARG_CONST)
>         meta->mem_size = reg->var_off.value;
>         if (!(arg_type & ARG_ZERO) && !meta->mem_size)
>                 // error
>
> The check_mem_size_reg call is only made when we see that previous reg was
> ARG_PTR_TO_MEM. When preceding argument is not ARG_PTR_TO_MEM, we error if
> ARG_CONST is not set for ARG_MEM_SIZE (so that either the mem size is for
> previous parameter, or otherwise a constant size for the returned pointer).
> We can also only allow certain pointer return types for that case.
>
> If that is too much automagic, we can also discern using ARG_MEM_SIZE vs
> ARG_RET_MEM_SIZE, but I think the above is fine.
>
> ARG_CONST ofcourse only applies to args taking scalar type.
>
> >
> >
> > >         __BPF_ARG_TYPE_MAX,
> > >
> > >         /* Extended arg_types. */
> > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > index ec3a7b6c9515..0373d5bd240f 100644
> > > --- a/kernel/bpf/verifier.c
> > > +++ b/kernel/bpf/verifier.c
> > > @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
> > >                type == ARG_PTR_TO_LONG;
> > >  }
> > >
> > > +static bool arg_type_is_scalar(enum bpf_arg_type type)
> > > +{
> > > +       return type == ARG_SCALAR ||
> > > +              type == ARG_CONSTANT;
> > > +}
> > > +
> > >  static int int_ptr_type_to_size(enum bpf_arg_type type)
> > >  {
> > >         if (type == ARG_PTR_TO_INT)
> > > @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> > >         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
> > >         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
> > >         [ARG_PTR_TO_TIMER]              = &timer_types,
> > > +       [ARG_SCALAR]                    = &scalar_types,
> > > +       [ARG_CONSTANT]                  = &scalar_types,
> > >  };
> > >
> > >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > > @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > >                         verbose(env, "string is not zero-terminated\n");
> > >                         return -EINVAL;
> > >                 }
> > > +       } else if (arg_type_is_scalar(arg_type)) {
> > > +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> > > +                       verbose(env, "R%d is not a known constant\n", regno);
> > > +                       return -EACCES;
> > > +               }
> > >         }
> > >
> > >         return err;
> > > --
> > > 2.35.1
> > >
>
> --
> Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-10 23:05       ` Andrii Nakryiko
@ 2022-03-10 23:09         ` Andrii Nakryiko
  2022-03-11  7:31           ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2022-03-10 23:09 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Thu, Mar 10, 2022 at 3:05 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Mar 7, 2022 at 10:26 PM Kumar Kartikeya Dwivedi
> <memxor@gmail.com> wrote:
> >
> > On Tue, Mar 08, 2022 at 11:12:13AM IST, Andrii Nakryiko wrote:
> > > On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > >
> > > > In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> > > > that takes offset and len and returns a packet pointer. There we want to
> > > > statically enforce offset is in range [0, 0xffff], and that len is a
> > > > constant value, in range [1, 0xffff]. This also helps us avoid a
> > > > pointless runtime check. To make these checks possible, we need to
> > > > ensure we only get a scalar type. Although a lot of other argument types
> > > > take scalars, their intent is different. Hence add general ARG_SCALAR
> > > > and ARG_CONSTANT types, where the latter is also checked to be constant
> > > > in addition to being scalar.
> > > >
> > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > ---
> > > >  include/linux/bpf.h   |  2 ++
> > > >  kernel/bpf/verifier.c | 13 +++++++++++++
> > > >  2 files changed, 15 insertions(+)
> > > >
> > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > index 88449fbbe063..7841d90b83df 100644
> > > > --- a/include/linux/bpf.h
> > > > +++ b/include/linux/bpf.h
> > > > @@ -391,6 +391,8 @@ enum bpf_arg_type {
> > > >         ARG_PTR_TO_STACK,       /* pointer to stack */
> > > >         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
> > > >         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> > > > +       ARG_SCALAR,             /* a scalar with any value(s) */
> > >
> > > What's the difference between ARG_ANYTHING and ARG_SCALAR?
> > >
> >
> > ARG_SCALAR only accepts reg->type == SCALAR, ARG_ANYTHING accepts anything as
> > long as reg->type != NOT_INIT (due to SRC_OP for check_reg_arg and early return
> > without further checks).
> >
>
> Ah, ok, didn't realize that it's not always scalar for ARG_ANYTHING
>
>
> > > > +       ARG_CONSTANT,           /* a scalar with constant value */
> > >
> > > This ARG_CONSTANT serves a very similar purpose as
> > > ARG_CONST_ALLOC_SIZE_OR_ZERO, tbh. The only difference is that one is
> > > used to set meta->mem_size and this one is used (through extra func_id
> > > special handling) to set meta->ret_pkt_len. But meta->mem_size and
> > > meta->ret_pkt_len mean the same thing: how many bytes are directly
> > > accessible through a pointer returned from the helper. So I feel like
> > > there is some opportunity to unify and generalize, instead of adding
> > > more custom variants of constants. WDYT?
> > >
> >
> > I see, indeed it would make sense to make both equivalent, since
> > CONST_ALLOC_SIZE must also be a constant. Joanne also mentioned consolidating,
> > but I didn't understand how that would work for ARG_CONSTANT and ARG_CONST_SIZE
> > ones.
> >
> > I'm wondering whether we can take a step back and should go with the following
> > convention:
> >
> > ARG_MEM_SIZE, and two type flags, ARG_ZERO | ARG_CONSTANT
> >
> > Old                             New (in bpf_func_proto)
> > -------------------------------------------------------------------
> > ARG_CONST_SIZE                  ARG_MEM_SIZE
> > ARG_CONST_SIZE_OR_ZERO          ARG_MEM_SIZE | ARG_ZERO
> > ARG_CONST_ALLOC_SIZE            ARG_MEM_SIZE | ARG_CONST
> > ARG_CONST_ALLOC_SIZE_OR_ZERO    ARG_MEM_SIZE | ARG_CONST | ARG_ZERO
> > ARG_CONSTANT (mine)             ARG_MEM_SIZE | ARG_CONST
> >
>
> I think using "ARG_MEM_SIZE" as part of ARG_CONSTANT is backwards and
> misleading. It makes more sense to me to have ARG_CONSTANT and use
> ARG_ZERO (or rather ARG_MAYBE_ZERO?) and ARG_MEM_SIZE (to specify that
> this constant is describing the size of memory of a pointer that is
> passed in a previous argument).
>
> Basically, something like:
>
> ARG_CONST_SIZE => ARG_CONSTANT | ARG_MEM_SIZE
> ARG_CONST_SIZE_OR_ZERO => ARG_CONSTANT | ARG_MEM_SIZE | ARG_MAYBE_ZERO
>
> Then we can replace ARG_CONST_ALLOC_SIZE and
> ARG_CONST_ALLOC_SIZE_OR_ZERO with ARG_CONSTANT  and ARG_CONSTANT |
> ARG_MAYBE_ZERO and we'll have a bit of special case to handle
> bpf_ringbuf_reserve.
>
> For ARG_CONSTANT, verifier will remember the value in
> bpf_call_arg_meta, and then we can use it as necessary (e.g., instead
> of mem_size when ARG_MEM_SIZE is specified) depending on context,
> helper being called, etc.
>
> Adding ARG_CONST just makes no sense as we always want constant value,
> otherwise it might as well be just ARG_ANYTHING, right?

Re-reading this, this paragraph is very confusing (especially taking
into account what I wrote above). My point was that in your table, you
have ARG_MEM_SIZE as a "base type" and ARG_CONST as "modifier". And
that makes little sense to me, because in all cases we have a
constant, but not in all cases we use that constant to describe the
size of memory passed in a previous argument. So I inverted that,
ARG_CONSTANT as "base type", ARG_MEM_SIZE and ARG_MAYBE_ZERO as
modifiers. And we then don't need 5 different resulting types because
"CONST_ALLOC_SIZE" handling is just a custom constant handling for
bpf_ringbuf_reserve. Just like for your use case you wanted to use
plain ARG_CONSTANT and add some extra logic for your
bpf_packet_pointer(). I hope this clarifies it a bit.

>
> I haven't spent much time thinking about this, though, so I'm probably
> missing something.
>
>
> > When we detect ARG_CONST, we always set meta->mem_size, which can be used to
> > refine returned pointer range, otherwise meta->mem_size = -1 by default (so it
> > will be -1 for the !tnum_is_const(reg->var_off) case).
> >
> > if (arg_type & ARG_CONST)
> >         meta->mem_size = reg->var_off.value;
> >         if (!(arg_type & ARG_ZERO) && !meta->mem_size)
> >                 // error
> >
> > The check_mem_size_reg call is only made when we see that previous reg was
> > ARG_PTR_TO_MEM. When preceding argument is not ARG_PTR_TO_MEM, we error if
> > ARG_CONST is not set for ARG_MEM_SIZE (so that either the mem size is for
> > previous parameter, or otherwise a constant size for the returned pointer).
> > We can also only allow certain pointer return types for that case.
> >
> > If that is too much automagic, we can also discern using ARG_MEM_SIZE vs
> > ARG_RET_MEM_SIZE, but I think the above is fine.
> >
> > ARG_CONST ofcourse only applies to args taking scalar type.
> >
> > >
> > >
> > > >         __BPF_ARG_TYPE_MAX,
> > > >
> > > >         /* Extended arg_types. */
> > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > index ec3a7b6c9515..0373d5bd240f 100644
> > > > --- a/kernel/bpf/verifier.c
> > > > +++ b/kernel/bpf/verifier.c
> > > > @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
> > > >                type == ARG_PTR_TO_LONG;
> > > >  }
> > > >
> > > > +static bool arg_type_is_scalar(enum bpf_arg_type type)
> > > > +{
> > > > +       return type == ARG_SCALAR ||
> > > > +              type == ARG_CONSTANT;
> > > > +}
> > > > +
> > > >  static int int_ptr_type_to_size(enum bpf_arg_type type)
> > > >  {
> > > >         if (type == ARG_PTR_TO_INT)
> > > > @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> > > >         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
> > > >         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
> > > >         [ARG_PTR_TO_TIMER]              = &timer_types,
> > > > +       [ARG_SCALAR]                    = &scalar_types,
> > > > +       [ARG_CONSTANT]                  = &scalar_types,
> > > >  };
> > > >
> > > >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > > > @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > > >                         verbose(env, "string is not zero-terminated\n");
> > > >                         return -EINVAL;
> > > >                 }
> > > > +       } else if (arg_type_is_scalar(arg_type)) {
> > > > +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> > > > +                       verbose(env, "R%d is not a known constant\n", regno);
> > > > +                       return -EACCES;
> > > > +               }
> > > >         }
> > > >
> > > >         return err;
> > > > --
> > > > 2.35.1
> > > >
> >
> > --
> > Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-08 13:40     ` Toke Høiland-Jørgensen
@ 2022-03-10 23:12       ` Andrii Nakryiko
  2022-03-10 23:30         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 24+ messages in thread
From: Andrii Nakryiko @ 2022-03-10 23:12 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Kumar Kartikeya Dwivedi, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
>
> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >> >
> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> >> > enable DPA without having to use memcpy (currently the case in
> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> >> >
> >> > The intended usage to read and write data for multi-buff XDP is:
> >> >
> >> >         int err = 0;
> >> >         char buf[N];
> >> >
> >> >         off &= 0xffff;
> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> >> >         if (unlikely(!ptr)) {
> >> >                 if (err < 0)
> >> >                         return XDP_ABORTED;
> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> >> >                 if (err < 0)
> >> >                         return XDP_ABORTED;
> >> >                 ptr = buf;
> >> >         }
> >> >         ...
> >> >         // Do some stores and loads in [ptr, ptr + N) region
> >> >         ...
> >> >         if (unlikely(ptr == buf)) {
> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> >> >                 if (err < 0)
> >> >                         return XDP_ABORTED;
> >> >         }
> >> >
> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> >> > it is also more meaningful to the user to see return value as R0=pkt.
> >> >
> >> > This series is meant to collect feedback on the approach, next version can
> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> >> > like check_mem_region_access (var_off + off < range), since there would be no
> >> > data_end to compare against and obtain a new range.
> >> >
> >> > The common name and func_id is supposed to allow writing generic code using
> >> > bpf_packet_pointer that works for both XDP and TC programs.
> >> >
> >> > Please see the individual patches for implementation details.
> >> >
> >>
> >> Joanne is working on a "bpf_dynptr" framework that will support
> >> exactly this feature, in addition to working with dynamically
> >> allocated memory, working with memory of statically unknown size (but
> >> safe and checked at runtime), etc. And all that within a generic
> >> common feature implemented uniformly within the verifier. E.g., it
> >> won't need any of the custom bits of logic added in patch #2 and #3.
> >> So I'm thinking that instead of custom-implementing a partial case of
> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> >> and do it only once there?
> >>
> >
> > Interesting stuff, looking forward to it.
> >
> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> >> where input constant is used to characterize some pointer returned
> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> >> that for bpf_dynptr for exactly this "give me direct access of N
> >> bytes, if possible" case. So improving/generalizing it now before
> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> >> feature itself.
> >
> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> > changes into a separate patch, and wait for dynptr to be posted before reworking
> > this.
>
> This does raise the question of what we do in the meantime, though? Your
> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> making it, really has to go in before those hit a release and become
> UAPI.
>
> One option would be to still make the change to those other helpers;
> they'd become a bit slower, but if we have a solution for that coming,
> that may be OK for a single release? WDYT?

I must have missed important changes to bpf_xdp_{load,store}_bytes().
Does anything change about its behavior? If there are some fixes
specific to those helpers, we should fix them as well as a separate
patch. My main objection is adding a bpf_packet_pointer() special case
when we have a generic mechanism in the works that will come this use
case (among other use cases).

>
> -Toke
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-10 23:12       ` Andrii Nakryiko
@ 2022-03-10 23:30         ` Toke Høiland-Jørgensen
  2022-03-10 23:35           ` Alexei Starovoitov
  2022-03-11  7:19           ` Kumar Kartikeya Dwivedi
  0 siblings, 2 replies; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-03-10 23:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Kumar Kartikeya Dwivedi, bpf, Alexei Starovoitov,
	Andrii Nakryiko, Daniel Borkmann, Martin KaFai Lau,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:

> On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
>>
>> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
>> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> >> >
>> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
>> >> > returning a packet pointer with a fixed immutable range. This can be useful to
>> >> > enable DPA without having to use memcpy (currently the case in
>> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
>> >> >
>> >> > The intended usage to read and write data for multi-buff XDP is:
>> >> >
>> >> >         int err = 0;
>> >> >         char buf[N];
>> >> >
>> >> >         off &= 0xffff;
>> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
>> >> >         if (unlikely(!ptr)) {
>> >> >                 if (err < 0)
>> >> >                         return XDP_ABORTED;
>> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
>> >> >                 if (err < 0)
>> >> >                         return XDP_ABORTED;
>> >> >                 ptr = buf;
>> >> >         }
>> >> >         ...
>> >> >         // Do some stores and loads in [ptr, ptr + N) region
>> >> >         ...
>> >> >         if (unlikely(ptr == buf)) {
>> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
>> >> >                 if (err < 0)
>> >> >                         return XDP_ABORTED;
>> >> >         }
>> >> >
>> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
>> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
>> >> > it is also more meaningful to the user to see return value as R0=pkt.
>> >> >
>> >> > This series is meant to collect feedback on the approach, next version can
>> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
>> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
>> >> > like check_mem_region_access (var_off + off < range), since there would be no
>> >> > data_end to compare against and obtain a new range.
>> >> >
>> >> > The common name and func_id is supposed to allow writing generic code using
>> >> > bpf_packet_pointer that works for both XDP and TC programs.
>> >> >
>> >> > Please see the individual patches for implementation details.
>> >> >
>> >>
>> >> Joanne is working on a "bpf_dynptr" framework that will support
>> >> exactly this feature, in addition to working with dynamically
>> >> allocated memory, working with memory of statically unknown size (but
>> >> safe and checked at runtime), etc. And all that within a generic
>> >> common feature implemented uniformly within the verifier. E.g., it
>> >> won't need any of the custom bits of logic added in patch #2 and #3.
>> >> So I'm thinking that instead of custom-implementing a partial case of
>> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
>> >> and do it only once there?
>> >>
>> >
>> > Interesting stuff, looking forward to it.
>> >
>> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
>> >> where input constant is used to characterize some pointer returned
>> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
>> >> that for bpf_dynptr for exactly this "give me direct access of N
>> >> bytes, if possible" case. So improving/generalizing it now before
>> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
>> >> feature itself.
>> >
>> > No worries, we can continue the discussion in patch 1, I'll split out the arg
>> > changes into a separate patch, and wait for dynptr to be posted before reworking
>> > this.
>>
>> This does raise the question of what we do in the meantime, though? Your
>> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
>> making it, really has to go in before those hit a release and become
>> UAPI.
>>
>> One option would be to still make the change to those other helpers;
>> they'd become a bit slower, but if we have a solution for that coming,
>> that may be OK for a single release? WDYT?
>
> I must have missed important changes to bpf_xdp_{load,store}_bytes().
> Does anything change about its behavior? If there are some fixes
> specific to those helpers, we should fix them as well as a separate
> patch. My main objection is adding a bpf_packet_pointer() special case
> when we have a generic mechanism in the works that will come this use
> case (among other use cases).

Well it's not a functional change per se, but Kartikeya's patch is
removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
use of the bpf_xdp_pointer()) in favour of making it available directly
to BPF. So if we don't do that change before those helpers are
finalised, we will end up either introducing a performance regression
for code using those helpers, or being stuck with the bpf_xdp_pointer()
use inside them even though it makes more sense to move it out to BPF.

So the "safe" thing to do would do the change to the store/load helpers
now, and get rid of the bpf_xdp_pointer() entirely until it can be
introduced as a BPF helper in a generic way. Of course this depends on
whether you consider performance regressions to be something to avoid,
but this being XDP IMO we should :)

-Toke


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-10 23:30         ` Toke Høiland-Jørgensen
@ 2022-03-10 23:35           ` Alexei Starovoitov
  2022-03-11  7:25             ` Kumar Kartikeya Dwivedi
  2022-03-11  7:19           ` Kumar Kartikeya Dwivedi
  1 sibling, 1 reply; 24+ messages in thread
From: Alexei Starovoitov @ 2022-03-10 23:35 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, Kumar Kartikeya Dwivedi, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Jesper Dangaard Brouer, Lorenzo Bianconi,
	John Fastabend, Jakub Kicinski, Lorenz Bauer, Networking

On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> >>
> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >> >> >
> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> >> >> > enable DPA without having to use memcpy (currently the case in
> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> >> >> >
> >> >> > The intended usage to read and write data for multi-buff XDP is:
> >> >> >
> >> >> >         int err = 0;
> >> >> >         char buf[N];
> >> >> >
> >> >> >         off &= 0xffff;
> >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> >> >> >         if (unlikely(!ptr)) {
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 ptr = buf;
> >> >> >         }
> >> >> >         ...
> >> >> >         // Do some stores and loads in [ptr, ptr + N) region
> >> >> >         ...
> >> >> >         if (unlikely(ptr == buf)) {
> >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >         }
> >> >> >
> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> >> >> > it is also more meaningful to the user to see return value as R0=pkt.
> >> >> >
> >> >> > This series is meant to collect feedback on the approach, next version can
> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> >> >> > like check_mem_region_access (var_off + off < range), since there would be no
> >> >> > data_end to compare against and obtain a new range.
> >> >> >
> >> >> > The common name and func_id is supposed to allow writing generic code using
> >> >> > bpf_packet_pointer that works for both XDP and TC programs.
> >> >> >
> >> >> > Please see the individual patches for implementation details.
> >> >> >
> >> >>
> >> >> Joanne is working on a "bpf_dynptr" framework that will support
> >> >> exactly this feature, in addition to working with dynamically
> >> >> allocated memory, working with memory of statically unknown size (but
> >> >> safe and checked at runtime), etc. And all that within a generic
> >> >> common feature implemented uniformly within the verifier. E.g., it
> >> >> won't need any of the custom bits of logic added in patch #2 and #3.
> >> >> So I'm thinking that instead of custom-implementing a partial case of
> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> >> >> and do it only once there?
> >> >>
> >> >
> >> > Interesting stuff, looking forward to it.
> >> >
> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> >> >> where input constant is used to characterize some pointer returned
> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> >> >> that for bpf_dynptr for exactly this "give me direct access of N
> >> >> bytes, if possible" case. So improving/generalizing it now before
> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> >> >> feature itself.
> >> >
> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> >> > changes into a separate patch, and wait for dynptr to be posted before reworking
> >> > this.
> >>
> >> This does raise the question of what we do in the meantime, though? Your
> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> >> making it, really has to go in before those hit a release and become
> >> UAPI.
> >>
> >> One option would be to still make the change to those other helpers;
> >> they'd become a bit slower, but if we have a solution for that coming,
> >> that may be OK for a single release? WDYT?
> >
> > I must have missed important changes to bpf_xdp_{load,store}_bytes().
> > Does anything change about its behavior? If there are some fixes
> > specific to those helpers, we should fix them as well as a separate
> > patch. My main objection is adding a bpf_packet_pointer() special case
> > when we have a generic mechanism in the works that will come this use
> > case (among other use cases).
>
> Well it's not a functional change per se, but Kartikeya's patch is
> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
> use of the bpf_xdp_pointer()) in favour of making it available directly
> to BPF. So if we don't do that change before those helpers are
> finalised, we will end up either introducing a performance regression
> for code using those helpers, or being stuck with the bpf_xdp_pointer()
> use inside them even though it makes more sense to move it out to BPF.
>
> So the "safe" thing to do would do the change to the store/load helpers
> now, and get rid of the bpf_xdp_pointer() entirely until it can be
> introduced as a BPF helper in a generic way. Of course this depends on
> whether you consider performance regressions to be something to avoid,
> but this being XDP IMO we should :)

I don't follow this logic.
Would you mean by "get rid of the bpf_xdp_pointer" ?
It's just an internal static function.

Also I don't believe that this patch set and exposing
bpf_xdp_pointer as a helper actually gives measurable performance wins.
It looks quirky to me and hard to use.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-10 23:30         ` Toke Høiland-Jørgensen
  2022-03-10 23:35           ` Alexei Starovoitov
@ 2022-03-11  7:19           ` Kumar Kartikeya Dwivedi
  2022-03-11 10:34             ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-11  7:19 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Jesper Dangaard Brouer,
	Lorenzo Bianconi, John Fastabend, Jakub Kicinski, Lorenz Bauer,
	Networking

On Fri, Mar 11, 2022 at 05:00:42AM IST, Toke Høiland-Jørgensen wrote:
> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>
> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> >>
> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >> >> >
> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> >> >> > enable DPA without having to use memcpy (currently the case in
> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> >> >> >
> >> >> > The intended usage to read and write data for multi-buff XDP is:
> >> >> >
> >> >> >         int err = 0;
> >> >> >         char buf[N];
> >> >> >
> >> >> >         off &= 0xffff;
> >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> >> >> >         if (unlikely(!ptr)) {
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >                 ptr = buf;
> >> >> >         }
> >> >> >         ...
> >> >> >         // Do some stores and loads in [ptr, ptr + N) region
> >> >> >         ...
> >> >> >         if (unlikely(ptr == buf)) {
> >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> >> >> >                 if (err < 0)
> >> >> >                         return XDP_ABORTED;
> >> >> >         }
> >> >> >
> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> >> >> > it is also more meaningful to the user to see return value as R0=pkt.
> >> >> >
> >> >> > This series is meant to collect feedback on the approach, next version can
> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> >> >> > like check_mem_region_access (var_off + off < range), since there would be no
> >> >> > data_end to compare against and obtain a new range.
> >> >> >
> >> >> > The common name and func_id is supposed to allow writing generic code using
> >> >> > bpf_packet_pointer that works for both XDP and TC programs.
> >> >> >
> >> >> > Please see the individual patches for implementation details.
> >> >> >
> >> >>
> >> >> Joanne is working on a "bpf_dynptr" framework that will support
> >> >> exactly this feature, in addition to working with dynamically
> >> >> allocated memory, working with memory of statically unknown size (but
> >> >> safe and checked at runtime), etc. And all that within a generic
> >> >> common feature implemented uniformly within the verifier. E.g., it
> >> >> won't need any of the custom bits of logic added in patch #2 and #3.
> >> >> So I'm thinking that instead of custom-implementing a partial case of
> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> >> >> and do it only once there?
> >> >>
> >> >
> >> > Interesting stuff, looking forward to it.
> >> >
> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> >> >> where input constant is used to characterize some pointer returned
> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> >> >> that for bpf_dynptr for exactly this "give me direct access of N
> >> >> bytes, if possible" case. So improving/generalizing it now before
> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> >> >> feature itself.
> >> >
> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> >> > changes into a separate patch, and wait for dynptr to be posted before reworking
> >> > this.
> >>
> >> This does raise the question of what we do in the meantime, though? Your
> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> >> making it, really has to go in before those hit a release and become
> >> UAPI.
> >>
> >> One option would be to still make the change to those other helpers;
> >> they'd become a bit slower, but if we have a solution for that coming,
> >> that may be OK for a single release? WDYT?
> >
> > I must have missed important changes to bpf_xdp_{load,store}_bytes().
> > Does anything change about its behavior? If there are some fixes
> > specific to those helpers, we should fix them as well as a separate
> > patch. My main objection is adding a bpf_packet_pointer() special case
> > when we have a generic mechanism in the works that will come this use
> > case (among other use cases).
>
> Well it's not a functional change per se, but Kartikeya's patch is
> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
> use of the bpf_xdp_pointer()) in favour of making it available directly
> to BPF. So if we don't do that change before those helpers are
> finalised, we will end up either introducing a performance regression
> for code using those helpers, or being stuck with the bpf_xdp_pointer()
> use inside them even though it makes more sense to move it out to BPF.
>

So IIUC, the case we're worried about is when a linear region is in head or a
frag and bpf_xdp_pointer can be used to do a direct memcpy for it. But in my
testing there doesn't seem to be any difference. With or without the call, the
time taken e.g. for bpf_xdp_load_bytes lies in the 30-40ns range. It would make
sense, because for this case the code in bpf_xdp_pointer and bpf_xdp_copy_buf
are almost the same, just that the latter has a conditional jump out of the loop
based on len. bpf_xdp_copy_buf is still only doing a single memcpy, the cost
seems to be dominated by that.

Otoh, removing it would improve the case for the other scenario (when region
touches two or more frags) because we wouldn't spend time in bpf_xdp_pointer and
returning NULL from it failing to find a linear region, but that shouldn't be a
regression.

Please let me know if I missed something.

> So the "safe" thing to do would do the change to the store/load helpers
> now, and get rid of the bpf_xdp_pointer() entirely until it can be
> introduced as a BPF helper in a generic way. Of course this depends on
> whether you consider performance regressions to be something to avoid,
> but this being XDP IMO we should :)
>
> -Toke
>

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-10 23:35           ` Alexei Starovoitov
@ 2022-03-11  7:25             ` Kumar Kartikeya Dwivedi
  2022-03-11  7:32               ` Kumar Kartikeya Dwivedi
  2022-03-11 17:27               ` Alexei Starovoitov
  0 siblings, 2 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-11  7:25 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Jesper Dangaard Brouer, Lorenzo Bianconi,
	John Fastabend, Jakub Kicinski, Lorenz Bauer, Networking

On Fri, Mar 11, 2022 at 05:05:24AM IST, Alexei Starovoitov wrote:
> On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> >
> > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > >>
> > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> > >>
> > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > >> >> >
> > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> > >> >> > enable DPA without having to use memcpy (currently the case in
> > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> > >> >> >
> > >> >> > The intended usage to read and write data for multi-buff XDP is:
> > >> >> >
> > >> >> >         int err = 0;
> > >> >> >         char buf[N];
> > >> >> >
> > >> >> >         off &= 0xffff;
> > >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> > >> >> >         if (unlikely(!ptr)) {
> > >> >> >                 if (err < 0)
> > >> >> >                         return XDP_ABORTED;
> > >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> > >> >> >                 if (err < 0)
> > >> >> >                         return XDP_ABORTED;
> > >> >> >                 ptr = buf;
> > >> >> >         }
> > >> >> >         ...
> > >> >> >         // Do some stores and loads in [ptr, ptr + N) region
> > >> >> >         ...
> > >> >> >         if (unlikely(ptr == buf)) {
> > >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> > >> >> >                 if (err < 0)
> > >> >> >                         return XDP_ABORTED;
> > >> >> >         }
> > >> >> >
> > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> > >> >> > it is also more meaningful to the user to see return value as R0=pkt.
> > >> >> >
> > >> >> > This series is meant to collect feedback on the approach, next version can
> > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> > >> >> > like check_mem_region_access (var_off + off < range), since there would be no
> > >> >> > data_end to compare against and obtain a new range.
> > >> >> >
> > >> >> > The common name and func_id is supposed to allow writing generic code using
> > >> >> > bpf_packet_pointer that works for both XDP and TC programs.
> > >> >> >
> > >> >> > Please see the individual patches for implementation details.
> > >> >> >
> > >> >>
> > >> >> Joanne is working on a "bpf_dynptr" framework that will support
> > >> >> exactly this feature, in addition to working with dynamically
> > >> >> allocated memory, working with memory of statically unknown size (but
> > >> >> safe and checked at runtime), etc. And all that within a generic
> > >> >> common feature implemented uniformly within the verifier. E.g., it
> > >> >> won't need any of the custom bits of logic added in patch #2 and #3.
> > >> >> So I'm thinking that instead of custom-implementing a partial case of
> > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> > >> >> and do it only once there?
> > >> >>
> > >> >
> > >> > Interesting stuff, looking forward to it.
> > >> >
> > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> > >> >> where input constant is used to characterize some pointer returned
> > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> > >> >> that for bpf_dynptr for exactly this "give me direct access of N
> > >> >> bytes, if possible" case. So improving/generalizing it now before
> > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> > >> >> feature itself.
> > >> >
> > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> > >> > changes into a separate patch, and wait for dynptr to be posted before reworking
> > >> > this.
> > >>
> > >> This does raise the question of what we do in the meantime, though? Your
> > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> > >> making it, really has to go in before those hit a release and become
> > >> UAPI.
> > >>
> > >> One option would be to still make the change to those other helpers;
> > >> they'd become a bit slower, but if we have a solution for that coming,
> > >> that may be OK for a single release? WDYT?
> > >
> > > I must have missed important changes to bpf_xdp_{load,store}_bytes().
> > > Does anything change about its behavior? If there are some fixes
> > > specific to those helpers, we should fix them as well as a separate
> > > patch. My main objection is adding a bpf_packet_pointer() special case
> > > when we have a generic mechanism in the works that will come this use
> > > case (among other use cases).
> >
> > Well it's not a functional change per se, but Kartikeya's patch is
> > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
> > use of the bpf_xdp_pointer()) in favour of making it available directly
> > to BPF. So if we don't do that change before those helpers are
> > finalised, we will end up either introducing a performance regression
> > for code using those helpers, or being stuck with the bpf_xdp_pointer()
> > use inside them even though it makes more sense to move it out to BPF.
> >
> > So the "safe" thing to do would do the change to the store/load helpers
> > now, and get rid of the bpf_xdp_pointer() entirely until it can be
> > introduced as a BPF helper in a generic way. Of course this depends on
> > whether you consider performance regressions to be something to avoid,
> > but this being XDP IMO we should :)
>
> I don't follow this logic.
> Would you mean by "get rid of the bpf_xdp_pointer" ?
> It's just an internal static function.
>
> Also I don't believe that this patch set and exposing
> bpf_xdp_pointer as a helper actually gives measurable performance wins.
> It looks quirky to me and hard to use.

This is actually inspired from your idea to avoid memcpy when reading and
writing to multi-buff XDP [0]. But instead of passing in the stack or mem
pointer (as discussed in that thread), I let the user set it and detect it
themselves, which makes the implementation simpler.

I am sure accessing a few bytes directly is going to be faster than first
memcpy'ing it to a local buffer, reading, and then possibly writing things
back again using a memcpy, but I will be happy to come with some numbers when
I respin this later, when Joanne posts the dynptr series.

Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf
pointer' idea, but then we have to conservatively invalidate the pointer even if
it points to stack buffer on clear_all_pkt_pointers. The current approach looked
better to me.

  [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT
  2022-03-10 23:09         ` Andrii Nakryiko
@ 2022-03-11  7:31           ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-11  7:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Toke Høiland-Jørgensen,
	Jesper Dangaard Brouer, Lorenzo Bianconi, John Fastabend,
	Jakub Kicinski, Lorenz Bauer, Networking

On Fri, Mar 11, 2022 at 04:39:40AM IST, Andrii Nakryiko wrote:
> On Thu, Mar 10, 2022 at 3:05 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Mar 7, 2022 at 10:26 PM Kumar Kartikeya Dwivedi
> > <memxor@gmail.com> wrote:
> > >
> > > On Tue, Mar 08, 2022 at 11:12:13AM IST, Andrii Nakryiko wrote:
> > > > On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > > >
> > > > > In the next patch, we will introduce a new helper 'bpf_packet_pointer'
> > > > > that takes offset and len and returns a packet pointer. There we want to
> > > > > statically enforce offset is in range [0, 0xffff], and that len is a
> > > > > constant value, in range [1, 0xffff]. This also helps us avoid a
> > > > > pointless runtime check. To make these checks possible, we need to
> > > > > ensure we only get a scalar type. Although a lot of other argument types
> > > > > take scalars, their intent is different. Hence add general ARG_SCALAR
> > > > > and ARG_CONSTANT types, where the latter is also checked to be constant
> > > > > in addition to being scalar.
> > > > >
> > > > > Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
> > > > > ---
> > > > >  include/linux/bpf.h   |  2 ++
> > > > >  kernel/bpf/verifier.c | 13 +++++++++++++
> > > > >  2 files changed, 15 insertions(+)
> > > > >
> > > > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > > > index 88449fbbe063..7841d90b83df 100644
> > > > > --- a/include/linux/bpf.h
> > > > > +++ b/include/linux/bpf.h
> > > > > @@ -391,6 +391,8 @@ enum bpf_arg_type {
> > > > >         ARG_PTR_TO_STACK,       /* pointer to stack */
> > > > >         ARG_PTR_TO_CONST_STR,   /* pointer to a null terminated read-only string */
> > > > >         ARG_PTR_TO_TIMER,       /* pointer to bpf_timer */
> > > > > +       ARG_SCALAR,             /* a scalar with any value(s) */
> > > >
> > > > What's the difference between ARG_ANYTHING and ARG_SCALAR?
> > > >
> > >
> > > ARG_SCALAR only accepts reg->type == SCALAR, ARG_ANYTHING accepts anything as
> > > long as reg->type != NOT_INIT (due to SRC_OP for check_reg_arg and early return
> > > without further checks).
> > >
> >
> > Ah, ok, didn't realize that it's not always scalar for ARG_ANYTHING
> >
> >
> > > > > +       ARG_CONSTANT,           /* a scalar with constant value */
> > > >
> > > > This ARG_CONSTANT serves a very similar purpose as
> > > > ARG_CONST_ALLOC_SIZE_OR_ZERO, tbh. The only difference is that one is
> > > > used to set meta->mem_size and this one is used (through extra func_id
> > > > special handling) to set meta->ret_pkt_len. But meta->mem_size and
> > > > meta->ret_pkt_len mean the same thing: how many bytes are directly
> > > > accessible through a pointer returned from the helper. So I feel like
> > > > there is some opportunity to unify and generalize, instead of adding
> > > > more custom variants of constants. WDYT?
> > > >
> > >
> > > I see, indeed it would make sense to make both equivalent, since
> > > CONST_ALLOC_SIZE must also be a constant. Joanne also mentioned consolidating,
> > > but I didn't understand how that would work for ARG_CONSTANT and ARG_CONST_SIZE
> > > ones.
> > >
> > > I'm wondering whether we can take a step back and should go with the following
> > > convention:
> > >
> > > ARG_MEM_SIZE, and two type flags, ARG_ZERO | ARG_CONSTANT
> > >
> > > Old                             New (in bpf_func_proto)
> > > -------------------------------------------------------------------
> > > ARG_CONST_SIZE                  ARG_MEM_SIZE
> > > ARG_CONST_SIZE_OR_ZERO          ARG_MEM_SIZE | ARG_ZERO
> > > ARG_CONST_ALLOC_SIZE            ARG_MEM_SIZE | ARG_CONST
> > > ARG_CONST_ALLOC_SIZE_OR_ZERO    ARG_MEM_SIZE | ARG_CONST | ARG_ZERO
> > > ARG_CONSTANT (mine)             ARG_MEM_SIZE | ARG_CONST
> > >
> >
> > I think using "ARG_MEM_SIZE" as part of ARG_CONSTANT is backwards and
> > misleading. It makes more sense to me to have ARG_CONSTANT and use
> > ARG_ZERO (or rather ARG_MAYBE_ZERO?) and ARG_MEM_SIZE (to specify that
> > this constant is describing the size of memory of a pointer that is
> > passed in a previous argument).
> >
> > Basically, something like:
> >
> > ARG_CONST_SIZE => ARG_CONSTANT | ARG_MEM_SIZE
> > ARG_CONST_SIZE_OR_ZERO => ARG_CONSTANT | ARG_MEM_SIZE | ARG_MAYBE_ZERO
> >
> > Then we can replace ARG_CONST_ALLOC_SIZE and
> > ARG_CONST_ALLOC_SIZE_OR_ZERO with ARG_CONSTANT  and ARG_CONSTANT |
> > ARG_MAYBE_ZERO and we'll have a bit of special case to handle
> > bpf_ringbuf_reserve.
> >
> > For ARG_CONSTANT, verifier will remember the value in
> > bpf_call_arg_meta, and then we can use it as necessary (e.g., instead
> > of mem_size when ARG_MEM_SIZE is specified) depending on context,
> > helper being called, etc.
> >
> > Adding ARG_CONST just makes no sense as we always want constant value,
> > otherwise it might as well be just ARG_ANYTHING, right?
>
> Re-reading this, this paragraph is very confusing (especially taking
> into account what I wrote above). My point was that in your table, you
> have ARG_MEM_SIZE as a "base type" and ARG_CONST as "modifier". And
> that makes little sense to me, because in all cases we have a
> constant, but not in all cases we use that constant to describe the
> size of memory passed in a previous argument. So I inverted that,
> ARG_CONSTANT as "base type", ARG_MEM_SIZE and ARG_MAYBE_ZERO as
> modifiers. And we then don't need 5 different resulting types because
> "CONST_ALLOC_SIZE" handling is just a custom constant handling for
> bpf_ringbuf_reserve. Just like for your use case you wanted to use
> plain ARG_CONSTANT and add some extra logic for your
> bpf_packet_pointer(). I hope this clarifies it a bit.
>

Makes sense, I'll split it out as a separate change. Thanks!

> >
> > I haven't spent much time thinking about this, though, so I'm probably
> > missing something.
> >
> >
> > > When we detect ARG_CONST, we always set meta->mem_size, which can be used to
> > > refine returned pointer range, otherwise meta->mem_size = -1 by default (so it
> > > will be -1 for the !tnum_is_const(reg->var_off) case).
> > >
> > > if (arg_type & ARG_CONST)
> > >         meta->mem_size = reg->var_off.value;
> > >         if (!(arg_type & ARG_ZERO) && !meta->mem_size)
> > >                 // error
> > >
> > > The check_mem_size_reg call is only made when we see that previous reg was
> > > ARG_PTR_TO_MEM. When preceding argument is not ARG_PTR_TO_MEM, we error if
> > > ARG_CONST is not set for ARG_MEM_SIZE (so that either the mem size is for
> > > previous parameter, or otherwise a constant size for the returned pointer).
> > > We can also only allow certain pointer return types for that case.
> > >
> > > If that is too much automagic, we can also discern using ARG_MEM_SIZE vs
> > > ARG_RET_MEM_SIZE, but I think the above is fine.
> > >
> > > ARG_CONST ofcourse only applies to args taking scalar type.
> > >
> > > >
> > > >
> > > > >         __BPF_ARG_TYPE_MAX,
> > > > >
> > > > >         /* Extended arg_types. */
> > > > > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > > > > index ec3a7b6c9515..0373d5bd240f 100644
> > > > > --- a/kernel/bpf/verifier.c
> > > > > +++ b/kernel/bpf/verifier.c
> > > > > @@ -5163,6 +5163,12 @@ static bool arg_type_is_int_ptr(enum bpf_arg_type type)
> > > > >                type == ARG_PTR_TO_LONG;
> > > > >  }
> > > > >
> > > > > +static bool arg_type_is_scalar(enum bpf_arg_type type)
> > > > > +{
> > > > > +       return type == ARG_SCALAR ||
> > > > > +              type == ARG_CONSTANT;
> > > > > +}
> > > > > +
> > > > >  static int int_ptr_type_to_size(enum bpf_arg_type type)
> > > > >  {
> > > > >         if (type == ARG_PTR_TO_INT)
> > > > > @@ -5302,6 +5308,8 @@ static const struct bpf_reg_types *compatible_reg_types[__BPF_ARG_TYPE_MAX] = {
> > > > >         [ARG_PTR_TO_STACK]              = &stack_ptr_types,
> > > > >         [ARG_PTR_TO_CONST_STR]          = &const_str_ptr_types,
> > > > >         [ARG_PTR_TO_TIMER]              = &timer_types,
> > > > > +       [ARG_SCALAR]                    = &scalar_types,
> > > > > +       [ARG_CONSTANT]                  = &scalar_types,
> > > > >  };
> > > > >
> > > > >  static int check_reg_type(struct bpf_verifier_env *env, u32 regno,
> > > > > @@ -5635,6 +5643,11 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> > > > >                         verbose(env, "string is not zero-terminated\n");
> > > > >                         return -EINVAL;
> > > > >                 }
> > > > > +       } else if (arg_type_is_scalar(arg_type)) {
> > > > > +               if (arg_type == ARG_CONSTANT && !tnum_is_const(reg->var_off)) {
> > > > > +                       verbose(env, "R%d is not a known constant\n", regno);
> > > > > +                       return -EACCES;
> > > > > +               }
> > > > >         }
> > > > >
> > > > >         return err;
> > > > > --
> > > > > 2.35.1
> > > > >
> > >
> > > --
> > > Kartikeya

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-11  7:25             ` Kumar Kartikeya Dwivedi
@ 2022-03-11  7:32               ` Kumar Kartikeya Dwivedi
  2022-03-11 17:27               ` Alexei Starovoitov
  1 sibling, 0 replies; 24+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-03-11  7:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Jesper Dangaard Brouer, Lorenzo Bianconi,
	John Fastabend, Jakub Kicinski, Lorenz Bauer, Networking

On Fri, Mar 11, 2022 at 12:55:45PM IST, Kumar Kartikeya Dwivedi wrote:
> On Fri, Mar 11, 2022 at 05:05:24AM IST, Alexei Starovoitov wrote:
> > On Thu, Mar 10, 2022 at 3:30 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > >
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
> > >
> > > > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > > >>
> > > >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
> > > >>
> > > >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
> > > >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > >> >> >
> > > >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
> > > >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
> > > >> >> > enable DPA without having to use memcpy (currently the case in
> > > >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
> > > >> >> >
> > > >> >> > The intended usage to read and write data for multi-buff XDP is:
> > > >> >> >
> > > >> >> >         int err = 0;
> > > >> >> >         char buf[N];
> > > >> >> >
> > > >> >> >         off &= 0xffff;
> > > >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
> > > >> >> >         if (unlikely(!ptr)) {
> > > >> >> >                 if (err < 0)
> > > >> >> >                         return XDP_ABORTED;
> > > >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
> > > >> >> >                 if (err < 0)
> > > >> >> >                         return XDP_ABORTED;
> > > >> >> >                 ptr = buf;
> > > >> >> >         }
> > > >> >> >         ...
> > > >> >> >         // Do some stores and loads in [ptr, ptr + N) region
> > > >> >> >         ...
> > > >> >> >         if (unlikely(ptr == buf)) {
> > > >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
> > > >> >> >                 if (err < 0)
> > > >> >> >                         return XDP_ABORTED;
> > > >> >> >         }
> > > >> >> >
> > > >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
> > > >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
> > > >> >> > it is also more meaningful to the user to see return value as R0=pkt.
> > > >> >> >
> > > >> >> > This series is meant to collect feedback on the approach, next version can
> > > >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
> > > >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
> > > >> >> > like check_mem_region_access (var_off + off < range), since there would be no
> > > >> >> > data_end to compare against and obtain a new range.
> > > >> >> >
> > > >> >> > The common name and func_id is supposed to allow writing generic code using
> > > >> >> > bpf_packet_pointer that works for both XDP and TC programs.
> > > >> >> >
> > > >> >> > Please see the individual patches for implementation details.
> > > >> >> >
> > > >> >>
> > > >> >> Joanne is working on a "bpf_dynptr" framework that will support
> > > >> >> exactly this feature, in addition to working with dynamically
> > > >> >> allocated memory, working with memory of statically unknown size (but
> > > >> >> safe and checked at runtime), etc. And all that within a generic
> > > >> >> common feature implemented uniformly within the verifier. E.g., it
> > > >> >> won't need any of the custom bits of logic added in patch #2 and #3.
> > > >> >> So I'm thinking that instead of custom-implementing a partial case of
> > > >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
> > > >> >> and do it only once there?
> > > >> >>
> > > >> >
> > > >> > Interesting stuff, looking forward to it.
> > > >> >
> > > >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
> > > >> >> where input constant is used to characterize some pointer returned
> > > >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
> > > >> >> that for bpf_dynptr for exactly this "give me direct access of N
> > > >> >> bytes, if possible" case. So improving/generalizing it now before
> > > >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
> > > >> >> feature itself.
> > > >> >
> > > >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
> > > >> > changes into a separate patch, and wait for dynptr to be posted before reworking
> > > >> > this.
> > > >>
> > > >> This does raise the question of what we do in the meantime, though? Your
> > > >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
> > > >> making it, really has to go in before those hit a release and become
> > > >> UAPI.
> > > >>
> > > >> One option would be to still make the change to those other helpers;
> > > >> they'd become a bit slower, but if we have a solution for that coming,
> > > >> that may be OK for a single release? WDYT?
> > > >
> > > > I must have missed important changes to bpf_xdp_{load,store}_bytes().
> > > > Does anything change about its behavior? If there are some fixes
> > > > specific to those helpers, we should fix them as well as a separate
> > > > patch. My main objection is adding a bpf_packet_pointer() special case
> > > > when we have a generic mechanism in the works that will come this use
> > > > case (among other use cases).
> > >
> > > Well it's not a functional change per se, but Kartikeya's patch is
> > > removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
> > > use of the bpf_xdp_pointer()) in favour of making it available directly
> > > to BPF. So if we don't do that change before those helpers are
> > > finalised, we will end up either introducing a performance regression
> > > for code using those helpers, or being stuck with the bpf_xdp_pointer()
> > > use inside them even though it makes more sense to move it out to BPF.
> > >
> > > So the "safe" thing to do would do the change to the store/load helpers
> > > now, and get rid of the bpf_xdp_pointer() entirely until it can be
> > > introduced as a BPF helper in a generic way. Of course this depends on
> > > whether you consider performance regressions to be something to avoid,
> > > but this being XDP IMO we should :)
> >
> > I don't follow this logic.
> > Would you mean by "get rid of the bpf_xdp_pointer" ?
> > It's just an internal static function.
> >
> > Also I don't believe that this patch set and exposing
> > bpf_xdp_pointer as a helper actually gives measurable performance wins.
> > It looks quirky to me and hard to use.
>
> This is actually inspired from your idea to avoid memcpy when reading and
> writing to multi-buff XDP [0]. But instead of passing in the stack or mem
> pointer (as discussed in that thread), I let the user set it and detect it
> themselves, which makes the implementation simpler.
>
> I am sure accessing a few bytes directly is going to be faster than first
> memcpy'ing it to a local buffer, reading, and then possibly writing things
> back again using a memcpy, but I will be happy to come with some numbers when
> I respin this later, when Joanne posts the dynptr series.
>
> Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf
> pointer' idea, but then we have to conservatively invalidate the pointer even if
> it points to stack buffer on clear_all_pkt_pointers. The current approach looked
> better to me.
>
>   [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com
>

This is probably the correct link:
https://lore.kernel.org/bpf/CAADnVQ+XXGUxzqMdbPMYf+t_ViDkqvGDdogrmv-wH-dckzujLw@mail.gmail.com

> --
> Kartikeya

--
Kartikeya

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-11  7:19           ` Kumar Kartikeya Dwivedi
@ 2022-03-11 10:34             ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 24+ messages in thread
From: Toke Høiland-Jørgensen @ 2022-03-11 10:34 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Andrii Nakryiko, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Martin KaFai Lau, Jesper Dangaard Brouer,
	Lorenzo Bianconi, John Fastabend, Jakub Kicinski, Lorenz Bauer,
	Networking

Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:

> On Fri, Mar 11, 2022 at 05:00:42AM IST, Toke Høiland-Jørgensen wrote:
>> Andrii Nakryiko <andrii.nakryiko@gmail.com> writes:
>>
>> > On Tue, Mar 8, 2022 at 5:40 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Kumar Kartikeya Dwivedi <memxor@gmail.com> writes:
>> >>
>> >> > On Tue, Mar 08, 2022 at 11:18:52AM IST, Andrii Nakryiko wrote:
>> >> >> On Sun, Mar 6, 2022 at 3:43 PM Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> >> >> >
>> >> >> > Expose existing 'bpf_xdp_pointer' as a BPF helper named 'bpf_packet_pointer'
>> >> >> > returning a packet pointer with a fixed immutable range. This can be useful to
>> >> >> > enable DPA without having to use memcpy (currently the case in
>> >> >> > bpf_xdp_load_bytes and bpf_xdp_store_bytes).
>> >> >> >
>> >> >> > The intended usage to read and write data for multi-buff XDP is:
>> >> >> >
>> >> >> >         int err = 0;
>> >> >> >         char buf[N];
>> >> >> >
>> >> >> >         off &= 0xffff;
>> >> >> >         ptr = bpf_packet_pointer(ctx, off, sizeof(buf), &err);
>> >> >> >         if (unlikely(!ptr)) {
>> >> >> >                 if (err < 0)
>> >> >> >                         return XDP_ABORTED;
>> >> >> >                 err = bpf_xdp_load_bytes(ctx, off, buf, sizeof(buf));
>> >> >> >                 if (err < 0)
>> >> >> >                         return XDP_ABORTED;
>> >> >> >                 ptr = buf;
>> >> >> >         }
>> >> >> >         ...
>> >> >> >         // Do some stores and loads in [ptr, ptr + N) region
>> >> >> >         ...
>> >> >> >         if (unlikely(ptr == buf)) {
>> >> >> >                 err = bpf_xdp_store_bytes(ctx, off, buf, sizeof(buf));
>> >> >> >                 if (err < 0)
>> >> >> >                         return XDP_ABORTED;
>> >> >> >         }
>> >> >> >
>> >> >> > Note that bpf_packet_pointer returns a PTR_TO_PACKET, not PTR_TO_MEM, because
>> >> >> > these pointers need to be invalidated on clear_all_pkt_pointers invocation, and
>> >> >> > it is also more meaningful to the user to see return value as R0=pkt.
>> >> >> >
>> >> >> > This series is meant to collect feedback on the approach, next version can
>> >> >> > include a bpf_skb_pointer and exposing it as bpf_packet_pointer helper for TC
>> >> >> > hooks, and explore not resetting range to zero on r0 += rX, instead check access
>> >> >> > like check_mem_region_access (var_off + off < range), since there would be no
>> >> >> > data_end to compare against and obtain a new range.
>> >> >> >
>> >> >> > The common name and func_id is supposed to allow writing generic code using
>> >> >> > bpf_packet_pointer that works for both XDP and TC programs.
>> >> >> >
>> >> >> > Please see the individual patches for implementation details.
>> >> >> >
>> >> >>
>> >> >> Joanne is working on a "bpf_dynptr" framework that will support
>> >> >> exactly this feature, in addition to working with dynamically
>> >> >> allocated memory, working with memory of statically unknown size (but
>> >> >> safe and checked at runtime), etc. And all that within a generic
>> >> >> common feature implemented uniformly within the verifier. E.g., it
>> >> >> won't need any of the custom bits of logic added in patch #2 and #3.
>> >> >> So I'm thinking that instead of custom-implementing a partial case of
>> >> >> bpf_dynptr just for skb and xdp packets, let's maybe wait for dynptr
>> >> >> and do it only once there?
>> >> >>
>> >> >
>> >> > Interesting stuff, looking forward to it.
>> >> >
>> >> >> See also my ARG_CONSTANT comment. It seems like a pretty common thing
>> >> >> where input constant is used to characterize some pointer returned
>> >> >> from the helper (e.g., bpf_ringbuf_reserve() case), and we'll need
>> >> >> that for bpf_dynptr for exactly this "give me direct access of N
>> >> >> bytes, if possible" case. So improving/generalizing it now before
>> >> >> dynptr lands makes a lot of sense, outside of bpf_packet_pointer()
>> >> >> feature itself.
>> >> >
>> >> > No worries, we can continue the discussion in patch 1, I'll split out the arg
>> >> > changes into a separate patch, and wait for dynptr to be posted before reworking
>> >> > this.
>> >>
>> >> This does raise the question of what we do in the meantime, though? Your
>> >> patch includes a change to bpf_xdp_{load,store}_bytes() which, if we're
>> >> making it, really has to go in before those hit a release and become
>> >> UAPI.
>> >>
>> >> One option would be to still make the change to those other helpers;
>> >> they'd become a bit slower, but if we have a solution for that coming,
>> >> that may be OK for a single release? WDYT?
>> >
>> > I must have missed important changes to bpf_xdp_{load,store}_bytes().
>> > Does anything change about its behavior? If there are some fixes
>> > specific to those helpers, we should fix them as well as a separate
>> > patch. My main objection is adding a bpf_packet_pointer() special case
>> > when we have a generic mechanism in the works that will come this use
>> > case (among other use cases).
>>
>> Well it's not a functional change per se, but Kartikeya's patch is
>> removing an optimisation from bpf_xdp_{load_store}_bytes() (i.e., the
>> use of the bpf_xdp_pointer()) in favour of making it available directly
>> to BPF. So if we don't do that change before those helpers are
>> finalised, we will end up either introducing a performance regression
>> for code using those helpers, or being stuck with the bpf_xdp_pointer()
>> use inside them even though it makes more sense to move it out to BPF.
>>
>
> So IIUC, the case we're worried about is when a linear region is in head or a
> frag and bpf_xdp_pointer can be used to do a direct memcpy for it. But in my
> testing there doesn't seem to be any difference. With or without the call, the
> time taken e.g. for bpf_xdp_load_bytes lies in the 30-40ns range. It would make
> sense, because for this case the code in bpf_xdp_pointer and bpf_xdp_copy_buf
> are almost the same, just that the latter has a conditional jump out of the loop
> based on len. bpf_xdp_copy_buf is still only doing a single memcpy, the cost
> seems to be dominated by that.
>
> Otoh, removing it would improve the case for the other scenario (when region
> touches two or more frags) because we wouldn't spend time in bpf_xdp_pointer and
> returning NULL from it failing to find a linear region, but that shouldn't be a
> regression.

Yeah, that was basically what I was worried about; thanks for testing!
So this implies that the current use of the bpf_xdp_pointer() helper
function is pretty pointless, right? But at least it's an internal
detail so there's no hurry in fixing it...

-Toke


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper
  2022-03-11  7:25             ` Kumar Kartikeya Dwivedi
  2022-03-11  7:32               ` Kumar Kartikeya Dwivedi
@ 2022-03-11 17:27               ` Alexei Starovoitov
  1 sibling, 0 replies; 24+ messages in thread
From: Alexei Starovoitov @ 2022-03-11 17:27 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: Toke Høiland-Jørgensen, Andrii Nakryiko, bpf,
	Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Jesper Dangaard Brouer, Lorenzo Bianconi,
	John Fastabend, Jakub Kicinski, Lorenz Bauer, Networking

On Thu, Mar 10, 2022 at 11:25 PM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
> >
> > Also I don't believe that this patch set and exposing
> > bpf_xdp_pointer as a helper actually gives measurable performance wins.
> > It looks quirky to me and hard to use.
>
> This is actually inspired from your idea to avoid memcpy when reading and
> writing to multi-buff XDP [0]. But instead of passing in the stack or mem
> pointer (as discussed in that thread), I let the user set it and detect it
> themselves, which makes the implementation simpler.
>
> I am sure accessing a few bytes directly is going to be faster than first
> memcpy'ing it to a local buffer, reading, and then possibly writing things
> back again using a memcpy, but I will be happy to come with some numbers when
> I respin this later, when Joanne posts the dynptr series.
>
> Ofcourse, we could just make return value PTR_TO_MEM even for the 'pass buf
> pointer' idea, but then we have to conservatively invalidate the pointer even if
> it points to stack buffer on clear_all_pkt_pointers. The current approach looked
> better to me.
>
>   [0]: https://lore.kernel.org/bpf/CAADnVQKbrkOxfNoixUx-RLJEWULJLyhqjZ=M_X2cFG_APwNyCg@mail.gmail.com

There is a big difference in what was proposed earlier
vs what you've implemented.
In that proposal bpf_packet_pointer would behave similar to
skb_header_pointer. The kernel developers are familiar with it
and the concept is tested by time.
While your bpf_packet_pointer is different.
It fails on frag boundaries, so the user has to open code
the checks and add explicit bpf_xdp_load_bytes.
I guess you're arguing that it's the same.
My point that it's a big conceptual difference in api.
This kind of difference should be discussed instead
of dumping a patch set. Especially without RFC tag.

Please focus on kptr patches. They were close enough and have
a chance of landing in this release.
This bpf_packet_pointer stuff is certainly not going to make it.

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2022-03-11 17:27 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-06 23:43 [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Kumar Kartikeya Dwivedi
2022-03-06 23:43 ` [PATCH bpf-next v1 1/5] bpf: Add ARG_SCALAR and ARG_CONSTANT Kumar Kartikeya Dwivedi
2022-03-07 19:28   ` Joanne Koong
2022-03-07 21:22     ` Kumar Kartikeya Dwivedi
2022-03-08  5:42   ` Andrii Nakryiko
2022-03-08  6:26     ` Kumar Kartikeya Dwivedi
2022-03-10 23:05       ` Andrii Nakryiko
2022-03-10 23:09         ` Andrii Nakryiko
2022-03-11  7:31           ` Kumar Kartikeya Dwivedi
2022-03-06 23:43 ` [PATCH bpf-next v1 2/5] bpf: Introduce pkt_uid concept for PTR_TO_PACKET Kumar Kartikeya Dwivedi
2022-03-06 23:43 ` [PATCH bpf-next v1 3/5] bpf: Introduce bpf_packet_pointer helper to do DPA Kumar Kartikeya Dwivedi
2022-03-06 23:43 ` [PATCH bpf-next v1 4/5] selftests/bpf: Add verifier tests for pkt pointer with pkt_uid Kumar Kartikeya Dwivedi
2022-03-06 23:43 ` [PATCH bpf-next v1 5/5] selftests/bpf: Update xdp_adjust_frags to use bpf_packet_pointer Kumar Kartikeya Dwivedi
2022-03-08  5:48 ` [PATCH bpf-next v1 0/5] Introduce bpf_packet_pointer helper Andrii Nakryiko
2022-03-08  7:08   ` Kumar Kartikeya Dwivedi
2022-03-08 13:40     ` Toke Høiland-Jørgensen
2022-03-10 23:12       ` Andrii Nakryiko
2022-03-10 23:30         ` Toke Høiland-Jørgensen
2022-03-10 23:35           ` Alexei Starovoitov
2022-03-11  7:25             ` Kumar Kartikeya Dwivedi
2022-03-11  7:32               ` Kumar Kartikeya Dwivedi
2022-03-11 17:27               ` Alexei Starovoitov
2022-03-11  7:19           ` Kumar Kartikeya Dwivedi
2022-03-11 10:34             ` Toke Høiland-Jørgensen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).