All of lore.kernel.org
 help / color / mirror / Atom feed
From: sdf@google.com
To: Joanne Koong <joannelkoong@gmail.com>
Cc: bpf@vger.kernel.org, andrii@kernel.org, daniel@iogearbox.net,
	ast@kernel.org
Subject: Re: [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs
Date: Wed, 27 Jul 2022 10:13:59 -0700	[thread overview]
Message-ID: <YuFyVwiFkrKjSmFN@google.com> (raw)
In-Reply-To: <20220726184706.954822-2-joannelkoong@gmail.com>

On 07/26, Joanne Koong wrote:
> Add skb dynptrs, which are dynptrs whose underlying pointer points
> to a skb. The dynptr acts on skb data. skb dynptrs have two main
> benefits. One is that they allow operations on sizes that are not
> statically known at compile-time (eg variable-sized accesses).
> Another is that parsing the packet data through dynptrs (instead of
> through direct access of skb->data and skb->data_end) can be more
> ergonomic and less brittle (eg does not need manual if checking for
> being within bounds of data_end).

> For bpf prog types that don't support writes on skb data, the dynptr is
> read-only (writes and data slices are not permitted). For reads on the
> dynptr, this includes reading into data in the non-linear paged buffers
> but for writes and data slices, if the data is in a paged buffer, the
> user must first call bpf_skb_pull_data to pull the data into the linear
> portion.

> Additionally, any helper calls that change the underlying packet buffer
> (eg bpf_skb_pull_data) invalidates any data slices of the associated
> dynptr.

> Right now, skb dynptrs can only be constructed from skbs that are
> the bpf program context - as such, there does not need to be any
> reference tracking or release on skb dynptrs.

> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
>   include/linux/bpf.h            |  8 ++++-
>   include/linux/filter.h         |  4 +++
>   include/uapi/linux/bpf.h       | 42 ++++++++++++++++++++++++--
>   kernel/bpf/helpers.c           | 54 +++++++++++++++++++++++++++++++++-
>   kernel/bpf/verifier.c          | 43 +++++++++++++++++++++++----
>   net/core/filter.c              | 53 ++++++++++++++++++++++++++++++---
>   tools/include/uapi/linux/bpf.h | 42 ++++++++++++++++++++++++--
>   7 files changed, 229 insertions(+), 17 deletions(-)

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 20c26aed7896..7fbd4324c848 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -407,11 +407,14 @@ enum bpf_type_flag {
>   	/* Size is known at compile time. */
>   	MEM_FIXED_SIZE		= BIT(10 + BPF_BASE_TYPE_BITS),

> +	/* DYNPTR points to sk_buff */
> +	DYNPTR_TYPE_SKB		= BIT(11 + BPF_BASE_TYPE_BITS),
> +
>   	__BPF_TYPE_FLAG_MAX,
>   	__BPF_TYPE_LAST_FLAG	= __BPF_TYPE_FLAG_MAX - 1,
>   };

> -#define DYNPTR_TYPE_FLAG_MASK	(DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF)
> +#define DYNPTR_TYPE_FLAG_MASK	(DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF |  
> DYNPTR_TYPE_SKB)

>   /* Max number of base types. */
>   #define BPF_BASE_TYPE_LIMIT	(1UL << BPF_BASE_TYPE_BITS)
> @@ -2556,12 +2559,15 @@ enum bpf_dynptr_type {
>   	BPF_DYNPTR_TYPE_LOCAL,
>   	/* Underlying data is a ringbuf record */
>   	BPF_DYNPTR_TYPE_RINGBUF,
> +	/* Underlying data is a sk_buff */
> +	BPF_DYNPTR_TYPE_SKB,
>   };

>   void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
>   		     enum bpf_dynptr_type type, u32 offset, u32 size);
>   void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
>   int bpf_dynptr_check_size(u32 size);
> +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr);

>   #ifdef CONFIG_BPF_LSM
>   void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index a5f21dc3c432..649063d9cbfd 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -1532,4 +1532,8 @@ static __always_inline int  
> __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind
>   	return XDP_REDIRECT;
>   }

> +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void  
> *to, u32 len);
> +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void  
> *from,
> +			  u32 len, u64 flags);
> +
>   #endif /* __LINUX_FILTER_H__ */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 59a217ca2dfd..0730cd198a7f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5241,11 +5241,22 @@ union bpf_attr {
>    *	Description
>    *		Write *len* bytes from *src* into *dst*, starting from *offset*
>    *		into *dst*.
> - *		*flags* is currently unused.
> + *
> + *		*flags* must be 0 except for skb-type dynptrs.
> + *
> + *		For skb-type dynptrs:
> + *		    *  if *offset* + *len* extends into the skb's paged buffers, the  
> user
> + *		       should manually pull the skb with bpf_skb_pull and then try  
> again.
> + *
> + *		    *  *flags* are a combination of **BPF_F_RECOMPUTE_CSUM**  
> (automatically
> + *			recompute the checksum for the packet after storing the bytes) and
> + *			**BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
> + *			**->swhash** and *skb*\ **->l4hash** to 0).
>    *	Return
>    *		0 on success, -E2BIG if *offset* + *len* exceeds the length
>    *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
> - *		is a read-only dynptr or if *flags* is not 0.
> + *		is a read-only dynptr or if *flags* is not correct, -EAGAIN if for
> + *		skb-type dynptrs the write extends into the skb's paged buffers.
>    *
>    * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
>    *	Description
> @@ -5253,10 +5264,19 @@ union bpf_attr {
>    *
>    *		*len* must be a statically known value. The returned data slice
>    *		is invalidated whenever the dynptr is invalidated.
> + *
> + *		For skb-type dynptrs:
> + *		    * if *offset* + *len* extends into the skb's paged buffers,
> + *		      the user should manually pull the skb with bpf_skb_pull and  
> then
> + *		      try again.
> + *
> + *		    * the data slice is automatically invalidated anytime a
> + *		      helper call that changes the underlying packet buffer
> + *		      (eg bpf_skb_pull) is called.
>    *	Return
>    *		Pointer to the underlying dynptr data, NULL if the dynptr is
>    *		read-only, if the dynptr is invalid, or if the offset and length
> - *		is out of bounds.
> + *		is out of bounds or in a paged buffer for skb-type dynptrs.
>    *
>    * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr  
> *th, u32 th_len)
>    *	Description
> @@ -5331,6 +5351,21 @@ union bpf_attr {
>    *		**-EACCES** if the SYN cookie is not valid.
>    *
>    *		**-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
> + *
> + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct  
> bpf_dynptr *ptr)
> + *	Description
> + *		Get a dynptr to the data in *skb*. *skb* must be the BPF program
> + *		context. Depending on program type, the dynptr may be read-only,
> + *		in which case trying to obtain a direct data slice to it through
> + *		bpf_dynptr_data will return an error.
> + *
> + *		Calls that change the *skb*'s underlying packet buffer
> + *		(eg bpf_skb_pull_data) do not invalidate the dynptr, but they do
> + *		invalidate any data slices associated with the dynptr.
> + *
> + *		*flags* is currently unused, it must be 0 for now.
> + *	Return
> + *		0 on success or -EINVAL if flags is not 0.
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -5541,6 +5576,7 @@ union bpf_attr {
>   	FN(tcp_raw_gen_syncookie_ipv6),	\
>   	FN(tcp_raw_check_syncookie_ipv4),	\
>   	FN(tcp_raw_check_syncookie_ipv6),	\
> +	FN(dynptr_from_skb),		\
>   	/* */

>   /* integer value in 'imm' field of BPF_CALL instruction selects which  
> helper
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 1f961f9982d2..21a806057e9e 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1425,11 +1425,21 @@ static bool bpf_dynptr_is_rdonly(struct  
> bpf_dynptr_kern *ptr)
>   	return ptr->size & DYNPTR_RDONLY_BIT;
>   }

> +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr)
> +{
> +	ptr->size |= DYNPTR_RDONLY_BIT;
> +}
> +
>   static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum  
> bpf_dynptr_type type)
>   {
>   	ptr->size |= type << DYNPTR_TYPE_SHIFT;
>   }

> +static enum bpf_dynptr_type bpf_dynptr_get_type(const struct  
> bpf_dynptr_kern *ptr)
> +{
> +	return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
> +}
> +
>   static u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
>   {
>   	return ptr->size & DYNPTR_SIZE_MASK;
> @@ -1500,6 +1510,7 @@ static const struct bpf_func_proto  
> bpf_dynptr_from_mem_proto = {
>   BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct  
> bpf_dynptr_kern *, src,
>   	   u32, offset, u64, flags)
>   {
> +	enum bpf_dynptr_type type;
>   	int err;

>   	if (!src->data || flags)
> @@ -1509,6 +1520,11 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len,  
> struct bpf_dynptr_kern *, src
>   	if (err)
>   		return err;

> +	type = bpf_dynptr_get_type(src);
> +
> +	if (type == BPF_DYNPTR_TYPE_SKB)
> +		return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len);
> +
>   	memcpy(dst, src->data + src->offset + offset, len);

>   	return 0;
> @@ -1528,15 +1544,38 @@ static const struct bpf_func_proto  
> bpf_dynptr_read_proto = {
>   BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset,  
> void *, src,
>   	   u32, len, u64, flags)
>   {
> +	enum bpf_dynptr_type type;
>   	int err;

> -	if (!dst->data || flags || bpf_dynptr_is_rdonly(dst))
> +	if (!dst->data || bpf_dynptr_is_rdonly(dst))
>   		return -EINVAL;

>   	err = bpf_dynptr_check_off_len(dst, offset, len);
>   	if (err)
>   		return err;

> +	type = bpf_dynptr_get_type(dst);
> +
> +	if (flags) {
> +		if (type == BPF_DYNPTR_TYPE_SKB) {
> +			if (flags & ~(BPF_F_RECOMPUTE_CSUM | BPF_F_INVALIDATE_HASH))
> +				return -EINVAL;
> +		} else {
> +			return -EINVAL;
> +		}
> +	}
> +
> +	if (type == BPF_DYNPTR_TYPE_SKB) {
> +		struct sk_buff *skb = dst->data;
> +
> +		/* if the data is paged, the caller needs to pull it first */
> +		if (dst->offset + offset + len > skb->len - skb->data_len)

Use skb_headlen instead of 'skb->len - skb->data_len' ?

> +			return -EAGAIN;
> +
> +		return __bpf_skb_store_bytes(skb, dst->offset + offset, src, len,
> +					     flags);
> +	}
> +
>   	memcpy(dst->data + dst->offset + offset, src, len);

>   	return 0;
> @@ -1555,6 +1594,7 @@ static const struct bpf_func_proto  
> bpf_dynptr_write_proto = {

>   BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset,  
> u32, len)
>   {
> +	enum bpf_dynptr_type type;
>   	int err;

>   	if (!ptr->data)
> @@ -1567,6 +1607,18 @@ BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern  
> *, ptr, u32, offset, u32, len
>   	if (bpf_dynptr_is_rdonly(ptr))
>   		return 0;

> +	type = bpf_dynptr_get_type(ptr);
> +
> +	if (type == BPF_DYNPTR_TYPE_SKB) {
> +		struct sk_buff *skb = ptr->data;
> +
> +		/* if the data is paged, the caller needs to pull it first */
> +		if (ptr->offset + offset + len > skb->len - skb->data_len)
> +			return 0;

Same here?

> +
> +		return (unsigned long)(skb->data + ptr->offset + offset);
> +	}
> +
>   	return (unsigned long)(ptr->data + ptr->offset + offset);
>   }

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 0d523741a543..0838653eeb4e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -263,6 +263,7 @@ struct bpf_call_arg_meta {
>   	u32 subprogno;
>   	struct bpf_map_value_off_desc *kptr_off_desc;
>   	u8 uninit_dynptr_regno;
> +	enum bpf_dynptr_type type;
>   };

>   struct btf *btf_vmlinux;
> @@ -678,6 +679,8 @@ static enum bpf_dynptr_type arg_to_dynptr_type(enum  
> bpf_arg_type arg_type)
>   		return BPF_DYNPTR_TYPE_LOCAL;
>   	case DYNPTR_TYPE_RINGBUF:
>   		return BPF_DYNPTR_TYPE_RINGBUF;
> +	case DYNPTR_TYPE_SKB:
> +		return BPF_DYNPTR_TYPE_SKB;
>   	default:
>   		return BPF_DYNPTR_TYPE_INVALID;
>   	}
> @@ -5820,12 +5823,14 @@ int check_func_arg_reg_off(struct  
> bpf_verifier_env *env,
>   	return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
>   }

> -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct  
> bpf_reg_state *reg)
> +static void stack_slot_get_dynptr_info(struct bpf_verifier_env *env,  
> struct bpf_reg_state *reg,
> +				       struct bpf_call_arg_meta *meta)
>   {
>   	struct bpf_func_state *state = func(env, reg);
>   	int spi = get_spi(reg->off);

> -	return state->stack[spi].spilled_ptr.id;
> +	meta->ref_obj_id = state->stack[spi].spilled_ptr.id;
> +	meta->type = state->stack[spi].spilled_ptr.dynptr.type;
>   }

>   static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -6052,6 +6057,9 @@ static int check_func_arg(struct bpf_verifier_env  
> *env, u32 arg,
>   				case DYNPTR_TYPE_RINGBUF:
>   					err_extra = "ringbuf ";
>   					break;
> +				case DYNPTR_TYPE_SKB:
> +					err_extra = "skb ";
> +					break;
>   				default:
>   					break;
>   				}
> @@ -6065,8 +6073,10 @@ static int check_func_arg(struct bpf_verifier_env  
> *env, u32 arg,
>   					verbose(env, "verifier internal error: multiple refcounted args in  
> BPF_FUNC_dynptr_data");
>   					return -EFAULT;
>   				}
> -				/* Find the id of the dynptr we're tracking the reference of */
> -				meta->ref_obj_id = stack_slot_get_id(env, reg);
> +				/* Find the id and the type of the dynptr we're tracking
> +				 * the reference of.
> +				 */
> +				stack_slot_get_dynptr_info(env, reg, meta);
>   			}
>   		}
>   		break;
> @@ -7406,7 +7416,11 @@ static int check_helper_call(struct  
> bpf_verifier_env *env, struct bpf_insn *insn
>   		regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag;
>   	} else if (base_type(ret_type) == RET_PTR_TO_ALLOC_MEM) {
>   		mark_reg_known_zero(env, regs, BPF_REG_0);
> -		regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
> +		if (func_id == BPF_FUNC_dynptr_data &&
> +		    meta.type == BPF_DYNPTR_TYPE_SKB)
> +			regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag;
> +		else
> +			regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
>   		regs[BPF_REG_0].mem_size = meta.mem_size;
>   	} else if (base_type(ret_type) == RET_PTR_TO_MEM_OR_BTF_ID) {
>   		const struct btf_type *t;
> @@ -14132,6 +14146,25 @@ static int do_misc_fixups(struct  
> bpf_verifier_env *env)
>   			goto patch_call_imm;
>   		}


[..]

> +		if (insn->imm == BPF_FUNC_dynptr_from_skb) {
> +			if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE))
> +				insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, true);
> +			else
> +				insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, false);
> +			insn_buf[1] = *insn;
> +			cnt = 2;
> +
> +			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
> +			if (!new_prog)
> +				return -ENOMEM;
> +
> +			delta += cnt - 1;
> +			env->prog = new_prog;
> +			prog = new_prog;
> +			insn = new_prog->insnsi + i + delta;
> +			goto patch_call_imm;
> +		}

Would it be easier to have two separate helpers:
- BPF_FUNC_dynptr_from_skb
- BPF_FUNC_dynptr_from_skb_readonly

And make the verifier rewrite insn->imm to
BPF_FUNC_dynptr_from_skb_readonly when needed?

if (insn->imm == BPF_FUNC_dynptr_from_skb) {
	if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE))
		insn->imm = BPF_FUNC_dynptr_from_skb_readonly;
}

Or it's also ugly because we'd have to leak that new helper into UAPI?
(I wonder whether that hidden 4th argument is too magical, but probably
fine?)

> +
>   		/* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup
>   		 * and other inlining handlers are currently limited to 64 bit
>   		 * only.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 5669248aff25..312f99deb759 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1681,8 +1681,8 @@ static inline void bpf_pull_mac_rcsum(struct  
> sk_buff *skb)
>   		skb_postpull_rcsum(skb, skb_mac_header(skb), skb->mac_len);
>   }

> -BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
> -	   const void *, from, u32, len, u64, flags)
> +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void  
> *from,
> +			  u32 len, u64 flags)
>   {
>   	void *ptr;

> @@ -1707,6 +1707,12 @@ BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *,  
> skb, u32, offset,
>   	return 0;
>   }

> +BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
> +	   const void *, from, u32, len, u64, flags)
> +{
> +	return __bpf_skb_store_bytes(skb, offset, from, len, flags);
> +}
> +
>   static const struct bpf_func_proto bpf_skb_store_bytes_proto = {
>   	.func		= bpf_skb_store_bytes,
>   	.gpl_only	= false,
> @@ -1718,8 +1724,7 @@ static const struct bpf_func_proto  
> bpf_skb_store_bytes_proto = {
>   	.arg5_type	= ARG_ANYTHING,
>   };

> -BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset,
> -	   void *, to, u32, len)
> +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void  
> *to, u32 len)
>   {
>   	void *ptr;

> @@ -1738,6 +1743,12 @@ BPF_CALL_4(bpf_skb_load_bytes, const struct  
> sk_buff *, skb, u32, offset,
>   	return -EFAULT;
>   }

> +BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset,
> +	   void *, to, u32, len)
> +{
> +	return __bpf_skb_load_bytes(skb, offset, to, len);
> +}
> +
>   static const struct bpf_func_proto bpf_skb_load_bytes_proto = {
>   	.func		= bpf_skb_load_bytes,
>   	.gpl_only	= false,
> @@ -1849,6 +1860,32 @@ static const struct bpf_func_proto  
> bpf_skb_pull_data_proto = {
>   	.arg2_type	= ARG_ANYTHING,
>   };

> +/* is_rdonly is set by the verifier */
> +BPF_CALL_4(bpf_dynptr_from_skb, struct sk_buff *, skb, u64, flags,
> +	   struct bpf_dynptr_kern *, ptr, u32, is_rdonly)
> +{
> +	if (flags) {
> +		bpf_dynptr_set_null(ptr);
> +		return -EINVAL;
> +	}
> +
> +	bpf_dynptr_init(ptr, skb, BPF_DYNPTR_TYPE_SKB, 0, skb->len);
> +
> +	if (is_rdonly)
> +		bpf_dynptr_set_rdonly(ptr);
> +
> +	return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_dynptr_from_skb_proto = {
> +	.func		= bpf_dynptr_from_skb,
> +	.gpl_only	= false,
> +	.ret_type	= RET_INTEGER,
> +	.arg1_type	= ARG_PTR_TO_CTX,
> +	.arg2_type	= ARG_ANYTHING,
> +	.arg3_type	= ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_SKB | MEM_UNINIT,
> +};
> +
>   BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk)
>   {
>   	return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL;
> @@ -7808,6 +7845,8 @@ sk_filter_func_proto(enum bpf_func_id func_id,  
> const struct bpf_prog *prog)
>   		return &bpf_get_socket_uid_proto;
>   	case BPF_FUNC_perf_event_output:
>   		return &bpf_skb_event_output_proto;
> +	case BPF_FUNC_dynptr_from_skb:
> +		return &bpf_dynptr_from_skb_proto;
>   	default:
>   		return bpf_sk_base_func_proto(func_id);
>   	}
> @@ -7991,6 +8030,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id,  
> const struct bpf_prog *prog)
>   		return &bpf_tcp_raw_check_syncookie_ipv6_proto;
>   #endif
>   #endif
> +	case BPF_FUNC_dynptr_from_skb:
> +		return &bpf_dynptr_from_skb_proto;
>   	default:
>   		return bpf_sk_base_func_proto(func_id);
>   	}
> @@ -8186,6 +8227,8 @@ sk_skb_func_proto(enum bpf_func_id func_id, const  
> struct bpf_prog *prog)
>   	case BPF_FUNC_skc_lookup_tcp:
>   		return &bpf_skc_lookup_tcp_proto;
>   #endif
> +	case BPF_FUNC_dynptr_from_skb:
> +		return &bpf_dynptr_from_skb_proto;
>   	default:
>   		return bpf_sk_base_func_proto(func_id);
>   	}
> @@ -8224,6 +8267,8 @@ lwt_out_func_proto(enum bpf_func_id func_id, const  
> struct bpf_prog *prog)
>   		return &bpf_get_smp_processor_id_proto;
>   	case BPF_FUNC_skb_under_cgroup:
>   		return &bpf_skb_under_cgroup_proto;
> +	case BPF_FUNC_dynptr_from_skb:
> +		return &bpf_dynptr_from_skb_proto;
>   	default:
>   		return bpf_sk_base_func_proto(func_id);
>   	}
> diff --git a/tools/include/uapi/linux/bpf.h  
> b/tools/include/uapi/linux/bpf.h
> index 59a217ca2dfd..0730cd198a7f 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5241,11 +5241,22 @@ union bpf_attr {
>    *	Description
>    *		Write *len* bytes from *src* into *dst*, starting from *offset*
>    *		into *dst*.
> - *		*flags* is currently unused.
> + *
> + *		*flags* must be 0 except for skb-type dynptrs.
> + *
> + *		For skb-type dynptrs:
> + *		    *  if *offset* + *len* extends into the skb's paged buffers, the  
> user
> + *		       should manually pull the skb with bpf_skb_pull and then try  
> again.
> + *
> + *		    *  *flags* are a combination of **BPF_F_RECOMPUTE_CSUM**  
> (automatically
> + *			recompute the checksum for the packet after storing the bytes) and
> + *			**BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
> + *			**->swhash** and *skb*\ **->l4hash** to 0).
>    *	Return
>    *		0 on success, -E2BIG if *offset* + *len* exceeds the length
>    *		of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
> - *		is a read-only dynptr or if *flags* is not 0.
> + *		is a read-only dynptr or if *flags* is not correct, -EAGAIN if for
> + *		skb-type dynptrs the write extends into the skb's paged buffers.
>    *
>    * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
>    *	Description
> @@ -5253,10 +5264,19 @@ union bpf_attr {
>    *
>    *		*len* must be a statically known value. The returned data slice
>    *		is invalidated whenever the dynptr is invalidated.
> + *
> + *		For skb-type dynptrs:
> + *		    * if *offset* + *len* extends into the skb's paged buffers,
> + *		      the user should manually pull the skb with bpf_skb_pull and  
> then
> + *		      try again.
> + *
> + *		    * the data slice is automatically invalidated anytime a
> + *		      helper call that changes the underlying packet buffer
> + *		      (eg bpf_skb_pull) is called.
>    *	Return
>    *		Pointer to the underlying dynptr data, NULL if the dynptr is
>    *		read-only, if the dynptr is invalid, or if the offset and length
> - *		is out of bounds.
> + *		is out of bounds or in a paged buffer for skb-type dynptrs.
>    *
>    * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr  
> *th, u32 th_len)
>    *	Description
> @@ -5331,6 +5351,21 @@ union bpf_attr {
>    *		**-EACCES** if the SYN cookie is not valid.
>    *
>    *		**-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
> + *
> + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct  
> bpf_dynptr *ptr)
> + *	Description
> + *		Get a dynptr to the data in *skb*. *skb* must be the BPF program
> + *		context. Depending on program type, the dynptr may be read-only,
> + *		in which case trying to obtain a direct data slice to it through
> + *		bpf_dynptr_data will return an error.
> + *
> + *		Calls that change the *skb*'s underlying packet buffer
> + *		(eg bpf_skb_pull_data) do not invalidate the dynptr, but they do
> + *		invalidate any data slices associated with the dynptr.
> + *
> + *		*flags* is currently unused, it must be 0 for now.
> + *	Return
> + *		0 on success or -EINVAL if flags is not 0.
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -5541,6 +5576,7 @@ union bpf_attr {
>   	FN(tcp_raw_gen_syncookie_ipv6),	\
>   	FN(tcp_raw_check_syncookie_ipv4),	\
>   	FN(tcp_raw_check_syncookie_ipv6),	\
> +	FN(dynptr_from_skb),		\
>   	/* */

>   /* integer value in 'imm' field of BPF_CALL instruction selects which  
> helper
> --
> 2.30.2


  reply	other threads:[~2022-07-27 18:13 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-26 18:47 [PATCH bpf-next v1 0/3] Add skb + xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs Joanne Koong
2022-07-27 17:13   ` sdf [this message]
2022-07-28 16:49     ` Joanne Koong
2022-07-28 17:28       ` Stanislav Fomichev
2022-07-28 17:45   ` Hao Luo
2022-07-28 18:36     ` Joanne Koong
2022-07-28 23:39   ` Martin KaFai Lau
2022-07-29 20:26     ` Joanne Koong
2022-07-29 21:39       ` Martin KaFai Lau
2022-08-01 17:52         ` Joanne Koong
2022-08-01 19:38           ` Martin KaFai Lau
2022-08-01 21:16             ` Joanne Koong
2022-08-01 22:14               ` Andrii Nakryiko
2022-08-01 22:32               ` Martin KaFai Lau
2022-08-01 22:58                 ` Andrii Nakryiko
2022-08-01 23:23                   ` Martin KaFai Lau
2022-08-02  0:56                     ` Martin KaFai Lau
2022-08-02  3:51                       ` Andrii Nakryiko
2022-08-02  4:53                         ` Joanne Koong
2022-08-02  5:14                           ` Joanne Koong
2022-08-03 20:29         ` Joanne Koong
2022-08-03 20:36           ` Andrii Nakryiko
2022-08-03 20:56           ` Martin KaFai Lau
2022-08-03 23:25           ` Jakub Kicinski
2022-08-04  1:05             ` Joanne Koong
2022-08-04  1:34               ` Jakub Kicinski
2022-08-04  3:44                 ` Joanne Koong
2022-08-04  1:27             ` Martin KaFai Lau
2022-08-04  1:44               ` Jakub Kicinski
2022-08-04 22:58             ` Kumar Kartikeya Dwivedi
2022-08-05 23:25               ` Jakub Kicinski
2022-08-01 22:11   ` Andrii Nakryiko
2022-08-02  0:15     ` Joanne Koong
2022-08-01 23:33   ` Jakub Kicinski
2022-08-02  2:12     ` Joanne Koong
2022-08-04 21:55       ` Joanne Koong
2022-08-05 23:22         ` Jakub Kicinski
2022-08-03  6:37   ` Martin KaFai Lau
2022-07-26 18:47 ` [PATCH bpf-next v1 2/3] bpf: Add xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 3/3] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers Joanne Koong
2022-07-26 19:44   ` Zvi Effron
2022-07-26 20:06     ` Joanne Koong
2022-08-01 17:58   ` Andrii Nakryiko
2022-08-02 22:56     ` Joanne Koong
2022-08-03  0:53       ` Andrii Nakryiko
2022-08-03 16:11         ` Joanne Koong
2022-08-04 18:45           ` Alexei Starovoitov
2022-08-05 16:29             ` Joanne Koong
2022-08-01 19:12   ` Alexei Starovoitov
2022-08-02 22:21     ` Joanne Koong
2022-08-04 21:46       ` Joanne Koong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YuFyVwiFkrKjSmFN@google.com \
    --to=sdf@google.com \
    --cc=andrii@kernel.org \
    --cc=ast@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=daniel@iogearbox.net \
    --cc=joannelkoong@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.