From: sdf@google.com
To: Joanne Koong <joannelkoong@gmail.com>
Cc: bpf@vger.kernel.org, andrii@kernel.org, daniel@iogearbox.net,
ast@kernel.org
Subject: Re: [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs
Date: Wed, 27 Jul 2022 10:13:59 -0700 [thread overview]
Message-ID: <YuFyVwiFkrKjSmFN@google.com> (raw)
In-Reply-To: <20220726184706.954822-2-joannelkoong@gmail.com>
On 07/26, Joanne Koong wrote:
> Add skb dynptrs, which are dynptrs whose underlying pointer points
> to a skb. The dynptr acts on skb data. skb dynptrs have two main
> benefits. One is that they allow operations on sizes that are not
> statically known at compile-time (eg variable-sized accesses).
> Another is that parsing the packet data through dynptrs (instead of
> through direct access of skb->data and skb->data_end) can be more
> ergonomic and less brittle (eg does not need manual if checking for
> being within bounds of data_end).
> For bpf prog types that don't support writes on skb data, the dynptr is
> read-only (writes and data slices are not permitted). For reads on the
> dynptr, this includes reading into data in the non-linear paged buffers
> but for writes and data slices, if the data is in a paged buffer, the
> user must first call bpf_skb_pull_data to pull the data into the linear
> portion.
> Additionally, any helper calls that change the underlying packet buffer
> (eg bpf_skb_pull_data) invalidates any data slices of the associated
> dynptr.
> Right now, skb dynptrs can only be constructed from skbs that are
> the bpf program context - as such, there does not need to be any
> reference tracking or release on skb dynptrs.
> Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
> ---
> include/linux/bpf.h | 8 ++++-
> include/linux/filter.h | 4 +++
> include/uapi/linux/bpf.h | 42 ++++++++++++++++++++++++--
> kernel/bpf/helpers.c | 54 +++++++++++++++++++++++++++++++++-
> kernel/bpf/verifier.c | 43 +++++++++++++++++++++++----
> net/core/filter.c | 53 ++++++++++++++++++++++++++++++---
> tools/include/uapi/linux/bpf.h | 42 ++++++++++++++++++++++++--
> 7 files changed, 229 insertions(+), 17 deletions(-)
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 20c26aed7896..7fbd4324c848 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -407,11 +407,14 @@ enum bpf_type_flag {
> /* Size is known at compile time. */
> MEM_FIXED_SIZE = BIT(10 + BPF_BASE_TYPE_BITS),
> + /* DYNPTR points to sk_buff */
> + DYNPTR_TYPE_SKB = BIT(11 + BPF_BASE_TYPE_BITS),
> +
> __BPF_TYPE_FLAG_MAX,
> __BPF_TYPE_LAST_FLAG = __BPF_TYPE_FLAG_MAX - 1,
> };
> -#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF)
> +#define DYNPTR_TYPE_FLAG_MASK (DYNPTR_TYPE_LOCAL | DYNPTR_TYPE_RINGBUF |
> DYNPTR_TYPE_SKB)
> /* Max number of base types. */
> #define BPF_BASE_TYPE_LIMIT (1UL << BPF_BASE_TYPE_BITS)
> @@ -2556,12 +2559,15 @@ enum bpf_dynptr_type {
> BPF_DYNPTR_TYPE_LOCAL,
> /* Underlying data is a ringbuf record */
> BPF_DYNPTR_TYPE_RINGBUF,
> + /* Underlying data is a sk_buff */
> + BPF_DYNPTR_TYPE_SKB,
> };
> void bpf_dynptr_init(struct bpf_dynptr_kern *ptr, void *data,
> enum bpf_dynptr_type type, u32 offset, u32 size);
> void bpf_dynptr_set_null(struct bpf_dynptr_kern *ptr);
> int bpf_dynptr_check_size(u32 size);
> +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr);
> #ifdef CONFIG_BPF_LSM
> void bpf_cgroup_atype_get(u32 attach_btf_id, int cgroup_atype);
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index a5f21dc3c432..649063d9cbfd 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -1532,4 +1532,8 @@ static __always_inline int
> __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind
> return XDP_REDIRECT;
> }
> +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void
> *to, u32 len);
> +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void
> *from,
> + u32 len, u64 flags);
> +
> #endif /* __LINUX_FILTER_H__ */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 59a217ca2dfd..0730cd198a7f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5241,11 +5241,22 @@ union bpf_attr {
> * Description
> * Write *len* bytes from *src* into *dst*, starting from *offset*
> * into *dst*.
> - * *flags* is currently unused.
> + *
> + * *flags* must be 0 except for skb-type dynptrs.
> + *
> + * For skb-type dynptrs:
> + * * if *offset* + *len* extends into the skb's paged buffers, the
> user
> + * should manually pull the skb with bpf_skb_pull and then try
> again.
> + *
> + * * *flags* are a combination of **BPF_F_RECOMPUTE_CSUM**
> (automatically
> + * recompute the checksum for the packet after storing the bytes) and
> + * **BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
> + * **->swhash** and *skb*\ **->l4hash** to 0).
> * Return
> * 0 on success, -E2BIG if *offset* + *len* exceeds the length
> * of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
> - * is a read-only dynptr or if *flags* is not 0.
> + * is a read-only dynptr or if *flags* is not correct, -EAGAIN if for
> + * skb-type dynptrs the write extends into the skb's paged buffers.
> *
> * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> * Description
> @@ -5253,10 +5264,19 @@ union bpf_attr {
> *
> * *len* must be a statically known value. The returned data slice
> * is invalidated whenever the dynptr is invalidated.
> + *
> + * For skb-type dynptrs:
> + * * if *offset* + *len* extends into the skb's paged buffers,
> + * the user should manually pull the skb with bpf_skb_pull and
> then
> + * try again.
> + *
> + * * the data slice is automatically invalidated anytime a
> + * helper call that changes the underlying packet buffer
> + * (eg bpf_skb_pull) is called.
> * Return
> * Pointer to the underlying dynptr data, NULL if the dynptr is
> * read-only, if the dynptr is invalid, or if the offset and length
> - * is out of bounds.
> + * is out of bounds or in a paged buffer for skb-type dynptrs.
> *
> * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr
> *th, u32 th_len)
> * Description
> @@ -5331,6 +5351,21 @@ union bpf_attr {
> * **-EACCES** if the SYN cookie is not valid.
> *
> * **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
> + *
> + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct
> bpf_dynptr *ptr)
> + * Description
> + * Get a dynptr to the data in *skb*. *skb* must be the BPF program
> + * context. Depending on program type, the dynptr may be read-only,
> + * in which case trying to obtain a direct data slice to it through
> + * bpf_dynptr_data will return an error.
> + *
> + * Calls that change the *skb*'s underlying packet buffer
> + * (eg bpf_skb_pull_data) do not invalidate the dynptr, but they do
> + * invalidate any data slices associated with the dynptr.
> + *
> + * *flags* is currently unused, it must be 0 for now.
> + * Return
> + * 0 on success or -EINVAL if flags is not 0.
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -5541,6 +5576,7 @@ union bpf_attr {
> FN(tcp_raw_gen_syncookie_ipv6), \
> FN(tcp_raw_check_syncookie_ipv4), \
> FN(tcp_raw_check_syncookie_ipv6), \
> + FN(dynptr_from_skb), \
> /* */
> /* integer value in 'imm' field of BPF_CALL instruction selects which
> helper
> diff --git a/kernel/bpf/helpers.c b/kernel/bpf/helpers.c
> index 1f961f9982d2..21a806057e9e 100644
> --- a/kernel/bpf/helpers.c
> +++ b/kernel/bpf/helpers.c
> @@ -1425,11 +1425,21 @@ static bool bpf_dynptr_is_rdonly(struct
> bpf_dynptr_kern *ptr)
> return ptr->size & DYNPTR_RDONLY_BIT;
> }
> +void bpf_dynptr_set_rdonly(struct bpf_dynptr_kern *ptr)
> +{
> + ptr->size |= DYNPTR_RDONLY_BIT;
> +}
> +
> static void bpf_dynptr_set_type(struct bpf_dynptr_kern *ptr, enum
> bpf_dynptr_type type)
> {
> ptr->size |= type << DYNPTR_TYPE_SHIFT;
> }
> +static enum bpf_dynptr_type bpf_dynptr_get_type(const struct
> bpf_dynptr_kern *ptr)
> +{
> + return (ptr->size & ~(DYNPTR_RDONLY_BIT)) >> DYNPTR_TYPE_SHIFT;
> +}
> +
> static u32 bpf_dynptr_get_size(struct bpf_dynptr_kern *ptr)
> {
> return ptr->size & DYNPTR_SIZE_MASK;
> @@ -1500,6 +1510,7 @@ static const struct bpf_func_proto
> bpf_dynptr_from_mem_proto = {
> BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len, struct
> bpf_dynptr_kern *, src,
> u32, offset, u64, flags)
> {
> + enum bpf_dynptr_type type;
> int err;
> if (!src->data || flags)
> @@ -1509,6 +1520,11 @@ BPF_CALL_5(bpf_dynptr_read, void *, dst, u32, len,
> struct bpf_dynptr_kern *, src
> if (err)
> return err;
> + type = bpf_dynptr_get_type(src);
> +
> + if (type == BPF_DYNPTR_TYPE_SKB)
> + return __bpf_skb_load_bytes(src->data, src->offset + offset, dst, len);
> +
> memcpy(dst, src->data + src->offset + offset, len);
> return 0;
> @@ -1528,15 +1544,38 @@ static const struct bpf_func_proto
> bpf_dynptr_read_proto = {
> BPF_CALL_5(bpf_dynptr_write, struct bpf_dynptr_kern *, dst, u32, offset,
> void *, src,
> u32, len, u64, flags)
> {
> + enum bpf_dynptr_type type;
> int err;
> - if (!dst->data || flags || bpf_dynptr_is_rdonly(dst))
> + if (!dst->data || bpf_dynptr_is_rdonly(dst))
> return -EINVAL;
> err = bpf_dynptr_check_off_len(dst, offset, len);
> if (err)
> return err;
> + type = bpf_dynptr_get_type(dst);
> +
> + if (flags) {
> + if (type == BPF_DYNPTR_TYPE_SKB) {
> + if (flags & ~(BPF_F_RECOMPUTE_CSUM | BPF_F_INVALIDATE_HASH))
> + return -EINVAL;
> + } else {
> + return -EINVAL;
> + }
> + }
> +
> + if (type == BPF_DYNPTR_TYPE_SKB) {
> + struct sk_buff *skb = dst->data;
> +
> + /* if the data is paged, the caller needs to pull it first */
> + if (dst->offset + offset + len > skb->len - skb->data_len)
Use skb_headlen instead of 'skb->len - skb->data_len' ?
> + return -EAGAIN;
> +
> + return __bpf_skb_store_bytes(skb, dst->offset + offset, src, len,
> + flags);
> + }
> +
> memcpy(dst->data + dst->offset + offset, src, len);
> return 0;
> @@ -1555,6 +1594,7 @@ static const struct bpf_func_proto
> bpf_dynptr_write_proto = {
> BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern *, ptr, u32, offset,
> u32, len)
> {
> + enum bpf_dynptr_type type;
> int err;
> if (!ptr->data)
> @@ -1567,6 +1607,18 @@ BPF_CALL_3(bpf_dynptr_data, struct bpf_dynptr_kern
> *, ptr, u32, offset, u32, len
> if (bpf_dynptr_is_rdonly(ptr))
> return 0;
> + type = bpf_dynptr_get_type(ptr);
> +
> + if (type == BPF_DYNPTR_TYPE_SKB) {
> + struct sk_buff *skb = ptr->data;
> +
> + /* if the data is paged, the caller needs to pull it first */
> + if (ptr->offset + offset + len > skb->len - skb->data_len)
> + return 0;
Same here?
> +
> + return (unsigned long)(skb->data + ptr->offset + offset);
> + }
> +
> return (unsigned long)(ptr->data + ptr->offset + offset);
> }
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 0d523741a543..0838653eeb4e 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -263,6 +263,7 @@ struct bpf_call_arg_meta {
> u32 subprogno;
> struct bpf_map_value_off_desc *kptr_off_desc;
> u8 uninit_dynptr_regno;
> + enum bpf_dynptr_type type;
> };
> struct btf *btf_vmlinux;
> @@ -678,6 +679,8 @@ static enum bpf_dynptr_type arg_to_dynptr_type(enum
> bpf_arg_type arg_type)
> return BPF_DYNPTR_TYPE_LOCAL;
> case DYNPTR_TYPE_RINGBUF:
> return BPF_DYNPTR_TYPE_RINGBUF;
> + case DYNPTR_TYPE_SKB:
> + return BPF_DYNPTR_TYPE_SKB;
> default:
> return BPF_DYNPTR_TYPE_INVALID;
> }
> @@ -5820,12 +5823,14 @@ int check_func_arg_reg_off(struct
> bpf_verifier_env *env,
> return __check_ptr_off_reg(env, reg, regno, fixed_off_ok);
> }
> -static u32 stack_slot_get_id(struct bpf_verifier_env *env, struct
> bpf_reg_state *reg)
> +static void stack_slot_get_dynptr_info(struct bpf_verifier_env *env,
> struct bpf_reg_state *reg,
> + struct bpf_call_arg_meta *meta)
> {
> struct bpf_func_state *state = func(env, reg);
> int spi = get_spi(reg->off);
> - return state->stack[spi].spilled_ptr.id;
> + meta->ref_obj_id = state->stack[spi].spilled_ptr.id;
> + meta->type = state->stack[spi].spilled_ptr.dynptr.type;
> }
> static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> @@ -6052,6 +6057,9 @@ static int check_func_arg(struct bpf_verifier_env
> *env, u32 arg,
> case DYNPTR_TYPE_RINGBUF:
> err_extra = "ringbuf ";
> break;
> + case DYNPTR_TYPE_SKB:
> + err_extra = "skb ";
> + break;
> default:
> break;
> }
> @@ -6065,8 +6073,10 @@ static int check_func_arg(struct bpf_verifier_env
> *env, u32 arg,
> verbose(env, "verifier internal error: multiple refcounted args in
> BPF_FUNC_dynptr_data");
> return -EFAULT;
> }
> - /* Find the id of the dynptr we're tracking the reference of */
> - meta->ref_obj_id = stack_slot_get_id(env, reg);
> + /* Find the id and the type of the dynptr we're tracking
> + * the reference of.
> + */
> + stack_slot_get_dynptr_info(env, reg, meta);
> }
> }
> break;
> @@ -7406,7 +7416,11 @@ static int check_helper_call(struct
> bpf_verifier_env *env, struct bpf_insn *insn
> regs[BPF_REG_0].type = PTR_TO_TCP_SOCK | ret_flag;
> } else if (base_type(ret_type) == RET_PTR_TO_ALLOC_MEM) {
> mark_reg_known_zero(env, regs, BPF_REG_0);
> - regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
> + if (func_id == BPF_FUNC_dynptr_data &&
> + meta.type == BPF_DYNPTR_TYPE_SKB)
> + regs[BPF_REG_0].type = PTR_TO_PACKET | ret_flag;
> + else
> + regs[BPF_REG_0].type = PTR_TO_MEM | ret_flag;
> regs[BPF_REG_0].mem_size = meta.mem_size;
> } else if (base_type(ret_type) == RET_PTR_TO_MEM_OR_BTF_ID) {
> const struct btf_type *t;
> @@ -14132,6 +14146,25 @@ static int do_misc_fixups(struct
> bpf_verifier_env *env)
> goto patch_call_imm;
> }
[..]
> + if (insn->imm == BPF_FUNC_dynptr_from_skb) {
> + if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE))
> + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, true);
> + else
> + insn_buf[0] = BPF_MOV32_IMM(BPF_REG_4, false);
> + insn_buf[1] = *insn;
> + cnt = 2;
> +
> + new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
> + if (!new_prog)
> + return -ENOMEM;
> +
> + delta += cnt - 1;
> + env->prog = new_prog;
> + prog = new_prog;
> + insn = new_prog->insnsi + i + delta;
> + goto patch_call_imm;
> + }
Would it be easier to have two separate helpers:
- BPF_FUNC_dynptr_from_skb
- BPF_FUNC_dynptr_from_skb_readonly
And make the verifier rewrite insn->imm to
BPF_FUNC_dynptr_from_skb_readonly when needed?
if (insn->imm == BPF_FUNC_dynptr_from_skb) {
if (!may_access_direct_pkt_data(env, NULL, BPF_WRITE))
insn->imm = BPF_FUNC_dynptr_from_skb_readonly;
}
Or it's also ugly because we'd have to leak that new helper into UAPI?
(I wonder whether that hidden 4th argument is too magical, but probably
fine?)
> +
> /* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup
> * and other inlining handlers are currently limited to 64 bit
> * only.
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 5669248aff25..312f99deb759 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -1681,8 +1681,8 @@ static inline void bpf_pull_mac_rcsum(struct
> sk_buff *skb)
> skb_postpull_rcsum(skb, skb_mac_header(skb), skb->mac_len);
> }
> -BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
> - const void *, from, u32, len, u64, flags)
> +int __bpf_skb_store_bytes(struct sk_buff *skb, u32 offset, const void
> *from,
> + u32 len, u64 flags)
> {
> void *ptr;
> @@ -1707,6 +1707,12 @@ BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *,
> skb, u32, offset,
> return 0;
> }
> +BPF_CALL_5(bpf_skb_store_bytes, struct sk_buff *, skb, u32, offset,
> + const void *, from, u32, len, u64, flags)
> +{
> + return __bpf_skb_store_bytes(skb, offset, from, len, flags);
> +}
> +
> static const struct bpf_func_proto bpf_skb_store_bytes_proto = {
> .func = bpf_skb_store_bytes,
> .gpl_only = false,
> @@ -1718,8 +1724,7 @@ static const struct bpf_func_proto
> bpf_skb_store_bytes_proto = {
> .arg5_type = ARG_ANYTHING,
> };
> -BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset,
> - void *, to, u32, len)
> +int __bpf_skb_load_bytes(const struct sk_buff *skb, u32 offset, void
> *to, u32 len)
> {
> void *ptr;
> @@ -1738,6 +1743,12 @@ BPF_CALL_4(bpf_skb_load_bytes, const struct
> sk_buff *, skb, u32, offset,
> return -EFAULT;
> }
> +BPF_CALL_4(bpf_skb_load_bytes, const struct sk_buff *, skb, u32, offset,
> + void *, to, u32, len)
> +{
> + return __bpf_skb_load_bytes(skb, offset, to, len);
> +}
> +
> static const struct bpf_func_proto bpf_skb_load_bytes_proto = {
> .func = bpf_skb_load_bytes,
> .gpl_only = false,
> @@ -1849,6 +1860,32 @@ static const struct bpf_func_proto
> bpf_skb_pull_data_proto = {
> .arg2_type = ARG_ANYTHING,
> };
> +/* is_rdonly is set by the verifier */
> +BPF_CALL_4(bpf_dynptr_from_skb, struct sk_buff *, skb, u64, flags,
> + struct bpf_dynptr_kern *, ptr, u32, is_rdonly)
> +{
> + if (flags) {
> + bpf_dynptr_set_null(ptr);
> + return -EINVAL;
> + }
> +
> + bpf_dynptr_init(ptr, skb, BPF_DYNPTR_TYPE_SKB, 0, skb->len);
> +
> + if (is_rdonly)
> + bpf_dynptr_set_rdonly(ptr);
> +
> + return 0;
> +}
> +
> +static const struct bpf_func_proto bpf_dynptr_from_skb_proto = {
> + .func = bpf_dynptr_from_skb,
> + .gpl_only = false,
> + .ret_type = RET_INTEGER,
> + .arg1_type = ARG_PTR_TO_CTX,
> + .arg2_type = ARG_ANYTHING,
> + .arg3_type = ARG_PTR_TO_DYNPTR | DYNPTR_TYPE_SKB | MEM_UNINIT,
> +};
> +
> BPF_CALL_1(bpf_sk_fullsock, struct sock *, sk)
> {
> return sk_fullsock(sk) ? (unsigned long)sk : (unsigned long)NULL;
> @@ -7808,6 +7845,8 @@ sk_filter_func_proto(enum bpf_func_id func_id,
> const struct bpf_prog *prog)
> return &bpf_get_socket_uid_proto;
> case BPF_FUNC_perf_event_output:
> return &bpf_skb_event_output_proto;
> + case BPF_FUNC_dynptr_from_skb:
> + return &bpf_dynptr_from_skb_proto;
> default:
> return bpf_sk_base_func_proto(func_id);
> }
> @@ -7991,6 +8030,8 @@ tc_cls_act_func_proto(enum bpf_func_id func_id,
> const struct bpf_prog *prog)
> return &bpf_tcp_raw_check_syncookie_ipv6_proto;
> #endif
> #endif
> + case BPF_FUNC_dynptr_from_skb:
> + return &bpf_dynptr_from_skb_proto;
> default:
> return bpf_sk_base_func_proto(func_id);
> }
> @@ -8186,6 +8227,8 @@ sk_skb_func_proto(enum bpf_func_id func_id, const
> struct bpf_prog *prog)
> case BPF_FUNC_skc_lookup_tcp:
> return &bpf_skc_lookup_tcp_proto;
> #endif
> + case BPF_FUNC_dynptr_from_skb:
> + return &bpf_dynptr_from_skb_proto;
> default:
> return bpf_sk_base_func_proto(func_id);
> }
> @@ -8224,6 +8267,8 @@ lwt_out_func_proto(enum bpf_func_id func_id, const
> struct bpf_prog *prog)
> return &bpf_get_smp_processor_id_proto;
> case BPF_FUNC_skb_under_cgroup:
> return &bpf_skb_under_cgroup_proto;
> + case BPF_FUNC_dynptr_from_skb:
> + return &bpf_dynptr_from_skb_proto;
> default:
> return bpf_sk_base_func_proto(func_id);
> }
> diff --git a/tools/include/uapi/linux/bpf.h
> b/tools/include/uapi/linux/bpf.h
> index 59a217ca2dfd..0730cd198a7f 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5241,11 +5241,22 @@ union bpf_attr {
> * Description
> * Write *len* bytes from *src* into *dst*, starting from *offset*
> * into *dst*.
> - * *flags* is currently unused.
> + *
> + * *flags* must be 0 except for skb-type dynptrs.
> + *
> + * For skb-type dynptrs:
> + * * if *offset* + *len* extends into the skb's paged buffers, the
> user
> + * should manually pull the skb with bpf_skb_pull and then try
> again.
> + *
> + * * *flags* are a combination of **BPF_F_RECOMPUTE_CSUM**
> (automatically
> + * recompute the checksum for the packet after storing the bytes) and
> + * **BPF_F_INVALIDATE_HASH** (set *skb*\ **->hash**, *skb*\
> + * **->swhash** and *skb*\ **->l4hash** to 0).
> * Return
> * 0 on success, -E2BIG if *offset* + *len* exceeds the length
> * of *dst*'s data, -EINVAL if *dst* is an invalid dynptr or if *dst*
> - * is a read-only dynptr or if *flags* is not 0.
> + * is a read-only dynptr or if *flags* is not correct, -EAGAIN if for
> + * skb-type dynptrs the write extends into the skb's paged buffers.
> *
> * void *bpf_dynptr_data(struct bpf_dynptr *ptr, u32 offset, u32 len)
> * Description
> @@ -5253,10 +5264,19 @@ union bpf_attr {
> *
> * *len* must be a statically known value. The returned data slice
> * is invalidated whenever the dynptr is invalidated.
> + *
> + * For skb-type dynptrs:
> + * * if *offset* + *len* extends into the skb's paged buffers,
> + * the user should manually pull the skb with bpf_skb_pull and
> then
> + * try again.
> + *
> + * * the data slice is automatically invalidated anytime a
> + * helper call that changes the underlying packet buffer
> + * (eg bpf_skb_pull) is called.
> * Return
> * Pointer to the underlying dynptr data, NULL if the dynptr is
> * read-only, if the dynptr is invalid, or if the offset and length
> - * is out of bounds.
> + * is out of bounds or in a paged buffer for skb-type dynptrs.
> *
> * s64 bpf_tcp_raw_gen_syncookie_ipv4(struct iphdr *iph, struct tcphdr
> *th, u32 th_len)
> * Description
> @@ -5331,6 +5351,21 @@ union bpf_attr {
> * **-EACCES** if the SYN cookie is not valid.
> *
> * **-EPROTONOSUPPORT** if CONFIG_IPV6 is not builtin.
> + *
> + * long bpf_dynptr_from_skb(struct sk_buff *skb, u64 flags, struct
> bpf_dynptr *ptr)
> + * Description
> + * Get a dynptr to the data in *skb*. *skb* must be the BPF program
> + * context. Depending on program type, the dynptr may be read-only,
> + * in which case trying to obtain a direct data slice to it through
> + * bpf_dynptr_data will return an error.
> + *
> + * Calls that change the *skb*'s underlying packet buffer
> + * (eg bpf_skb_pull_data) do not invalidate the dynptr, but they do
> + * invalidate any data slices associated with the dynptr.
> + *
> + * *flags* is currently unused, it must be 0 for now.
> + * Return
> + * 0 on success or -EINVAL if flags is not 0.
> */
> #define __BPF_FUNC_MAPPER(FN) \
> FN(unspec), \
> @@ -5541,6 +5576,7 @@ union bpf_attr {
> FN(tcp_raw_gen_syncookie_ipv6), \
> FN(tcp_raw_check_syncookie_ipv4), \
> FN(tcp_raw_check_syncookie_ipv6), \
> + FN(dynptr_from_skb), \
> /* */
> /* integer value in 'imm' field of BPF_CALL instruction selects which
> helper
> --
> 2.30.2
next prev parent reply other threads:[~2022-07-27 18:13 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-07-26 18:47 [PATCH bpf-next v1 0/3] Add skb + xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 1/3] bpf: Add skb dynptrs Joanne Koong
2022-07-27 17:13 ` sdf [this message]
2022-07-28 16:49 ` Joanne Koong
2022-07-28 17:28 ` Stanislav Fomichev
2022-07-28 17:45 ` Hao Luo
2022-07-28 18:36 ` Joanne Koong
2022-07-28 23:39 ` Martin KaFai Lau
2022-07-29 20:26 ` Joanne Koong
2022-07-29 21:39 ` Martin KaFai Lau
2022-08-01 17:52 ` Joanne Koong
2022-08-01 19:38 ` Martin KaFai Lau
2022-08-01 21:16 ` Joanne Koong
2022-08-01 22:14 ` Andrii Nakryiko
2022-08-01 22:32 ` Martin KaFai Lau
2022-08-01 22:58 ` Andrii Nakryiko
2022-08-01 23:23 ` Martin KaFai Lau
2022-08-02 0:56 ` Martin KaFai Lau
2022-08-02 3:51 ` Andrii Nakryiko
2022-08-02 4:53 ` Joanne Koong
2022-08-02 5:14 ` Joanne Koong
2022-08-03 20:29 ` Joanne Koong
2022-08-03 20:36 ` Andrii Nakryiko
2022-08-03 20:56 ` Martin KaFai Lau
2022-08-03 23:25 ` Jakub Kicinski
2022-08-04 1:05 ` Joanne Koong
2022-08-04 1:34 ` Jakub Kicinski
2022-08-04 3:44 ` Joanne Koong
2022-08-04 1:27 ` Martin KaFai Lau
2022-08-04 1:44 ` Jakub Kicinski
2022-08-04 22:58 ` Kumar Kartikeya Dwivedi
2022-08-05 23:25 ` Jakub Kicinski
2022-08-01 22:11 ` Andrii Nakryiko
2022-08-02 0:15 ` Joanne Koong
2022-08-01 23:33 ` Jakub Kicinski
2022-08-02 2:12 ` Joanne Koong
2022-08-04 21:55 ` Joanne Koong
2022-08-05 23:22 ` Jakub Kicinski
2022-08-03 6:37 ` Martin KaFai Lau
2022-07-26 18:47 ` [PATCH bpf-next v1 2/3] bpf: Add xdp dynptrs Joanne Koong
2022-07-26 18:47 ` [PATCH bpf-next v1 3/3] selftests/bpf: tests for using dynptrs to parse skb and xdp buffers Joanne Koong
2022-07-26 19:44 ` Zvi Effron
2022-07-26 20:06 ` Joanne Koong
2022-08-01 17:58 ` Andrii Nakryiko
2022-08-02 22:56 ` Joanne Koong
2022-08-03 0:53 ` Andrii Nakryiko
2022-08-03 16:11 ` Joanne Koong
2022-08-04 18:45 ` Alexei Starovoitov
2022-08-05 16:29 ` Joanne Koong
2022-08-01 19:12 ` Alexei Starovoitov
2022-08-02 22:21 ` Joanne Koong
2022-08-04 21:46 ` Joanne Koong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YuFyVwiFkrKjSmFN@google.com \
--to=sdf@google.com \
--cc=andrii@kernel.org \
--cc=ast@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=daniel@iogearbox.net \
--cc=joannelkoong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).