* [PATCH bpf-next 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() @ 2021-02-19 14:59 Björn Töpel 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel 2021-02-19 14:59 ` [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions Björn Töpel 0 siblings, 2 replies; 9+ messages in thread From: Björn Töpel @ 2021-02-19 14:59 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, bjorn.topel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem Hi XDP-folks, This two patch series contain two optimizations for the bpf_redirect_map() helper and the xdp_do_redirect() function. The bpf_redirect_map() optimization is about avoiding the map lookup dispatching. Instead of having a switch-statement and selecting the correct lookup function, we let the verifier patch the bpf_redirect_map() call to a specific lookup function. This way the run-time lookup is avoided. The xdp_do_redirect() patch restructures the code, so that the map pointer indirection can be avoided. Performance-wise I got 3% improvement for XSKMAP (sample:xdpsock/rx-drop), and 4% (sample:xdp_redirect_map) on my machine. More details in each commit. Changes since the RFC is outlined in each commit. Cheers, Björn Björn Töpel (2): bpf, xdp: per-map bpf_redirect_map functions for XDP bpf, xdp: restructure redirect actions include/linux/bpf.h | 20 ++-- include/linux/filter.h | 13 ++- include/net/xdp_sock.h | 6 +- include/trace/events/xdp.h | 66 ++++++----- kernel/bpf/cpumap.c | 3 +- kernel/bpf/devmap.c | 5 +- kernel/bpf/verifier.c | 28 +++-- net/core/filter.c | 219 ++++++++++++++++++------------------- net/xdp/xskmap.c | 1 - 9 files changed, 192 insertions(+), 169 deletions(-) base-commit: 7b1e385c9a488de9291eaaa412146d3972e9dec5 -- 2.27.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP 2021-02-19 14:59 [PATCH bpf-next 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel @ 2021-02-19 14:59 ` Björn Töpel 2021-02-19 17:05 ` Toke Høiland-Jørgensen ` (2 more replies) 2021-02-19 14:59 ` [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions Björn Töpel 1 sibling, 3 replies; 9+ messages in thread From: Björn Töpel @ 2021-02-19 14:59 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem From: Björn Töpel <bjorn.topel@intel.com> Currently the bpf_redirect_map() implementation dispatches to the correct map-lookup function via a switch-statement. To avoid the dispatching, this change adds one bpf_redirect_map() implementation per map. Correct function is automatically selected by the BPF verifier. rfc->v1: Get rid of the macro and use __always_inline. (Jesper) Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- include/linux/bpf.h | 20 +++++++------ include/linux/filter.h | 2 ++ include/net/xdp_sock.h | 6 ++-- kernel/bpf/cpumap.c | 2 +- kernel/bpf/devmap.c | 4 +-- kernel/bpf/verifier.c | 28 +++++++++++------- net/core/filter.c | 67 ++++++++++++++++++++++++++---------------- 7 files changed, 76 insertions(+), 53 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cccaef1088ea..3dd186eeaf98 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -314,12 +314,14 @@ enum bpf_return_type { RET_PTR_TO_BTF_ID, /* returns a pointer to a btf_id */ }; +typedef u64 (*bpf_func_proto_func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); + /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs * to in-kernel helper functions and for adjusting imm32 field in BPF_CALL * instructions after verifying */ struct bpf_func_proto { - u64 (*func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); + bpf_func_proto_func func; bool gpl_only; bool pkt_access; enum bpf_return_type ret_type; @@ -1429,9 +1431,11 @@ struct btf *bpf_get_btf_vmlinux(void); /* Map specifics */ struct xdp_buff; struct sk_buff; +struct bpf_dtab_netdev; +struct bpf_cpu_map_entry; -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key); -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key); +void *__dev_map_lookup_elem(struct bpf_map *map, u32 key); +void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key); void __dev_flush(void); int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, struct net_device *dev_rx); @@ -1441,7 +1445,7 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, struct bpf_prog *xdp_prog); bool dev_map_can_have_prog(struct bpf_map *map); -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key); +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key); void __cpu_map_flush(void); int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp, struct net_device *dev_rx); @@ -1568,14 +1572,12 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags) return -EOPNOTSUPP; } -static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map, - u32 key) +static inline void *__dev_map_lookup_elem(struct bpf_map *map, u32 key) { return NULL; } -static inline struct net_device *__dev_map_hash_lookup_elem(struct bpf_map *map, - u32 key) +static inline void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) { return NULL; } @@ -1615,7 +1617,7 @@ static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, } static inline -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) { return NULL; } diff --git a/include/linux/filter.h b/include/linux/filter.h index 3b00fc906ccd..1dedcf66b694 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1472,4 +1472,6 @@ static inline bool bpf_sk_lookup_run_v6(struct net *net, int protocol, } #endif /* IS_ENABLED(CONFIG_IPV6) */ +bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type); + #endif /* __LINUX_FILTER_H__ */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index cc17bc957548..da4139a58630 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -80,8 +80,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) +static inline void *__xsk_map_lookup_elem(struct bpf_map *map, u32 key) { struct xsk_map *m = container_of(map, struct xsk_map, map); struct xdp_sock *xs; @@ -109,8 +108,7 @@ static inline void __xsk_map_flush(void) { } -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) +static inline void *__xsk_map_lookup_elem(struct bpf_map *map, u32 key) { return NULL; } diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 5d1469de6921..a4d2cb93cd69 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -563,7 +563,7 @@ static void cpu_map_free(struct bpf_map *map) kfree(cmap); } -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map); struct bpf_cpu_map_entry *rcpu; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 85d9d1b72a33..37ac4cde9713 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -258,7 +258,7 @@ static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) return 0; } -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) +void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct hlist_head *head = dev_map_index_hash(dtab, key); @@ -392,7 +392,7 @@ void __dev_flush(void) * update happens in parallel here a dev_put wont happen until after reading the * ifindex. */ -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) +void *__dev_map_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct bpf_dtab_netdev *obj; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 3d34ba492d46..b5fb0c4e911a 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, func_id != BPF_FUNC_map_delete_elem && func_id != BPF_FUNC_map_push_elem && func_id != BPF_FUNC_map_pop_elem && - func_id != BPF_FUNC_map_peek_elem) + func_id != BPF_FUNC_map_peek_elem && + func_id != BPF_FUNC_redirect_map) return 0; if (map == NULL) { @@ -11860,17 +11861,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) } patch_call_imm: - fn = env->ops->get_func_proto(insn->imm, env->prog); - /* all functions that have prototype and verifier allowed - * programs to call them, must be real in-kernel functions - */ - if (!fn->func) { - verbose(env, - "kernel subsystem misconfigured func %s#%d\n", - func_id_name(insn->imm), insn->imm); - return -EFAULT; + if (insn->imm == BPF_FUNC_redirect_map) { + aux = &env->insn_aux_data[i]; + map_ptr = BPF_MAP_PTR(aux->map_ptr_state); + insn->imm = get_xdp_redirect_func(map_ptr->map_type) - __bpf_call_base; + } else { + fn = env->ops->get_func_proto(insn->imm, env->prog); + /* all functions that have prototype and verifier allowed + * programs to call them, must be real in-kernel functions + */ + if (!fn->func) { + verbose(env, "kernel subsystem misconfigured func %s#%d\n", + func_id_name(insn->imm), insn->imm); + return -EFAULT; + } + insn->imm = fn->func - __bpf_call_base; } - insn->imm = fn->func - __bpf_call_base; } /* Since poke tab is now finalized, publish aux to tracker. */ diff --git a/net/core/filter.c b/net/core/filter.c index adfdad234674..fd64d768e16a 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3944,22 +3944,6 @@ void xdp_do_flush(void) } EXPORT_SYMBOL_GPL(xdp_do_flush); -static inline void *__xdp_map_lookup_elem(struct bpf_map *map, u32 index) -{ - switch (map->map_type) { - case BPF_MAP_TYPE_DEVMAP: - return __dev_map_lookup_elem(map, index); - case BPF_MAP_TYPE_DEVMAP_HASH: - return __dev_map_hash_lookup_elem(map, index); - case BPF_MAP_TYPE_CPUMAP: - return __cpu_map_lookup_elem(map, index); - case BPF_MAP_TYPE_XSKMAP: - return __xsk_map_lookup_elem(map, index); - default: - return NULL; - } -} - void bpf_clear_redirect_map(struct bpf_map *map) { struct bpf_redirect_info *ri; @@ -4110,22 +4094,17 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { .arg2_type = ARG_ANYTHING, }; -BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, - u64, flags) +static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, + void *lookup_elem(struct bpf_map *map, + u32 key)) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - /* Lower bits of the flags are used as return code on lookup failure */ if (unlikely(flags > XDP_TX)) return XDP_ABORTED; - ri->tgt_value = __xdp_map_lookup_elem(map, ifindex); + ri->tgt_value = lookup_elem(map, ifindex); if (unlikely(!ri->tgt_value)) { - /* If the lookup fails we want to clear out the state in the - * redirect_info struct completely, so that if an eBPF program - * performs multiple lookups, the last one always takes - * precedence. - */ WRITE_ONCE(ri->map, NULL); return flags; } @@ -4137,8 +4116,44 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, return XDP_REDIRECT; } +BPF_CALL_3(bpf_xdp_redirect_devmap, struct bpf_map *, map, u32, ifindex, u64, flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); +} + +BPF_CALL_3(bpf_xdp_redirect_devmap_hash, struct bpf_map *, map, u32, ifindex, u64, flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); +} + +BPF_CALL_3(bpf_xdp_redirect_cpumap, struct bpf_map *, map, u32, ifindex, u64, flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); +} + +BPF_CALL_3(bpf_xdp_redirect_xskmap, struct bpf_map *, map, u32, ifindex, u64, flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); +} + +bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type) +{ + switch (map_type) { + case BPF_MAP_TYPE_DEVMAP: + return bpf_xdp_redirect_devmap; + case BPF_MAP_TYPE_DEVMAP_HASH: + return bpf_xdp_redirect_devmap_hash; + case BPF_MAP_TYPE_CPUMAP: + return bpf_xdp_redirect_cpumap; + case BPF_MAP_TYPE_XSKMAP: + return bpf_xdp_redirect_xskmap; + default: + return NULL; + } +} + +/* NB! .func is NULL! get_xdp_redirect_func() is used instead! */ static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { - .func = bpf_xdp_redirect_map, .gpl_only = false, .ret_type = RET_INTEGER, .arg1_type = ARG_CONST_MAP_PTR, -- 2.27.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel @ 2021-02-19 17:05 ` Toke Høiland-Jørgensen 2021-02-19 17:47 ` Björn Töpel 2021-02-19 17:08 ` kernel test robot 2021-02-19 18:13 ` kernel test robot 2 siblings, 1 reply; 9+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-19 17:05 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@gmail.com> writes: > From: Björn Töpel <bjorn.topel@intel.com> > > Currently the bpf_redirect_map() implementation dispatches to the > correct map-lookup function via a switch-statement. To avoid the > dispatching, this change adds one bpf_redirect_map() implementation per > map. Correct function is automatically selected by the BPF verifier. > > rfc->v1: Get rid of the macro and use __always_inline. (Jesper) > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Nice! Way better with the __always_inline. One small nit below, but otherwise: Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> > --- > include/linux/bpf.h | 20 +++++++------ > include/linux/filter.h | 2 ++ > include/net/xdp_sock.h | 6 ++-- > kernel/bpf/cpumap.c | 2 +- > kernel/bpf/devmap.c | 4 +-- > kernel/bpf/verifier.c | 28 +++++++++++------- > net/core/filter.c | 67 ++++++++++++++++++++++++++---------------- > 7 files changed, 76 insertions(+), 53 deletions(-) > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index cccaef1088ea..3dd186eeaf98 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -314,12 +314,14 @@ enum bpf_return_type { > RET_PTR_TO_BTF_ID, /* returns a pointer to a btf_id */ > }; > > +typedef u64 (*bpf_func_proto_func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); > + > /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs > * to in-kernel helper functions and for adjusting imm32 field in BPF_CALL > * instructions after verifying > */ > struct bpf_func_proto { > - u64 (*func)(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5); > + bpf_func_proto_func func; > bool gpl_only; > bool pkt_access; > enum bpf_return_type ret_type; > @@ -1429,9 +1431,11 @@ struct btf *bpf_get_btf_vmlinux(void); > /* Map specifics */ > struct xdp_buff; > struct sk_buff; > +struct bpf_dtab_netdev; > +struct bpf_cpu_map_entry; > > -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key); > -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key); > +void *__dev_map_lookup_elem(struct bpf_map *map, u32 key); > +void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key); > void __dev_flush(void); > int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, > struct net_device *dev_rx); > @@ -1441,7 +1445,7 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, > struct bpf_prog *xdp_prog); > bool dev_map_can_have_prog(struct bpf_map *map); > > -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key); > +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key); > void __cpu_map_flush(void); > int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp, > struct net_device *dev_rx); > @@ -1568,14 +1572,12 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags) > return -EOPNOTSUPP; > } > > -static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map, > - u32 key) > +static inline void *__dev_map_lookup_elem(struct bpf_map *map, u32 key) > { > return NULL; > } > > -static inline struct net_device *__dev_map_hash_lookup_elem(struct bpf_map *map, > - u32 key) > +static inline void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) > { > return NULL; > } > @@ -1615,7 +1617,7 @@ static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, > } > > static inline > -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) > +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) > { > return NULL; > } > diff --git a/include/linux/filter.h b/include/linux/filter.h > index 3b00fc906ccd..1dedcf66b694 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -1472,4 +1472,6 @@ static inline bool bpf_sk_lookup_run_v6(struct net *net, int protocol, > } > #endif /* IS_ENABLED(CONFIG_IPV6) */ > > +bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type); > + > #endif /* __LINUX_FILTER_H__ */ > diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h > index cc17bc957548..da4139a58630 100644 > --- a/include/net/xdp_sock.h > +++ b/include/net/xdp_sock.h > @@ -80,8 +80,7 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); > int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); > void __xsk_map_flush(void); > > -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, > - u32 key) > +static inline void *__xsk_map_lookup_elem(struct bpf_map *map, u32 key) > { > struct xsk_map *m = container_of(map, struct xsk_map, map); > struct xdp_sock *xs; > @@ -109,8 +108,7 @@ static inline void __xsk_map_flush(void) > { > } > > -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, > - u32 key) > +static inline void *__xsk_map_lookup_elem(struct bpf_map *map, u32 key) > { > return NULL; > } > diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c > index 5d1469de6921..a4d2cb93cd69 100644 > --- a/kernel/bpf/cpumap.c > +++ b/kernel/bpf/cpumap.c > @@ -563,7 +563,7 @@ static void cpu_map_free(struct bpf_map *map) > kfree(cmap); > } > > -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) > +void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) > { > struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map); > struct bpf_cpu_map_entry *rcpu; > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index 85d9d1b72a33..37ac4cde9713 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -258,7 +258,7 @@ static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) > return 0; > } > > -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) > +void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) > { > struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); > struct hlist_head *head = dev_map_index_hash(dtab, key); > @@ -392,7 +392,7 @@ void __dev_flush(void) > * update happens in parallel here a dev_put wont happen until after reading the > * ifindex. > */ > -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) > +void *__dev_map_lookup_elem(struct bpf_map *map, u32 key) > { > struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); > struct bpf_dtab_netdev *obj; > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 3d34ba492d46..b5fb0c4e911a 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, > func_id != BPF_FUNC_map_delete_elem && > func_id != BPF_FUNC_map_push_elem && > func_id != BPF_FUNC_map_pop_elem && > - func_id != BPF_FUNC_map_peek_elem) > + func_id != BPF_FUNC_map_peek_elem && > + func_id != BPF_FUNC_redirect_map) > return 0; > > if (map == NULL) { > @@ -11860,17 +11861,22 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) > } > > patch_call_imm: > - fn = env->ops->get_func_proto(insn->imm, env->prog); > - /* all functions that have prototype and verifier allowed > - * programs to call them, must be real in-kernel functions > - */ > - if (!fn->func) { > - verbose(env, > - "kernel subsystem misconfigured func %s#%d\n", > - func_id_name(insn->imm), insn->imm); > - return -EFAULT; > + if (insn->imm == BPF_FUNC_redirect_map) { > + aux = &env->insn_aux_data[i]; > + map_ptr = BPF_MAP_PTR(aux->map_ptr_state); > + insn->imm = get_xdp_redirect_func(map_ptr->map_type) - __bpf_call_base; > + } else { > + fn = env->ops->get_func_proto(insn->imm, env->prog); > + /* all functions that have prototype and verifier allowed > + * programs to call them, must be real in-kernel functions > + */ > + if (!fn->func) { > + verbose(env, "kernel subsystem misconfigured func %s#%d\n", > + func_id_name(insn->imm), insn->imm); > + return -EFAULT; > + } > + insn->imm = fn->func - __bpf_call_base; > } > - insn->imm = fn->func - __bpf_call_base; > } > > /* Since poke tab is now finalized, publish aux to tracker. */ > diff --git a/net/core/filter.c b/net/core/filter.c > index adfdad234674..fd64d768e16a 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -3944,22 +3944,6 @@ void xdp_do_flush(void) > } > EXPORT_SYMBOL_GPL(xdp_do_flush); > > -static inline void *__xdp_map_lookup_elem(struct bpf_map *map, u32 index) > -{ > - switch (map->map_type) { > - case BPF_MAP_TYPE_DEVMAP: > - return __dev_map_lookup_elem(map, index); > - case BPF_MAP_TYPE_DEVMAP_HASH: > - return __dev_map_hash_lookup_elem(map, index); > - case BPF_MAP_TYPE_CPUMAP: > - return __cpu_map_lookup_elem(map, index); > - case BPF_MAP_TYPE_XSKMAP: > - return __xsk_map_lookup_elem(map, index); > - default: > - return NULL; > - } > -} > - > void bpf_clear_redirect_map(struct bpf_map *map) > { > struct bpf_redirect_info *ri; > @@ -4110,22 +4094,17 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { > .arg2_type = ARG_ANYTHING, > }; > > -BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, > - u64, flags) > +static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, > + void *lookup_elem(struct bpf_map *map, > + u32 key)) > { > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > - /* Lower bits of the flags are used as return code on lookup failure */ > if (unlikely(flags > XDP_TX)) > return XDP_ABORTED; > > - ri->tgt_value = __xdp_map_lookup_elem(map, ifindex); > + ri->tgt_value = lookup_elem(map, ifindex); > if (unlikely(!ri->tgt_value)) { > - /* If the lookup fails we want to clear out the state in the > - * redirect_info struct completely, so that if an eBPF program > - * performs multiple lookups, the last one always takes > - * precedence. > - */ Why remove the comments? > WRITE_ONCE(ri->map, NULL); > return flags; > } > @@ -4137,8 +4116,44 @@ BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, > return XDP_REDIRECT; > } > > +BPF_CALL_3(bpf_xdp_redirect_devmap, struct bpf_map *, map, u32, ifindex, u64, flags) > +{ > + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); > +} > + > +BPF_CALL_3(bpf_xdp_redirect_devmap_hash, struct bpf_map *, map, u32, ifindex, u64, flags) > +{ > + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); > +} > + > +BPF_CALL_3(bpf_xdp_redirect_cpumap, struct bpf_map *, map, u32, ifindex, u64, flags) > +{ > + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); > +} > + > +BPF_CALL_3(bpf_xdp_redirect_xskmap, struct bpf_map *, map, u32, ifindex, u64, flags) > +{ > + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); > +} > + > +bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type) > +{ > + switch (map_type) { > + case BPF_MAP_TYPE_DEVMAP: > + return bpf_xdp_redirect_devmap; > + case BPF_MAP_TYPE_DEVMAP_HASH: > + return bpf_xdp_redirect_devmap_hash; > + case BPF_MAP_TYPE_CPUMAP: > + return bpf_xdp_redirect_cpumap; > + case BPF_MAP_TYPE_XSKMAP: > + return bpf_xdp_redirect_xskmap; > + default: > + return NULL; > + } > +} > + > +/* NB! .func is NULL! get_xdp_redirect_func() is used instead! */ > static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { > - .func = bpf_xdp_redirect_map, > .gpl_only = false, > .ret_type = RET_INTEGER, > .arg1_type = ARG_CONST_MAP_PTR, > -- > 2.27.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP 2021-02-19 17:05 ` Toke Høiland-Jørgensen @ 2021-02-19 17:47 ` Björn Töpel 0 siblings, 0 replies; 9+ messages in thread From: Björn Töpel @ 2021-02-19 17:47 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On 2021-02-19 18:05, Toke Høiland-Jørgensen wrote: > Björn Töpel <bjorn.topel@gmail.com> writes: > [...] >> @@ -4110,22 +4094,17 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { >> .arg2_type = ARG_ANYTHING, >> }; >> >> -BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, >> - u64, flags) >> +static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, >> + void *lookup_elem(struct bpf_map *map, >> + u32 key)) >> { >> struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); >> >> - /* Lower bits of the flags are used as return code on lookup failure */ >> if (unlikely(flags > XDP_TX)) >> return XDP_ABORTED; >> >> - ri->tgt_value = __xdp_map_lookup_elem(map, ifindex); >> + ri->tgt_value = lookup_elem(map, ifindex); >> if (unlikely(!ri->tgt_value)) { >> - /* If the lookup fails we want to clear out the state in the >> - * redirect_info struct completely, so that if an eBPF program >> - * performs multiple lookups, the last one always takes >> - * precedence. >> - */ > > Why remove the comments? > Ugh, no reason. I'll do a v2. LKP had a warning as well. Thanks, Björn [...] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel 2021-02-19 17:05 ` Toke Høiland-Jørgensen @ 2021-02-19 17:08 ` kernel test robot 2021-02-19 18:13 ` kernel test robot 2 siblings, 0 replies; 9+ messages in thread From: kernel test robot @ 2021-02-19 17:08 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: kbuild-all, Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend [-- Attachment #1: Type: text/plain, Size: 3314 bytes --] Hi "Björn, I love your patch! Perhaps something to improve: [auto build test WARNING on 7b1e385c9a488de9291eaaa412146d3972e9dec5] url: https://github.com/0day-ci/linux/commits/Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210219-230349 base: 7b1e385c9a488de9291eaaa412146d3972e9dec5 config: nds32-randconfig-r006-20210219 (attached as .config) compiler: nds32le-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/e784328ffb3b588155aeee02ff6a96b4a6b7cf20 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210219-230349 git checkout e784328ffb3b588155aeee02ff6a96b4a6b7cf20 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): In file included from include/linux/bpf-cgroup.h:5, from include/linux/cgroup-defs.h:22, from include/linux/cgroup.h:28, from include/linux/hugetlb.h:9, from kernel/events/core.c:31: >> include/linux/bpf.h:1629:42: warning: 'struct bpf_cpu_map_entry' declared inside parameter list will not be visible outside of this definition or declaration 1629 | static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, | ^~~~~~~~~~~~~~~~~ kernel/events/core.c:6539:6: warning: no previous prototype for 'perf_pmu_snapshot_aux' [-Wmissing-prototypes] 6539 | long perf_pmu_snapshot_aux(struct perf_buffer *rb, | ^~~~~~~~~~~~~~~~~~~~~ -- In file included from include/linux/bpf-cgroup.h:5, from include/linux/cgroup-defs.h:22, from include/linux/cgroup.h:28, from include/linux/perf_event.h:57, from kernel/events/ring_buffer.c:11: >> include/linux/bpf.h:1629:42: warning: 'struct bpf_cpu_map_entry' declared inside parameter list will not be visible outside of this definition or declaration 1629 | static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, | ^~~~~~~~~~~~~~~~~ vim +1629 include/linux/bpf.h 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1628 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 @1629 static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1630 struct xdp_buff *xdp, 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1631 struct net_device *dev_rx) 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1632 { 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1633 return 0; 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1634 } 040ee69226f8a9 Al Viro 2017-12-02 1635 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 28204 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel 2021-02-19 17:05 ` Toke Høiland-Jørgensen 2021-02-19 17:08 ` kernel test robot @ 2021-02-19 18:13 ` kernel test robot 2 siblings, 0 replies; 9+ messages in thread From: kernel test robot @ 2021-02-19 18:13 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: kbuild-all, clang-built-linux, Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend [-- Attachment #1: Type: text/plain, Size: 9899 bytes --] Hi "Björn, I love your patch! Yet something to improve: [auto build test ERROR on 7b1e385c9a488de9291eaaa412146d3972e9dec5] url: https://github.com/0day-ci/linux/commits/Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210219-230349 base: 7b1e385c9a488de9291eaaa412146d3972e9dec5 config: x86_64-randconfig-r032-20210219 (attached as .config) compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project c9439ca36342fb6013187d0a69aef92736951476) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # install x86_64 cross compiling tool for clang build # apt-get install binutils-x86-64-linux-gnu # https://github.com/0day-ci/linux/commit/e784328ffb3b588155aeee02ff6a96b4a6b7cf20 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210219-230349 git checkout e784328ffb3b588155aeee02ff6a96b4a6b7cf20 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All error/warnings (new ones prefixed by >>): In file included from arch/x86/kernel/asm-offsets.c:13: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:28: In file included from include/linux/cgroup-defs.h:22: In file included from include/linux/bpf-cgroup.h:5: >> include/linux/bpf.h:1629:42: warning: declaration of 'struct bpf_cpu_map_entry' will not be visible outside of this function [-Wvisibility] static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, ^ 1 warning generated. -- In file included from arch/x86/mm/ioremap.c:23: In file included from arch/x86/include/asm/efi.h:7: In file included from arch/x86/include/asm/tlb.h:12: In file included from include/asm-generic/tlb.h:15: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:28: In file included from include/linux/cgroup-defs.h:22: In file included from include/linux/bpf-cgroup.h:5: >> include/linux/bpf.h:1629:42: warning: declaration of 'struct bpf_cpu_map_entry' will not be visible outside of this function [-Wvisibility] static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, ^ arch/x86/mm/ioremap.c:737:17: warning: no previous prototype for function 'early_memremap_pgprot_adjust' [-Wmissing-prototypes] pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr, ^ arch/x86/mm/ioremap.c:737:1: note: declare 'static' if the function is not intended to be used outside of this translation unit pgprot_t __init early_memremap_pgprot_adjust(resource_size_t phys_addr, ^ static 2 warnings generated. -- In file included from arch/x86/mm/extable.c:9: In file included from arch/x86/include/asm/traps.h:9: In file included from arch/x86/include/asm/idtentry.h:9: In file included from include/linux/entry-common.h:5: In file included from include/linux/tracehook.h:50: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:28: In file included from include/linux/cgroup-defs.h:22: In file included from include/linux/bpf-cgroup.h:5: >> include/linux/bpf.h:1629:42: warning: declaration of 'struct bpf_cpu_map_entry' will not be visible outside of this function [-Wvisibility] static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, ^ arch/x86/mm/extable.c:27:16: warning: no previous prototype for function 'ex_handler_default' [-Wmissing-prototypes] __visible bool ex_handler_default(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:27:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_default(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:37:16: warning: no previous prototype for function 'ex_handler_fault' [-Wmissing-prototypes] __visible bool ex_handler_fault(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:37:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_fault(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:58:16: warning: no previous prototype for function 'ex_handler_fprestore' [-Wmissing-prototypes] __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:58:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:73:16: warning: no previous prototype for function 'ex_handler_uaccess' [-Wmissing-prototypes] __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:73:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:84:16: warning: no previous prototype for function 'ex_handler_copy' [-Wmissing-prototypes] __visible bool ex_handler_copy(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:84:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_copy(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:96:16: warning: no previous prototype for function 'ex_handler_rdmsr_unsafe' [-Wmissing-prototypes] __visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:96:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:113:16: warning: no previous prototype for function 'ex_handler_wrmsr_unsafe' [-Wmissing-prototypes] __visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:113:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup, ^ static arch/x86/mm/extable.c:129:16: warning: no previous prototype for function 'ex_handler_clear_fs' [-Wmissing-prototypes] __visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup, ^ arch/x86/mm/extable.c:129:11: note: declare 'static' if the function is not intended to be used outside of this translation unit __visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup, ^ static 9 warnings generated. -- In file included from <built-in>:3: In file included from drivers/gpu/drm/i915/display/intel_cdclk.h:11: In file included from drivers/gpu/drm/i915/i915_drv.h:46: In file included from include/linux/perf_event.h:57: In file included from include/linux/cgroup.h:28: In file included from include/linux/cgroup-defs.h:22: In file included from include/linux/bpf-cgroup.h:5: >> include/linux/bpf.h:1629:42: error: declaration of 'struct bpf_cpu_map_entry' will not be visible outside of this function [-Werror,-Wvisibility] static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, ^ 1 error generated. -- In file included from arch/x86/kernel/asm-offsets.c:13: In file included from include/linux/suspend.h:5: In file included from include/linux/swap.h:9: In file included from include/linux/memcontrol.h:13: In file included from include/linux/cgroup.h:28: In file included from include/linux/cgroup-defs.h:22: In file included from include/linux/bpf-cgroup.h:5: >> include/linux/bpf.h:1629:42: warning: declaration of 'struct bpf_cpu_map_entry' will not be visible outside of this function [-Wvisibility] static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, ^ 1 warning generated. vim +1629 include/linux/bpf.h 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1628 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 @1629 static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1630 struct xdp_buff *xdp, 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1631 struct net_device *dev_rx) 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1632 { 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1633 return 0; 9c270af37bb62e Jesper Dangaard Brouer 2017-10-16 1634 } 040ee69226f8a9 Al Viro 2017-12-02 1635 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 33551 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions 2021-02-19 14:59 [PATCH bpf-next 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel @ 2021-02-19 14:59 ` Björn Töpel 2021-02-19 17:10 ` Toke Høiland-Jørgensen 1 sibling, 1 reply; 9+ messages in thread From: Björn Töpel @ 2021-02-19 14:59 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem From: Björn Töpel <bjorn.topel@intel.com> The XDP_REDIRECT implementations for maps and non-maps are fairly similar, but obviously need to take different code paths depending on if the target is using a map or not. Today, the redirect targets for XDP either uses a map, or is based on ifindex. Here, an explicit redirect type is added to bpf_redirect_info, instead of the actual map. Redirect type, map item/ifindex, and the map_id (if any) is passed to xdp_do_redirect(). In addition to making the code easier to follow, using an explicit type in bpf_redirect_info has a slight positive performance impact by avoiding a pointer indirection for the map type lookup, and instead use the cacheline for bpf_redirect_info. Since the actual map is not passed via bpf_redirect_info anymore, the map lookup is only done in the BPF helper. This means that the bpf_clear_redirect_map() function can be removed. The actual map item is RCU protected. The bpf_redirect_info flags member is not used by XDP, and not read/written any more. The map member is only written to when required/used, and not unconditionally. rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke) Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- include/linux/filter.h | 11 ++- include/trace/events/xdp.h | 66 +++++++++------ kernel/bpf/cpumap.c | 1 - kernel/bpf/devmap.c | 1 - net/core/filter.c | 162 ++++++++++++++++--------------------- net/xdp/xskmap.c | 1 - 6 files changed, 121 insertions(+), 121 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 1dedcf66b694..1f3cf2a1e116 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -646,11 +646,20 @@ struct bpf_redirect_info { u32 flags; u32 tgt_index; void *tgt_value; - struct bpf_map *map; + u32 map_id; + u32 tgt_type; u32 kern_flags; struct bpf_nh_params nh; }; +enum xdp_redirect_type { + XDP_REDIR_UNSET, + XDP_REDIR_DEV_IFINDEX, + XDP_REDIR_DEV_MAP, + XDP_REDIR_CPU_MAP, + XDP_REDIR_XSK_MAP, +}; + DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); /* flags for bpf_redirect_info kern_flags */ diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index 76a97176ab81..538321735447 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -86,19 +86,15 @@ struct _bpf_dtab_netdev { }; #endif /* __DEVMAP_OBJ_TYPE */ -#define devmap_ifindex(tgt, map) \ - (((map->map_type == BPF_MAP_TYPE_DEVMAP || \ - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH)) ? \ - ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex : 0) - DECLARE_EVENT_CLASS(xdp_redirect_template, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), - TP_ARGS(dev, xdp, tgt, err, map, index), + TP_ARGS(dev, xdp, tgt, err, type, ri), TP_STRUCT__entry( __field(int, prog_id) @@ -111,14 +107,30 @@ DECLARE_EVENT_CLASS(xdp_redirect_template, ), TP_fast_assign( + u32 ifindex = 0, map_id = 0, index = ri->tgt_index; + + switch (type) { + case XDP_REDIR_DEV_MAP: + ifindex = ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex; + fallthrough; + case XDP_REDIR_CPU_MAP: + case XDP_REDIR_XSK_MAP: + map_id = ri->map_id; + break; + case XDP_REDIR_DEV_IFINDEX: + ifindex = (u32)(long)tgt; + break; + default: + break; + } + __entry->prog_id = xdp->aux->id; __entry->act = XDP_REDIRECT; __entry->ifindex = dev->ifindex; __entry->err = err; - __entry->to_ifindex = map ? devmap_ifindex(tgt, map) : - index; - __entry->map_id = map ? map->id : 0; - __entry->map_index = map ? index : 0; + __entry->to_ifindex = ifindex; + __entry->map_id = map_id; + __entry->map_index = index; ), TP_printk("prog_id=%d action=%s ifindex=%d to_ifindex=%d err=%d" @@ -133,45 +145,49 @@ DEFINE_EVENT(xdp_redirect_template, xdp_redirect, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); DEFINE_EVENT(xdp_redirect_template, xdp_redirect_err, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); #define _trace_xdp_redirect(dev, xdp, to) \ - trace_xdp_redirect(dev, xdp, NULL, 0, NULL, to) + trace_xdp_redirect(dev, xdp, NULL, 0, XDP_REDIR_DEV_IFINDEX, NULL) #define _trace_xdp_redirect_err(dev, xdp, to, err) \ - trace_xdp_redirect_err(dev, xdp, NULL, err, NULL, to) + trace_xdp_redirect_err(dev, xdp, NULL, err, XDP_REDIR_DEV_IFINDEX, NULL) -#define _trace_xdp_redirect_map(dev, xdp, to, map, index) \ - trace_xdp_redirect(dev, xdp, to, 0, map, index) +#define _trace_xdp_redirect_map(dev, xdp, to, type, ri) \ + trace_xdp_redirect(dev, xdp, to, 0, type, ri) -#define _trace_xdp_redirect_map_err(dev, xdp, to, map, index, err) \ - trace_xdp_redirect_err(dev, xdp, to, err, map, index) +#define _trace_xdp_redirect_map_err(dev, xdp, to, type, ri, err) \ + trace_xdp_redirect_err(dev, xdp, to, err, type, ri) /* not used anymore, but kept around so as not to break old programs */ DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map_err, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); TRACE_EVENT(xdp_cpumap_kthread, diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index a4d2cb93cd69..b7f4d22f5c8d 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -543,7 +543,6 @@ static void cpu_map_free(struct bpf_map *map) * complete. */ - bpf_clear_redirect_map(map); synchronize_rcu(); /* For cpu_map the remote CPUs can still be using the entries diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 37ac4cde9713..b5681a98020d 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -197,7 +197,6 @@ static void dev_map_free(struct bpf_map *map) list_del_rcu(&dtab->list); spin_unlock(&dev_map_lock); - bpf_clear_redirect_map(map); synchronize_rcu(); /* Make sure prior __dev_map_entry_free() have completed. */ diff --git a/net/core/filter.c b/net/core/filter.c index fd64d768e16a..56074b88d7e2 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3919,23 +3919,6 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { .arg2_type = ARG_ANYTHING, }; -static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, - struct bpf_map *map, struct xdp_buff *xdp) -{ - switch (map->map_type) { - case BPF_MAP_TYPE_DEVMAP: - case BPF_MAP_TYPE_DEVMAP_HASH: - return dev_map_enqueue(fwd, xdp, dev_rx); - case BPF_MAP_TYPE_CPUMAP: - return cpu_map_enqueue(fwd, xdp, dev_rx); - case BPF_MAP_TYPE_XSKMAP: - return __xsk_map_redirect(fwd, xdp); - default: - return -EBADRQC; - } - return 0; -} - void xdp_do_flush(void) { __dev_flush(); @@ -3944,55 +3927,45 @@ void xdp_do_flush(void) } EXPORT_SYMBOL_GPL(xdp_do_flush); -void bpf_clear_redirect_map(struct bpf_map *map) -{ - struct bpf_redirect_info *ri; - int cpu; - - for_each_possible_cpu(cpu) { - ri = per_cpu_ptr(&bpf_redirect_info, cpu); - /* Avoid polluting remote cacheline due to writes if - * not needed. Once we pass this test, we need the - * cmpxchg() to make sure it hasn't been changed in - * the meantime by remote CPU. - */ - if (unlikely(READ_ONCE(ri->map) == map)) - cmpxchg(&ri->map, map, NULL); - } -} - int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - struct bpf_map *map = READ_ONCE(ri->map); - u32 index = ri->tgt_index; + enum xdp_redirect_type type = ri->tgt_type; void *fwd = ri->tgt_value; int err; - ri->tgt_index = 0; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_UNSET; - if (unlikely(!map)) { - fwd = dev_get_by_index_rcu(dev_net(dev), index); + switch (type) { + case XDP_REDIR_DEV_IFINDEX: + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); if (unlikely(!fwd)) { err = -EINVAL; - goto err; + break; } - err = dev_xdp_enqueue(fwd, xdp, dev); - } else { - err = __bpf_tx_xdp_map(dev, fwd, map, xdp); + break; + case XDP_REDIR_DEV_MAP: + err = dev_map_enqueue(fwd, xdp, dev); + break; + case XDP_REDIR_CPU_MAP: + err = cpu_map_enqueue(fwd, xdp, dev); + break; + case XDP_REDIR_XSK_MAP: + err = __xsk_map_redirect(fwd, xdp); + break; + default: + err = -EBADRQC; } if (unlikely(err)) goto err; - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); return 0; err: - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); return err; } EXPORT_SYMBOL_GPL(xdp_do_redirect); @@ -4001,41 +3974,40 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, - struct bpf_map *map) + void *fwd, + enum xdp_redirect_type type) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - u32 index = ri->tgt_index; - void *fwd = ri->tgt_value; - int err = 0; - - ri->tgt_index = 0; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); + int err; - if (map->map_type == BPF_MAP_TYPE_DEVMAP || - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { + switch (type) { + case XDP_REDIR_DEV_MAP: { struct bpf_dtab_netdev *dst = fwd; err = dev_map_generic_redirect(dst, skb, xdp_prog); if (unlikely(err)) goto err; - } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { + break; + } + case XDP_REDIR_XSK_MAP: { struct xdp_sock *xs = fwd; err = xsk_generic_rcv(xs, xdp); if (err) goto err; consume_skb(skb); - } else { + break; + } + default: /* TODO: Handle BPF_MAP_TYPE_CPUMAP */ err = -EBADRQC; goto err; } - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); return 0; err: - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); return err; } @@ -4043,29 +4015,31 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - struct bpf_map *map = READ_ONCE(ri->map); - u32 index = ri->tgt_index; - struct net_device *fwd; + enum xdp_redirect_type type = ri->tgt_type; + void *fwd = ri->tgt_value; int err = 0; - if (map) - return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, - map); - ri->tgt_index = 0; - fwd = dev_get_by_index_rcu(dev_net(dev), index); - if (unlikely(!fwd)) { - err = -EINVAL; - goto err; - } + ri->tgt_type = XDP_REDIR_UNSET; + ri->tgt_value = NULL; - err = xdp_ok_fwd_dev(fwd, skb->len); - if (unlikely(err)) - goto err; + if (type == XDP_REDIR_DEV_IFINDEX) { + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); + if (unlikely(!fwd)) { + err = -EINVAL; + goto err; + } - skb->dev = fwd; - _trace_xdp_redirect(dev, xdp_prog, index); - generic_xdp_tx(skb, xdp_prog); - return 0; + err = xdp_ok_fwd_dev(fwd, skb->len); + if (unlikely(err)) + goto err; + + skb->dev = fwd; + _trace_xdp_redirect(dev, xdp_prog, index); + generic_xdp_tx(skb, xdp_prog); + return 0; + } + + return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, fwd, type); err: _trace_xdp_redirect_err(dev, xdp_prog, index, err); return err; @@ -4078,10 +4052,9 @@ BPF_CALL_2(bpf_xdp_redirect, u32, ifindex, u64, flags) if (unlikely(flags)) return XDP_ABORTED; - ri->flags = flags; - ri->tgt_index = ifindex; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_DEV_IFINDEX; + ri->tgt_index = 0; + ri->tgt_value = (void *)(long)ifindex; return XDP_REDIRECT; } @@ -4096,7 +4069,8 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, void *lookup_elem(struct bpf_map *map, - u32 key)) + u32 key), + enum xdp_redirect_type type) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); @@ -4105,35 +4079,39 @@ static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind ri->tgt_value = lookup_elem(map, ifindex); if (unlikely(!ri->tgt_value)) { - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_UNSET; return flags; } - ri->flags = flags; ri->tgt_index = ifindex; - WRITE_ONCE(ri->map, map); + ri->tgt_type = type; + ri->map_id = map->id; return XDP_REDIRECT; } BPF_CALL_3(bpf_xdp_redirect_devmap, struct bpf_map *, map, u32, ifindex, u64, flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem, + XDP_REDIR_DEV_MAP); } BPF_CALL_3(bpf_xdp_redirect_devmap_hash, struct bpf_map *, map, u32, ifindex, u64, flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem, + XDP_REDIR_DEV_MAP); } BPF_CALL_3(bpf_xdp_redirect_cpumap, struct bpf_map *, map, u32, ifindex, u64, flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem, + XDP_REDIR_CPU_MAP); } BPF_CALL_3(bpf_xdp_redirect_xskmap, struct bpf_map *, map, u32, ifindex, u64, flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem, + XDP_REDIR_XSK_MAP); } bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type) diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 113fd9017203..c285d3dd04ad 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -87,7 +87,6 @@ static void xsk_map_free(struct bpf_map *map) { struct xsk_map *m = container_of(map, struct xsk_map, map); - bpf_clear_redirect_map(map); synchronize_net(); bpf_map_area_free(m); } -- 2.27.0 ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions 2021-02-19 14:59 ` [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions Björn Töpel @ 2021-02-19 17:10 ` Toke Høiland-Jørgensen 2021-02-19 17:49 ` Björn Töpel 0 siblings, 1 reply; 9+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-19 17:10 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@gmail.com> writes: > From: Björn Töpel <bjorn.topel@intel.com> > > The XDP_REDIRECT implementations for maps and non-maps are fairly > similar, but obviously need to take different code paths depending on > if the target is using a map or not. Today, the redirect targets for > XDP either uses a map, or is based on ifindex. > > Here, an explicit redirect type is added to bpf_redirect_info, instead > of the actual map. Redirect type, map item/ifindex, and the map_id (if > any) is passed to xdp_do_redirect(). > > In addition to making the code easier to follow, using an explicit > type in bpf_redirect_info has a slight positive performance impact by > avoiding a pointer indirection for the map type lookup, and instead > use the cacheline for bpf_redirect_info. > > Since the actual map is not passed via bpf_redirect_info anymore, the > map lookup is only done in the BPF helper. This means that the > bpf_clear_redirect_map() function can be removed. The actual map item > is RCU protected. > > The bpf_redirect_info flags member is not used by XDP, and not > read/written any more. The map member is only written to when > required/used, and not unconditionally. > > rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke) > > Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Very cool! Also a small nit below, but otherwise: Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> > --- > include/linux/filter.h | 11 ++- > include/trace/events/xdp.h | 66 +++++++++------ > kernel/bpf/cpumap.c | 1 - > kernel/bpf/devmap.c | 1 - > net/core/filter.c | 162 ++++++++++++++++--------------------- > net/xdp/xskmap.c | 1 - > 6 files changed, 121 insertions(+), 121 deletions(-) > > diff --git a/include/linux/filter.h b/include/linux/filter.h > index 1dedcf66b694..1f3cf2a1e116 100644 > --- a/include/linux/filter.h > +++ b/include/linux/filter.h > @@ -646,11 +646,20 @@ struct bpf_redirect_info { > u32 flags; > u32 tgt_index; > void *tgt_value; > - struct bpf_map *map; > + u32 map_id; > + u32 tgt_type; > u32 kern_flags; > struct bpf_nh_params nh; > }; > > +enum xdp_redirect_type { > + XDP_REDIR_UNSET, > + XDP_REDIR_DEV_IFINDEX, > + XDP_REDIR_DEV_MAP, > + XDP_REDIR_CPU_MAP, > + XDP_REDIR_XSK_MAP, > +}; > + > DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); > > /* flags for bpf_redirect_info kern_flags */ > diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h > index 76a97176ab81..538321735447 100644 > --- a/include/trace/events/xdp.h > +++ b/include/trace/events/xdp.h > @@ -86,19 +86,15 @@ struct _bpf_dtab_netdev { > }; > #endif /* __DEVMAP_OBJ_TYPE */ > > -#define devmap_ifindex(tgt, map) \ > - (((map->map_type == BPF_MAP_TYPE_DEVMAP || \ > - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH)) ? \ > - ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex : 0) > - > DECLARE_EVENT_CLASS(xdp_redirect_template, > > TP_PROTO(const struct net_device *dev, > const struct bpf_prog *xdp, > const void *tgt, int err, > - const struct bpf_map *map, u32 index), > + enum xdp_redirect_type type, > + const struct bpf_redirect_info *ri), > > - TP_ARGS(dev, xdp, tgt, err, map, index), > + TP_ARGS(dev, xdp, tgt, err, type, ri), > > TP_STRUCT__entry( > __field(int, prog_id) > @@ -111,14 +107,30 @@ DECLARE_EVENT_CLASS(xdp_redirect_template, > ), > > TP_fast_assign( > + u32 ifindex = 0, map_id = 0, index = ri->tgt_index; > + > + switch (type) { > + case XDP_REDIR_DEV_MAP: > + ifindex = ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex; > + fallthrough; > + case XDP_REDIR_CPU_MAP: > + case XDP_REDIR_XSK_MAP: > + map_id = ri->map_id; > + break; > + case XDP_REDIR_DEV_IFINDEX: > + ifindex = (u32)(long)tgt; > + break; > + default: > + break; > + } > + > __entry->prog_id = xdp->aux->id; > __entry->act = XDP_REDIRECT; > __entry->ifindex = dev->ifindex; > __entry->err = err; > - __entry->to_ifindex = map ? devmap_ifindex(tgt, map) : > - index; > - __entry->map_id = map ? map->id : 0; > - __entry->map_index = map ? index : 0; > + __entry->to_ifindex = ifindex; > + __entry->map_id = map_id; > + __entry->map_index = index; > ), > > TP_printk("prog_id=%d action=%s ifindex=%d to_ifindex=%d err=%d" > @@ -133,45 +145,49 @@ DEFINE_EVENT(xdp_redirect_template, xdp_redirect, > TP_PROTO(const struct net_device *dev, > const struct bpf_prog *xdp, > const void *tgt, int err, > - const struct bpf_map *map, u32 index), > - TP_ARGS(dev, xdp, tgt, err, map, index) > + enum xdp_redirect_type type, > + const struct bpf_redirect_info *ri), > + TP_ARGS(dev, xdp, tgt, err, type, ri) > ); > > DEFINE_EVENT(xdp_redirect_template, xdp_redirect_err, > TP_PROTO(const struct net_device *dev, > const struct bpf_prog *xdp, > const void *tgt, int err, > - const struct bpf_map *map, u32 index), > - TP_ARGS(dev, xdp, tgt, err, map, index) > + enum xdp_redirect_type type, > + const struct bpf_redirect_info *ri), > + TP_ARGS(dev, xdp, tgt, err, type, ri) > ); > > #define _trace_xdp_redirect(dev, xdp, to) \ > - trace_xdp_redirect(dev, xdp, NULL, 0, NULL, to) > + trace_xdp_redirect(dev, xdp, NULL, 0, XDP_REDIR_DEV_IFINDEX, NULL) > > #define _trace_xdp_redirect_err(dev, xdp, to, err) \ > - trace_xdp_redirect_err(dev, xdp, NULL, err, NULL, to) > + trace_xdp_redirect_err(dev, xdp, NULL, err, XDP_REDIR_DEV_IFINDEX, NULL) > > -#define _trace_xdp_redirect_map(dev, xdp, to, map, index) \ > - trace_xdp_redirect(dev, xdp, to, 0, map, index) > +#define _trace_xdp_redirect_map(dev, xdp, to, type, ri) \ > + trace_xdp_redirect(dev, xdp, to, 0, type, ri) > > -#define _trace_xdp_redirect_map_err(dev, xdp, to, map, index, err) \ > - trace_xdp_redirect_err(dev, xdp, to, err, map, index) > +#define _trace_xdp_redirect_map_err(dev, xdp, to, type, ri, err) \ > + trace_xdp_redirect_err(dev, xdp, to, err, type, ri) > > /* not used anymore, but kept around so as not to break old programs */ > DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map, > TP_PROTO(const struct net_device *dev, > const struct bpf_prog *xdp, > const void *tgt, int err, > - const struct bpf_map *map, u32 index), > - TP_ARGS(dev, xdp, tgt, err, map, index) > + enum xdp_redirect_type type, > + const struct bpf_redirect_info *ri), > + TP_ARGS(dev, xdp, tgt, err, type, ri) > ); > > DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map_err, > TP_PROTO(const struct net_device *dev, > const struct bpf_prog *xdp, > const void *tgt, int err, > - const struct bpf_map *map, u32 index), > - TP_ARGS(dev, xdp, tgt, err, map, index) > + enum xdp_redirect_type type, > + const struct bpf_redirect_info *ri), > + TP_ARGS(dev, xdp, tgt, err, type, ri) > ); > > TRACE_EVENT(xdp_cpumap_kthread, > diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c > index a4d2cb93cd69..b7f4d22f5c8d 100644 > --- a/kernel/bpf/cpumap.c > +++ b/kernel/bpf/cpumap.c > @@ -543,7 +543,6 @@ static void cpu_map_free(struct bpf_map *map) > * complete. > */ > > - bpf_clear_redirect_map(map); > synchronize_rcu(); > > /* For cpu_map the remote CPUs can still be using the entries > diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c > index 37ac4cde9713..b5681a98020d 100644 > --- a/kernel/bpf/devmap.c > +++ b/kernel/bpf/devmap.c > @@ -197,7 +197,6 @@ static void dev_map_free(struct bpf_map *map) > list_del_rcu(&dtab->list); > spin_unlock(&dev_map_lock); > > - bpf_clear_redirect_map(map); > synchronize_rcu(); > > /* Make sure prior __dev_map_entry_free() have completed. */ > diff --git a/net/core/filter.c b/net/core/filter.c > index fd64d768e16a..56074b88d7e2 100644 > --- a/net/core/filter.c > +++ b/net/core/filter.c > @@ -3919,23 +3919,6 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { > .arg2_type = ARG_ANYTHING, > }; > > -static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, > - struct bpf_map *map, struct xdp_buff *xdp) > -{ > - switch (map->map_type) { > - case BPF_MAP_TYPE_DEVMAP: > - case BPF_MAP_TYPE_DEVMAP_HASH: > - return dev_map_enqueue(fwd, xdp, dev_rx); > - case BPF_MAP_TYPE_CPUMAP: > - return cpu_map_enqueue(fwd, xdp, dev_rx); > - case BPF_MAP_TYPE_XSKMAP: > - return __xsk_map_redirect(fwd, xdp); > - default: > - return -EBADRQC; > - } > - return 0; > -} > - > void xdp_do_flush(void) > { > __dev_flush(); > @@ -3944,55 +3927,45 @@ void xdp_do_flush(void) > } > EXPORT_SYMBOL_GPL(xdp_do_flush); > > -void bpf_clear_redirect_map(struct bpf_map *map) > -{ > - struct bpf_redirect_info *ri; > - int cpu; > - > - for_each_possible_cpu(cpu) { > - ri = per_cpu_ptr(&bpf_redirect_info, cpu); > - /* Avoid polluting remote cacheline due to writes if > - * not needed. Once we pass this test, we need the > - * cmpxchg() to make sure it hasn't been changed in > - * the meantime by remote CPU. > - */ > - if (unlikely(READ_ONCE(ri->map) == map)) > - cmpxchg(&ri->map, map, NULL); > - } > -} > - > int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, > struct bpf_prog *xdp_prog) > { > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > - struct bpf_map *map = READ_ONCE(ri->map); > - u32 index = ri->tgt_index; > + enum xdp_redirect_type type = ri->tgt_type; > void *fwd = ri->tgt_value; > int err; > > - ri->tgt_index = 0; > - ri->tgt_value = NULL; > - WRITE_ONCE(ri->map, NULL); > + ri->tgt_type = XDP_REDIR_UNSET; > > - if (unlikely(!map)) { > - fwd = dev_get_by_index_rcu(dev_net(dev), index); > + switch (type) { > + case XDP_REDIR_DEV_IFINDEX: > + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); > if (unlikely(!fwd)) { > err = -EINVAL; > - goto err; > + break; > } > - > err = dev_xdp_enqueue(fwd, xdp, dev); > - } else { > - err = __bpf_tx_xdp_map(dev, fwd, map, xdp); > + break; > + case XDP_REDIR_DEV_MAP: > + err = dev_map_enqueue(fwd, xdp, dev); > + break; > + case XDP_REDIR_CPU_MAP: > + err = cpu_map_enqueue(fwd, xdp, dev); > + break; > + case XDP_REDIR_XSK_MAP: > + err = __xsk_map_redirect(fwd, xdp); > + break; > + default: > + err = -EBADRQC; > } > > if (unlikely(err)) > goto err; > > - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); > + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); > return 0; > err: > - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); > + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); > return err; > } > EXPORT_SYMBOL_GPL(xdp_do_redirect); > @@ -4001,41 +3974,40 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, > struct sk_buff *skb, > struct xdp_buff *xdp, > struct bpf_prog *xdp_prog, > - struct bpf_map *map) > + void *fwd, > + enum xdp_redirect_type type) > { > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > - u32 index = ri->tgt_index; > - void *fwd = ri->tgt_value; > - int err = 0; > - > - ri->tgt_index = 0; > - ri->tgt_value = NULL; > - WRITE_ONCE(ri->map, NULL); > + int err; > > - if (map->map_type == BPF_MAP_TYPE_DEVMAP || > - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { > + switch (type) { > + case XDP_REDIR_DEV_MAP: { > struct bpf_dtab_netdev *dst = fwd; I thought the braces around the case body looked a bit odd. I guess that's to get a local scope for the dst var (and xs var below), right? This is basically a cast, though, so I wonder if you couldn't just as well use the fwd pointer directly (with a cast) in the function call below? WDYT? (Strictly speaking I don't think the compiler will even complain if you omit the cast as well, but having it in there is nice fore readability I think, and guards against someone forgetting to update the call if the function prototype changes). > err = dev_map_generic_redirect(dst, skb, xdp_prog); > if (unlikely(err)) > goto err; > - } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { > + break; > + } > + case XDP_REDIR_XSK_MAP: { > struct xdp_sock *xs = fwd; > > err = xsk_generic_rcv(xs, xdp); > if (err) > goto err; > consume_skb(skb); > - } else { > + break; > + } > + default: > /* TODO: Handle BPF_MAP_TYPE_CPUMAP */ > err = -EBADRQC; > goto err; > } > > - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); > + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); > return 0; > err: > - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); > + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); > return err; > } > > @@ -4043,29 +4015,31 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb, > struct xdp_buff *xdp, struct bpf_prog *xdp_prog) > { > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > - struct bpf_map *map = READ_ONCE(ri->map); > - u32 index = ri->tgt_index; > - struct net_device *fwd; > + enum xdp_redirect_type type = ri->tgt_type; > + void *fwd = ri->tgt_value; > int err = 0; > > - if (map) > - return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, > - map); > - ri->tgt_index = 0; > - fwd = dev_get_by_index_rcu(dev_net(dev), index); > - if (unlikely(!fwd)) { > - err = -EINVAL; > - goto err; > - } > + ri->tgt_type = XDP_REDIR_UNSET; > + ri->tgt_value = NULL; > > - err = xdp_ok_fwd_dev(fwd, skb->len); > - if (unlikely(err)) > - goto err; > + if (type == XDP_REDIR_DEV_IFINDEX) { > + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); > + if (unlikely(!fwd)) { > + err = -EINVAL; > + goto err; > + } > > - skb->dev = fwd; > - _trace_xdp_redirect(dev, xdp_prog, index); > - generic_xdp_tx(skb, xdp_prog); > - return 0; > + err = xdp_ok_fwd_dev(fwd, skb->len); > + if (unlikely(err)) > + goto err; > + > + skb->dev = fwd; > + _trace_xdp_redirect(dev, xdp_prog, index); > + generic_xdp_tx(skb, xdp_prog); > + return 0; > + } > + > + return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, fwd, type); > err: > _trace_xdp_redirect_err(dev, xdp_prog, index, err); > return err; > @@ -4078,10 +4052,9 @@ BPF_CALL_2(bpf_xdp_redirect, u32, ifindex, u64, flags) > if (unlikely(flags)) > return XDP_ABORTED; > > - ri->flags = flags; > - ri->tgt_index = ifindex; > - ri->tgt_value = NULL; > - WRITE_ONCE(ri->map, NULL); > + ri->tgt_type = XDP_REDIR_DEV_IFINDEX; > + ri->tgt_index = 0; > + ri->tgt_value = (void *)(long)ifindex; > > return XDP_REDIRECT; > } > @@ -4096,7 +4069,8 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { > > static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, > void *lookup_elem(struct bpf_map *map, > - u32 key)) > + u32 key), > + enum xdp_redirect_type type) > { > struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); > > @@ -4105,35 +4079,39 @@ static __always_inline s64 __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind > > ri->tgt_value = lookup_elem(map, ifindex); > if (unlikely(!ri->tgt_value)) { > - WRITE_ONCE(ri->map, NULL); > + ri->tgt_type = XDP_REDIR_UNSET; > return flags; > } > > - ri->flags = flags; > ri->tgt_index = ifindex; > - WRITE_ONCE(ri->map, map); > + ri->tgt_type = type; > + ri->map_id = map->id; > > return XDP_REDIRECT; > } > > BPF_CALL_3(bpf_xdp_redirect_devmap, struct bpf_map *, map, u32, ifindex, u64, flags) > { > - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); > + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem, > + XDP_REDIR_DEV_MAP); > } > > BPF_CALL_3(bpf_xdp_redirect_devmap_hash, struct bpf_map *, map, u32, ifindex, u64, flags) > { > - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); > + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem, > + XDP_REDIR_DEV_MAP); > } > > BPF_CALL_3(bpf_xdp_redirect_cpumap, struct bpf_map *, map, u32, ifindex, u64, flags) > { > - return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); > + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem, > + XDP_REDIR_CPU_MAP); > } > > BPF_CALL_3(bpf_xdp_redirect_xskmap, struct bpf_map *, map, u32, ifindex, u64, flags) > { > - return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); > + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem, > + XDP_REDIR_XSK_MAP); > } > > bpf_func_proto_func get_xdp_redirect_func(enum bpf_map_type map_type) > diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c > index 113fd9017203..c285d3dd04ad 100644 > --- a/net/xdp/xskmap.c > +++ b/net/xdp/xskmap.c > @@ -87,7 +87,6 @@ static void xsk_map_free(struct bpf_map *map) > { > struct xsk_map *m = container_of(map, struct xsk_map, map); > > - bpf_clear_redirect_map(map); > synchronize_net(); > bpf_map_area_free(m); > } > -- > 2.27.0 ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions 2021-02-19 17:10 ` Toke Høiland-Jørgensen @ 2021-02-19 17:49 ` Björn Töpel 0 siblings, 0 replies; 9+ messages in thread From: Björn Töpel @ 2021-02-19 17:49 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On 2021-02-19 18:10, Toke Høiland-Jørgensen wrote: >> + case XDP_REDIR_DEV_MAP: { >> struct bpf_dtab_netdev *dst = fwd; > I thought the braces around the case body looked a bit odd. I guess > that's to get a local scope for the dst var (and xs var below), right? > This is basically a cast, though, so I wonder if you couldn't just as > well use the fwd pointer directly (with a cast) in the function call > below? WDYT? Yeah. I'll fix that in the next verison! Thanks, and have a nice weekend! Björn ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-02-19 18:15 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-19 14:59 [PATCH bpf-next 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel 2021-02-19 14:59 ` [PATCH bpf-next 1/2] bpf, xdp: per-map bpf_redirect_map functions for XDP Björn Töpel 2021-02-19 17:05 ` Toke Høiland-Jørgensen 2021-02-19 17:47 ` Björn Töpel 2021-02-19 17:08 ` kernel test robot 2021-02-19 18:13 ` kernel test robot 2021-02-19 14:59 ` [PATCH bpf-next 2/2] bpf, xdp: restructure redirect actions Björn Töpel 2021-02-19 17:10 ` Toke Høiland-Jørgensen 2021-02-19 17:49 ` Björn Töpel
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).