* [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() @ 2021-02-26 11:23 Björn Töpel 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel ` (2 more replies) 0 siblings, 3 replies; 17+ messages in thread From: Björn Töpel @ 2021-02-26 11:23 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, bjorn.topel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem Hi XDP-folks, This two patch series contain two optimizations for the bpf_redirect_map() helper and the xdp_do_redirect() function. The bpf_redirect_map() optimization is about avoiding the map lookup dispatching. Instead of having a switch-statement and selecting the correct lookup function, we let bpf_redirect_map() be a map operation, where each map has its own bpf_redirect_map() implementation. This way the run-time lookup is avoided. The xdp_do_redirect() patch restructures the code, so that the map pointer indirection can be avoided. Performance-wise I got 3% improvement for XSKMAP (sample:xdpsock/rx-drop), and 4% (sample:xdp_redirect_map) on my machine. More details in each commit. @Jesper/Toke I dropped your Acked-by: on the first patch, since there were major restucturing. Please have another look! Thanks! Changelog: v3->v4: Made bpf_redirect_map() a map operation. (Daniel) v2->v3: Fix build when CONFIG_NET is not set. (lkp) v1->v2: Removed warning when CONFIG_BPF_SYSCALL was not set. (lkp) Cleaned up case-clause in xdp_do_generic_redirect_map(). (Toke) Re-added comment. (Toke) rfc->v1: Use map_id, and remove bpf_clear_redirect_map(). (Toke) Get rid of the macro and use __always_inline. (Jesper) rfc: https://lore.kernel.org/bpf/87im7fy9nc.fsf@toke.dk/ (Cover not on lore!) v1: https://lore.kernel.org/bpf/20210219145922.63655-1-bjorn.topel@gmail.com/ v2: https://lore.kernel.org/bpf/20210220153056.111968-1-bjorn.topel@gmail.com/ v3: https://lore.kernel.org/bpf/20210221200954.164125-3-bjorn.topel@gmail.com/ Cheers, Björn Björn Töpel (2): bpf, xdp: make bpf_redirect_map() a map operation bpf, xdp: restructure redirect actions include/linux/bpf.h | 26 ++---- include/linux/filter.h | 39 +++++++- include/net/xdp_sock.h | 19 ---- include/trace/events/xdp.h | 66 ++++++++----- kernel/bpf/cpumap.c | 10 +- kernel/bpf/devmap.c | 19 +++- kernel/bpf/verifier.c | 11 ++- net/core/filter.c | 183 ++++++++++++------------------------- net/xdp/xskmap.c | 20 +++- 9 files changed, 195 insertions(+), 198 deletions(-) base-commit: 9c8f21e6f8856a96634e542a58ef3abf27486801 -- 2.27.0 ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:23 [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel @ 2021-02-26 11:23 ` Björn Töpel 2021-02-26 11:37 ` Toke Høiland-Jørgensen ` (2 more replies) 2021-02-26 11:23 ` [PATCH bpf-next v4 2/2] bpf, xdp: restructure redirect actions Björn Töpel 2021-02-26 11:35 ` [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Toke Høiland-Jørgensen 2 siblings, 3 replies; 17+ messages in thread From: Björn Töpel @ 2021-02-26 11:23 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem From: Björn Töpel <bjorn.topel@intel.com> Currently the bpf_redirect_map() implementation dispatches to the correct map-lookup function via a switch-statement. To avoid the dispatching, this change adds bpf_redirect_map() as a map operation. Each map provides its bpf_redirect_map() version, and correct function is automatically selected by the BPF verifier. A nice side-effect of the code movement is that the map lookup functions are now local to the map implementation files, which removes one additional function call. Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- include/linux/bpf.h | 26 ++++++-------------------- include/linux/filter.h | 27 +++++++++++++++++++++++++++ include/net/xdp_sock.h | 19 ------------------- kernel/bpf/cpumap.c | 8 +++++++- kernel/bpf/devmap.c | 16 ++++++++++++++-- kernel/bpf/verifier.c | 11 +++++++++-- net/core/filter.c | 39 +-------------------------------------- net/xdp/xskmap.c | 18 ++++++++++++++++++ 8 files changed, 82 insertions(+), 82 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cccaef1088ea..a44ba904ca37 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -117,6 +117,9 @@ struct bpf_map_ops { void *owner, u32 size); struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void *owner); + /* XDP helpers.*/ + int (*xdp_redirect_map)(struct bpf_map *map, u32 ifindex, u64 flags); + /* map_meta_equal must be implemented for maps that can be * used as an inner map. It is a runtime check to ensure * an inner map can be inserted to an outer map. @@ -1429,9 +1432,9 @@ struct btf *bpf_get_btf_vmlinux(void); /* Map specifics */ struct xdp_buff; struct sk_buff; +struct bpf_dtab_netdev; +struct bpf_cpu_map_entry; -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key); -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key); void __dev_flush(void); int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, struct net_device *dev_rx); @@ -1441,7 +1444,6 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, struct bpf_prog *xdp_prog); bool dev_map_can_have_prog(struct bpf_map *map); -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key); void __cpu_map_flush(void); int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu, struct xdp_buff *xdp, struct net_device *dev_rx); @@ -1568,17 +1570,6 @@ static inline int bpf_obj_get_user(const char __user *pathname, int flags) return -EOPNOTSUPP; } -static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - return NULL; -} - -static inline struct net_device *__dev_map_hash_lookup_elem(struct bpf_map *map, - u32 key) -{ - return NULL; -} static inline bool dev_map_can_have_prog(struct bpf_map *map) { return false; @@ -1590,6 +1581,7 @@ static inline void __dev_flush(void) struct xdp_buff; struct bpf_dtab_netdev; +struct bpf_cpu_map_entry; static inline int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, @@ -1614,12 +1606,6 @@ static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, return 0; } -static inline -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) -{ - return NULL; -} - static inline void __cpu_map_flush(void) { } diff --git a/include/linux/filter.h b/include/linux/filter.h index 3b00fc906ccd..008691fd3b58 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1472,4 +1472,31 @@ static inline bool bpf_sk_lookup_run_v6(struct net *net, int protocol, } #endif /* IS_ENABLED(CONFIG_IPV6) */ +static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, + void *lookup_elem(struct bpf_map *map, u32 key)) +{ + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + + /* Lower bits of the flags are used as return code on lookup failure */ + if (unlikely(flags > XDP_TX)) + return XDP_ABORTED; + + ri->tgt_value = lookup_elem(map, ifindex); + if (unlikely(!ri->tgt_value)) { + /* If the lookup fails we want to clear out the state in the + * redirect_info struct completely, so that if an eBPF program + * performs multiple lookups, the last one always takes + * precedence. + */ + WRITE_ONCE(ri->map, NULL); + return flags; + } + + ri->flags = flags; + ri->tgt_index = ifindex; + WRITE_ONCE(ri->map, map); + + return XDP_REDIRECT; +} + #endif /* __LINUX_FILTER_H__ */ diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index cc17bc957548..9c0722c6d7ac 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -80,19 +80,6 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp); int __xsk_map_redirect(struct xdp_sock *xs, struct xdp_buff *xdp); void __xsk_map_flush(void); -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - struct xsk_map *m = container_of(map, struct xsk_map, map); - struct xdp_sock *xs; - - if (key >= map->max_entries) - return NULL; - - xs = READ_ONCE(m->xsk_map[key]); - return xs; -} - #else static inline int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) @@ -109,12 +96,6 @@ static inline void __xsk_map_flush(void) { } -static inline struct xdp_sock *__xsk_map_lookup_elem(struct bpf_map *map, - u32 key) -{ - return NULL; -} - #endif /* CONFIG_XDP_SOCKETS */ #endif /* _LINUX_XDP_SOCK_H */ diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 5d1469de6921..85a2d33fd46b 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -563,7 +563,7 @@ static void cpu_map_free(struct bpf_map *map) kfree(cmap); } -struct bpf_cpu_map_entry *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) +static void *__cpu_map_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_cpu_map *cmap = container_of(map, struct bpf_cpu_map, map); struct bpf_cpu_map_entry *rcpu; @@ -600,6 +600,11 @@ static int cpu_map_get_next_key(struct bpf_map *map, void *key, void *next_key) return 0; } +static int cpu_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); +} + static int cpu_map_btf_id; const struct bpf_map_ops cpu_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -612,6 +617,7 @@ const struct bpf_map_ops cpu_map_ops = { .map_check_btf = map_check_no_btf, .map_btf_name = "bpf_cpu_map", .map_btf_id = &cpu_map_btf_id, + .xdp_redirect_map = cpu_map_xdp_redirect_map, }; static void bq_flush_to_queue(struct xdp_bulk_queue *bq) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 85d9d1b72a33..adf9a2517f80 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -258,7 +258,7 @@ static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) return 0; } -struct bpf_dtab_netdev *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) +static void *__dev_map_hash_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct hlist_head *head = dev_map_index_hash(dtab, key); @@ -392,7 +392,7 @@ void __dev_flush(void) * update happens in parallel here a dev_put wont happen until after reading the * ifindex. */ -struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) +static void *__dev_map_lookup_elem(struct bpf_map *map, u32 key) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct bpf_dtab_netdev *obj; @@ -735,6 +735,16 @@ static int dev_map_hash_update_elem(struct bpf_map *map, void *key, void *value, map, key, value, map_flags); } +static int dev_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); +} + +static int dev_hash_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); +} + static int dev_map_btf_id; const struct bpf_map_ops dev_map_ops = { .map_meta_equal = bpf_map_meta_equal, @@ -747,6 +757,7 @@ const struct bpf_map_ops dev_map_ops = { .map_check_btf = map_check_no_btf, .map_btf_name = "bpf_dtab", .map_btf_id = &dev_map_btf_id, + .xdp_redirect_map = dev_map_xdp_redirect_map, }; static int dev_map_hash_map_btf_id; @@ -761,6 +772,7 @@ const struct bpf_map_ops dev_map_hash_ops = { .map_check_btf = map_check_no_btf, .map_btf_name = "bpf_dtab", .map_btf_id = &dev_map_hash_map_btf_id, + .xdp_redirect_map = dev_hash_map_xdp_redirect_map, }; static void dev_map_hash_remove_netdev(struct bpf_dtab *dtab, diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 1dda9d81f12c..96705a49225e 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, func_id != BPF_FUNC_map_delete_elem && func_id != BPF_FUNC_map_push_elem && func_id != BPF_FUNC_map_pop_elem && - func_id != BPF_FUNC_map_peek_elem) + func_id != BPF_FUNC_map_peek_elem && + func_id != BPF_FUNC_redirect_map) return 0; if (map == NULL) { @@ -11762,7 +11763,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) insn->imm == BPF_FUNC_map_delete_elem || insn->imm == BPF_FUNC_map_push_elem || insn->imm == BPF_FUNC_map_pop_elem || - insn->imm == BPF_FUNC_map_peek_elem)) { + insn->imm == BPF_FUNC_map_peek_elem || + insn->imm == BPF_FUNC_redirect_map)) { aux = &env->insn_aux_data[i + delta]; if (bpf_map_ptr_poisoned(aux)) goto patch_call_imm; @@ -11804,6 +11806,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) (int (*)(struct bpf_map *map, void *value))NULL)); BUILD_BUG_ON(!__same_type(ops->map_peek_elem, (int (*)(struct bpf_map *map, void *value))NULL)); + BUILD_BUG_ON(!__same_type(ops->xdp_redirect_map, + (int (*)(struct bpf_map *map, u32 ifindex, u64 flags))NULL)); patch_map_ops_generic: switch (insn->imm) { case BPF_FUNC_map_lookup_elem: @@ -11830,6 +11834,9 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - __bpf_call_base; continue; + case BPF_FUNC_redirect_map: + insn->imm = BPF_CAST_CALL(ops->xdp_redirect_map) - __bpf_call_base; + continue; } goto patch_call_imm; diff --git a/net/core/filter.c b/net/core/filter.c index adfdad234674..fdf7401f43fd 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3944,22 +3944,6 @@ void xdp_do_flush(void) } EXPORT_SYMBOL_GPL(xdp_do_flush); -static inline void *__xdp_map_lookup_elem(struct bpf_map *map, u32 index) -{ - switch (map->map_type) { - case BPF_MAP_TYPE_DEVMAP: - return __dev_map_lookup_elem(map, index); - case BPF_MAP_TYPE_DEVMAP_HASH: - return __dev_map_hash_lookup_elem(map, index); - case BPF_MAP_TYPE_CPUMAP: - return __cpu_map_lookup_elem(map, index); - case BPF_MAP_TYPE_XSKMAP: - return __xsk_map_lookup_elem(map, index); - default: - return NULL; - } -} - void bpf_clear_redirect_map(struct bpf_map *map) { struct bpf_redirect_info *ri; @@ -4113,28 +4097,7 @@ static const struct bpf_func_proto bpf_xdp_redirect_proto = { BPF_CALL_3(bpf_xdp_redirect_map, struct bpf_map *, map, u32, ifindex, u64, flags) { - struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - - /* Lower bits of the flags are used as return code on lookup failure */ - if (unlikely(flags > XDP_TX)) - return XDP_ABORTED; - - ri->tgt_value = __xdp_map_lookup_elem(map, ifindex); - if (unlikely(!ri->tgt_value)) { - /* If the lookup fails we want to clear out the state in the - * redirect_info struct completely, so that if an eBPF program - * performs multiple lookups, the last one always takes - * precedence. - */ - WRITE_ONCE(ri->map, NULL); - return flags; - } - - ri->flags = flags; - ri->tgt_index = ifindex; - WRITE_ONCE(ri->map, map); - - return XDP_REDIRECT; + return map->ops->xdp_redirect_map(map, ifindex, flags); } static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 113fd9017203..92f4023d3ae2 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -125,6 +125,18 @@ static int xsk_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf) return insn - insn_buf; } +static void *__xsk_map_lookup_elem(struct bpf_map *map, u32 key) +{ + struct xsk_map *m = container_of(map, struct xsk_map, map); + struct xdp_sock *xs; + + if (key >= map->max_entries) + return NULL; + + xs = READ_ONCE(m->xsk_map[key]); + return xs; +} + static void *xsk_map_lookup_elem(struct bpf_map *map, void *key) { WARN_ON_ONCE(!rcu_read_lock_held()); @@ -215,6 +227,11 @@ static int xsk_map_delete_elem(struct bpf_map *map, void *key) return 0; } +static int xsk_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) +{ + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); +} + void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs, struct xdp_sock **map_entry) { @@ -247,4 +264,5 @@ const struct bpf_map_ops xsk_map_ops = { .map_check_btf = map_check_no_btf, .map_btf_name = "xsk_map", .map_btf_id = &xsk_map_btf_id, + .xdp_redirect_map = xsk_map_xdp_redirect_map, }; -- 2.27.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel @ 2021-02-26 11:37 ` Toke Høiland-Jørgensen 2021-02-26 11:40 ` Björn Töpel 2021-02-26 14:29 ` Jesper Dangaard Brouer 2021-02-26 15:23 ` kernel test robot 2021-02-26 21:48 ` Daniel Borkmann 2 siblings, 2 replies; 17+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-26 11:37 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@gmail.com> writes: > From: Björn Töpel <bjorn.topel@intel.com> > > Currently the bpf_redirect_map() implementation dispatches to the > correct map-lookup function via a switch-statement. To avoid the > dispatching, this change adds bpf_redirect_map() as a map > operation. Each map provides its bpf_redirect_map() version, and > correct function is automatically selected by the BPF verifier. > > A nice side-effect of the code movement is that the map lookup > functions are now local to the map implementation files, which removes > one additional function call. > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> Nice! I agree that this is a much nicer approach! :) (That last paragraph above is why I asked if you updated the performance numbers in the cover letter; removing an additional function call should affect those, right?) Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:37 ` Toke Høiland-Jørgensen @ 2021-02-26 11:40 ` Björn Töpel 2021-02-26 12:04 ` Björn Töpel 2021-02-26 14:27 ` Jesper Dangaard Brouer 2021-02-26 14:29 ` Jesper Dangaard Brouer 1 sibling, 2 replies; 17+ messages in thread From: Björn Töpel @ 2021-02-26 11:40 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On 2021-02-26 12:37, Toke Høiland-Jørgensen wrote: > Björn Töpel <bjorn.topel@gmail.com> writes: > >> From: Björn Töpel <bjorn.topel@intel.com> >> >> Currently the bpf_redirect_map() implementation dispatches to the >> correct map-lookup function via a switch-statement. To avoid the >> dispatching, this change adds bpf_redirect_map() as a map >> operation. Each map provides its bpf_redirect_map() version, and >> correct function is automatically selected by the BPF verifier. >> >> A nice side-effect of the code movement is that the map lookup >> functions are now local to the map implementation files, which removes >> one additional function call. >> >> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> > > Nice! I agree that this is a much nicer approach! :) > > (That last paragraph above is why I asked if you updated the performance > numbers in the cover letter; removing an additional function call should > affect those, right?) > Yeah, it should. Let me spend some more time benchmarking on the DEVMAP scenario. @Jesper Do you have a CPUMAP benchmark that you can point me to? I just did functional testing for CPUMAP > Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> > Thank you! Björn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:40 ` Björn Töpel @ 2021-02-26 12:04 ` Björn Töpel 2021-02-26 12:26 ` Toke Høiland-Jørgensen 2021-02-26 14:27 ` Jesper Dangaard Brouer 1 sibling, 1 reply; 17+ messages in thread From: Björn Töpel @ 2021-02-26 12:04 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On 2021-02-26 12:40, Björn Töpel wrote: > On 2021-02-26 12:37, Toke Høiland-Jørgensen wrote: [...] >> >> (That last paragraph above is why I asked if you updated the performance >> numbers in the cover letter; removing an additional function call should >> affect those, right?) >> > > Yeah, it should. Let me spend some more time benchmarking on the DEVMAP > scenario. > I did a re-measure using samples/xdp_redirect_map. The setup is 64B packets blasted to an i40e. As a baseline, # xdp_rxq_info --dev ens801f1 --action XDP_DROP gives 24.8 Mpps. Now, xdp_redirect_map. Same NIC, two ports, receive from port A, redirect to port B: baseline: 14.3 Mpps this series: 15.4 Mpps which is almost 8%! Björn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 12:04 ` Björn Töpel @ 2021-02-26 12:26 ` Toke Høiland-Jørgensen 2021-02-26 14:28 ` Jesper Dangaard Brouer 0 siblings, 1 reply; 17+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-26 12:26 UTC (permalink / raw) To: Björn Töpel, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@intel.com> writes: > On 2021-02-26 12:40, Björn Töpel wrote: >> On 2021-02-26 12:37, Toke Høiland-Jørgensen wrote: > > [...] > >>> >>> (That last paragraph above is why I asked if you updated the performance >>> numbers in the cover letter; removing an additional function call should >>> affect those, right?) >>> >> >> Yeah, it should. Let me spend some more time benchmarking on the DEVMAP >> scenario. >> > > I did a re-measure using samples/xdp_redirect_map. > > The setup is 64B packets blasted to an i40e. As a baseline, > > # xdp_rxq_info --dev ens801f1 --action XDP_DROP > > gives 24.8 Mpps. > > > Now, xdp_redirect_map. Same NIC, two ports, receive from port A, > redirect to port B: > > baseline: 14.3 Mpps > this series: 15.4 Mpps > > which is almost 8%! Or 5 ns difference: 10**9/(14.3*10**6) - 10**9/(15.4*10**6) 4.995004995005004 Nice :) -Toke ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 12:26 ` Toke Høiland-Jørgensen @ 2021-02-26 14:28 ` Jesper Dangaard Brouer 0 siblings, 0 replies; 17+ messages in thread From: Jesper Dangaard Brouer @ 2021-02-26 14:28 UTC (permalink / raw) To: Toke Høiland-Jørgensen Cc: Björn Töpel, Björn Töpel, ast, daniel, netdev, bpf, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On Fri, 26 Feb 2021 13:26:22 +0100 Toke Høiland-Jørgensen <toke@redhat.com> wrote: > Björn Töpel <bjorn.topel@intel.com> writes: > > > On 2021-02-26 12:40, Björn Töpel wrote: > >> On 2021-02-26 12:37, Toke Høiland-Jørgensen wrote: > > > > [...] > > > >>> > >>> (That last paragraph above is why I asked if you updated the performance > >>> numbers in the cover letter; removing an additional function call should > >>> affect those, right?) > >>> > >> > >> Yeah, it should. Let me spend some more time benchmarking on the DEVMAP > >> scenario. > >> > > > > I did a re-measure using samples/xdp_redirect_map. > > > > The setup is 64B packets blasted to an i40e. As a baseline, > > > > # xdp_rxq_info --dev ens801f1 --action XDP_DROP > > > > gives 24.8 Mpps. > > > > > > Now, xdp_redirect_map. Same NIC, two ports, receive from port A, > > redirect to port B: > > > > baseline: 14.3 Mpps > > this series: 15.4 Mpps > > > > which is almost 8%! > > Or 5 ns difference: > > 10**9/(14.3*10**6) - 10**9/(15.4*10**6) > 4.995004995005004 > > Nice :) Yes, this is a very significant improvement at this zoom-in benchmarking level :-) -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:40 ` Björn Töpel 2021-02-26 12:04 ` Björn Töpel @ 2021-02-26 14:27 ` Jesper Dangaard Brouer 1 sibling, 0 replies; 17+ messages in thread From: Jesper Dangaard Brouer @ 2021-02-26 14:27 UTC (permalink / raw) To: Björn Töpel Cc: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On Fri, 26 Feb 2021 12:40:33 +0100 Björn Töpel <bjorn.topel@intel.com> wrote: > @Jesper Do you have a CPUMAP benchmark that you can point me to? I just > did functional testing for CPUMAP I usually just use the xdp_redirect_cpu samples/bpf program. Your optimization will help the RX enqueue side, but the bottleneck for CPUMAP is the remote CPU dequeue. You should still be able to see that RX-side performance improve, and that should be enough (even-though packets are dropped before reaching remote CPU). I'm not going to ask you to test scale out to more CPUs. -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:37 ` Toke Høiland-Jørgensen 2021-02-26 11:40 ` Björn Töpel @ 2021-02-26 14:29 ` Jesper Dangaard Brouer 1 sibling, 0 replies; 17+ messages in thread From: Jesper Dangaard Brouer @ 2021-02-26 14:29 UTC (permalink / raw) To: Toke Høiland-Jørgensen Cc: brouer, Björn Töpel, ast, daniel, netdev, bpf, Björn Töpel, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On Fri, 26 Feb 2021 12:37:40 +0100 Toke Høiland-Jørgensen <toke@redhat.com> wrote: > Björn Töpel <bjorn.topel@gmail.com> writes: > > > From: Björn Töpel <bjorn.topel@intel.com> > > > > Currently the bpf_redirect_map() implementation dispatches to the > > correct map-lookup function via a switch-statement. To avoid the > > dispatching, this change adds bpf_redirect_map() as a map > > operation. Each map provides its bpf_redirect_map() version, and > > correct function is automatically selected by the BPF verifier. > > > > A nice side-effect of the code movement is that the map lookup > > functions are now local to the map implementation files, which removes > > one additional function call. > > > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> > > Nice! I agree that this is a much nicer approach! :) I agree :-) Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> -- Best regards, Jesper Dangaard Brouer MSc.CS, Principal Kernel Engineer at Red Hat LinkedIn: http://www.linkedin.com/in/brouer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel 2021-02-26 11:37 ` Toke Høiland-Jørgensen @ 2021-02-26 15:23 ` kernel test robot 2021-02-26 21:48 ` Daniel Borkmann 2 siblings, 0 replies; 17+ messages in thread From: kernel test robot @ 2021-02-26 15:23 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: kbuild-all, Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend [-- Attachment #1: Type: text/plain, Size: 6978 bytes --] Hi "Björn, I love your patch! Perhaps something to improve: [auto build test WARNING on 9c8f21e6f8856a96634e542a58ef3abf27486801] url: https://github.com/0day-ci/linux/commits/Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210226-192840 base: 9c8f21e6f8856a96634e542a58ef3abf27486801 config: mips-randconfig-r026-20210226 (attached as .config) compiler: mips64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/0day-ci/linux/commit/1f7606274f17503baf1c0908dad3462981840749 git remote add linux-review https://github.com/0day-ci/linux git fetch --no-tags linux-review Bj-rn-T-pel/Optimize-bpf_redirect_map-xdp_do_redirect/20210226-192840 git checkout 1f7606274f17503baf1c0908dad3462981840749 # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=mips If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All warnings (new ones prefixed by >>): In file included from include/linux/bpf_verifier.h:9, from kernel/bpf/verifier.c:12: kernel/bpf/verifier.c: In function 'jit_subprogs': include/linux/filter.h:363:4: warning: cast between incompatible function types from 'unsigned int (*)(const void *, const struct bpf_insn *)' to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11421:16: note: in expansion of macro 'BPF_CAST_CALL' 11421 | insn->imm = BPF_CAST_CALL(func[subprog]->bpf_func) - | ^~~~~~~~~~~~~ kernel/bpf/verifier.c: In function 'fixup_bpf_calls': include/linux/filter.h:363:4: warning: cast between incompatible function types from 'void * (* const)(struct bpf_map *, void *)' to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11814:17: note: in expansion of macro 'BPF_CAST_CALL' 11814 | insn->imm = BPF_CAST_CALL(ops->map_lookup_elem) - | ^~~~~~~~~~~~~ include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, void *, void *, u64)' {aka 'int (* const)(struct bpf_map *, void *, void *, long long unsigned int)'} to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11818:17: note: in expansion of macro 'BPF_CAST_CALL' 11818 | insn->imm = BPF_CAST_CALL(ops->map_update_elem) - | ^~~~~~~~~~~~~ include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, void *)' to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11822:17: note: in expansion of macro 'BPF_CAST_CALL' 11822 | insn->imm = BPF_CAST_CALL(ops->map_delete_elem) - | ^~~~~~~~~~~~~ include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, void *, u64)' {aka 'int (* const)(struct bpf_map *, void *, long long unsigned int)'} to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11826:17: note: in expansion of macro 'BPF_CAST_CALL' 11826 | insn->imm = BPF_CAST_CALL(ops->map_push_elem) - | ^~~~~~~~~~~~~ include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, void *)' to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11830:17: note: in expansion of macro 'BPF_CAST_CALL' 11830 | insn->imm = BPF_CAST_CALL(ops->map_pop_elem) - | ^~~~~~~~~~~~~ include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, void *)' to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11834:17: note: in expansion of macro 'BPF_CAST_CALL' 11834 | insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - | ^~~~~~~~~~~~~ >> include/linux/filter.h:363:4: warning: cast between incompatible function types from 'int (* const)(struct bpf_map *, u32, u64)' {aka 'int (* const)(struct bpf_map *, unsigned int, long long unsigned int)'} to 'u64 (*)(u64, u64, u64, u64, u64)' {aka 'long long unsigned int (*)(long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int)'} [-Wcast-function-type] 363 | ((u64 (*)(u64, u64, u64, u64, u64))(x)) | ^ kernel/bpf/verifier.c:11838:17: note: in expansion of macro 'BPF_CAST_CALL' 11838 | insn->imm = BPF_CAST_CALL(ops->xdp_redirect_map) - __bpf_call_base; | ^~~~~~~~~~~~~ vim +363 include/linux/filter.h f8f6d679aaa78b Daniel Borkmann 2014-05-29 361 09772d92cd5ad9 Daniel Borkmann 2018-06-02 362 #define BPF_CAST_CALL(x) \ 09772d92cd5ad9 Daniel Borkmann 2018-06-02 @363 ((u64 (*)(u64, u64, u64, u64, u64))(x)) 09772d92cd5ad9 Daniel Borkmann 2018-06-02 364 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 24530 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel 2021-02-26 11:37 ` Toke Høiland-Jørgensen 2021-02-26 15:23 ` kernel test robot @ 2021-02-26 21:48 ` Daniel Borkmann 2021-02-27 9:04 ` Björn Töpel 2 siblings, 1 reply; 17+ messages in thread From: Daniel Borkmann @ 2021-02-26 21:48 UTC (permalink / raw) To: Björn Töpel, ast, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem On 2/26/21 12:23 PM, Björn Töpel wrote: > From: Björn Töpel <bjorn.topel@intel.com> > > Currently the bpf_redirect_map() implementation dispatches to the > correct map-lookup function via a switch-statement. To avoid the > dispatching, this change adds bpf_redirect_map() as a map > operation. Each map provides its bpf_redirect_map() version, and > correct function is automatically selected by the BPF verifier. > > A nice side-effect of the code movement is that the map lookup > functions are now local to the map implementation files, which removes > one additional function call. > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com> > --- > include/linux/bpf.h | 26 ++++++-------------------- > include/linux/filter.h | 27 +++++++++++++++++++++++++++ > include/net/xdp_sock.h | 19 ------------------- > kernel/bpf/cpumap.c | 8 +++++++- > kernel/bpf/devmap.c | 16 ++++++++++++++-- > kernel/bpf/verifier.c | 11 +++++++++-- > net/core/filter.c | 39 +-------------------------------------- > net/xdp/xskmap.c | 18 ++++++++++++++++++ > 8 files changed, 82 insertions(+), 82 deletions(-) > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h > index cccaef1088ea..a44ba904ca37 100644 > --- a/include/linux/bpf.h > +++ b/include/linux/bpf.h > @@ -117,6 +117,9 @@ struct bpf_map_ops { > void *owner, u32 size); > struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void *owner); > > + /* XDP helpers.*/ > + int (*xdp_redirect_map)(struct bpf_map *map, u32 ifindex, u64 flags); > + > /* map_meta_equal must be implemented for maps that can be > * used as an inner map. It is a runtime check to ensure > * an inner map can be inserted to an outer map. [...] > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c > index 1dda9d81f12c..96705a49225e 100644 > --- a/kernel/bpf/verifier.c > +++ b/kernel/bpf/verifier.c > @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, > func_id != BPF_FUNC_map_delete_elem && > func_id != BPF_FUNC_map_push_elem && > func_id != BPF_FUNC_map_pop_elem && > - func_id != BPF_FUNC_map_peek_elem) > + func_id != BPF_FUNC_map_peek_elem && > + func_id != BPF_FUNC_redirect_map) > return 0; > > if (map == NULL) { > @@ -11762,7 +11763,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) > insn->imm == BPF_FUNC_map_delete_elem || > insn->imm == BPF_FUNC_map_push_elem || > insn->imm == BPF_FUNC_map_pop_elem || > - insn->imm == BPF_FUNC_map_peek_elem)) { > + insn->imm == BPF_FUNC_map_peek_elem || > + insn->imm == BPF_FUNC_redirect_map)) { > aux = &env->insn_aux_data[i + delta]; > if (bpf_map_ptr_poisoned(aux)) > goto patch_call_imm; > @@ -11804,6 +11806,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) > (int (*)(struct bpf_map *map, void *value))NULL)); > BUILD_BUG_ON(!__same_type(ops->map_peek_elem, > (int (*)(struct bpf_map *map, void *value))NULL)); > + BUILD_BUG_ON(!__same_type(ops->xdp_redirect_map, > + (int (*)(struct bpf_map *map, u32 ifindex, u64 flags))NULL)); > patch_map_ops_generic: > switch (insn->imm) { > case BPF_FUNC_map_lookup_elem: > @@ -11830,6 +11834,9 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) > insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - > __bpf_call_base; > continue; > + case BPF_FUNC_redirect_map: > + insn->imm = BPF_CAST_CALL(ops->xdp_redirect_map) - __bpf_call_base; Small nit: I would name the generic callback ops->map_redirect so that this is in line with the general naming convention for the map ops. Otherwise this looks much better, thx! > + continue; > } > > goto patch_call_imm; ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-26 21:48 ` Daniel Borkmann @ 2021-02-27 9:04 ` Björn Töpel 2021-02-27 10:22 ` Daniel Borkmann 0 siblings, 1 reply; 17+ messages in thread From: Björn Töpel @ 2021-02-27 9:04 UTC (permalink / raw) To: Daniel Borkmann Cc: maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem, Björn Töpel, bpf, netdev, ast On 2021-02-26 22:48, Daniel Borkmann wrote: > On 2/26/21 12:23 PM, Björn Töpel wrote: >> From: Björn Töpel <bjorn.topel@intel.com> >> >> Currently the bpf_redirect_map() implementation dispatches to the >> correct map-lookup function via a switch-statement. To avoid the >> dispatching, this change adds bpf_redirect_map() as a map >> operation. Each map provides its bpf_redirect_map() version, and >> correct function is automatically selected by the BPF verifier. >> >> A nice side-effect of the code movement is that the map lookup >> functions are now local to the map implementation files, which removes >> one additional function call. >> >> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> >> --- >> include/linux/bpf.h | 26 ++++++-------------------- >> include/linux/filter.h | 27 +++++++++++++++++++++++++++ >> include/net/xdp_sock.h | 19 ------------------- >> kernel/bpf/cpumap.c | 8 +++++++- >> kernel/bpf/devmap.c | 16 ++++++++++++++-- >> kernel/bpf/verifier.c | 11 +++++++++-- >> net/core/filter.c | 39 +-------------------------------------- >> net/xdp/xskmap.c | 18 ++++++++++++++++++ >> 8 files changed, 82 insertions(+), 82 deletions(-) >> >> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >> index cccaef1088ea..a44ba904ca37 100644 >> --- a/include/linux/bpf.h >> +++ b/include/linux/bpf.h >> @@ -117,6 +117,9 @@ struct bpf_map_ops { >> void *owner, u32 size); >> struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void >> *owner); >> + /* XDP helpers.*/ >> + int (*xdp_redirect_map)(struct bpf_map *map, u32 ifindex, u64 >> flags); >> + >> /* map_meta_equal must be implemented for maps that can be >> * used as an inner map. It is a runtime check to ensure >> * an inner map can be inserted to an outer map. > [...] >> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >> index 1dda9d81f12c..96705a49225e 100644 >> --- a/kernel/bpf/verifier.c >> +++ b/kernel/bpf/verifier.c >> @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, >> struct bpf_call_arg_meta *meta, >> func_id != BPF_FUNC_map_delete_elem && >> func_id != BPF_FUNC_map_push_elem && >> func_id != BPF_FUNC_map_pop_elem && >> - func_id != BPF_FUNC_map_peek_elem) >> + func_id != BPF_FUNC_map_peek_elem && >> + func_id != BPF_FUNC_redirect_map) >> return 0; >> if (map == NULL) { >> @@ -11762,7 +11763,8 @@ static int fixup_bpf_calls(struct >> bpf_verifier_env *env) >> insn->imm == BPF_FUNC_map_delete_elem || >> insn->imm == BPF_FUNC_map_push_elem || >> insn->imm == BPF_FUNC_map_pop_elem || >> - insn->imm == BPF_FUNC_map_peek_elem)) { >> + insn->imm == BPF_FUNC_map_peek_elem || >> + insn->imm == BPF_FUNC_redirect_map)) { >> aux = &env->insn_aux_data[i + delta]; >> if (bpf_map_ptr_poisoned(aux)) >> goto patch_call_imm; >> @@ -11804,6 +11806,8 @@ static int fixup_bpf_calls(struct >> bpf_verifier_env *env) >> (int (*)(struct bpf_map *map, void *value))NULL)); >> BUILD_BUG_ON(!__same_type(ops->map_peek_elem, >> (int (*)(struct bpf_map *map, void *value))NULL)); >> + BUILD_BUG_ON(!__same_type(ops->xdp_redirect_map, >> + (int (*)(struct bpf_map *map, u32 ifindex, u64 >> flags))NULL)); >> patch_map_ops_generic: >> switch (insn->imm) { >> case BPF_FUNC_map_lookup_elem: >> @@ -11830,6 +11834,9 @@ static int fixup_bpf_calls(struct >> bpf_verifier_env *env) >> insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - >> __bpf_call_base; >> continue; >> + case BPF_FUNC_redirect_map: >> + insn->imm = BPF_CAST_CALL(ops->xdp_redirect_map) - >> __bpf_call_base; > > Small nit: I would name the generic callback ops->map_redirect so that > this is in line with > the general naming convention for the map ops. Otherwise this looks much > better, thx! > I'll respin! Thanks for the input! I'll ignore the BPF_CAST_CALL W=1 warnings ([-Wcast-function-type]), or do you have any thoughts on that? I don't think it's a good idea to silence that warning for the whole verifier.c Björn >> + continue; >> } >> goto patch_call_imm; ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation 2021-02-27 9:04 ` Björn Töpel @ 2021-02-27 10:22 ` Daniel Borkmann 0 siblings, 0 replies; 17+ messages in thread From: Daniel Borkmann @ 2021-02-27 10:22 UTC (permalink / raw) To: Björn Töpel Cc: maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem, Björn Töpel, bpf, netdev, ast On 2/27/21 10:04 AM, Björn Töpel wrote: > On 2021-02-26 22:48, Daniel Borkmann wrote: >> On 2/26/21 12:23 PM, Björn Töpel wrote: >>> From: Björn Töpel <bjorn.topel@intel.com> >>> >>> Currently the bpf_redirect_map() implementation dispatches to the >>> correct map-lookup function via a switch-statement. To avoid the >>> dispatching, this change adds bpf_redirect_map() as a map >>> operation. Each map provides its bpf_redirect_map() version, and >>> correct function is automatically selected by the BPF verifier. >>> >>> A nice side-effect of the code movement is that the map lookup >>> functions are now local to the map implementation files, which removes >>> one additional function call. >>> >>> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> >>> --- >>> include/linux/bpf.h | 26 ++++++-------------------- >>> include/linux/filter.h | 27 +++++++++++++++++++++++++++ >>> include/net/xdp_sock.h | 19 ------------------- >>> kernel/bpf/cpumap.c | 8 +++++++- >>> kernel/bpf/devmap.c | 16 ++++++++++++++-- >>> kernel/bpf/verifier.c | 11 +++++++++-- >>> net/core/filter.c | 39 +-------------------------------------- >>> net/xdp/xskmap.c | 18 ++++++++++++++++++ >>> 8 files changed, 82 insertions(+), 82 deletions(-) >>> >>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h >>> index cccaef1088ea..a44ba904ca37 100644 >>> --- a/include/linux/bpf.h >>> +++ b/include/linux/bpf.h >>> @@ -117,6 +117,9 @@ struct bpf_map_ops { >>> void *owner, u32 size); >>> struct bpf_local_storage __rcu ** (*map_owner_storage_ptr)(void *owner); >>> + /* XDP helpers.*/ >>> + int (*xdp_redirect_map)(struct bpf_map *map, u32 ifindex, u64 flags); >>> + >>> /* map_meta_equal must be implemented for maps that can be >>> * used as an inner map. It is a runtime check to ensure >>> * an inner map can be inserted to an outer map. >> [...] >>> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c >>> index 1dda9d81f12c..96705a49225e 100644 >>> --- a/kernel/bpf/verifier.c >>> +++ b/kernel/bpf/verifier.c >>> @@ -5409,7 +5409,8 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, >>> func_id != BPF_FUNC_map_delete_elem && >>> func_id != BPF_FUNC_map_push_elem && >>> func_id != BPF_FUNC_map_pop_elem && >>> - func_id != BPF_FUNC_map_peek_elem) >>> + func_id != BPF_FUNC_map_peek_elem && >>> + func_id != BPF_FUNC_redirect_map) >>> return 0; >>> if (map == NULL) { >>> @@ -11762,7 +11763,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) >>> insn->imm == BPF_FUNC_map_delete_elem || >>> insn->imm == BPF_FUNC_map_push_elem || >>> insn->imm == BPF_FUNC_map_pop_elem || >>> - insn->imm == BPF_FUNC_map_peek_elem)) { >>> + insn->imm == BPF_FUNC_map_peek_elem || >>> + insn->imm == BPF_FUNC_redirect_map)) { >>> aux = &env->insn_aux_data[i + delta]; >>> if (bpf_map_ptr_poisoned(aux)) >>> goto patch_call_imm; >>> @@ -11804,6 +11806,8 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) >>> (int (*)(struct bpf_map *map, void *value))NULL)); >>> BUILD_BUG_ON(!__same_type(ops->map_peek_elem, >>> (int (*)(struct bpf_map *map, void *value))NULL)); >>> + BUILD_BUG_ON(!__same_type(ops->xdp_redirect_map, >>> + (int (*)(struct bpf_map *map, u32 ifindex, u64 flags))NULL)); >>> patch_map_ops_generic: >>> switch (insn->imm) { >>> case BPF_FUNC_map_lookup_elem: >>> @@ -11830,6 +11834,9 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) >>> insn->imm = BPF_CAST_CALL(ops->map_peek_elem) - >>> __bpf_call_base; >>> continue; >>> + case BPF_FUNC_redirect_map: >>> + insn->imm = BPF_CAST_CALL(ops->xdp_redirect_map) - __bpf_call_base; >> >> Small nit: I would name the generic callback ops->map_redirect so that this is in line with >> the general naming convention for the map ops. Otherwise this looks much better, thx! >> > > I'll respin! Thanks for the input! > > I'll ignore the BPF_CAST_CALL W=1 warnings ([-Wcast-function-type]), or > do you have any thoughts on that? I don't think it's a good idea to > silence that warning for the whole verifier.c Makes sense, yes, given they are neither new nor critical for the existing ones either. Thanks, Daniel ^ permalink raw reply [flat|nested] 17+ messages in thread
* [PATCH bpf-next v4 2/2] bpf, xdp: restructure redirect actions 2021-02-26 11:23 [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel @ 2021-02-26 11:23 ` Björn Töpel 2021-02-26 11:35 ` [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Toke Høiland-Jørgensen 2 siblings, 0 replies; 17+ messages in thread From: Björn Töpel @ 2021-02-26 11:23 UTC (permalink / raw) To: ast, daniel, netdev, bpf Cc: Björn Töpel, maciej.fijalkowski, hawk, toke, magnus.karlsson, john.fastabend, kuba, davem, Jesper Dangaard Brouer From: Björn Töpel <bjorn.topel@intel.com> The XDP_REDIRECT implementations for maps and non-maps are fairly similar, but obviously need to take different code paths depending on if the target is using a map or not. Today, the redirect targets for XDP either uses a map, or is based on ifindex. Here, an explicit redirect type is added to bpf_redirect_info, instead of the actual map. Redirect type, map item/ifindex, and the map_id (if any) is passed to xdp_do_redirect(). In addition to making the code easier to follow, using an explicit type in bpf_redirect_info has a slight positive performance impact by avoiding a pointer indirection for the map type lookup, and instead use the cacheline for bpf_redirect_info. Since the actual map is not passed via bpf_redirect_info anymore, the map lookup is only done in the BPF helper. This means that the bpf_clear_redirect_map() function can be removed. The actual map item is RCU protected. The bpf_redirect_info flags member is not used by XDP, and not read/written any more. The map member is only written to when required/used, and not unconditionally. Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Acked-by: Toke Høiland-Jørgensen <toke@redhat.com> Reviewed-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Signed-off-by: Björn Töpel <bjorn.topel@intel.com> --- include/linux/filter.h | 20 ++++-- include/trace/events/xdp.h | 66 ++++++++++------- kernel/bpf/cpumap.c | 4 +- kernel/bpf/devmap.c | 7 +- net/core/filter.c | 144 +++++++++++++++---------------------- net/xdp/xskmap.c | 4 +- 6 files changed, 121 insertions(+), 124 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 008691fd3b58..a7752badc2ec 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -646,11 +646,20 @@ struct bpf_redirect_info { u32 flags; u32 tgt_index; void *tgt_value; - struct bpf_map *map; + u32 map_id; + u32 tgt_type; u32 kern_flags; struct bpf_nh_params nh; }; +enum xdp_redirect_type { + XDP_REDIR_UNSET, + XDP_REDIR_DEV_IFINDEX, + XDP_REDIR_DEV_MAP, + XDP_REDIR_CPU_MAP, + XDP_REDIR_XSK_MAP, +}; + DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); /* flags for bpf_redirect_info kern_flags */ @@ -1473,7 +1482,8 @@ static inline bool bpf_sk_lookup_run_v6(struct net *net, int protocol, #endif /* IS_ENABLED(CONFIG_IPV6) */ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags, - void *lookup_elem(struct bpf_map *map, u32 key)) + void *lookup_elem(struct bpf_map *map, u32 key), + enum xdp_redirect_type type) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); @@ -1488,13 +1498,13 @@ static __always_inline int __bpf_xdp_redirect_map(struct bpf_map *map, u32 ifind * performs multiple lookups, the last one always takes * precedence. */ - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_UNSET; return flags; } - ri->flags = flags; ri->tgt_index = ifindex; - WRITE_ONCE(ri->map, map); + ri->tgt_type = type; + ri->map_id = map->id; return XDP_REDIRECT; } diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index 76a97176ab81..538321735447 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -86,19 +86,15 @@ struct _bpf_dtab_netdev { }; #endif /* __DEVMAP_OBJ_TYPE */ -#define devmap_ifindex(tgt, map) \ - (((map->map_type == BPF_MAP_TYPE_DEVMAP || \ - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH)) ? \ - ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex : 0) - DECLARE_EVENT_CLASS(xdp_redirect_template, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), - TP_ARGS(dev, xdp, tgt, err, map, index), + TP_ARGS(dev, xdp, tgt, err, type, ri), TP_STRUCT__entry( __field(int, prog_id) @@ -111,14 +107,30 @@ DECLARE_EVENT_CLASS(xdp_redirect_template, ), TP_fast_assign( + u32 ifindex = 0, map_id = 0, index = ri->tgt_index; + + switch (type) { + case XDP_REDIR_DEV_MAP: + ifindex = ((struct _bpf_dtab_netdev *)tgt)->dev->ifindex; + fallthrough; + case XDP_REDIR_CPU_MAP: + case XDP_REDIR_XSK_MAP: + map_id = ri->map_id; + break; + case XDP_REDIR_DEV_IFINDEX: + ifindex = (u32)(long)tgt; + break; + default: + break; + } + __entry->prog_id = xdp->aux->id; __entry->act = XDP_REDIRECT; __entry->ifindex = dev->ifindex; __entry->err = err; - __entry->to_ifindex = map ? devmap_ifindex(tgt, map) : - index; - __entry->map_id = map ? map->id : 0; - __entry->map_index = map ? index : 0; + __entry->to_ifindex = ifindex; + __entry->map_id = map_id; + __entry->map_index = index; ), TP_printk("prog_id=%d action=%s ifindex=%d to_ifindex=%d err=%d" @@ -133,45 +145,49 @@ DEFINE_EVENT(xdp_redirect_template, xdp_redirect, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); DEFINE_EVENT(xdp_redirect_template, xdp_redirect_err, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); #define _trace_xdp_redirect(dev, xdp, to) \ - trace_xdp_redirect(dev, xdp, NULL, 0, NULL, to) + trace_xdp_redirect(dev, xdp, NULL, 0, XDP_REDIR_DEV_IFINDEX, NULL) #define _trace_xdp_redirect_err(dev, xdp, to, err) \ - trace_xdp_redirect_err(dev, xdp, NULL, err, NULL, to) + trace_xdp_redirect_err(dev, xdp, NULL, err, XDP_REDIR_DEV_IFINDEX, NULL) -#define _trace_xdp_redirect_map(dev, xdp, to, map, index) \ - trace_xdp_redirect(dev, xdp, to, 0, map, index) +#define _trace_xdp_redirect_map(dev, xdp, to, type, ri) \ + trace_xdp_redirect(dev, xdp, to, 0, type, ri) -#define _trace_xdp_redirect_map_err(dev, xdp, to, map, index, err) \ - trace_xdp_redirect_err(dev, xdp, to, err, map, index) +#define _trace_xdp_redirect_map_err(dev, xdp, to, type, ri, err) \ + trace_xdp_redirect_err(dev, xdp, to, err, type, ri) /* not used anymore, but kept around so as not to break old programs */ DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); DEFINE_EVENT(xdp_redirect_template, xdp_redirect_map_err, TP_PROTO(const struct net_device *dev, const struct bpf_prog *xdp, const void *tgt, int err, - const struct bpf_map *map, u32 index), - TP_ARGS(dev, xdp, tgt, err, map, index) + enum xdp_redirect_type type, + const struct bpf_redirect_info *ri), + TP_ARGS(dev, xdp, tgt, err, type, ri) ); TRACE_EVENT(xdp_cpumap_kthread, diff --git a/kernel/bpf/cpumap.c b/kernel/bpf/cpumap.c index 85a2d33fd46b..5161ccb871e7 100644 --- a/kernel/bpf/cpumap.c +++ b/kernel/bpf/cpumap.c @@ -543,7 +543,6 @@ static void cpu_map_free(struct bpf_map *map) * complete. */ - bpf_clear_redirect_map(map); synchronize_rcu(); /* For cpu_map the remote CPUs can still be using the entries @@ -602,7 +601,8 @@ static int cpu_map_get_next_key(struct bpf_map *map, void *key, void *next_key) static int cpu_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __cpu_map_lookup_elem, + XDP_REDIR_CPU_MAP); } static int cpu_map_btf_id; diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index adf9a2517f80..5a403471065b 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -197,7 +197,6 @@ static void dev_map_free(struct bpf_map *map) list_del_rcu(&dtab->list); spin_unlock(&dev_map_lock); - bpf_clear_redirect_map(map); synchronize_rcu(); /* Make sure prior __dev_map_entry_free() have completed. */ @@ -737,12 +736,14 @@ static int dev_map_hash_update_elem(struct bpf_map *map, void *key, void *value, static int dev_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_lookup_elem, + XDP_REDIR_DEV_MAP); } static int dev_hash_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __dev_map_hash_lookup_elem, + XDP_REDIR_DEV_MAP); } static int dev_map_btf_id; diff --git a/net/core/filter.c b/net/core/filter.c index fdf7401f43fd..8ce21667b899 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3919,23 +3919,6 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { .arg2_type = ARG_ANYTHING, }; -static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, - struct bpf_map *map, struct xdp_buff *xdp) -{ - switch (map->map_type) { - case BPF_MAP_TYPE_DEVMAP: - case BPF_MAP_TYPE_DEVMAP_HASH: - return dev_map_enqueue(fwd, xdp, dev_rx); - case BPF_MAP_TYPE_CPUMAP: - return cpu_map_enqueue(fwd, xdp, dev_rx); - case BPF_MAP_TYPE_XSKMAP: - return __xsk_map_redirect(fwd, xdp); - default: - return -EBADRQC; - } - return 0; -} - void xdp_do_flush(void) { __dev_flush(); @@ -3944,55 +3927,45 @@ void xdp_do_flush(void) } EXPORT_SYMBOL_GPL(xdp_do_flush); -void bpf_clear_redirect_map(struct bpf_map *map) -{ - struct bpf_redirect_info *ri; - int cpu; - - for_each_possible_cpu(cpu) { - ri = per_cpu_ptr(&bpf_redirect_info, cpu); - /* Avoid polluting remote cacheline due to writes if - * not needed. Once we pass this test, we need the - * cmpxchg() to make sure it hasn't been changed in - * the meantime by remote CPU. - */ - if (unlikely(READ_ONCE(ri->map) == map)) - cmpxchg(&ri->map, map, NULL); - } -} - int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - struct bpf_map *map = READ_ONCE(ri->map); - u32 index = ri->tgt_index; + enum xdp_redirect_type type = ri->tgt_type; void *fwd = ri->tgt_value; int err; - ri->tgt_index = 0; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_UNSET; - if (unlikely(!map)) { - fwd = dev_get_by_index_rcu(dev_net(dev), index); + switch (type) { + case XDP_REDIR_DEV_IFINDEX: + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); if (unlikely(!fwd)) { err = -EINVAL; - goto err; + break; } - err = dev_xdp_enqueue(fwd, xdp, dev); - } else { - err = __bpf_tx_xdp_map(dev, fwd, map, xdp); + break; + case XDP_REDIR_DEV_MAP: + err = dev_map_enqueue(fwd, xdp, dev); + break; + case XDP_REDIR_CPU_MAP: + err = cpu_map_enqueue(fwd, xdp, dev); + break; + case XDP_REDIR_XSK_MAP: + err = __xsk_map_redirect(fwd, xdp); + break; + default: + err = -EBADRQC; } if (unlikely(err)) goto err; - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); return 0; err: - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); return err; } EXPORT_SYMBOL_GPL(xdp_do_redirect); @@ -4001,41 +3974,37 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog, - struct bpf_map *map) + void *fwd, + enum xdp_redirect_type type) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - u32 index = ri->tgt_index; - void *fwd = ri->tgt_value; - int err = 0; - - ri->tgt_index = 0; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); - - if (map->map_type == BPF_MAP_TYPE_DEVMAP || - map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { - struct bpf_dtab_netdev *dst = fwd; + int err; - err = dev_map_generic_redirect(dst, skb, xdp_prog); + switch (type) { + case XDP_REDIR_DEV_MAP: + err = dev_map_generic_redirect(fwd, skb, xdp_prog); if (unlikely(err)) goto err; - } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { + break; + case XDP_REDIR_XSK_MAP: { struct xdp_sock *xs = fwd; err = xsk_generic_rcv(xs, xdp); if (err) goto err; consume_skb(skb); - } else { + break; + } + default: /* TODO: Handle BPF_MAP_TYPE_CPUMAP */ err = -EBADRQC; goto err; } - _trace_xdp_redirect_map(dev, xdp_prog, fwd, map, index); + _trace_xdp_redirect_map(dev, xdp_prog, fwd, type, ri); return 0; err: - _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, map, index, err); + _trace_xdp_redirect_map_err(dev, xdp_prog, fwd, type, ri, err); return err; } @@ -4043,29 +4012,31 @@ int xdp_do_generic_redirect(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); - struct bpf_map *map = READ_ONCE(ri->map); - u32 index = ri->tgt_index; - struct net_device *fwd; + enum xdp_redirect_type type = ri->tgt_type; + void *fwd = ri->tgt_value; int err = 0; - if (map) - return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, - map); - ri->tgt_index = 0; - fwd = dev_get_by_index_rcu(dev_net(dev), index); - if (unlikely(!fwd)) { - err = -EINVAL; - goto err; - } + ri->tgt_type = XDP_REDIR_UNSET; + ri->tgt_value = NULL; - err = xdp_ok_fwd_dev(fwd, skb->len); - if (unlikely(err)) - goto err; + if (type == XDP_REDIR_DEV_IFINDEX) { + fwd = dev_get_by_index_rcu(dev_net(dev), (u32)(long)fwd); + if (unlikely(!fwd)) { + err = -EINVAL; + goto err; + } - skb->dev = fwd; - _trace_xdp_redirect(dev, xdp_prog, index); - generic_xdp_tx(skb, xdp_prog); - return 0; + err = xdp_ok_fwd_dev(fwd, skb->len); + if (unlikely(err)) + goto err; + + skb->dev = fwd; + _trace_xdp_redirect(dev, xdp_prog, index); + generic_xdp_tx(skb, xdp_prog); + return 0; + } + + return xdp_do_generic_redirect_map(dev, skb, xdp, xdp_prog, fwd, type); err: _trace_xdp_redirect_err(dev, xdp_prog, index, err); return err; @@ -4078,10 +4049,9 @@ BPF_CALL_2(bpf_xdp_redirect, u32, ifindex, u64, flags) if (unlikely(flags)) return XDP_ABORTED; - ri->flags = flags; - ri->tgt_index = ifindex; - ri->tgt_value = NULL; - WRITE_ONCE(ri->map, NULL); + ri->tgt_type = XDP_REDIR_DEV_IFINDEX; + ri->tgt_index = 0; + ri->tgt_value = (void *)(long)ifindex; return XDP_REDIRECT; } diff --git a/net/xdp/xskmap.c b/net/xdp/xskmap.c index 92f4023d3ae2..d3e80c9fa17c 100644 --- a/net/xdp/xskmap.c +++ b/net/xdp/xskmap.c @@ -87,7 +87,6 @@ static void xsk_map_free(struct bpf_map *map) { struct xsk_map *m = container_of(map, struct xsk_map, map); - bpf_clear_redirect_map(map); synchronize_net(); bpf_map_area_free(m); } @@ -229,7 +228,8 @@ static int xsk_map_delete_elem(struct bpf_map *map, void *key) static int xsk_map_xdp_redirect_map(struct bpf_map *map, u32 ifindex, u64 flags) { - return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem); + return __bpf_xdp_redirect_map(map, ifindex, flags, __xsk_map_lookup_elem, + XDP_REDIR_XSK_MAP); } void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs, -- 2.27.0 ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() 2021-02-26 11:23 [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel 2021-02-26 11:23 ` [PATCH bpf-next v4 2/2] bpf, xdp: restructure redirect actions Björn Töpel @ 2021-02-26 11:35 ` Toke Høiland-Jørgensen 2021-02-26 11:38 ` Björn Töpel 2 siblings, 1 reply; 17+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-26 11:35 UTC (permalink / raw) To: Björn Töpel, ast, daniel, netdev, bpf Cc: Björn Töpel, bjorn.topel, maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@gmail.com> writes: > Hi XDP-folks, > > This two patch series contain two optimizations for the > bpf_redirect_map() helper and the xdp_do_redirect() function. > > The bpf_redirect_map() optimization is about avoiding the map lookup > dispatching. Instead of having a switch-statement and selecting the > correct lookup function, we let bpf_redirect_map() be a map operation, > where each map has its own bpf_redirect_map() implementation. This way > the run-time lookup is avoided. > > The xdp_do_redirect() patch restructures the code, so that the map > pointer indirection can be avoided. > > Performance-wise I got 3% improvement for XSKMAP > (sample:xdpsock/rx-drop), and 4% (sample:xdp_redirect_map) on my > machine. > > More details in each commit. > > @Jesper/Toke I dropped your Acked-by: on the first patch, since there > were major restucturing. Please have another look! Thanks! Will do! Did you update the performance numbers above after that change? -Toke ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() 2021-02-26 11:35 ` [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Toke Høiland-Jørgensen @ 2021-02-26 11:38 ` Björn Töpel 2021-02-26 11:43 ` Toke Høiland-Jørgensen 0 siblings, 1 reply; 17+ messages in thread From: Björn Töpel @ 2021-02-26 11:38 UTC (permalink / raw) To: Toke Høiland-Jørgensen, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem On 2021-02-26 12:35, Toke Høiland-Jørgensen wrote: > Björn Töpel <bjorn.topel@gmail.com> writes: > >> Hi XDP-folks, >> >> This two patch series contain two optimizations for the >> bpf_redirect_map() helper and the xdp_do_redirect() function. >> >> The bpf_redirect_map() optimization is about avoiding the map lookup >> dispatching. Instead of having a switch-statement and selecting the >> correct lookup function, we let bpf_redirect_map() be a map operation, >> where each map has its own bpf_redirect_map() implementation. This way >> the run-time lookup is avoided. >> >> The xdp_do_redirect() patch restructures the code, so that the map >> pointer indirection can be avoided. >> >> Performance-wise I got 3% improvement for XSKMAP >> (sample:xdpsock/rx-drop), and 4% (sample:xdp_redirect_map) on my >> machine. >> >> More details in each commit. >> >> @Jesper/Toke I dropped your Acked-by: on the first patch, since there >> were major restucturing. Please have another look! Thanks! > > Will do! Did you update the performance numbers above after that change? > I did. The XSKMAP performance stayed the same (no surprise, since the code was the same). However, for the DEVMAP the v4 got rid of a call, so it *should* be a bit better, but for some reason it didn't show on my machine. Björn ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() 2021-02-26 11:38 ` Björn Töpel @ 2021-02-26 11:43 ` Toke Høiland-Jørgensen 0 siblings, 0 replies; 17+ messages in thread From: Toke Høiland-Jørgensen @ 2021-02-26 11:43 UTC (permalink / raw) To: Björn Töpel, Björn Töpel, ast, daniel, netdev, bpf Cc: maciej.fijalkowski, hawk, magnus.karlsson, john.fastabend, kuba, davem Björn Töpel <bjorn.topel@intel.com> writes: > On 2021-02-26 12:35, Toke Høiland-Jørgensen wrote: >> Björn Töpel <bjorn.topel@gmail.com> writes: >> >>> Hi XDP-folks, >>> >>> This two patch series contain two optimizations for the >>> bpf_redirect_map() helper and the xdp_do_redirect() function. >>> >>> The bpf_redirect_map() optimization is about avoiding the map lookup >>> dispatching. Instead of having a switch-statement and selecting the >>> correct lookup function, we let bpf_redirect_map() be a map operation, >>> where each map has its own bpf_redirect_map() implementation. This way >>> the run-time lookup is avoided. >>> >>> The xdp_do_redirect() patch restructures the code, so that the map >>> pointer indirection can be avoided. >>> >>> Performance-wise I got 3% improvement for XSKMAP >>> (sample:xdpsock/rx-drop), and 4% (sample:xdp_redirect_map) on my >>> machine. >>> >>> More details in each commit. >>> >>> @Jesper/Toke I dropped your Acked-by: on the first patch, since there >>> were major restucturing. Please have another look! Thanks! >> >> Will do! Did you update the performance numbers above after that change? >> > > I did. The XSKMAP performance stayed the same (no surprise, since the > code was the same). However, for the DEVMAP the v4 got rid of a call, so > it *should* be a bit better, but for some reason it didn't show on my > machine. Alright, fair enough - pesky real world not lining up with expectations! Maybe Jesper has additional suggestions, but I can live with the 4% improvement ;) -Toke ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2021-02-27 10:23 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-02-26 11:23 [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Björn Töpel 2021-02-26 11:23 ` [PATCH bpf-next v4 1/2] bpf, xdp: make bpf_redirect_map() a map operation Björn Töpel 2021-02-26 11:37 ` Toke Høiland-Jørgensen 2021-02-26 11:40 ` Björn Töpel 2021-02-26 12:04 ` Björn Töpel 2021-02-26 12:26 ` Toke Høiland-Jørgensen 2021-02-26 14:28 ` Jesper Dangaard Brouer 2021-02-26 14:27 ` Jesper Dangaard Brouer 2021-02-26 14:29 ` Jesper Dangaard Brouer 2021-02-26 15:23 ` kernel test robot 2021-02-26 21:48 ` Daniel Borkmann 2021-02-27 9:04 ` Björn Töpel 2021-02-27 10:22 ` Daniel Borkmann 2021-02-26 11:23 ` [PATCH bpf-next v4 2/2] bpf, xdp: restructure redirect actions Björn Töpel 2021-02-26 11:35 ` [PATCH bpf-next v4 0/2] Optimize bpf_redirect_map()/xdp_do_redirect() Toke Høiland-Jørgensen 2021-02-26 11:38 ` Björn Töpel 2021-02-26 11:43 ` Toke Høiland-Jørgensen
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).