From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 635B8C433E1 for ; Wed, 1 Jul 2020 04:21:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 266BC206B6 for ; Wed, 1 Jul 2020 04:21:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="qjNI8nMz" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726092AbgGAEVN (ORCPT ); Wed, 1 Jul 2020 00:21:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725272AbgGAEVN (ORCPT ); Wed, 1 Jul 2020 00:21:13 -0400 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE1B2C061755; Tue, 30 Jun 2020 21:21:12 -0700 (PDT) Received: by mail-pf1-x442.google.com with SMTP id x3so4882755pfo.9; Tue, 30 Jun 2020 21:21:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KSUZkqGd6KUrxkx4++IKSGTTxMwn8vlVYbcu8FQWDIU=; b=qjNI8nMzA79HQjAHiqtKcEX461gjceNz+pEBABqC0admZGuwEBCiBk1Mu7CgV2tEEf 2jMBTywRPM15LggjrbbnIYZKI8xIggP3xjfwyzV5K236zstSN+/dMyC/I353X4k5RZGO AJwV614S676xSioYXLHtHYFlwwwOnd9bYtkCdTmziTRXHMieBphjPrkQmRyNG4uwE9Pp 9An88V1JlepnNF8YKIze2uU+aBvtLkVrVtwvf073hbOGw65eCa4y2QsSxrZsDToPJ9o9 1B31ufeppuMZdtf/bdNCPdQuStHMGxbg9o71IKpSAqti19fDNk39FieE5++fsyi53T4a Pf1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KSUZkqGd6KUrxkx4++IKSGTTxMwn8vlVYbcu8FQWDIU=; b=qQzM/GVtO410An7+8uZtRei50TR5Pg4BTJ82mkVFHw0byNlUCZ17L7ItaJbXlnzcfb WvUN+EES98HlZnZ4IM+GBGkVLuRULDlHtaqgE5RHxvXoZejpzmP3m4p+wkHpxVDOOR9P K6Atab4FOIKF9n6TGBm1RO8beciLSA6DEhpKUjw/b19SA+8yntxTXD1XQMmWjGrK+ynW grzbeXI/DR9ZA+jgqAdHfqF6mh4I82/6c4CD4/Vvxtel2TRor2HWku/U1AVTnv8/+hFl ByT6XLBMVi6dvPzLjFfAvIhnfk+XesHFsT2lzWn+xhRA6u0UxmouC4feo3zt5/LkjNev 0qHw== X-Gm-Message-State: AOAM532lx/ja9ievdXsJ8XCApfhA6YYs5Gosu+8RNFgUV6GHVFaUxD2E ymnLgQsKugQiZlvGwTYyN39esTwx+ps= X-Google-Smtp-Source: ABdhPJwMQSshsPIJ2ulF3zprvrDTIUBxH4Vr3cSIxLROV3gUe4n7tlWUFdJPaNS0+rdcOjjoIopFEA== X-Received: by 2002:a63:725c:: with SMTP id c28mr17567092pgn.156.1593577271975; Tue, 30 Jun 2020 21:21:11 -0700 (PDT) Received: from dhcp-12-153.nay.redhat.com ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id h9sm3420227pjs.50.2020.06.30.21.21.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jun 2020 21:21:11 -0700 (PDT) From: Hangbin Liu To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, =?UTF-8?q?Toke=20H=C3=B8iland-J=C3=B8rgensen?= , Jiri Benc , Jesper Dangaard Brouer , Eelco Chaudron , ast@kernel.org, Daniel Borkmann , Lorenzo Bianconi , Hangbin Liu Subject: [PATCHv5 bpf-next 1/3] xdp: add a new helper for dev map multicast support Date: Wed, 1 Jul 2020 12:19:36 +0800 Message-Id: <20200701041938.862200-2-liuhangbin@gmail.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200701041938.862200-1-liuhangbin@gmail.com> References: <20200526140539.4103528-1-liuhangbin@gmail.com> <20200701041938.862200-1-liuhangbin@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This patch is for xdp multicast support. In this implementation we add a new helper to accept two maps: forward map and exclude map. We will redirect the packet to all the interfaces in *forward map*, but exclude the interfaces that in *exclude map*. To achive this I add a new ex_map for struct bpf_redirect_info. in the helper I set tgt_value to NULL to make a difference with bpf_xdp_redirect_map() We also add a flag *BPF_F_EXCLUDE_INGRESS* incase you don't want to create a exclude map for each interface and just want to exclude the ingress interface. The general data path is kept in net/core/filter.c. The native data path is in kernel/bpf/devmap.c so we can use direct calls to get better performace. v5: a) Check devmap_get_next_key() return value. b) Pass through flags to __bpf_tx_xdp_map() instead of bool value. c) In function dev_map_enqueue_multi(), consume xdpf for the last obj instead of the first on. d) Update helper description and code comments to explain that we use NULL target value to distinguish multicast and unicast forwarding. e) Update memory model, memory id and frame_sz in xdpf_clone(). v4: Fix bpf_xdp_redirect_map_multi_proto arg2_type typo v3: Based on Toke's suggestion, do the following update a) Update bpf_redirect_map_multi() description in bpf.h. b) Fix exclude_ifindex checking order in dev_in_exclude_map(). c) Fix one more xdpf clone in dev_map_enqueue_multi(). d) Go find next one in dev_map_enqueue_multi() if the interface is not able to forward instead of abort the whole loop. e) Remove READ_ONCE/WRITE_ONCE for ex_map. v2: Add new syscall bpf_xdp_redirect_map_multi() which could accept include/exclude maps directly. Signed-off-by: Hangbin Liu --- include/linux/bpf.h | 20 +++++ include/linux/filter.h | 1 + include/net/xdp.h | 1 + include/uapi/linux/bpf.h | 25 +++++- kernel/bpf/devmap.c | 154 +++++++++++++++++++++++++++++++++ kernel/bpf/verifier.c | 6 ++ net/core/filter.c | 109 +++++++++++++++++++++-- net/core/xdp.c | 29 +++++++ tools/include/uapi/linux/bpf.h | 25 +++++- 9 files changed, 363 insertions(+), 7 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 3d2ade703a35..c77bc70dba87 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -1264,6 +1264,11 @@ int dev_xdp_enqueue(struct net_device *dev, struct xdp_buff *xdp, struct net_device *dev_rx); int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, struct net_device *dev_rx); +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map, + int exclude_ifindex); +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx, + struct bpf_map *map, struct bpf_map *ex_map, + u32 flags); int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, struct bpf_prog *xdp_prog); bool dev_map_can_have_prog(struct bpf_map *map); @@ -1406,6 +1411,21 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, return 0; } +static inline +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map, + int exclude_ifindex) +{ + return false; +} + +static inline +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx, + struct bpf_map *map, struct bpf_map *ex_map, + u32 flags) +{ + return 0; +} + struct sk_buff; static inline int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, diff --git a/include/linux/filter.h b/include/linux/filter.h index 259377723603..cf5b5b1d9ae5 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -612,6 +612,7 @@ struct bpf_redirect_info { u32 tgt_index; void *tgt_value; struct bpf_map *map; + struct bpf_map *ex_map; u32 kern_flags; }; diff --git a/include/net/xdp.h b/include/net/xdp.h index 609f819ed08b..deb6c104e698 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -110,6 +110,7 @@ void xdp_warn(const char *msg, const char *func, const int line); #define XDP_WARN(msg) xdp_warn(msg, __func__, __LINE__) struct xdp_frame *xdp_convert_zc_to_xdp_frame(struct xdp_buff *xdp); +struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf); static inline void xdp_convert_frame_to_buff(struct xdp_frame *frame, struct xdp_buff *xdp) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 0cb8ec948816..d7de6c0b32e4 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3285,6 +3285,23 @@ union bpf_attr { * Dynamically cast a *sk* pointer to a *udp6_sock* pointer. * Return * *sk* if casting is valid, or NULL otherwise. + * + * int bpf_redirect_map_multi(struct bpf_map *map, struct bpf_map *ex_map, u64 flags) + * Description + * This is a multicast implementation for XDP redirect. It will + * redirect the packet to ALL the interfaces in *map*, but + * exclude the interfaces in *ex_map*. + * + * Currently the *flags* only supports *BPF_F_EXCLUDE_INGRESS*, + * which additionally excludes the current ingress device. + * + * See also bpf_redirect_map() as a unicast implementation, + * which supports redirecting packet to a specific ifindex + * in the map. As both helpers use struct bpf_redirect_info + * to store the redirect info, we will use a a NULL tgt_value + * to distinguish multicast and unicast redirecting. + * Return + * **XDP_REDIRECT** on success, or **XDP_ABORTED** on error. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3427,7 +3444,8 @@ union bpf_attr { FN(skc_to_tcp_sock), \ FN(skc_to_tcp_timewait_sock), \ FN(skc_to_tcp_request_sock), \ - FN(skc_to_udp6_sock), + FN(skc_to_udp6_sock), \ + FN(redirect_map_multi), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -3588,6 +3606,11 @@ enum bpf_lwt_encap_mode { BPF_LWT_ENCAP_IP, }; +/* BPF_FUNC_redirect_map_multi flags. */ +enum { + BPF_F_EXCLUDE_INGRESS = (1ULL << 0), +}; + #define __bpf_md_ptr(type, name) \ union { \ type name; \ diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 58acc46861ef..8a45fc9e2ccb 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -510,6 +510,160 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, return __xdp_enqueue(dev, xdp, dev_rx); } +/* Use direct call in fast path instead of map->ops->map_get_next_key() */ +static int devmap_get_next_key(struct bpf_map *map, void *key, void *next_key) +{ + + switch (map->map_type) { + case BPF_MAP_TYPE_DEVMAP: + return dev_map_get_next_key(map, key, next_key); + case BPF_MAP_TYPE_DEVMAP_HASH: + return dev_map_hash_get_next_key(map, key, next_key); + default: + break; + } + + return -ENOENT; +} + +bool dev_in_exclude_map(struct bpf_dtab_netdev *obj, struct bpf_map *map, + int exclude_ifindex) +{ + struct bpf_dtab_netdev *ex_obj = NULL; + u32 key, next_key; + int err; + + if (obj->dev->ifindex == exclude_ifindex) + return true; + + if (!map) + return false; + + err = devmap_get_next_key(map, NULL, &key); + if (err) + return false; + + for (;;) { + switch (map->map_type) { + case BPF_MAP_TYPE_DEVMAP: + ex_obj = __dev_map_lookup_elem(map, key); + break; + case BPF_MAP_TYPE_DEVMAP_HASH: + ex_obj = __dev_map_hash_lookup_elem(map, key); + break; + default: + break; + } + + if (ex_obj && ex_obj->dev->ifindex == obj->dev->ifindex) + return true; + + err = devmap_get_next_key(map, &key, &next_key); + if (err) + break; + + key = next_key; + } + + return false; +} + +struct bpf_dtab_netdev *devmap_get_next_obj(struct xdp_buff *xdp, struct bpf_map *map, + struct bpf_map *ex_map, u32 *key, + u32 *next_key, int ex_ifindex) +{ + struct bpf_dtab_netdev *obj; + struct net_device *dev; + u32 *tmp_key = key; + int err; + + err = devmap_get_next_key(map, tmp_key, next_key); + if (err) + return NULL; + + for (;;) { + switch (map->map_type) { + case BPF_MAP_TYPE_DEVMAP: + obj = __dev_map_lookup_elem(map, *next_key); + break; + case BPF_MAP_TYPE_DEVMAP_HASH: + obj = __dev_map_hash_lookup_elem(map, *next_key); + break; + default: + break; + } + + if (!obj || dev_in_exclude_map(obj, ex_map, ex_ifindex)) + goto find_next; + + dev = obj->dev; + + if (!dev->netdev_ops->ndo_xdp_xmit) + goto find_next; + + err = xdp_ok_fwd_dev(dev, xdp->data_end - xdp->data); + if (unlikely(err)) + goto find_next; + + return obj; + +find_next: + tmp_key = next_key; + err = devmap_get_next_key(map, tmp_key, next_key); + if (err) + break; + } + + return NULL; +} + +int dev_map_enqueue_multi(struct xdp_buff *xdp, struct net_device *dev_rx, + struct bpf_map *map, struct bpf_map *ex_map, + u32 flags) +{ + struct bpf_dtab_netdev *obj = NULL, *next_obj = NULL; + struct xdp_frame *xdpf, *nxdpf; + bool last_one = false; + int ex_ifindex; + u32 key, next_key; + + ex_ifindex = flags & BPF_F_EXCLUDE_INGRESS ? dev_rx->ifindex : 0; + + /* Find first available obj */ + obj = devmap_get_next_obj(xdp, map, ex_map, NULL, &key, ex_ifindex); + if (!obj) + return 0; + + xdpf = xdp_convert_buff_to_frame(xdp); + if (unlikely(!xdpf)) + return -EOVERFLOW; + + for (;;) { + /* Check if we still have one more available obj */ + next_obj = devmap_get_next_obj(xdp, map, ex_map, &key, + &next_key, ex_ifindex); + if (!next_obj) + last_one = true; + + if (last_one) { + bq_enqueue(obj->dev, xdpf, dev_rx); + return 0; + } + + nxdpf = xdpf_clone(xdpf); + if (unlikely(!nxdpf)) { + xdp_return_frame_rx_napi(xdpf); + return -ENOMEM; + } + + bq_enqueue(obj->dev, nxdpf, dev_rx); + + /* Deal with next obj */ + obj = next_obj; + key = next_key; + } +} + int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, struct bpf_prog *xdp_prog) { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 7de98906ddf4..8302b68ef953 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4110,6 +4110,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, case BPF_MAP_TYPE_DEVMAP: case BPF_MAP_TYPE_DEVMAP_HASH: if (func_id != BPF_FUNC_redirect_map && + func_id != BPF_FUNC_redirect_map_multi && func_id != BPF_FUNC_map_lookup_elem) goto error; break; @@ -4202,6 +4203,11 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, map->map_type != BPF_MAP_TYPE_XSKMAP) goto error; break; + case BPF_FUNC_redirect_map_multi: + if (map->map_type != BPF_MAP_TYPE_DEVMAP && + map->map_type != BPF_MAP_TYPE_DEVMAP_HASH) + goto error; + break; case BPF_FUNC_sk_redirect_map: case BPF_FUNC_msg_redirect_map: case BPF_FUNC_sock_map_update: diff --git a/net/core/filter.c b/net/core/filter.c index c796e141ea8e..37df1ea747ae 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3515,12 +3515,19 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { }; static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, - struct bpf_map *map, struct xdp_buff *xdp) + struct bpf_map *map, struct xdp_buff *xdp, + struct bpf_map *ex_map, u32 flags) { switch (map->map_type) { case BPF_MAP_TYPE_DEVMAP: case BPF_MAP_TYPE_DEVMAP_HASH: - return dev_map_enqueue(fwd, xdp, dev_rx); + /* We use a NULL fwd value to distinguish multicast + * and unicast forwarding + */ + if (fwd) + return dev_map_enqueue(fwd, xdp, dev_rx); + else + return dev_map_enqueue_multi(xdp, dev_rx, map, ex_map, flags); case BPF_MAP_TYPE_CPUMAP: return cpu_map_enqueue(fwd, xdp, dev_rx); case BPF_MAP_TYPE_XSKMAP: @@ -3577,12 +3584,14 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); struct bpf_map *map = READ_ONCE(ri->map); + struct bpf_map *ex_map = ri->ex_map; u32 index = ri->tgt_index; void *fwd = ri->tgt_value; int err; ri->tgt_index = 0; ri->tgt_value = NULL; + ri->ex_map = NULL; WRITE_ONCE(ri->map, NULL); if (unlikely(!map)) { @@ -3594,7 +3603,7 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, err = dev_xdp_enqueue(fwd, xdp, dev); } else { - err = __bpf_tx_xdp_map(dev, fwd, map, xdp); + err = __bpf_tx_xdp_map(dev, fwd, map, xdp, ex_map, ri->flags); } if (unlikely(err)) @@ -3608,6 +3617,55 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, } EXPORT_SYMBOL_GPL(xdp_do_redirect); +static int dev_map_redirect_multi(struct net_device *dev, struct sk_buff *skb, + struct bpf_prog *xdp_prog, + struct bpf_map *map, struct bpf_map *ex_map, + u32 flags) + +{ + struct bpf_dtab_netdev *dst; + struct sk_buff *nskb; + bool exclude_ingress; + u32 key, next_key; + void *fwd; + int err; + + /* Get first key from forward map */ + err = map->ops->map_get_next_key(map, NULL, &key); + if (err) + return err; + + exclude_ingress = !!(flags & BPF_F_EXCLUDE_INGRESS); + + for (;;) { + fwd = __xdp_map_lookup_elem(map, key); + if (fwd) { + dst = (struct bpf_dtab_netdev *)fwd; + if (dev_in_exclude_map(dst, ex_map, + exclude_ingress ? dev->ifindex : 0)) + goto find_next; + + nskb = skb_clone(skb, GFP_ATOMIC); + if (!nskb) + return -ENOMEM; + + /* Try forword next one no mater the current forward + * succeed or not */ + dev_map_generic_redirect(dst, nskb, xdp_prog); + } + +find_next: + err = map->ops->map_get_next_key(map, &key, &next_key); + if (err) + break; + + key = next_key; + } + + consume_skb(skb); + return 0; +} + static int xdp_do_generic_redirect_map(struct net_device *dev, struct sk_buff *skb, struct xdp_buff *xdp, @@ -3615,19 +3673,30 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, struct bpf_map *map) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + struct bpf_map *ex_map = ri->ex_map; u32 index = ri->tgt_index; void *fwd = ri->tgt_value; int err = 0; ri->tgt_index = 0; ri->tgt_value = NULL; + ri->ex_map = NULL; WRITE_ONCE(ri->map, NULL); if (map->map_type == BPF_MAP_TYPE_DEVMAP || map->map_type == BPF_MAP_TYPE_DEVMAP_HASH) { - struct bpf_dtab_netdev *dst = fwd; + /* We use a NULL fwd value to distinguish multicast + * and unicast forwarding + */ + if (fwd) { + struct bpf_dtab_netdev *dst = fwd; + + err = dev_map_generic_redirect(dst, skb, xdp_prog); + } else { + err = dev_map_redirect_multi(dev, skb, xdp_prog, map, + ex_map, ri->flags); + } - err = dev_map_generic_redirect(dst, skb, xdp_prog); if (unlikely(err)) goto err; } else if (map->map_type == BPF_MAP_TYPE_XSKMAP) { @@ -3741,6 +3810,34 @@ static const struct bpf_func_proto bpf_xdp_redirect_map_proto = { .arg3_type = ARG_ANYTHING, }; +BPF_CALL_3(bpf_xdp_redirect_map_multi, struct bpf_map *, map, + struct bpf_map *, ex_map, u64, flags) +{ + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + + if (unlikely(!map || flags > BPF_F_EXCLUDE_INGRESS)) + return XDP_ABORTED; + + ri->tgt_index = 0; + /* Set the tgt_value to NULL to distinguish with bpf_xdp_redirect_map */ + ri->tgt_value = NULL; + ri->flags = flags; + ri->ex_map = ex_map; + + WRITE_ONCE(ri->map, map); + + return XDP_REDIRECT; +} + +static const struct bpf_func_proto bpf_xdp_redirect_map_multi_proto = { + .func = bpf_xdp_redirect_map_multi, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_CONST_MAP_PTR, + .arg2_type = ARG_CONST_MAP_PTR, + .arg3_type = ARG_ANYTHING, +}; + static unsigned long bpf_skb_copy(void *dst_buff, const void *skb, unsigned long off, unsigned long len) { @@ -6464,6 +6561,8 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) return &bpf_xdp_redirect_proto; case BPF_FUNC_redirect_map: return &bpf_xdp_redirect_map_proto; + case BPF_FUNC_redirect_map_multi: + return &bpf_xdp_redirect_map_multi_proto; case BPF_FUNC_xdp_adjust_tail: return &bpf_xdp_adjust_tail_proto; case BPF_FUNC_fib_lookup: diff --git a/net/core/xdp.c b/net/core/xdp.c index 90f44f382115..7e291f1015d7 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -475,3 +475,32 @@ void xdp_warn(const char *msg, const char *func, const int line) WARN(1, "XDP_WARN: %s(line:%d): %s\n", func, line, msg); }; EXPORT_SYMBOL_GPL(xdp_warn); + +struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf) +{ + unsigned int headroom, totalsize; + struct xdp_frame *nxdpf; + struct page *page; + void *addr; + + headroom = xdpf->headroom + sizeof(*xdpf); + totalsize = headroom + xdpf->len; + + if (unlikely(totalsize > PAGE_SIZE)) + return NULL; + page = dev_alloc_page(); + if (!page) + return NULL; + addr = page_to_virt(page); + + memcpy(addr, xdpf, totalsize); + + nxdpf = addr; + nxdpf->data = addr + headroom; + nxdpf->frame_sz = PAGE_SIZE; + nxdpf->mem.type = MEM_TYPE_PAGE_ORDER0; + nxdpf->mem.id = 0; + + return nxdpf; +} +EXPORT_SYMBOL_GPL(xdpf_clone); diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 0cb8ec948816..d7de6c0b32e4 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3285,6 +3285,23 @@ union bpf_attr { * Dynamically cast a *sk* pointer to a *udp6_sock* pointer. * Return * *sk* if casting is valid, or NULL otherwise. + * + * int bpf_redirect_map_multi(struct bpf_map *map, struct bpf_map *ex_map, u64 flags) + * Description + * This is a multicast implementation for XDP redirect. It will + * redirect the packet to ALL the interfaces in *map*, but + * exclude the interfaces in *ex_map*. + * + * Currently the *flags* only supports *BPF_F_EXCLUDE_INGRESS*, + * which additionally excludes the current ingress device. + * + * See also bpf_redirect_map() as a unicast implementation, + * which supports redirecting packet to a specific ifindex + * in the map. As both helpers use struct bpf_redirect_info + * to store the redirect info, we will use a a NULL tgt_value + * to distinguish multicast and unicast redirecting. + * Return + * **XDP_REDIRECT** on success, or **XDP_ABORTED** on error. */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3427,7 +3444,8 @@ union bpf_attr { FN(skc_to_tcp_sock), \ FN(skc_to_tcp_timewait_sock), \ FN(skc_to_tcp_request_sock), \ - FN(skc_to_udp6_sock), + FN(skc_to_udp6_sock), \ + FN(redirect_map_multi), /* integer value in 'imm' field of BPF_CALL instruction selects which helper * function eBPF program intends to call @@ -3588,6 +3606,11 @@ enum bpf_lwt_encap_mode { BPF_LWT_ENCAP_IP, }; +/* BPF_FUNC_redirect_map_multi flags. */ +enum { + BPF_F_EXCLUDE_INGRESS = (1ULL << 0), +}; + #define __bpf_md_ptr(type, name) \ union { \ type name; \ -- 2.25.4