From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2FCEC433E4 for ; Sat, 11 Jul 2020 21:29:00 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B750420720 for ; Sat, 11 Jul 2020 21:29:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728093AbgGKV3A (ORCPT ); Sat, 11 Jul 2020 17:29:00 -0400 Received: from mail-il-dmz.mellanox.com ([193.47.165.129]:47976 "EHLO mellanox.co.il" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726779AbgGKV26 (ORCPT ); Sat, 11 Jul 2020 17:28:58 -0400 Received: from Internal Mail-Server by MTLPINE1 (envelope-from lariel@mellanox.com) with SMTP; 12 Jul 2020 00:28:49 +0300 Received: from gen-l-vrt-029.mtl.labs.mlnx (gen-l-vrt-029.mtl.labs.mlnx [10.237.29.1]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 06BLSn23028048; Sun, 12 Jul 2020 00:28:49 +0300 From: Ariel Levkovich To: netdev@vger.kernel.org Cc: jiri@resnulli.us, kuba@kernel.org, jhs@mojatatu.com, xiyou.wangcong@gmail.com, ast@kernel.org, daniel@iogearbox.net, Ariel Levkovich , Jiri Pirko Subject: [PATCH net-next v3 2/4] net/sched: Introduce action hash Date: Sun, 12 Jul 2020 00:28:46 +0300 Message-Id: <20200711212848.20914-3-lariel@mellanox.com> X-Mailer: git-send-email 2.25.2 In-Reply-To: <20200711212848.20914-1-lariel@mellanox.com> References: <20200711212848.20914-1-lariel@mellanox.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Allow user to set a packet's hash value using a bpf program. The user provided BPF program is required to compute and return a hash value for the packet which is then stored in skb->hash. Using this action to set the skb->hash is an alternative to setting it with act_skbedit and can be useful for future HW offload support when the HW hash function is different then the kernel's hash function. By using a bpg program that emulates the HW hash function user can ensure hash consistency between the SW and the HW. Usage is as follows: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto tcp \ action hash object-file section \ action goto chain 2 Matching on the result: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x0/0xf \ action mirred egress redirect dev ens1f0_1 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x1/0xf \ action mirred egress redirect dev ens1f0_2 v1 -> v2: *Handle egress case for bpf hash properly. *Check for valid bpf fd before referencing it. *Fixed missing unlocking of tcf_lock. v2 -> v3: *Move hash algorithm asym_l4 to act_skbedit. This action only supports bpf option now. Signed-off-by: Ariel Levkovich Reviewed-by: Jiri Pirko --- include/net/act_api.h | 2 + include/net/tc_act/tc_hash.h | 20 ++ include/uapi/linux/pkt_cls.h | 1 + include/uapi/linux/tc_act/tc_hash.h | 25 ++ net/sched/Kconfig | 11 + net/sched/Makefile | 1 + net/sched/act_hash.c | 348 ++++++++++++++++++++++++++++ net/sched/cls_api.c | 1 + 8 files changed, 409 insertions(+) create mode 100644 include/net/tc_act/tc_hash.h create mode 100644 include/uapi/linux/tc_act/tc_hash.h create mode 100644 net/sched/act_hash.c diff --git a/include/net/act_api.h b/include/net/act_api.h index 8c3934880670..b7e5d060bd2f 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -12,6 +12,8 @@ #include #include +#define ACT_BPF_NAME_LEN 256 + struct tcf_idrinfo { struct mutex lock; struct idr action_idr; diff --git a/include/net/tc_act/tc_hash.h b/include/net/tc_act/tc_hash.h new file mode 100644 index 000000000000..4a9e36664813 --- /dev/null +++ b/include/net/tc_act/tc_hash.h @@ -0,0 +1,20 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef __NET_TC_HASH_H +#define __NET_TC_HASH_H + +#include +#include + +struct tcf_hash_params { + struct bpf_prog *prog; + const char *bpf_name; + struct rcu_head rcu; +}; + +struct tcf_hash { + struct tc_action common; + struct tcf_hash_params __rcu *hash_p; +}; +#define to_hash(a) ((struct tcf_hash *)a) + +#endif /* __NET_TC_HASH_H */ diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 7576209d96f9..2fd93389d091 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -135,6 +135,7 @@ enum tca_id { TCA_ID_MPLS, TCA_ID_CT, TCA_ID_GATE, + TCA_ID_HASH, /* other actions go here */ __TCA_ID_MAX = 255 }; diff --git a/include/uapi/linux/tc_act/tc_hash.h b/include/uapi/linux/tc_act/tc_hash.h new file mode 100644 index 000000000000..08937f097ed7 --- /dev/null +++ b/include/uapi/linux/tc_act/tc_hash.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +#ifndef __LINUX_TC_HASH_H +#define __LINUX_TC_HASH_H + +#include + +enum { + TCA_HASH_UNSPEC, + TCA_HASH_TM, + TCA_HASH_PARMS, + TCA_HASH_BPF_FD, + TCA_HASH_BPF_NAME, + TCA_HASH_BPF_ID, + TCA_HASH_BPF_TAG, + TCA_HASH_PAD, + __TCA_HASH_MAX, +}; +#define TCA_HASH_MAX (__TCA_HASH_MAX - 1) + +struct tc_hash { + tc_gen; +}; + +#endif diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 84badf00647e..e9725bb40f4f 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -993,6 +993,17 @@ config NET_ACT_GATE To compile this code as a module, choose M here: the module will be called act_gate. +config NET_ACT_HASH + tristate "Hash calculation action" + depends on NET_CLS_ACT + help + Say Y here to perform hash calculation on packet headers. + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_hash. + config NET_IFE_SKBMARK tristate "Support to encoding decoding skb mark on IFE action" depends on NET_ACT_IFE diff --git a/net/sched/Makefile b/net/sched/Makefile index 66bbf9a98f9e..2d1415fb57db 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -25,6 +25,7 @@ obj-$(CONFIG_NET_ACT_CONNMARK) += act_connmark.o obj-$(CONFIG_NET_ACT_CTINFO) += act_ctinfo.o obj-$(CONFIG_NET_ACT_SKBMOD) += act_skbmod.o obj-$(CONFIG_NET_ACT_IFE) += act_ife.o +obj-$(CONFIG_NET_ACT_HASH) += act_hash.o obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o obj-$(CONFIG_NET_IFE_SKBTCINDEX) += act_meta_skbtcindex.o diff --git a/net/sched/act_hash.c b/net/sched/act_hash.c new file mode 100644 index 000000000000..88e7c9f9a2d3 --- /dev/null +++ b/net/sched/act_hash.c @@ -0,0 +1,348 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* - + * net/sched/act_hash.c Hash calculation action + * + * Author: Ariel Levkovich + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#define ACT_HASH_BPF_NAME_LEN 256 + +static unsigned int hash_net_id; +static struct tc_action_ops act_hash_ops; + +static int tcf_hash_act(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + bool at_ingress = skb_at_tc_ingress(skb); + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + int action; + u32 hash; + + tcf_lastuse_update(&h->tcf_tm); + tcf_action_update_bstats(&h->common, skb); + + p = rcu_dereference_bh(h->hash_p); + + action = READ_ONCE(h->tcf_action); + + if (at_ingress) { + __skb_push(skb, skb->mac_len); + bpf_compute_data_pointers(skb); + hash = BPF_PROG_RUN(p->prog, skb); + __skb_pull(skb, skb->mac_len); + } else { + bpf_compute_data_pointers(skb); + hash = BPF_PROG_RUN(p->prog, skb); + } + + /* The BPF program hash function type is + * unknown so only the sw hash bit is set. + */ + __skb_set_sw_hash(skb, hash, false); + + return action; +} + +static const struct nla_policy hash_policy[TCA_HASH_MAX + 1] = { + [TCA_HASH_PARMS] = { .type = NLA_EXACT_LEN, .len = sizeof(struct tc_hash) }, + [TCA_HASH_BPF_FD] = { .type = NLA_U32 }, + [TCA_HASH_BPF_NAME] = { .type = NLA_NUL_STRING, + .len = ACT_HASH_BPF_NAME_LEN }, +}; + +static int tcf_hash_bpf_init(struct nlattr **tb, struct tcf_hash_params *params) +{ + struct bpf_prog *fp; + char *name = NULL; + u32 bpf_fd; + + bpf_fd = nla_get_u32(tb[TCA_HASH_BPF_FD]); + + fp = bpf_prog_get_type(bpf_fd, BPF_PROG_TYPE_SCHED_ACT); + if (IS_ERR(fp)) + return PTR_ERR(fp); + + if (tb[TCA_HASH_BPF_NAME]) { + name = nla_memdup(tb[TCA_HASH_BPF_NAME], GFP_KERNEL); + if (!name) { + bpf_prog_put(fp); + return -ENOMEM; + } + } + + params->bpf_name = name; + params->prog = fp; + + return 0; +} + +static void tcf_hash_bpf_cleanup(struct tcf_hash_params *params) +{ + if (params->prog) + bpf_prog_put(params->prog); + + kfree(params->bpf_name); +} + +static int tcf_hash_init(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **a, + int replace, int bind, bool rtnl_held, + struct tcf_proto *tp, u32 flags, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + struct tcf_hash_params *params, old; + struct nlattr *tb[TCA_HASH_MAX + 1]; + struct tcf_chain *goto_ch = NULL; + struct tcf_hash_params *p = NULL; + struct tc_hash *parm; + struct tcf_hash *h; + int err, res = 0; + u32 index; + + if (!nla) { + NL_SET_ERR_MSG_MOD(extack, "Hash requires attributes to be passed"); + return -EINVAL; + } + + err = nla_parse_nested(tb, TCA_HASH_MAX, nla, hash_policy, extack); + if (err < 0) + return err; + + if (!tb[TCA_HASH_PARMS]) { + NL_SET_ERR_MSG_MOD(extack, "Missing required hash parameters"); + return -EINVAL; + } + parm = nla_data(tb[TCA_HASH_PARMS]); + index = parm->index; + + err = tcf_idr_check_alloc(tn, &index, a, bind); + if (err < 0) + return err; + + if (!err) { + err = tcf_idr_create_from_flags(tn, index, est, a, + &act_hash_ops, bind, flags); + if (err) { + tcf_idr_cleanup(tn, index); + return err; + } + res = ACT_P_CREATED; + } else { + if (bind) + return 0; + + if (!replace) { + tcf_idr_release(*a, bind); + return -EEXIST; + } + } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + + h = to_hash(*a); + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (unlikely(!p)) { + err = -ENOMEM; + goto cleanup; + } + + if (!tb[TCA_HASH_BPF_FD]) { + NL_SET_ERR_MSG_MOD(extack, "Missing hash bpf fd"); + err = -EINVAL; + goto cleanup; + } + + spin_lock_bh(&h->tcf_lock); + + if (res != ACT_P_CREATED) { + params = rcu_dereference_protected(h->hash_p, 1); + old.prog = params->prog; + old.bpf_name = params->bpf_name; + } + + err = tcf_hash_bpf_init(tb, p); + if (err) + goto unlock; + + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + p = rcu_replace_pointer(h->hash_p, p, + lockdep_is_held(&h->tcf_lock)); + spin_unlock_bh(&h->tcf_lock); + + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + if (p) + kfree_rcu(p, rcu); + + if (res == ACT_P_CREATED) { + tcf_idr_insert(tn, *a); + } else { + synchronize_rcu(); + tcf_hash_bpf_cleanup(&old); + } + + return res; + +unlock: + spin_unlock_bh(&h->tcf_lock); + +cleanup: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + kfree(p); + +release_idr: + tcf_idr_release(*a, bind); + return err; +} + +static void tcf_hash_cleanup(struct tc_action *a) +{ + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + + p = rcu_dereference_protected(h->hash_p, 1); + if (p) { + tcf_hash_bpf_cleanup(p); + kfree_rcu(p, rcu); + } +} + +static int tcf_hash_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *tp = skb_tail_pointer(skb); + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + struct tc_hash opt = { + .index = h->tcf_index, + .refcnt = refcount_read(&h->tcf_refcnt) - ref, + .bindcnt = atomic_read(&h->tcf_bindcnt) - bind, + }; + struct nlattr *nla; + struct tcf_t tm; + + spin_lock_bh(&h->tcf_lock); + opt.action = h->tcf_action; + p = rcu_dereference_protected(h->hash_p, lockdep_is_held(&h->tcf_lock)); + + if (nla_put(skb, TCA_HASH_PARMS, sizeof(opt), &opt)) + goto nla_put_failure; + + if (p->bpf_name && nla_put_string(skb, TCA_HASH_BPF_NAME, p->bpf_name)) + goto nla_put_failure; + + if (nla_put_u32(skb, TCA_HASH_BPF_ID, p->prog->aux->id)) + goto nla_put_failure; + + nla = nla_reserve(skb, TCA_HASH_BPF_TAG, sizeof(p->prog->tag)); + if (!nla) + goto nla_put_failure; + + memcpy(nla_data(nla), p->prog->tag, nla_len(nla)); + + tcf_tm_dump(&tm, &h->tcf_tm); + if (nla_put_64bit(skb, TCA_HASH_TM, sizeof(tm), &tm, + TCA_HASH_PAD)) + goto nla_put_failure; + + spin_unlock_bh(&h->tcf_lock); + return skb->len; + +nla_put_failure: + spin_unlock_bh(&h->tcf_lock); + nlmsg_trim(skb, tp); + return -1; +} + +static int tcf_hash_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tcf_generic_walker(tn, skb, cb, type, ops, extack); +} + +static int tcf_hash_search(struct net *net, struct tc_action **a, u32 index) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tcf_idr_search(tn, a, index); +} + +static void tcf_hash_stats_update(struct tc_action *a, u64 bytes, u32 packets, + u64 lastuse, bool hw) +{ + struct tcf_hash *h = to_hash(a); + + tcf_action_update_stats(a, bytes, packets, false, hw); + h->tcf_tm.lastuse = max_t(u64, h->tcf_tm.lastuse, lastuse); +} + +static struct tc_action_ops act_hash_ops = { + .kind = "hash", + .id = TCA_ID_HASH, + .owner = THIS_MODULE, + .act = tcf_hash_act, + .dump = tcf_hash_dump, + .init = tcf_hash_init, + .cleanup = tcf_hash_cleanup, + .walk = tcf_hash_walker, + .lookup = tcf_hash_search, + .stats_update = tcf_hash_stats_update, + .size = sizeof(struct tcf_hash), +}; + +static __net_init int hash_init_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tc_action_net_init(net, tn, &act_hash_ops); +} + +static void __net_exit hash_exit_net(struct list_head *net_list) +{ + tc_action_net_exit(net_list, hash_net_id); +} + +static struct pernet_operations hash_net_ops = { + .init = hash_init_net, + .exit_batch = hash_exit_net, + .id = &hash_net_id, + .size = sizeof(struct tc_action_net), +}; + +static int __init hash_init_module(void) +{ + return tcf_register_action(&act_hash_ops, &hash_net_ops); +} + +static void __exit hash_cleanup_module(void) +{ + tcf_unregister_action(&act_hash_ops, &hash_net_ops); +} + +module_init(hash_init_module); +module_exit(hash_cleanup_module); + +MODULE_AUTHOR("Ariel Levkovich "); +MODULE_DESCRIPTION("Packet hash action"); +MODULE_LICENSE("GPL v2"); diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index a00a203b2ef5..6d7eb249e557 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -40,6 +40,7 @@ #include #include #include +#include #include extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1]; -- 2.25.2