* [PATCH net-next v2 0/3] ] TC datapath hash api @ 2020-07-01 18:47 Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich ` (3 more replies) 0 siblings, 4 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-01 18:47 UTC (permalink / raw) To: netdev; +Cc: jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Ariel Levkovich Supporting datapath hash allows user to set up rules that provide load balancing of traffic across multiple vports and for ECMP path selection while keeping the number of rule at minimum. Instead of matching on exact flow spec, which requires a rule per flow, user can define rules based on hashing on the packet headers and distribute the flows to different buckets. The number of rules in this case will be constant and equal to the number of buckets. The datapath hash functionality is achieved in two steps - performing the hash action and then matching on the result, as part of the packet's classification. The api allows user to define a filter with a tc hash action where the hash function can be standard asymetric hashing that Linux offers or alternatively user can provide a bpf program that performs hash calculation on a packet. Usage is as follows: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto tcp \ action hash bpf object-file <file> \ action goto chain 2 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto udp \ action hash bpf asym_l4 basis <basis> \ action goto chain 2 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x0/0xf \ action mirred egress redirect dev ens1f0_1 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x1/0xf \ action mirred egress redirect dev ens1f0_2 Ariel Levkovich (3): net/sched: Introduce action hash net/flow_dissector: add packet hash dissection net/sched: cls_flower: Add hash info to flow classification include/linux/skbuff.h | 4 + include/net/act_api.h | 2 + include/net/flow_dissector.h | 9 + include/net/tc_act/tc_hash.h | 22 ++ include/uapi/linux/pkt_cls.h | 4 + include/uapi/linux/tc_act/tc_hash.h | 32 +++ net/core/flow_dissector.c | 17 ++ net/sched/Kconfig | 11 + net/sched/Makefile | 1 + net/sched/act_hash.c | 389 ++++++++++++++++++++++++++++ net/sched/cls_api.c | 1 + net/sched/cls_flower.c | 16 ++ 12 files changed, 508 insertions(+) create mode 100644 include/net/tc_act/tc_hash.h create mode 100644 include/uapi/linux/tc_act/tc_hash.h create mode 100644 net/sched/act_hash.c -- 2.25.2 ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net-next v2 1/3] net/sched: Introduce action hash 2020-07-01 18:47 [PATCH net-next v2 0/3] ] TC datapath hash api Ariel Levkovich @ 2020-07-01 18:47 ` Ariel Levkovich 2020-07-01 22:20 ` kernel test robot 2020-07-02 20:52 ` Cong Wang 2020-07-01 18:47 ` [PATCH net-next v2 2/3] net/flow_dissector: add packet hash dissection Ariel Levkovich ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-01 18:47 UTC (permalink / raw) To: netdev Cc: jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Ariel Levkovich, Jiri Pirko Allow setting a hash value to a packet for a future match. The action will determine the packet's hash result according to the selected hash type. The first option is to select a basic asymmetric l4 hash calculation on the packet headers which will either use the skb->hash value as if such was already calculated and set by the device driver, or it will perform the kernel jenkins hash function on the packet which will generate the result otherwise. The other option is for user to provide an BPF program which is dedicated to calculate the hash. In such case the program is loaded and used by tc to perform the hash calculation and provide it to the hash action to be stored in skb->hash field. The BPF option can be useful for future HW offload support of the hash calculation by emulating the HW hash function when it's different than the kernel's but yet we want to maintain consistency between the SW and the HW. Usage is as follows: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto tcp \ action hash bpf object-file <bpf file> \ action goto chain 2 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto udp \ action hash asym_l4 basis <basis> \ action goto chain 2 Matching on the result: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x0/0xf \ action mirred egress redirect dev ens1f0_1 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ flower hash 0x1/0xf \ action mirred egress redirect dev ens1f0_2 v1 -> v2: *Handle egress case for bpf hash properly. *Check for valid bpf fd before referencing it. *Fixed missing unlocking of tcf_lock. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> --- include/net/act_api.h | 2 + include/net/tc_act/tc_hash.h | 22 ++ include/uapi/linux/pkt_cls.h | 1 + include/uapi/linux/tc_act/tc_hash.h | 32 +++ net/sched/Kconfig | 11 + net/sched/Makefile | 1 + net/sched/act_hash.c | 389 ++++++++++++++++++++++++++++ net/sched/cls_api.c | 1 + 8 files changed, 459 insertions(+) create mode 100644 include/net/tc_act/tc_hash.h create mode 100644 include/uapi/linux/tc_act/tc_hash.h create mode 100644 net/sched/act_hash.c diff --git a/include/net/act_api.h b/include/net/act_api.h index 8c3934880670..b7e5d060bd2f 100644 --- a/include/net/act_api.h +++ b/include/net/act_api.h @@ -12,6 +12,8 @@ #include <net/net_namespace.h> #include <net/netns/generic.h> +#define ACT_BPF_NAME_LEN 256 + struct tcf_idrinfo { struct mutex lock; struct idr action_idr; diff --git a/include/net/tc_act/tc_hash.h b/include/net/tc_act/tc_hash.h new file mode 100644 index 000000000000..f03bb19709e5 --- /dev/null +++ b/include/net/tc_act/tc_hash.h @@ -0,0 +1,22 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef __NET_TC_HASH_H +#define __NET_TC_HASH_H + +#include <net/act_api.h> +#include <uapi/linux/tc_act/tc_hash.h> + +struct tcf_hash_params { + enum tc_hash_alg alg; + u32 basis; + struct bpf_prog *prog; + const char *bpf_name; + struct rcu_head rcu; +}; + +struct tcf_hash { + struct tc_action common; + struct tcf_hash_params __rcu *hash_p; +}; +#define to_hash(a) ((struct tcf_hash *)a) + +#endif /* __NET_TC_HASH_H */ diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 7576209d96f9..2fd93389d091 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -135,6 +135,7 @@ enum tca_id { TCA_ID_MPLS, TCA_ID_CT, TCA_ID_GATE, + TCA_ID_HASH, /* other actions go here */ __TCA_ID_MAX = 255 }; diff --git a/include/uapi/linux/tc_act/tc_hash.h b/include/uapi/linux/tc_act/tc_hash.h new file mode 100644 index 000000000000..ff3063f70ee0 --- /dev/null +++ b/include/uapi/linux/tc_act/tc_hash.h @@ -0,0 +1,32 @@ +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ + +#ifndef __LINUX_TC_HASH_H +#define __LINUX_TC_HASH_H + +#include <linux/pkt_cls.h> + +enum tc_hash_alg { + TCA_HASH_ALG_L4, + TCA_HASH_ALG_BPF, +}; + +enum { + TCA_HASH_UNSPEC, + TCA_HASH_TM, + TCA_HASH_PARMS, + TCA_HASH_ALG, + TCA_HASH_BASIS, + TCA_HASH_BPF_FD, + TCA_HASH_BPF_NAME, + TCA_HASH_BPF_ID, + TCA_HASH_BPF_TAG, + TCA_HASH_PAD, + __TCA_HASH_MAX, +}; +#define TCA_HASH_MAX (__TCA_HASH_MAX - 1) + +struct tc_hash { + tc_gen; +}; + +#endif diff --git a/net/sched/Kconfig b/net/sched/Kconfig index 84badf00647e..e9725bb40f4f 100644 --- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -993,6 +993,17 @@ config NET_ACT_GATE To compile this code as a module, choose M here: the module will be called act_gate. +config NET_ACT_HASH + tristate "Hash calculation action" + depends on NET_CLS_ACT + help + Say Y here to perform hash calculation on packet headers. + + If unsure, say N. + + To compile this code as a module, choose M here: the + module will be called act_hash. + config NET_IFE_SKBMARK tristate "Support to encoding decoding skb mark on IFE action" depends on NET_ACT_IFE diff --git a/net/sched/Makefile b/net/sched/Makefile index 66bbf9a98f9e..2d1415fb57db 100644 --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -25,6 +25,7 @@ obj-$(CONFIG_NET_ACT_CONNMARK) += act_connmark.o obj-$(CONFIG_NET_ACT_CTINFO) += act_ctinfo.o obj-$(CONFIG_NET_ACT_SKBMOD) += act_skbmod.o obj-$(CONFIG_NET_ACT_IFE) += act_ife.o +obj-$(CONFIG_NET_ACT_HASH) += act_hash.o obj-$(CONFIG_NET_IFE_SKBMARK) += act_meta_mark.o obj-$(CONFIG_NET_IFE_SKBPRIO) += act_meta_skbprio.o obj-$(CONFIG_NET_IFE_SKBTCINDEX) += act_meta_skbtcindex.o diff --git a/net/sched/act_hash.c b/net/sched/act_hash.c new file mode 100644 index 000000000000..72dac9deb1ab --- /dev/null +++ b/net/sched/act_hash.c @@ -0,0 +1,389 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* - + * net/sched/act_hash.c Hash calculation action + * + * Author: Ariel Levkovich <lariel@mellanox.com> + */ + +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/rtnetlink.h> +#include <linux/skbuff.h> +#include <linux/filter.h> +#include <net/netlink.h> +#include <net/pkt_sched.h> +#include <net/pkt_cls.h> +#include <linux/tc_act/tc_hash.h> +#include <net/tc_act/tc_hash.h> + +#define ACT_HASH_BPF_NAME_LEN 256 + +static unsigned int hash_net_id; +static struct tc_action_ops act_hash_ops; + +static int tcf_hash_act(struct sk_buff *skb, const struct tc_action *a, + struct tcf_result *res) +{ + bool at_ingress = skb_at_tc_ingress(skb); + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + int action; + u32 hash; + + tcf_lastuse_update(&h->tcf_tm); + tcf_action_update_bstats(&h->common, skb); + + p = rcu_dereference_bh(h->hash_p); + + action = READ_ONCE(h->tcf_action); + + switch (p->alg) { + case TCA_HASH_ALG_L4: + hash = skb_get_hash(skb); + /* If a hash basis was provided, add it into + * hash calculation here and re-set skb->hash + * to the new result with sw_hash indication + * and keeping the l4 hash indication. + */ + hash = jhash_1word(hash, p->basis); + __skb_set_sw_hash(skb, hash, skb->l4_hash); + break; + case TCA_HASH_ALG_BPF: + if (at_ingress) { + __skb_push(skb, skb->mac_len); + bpf_compute_data_pointers(skb); + hash = BPF_PROG_RUN(p->prog, skb); + __skb_pull(skb, skb->mac_len); + } else { + bpf_compute_data_pointers(skb); + hash = BPF_PROG_RUN(p->prog, skb); + } + + /* The BPF program hash function type is + * unknown so only the sw hash bit is set. + */ + __skb_set_sw_hash(skb, hash, false); + break; + } + + return action; +} + +static const struct nla_policy hash_policy[TCA_HASH_MAX + 1] = { + [TCA_HASH_PARMS] = { .type = NLA_EXACT_LEN, .len = sizeof(struct tc_hash) }, + [TCA_HASH_ALG] = { .type = NLA_U32 }, + [TCA_HASH_BASIS] = { .type = NLA_U32 }, + [TCA_HASH_BPF_FD] = { .type = NLA_U32 }, + [TCA_HASH_BPF_NAME] = { .type = NLA_NUL_STRING, + .len = ACT_HASH_BPF_NAME_LEN }, +}; + +static int tcf_hash_bpf_init(struct nlattr **tb, struct tcf_hash_params *params) +{ + struct bpf_prog *fp; + char *name = NULL; + u32 bpf_fd; + + if (!tb[TCA_HASH_BPF_FD]) + return -EINVAL; + + bpf_fd = nla_get_u32(tb[TCA_HASH_BPF_FD]); + + fp = bpf_prog_get_type(bpf_fd, BPF_PROG_TYPE_SCHED_ACT); + if (IS_ERR(fp)) + return PTR_ERR(fp); + + if (tb[TCA_HASH_BPF_NAME]) { + name = nla_memdup(tb[TCA_HASH_BPF_NAME], GFP_KERNEL); + if (!name) { + bpf_prog_put(fp); + return -ENOMEM; + } + } + + params->bpf_name = name; + params->prog = fp; + + return 0; +} + +static void tcf_hash_bpf_cleanup(struct tcf_hash_params *params) +{ + if (params->prog) + bpf_prog_put(params->prog); + + kfree(params->bpf_name); +} + +static int tcf_hash_init(struct net *net, struct nlattr *nla, + struct nlattr *est, struct tc_action **a, + int replace, int bind, bool rtnl_held, + struct tcf_proto *tp, u32 flags, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + struct tcf_hash_params *params, old; + struct nlattr *tb[TCA_HASH_MAX + 1]; + struct tcf_chain *goto_ch = NULL; + struct tcf_hash_params *p = NULL; + struct tc_hash *parm; + struct tcf_hash *h; + int err, res = 0; + u32 index; + + if (!nla) { + NL_SET_ERR_MSG_MOD(extack, "Hash requires attributes to be passed"); + return -EINVAL; + } + + err = nla_parse_nested(tb, TCA_HASH_MAX, nla, hash_policy, extack); + if (err < 0) + return err; + + if (!tb[TCA_HASH_PARMS]) { + NL_SET_ERR_MSG_MOD(extack, "Missing required hash parameters"); + return -EINVAL; + } + parm = nla_data(tb[TCA_HASH_PARMS]); + index = parm->index; + + err = tcf_idr_check_alloc(tn, &index, a, bind); + if (err < 0) + return err; + + if (!err) { + err = tcf_idr_create_from_flags(tn, index, est, a, + &act_hash_ops, bind, flags); + if (err) { + tcf_idr_cleanup(tn, index); + return err; + } + res = ACT_P_CREATED; + } else { + if (bind) + return 0; + + if (!replace) { + tcf_idr_release(*a, bind); + return -EEXIST; + } + } + err = tcf_action_check_ctrlact(parm->action, tp, &goto_ch, extack); + if (err < 0) + goto release_idr; + + h = to_hash(*a); + + p = kzalloc(sizeof(*p), GFP_KERNEL); + if (unlikely(!p)) { + err = -ENOMEM; + goto cleanup; + } + + if (!tb[TCA_HASH_ALG]) { + NL_SET_ERR_MSG_MOD(extack, "Missing hash algorithm selection"); + err = -EINVAL; + goto cleanup; + } + + p->alg = nla_get_u32(tb[TCA_HASH_ALG]); + + spin_lock_bh(&h->tcf_lock); + + switch (p->alg) { + case TCA_HASH_ALG_L4: + break; + case TCA_HASH_ALG_BPF: + if (res != ACT_P_CREATED) { + params = rcu_dereference_protected(h->hash_p, 1); + old.prog = params->prog; + old.bpf_name = params->bpf_name; + } + + err = tcf_hash_bpf_init(tb, p); + if (err) + goto unlock; + + break; + default: + NL_SET_ERR_MSG_MOD(extack, "Hash type not supported"); + err = -EINVAL; + goto unlock; + } + + if (tb[TCA_HASH_BASIS]) + p->basis = nla_get_u32(tb[TCA_HASH_BASIS]); + + goto_ch = tcf_action_set_ctrlact(*a, parm->action, goto_ch); + p = rcu_replace_pointer(h->hash_p, p, + lockdep_is_held(&h->tcf_lock)); + spin_unlock_bh(&h->tcf_lock); + + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + if (p) + kfree_rcu(p, rcu); + + if (res == ACT_P_CREATED) { + tcf_idr_insert(tn, *a); + } else { + synchronize_rcu(); + tcf_hash_bpf_cleanup(&old); + } + + return res; + +unlock: + spin_unlock_bh(&h->tcf_lock); + +cleanup: + if (goto_ch) + tcf_chain_put_by_act(goto_ch); + kfree(p); + +release_idr: + tcf_idr_release(*a, bind); + return err; +} + +static void tcf_hash_cleanup(struct tc_action *a) +{ + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + + p = rcu_dereference_protected(h->hash_p, 1); + if (p) { + tcf_hash_bpf_cleanup(p); + kfree_rcu(p, rcu); + } +} + +static int tcf_hash_dump(struct sk_buff *skb, struct tc_action *a, + int bind, int ref) +{ + unsigned char *tp = skb_tail_pointer(skb); + struct tcf_hash *h = to_hash(a); + struct tcf_hash_params *p; + struct tc_hash opt = { + .index = h->tcf_index, + .refcnt = refcount_read(&h->tcf_refcnt) - ref, + .bindcnt = atomic_read(&h->tcf_bindcnt) - bind, + }; + struct nlattr *nla; + struct tcf_t tm; + + spin_lock_bh(&h->tcf_lock); + opt.action = h->tcf_action; + p = rcu_dereference_protected(h->hash_p, lockdep_is_held(&h->tcf_lock)); + + if (nla_put(skb, TCA_HASH_PARMS, sizeof(opt), &opt)) + goto nla_put_failure; + + if (nla_put_u32(skb, TCA_HASH_ALG, p->alg)) + goto nla_put_failure; + + if (nla_put_u32(skb, TCA_HASH_BASIS, p->basis)) + goto nla_put_failure; + + if (p->alg == TCA_HASH_ALG_BPF && p->bpf_name) { + if (nla_put_string(skb, TCA_HASH_BPF_NAME, p->bpf_name)) + goto nla_put_failure; + if (nla_put_u32(skb, TCA_HASH_BPF_ID, p->prog->aux->id)) + goto nla_put_failure; + nla = nla_reserve(skb, TCA_HASH_BPF_TAG, sizeof(p->prog->tag)); + if (!nla) + goto nla_put_failure; + + memcpy(nla_data(nla), p->prog->tag, nla_len(nla)); + } + + tcf_tm_dump(&tm, &h->tcf_tm); + if (nla_put_64bit(skb, TCA_HASH_TM, sizeof(tm), &tm, + TCA_HASH_PAD)) + goto nla_put_failure; + + spin_unlock_bh(&h->tcf_lock); + return skb->len; + +nla_put_failure: + spin_unlock_bh(&h->tcf_lock); + nlmsg_trim(skb, tp); + return -1; +} + +static int tcf_hash_walker(struct net *net, struct sk_buff *skb, + struct netlink_callback *cb, int type, + const struct tc_action_ops *ops, + struct netlink_ext_ack *extack) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tcf_generic_walker(tn, skb, cb, type, ops, extack); +} + +static int tcf_hash_search(struct net *net, struct tc_action **a, u32 index) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tcf_idr_search(tn, a, index); +} + +static void tcf_hash_stats_update(struct tc_action *a, u64 bytes, u32 packets, + u64 lastuse, bool hw) +{ + struct tcf_hash *h = to_hash(a); + + tcf_action_update_stats(a, bytes, packets, false, hw); + h->tcf_tm.lastuse = max_t(u64, h->tcf_tm.lastuse, lastuse); +} + +static struct tc_action_ops act_hash_ops = { + .kind = "hash", + .id = TCA_ID_HASH, + .owner = THIS_MODULE, + .act = tcf_hash_act, + .dump = tcf_hash_dump, + .init = tcf_hash_init, + .cleanup = tcf_hash_cleanup, + .walk = tcf_hash_walker, + .lookup = tcf_hash_search, + .stats_update = tcf_hash_stats_update, + .size = sizeof(struct tcf_hash), +}; + +static __net_init int hash_init_net(struct net *net) +{ + struct tc_action_net *tn = net_generic(net, hash_net_id); + + return tc_action_net_init(net, tn, &act_hash_ops); +} + +static void __net_exit hash_exit_net(struct list_head *net_list) +{ + tc_action_net_exit(net_list, hash_net_id); +} + +static struct pernet_operations hash_net_ops = { + .init = hash_init_net, + .exit_batch = hash_exit_net, + .id = &hash_net_id, + .size = sizeof(struct tc_action_net), +}; + +static int __init hash_init_module(void) +{ + return tcf_register_action(&act_hash_ops, &hash_net_ops); +} + +static void __exit hash_cleanup_module(void) +{ + tcf_unregister_action(&act_hash_ops, &hash_net_ops); +} + +module_init(hash_init_module); +module_exit(hash_cleanup_module); + +MODULE_AUTHOR("Ariel Levkovich <lariel@mellanox.com>"); +MODULE_DESCRIPTION("Packet hash action"); +MODULE_LICENSE("GPL v2"); diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index a00a203b2ef5..6d7eb249e557 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -40,6 +40,7 @@ #include <net/tc_act/tc_ct.h> #include <net/tc_act/tc_mpls.h> #include <net/tc_act/tc_gate.h> +#include <net/tc_act/tc_hash.h> #include <net/flow_offload.h> extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1]; -- 2.25.2 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 1/3] net/sched: Introduce action hash 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich @ 2020-07-01 22:20 ` kernel test robot 2020-07-02 20:52 ` Cong Wang 1 sibling, 0 replies; 20+ messages in thread From: kernel test robot @ 2020-07-01 22:20 UTC (permalink / raw) To: Ariel Levkovich, netdev Cc: kbuild-all, jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Ariel Levkovich, Jiri Pirko [-- Attachment #1: Type: text/plain, Size: 2285 bytes --] Hi Ariel, Thank you for the patch! Yet something to improve: [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Ariel-Levkovich/TC-datapath-hash-api/20200702-025138 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 2b04a66156159592156a97553057e8c36de2ee70 config: riscv-allyesconfig (attached as .config) compiler: riscv64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=riscv If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> net/sched/act_hash.c:351:24: error: initialization of 'void (*)(struct tc_action *, u64, u64, u64, u64, bool)' {aka 'void (*)(struct tc_action *, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, _Bool)'} from incompatible pointer type 'void (*)(struct tc_action *, u64, u32, u64, bool)' {aka 'void (*)(struct tc_action *, long long unsigned int, unsigned int, long long unsigned int, _Bool)'} [-Werror=incompatible-pointer-types] 351 | .stats_update = tcf_hash_stats_update, | ^~~~~~~~~~~~~~~~~~~~~ net/sched/act_hash.c:351:24: note: (near initialization for 'act_hash_ops.stats_update') cc1: some warnings being treated as errors vim +351 net/sched/act_hash.c 340 341 static struct tc_action_ops act_hash_ops = { 342 .kind = "hash", 343 .id = TCA_ID_HASH, 344 .owner = THIS_MODULE, 345 .act = tcf_hash_act, 346 .dump = tcf_hash_dump, 347 .init = tcf_hash_init, 348 .cleanup = tcf_hash_cleanup, 349 .walk = tcf_hash_walker, 350 .lookup = tcf_hash_search, > 351 .stats_update = tcf_hash_stats_update, 352 .size = sizeof(struct tcf_hash), 353 }; 354 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org [-- Attachment #2: .config.gz --] [-- Type: application/gzip, Size: 65092 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 1/3] net/sched: Introduce action hash @ 2020-07-01 22:20 ` kernel test robot 0 siblings, 0 replies; 20+ messages in thread From: kernel test robot @ 2020-07-01 22:20 UTC (permalink / raw) To: kbuild-all [-- Attachment #1: Type: text/plain, Size: 2335 bytes --] Hi Ariel, Thank you for the patch! Yet something to improve: [auto build test ERROR on net-next/master] url: https://github.com/0day-ci/linux/commits/Ariel-Levkovich/TC-datapath-hash-api/20200702-025138 base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 2b04a66156159592156a97553057e8c36de2ee70 config: riscv-allyesconfig (attached as .config) compiler: riscv64-linux-gcc (GCC) 9.3.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=riscv If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> net/sched/act_hash.c:351:24: error: initialization of 'void (*)(struct tc_action *, u64, u64, u64, u64, bool)' {aka 'void (*)(struct tc_action *, long long unsigned int, long long unsigned int, long long unsigned int, long long unsigned int, _Bool)'} from incompatible pointer type 'void (*)(struct tc_action *, u64, u32, u64, bool)' {aka 'void (*)(struct tc_action *, long long unsigned int, unsigned int, long long unsigned int, _Bool)'} [-Werror=incompatible-pointer-types] 351 | .stats_update = tcf_hash_stats_update, | ^~~~~~~~~~~~~~~~~~~~~ net/sched/act_hash.c:351:24: note: (near initialization for 'act_hash_ops.stats_update') cc1: some warnings being treated as errors vim +351 net/sched/act_hash.c 340 341 static struct tc_action_ops act_hash_ops = { 342 .kind = "hash", 343 .id = TCA_ID_HASH, 344 .owner = THIS_MODULE, 345 .act = tcf_hash_act, 346 .dump = tcf_hash_dump, 347 .init = tcf_hash_init, 348 .cleanup = tcf_hash_cleanup, 349 .walk = tcf_hash_walker, 350 .lookup = tcf_hash_search, > 351 .stats_update = tcf_hash_stats_update, 352 .size = sizeof(struct tcf_hash), 353 }; 354 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org [-- Attachment #2: config.gz --] [-- Type: application/gzip, Size: 65092 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 1/3] net/sched: Introduce action hash 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich 2020-07-01 22:20 ` kernel test robot @ 2020-07-02 20:52 ` Cong Wang 1 sibling, 0 replies; 20+ messages in thread From: Cong Wang @ 2020-07-02 20:52 UTC (permalink / raw) To: Ariel Levkovich Cc: Linux Kernel Network Developers, Jiri Pirko, Jakub Kicinski, Jamal Hadi Salim, Alexei Starovoitov, Daniel Borkmann, Jiri Pirko On Wed, Jul 1, 2020 at 11:47 AM Ariel Levkovich <lariel@mellanox.com> wrote: > > Allow setting a hash value to a packet for a future match. > > The action will determine the packet's hash result according to > the selected hash type. > > The first option is to select a basic asymmetric l4 hash calculation > on the packet headers which will either use the skb->hash value as > if such was already calculated and set by the device driver, or it > will perform the kernel jenkins hash function on the packet which will > generate the result otherwise. > > The other option is for user to provide an BPF program which is > dedicated to calculate the hash. In such case the program is loaded > and used by tc to perform the hash calculation and provide it to > the hash action to be stored in skb->hash field. > > The BPF option can be useful for future HW offload support of the hash > calculation by emulating the HW hash function when it's different than > the kernel's but yet we want to maintain consistency between the SW and > the HW. In previous discussion, people mentioned act_skbedit. If you have a legitimate reason for adding a new action instead of simply extending act_skbedit, you have to mention it here or in the cover letter. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH net-next v2 2/3] net/flow_dissector: add packet hash dissection 2020-07-01 18:47 [PATCH net-next v2 0/3] ] TC datapath hash api Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich @ 2020-07-01 18:47 ` Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 3/3] net/sched: cls_flower: Add hash info to flow classification Ariel Levkovich 2020-07-03 11:22 ` [PATCH net-next v2 0/3] ] TC datapath hash api Jamal Hadi Salim 3 siblings, 0 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-01 18:47 UTC (permalink / raw) To: netdev Cc: jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Ariel Levkovich, Jiri Pirko Retreive a hash value from the SKB and store it in the dissector key for future matching. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> --- include/linux/skbuff.h | 4 ++++ include/net/flow_dissector.h | 9 +++++++++ net/core/flow_dissector.c | 17 +++++++++++++++++ 3 files changed, 30 insertions(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 0c0377fc00c2..beb7fe2c7809 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -1342,6 +1342,10 @@ skb_flow_dissect_tunnel_info(const struct sk_buff *skb, struct flow_dissector *flow_dissector, void *target_container); +void skb_flow_dissect_hash(const struct sk_buff *skb, + struct flow_dissector *flow_dissector, + void *target_container); + static inline __u32 skb_get_hash(struct sk_buff *skb) { if (!skb->l4_hash && !skb->sw_hash) diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index a7eba43fe4e4..5cc0540ce3f7 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -243,6 +243,14 @@ struct flow_dissector_key_ct { u32 ct_labels[4]; }; +/** + * struct flow_dissector_key_hash: + * @hash: hash value + */ +struct flow_dissector_key_hash { + u32 hash; +}; + enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_CONTROL, /* struct flow_dissector_key_control */ FLOW_DISSECTOR_KEY_BASIC, /* struct flow_dissector_key_basic */ @@ -271,6 +279,7 @@ enum flow_dissector_key_id { FLOW_DISSECTOR_KEY_ENC_OPTS, /* struct flow_dissector_key_enc_opts */ FLOW_DISSECTOR_KEY_META, /* struct flow_dissector_key_meta */ FLOW_DISSECTOR_KEY_CT, /* struct flow_dissector_key_ct */ + FLOW_DISSECTOR_KEY_HASH, /* struct flow_dissector_key_hash */ FLOW_DISSECTOR_KEY_MAX, }; diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index d02df0b6d0d9..c114f0e3ef4f 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -392,6 +392,23 @@ skb_flow_dissect_tunnel_info(const struct sk_buff *skb, } EXPORT_SYMBOL(skb_flow_dissect_tunnel_info); +void skb_flow_dissect_hash(const struct sk_buff *skb, + struct flow_dissector *flow_dissector, + void *target_container) +{ + struct flow_dissector_key_hash *key; + + if (!dissector_uses_key(flow_dissector, FLOW_DISSECTOR_KEY_HASH)) + return; + + key = skb_flow_dissector_target(flow_dissector, + FLOW_DISSECTOR_KEY_HASH, + target_container); + + key->hash = skb_get_hash_raw(skb); +} +EXPORT_SYMBOL(skb_flow_dissect_hash); + static enum flow_dissect_ret __skb_flow_dissect_mpls(const struct sk_buff *skb, struct flow_dissector *flow_dissector, -- 2.25.2 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH net-next v2 3/3] net/sched: cls_flower: Add hash info to flow classification 2020-07-01 18:47 [PATCH net-next v2 0/3] ] TC datapath hash api Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 2/3] net/flow_dissector: add packet hash dissection Ariel Levkovich @ 2020-07-01 18:47 ` Ariel Levkovich 2020-07-03 11:22 ` [PATCH net-next v2 0/3] ] TC datapath hash api Jamal Hadi Salim 3 siblings, 0 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-01 18:47 UTC (permalink / raw) To: netdev Cc: jiri, kuba, jhs, xiyou.wangcong, ast, daniel, Ariel Levkovich, Jiri Pirko Adding new cls flower keys for hash value and hash mask and dissect the hash info from the skb into the flow key towards flow classication. Signed-off-by: Ariel Levkovich <lariel@mellanox.com> Reviewed-by: Jiri Pirko <jiri@mellanox.com> --- include/uapi/linux/pkt_cls.h | 3 +++ net/sched/cls_flower.c | 16 ++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h index 2fd93389d091..ef145320ee99 100644 --- a/include/uapi/linux/pkt_cls.h +++ b/include/uapi/linux/pkt_cls.h @@ -579,6 +579,9 @@ enum { TCA_FLOWER_KEY_MPLS_OPTS, + TCA_FLOWER_KEY_HASH, /* u32 */ + TCA_FLOWER_KEY_HASH_MASK, /* u32 */ + __TCA_FLOWER_MAX, }; diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index b2da37286082..ff739e0d86fc 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -64,6 +64,7 @@ struct fl_flow_key { }; } tp_range; struct flow_dissector_key_ct ct; + struct flow_dissector_key_hash hash; } __aligned(BITS_PER_LONG / 8); /* Ensure that we can do comparisons as longs. */ struct fl_flow_mask_range { @@ -318,6 +319,7 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp, skb_flow_dissect_ct(skb, &mask->dissector, &skb_key, fl_ct_info_to_flower_map, ARRAY_SIZE(fl_ct_info_to_flower_map)); + skb_flow_dissect_hash(skb, &mask->dissector, &skb_key); skb_flow_dissect(skb, &mask->dissector, &skb_key, 0); f = fl_mask_lookup(mask, &skb_key); @@ -694,6 +696,9 @@ static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = { [TCA_FLOWER_KEY_CT_LABELS_MASK] = { .type = NLA_BINARY, .len = 128 / BITS_PER_BYTE }, [TCA_FLOWER_FLAGS] = { .type = NLA_U32 }, + [TCA_FLOWER_KEY_HASH] = { .type = NLA_U32 }, + [TCA_FLOWER_KEY_HASH_MASK] = { .type = NLA_U32 }, + }; static const struct nla_policy @@ -1625,6 +1630,10 @@ static int fl_set_key(struct net *net, struct nlattr **tb, fl_set_key_ip(tb, true, &key->enc_ip, &mask->enc_ip); + fl_set_key_val(tb, &key->hash.hash, TCA_FLOWER_KEY_HASH, + &mask->hash.hash, TCA_FLOWER_KEY_HASH_MASK, + sizeof(key->hash.hash)); + if (tb[TCA_FLOWER_KEY_ENC_OPTS]) { ret = fl_set_enc_opt(tb, key, mask, extack); if (ret) @@ -1739,6 +1748,8 @@ static void fl_init_dissector(struct flow_dissector *dissector, FLOW_DISSECTOR_KEY_ENC_OPTS, enc_opts); FL_KEY_SET_IF_MASKED(mask, keys, cnt, FLOW_DISSECTOR_KEY_CT, ct); + FL_KEY_SET_IF_MASKED(mask, keys, cnt, + FLOW_DISSECTOR_KEY_HASH, hash); skb_flow_dissector_init(dissector, keys, cnt); } @@ -2959,6 +2970,11 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, if (fl_dump_key_flags(skb, key->control.flags, mask->control.flags)) goto nla_put_failure; + if (fl_dump_key_val(skb, &key->hash.hash, TCA_FLOWER_KEY_HASH, + &mask->hash.hash, TCA_FLOWER_KEY_HASH_MASK, + sizeof(key->hash.hash))) + goto nla_put_failure; + return 0; nla_put_failure: -- 2.25.2 ^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-01 18:47 [PATCH net-next v2 0/3] ] TC datapath hash api Ariel Levkovich ` (2 preceding siblings ...) 2020-07-01 18:47 ` [PATCH net-next v2 3/3] net/sched: cls_flower: Add hash info to flow classification Ariel Levkovich @ 2020-07-03 11:22 ` Jamal Hadi Salim 2020-07-05 17:26 ` Ariel Levkovich 2020-07-07 10:05 ` Jiri Pirko 3 siblings, 2 replies; 20+ messages in thread From: Jamal Hadi Salim @ 2020-07-03 11:22 UTC (permalink / raw) To: Ariel Levkovich, netdev; +Cc: jiri, kuba, xiyou.wangcong, ast, daniel Hi, Several comments: 1) I agree with previous comments that you should look at incorporating this into skbedit. Unless incorporating into skbedit introduces huge complexity, IMO it belongs there. 2) I think it would make sense to create a skb hash classifier instead of tying this entirely to flower i.e i should not have to change u32 just so i can support hash classification. So policy would be something of the sort: $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 0 proto ip \ flower ip_proto tcp \ action skbedit hash bpf object-file <file> \ action goto chain 2 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ handle 0x0 skbhash flowid 1:11 mask 0xf \ action mirred egress redirect dev ens1f0_1 $ tc filter add dev ens1f0_0 ingress \ prio 1 chain 2 proto ip \ handle 0x1 skbhash flowid 1:11 mask 0xf \ action mirred egress redirect dev ens1f0_2 IOW, we maintain current modularity as opposed to dumping everything into flower. Ive always wanted to write the skbhash classifier but time was scarce. At one point i had some experiment where I would copy skb hash into mark in the driver and use fw classifier for further processing. It was ugly. cheers, jamal On 2020-07-01 2:47 p.m., Ariel Levkovich wrote: > Supporting datapath hash allows user to set up rules that provide > load balancing of traffic across multiple vports and for ECMP path > selection while keeping the number of rule at minimum. > > Instead of matching on exact flow spec, which requires a rule per > flow, user can define rules based on hashing on the packet headers > and distribute the flows to different buckets. The number of rules > in this case will be constant and equal to the number of buckets. > > The datapath hash functionality is achieved in two steps - > performing the hash action and then matching on the result, as > part of the packet's classification. > > The api allows user to define a filter with a tc hash action > where the hash function can be standard asymetric hashing that Linux > offers or alternatively user can provide a bpf program that > performs hash calculation on a packet. > > Usage is as follows: > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 0 proto ip \ > flower ip_proto tcp \ > action hash bpf object-file <file> \ > action goto chain 2 > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 0 proto ip \ > flower ip_proto udp \ > action hash bpf asym_l4 basis <basis> \ > action goto chain 2 > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 2 proto ip \ > flower hash 0x0/0xf \ > action mirred egress redirect dev ens1f0_1 > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 2 proto ip \ > flower hash 0x1/0xf \ > action mirred egress redirect dev ens1f0_2 > > Ariel Levkovich (3): > net/sched: Introduce action hash > net/flow_dissector: add packet hash dissection > net/sched: cls_flower: Add hash info to flow classification > > include/linux/skbuff.h | 4 + > include/net/act_api.h | 2 + > include/net/flow_dissector.h | 9 + > include/net/tc_act/tc_hash.h | 22 ++ > include/uapi/linux/pkt_cls.h | 4 + > include/uapi/linux/tc_act/tc_hash.h | 32 +++ > net/core/flow_dissector.c | 17 ++ > net/sched/Kconfig | 11 + > net/sched/Makefile | 1 + > net/sched/act_hash.c | 389 ++++++++++++++++++++++++++++ > net/sched/cls_api.c | 1 + > net/sched/cls_flower.c | 16 ++ > 12 files changed, 508 insertions(+) > create mode 100644 include/net/tc_act/tc_hash.h > create mode 100644 include/uapi/linux/tc_act/tc_hash.h > create mode 100644 net/sched/act_hash.c > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-03 11:22 ` [PATCH net-next v2 0/3] ] TC datapath hash api Jamal Hadi Salim @ 2020-07-05 17:26 ` Ariel Levkovich 2020-07-05 21:50 ` Jamal Hadi Salim 2020-07-06 0:23 ` Cong Wang 2020-07-07 10:05 ` Jiri Pirko 1 sibling, 2 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-05 17:26 UTC (permalink / raw) To: Jamal Hadi Salim, netdev; +Cc: jiri, jiri, kuba, xiyou.wangcong, ast, daniel On 7/3/20 7:22 AM, Jamal Hadi Salim wrote: > Hi, > > Several comments: > 1) I agree with previous comments that you should > look at incorporating this into skbedit. > Unless incorporating into skbedit introduces huge > complexity, IMO it belongs there. Hi Jamal, I agree that using skbedit makes some sense and can provide the same functionality. However I believe that from a concept point of view, using it is wrong. In my honest opinion, the concept here is to perform some calculation on the packet itself and its headers while the skb->hash field is the storage location of the calculation result (in SW). Furthermore, looking forward to HW offload support, the HW devices will be offloading the hash calculation and not rewriting skb metadata fields. Therefore the action should be the hash, not skbedit. Another thing that I can mention, which is kind of related to what I wrote above, is that for all existing skbedit supported fields, user typically provides a desired value of his choosing to set to a skb metadata field. Here, the value is unknown and probably not a real concern to the user. To sum it up, I look at this as performing some operation on the packet rather then just setting an skb metadata field and therefore it requires an explicit, new action. What do you think? > > 2) I think it would make sense to create a skb hash classifier > instead of tying this entirely to flower i.e i should not > have to change u32 just so i can support hash classification. > So policy would be something of the sort: > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 0 proto ip \ > flower ip_proto tcp \ > action skbedit hash bpf object-file <file> \ > action goto chain 2 > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 2 proto ip \ > handle 0x0 skbhash flowid 1:11 mask 0xf \ > action mirred egress redirect dev ens1f0_1 > > $ tc filter add dev ens1f0_0 ingress \ > prio 1 chain 2 proto ip \ > handle 0x1 skbhash flowid 1:11 mask 0xf \ > action mirred egress redirect dev ens1f0_2 > > IOW, we maintain current modularity as opposed > to dumping everything into flower. > Ive always wanted to write the skbhash classifier but > time was scarce. At one point i had some experiment > where I would copy skb hash into mark in the driver > and use fw classifier for further processing. > It was ugly. I agree but perhaps we should make it a separate effort and not block this one. I still think we should have support via flower. This is the HW offload path eventually. Regards, Ariel > cheers, > jamal > > On 2020-07-01 2:47 p.m., Ariel Levkovich wrote: >> Supporting datapath hash allows user to set up rules that provide >> load balancing of traffic across multiple vports and for ECMP path >> selection while keeping the number of rule at minimum. >> >> Instead of matching on exact flow spec, which requires a rule per >> flow, user can define rules based on hashing on the packet headers >> and distribute the flows to different buckets. The number of rules >> in this case will be constant and equal to the number of buckets. >> >> The datapath hash functionality is achieved in two steps - >> performing the hash action and then matching on the result, as >> part of the packet's classification. >> >> The api allows user to define a filter with a tc hash action >> where the hash function can be standard asymetric hashing that Linux >> offers or alternatively user can provide a bpf program that >> performs hash calculation on a packet. >> >> Usage is as follows: >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 0 proto ip \ >> flower ip_proto tcp \ >> action hash bpf object-file <file> \ >> action goto chain 2 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 0 proto ip \ >> flower ip_proto udp \ >> action hash bpf asym_l4 basis <basis> \ >> action goto chain 2 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> flower hash 0x0/0xf \ >> action mirred egress redirect dev ens1f0_1 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> flower hash 0x1/0xf \ >> action mirred egress redirect dev ens1f0_2 >> >> Ariel Levkovich (3): >> net/sched: Introduce action hash >> net/flow_dissector: add packet hash dissection >> net/sched: cls_flower: Add hash info to flow classification >> >> include/linux/skbuff.h | 4 + >> include/net/act_api.h | 2 + >> include/net/flow_dissector.h | 9 + >> include/net/tc_act/tc_hash.h | 22 ++ >> include/uapi/linux/pkt_cls.h | 4 + >> include/uapi/linux/tc_act/tc_hash.h | 32 +++ >> net/core/flow_dissector.c | 17 ++ >> net/sched/Kconfig | 11 + >> net/sched/Makefile | 1 + >> net/sched/act_hash.c | 389 ++++++++++++++++++++++++++++ >> net/sched/cls_api.c | 1 + >> net/sched/cls_flower.c | 16 ++ >> 12 files changed, 508 insertions(+) >> create mode 100644 include/net/tc_act/tc_hash.h >> create mode 100644 include/uapi/linux/tc_act/tc_hash.h >> create mode 100644 net/sched/act_hash.c >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-05 17:26 ` Ariel Levkovich @ 2020-07-05 21:50 ` Jamal Hadi Salim 2020-07-06 0:28 ` Cong Wang 2020-07-06 0:23 ` Cong Wang 1 sibling, 1 reply; 20+ messages in thread From: Jamal Hadi Salim @ 2020-07-05 21:50 UTC (permalink / raw) To: Ariel Levkovich, netdev; +Cc: jiri, jiri, kuba, xiyou.wangcong, ast, daniel Hi Ariel, On 2020-07-05 1:26 p.m., Ariel Levkovich wrote: > > On 7/3/20 7:22 AM, Jamal Hadi Salim wrote: [..] > Hi Jamal, > > I agree that using skbedit makes some sense and can provide the same > functionality. > > However I believe that from a concept point of view, using it is wrong. > > In my honest opinion, the concept here is to perform some calculation on > the packet itself and its headers while the skb->hash field > > is the storage location of the calculation result (in SW). > > Furthermore, looking forward to HW offload support, the HW devices will > be offloading the hash calculation and > > not rewriting skb metadata fields. Therefore the action should be the > hash, not skbedit. > > Another thing that I can mention, which is kind of related to what I > wrote above, is that for all existing skbedit supported fields, > > user typically provides a desired value of his choosing to set to a skb > metadata field. > > Here, the value is unknown and probably not a real concern to the user. > > > To sum it up, I look at this as performing some operation on the packet > rather then just > > setting an skb metadata field and therefore it requires an explicit, new > action. > > > What do you think? skbedit is generally the action for any skb metadata modification (of which hash is one). Note: We already have skbedit offload for skbmark today. The hash feature is useful for software as well (as your use case showed). I agree with you that the majority of the cases are going to be a computation of some form that results in dynamic skb->hash. But the hash should be possible to be statically set by a policy. BTW, nothing in skbedit is against computing what the new metadata should be. IMO: A good arguement to not make it part of skbedit is if it adds unnecessary complexity to skbedit or policy definitions. > >> >> 2) I think it would make sense to create a skb hash classifier >> instead of tying this entirely to flower i.e i should not >> have to change u32 just so i can support hash classification. >> So policy would be something of the sort: >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 0 proto ip \ >> flower ip_proto tcp \ >> action skbedit hash bpf object-file <file> \ >> action goto chain 2 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> handle 0x0 skbhash flowid 1:11 mask 0xf \ >> action mirred egress redirect dev ens1f0_1 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> handle 0x1 skbhash flowid 1:11 mask 0xf \ >> action mirred egress redirect dev ens1f0_2 >> >> IOW, we maintain current modularity as opposed >> to dumping everything into flower. >> Ive always wanted to write the skbhash classifier but >> time was scarce. At one point i had some experiment >> where I would copy skb hash into mark in the driver >> and use fw classifier for further processing. >> It was ugly. > > I agree but perhaps we should make it a separate effort and not block > this one. > > I still think we should have support via flower. This is the HW offload > path eventually. > My main concern is modularity and the tc principle of doing small things (and in principle doing them well). Flower is becoming the sink for everything hardware offload but f.e u32 also does h/w offload as well and we dont want to limit it to just those two classifiers for the future... Note: Flower is not very good performance-wise in the ingress in s/ware. Something that is more specialized like the way skb mark fw classifier is will be a lot more efficient. One good reason to make hardware[1] do the hard work is to save the cyles in the host. So to me adding to flower does not help that cause. cheers, jamal [1] Whether the hash is set by RSS or an offloaded classifier or shows up in some simple pkt header (IFE original patch had skb hash being set in one machine and transported across machines for use in a remote machine - that setup is in use in production). cheers, jamal ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-05 21:50 ` Jamal Hadi Salim @ 2020-07-06 0:28 ` Cong Wang 2020-07-09 13:52 ` Ariel Levkovich 0 siblings, 1 reply; 20+ messages in thread From: Cong Wang @ 2020-07-06 0:28 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Ariel Levkovich, Linux Kernel Network Developers, Jiri Pirko, Jiri Pirko, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann On Sun, Jul 5, 2020 at 2:50 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote: > BTW, nothing in skbedit is against computing what the new metadata > should be. Yup. > > IMO: A good arguement to not make it part of skbedit is if it adds > unnecessary complexity to skbedit or policy definitions. > TCA_HASH_ALG_L4 literally has 4 lines of code, has no way to add any unnecessary complexity to skbedit. (The BPF algorithm does not belong to skbedit, obviously.) Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-06 0:28 ` Cong Wang @ 2020-07-09 13:52 ` Ariel Levkovich 0 siblings, 0 replies; 20+ messages in thread From: Ariel Levkovich @ 2020-07-09 13:52 UTC (permalink / raw) To: Cong Wang, Jamal Hadi Salim Cc: Linux Kernel Network Developers, Jiri Pirko, Jiri Pirko, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann Not sure it was sent so trying again... On 7/5/20 8:28 PM, Cong Wang wrote: > On Sun, Jul 5, 2020 at 2:50 PM Jamal Hadi Salim <jhs@mojatatu.com> wrote: >> BTW, nothing in skbedit is against computing what the new metadata >> should be. > Yup. > >> IMO: A good arguement to not make it part of skbedit is if it adds >> unnecessary complexity to skbedit or policy definitions. >> > TCA_HASH_ALG_L4 literally has 4 lines of code, has no way > to add any unnecessary complexity to skbedit. (The BPF algorithm > does not belong to skbedit, obviously.) > > Thanks. Moving TCA_HASH_ALG_L4 to skbedit is very simple, I agree. However, supporting the bpf option via act_bpf is still problematic as it is not offloadable. We need some kind of indication that this is a hash computation program and therefore it requires specific identifier in the form of a new action. Ariel ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-05 17:26 ` Ariel Levkovich 2020-07-05 21:50 ` Jamal Hadi Salim @ 2020-07-06 0:23 ` Cong Wang 1 sibling, 0 replies; 20+ messages in thread From: Cong Wang @ 2020-07-06 0:23 UTC (permalink / raw) To: Ariel Levkovich Cc: Jamal Hadi Salim, Linux Kernel Network Developers, Jiri Pirko, Jiri Pirko, Jakub Kicinski, Alexei Starovoitov, Daniel Borkmann On Sun, Jul 5, 2020 at 10:26 AM Ariel Levkovich <lariel@mellanox.com> wrote: > However I believe that from a concept point of view, using it is wrong. > > In my honest opinion, the concept here is to perform some calculation on > the packet itself and its headers while the skb->hash field > > is the storage location of the calculation result (in SW). With skbedit, you don't have to pass a value either, whatever you pass to your act_hash, you can pass it to skbedit too. In your case, it seems to be an algorithm name. You can take a look at SKBEDIT_F_INHERITDSFIELD, it calculates skb->priority from headers, not passed from user-space. > > Furthermore, looking forward to HW offload support, the HW devices will > be offloading the hash calculation and > > not rewriting skb metadata fields. Therefore the action should be the > hash, not skbedit. Not sure if this makes sense, whatever your code under case TCA_HASH_ALG_L4 can be just moved to skbedit. I don't see how making it standalone could be different for HW offloading. > > Another thing that I can mention, which is kind of related to what I > wrote above, is that for all existing skbedit supported fields, > > user typically provides a desired value of his choosing to set to a skb > metadata field. Again, no one forces this rule. Please feel free to adjust it for your needs. Thanks. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-03 11:22 ` [PATCH net-next v2 0/3] ] TC datapath hash api Jamal Hadi Salim 2020-07-05 17:26 ` Ariel Levkovich @ 2020-07-07 10:05 ` Jiri Pirko 2020-07-08 13:54 ` Jamal Hadi Salim 1 sibling, 1 reply; 20+ messages in thread From: Jiri Pirko @ 2020-07-07 10:05 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel Fri, Jul 03, 2020 at 01:22:47PM CEST, jhs@mojatatu.com wrote: >Hi, > >Several comments: >1) I agree with previous comments that you should >look at incorporating this into skbedit. >Unless incorporating into skbedit introduces huge >complexity, IMO it belongs there. > >2) I think it would make sense to create a skb hash classifier >instead of tying this entirely to flower i.e i should not >have to change u32 just so i can support hash classification. Well, we don't have multiple classifiers for each flower match, we have them all in one classifier. It turned out to be very convenient and intuitive for people to use one classifier to do the job for them. Modularity is nice, but useability is I think more important in this case. Flower turned out to do good job there. + Nothing stops you from creating separate classifier to match on hash as you wanted to :) >So policy would be something of the sort: > >$ tc filter add dev ens1f0_0 ingress \ >prio 1 chain 0 proto ip \ >flower ip_proto tcp \ >action skbedit hash bpf object-file <file> \ >action goto chain 2 > >$ tc filter add dev ens1f0_0 ingress \ >prio 1 chain 2 proto ip \ >handle 0x0 skbhash flowid 1:11 mask 0xf \ >action mirred egress redirect dev ens1f0_1 > >$ tc filter add dev ens1f0_0 ingress \ >prio 1 chain 2 proto ip \ >handle 0x1 skbhash flowid 1:11 mask 0xf \ >action mirred egress redirect dev ens1f0_2 > >IOW, we maintain current modularity as opposed >to dumping everything into flower. >Ive always wanted to write the skbhash classifier but >time was scarce. At one point i had some experiment >where I would copy skb hash into mark in the driver >and use fw classifier for further processing. >It was ugly. > >cheers, >jamal > >On 2020-07-01 2:47 p.m., Ariel Levkovich wrote: >> Supporting datapath hash allows user to set up rules that provide >> load balancing of traffic across multiple vports and for ECMP path >> selection while keeping the number of rule at minimum. >> >> Instead of matching on exact flow spec, which requires a rule per >> flow, user can define rules based on hashing on the packet headers >> and distribute the flows to different buckets. The number of rules >> in this case will be constant and equal to the number of buckets. >> >> The datapath hash functionality is achieved in two steps - >> performing the hash action and then matching on the result, as >> part of the packet's classification. >> >> The api allows user to define a filter with a tc hash action >> where the hash function can be standard asymetric hashing that Linux >> offers or alternatively user can provide a bpf program that >> performs hash calculation on a packet. >> >> Usage is as follows: >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 0 proto ip \ >> flower ip_proto tcp \ >> action hash bpf object-file <file> \ >> action goto chain 2 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 0 proto ip \ >> flower ip_proto udp \ >> action hash bpf asym_l4 basis <basis> \ >> action goto chain 2 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> flower hash 0x0/0xf \ >> action mirred egress redirect dev ens1f0_1 >> >> $ tc filter add dev ens1f0_0 ingress \ >> prio 1 chain 2 proto ip \ >> flower hash 0x1/0xf \ >> action mirred egress redirect dev ens1f0_2 >> >> Ariel Levkovich (3): >> net/sched: Introduce action hash >> net/flow_dissector: add packet hash dissection >> net/sched: cls_flower: Add hash info to flow classification >> >> include/linux/skbuff.h | 4 + >> include/net/act_api.h | 2 + >> include/net/flow_dissector.h | 9 + >> include/net/tc_act/tc_hash.h | 22 ++ >> include/uapi/linux/pkt_cls.h | 4 + >> include/uapi/linux/tc_act/tc_hash.h | 32 +++ >> net/core/flow_dissector.c | 17 ++ >> net/sched/Kconfig | 11 + >> net/sched/Makefile | 1 + >> net/sched/act_hash.c | 389 ++++++++++++++++++++++++++++ >> net/sched/cls_api.c | 1 + >> net/sched/cls_flower.c | 16 ++ >> 12 files changed, 508 insertions(+) >> create mode 100644 include/net/tc_act/tc_hash.h >> create mode 100644 include/uapi/linux/tc_act/tc_hash.h >> create mode 100644 net/sched/act_hash.c >> > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-07 10:05 ` Jiri Pirko @ 2020-07-08 13:54 ` Jamal Hadi Salim 2020-07-08 14:45 ` Jiri Pirko 0 siblings, 1 reply; 20+ messages in thread From: Jamal Hadi Salim @ 2020-07-08 13:54 UTC (permalink / raw) To: Jiri Pirko; +Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel On 2020-07-07 6:05 a.m., Jiri Pirko wrote: > Fri, Jul 03, 2020 at 01:22:47PM CEST, jhs@mojatatu.com wrote: >> Hi, >> >> Several comments: >> 1) I agree with previous comments that you should >> look at incorporating this into skbedit. >> Unless incorporating into skbedit introduces huge >> complexity, IMO it belongs there. >> >> 2) I think it would make sense to create a skb hash classifier >> instead of tying this entirely to flower i.e i should not >> have to change u32 just so i can support hash classification. > > Well, we don't have multiple classifiers for each flower match, we have > them all in one classifier. Packet data matches, yes - makes sense. You could argue the same for the other classifiers. > It turned out to be very convenient and > intuitive for people to use one classifier to do the job for them. IMO: For this specific case where _offload_ is the main use case i think it is not a good idea because flower on ingress is slow. The goal of offloading classifiers to hardware is so one can reduce consumed cpu cycles on the host. If the hardware has done the classification for me, a simple hash lookup of the 32 bit skbhash(similar to fw) in the host would be a lot less compute intensive than running flower's algorithm. I think there is a line for adding everything in one place, my main concern is that this feature this is needed by all classifiers and not just flower. > Modularity is nice, but useability is I think more important in this > case. Flower turned out to do good job there. > For humans, agreed everything in one place is convinient. Note: your arguement could be used for "ls" to include "grep" functionality because in my scripts I do both most of the time. cheers, jamal > + Nothing stops you from creating separate classifier to match on hash > as you wanted to :) > > ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-08 13:54 ` Jamal Hadi Salim @ 2020-07-08 14:45 ` Jiri Pirko 2020-07-09 11:00 ` Jamal Hadi Salim 0 siblings, 1 reply; 20+ messages in thread From: Jiri Pirko @ 2020-07-08 14:45 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel Wed, Jul 08, 2020 at 03:54:14PM CEST, jhs@mojatatu.com wrote: >On 2020-07-07 6:05 a.m., Jiri Pirko wrote: >> Fri, Jul 03, 2020 at 01:22:47PM CEST, jhs@mojatatu.com wrote: >> > Hi, >> > >> > Several comments: >> > 1) I agree with previous comments that you should >> > look at incorporating this into skbedit. >> > Unless incorporating into skbedit introduces huge >> > complexity, IMO it belongs there. >> > >> > 2) I think it would make sense to create a skb hash classifier >> > instead of tying this entirely to flower i.e i should not >> > have to change u32 just so i can support hash classification. >> >> Well, we don't have multiple classifiers for each flower match, we have >> them all in one classifier. > >Packet data matches, yes - makes sense. You could argue the same for >the other classifiers. > >> It turned out to be very convenient and >> intuitive for people to use one classifier to do the job for them. > >IMO: >For this specific case where _offload_ is the main use case i think >it is not a good idea because flower on ingress is slow. Eh? What do you mean by that? >The goal of offloading classifiers to hardware is so one can reduce >consumed cpu cycles on the host. If the hardware >has done the classification for me, a simple hash lookup of the >32 bit skbhash(similar to fw) in the host would be a lot less >compute intensive than running flower's algorithm. It is totally up to the driver/fw how they decide to offload flower. There are multiple ways. So I don't really follow what do you mean by "flower's algorithm" > >I think there is a line for adding everything in one place, >my main concern is that this feature this is needed >by all classifiers and not just flower. "All" is effectively only flower. Let's be clear about that. > > >> Modularity is nice, but useability is I think more important in this >> case. Flower turned out to do good job there. >> > >For humans, agreed everything in one place is convinient. >Note: your arguement could be used for "ls" to include "grep" >functionality because in my scripts I do both most of the time. > >cheers, >jamal > > > >> + Nothing stops you from creating separate classifier to match on hash >> as you wanted to :) >> >> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-08 14:45 ` Jiri Pirko @ 2020-07-09 11:00 ` Jamal Hadi Salim 2020-07-09 12:19 ` Jiri Pirko 0 siblings, 1 reply; 20+ messages in thread From: Jamal Hadi Salim @ 2020-07-09 11:00 UTC (permalink / raw) To: Jiri Pirko; +Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel On 2020-07-08 10:45 a.m., Jiri Pirko wrote: > Wed, Jul 08, 2020 at 03:54:14PM CEST, jhs@mojatatu.com wrote: >> On 2020-07-07 6:05 a.m., Jiri Pirko wrote: [..] >> IMO: >> For this specific case where _offload_ is the main use case i think >> it is not a good idea because flower on ingress is slow. > > Eh? What do you mean by that? > > >> The goal of offloading classifiers to hardware is so one can reduce >> consumed cpu cycles on the host. If the hardware >> has done the classification for me, a simple hash lookup of the >> 32 bit skbhash(similar to fw) in the host would be a lot less >> compute intensive than running flower's algorithm. > > It is totally up to the driver/fw how they decide to offload flower. > There are multiple ways. So I don't really follow what do you mean by > "flower's algorithm" > Nothing to do with how a driver offloads. That part is fine. By "flower's algorithm" I mean the fact you have to parse and create the flow cache from scratch on ingress - that slows down the ingress path. Compare, from cpu cycles pov, to say fw classifier which dereferences skbmark and uses it as a key to lookup a hash table. An skbhash classifier would look the same as fw in its approach. subtle point i was making was: if your goal was to save cpu cycles by offloading the lookup(whose result you then use to do less work on the host) then you need all the cycles you can save. Main point is: classifying based on hash(and for that matter any other metadata like mark) is needed as a general utility for the system and should not be only available for flower. The one big reason we allow all kinds of classifiers in tc is in the name of "do one thing and do it well". It is impossible for any one classifier to classify everything and do a good job at it. For example, I hope you are NEVER going to add string classification in flower. Note, what i am describing has precendence: I can do the same thing with skbmark offloading today. On ingress I use fw classifier (not u32 or flower). > >> >> I think there is a line for adding everything in one place, >> my main concern is that this feature this is needed >> by all classifiers and not just flower. > > "All" is effectively only flower. Let's be clear about that. > Unless I am misunderstanding, why is it just about flower? u32 does offload to some hardware. RSS can set this value and various existing things like XDP, tc ebpf and the action posted by Ariel. cheers, jamal ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-09 11:00 ` Jamal Hadi Salim @ 2020-07-09 12:19 ` Jiri Pirko 2020-07-10 12:04 ` Jamal Hadi Salim 0 siblings, 1 reply; 20+ messages in thread From: Jiri Pirko @ 2020-07-09 12:19 UTC (permalink / raw) To: Jamal Hadi Salim Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel Thu, Jul 09, 2020 at 01:00:26PM CEST, jhs@mojatatu.com wrote: >On 2020-07-08 10:45 a.m., Jiri Pirko wrote: >> Wed, Jul 08, 2020 at 03:54:14PM CEST, jhs@mojatatu.com wrote: >> > On 2020-07-07 6:05 a.m., Jiri Pirko wrote: > > >[..] >> > IMO: >> > For this specific case where _offload_ is the main use case i think >> > it is not a good idea because flower on ingress is slow. >> >> Eh? What do you mean by that? >> >> >> > The goal of offloading classifiers to hardware is so one can reduce >> > consumed cpu cycles on the host. If the hardware >> > has done the classification for me, a simple hash lookup of the >> > 32 bit skbhash(similar to fw) in the host would be a lot less >> > compute intensive than running flower's algorithm. >> >> It is totally up to the driver/fw how they decide to offload flower. >> There are multiple ways. So I don't really follow what do you mean by >> "flower's algorithm" >> > >Nothing to do with how a driver offloads. That part is fine. > >By "flower's algorithm" I mean the fact you have to parse and >create the flow cache from scratch on ingress - that slows down >the ingress path. Compare, from cpu cycles pov, to say fw Could you point to the specific code please? The skb->hash is only accessed if the user sets it up for matching. I don't understand what slowdown you are talking about :/ >classifier which dereferences skbmark and uses it as a key >to lookup a hash table. >An skbhash classifier would look the same as fw in its >approach. >subtle point i was making was: if your goal was to save cpu cycles >by offloading the lookup(whose result you then use to do >less work on the host) then you need all the cycles you can >save. > >Main point is: classifying based on hash(and for that >matter any other metadata like mark) is needed as a general >utility for the system and should not be only available for >flower. The one big reason we allow all kinds of classifiers >in tc is in the name of "do one thing and do it well". Sure. That classifier can exist, no problem. At the same time, flower can match on it as well. There are already multiple examples of classifiers matching on the same thing. I don't see any problem there. >It is impossible for any one classifier to classify everything >and do a good job at it. For example, I hope you are NEVER >going to add string classification in flower. > >Note, what i am describing has precendence: >I can do the same thing with skbmark offloading today. >On ingress I use fw classifier (not u32 or flower). > >> >> > >> > I think there is a line for adding everything in one place, >> > my main concern is that this feature this is needed >> > by all classifiers and not just flower. >> >> "All" is effectively only flower. Let's be clear about that. >> > >Unless I am misunderstanding, why is it just about flower? >u32 does offload to some hardware. RSS can set this value >and various existing things like XDP, tc ebpf and the action >posted by Ariel. > >cheers, >jamal ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-09 12:19 ` Jiri Pirko @ 2020-07-10 12:04 ` Jamal Hadi Salim 2020-08-07 10:41 ` Jamal Hadi Salim 0 siblings, 1 reply; 20+ messages in thread From: Jamal Hadi Salim @ 2020-07-10 12:04 UTC (permalink / raw) To: Jiri Pirko; +Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel On 2020-07-09 8:19 a.m., Jiri Pirko wrote: > Thu, Jul 09, 2020 at 01:00:26PM CEST, jhs@mojatatu.com wrote: >> On 2020-07-08 10:45 a.m., Jiri Pirko wrote: >>> Wed, Jul 08, 2020 at 03:54:14PM CEST, jhs@mojatatu.com wrote: >>>> On 2020-07-07 6:05 a.m., Jiri Pirko wrote: >> Nothing to do with how a driver offloads. That part is fine. >> >> By "flower's algorithm" I mean the fact you have to parse and >> create the flow cache from scratch on ingress - that slows down >> the ingress path. Compare, from cpu cycles pov, to say fw > > Could you point to the specific code please? > > The skb->hash is only accessed if the user sets it up for matching. > I don't understand what slowdown you are talking about :/ > Compare the lookup approach taken by flower in ->classify vs fw. Then add a few hundred(or thousands of) rules. > >> classifier which dereferences skbmark and uses it as a key >> to lookup a hash table. >> An skbhash classifier would look the same as fw in its >> approach. >> subtle point i was making was: if your goal was to save cpu cycles >> by offloading the lookup(whose result you then use to do >> less work on the host) then you need all the cycles you can >> save. >> >> Main point is: classifying based on hash(and for that >> matter any other metadata like mark) is needed as a general >> utility for the system and should not be only available for >> flower. The one big reason we allow all kinds of classifiers >> in tc is in the name of "do one thing and do it well". > > Sure. That classifier can exist, no problem. At the same time, flower > can match on it as well. There are already multiple examples of > classifiers matching on the same thing. I don't see any problem there. > I keep pointing to the issues and we keep circling back to your desire to add it to flower. I emphatize with the desire to have flower as a one stop shop for all things classification but this is at the expense of other classifiers. I too need this for offloading as well as getting the RSS proper feature i described.ets make progress. You go ahead - i will submit a version to add it as a separate hash classifier. cheers, jamal ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH net-next v2 0/3] ] TC datapath hash api 2020-07-10 12:04 ` Jamal Hadi Salim @ 2020-08-07 10:41 ` Jamal Hadi Salim 0 siblings, 0 replies; 20+ messages in thread From: Jamal Hadi Salim @ 2020-08-07 10:41 UTC (permalink / raw) To: Jiri Pirko; +Cc: Ariel Levkovich, netdev, kuba, xiyou.wangcong, ast, daniel On 2020-07-10 8:04 a.m., Jamal Hadi Salim wrote: > On 2020-07-09 8:19 a.m., Jiri Pirko wrote: >> Thu, Jul 09, 2020 at 01:00:26PM CEST, jhs@mojatatu.com wrote: >>> >>> Main point is: classifying based on hash(and for that >>> matter any other metadata like mark) is needed as a general >>> utility for the system and should not be only available for >>> flower. The one big reason we allow all kinds of classifiers >>> in tc is in the name of "do one thing and do it well". >> >> Sure. That classifier can exist, no problem. At the same time, flower >> can match on it as well. There are already multiple examples of >> classifiers matching on the same thing. I don't see any problem there. >> > > I keep pointing to the issues and we keep circling back > to your desire to add it to flower. I emphatize with the > desire to have flower as a one stop shop for all things classification > but this is at the expense of other classifiers. I too need this for > offloading as well as getting the RSS proper feature i described.ets > make progress. > You go ahead - i will submit a version to add it as a separate > hash classifier. > Some cycles opened up - I will work on this in the next day or two now that your patches are in... cheers, jamal ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2020-08-07 10:41 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-07-01 18:47 [PATCH net-next v2 0/3] ] TC datapath hash api Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 1/3] net/sched: Introduce action hash Ariel Levkovich 2020-07-01 22:20 ` kernel test robot 2020-07-01 22:20 ` kernel test robot 2020-07-02 20:52 ` Cong Wang 2020-07-01 18:47 ` [PATCH net-next v2 2/3] net/flow_dissector: add packet hash dissection Ariel Levkovich 2020-07-01 18:47 ` [PATCH net-next v2 3/3] net/sched: cls_flower: Add hash info to flow classification Ariel Levkovich 2020-07-03 11:22 ` [PATCH net-next v2 0/3] ] TC datapath hash api Jamal Hadi Salim 2020-07-05 17:26 ` Ariel Levkovich 2020-07-05 21:50 ` Jamal Hadi Salim 2020-07-06 0:28 ` Cong Wang 2020-07-09 13:52 ` Ariel Levkovich 2020-07-06 0:23 ` Cong Wang 2020-07-07 10:05 ` Jiri Pirko 2020-07-08 13:54 ` Jamal Hadi Salim 2020-07-08 14:45 ` Jiri Pirko 2020-07-09 11:00 ` Jamal Hadi Salim 2020-07-09 12:19 ` Jiri Pirko 2020-07-10 12:04 ` Jamal Hadi Salim 2020-08-07 10:41 ` Jamal Hadi Salim
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.