netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next v4 0/3] flow_offload: add indr-block in nf_table_offload
@ 2019-07-28  6:52 wenxu
  2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: wenxu @ 2019-07-28  6:52 UTC (permalink / raw)
  To: pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev

From: wenxu <wenxu@ucloud.cn>

This series patch make nftables offload support the vlan and
tunnel device offload through indr-block architecture.

The first patch mv tc indr block to flow offload and rename
to flow-indr-block.
Because the new flow-indr-block can't get the tcf_block
directly. The second patch provide a callback to get tcf_block
immediately when the device register and contain a ingress block.
The third patch make nf_tables_offload support flow-indr-block.

wenxu (3):
  flow_offload: move tc indirect block to flow offload
  flow_offload: Support get default block from tc immediately
  netfilter: nf_tables_offload: support indr block call

 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  10 +-
 include/net/flow_offload.h                         |  39 ++++
 include/net/pkt_cls.h                              |  42 +---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 181 +++++++++++++++
 net/netfilter/nf_tables_offload.c                  | 131 +++++++++--
 net/sched/cls_api.c                                | 246 ++++-----------------
 8 files changed, 385 insertions(+), 277 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload
  2019-07-28  6:52 [PATCH net-next v4 0/3] flow_offload: add indr-block in nf_table_offload wenxu
@ 2019-07-28  6:52 ` wenxu
  2019-07-29 11:13   ` Jiri Pirko
  2019-07-29 11:15   ` Jiri Pirko
  2019-07-28  6:52 ` [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately wenxu
  2019-07-28  6:52 ` [PATCH net-next v4 3/3] netfilter: nf_tables_offload: support indr block call wenxu
  2 siblings, 2 replies; 16+ messages in thread
From: wenxu @ 2019-07-28  6:52 UTC (permalink / raw)
  To: pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev

From: wenxu <wenxu@ucloud.cn>

move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v3: subsys_initcall for init_flow_indr_rhashtable
v4: no change

 drivers/net/ethernet/mellanox/mlx5/core/en_rep.c   |  10 +-
 .../net/ethernet/netronome/nfp/flower/offload.c    |  10 +-
 include/net/flow_offload.h                         |  39 ++++
 include/net/pkt_cls.h                              |  35 ---
 include/net/sch_generic.h                          |   3 -
 net/core/flow_offload.c                            | 179 ++++++++++++++++
 net/sched/cls_api.c                                | 235 ++-------------------
 7 files changed, 247 insertions(+), 264 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 7f747cb..074573b 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -785,9 +785,9 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 {
 	int err;
 
-	err = __tc_indr_block_cb_register(netdev, rpriv,
-					  mlx5e_rep_indr_setup_tc_cb,
-					  rpriv);
+	err = __flow_indr_block_cb_register(netdev, rpriv,
+					    mlx5e_rep_indr_setup_tc_cb,
+					    rpriv);
 	if (err) {
 		struct mlx5e_priv *priv = netdev_priv(rpriv->netdev);
 
@@ -800,8 +800,8 @@ static int mlx5e_rep_indr_register_block(struct mlx5e_rep_priv *rpriv,
 static void mlx5e_rep_indr_unregister_block(struct mlx5e_rep_priv *rpriv,
 					    struct net_device *netdev)
 {
-	__tc_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
-				      rpriv);
+	__flow_indr_block_cb_unregister(netdev, mlx5e_rep_indr_setup_tc_cb,
+					rpriv);
 }
 
 static int mlx5e_nic_rep_netdevice_event(struct notifier_block *nb,
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index e209f15..6a0f034 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1479,16 +1479,16 @@ int nfp_flower_reg_indir_block_handler(struct nfp_app *app,
 		return NOTIFY_OK;
 
 	if (event == NETDEV_REGISTER) {
-		err = __tc_indr_block_cb_register(netdev, app,
-						  nfp_flower_indr_setup_tc_cb,
-						  app);
+		err = __flow_indr_block_cb_register(netdev, app,
+						    nfp_flower_indr_setup_tc_cb,
+						    app);
 		if (err)
 			nfp_flower_cmsg_warn(app,
 					     "Indirect block reg failed - %s\n",
 					     netdev->name);
 	} else if (event == NETDEV_UNREGISTER) {
-		__tc_indr_block_cb_unregister(netdev,
-					      nfp_flower_indr_setup_tc_cb, app);
+		__flow_indr_block_cb_unregister(netdev,
+						nfp_flower_indr_setup_tc_cb, app);
 	}
 
 	return NOTIFY_OK;
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 00b9aab..66f89bc 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -4,6 +4,7 @@
 #include <linux/kernel.h>
 #include <linux/list.h>
 #include <net/flow_dissector.h>
+#include <linux/rhashtable.h>
 
 struct flow_match {
 	struct flow_dissector	*dissector;
@@ -366,4 +367,42 @@ static inline void flow_block_init(struct flow_block *flow_block)
 	INIT_LIST_HEAD(&flow_block->cb_list);
 }
 
+typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
+				      enum tc_setup_type type, void *type_data);
+
+struct flow_indr_block_cb {
+	struct list_head list;
+	void *cb_priv;
+	flow_indr_block_bind_cb_t *cb;
+	void *cb_ident;
+};
+
+typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
+				       struct flow_block *flow_block,
+				       struct flow_indr_block_cb *indr_block_cb,
+				       enum flow_block_command command);
+
+struct flow_indr_block_dev {
+	struct rhash_head ht_node;
+	struct net_device *dev;
+	unsigned int refcnt;
+	struct list_head cb_list;
+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
+	struct flow_block *flow_block;
+};
+
+struct flow_indr_block_dev *flow_indr_block_dev_lookup(struct net_device *dev);
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb, void *cb_ident);
+
 #endif /* _NET_FLOW_OFFLOAD_H */
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index e429809..0790a4e 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -70,15 +70,6 @@ static inline struct Qdisc *tcf_block_q(struct tcf_block *block)
 	return block->q;
 }
 
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident);
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident);
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident);
-
 int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		 struct tcf_result *res, bool compat_mode);
 
@@ -137,32 +128,6 @@ void tc_setup_cb_block_unregister(struct tcf_block *block, flow_setup_cb_t *cb,
 {
 }
 
-static inline
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	return 0;
-}
-
-static inline
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
-static inline
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-}
-
 static inline int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 			       struct tcf_result *res, bool compat_mode)
 {
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 6b6b012..d9f359a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -23,9 +23,6 @@
 struct module;
 struct bpf_flow_keys;
 
-typedef int tc_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
-				    enum tc_setup_type type, void *type_data);
-
 struct qdisc_rate_table {
 	struct tc_ratespec rate;
 	u32		data[256];
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index d63b970..9f1ae67 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -2,6 +2,7 @@
 #include <linux/kernel.h>
 #include <linux/slab.h>
 #include <net/flow_offload.h>
+#include <linux/rtnetlink.h>
 
 struct flow_rule *flow_rule_alloc(unsigned int num_actions)
 {
@@ -280,3 +281,181 @@ int flow_block_cb_setup_simple(struct flow_block_offload *f,
 	}
 }
 EXPORT_SYMBOL(flow_block_cb_setup_simple);
+
+static struct rhashtable indr_setup_block_ht;
+
+static const struct rhashtable_params flow_indr_setup_block_ht_params = {
+	.key_offset	= offsetof(struct flow_indr_block_dev, dev),
+	.head_offset	= offsetof(struct flow_indr_block_dev, ht_node),
+	.key_len	= sizeof(struct net_device *),
+};
+
+struct flow_indr_block_dev *
+flow_indr_block_dev_lookup(struct net_device *dev)
+{
+	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
+				      flow_indr_setup_block_ht_params);
+}
+EXPORT_SYMBOL(flow_indr_block_dev_lookup);
+
+static struct flow_indr_block_dev *flow_indr_block_dev_get(struct net_device *dev)
+{
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (indr_dev)
+		goto inc_ref;
+
+	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
+	if (!indr_dev)
+		return NULL;
+
+	INIT_LIST_HEAD(&indr_dev->cb_list);
+	indr_dev->dev = dev;
+	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+				   flow_indr_setup_block_ht_params)) {
+		kfree(indr_dev);
+		return NULL;
+	}
+
+inc_ref:
+	indr_dev->refcnt++;
+	return indr_dev;
+}
+
+static void flow_indr_block_dev_put(struct flow_indr_block_dev *indr_dev)
+{
+	if (--indr_dev->refcnt)
+		return;
+
+	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
+			       flow_indr_setup_block_ht_params);
+	kfree(indr_dev);
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_lookup(struct flow_indr_block_dev *indr_dev,
+			  flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		if (indr_block_cb->cb == cb &&
+		    indr_block_cb->cb_ident == cb_ident)
+			return indr_block_cb;
+	return NULL;
+}
+
+static struct flow_indr_block_cb *
+flow_indr_block_cb_add(struct flow_indr_block_dev *indr_dev, void *cb_priv,
+		       flow_indr_block_bind_cb_t *cb, void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (indr_block_cb)
+		return ERR_PTR(-EEXIST);
+
+	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
+	if (!indr_block_cb)
+		return ERR_PTR(-ENOMEM);
+
+	indr_block_cb->cb_priv = cb_priv;
+	indr_block_cb->cb = cb;
+	indr_block_cb->cb_ident = cb_ident;
+	list_add(&indr_block_cb->list, &indr_dev->cb_list);
+
+	return indr_block_cb;
+}
+
+static void flow_indr_block_cb_del(struct flow_indr_block_cb *indr_block_cb)
+{
+	list_del(&indr_block_cb->list);
+	kfree(indr_block_cb);
+}
+
+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				  flow_indr_block_bind_cb_t *cb,
+				  void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+	int err;
+
+	indr_dev = flow_indr_block_dev_get(dev);
+	if (!indr_dev)
+		return -ENOMEM;
+
+	indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
+	err = PTR_ERR_OR_ZERO(indr_block_cb);
+	if (err)
+		goto err_dev_put;
+
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block, indr_block_cb,
+				     FLOW_BLOCK_BIND);
+
+	return 0;
+
+err_dev_put:
+	flow_indr_block_dev_put(indr_dev);
+	return err;
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);
+
+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
+				flow_indr_block_bind_cb_t *cb,
+				void *cb_ident)
+{
+	int err;
+
+	rtnl_lock();
+	err = __flow_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
+	rtnl_unlock();
+
+	return err;
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_register);
+
+void __flow_indr_block_cb_unregister(struct net_device *dev,
+				     flow_indr_block_bind_cb_t *cb,
+				     void *cb_ident)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return;
+
+	indr_block_cb = flow_indr_block_cb_lookup(indr_dev, cb, cb_ident);
+	if (!indr_block_cb)
+		return;
+
+	/* Send unbind message if required to free any block cbs. */
+	if (indr_dev->ing_cmd_cb)
+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block,
+				     indr_block_cb,
+				     FLOW_BLOCK_UNBIND);
+
+	flow_indr_block_cb_del(indr_block_cb);
+	flow_indr_block_dev_put(indr_dev);
+}
+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_unregister);
+
+void flow_indr_block_cb_unregister(struct net_device *dev,
+				   flow_indr_block_bind_cb_t *cb,
+				   void *cb_ident)
+{
+	rtnl_lock();
+	__flow_indr_block_cb_unregister(dev, cb, cb_ident);
+	rtnl_unlock();
+}
+EXPORT_SYMBOL_GPL(flow_indr_block_cb_unregister);
+
+static int __init init_flow_indr_rhashtable(void)
+{
+	return rhashtable_init(&indr_setup_block_ht,
+			       &flow_indr_setup_block_ht_params);
+}
+subsys_initcall(init_flow_indr_rhashtable);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 3565d9a..d551c56 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -37,6 +37,7 @@
 #include <net/tc_act/tc_skbedit.h>
 #include <net/tc_act/tc_ct.h>
 #include <net/tc_act/tc_mpls.h>
+#include <net/flow_offload.h>
 
 extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
 
@@ -545,235 +546,43 @@ static void tcf_chain_flush(struct tcf_chain *chain, bool rtnl_held)
 	}
 }
 
-static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
-{
-	const struct Qdisc_class_ops *cops;
-	struct Qdisc *qdisc;
-
-	if (!dev_ingress_queue(dev))
-		return NULL;
-
-	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
-	if (!qdisc)
-		return NULL;
-
-	cops = qdisc->ops->cl_ops;
-	if (!cops)
-		return NULL;
-
-	if (!cops->tcf_block)
-		return NULL;
-
-	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
-}
-
-static struct rhashtable indr_setup_block_ht;
-
-struct tc_indr_block_dev {
-	struct rhash_head ht_node;
-	struct net_device *dev;
-	unsigned int refcnt;
-	struct list_head cb_list;
-	struct tcf_block *block;
-};
-
-struct tc_indr_block_cb {
-	struct list_head list;
-	void *cb_priv;
-	tc_indr_block_bind_cb_t *cb;
-	void *cb_ident;
-};
-
-static const struct rhashtable_params tc_indr_setup_block_ht_params = {
-	.key_offset	= offsetof(struct tc_indr_block_dev, dev),
-	.head_offset	= offsetof(struct tc_indr_block_dev, ht_node),
-	.key_len	= sizeof(struct net_device *),
-};
-
-static struct tc_indr_block_dev *
-tc_indr_block_dev_lookup(struct net_device *dev)
-{
-	return rhashtable_lookup_fast(&indr_setup_block_ht, &dev,
-				      tc_indr_setup_block_ht_params);
-}
-
-static struct tc_indr_block_dev *tc_indr_block_dev_get(struct net_device *dev)
-{
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (indr_dev)
-		goto inc_ref;
-
-	indr_dev = kzalloc(sizeof(*indr_dev), GFP_KERNEL);
-	if (!indr_dev)
-		return NULL;
-
-	INIT_LIST_HEAD(&indr_dev->cb_list);
-	indr_dev->dev = dev;
-	indr_dev->block = tc_dev_ingress_block(dev);
-	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-				   tc_indr_setup_block_ht_params)) {
-		kfree(indr_dev);
-		return NULL;
-	}
-
-inc_ref:
-	indr_dev->refcnt++;
-	return indr_dev;
-}
-
-static void tc_indr_block_dev_put(struct tc_indr_block_dev *indr_dev)
-{
-	if (--indr_dev->refcnt)
-		return;
-
-	rhashtable_remove_fast(&indr_setup_block_ht, &indr_dev->ht_node,
-			       tc_indr_setup_block_ht_params);
-	kfree(indr_dev);
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_lookup(struct tc_indr_block_dev *indr_dev,
-			tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
-		if (indr_block_cb->cb == cb &&
-		    indr_block_cb->cb_ident == cb_ident)
-			return indr_block_cb;
-	return NULL;
-}
-
-static struct tc_indr_block_cb *
-tc_indr_block_cb_add(struct tc_indr_block_dev *indr_dev, void *cb_priv,
-		     tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (indr_block_cb)
-		return ERR_PTR(-EEXIST);
-
-	indr_block_cb = kzalloc(sizeof(*indr_block_cb), GFP_KERNEL);
-	if (!indr_block_cb)
-		return ERR_PTR(-ENOMEM);
-
-	indr_block_cb->cb_priv = cb_priv;
-	indr_block_cb->cb = cb;
-	indr_block_cb->cb_ident = cb_ident;
-	list_add(&indr_block_cb->list, &indr_dev->cb_list);
-
-	return indr_block_cb;
-}
-
-static void tc_indr_block_cb_del(struct tc_indr_block_cb *indr_block_cb)
-{
-	list_del(&indr_block_cb->list);
-	kfree(indr_block_cb);
-}
-
 static int tcf_block_setup(struct tcf_block *block,
 			   struct flow_block_offload *bo);
 
-static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
-				  struct tc_indr_block_cb *indr_block_cb,
+static void tc_indr_block_ing_cmd(struct net_device *dev,
+				  struct flow_block *flow_block,
+				  struct flow_indr_block_cb *indr_block_cb,
 				  enum flow_block_command command)
 {
+	struct tcf_block *block = flow_block ?
+				  container_of(flow_block,
+					       struct tcf_block,
+					       flow_block) : NULL;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
-		.net		= dev_net(indr_dev->dev),
-		.block_shared	= tcf_block_non_null_shared(indr_dev->block),
+		.net		= dev_net(dev),
+		.block_shared	= tcf_block_non_null_shared(block),
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	if (!indr_dev->block)
-		return;
-
-	bo.block = &indr_dev->block->flow_block;
-
-	indr_block_cb->cb(indr_dev->dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
-			  &bo);
-	tcf_block_setup(indr_dev->block, &bo);
-}
-
-int __tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-				tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-	int err;
-
-	indr_dev = tc_indr_block_dev_get(dev);
-	if (!indr_dev)
-		return -ENOMEM;
-
-	indr_block_cb = tc_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
-	err = PTR_ERR_OR_ZERO(indr_block_cb);
-	if (err)
-		goto err_dev_put;
-
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_BIND);
-	return 0;
-
-err_dev_put:
-	tc_indr_block_dev_put(indr_dev);
-	return err;
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_register);
-
-int tc_indr_block_cb_register(struct net_device *dev, void *cb_priv,
-			      tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	int err;
-
-	rtnl_lock();
-	err = __tc_indr_block_cb_register(dev, cb_priv, cb, cb_ident);
-	rtnl_unlock();
-
-	return err;
-}
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_register);
-
-void __tc_indr_block_cb_unregister(struct net_device *dev,
-				   tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
-
-	indr_dev = tc_indr_block_dev_lookup(dev);
-	if (!indr_dev)
+	if (!block)
 		return;
 
-	indr_block_cb = tc_indr_block_cb_lookup(indr_dev, cb, cb_ident);
-	if (!indr_block_cb)
-		return;
+	bo.block = flow_block;
 
-	/* Send unbind message if required to free any block cbs. */
-	tc_indr_block_ing_cmd(indr_dev, indr_block_cb, FLOW_BLOCK_UNBIND);
-	tc_indr_block_cb_del(indr_block_cb);
-	tc_indr_block_dev_put(indr_dev);
-}
-EXPORT_SYMBOL_GPL(__tc_indr_block_cb_unregister);
+	indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK, &bo);
 
-void tc_indr_block_cb_unregister(struct net_device *dev,
-				 tc_indr_block_bind_cb_t *cb, void *cb_ident)
-{
-	rtnl_lock();
-	__tc_indr_block_cb_unregister(dev, cb, cb_ident);
-	rtnl_unlock();
+	tcf_block_setup(block, &bo);
 }
-EXPORT_SYMBOL_GPL(tc_indr_block_cb_unregister);
 
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
 			       struct netlink_ext_ack *extack)
 {
-	struct tc_indr_block_cb *indr_block_cb;
-	struct tc_indr_block_dev *indr_dev;
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
 	struct flow_block_offload bo = {
 		.command	= command,
 		.binder_type	= ei->binder_type,
@@ -784,11 +593,12 @@ static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 	};
 	INIT_LIST_HEAD(&bo.cb_list);
 
-	indr_dev = tc_indr_block_dev_lookup(dev);
+	indr_dev = flow_indr_block_dev_lookup(dev);
 	if (!indr_dev)
 		return;
 
-	indr_dev->block = command == FLOW_BLOCK_BIND ? block : NULL;
+	indr_dev->flow_block = command == FLOW_BLOCK_BIND ? &block->flow_block : NULL;
+	indr_dev->ing_cmd_cb = command == FLOW_BLOCK_BIND ? tc_indr_block_ing_cmd : NULL;
 
 	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
 		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
@@ -3358,11 +3168,6 @@ static int __init tc_filter_init(void)
 	if (err)
 		goto err_register_pernet_subsys;
 
-	err = rhashtable_init(&indr_setup_block_ht,
-			      &tc_indr_setup_block_ht_params);
-	if (err)
-		goto err_rhash_setup_block_ht;
-
 	rtnl_register(PF_UNSPEC, RTM_NEWTFILTER, tc_new_tfilter, NULL,
 		      RTNL_FLAG_DOIT_UNLOCKED);
 	rtnl_register(PF_UNSPEC, RTM_DELTFILTER, tc_del_tfilter, NULL,
@@ -3376,8 +3181,6 @@ static int __init tc_filter_init(void)
 
 	return 0;
 
-err_rhash_setup_block_ht:
-	unregister_pernet_subsys(&tcf_net_ops);
 err_register_pernet_subsys:
 	destroy_workqueue(tc_filter_wq);
 	return err;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-28  6:52 [PATCH net-next v4 0/3] flow_offload: add indr-block in nf_table_offload wenxu
  2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
@ 2019-07-28  6:52 ` wenxu
  2019-07-28 20:16   ` Jakub Kicinski
  2019-07-28  6:52 ` [PATCH net-next v4 3/3] netfilter: nf_tables_offload: support indr block call wenxu
  2 siblings, 1 reply; 16+ messages in thread
From: wenxu @ 2019-07-28  6:52 UTC (permalink / raw)
  To: pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev

From: wenxu <wenxu@ucloud.cn>

When thre indr device register, it can get the default block
from tc immediately if the block is exist.

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v3: no change
v4: get tc default block without callback

 include/net/pkt_cls.h   |  7 +++++++
 net/core/flow_offload.c |  2 ++
 net/sched/cls_api.c     | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 42 insertions(+)

diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 0790a4e..77c3a42 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -54,6 +54,8 @@ int tcf_block_get_ext(struct tcf_block **p_block, struct Qdisc *q,
 void tcf_block_put_ext(struct tcf_block *block, struct Qdisc *q,
 		       struct tcf_block_ext_info *ei);
 
+void tc_indr_get_default_block(struct flow_indr_block_dev *indr_dev);
+
 static inline bool tcf_block_shared(struct tcf_block *block)
 {
 	return block->index;
@@ -74,6 +76,11 @@ int tcf_classify(struct sk_buff *skb, const struct tcf_proto *tp,
 		 struct tcf_result *res, bool compat_mode);
 
 #else
+static inline
+void tc_indr_get_default_block(struct flow_indr_block_dev *indr_dev)
+{
+}
+
 static inline bool tcf_block_shared(struct tcf_block *block)
 {
 	return false;
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 9f1ae67..0ca3d51 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -3,6 +3,7 @@
 #include <linux/slab.h>
 #include <net/flow_offload.h>
 #include <linux/rtnetlink.h>
+#include <net/pkt_cls.h>
 
 struct flow_rule *flow_rule_alloc(unsigned int num_actions)
 {
@@ -312,6 +313,7 @@ static struct flow_indr_block_dev *flow_indr_block_dev_get(struct net_device *de
 
 	INIT_LIST_HEAD(&indr_dev->cb_list);
 	indr_dev->dev = dev;
+	tc_indr_get_default_block(indr_dev);
 	if (rhashtable_insert_fast(&indr_setup_block_ht, &indr_dev->ht_node,
 				   flow_indr_setup_block_ht_params)) {
 		kfree(indr_dev);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index d551c56..59e9572 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -576,6 +576,39 @@ static void tc_indr_block_ing_cmd(struct net_device *dev,
 	tcf_block_setup(block, &bo);
 }
 
+static struct tcf_block *tc_dev_ingress_block(struct net_device *dev)
+{
+	const struct Qdisc_class_ops *cops;
+	struct Qdisc *qdisc;
+
+	if (!dev_ingress_queue(dev))
+		return NULL;
+
+	qdisc = dev_ingress_queue(dev)->qdisc_sleeping;
+	if (!qdisc)
+		return NULL;
+
+	cops = qdisc->ops->cl_ops;
+	if (!cops)
+		return NULL;
+
+	if (!cops->tcf_block)
+		return NULL;
+
+	return cops->tcf_block(qdisc, TC_H_MIN_INGRESS, NULL);
+}
+
+void tc_indr_get_default_block(struct flow_indr_block_dev *indr_dev)
+{
+	struct tcf_block *block = tc_dev_ingress_block(indr_dev->dev);
+
+	if (block) {
+		indr_dev->flow_block = &block->flow_block;
+		indr_dev->ing_cmd_cb = tc_indr_block_ing_cmd;
+	}
+}
+EXPORT_SYMBOL(tc_indr_get_default_block);
+
 static void tc_indr_block_call(struct tcf_block *block, struct net_device *dev,
 			       struct tcf_block_ext_info *ei,
 			       enum flow_block_command command,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net-next v4 3/3] netfilter: nf_tables_offload: support indr block call
  2019-07-28  6:52 [PATCH net-next v4 0/3] flow_offload: add indr-block in nf_table_offload wenxu
  2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
  2019-07-28  6:52 ` [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately wenxu
@ 2019-07-28  6:52 ` wenxu
  2 siblings, 0 replies; 16+ messages in thread
From: wenxu @ 2019-07-28  6:52 UTC (permalink / raw)
  To: pablo, fw, jakub.kicinski; +Cc: netfilter-devel, netdev

From: wenxu <wenxu@ucloud.cn>

nftable support indr-block call. It makes nftable an offload vlan
and tunnel device.

nft add table netdev firewall
nft add chain netdev firewall aclout { type filter hook ingress offload device mlx_pf0vf0 priority - 300 \; }
nft add rule netdev firewall aclout ip daddr 10.0.0.1 fwd to vlan0
nft add chain netdev firewall aclin { type filter hook ingress device vlan0 priority - 300 \; }
nft add rule netdev firewall aclin ip daddr 10.0.0.7 fwd to mlx_pf0vf0

Signed-off-by: wenxu <wenxu@ucloud.cn>
---
v3: subsys_initcall for init_flow_indr_rhashtable
v4: guarantee only one offload base chain used per indr dev. 
If the indr_block_cmd bind fail return unsupported.

 net/netfilter/nf_tables_offload.c | 131 +++++++++++++++++++++++++++++++-------
 1 file changed, 107 insertions(+), 24 deletions(-)

diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 64f5fd5..19214ad 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -171,24 +171,123 @@ static int nft_flow_offload_unbind(struct flow_block_offload *bo,
 	return 0;
 }
 
+static int nft_block_setup(struct nft_base_chain *basechain,
+			   struct flow_block_offload *bo,
+			   enum flow_block_command cmd)
+{
+	int err;
+
+	switch (cmd) {
+	case FLOW_BLOCK_BIND:
+		err = nft_flow_offload_bind(bo, basechain);
+		break;
+	case FLOW_BLOCK_UNBIND:
+		err = nft_flow_offload_unbind(bo, basechain);
+		break;
+	default:
+		WARN_ON_ONCE(1);
+		err = -EOPNOTSUPP;
+	}
+
+	return err;
+}
+
+static int nft_block_offload_cmd(struct nft_base_chain *chain,
+				 struct net_device *dev,
+				 enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	int err;
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
+	if (err < 0)
+		return err;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
+static void nft_indr_block_ing_cmd(struct net_device *dev,
+				   struct flow_block *flow_block,
+				   struct flow_indr_block_cb *indr_block_cb,
+				   enum flow_block_command cmd)
+{
+	struct netlink_ext_ack extack = {};
+	struct flow_block_offload bo = {};
+	struct nft_base_chain *chain;
+
+	if (flow_block)
+		return;
+
+	chain = container_of(flow_block, struct nft_base_chain, flow_block);
+
+	bo.net = dev_net(dev);
+	bo.block = flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK, &bo);
+
+	nft_block_setup(chain, &bo, cmd);
+}
+
+static int nft_indr_block_offload_cmd(struct nft_base_chain *chain,
+				      struct net_device *dev,
+				      enum flow_block_command cmd)
+{
+	struct flow_indr_block_cb *indr_block_cb;
+	struct flow_indr_block_dev *indr_dev;
+	struct flow_block_offload bo = {};
+	struct netlink_ext_ack extack = {};
+
+	bo.net = dev_net(dev);
+	bo.block = &chain->flow_block;
+	bo.command = cmd;
+	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
+	bo.extack = &extack;
+	INIT_LIST_HEAD(&bo.cb_list);
+
+	indr_dev = flow_indr_block_dev_lookup(dev);
+	if (!indr_dev)
+		return -EOPNOTSUPP;
+
+	indr_dev->flow_block = cmd == FLOW_BLOCK_BIND ? &chain->flow_block : NULL;
+	indr_dev->ing_cmd_cb = cmd == FLOW_BLOCK_BIND ? nft_indr_block_ing_cmd : NULL;
+
+	list_for_each_entry(indr_block_cb, &indr_dev->cb_list, list)
+		indr_block_cb->cb(dev, indr_block_cb->cb_priv, TC_SETUP_BLOCK,
+				  &bo);
+
+	if (list_empty(&bo.cb_list))
+		return -EOPNOTSUPP;
+
+	return nft_block_setup(chain, &bo, cmd);
+}
+
 #define FLOW_SETUP_BLOCK TC_SETUP_BLOCK
 
 static int nft_flow_offload_chain(struct nft_trans *trans,
 				  enum flow_block_command cmd)
 {
 	struct nft_chain *chain = trans->ctx.chain;
-	struct netlink_ext_ack extack = {};
-	struct flow_block_offload bo = {};
 	struct nft_base_chain *basechain;
 	struct net_device *dev;
-	int err;
 
 	if (!nft_is_base_chain(chain))
 		return -EOPNOTSUPP;
 
 	basechain = nft_base_chain(chain);
 	dev = basechain->ops.dev;
-	if (!dev || !dev->netdev_ops->ndo_setup_tc)
+	if (!dev)
 		return -EOPNOTSUPP;
 
 	/* Only default policy to accept is supported for now. */
@@ -197,26 +296,10 @@ static int nft_flow_offload_chain(struct nft_trans *trans,
 	    nft_trans_chain_policy(trans) != NF_ACCEPT)
 		return -EOPNOTSUPP;
 
-	bo.command = cmd;
-	bo.block = &basechain->flow_block;
-	bo.binder_type = FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS;
-	bo.extack = &extack;
-	INIT_LIST_HEAD(&bo.cb_list);
-
-	err = dev->netdev_ops->ndo_setup_tc(dev, FLOW_SETUP_BLOCK, &bo);
-	if (err < 0)
-		return err;
-
-	switch (cmd) {
-	case FLOW_BLOCK_BIND:
-		err = nft_flow_offload_bind(&bo, basechain);
-		break;
-	case FLOW_BLOCK_UNBIND:
-		err = nft_flow_offload_unbind(&bo, basechain);
-		break;
-	}
-
-	return err;
+	if (dev->netdev_ops->ndo_setup_tc)
+		return nft_block_offload_cmd(basechain, dev, cmd);
+	else
+		return nft_indr_block_offload_cmd(basechain, dev, cmd);
 }
 
 int nft_flow_rule_offload_commit(struct net *net)
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-28  6:52 ` [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately wenxu
@ 2019-07-28 20:16   ` Jakub Kicinski
  2019-07-29  2:43     ` wenxu
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-07-28 20:16 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, netfilter-devel, netdev

On Sun, 28 Jul 2019 14:52:48 +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> When thre indr device register, it can get the default block
> from tc immediately if the block is exist.
> 
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
> v3: no change
> v4: get tc default block without callback

Please stop reposting new versions of the patches while discussion is
ongoing, it makes it harder to follow.

The TC default block is there because the indirect registration may
happen _after_ the block is installed and populated.  It's the device
driver that usually does the indirect registration, the tunnel device
and its rules may already be set when device driver is loaded or
reloaded.

I don't know the nft code, but it seems unlikely it wouldn't have the
same problem/need..

Please explain.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-28 20:16   ` Jakub Kicinski
@ 2019-07-29  2:43     ` wenxu
  2019-07-29  4:42       ` Jakub Kicinski
  0 siblings, 1 reply; 16+ messages in thread
From: wenxu @ 2019-07-29  2:43 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: pablo, fw, netfilter-devel, netdev


On 7/29/2019 4:16 AM, Jakub Kicinski wrote:
> .
> The TC default block is there because the indirect registration may
> happen _after_ the block is installed and populated.  It's the device
> driver that usually does the indirect registration, the tunnel device
> and its rules may already be set when device driver is loaded or
> reloaded.
Yes, I know this scenario.
> I don't know the nft code, but it seems unlikely it wouldn't have the
> same problem/need..

nft don't have the same problem.  The offload rule can only attached to offload base chain.

Th  offload base chain is created after the device driver loaded (the device exist).

>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29  2:43     ` wenxu
@ 2019-07-29  4:42       ` Jakub Kicinski
  2019-07-29  7:05         ` wenxu
  2019-07-29  7:18         ` wenxu
  0 siblings, 2 replies; 16+ messages in thread
From: Jakub Kicinski @ 2019-07-29  4:42 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, netfilter-devel, netdev

On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:
> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:
> > I don't know the nft code, but it seems unlikely it wouldn't have the
> > same problem/need..  
> 
> nft don't have the same problem.  The offload rule can only attached
> to offload base chain.
> 
> Th  offload base chain is created after the device driver loaded (the
> device exist).

For indirect blocks the block is on the tunnel device and the offload
target is another device. E.g. you offload rules from a VXLAN device
onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
device is created.

So I feel like either the chain somehow directly references the offload
target (in which case the indirect infrastructure with hash lookup etc
is not needed for nft), or indirect infra is needed, and we need to take
care of replays.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29  4:42       ` Jakub Kicinski
@ 2019-07-29  7:05         ` wenxu
  2019-07-29 16:51           ` Jakub Kicinski
  2019-07-29  7:18         ` wenxu
  1 sibling, 1 reply; 16+ messages in thread
From: wenxu @ 2019-07-29  7:05 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: pablo, fw, netfilter-devel, netdev


On 7/29/2019 12:42 PM, Jakub Kicinski wrote:
> On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:
>> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:
>>> I don't know the nft code, but it seems unlikely it wouldn't have the
>>> same problem/need..  
>> nft don't have the same problem.  The offload rule can only attached
>> to offload base chain.
>>
>> Th  offload base chain is created after the device driver loaded (the
>> device exist).
> For indirect blocks the block is on the tunnel device and the offload
> target is another device. E.g. you offload rules from a VXLAN device
> onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
> device is created.
>
> So I feel like either the chain somehow directly references the offload
> target (in which case the indirect infrastructure with hash lookup etc
> is not needed for nft), or indirect infra is needed, and we need to take
> care of replays.

So you mean the case is there are two card A and B both can offload vxlan.

First vxlan device offload with A.  And then the B driver loaded, So the rules

should replay to B device?




^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29  4:42       ` Jakub Kicinski
  2019-07-29  7:05         ` wenxu
@ 2019-07-29  7:18         ` wenxu
  2019-07-29 16:55           ` Jakub Kicinski
  1 sibling, 1 reply; 16+ messages in thread
From: wenxu @ 2019-07-29  7:18 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: pablo, fw, netfilter-devel, netdev


On 7/29/2019 12:42 PM, Jakub Kicinski wrote:
> On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:
>> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:
>>> I don't know the nft code, but it seems unlikely it wouldn't have the
>>> same problem/need..  
>> nft don't have the same problem.  The offload rule can only attached
>> to offload base chain.
>>
>> Th  offload base chain is created after the device driver loaded (the
>> device exist).
> For indirect blocks the block is on the tunnel device and the offload
> target is another device. E.g. you offload rules from a VXLAN device
> onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
> device is created.
>
> So I feel like either the chain somehow directly references the offload
> target (in which case the indirect infrastructure with hash lookup etc
> is not needed for nft), or indirect infra is needed, and we need to take
> care of replays.

I think the nft is different with tc. 

In tc case we can create vxlan device add a ingress qdisc with a block success

Then the ASIC driver loaded,  then register the vxlan indr-dev and get the block

adn replay it to hardware

But in the nft case,  The base chain flags with offload. Create an offload netdev

base chain on vxlan device will fail if there is no indr-device to offload.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload
  2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
@ 2019-07-29 11:13   ` Jiri Pirko
  2019-07-29 12:47     ` wenxu
  2019-07-29 11:15   ` Jiri Pirko
  1 sibling, 1 reply; 16+ messages in thread
From: Jiri Pirko @ 2019-07-29 11:13 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev

Sun, Jul 28, 2019 at 08:52:47AM CEST, wenxu@ucloud.cn wrote:
>From: wenxu <wenxu@ucloud.cn>
>
>move tc indirect block to flow_offload and rename
>it to flow indirect block.The nf_tables can use the
>indr block architecture.
>
>Signed-off-by: wenxu <wenxu@ucloud.cn>
>---
>v3: subsys_initcall for init_flow_indr_rhashtable
>v4: no change
>

[...]


>diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
>index 00b9aab..66f89bc 100644
>--- a/include/net/flow_offload.h
>+++ b/include/net/flow_offload.h
>@@ -4,6 +4,7 @@
> #include <linux/kernel.h>
> #include <linux/list.h>
> #include <net/flow_dissector.h>
>+#include <linux/rhashtable.h>
> 
> struct flow_match {
> 	struct flow_dissector	*dissector;
>@@ -366,4 +367,42 @@ static inline void flow_block_init(struct flow_block *flow_block)
> 	INIT_LIST_HEAD(&flow_block->cb_list);
> }
> 
>+typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
>+				      enum tc_setup_type type, void *type_data);
>+
>+struct flow_indr_block_cb {
>+	struct list_head list;
>+	void *cb_priv;
>+	flow_indr_block_bind_cb_t *cb;
>+	void *cb_ident;
>+};

I don't understand why are you pushing this struct out of the c file to
the header. Please don't.


>+
>+typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
>+				       struct flow_block *flow_block,
>+				       struct flow_indr_block_cb *indr_block_cb,
>+				       enum flow_block_command command);
>+
>+struct flow_indr_block_dev {
>+	struct rhash_head ht_node;
>+	struct net_device *dev;
>+	unsigned int refcnt;
>+	struct list_head cb_list;
>+	flow_indr_block_ing_cmd_t *ing_cmd_cb;
>+	struct flow_block *flow_block;

I don't understand why are you pushing this struct out of the c file to
the header. Please don't.


>+};
>+
>+struct flow_indr_block_dev *flow_indr_block_dev_lookup(struct net_device *dev);
>+
>+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
>+				  flow_indr_block_bind_cb_t *cb, void *cb_ident);
>+
>+void __flow_indr_block_cb_unregister(struct net_device *dev,
>+				     flow_indr_block_bind_cb_t *cb, void *cb_ident);
>+
>+int flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
>+				flow_indr_block_bind_cb_t *cb, void *cb_ident);
>+
>+void flow_indr_block_cb_unregister(struct net_device *dev,
>+				   flow_indr_block_bind_cb_t *cb, void *cb_ident);
>+
	
[...]

	
>+
>+int __flow_indr_block_cb_register(struct net_device *dev, void *cb_priv,
>+				  flow_indr_block_bind_cb_t *cb,
>+				  void *cb_ident)
>+{
>+	struct flow_indr_block_cb *indr_block_cb;
>+	struct flow_indr_block_dev *indr_dev;
>+	int err;
>+
>+	indr_dev = flow_indr_block_dev_get(dev);
>+	if (!indr_dev)
>+		return -ENOMEM;
>+
>+	indr_block_cb = flow_indr_block_cb_add(indr_dev, cb_priv, cb, cb_ident);
>+	err = PTR_ERR_OR_ZERO(indr_block_cb);
>+	if (err)
>+		goto err_dev_put;
>+
>+	if (indr_dev->ing_cmd_cb)
>+		indr_dev->ing_cmd_cb(indr_dev->dev, indr_dev->flow_block, indr_block_cb,

This line is over 80cols. Please run checkpatch script for your patch
and obey the warnings.


>+				     FLOW_BLOCK_BIND);
>+
>+	return 0;
>+
>+err_dev_put:
>+	flow_indr_block_dev_put(indr_dev);
>+	return err;
>+}
>+EXPORT_SYMBOL_GPL(__flow_indr_block_cb_register);

[...]


> 
>-static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
>-				  struct tc_indr_block_cb *indr_block_cb,
>+static void tc_indr_block_ing_cmd(struct net_device *dev,

I don't understand why you change struct tc_indr_block_dev * to
struct net_device * here. If you want to do that, please do that in a
separate patch, not it this one where only "the move" should happen.


>+				  struct flow_block *flow_block,
>+				  struct flow_indr_block_cb *indr_block_cb,
> 				  enum flow_block_command command)
> {
>+	struct tcf_block *block = flow_block ?
>+				  container_of(flow_block,
>+					       struct tcf_block,
>+					       flow_block) : NULL;
> 	struct flow_block_offload bo = {
> 		.command	= command,
> 		.binder_type	= FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS,
>-		.net		= dev_net(indr_dev->dev),
>-		.block_shared	= tcf_block_non_null_shared(indr_dev->block),
>+		.net		= dev_net(dev),
>+		.block_shared	= tcf_block_non_null_shared(block),
> 	};
> 	INIT_LIST_HEAD(&bo.cb_list);
> 

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload
  2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
  2019-07-29 11:13   ` Jiri Pirko
@ 2019-07-29 11:15   ` Jiri Pirko
  1 sibling, 0 replies; 16+ messages in thread
From: Jiri Pirko @ 2019-07-29 11:15 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev

Sun, Jul 28, 2019 at 08:52:47AM CEST, wenxu@ucloud.cn wrote:
>From: wenxu <wenxu@ucloud.cn>
>
>move tc indirect block to flow_offload and rename

A sentence should start with capital letter.


>it to flow indirect block.The nf_tables can use the

There should be a space between "." and first letter of the next
sensence.


>indr block architecture.
>

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload
  2019-07-29 11:13   ` Jiri Pirko
@ 2019-07-29 12:47     ` wenxu
  2019-07-29 13:13       ` Jiri Pirko
  0 siblings, 1 reply; 16+ messages in thread
From: wenxu @ 2019-07-29 12:47 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev


在 2019/7/29 19:13, Jiri Pirko 写道:
> Sun, Jul 28, 2019 at 08:52:47AM CEST, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> move tc indirect block to flow_offload and rename
>> it to flow indirect block.The nf_tables can use the
>> indr block architecture.
>>
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>> ---
>> v3: subsys_initcall for init_flow_indr_rhashtable
>> v4: no change
>>
> [...]
>
>
>> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
>> index 00b9aab..66f89bc 100644
>> --- a/include/net/flow_offload.h
>> +++ b/include/net/flow_offload.h
>> @@ -4,6 +4,7 @@
>> #include <linux/kernel.h>
>> #include <linux/list.h>
>> #include <net/flow_dissector.h>
>> +#include <linux/rhashtable.h>
>>
>> struct flow_match {
>> 	struct flow_dissector	*dissector;
>> @@ -366,4 +367,42 @@ static inline void flow_block_init(struct flow_block *flow_block)
>> 	INIT_LIST_HEAD(&flow_block->cb_list);
>> }
>>
>> +typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
>> +				      enum tc_setup_type type, void *type_data);
>> +
>> +struct flow_indr_block_cb {
>> +	struct list_head list;
>> +	void *cb_priv;
>> +	flow_indr_block_bind_cb_t *cb;
>> +	void *cb_ident;
>> +};
> I don't understand why are you pushing this struct out of the c file to
> the header. Please don't.
>
>
>> +
>> +typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
>> +				       struct flow_block *flow_block,
>> +				       struct flow_indr_block_cb *indr_block_cb,
>> +				       enum flow_block_command command);
>> +
>> +struct flow_indr_block_dev {
>> +	struct rhash_head ht_node;
>> +	struct net_device *dev;
>> +	unsigned int refcnt;
>> +	struct list_head cb_list;
>> +	flow_indr_block_ing_cmd_t *ing_cmd_cb;
>> +	struct flow_block *flow_block;
> I don't understand why are you pushing this struct out of the c file to
> the header. Please don't.

the flow_indr_block_dev and indr_block_cb in the h file used for the function

tc_indr_block_ing_cmd in cls_api.c

>> -static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
>> -				  struct tc_indr_block_cb *indr_block_cb,
>> +static void tc_indr_block_ing_cmd(struct net_device *dev,
> I don't understand why you change struct tc_indr_block_dev * to
> struct net_device * here. If you want to do that, please do that in a
> separate patch, not it this one where only "the move" should happen.
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload
  2019-07-29 12:47     ` wenxu
@ 2019-07-29 13:13       ` Jiri Pirko
  0 siblings, 0 replies; 16+ messages in thread
From: Jiri Pirko @ 2019-07-29 13:13 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, jakub.kicinski, netfilter-devel, netdev

Mon, Jul 29, 2019 at 02:47:07PM CEST, wenxu@ucloud.cn wrote:
>
>在 2019/7/29 19:13, Jiri Pirko 写道:
>> Sun, Jul 28, 2019 at 08:52:47AM CEST, wenxu@ucloud.cn wrote:
>>> From: wenxu <wenxu@ucloud.cn>
>>>
>>> move tc indirect block to flow_offload and rename
>>> it to flow indirect block.The nf_tables can use the
>>> indr block architecture.
>>>
>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>> ---
>>> v3: subsys_initcall for init_flow_indr_rhashtable
>>> v4: no change
>>>
>> [...]
>>
>>
>>> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
>>> index 00b9aab..66f89bc 100644
>>> --- a/include/net/flow_offload.h
>>> +++ b/include/net/flow_offload.h
>>> @@ -4,6 +4,7 @@
>>> #include <linux/kernel.h>
>>> #include <linux/list.h>
>>> #include <net/flow_dissector.h>
>>> +#include <linux/rhashtable.h>
>>>
>>> struct flow_match {
>>> 	struct flow_dissector	*dissector;
>>> @@ -366,4 +367,42 @@ static inline void flow_block_init(struct flow_block *flow_block)
>>> 	INIT_LIST_HEAD(&flow_block->cb_list);
>>> }
>>>
>>> +typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
>>> +				      enum tc_setup_type type, void *type_data);
>>> +
>>> +struct flow_indr_block_cb {
>>> +	struct list_head list;
>>> +	void *cb_priv;
>>> +	flow_indr_block_bind_cb_t *cb;
>>> +	void *cb_ident;
>>> +};
>> I don't understand why are you pushing this struct out of the c file to
>> the header. Please don't.
>>
>>
>>> +
>>> +typedef void flow_indr_block_ing_cmd_t(struct net_device *dev,
>>> +				       struct flow_block *flow_block,
>>> +				       struct flow_indr_block_cb *indr_block_cb,
>>> +				       enum flow_block_command command);
>>> +
>>> +struct flow_indr_block_dev {
>>> +	struct rhash_head ht_node;
>>> +	struct net_device *dev;
>>> +	unsigned int refcnt;
>>> +	struct list_head cb_list;
>>> +	flow_indr_block_ing_cmd_t *ing_cmd_cb;
>>> +	struct flow_block *flow_block;
>> I don't understand why are you pushing this struct out of the c file to
>> the header. Please don't.
>
>the flow_indr_block_dev and indr_block_cb in the h file used for the function

You don't need it, same as before. Please don't expose this struct.


>
>tc_indr_block_ing_cmd in cls_api.c
>
>>> -static void tc_indr_block_ing_cmd(struct tc_indr_block_dev *indr_dev,
>>> -				  struct tc_indr_block_cb *indr_block_cb,
>>> +static void tc_indr_block_ing_cmd(struct net_device *dev,
>> I don't understand why you change struct tc_indr_block_dev * to
>> struct net_device * here. If you want to do that, please do that in a
>> separate patch, not it this one where only "the move" should happen.


Did you see the rest of my comments???



>>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29  7:05         ` wenxu
@ 2019-07-29 16:51           ` Jakub Kicinski
  0 siblings, 0 replies; 16+ messages in thread
From: Jakub Kicinski @ 2019-07-29 16:51 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, netfilter-devel, netdev

On Mon, 29 Jul 2019 15:05:34 +0800, wenxu wrote:
> On 7/29/2019 12:42 PM, Jakub Kicinski wrote:
> > On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:  
> >> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:  
> >>> I don't know the nft code, but it seems unlikely it wouldn't have the
> >>> same problem/need..    
> >> nft don't have the same problem.  The offload rule can only attached
> >> to offload base chain.
> >>
> >> Th  offload base chain is created after the device driver loaded (the
> >> device exist).  
> > For indirect blocks the block is on the tunnel device and the offload
> > target is another device. E.g. you offload rules from a VXLAN device
> > onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
> > device is created.
> >
> > So I feel like either the chain somehow directly references the offload
> > target (in which case the indirect infrastructure with hash lookup etc
> > is not needed for nft), or indirect infra is needed, and we need to take
> > care of replays.  
> 
> So you mean the case is there are two card A and B both can offload vxlan.
> 
> First vxlan device offload with A.  And then the B driver loaded, So the rules
> should replay to B device?

That'd be one example, yes.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29  7:18         ` wenxu
@ 2019-07-29 16:55           ` Jakub Kicinski
  2019-07-30  0:10             ` wenxu
  0 siblings, 1 reply; 16+ messages in thread
From: Jakub Kicinski @ 2019-07-29 16:55 UTC (permalink / raw)
  To: wenxu; +Cc: pablo, fw, netfilter-devel, netdev

On Mon, 29 Jul 2019 15:18:03 +0800, wenxu wrote:
> On 7/29/2019 12:42 PM, Jakub Kicinski wrote:
> > On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:  
> >> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:  
> >>> I don't know the nft code, but it seems unlikely it wouldn't have the
> >>> same problem/need..    
> >> nft don't have the same problem.  The offload rule can only attached
> >> to offload base chain.
> >>
> >> Th  offload base chain is created after the device driver loaded (the
> >> device exist).  
> > For indirect blocks the block is on the tunnel device and the offload
> > target is another device. E.g. you offload rules from a VXLAN device
> > onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
> > device is created.
> >
> > So I feel like either the chain somehow directly references the offload
> > target (in which case the indirect infrastructure with hash lookup etc
> > is not needed for nft), or indirect infra is needed, and we need to take
> > care of replays.  
> 
> I think the nft is different with tc. 
> 
> In tc case we can create vxlan device add a ingress qdisc with a block success
> 
> Then the ASIC driver loaded,  then register the vxlan indr-dev and get the block
> adn replay it to hardware
> 
> But in the nft case,  The base chain flags with offload. Create an offload netdev
> base chain on vxlan device will fail if there is no indr-device to offload.

Can you show us the offload chain spec? Does it specify offload to the
vxlan device or the ASIC device?

Indir-devs can come and go, how do you handle a situation where offload
chain was installed with indir listener present, but then the ASIC
driver got removed?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately
  2019-07-29 16:55           ` Jakub Kicinski
@ 2019-07-30  0:10             ` wenxu
  0 siblings, 0 replies; 16+ messages in thread
From: wenxu @ 2019-07-30  0:10 UTC (permalink / raw)
  To: Jakub Kicinski; +Cc: pablo, fw, netfilter-devel, netdev


在 2019/7/30 0:55, Jakub Kicinski 写道:
> On Mon, 29 Jul 2019 15:18:03 +0800, wenxu wrote:
>> On 7/29/2019 12:42 PM, Jakub Kicinski wrote:
>>> On Mon, 29 Jul 2019 10:43:56 +0800, wenxu wrote:  
>>>> On 7/29/2019 4:16 AM, Jakub Kicinski wrote:  
>>>>> I don't know the nft code, but it seems unlikely it wouldn't have the
>>>>> same problem/need..    
>>>> nft don't have the same problem.  The offload rule can only attached
>>>> to offload base chain.
>>>>
>>>> Th  offload base chain is created after the device driver loaded (the
>>>> device exist).  
>>> For indirect blocks the block is on the tunnel device and the offload
>>> target is another device. E.g. you offload rules from a VXLAN device
>>> onto the ASIC. The ASICs driver does not have to be loaded when VXLAN
>>> device is created.
>>>
>>> So I feel like either the chain somehow directly references the offload
>>> target (in which case the indirect infrastructure with hash lookup etc
>>> is not needed for nft), or indirect infra is needed, and we need to take
>>> care of replays.  
>> I think the nft is different with tc. 
>>
>> In tc case we can create vxlan device add a ingress qdisc with a block success
>>
>> Then the ASIC driver loaded,  then register the vxlan indr-dev and get the block
>> adn replay it to hardware
>>
>> But in the nft case,  The base chain flags with offload. Create an offload netdev
>> base chain on vxlan device will fail if there is no indr-device to offload.
> Can you show us the offload chain spec? Does it specify offload to the
> vxlan device or the ASIC device?

nft add chain netdev firewall aclout { type filter hook ingress offload device vxlan0 priority - 300 \; }

>
> Indir-devs can come and go, how do you handle a situation where offload
> chain was installed with indir listener present, but then the ASIC
> driver got removed?

yes, I think nft is also need to get the default block in the indr-regster-cb

for the go aways and reload again case

>

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-07-30  0:10 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-28  6:52 [PATCH net-next v4 0/3] flow_offload: add indr-block in nf_table_offload wenxu
2019-07-28  6:52 ` [PATCH net-next v4 1/3] flow_offload: move tc indirect block to flow offload wenxu
2019-07-29 11:13   ` Jiri Pirko
2019-07-29 12:47     ` wenxu
2019-07-29 13:13       ` Jiri Pirko
2019-07-29 11:15   ` Jiri Pirko
2019-07-28  6:52 ` [PATCH net-next v4 2/3] flow_offload: Support get default block from tc immediately wenxu
2019-07-28 20:16   ` Jakub Kicinski
2019-07-29  2:43     ` wenxu
2019-07-29  4:42       ` Jakub Kicinski
2019-07-29  7:05         ` wenxu
2019-07-29 16:51           ` Jakub Kicinski
2019-07-29  7:18         ` wenxu
2019-07-29 16:55           ` Jakub Kicinski
2019-07-30  0:10             ` wenxu
2019-07-28  6:52 ` [PATCH net-next v4 3/3] netfilter: nf_tables_offload: support indr block call wenxu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).