netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net v3 0/4] several fixes for indirect flow_blocks offload
@ 2020-06-16  3:19 wenxu
  2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
                   ` (3 more replies)
  0 siblings, 4 replies; 24+ messages in thread
From: wenxu @ 2020-06-16  3:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, pablo, vladbu

From: wenxu <wenxu@ucloud.cn>

v2:
patch2: store the cb_priv of representor to the flow_block_cb->indr.cb_priv
in the driver. And make the correct check with the statments
this->indr.cb_priv == cb_priv

patch4: del the driver list only in the indriect cleanup callbacks

v3:
add the cover letter and changlogs.

This series fixes commit 1fac52da5942 ("net: flow_offload: consolidate
indirect flow_block infrastructure") that revists the flow_block
infrastructure.

The first patch fix the miss cleanup for flowtable indirect flow_block. 
The cleanup operation based on the setup callback. But in the mlx5e
driver there are tc and flowtable indrict setup callback and shared
the same release callbacks. So when the representor is removed,
then identify the indirect flow_blocks that need to be removed by 
the release callback.

The second patch fix the incorrect cb_priv check in flow_block_cb.
In the function __flow_block_indr_cleanup, stataments
this->cb_priv == cb_priv is always false(the flow_block_cb->cb_priv
is totally different data with the flow_indr_dev->cb_priv). So there
will always miss cleanup when the HW goaway and lead the memory leak.

After fix the first two patches. When the HW goaway, the indirect
flow_block can be cleanup. But It takes another two problem.


The third patch fix block->nooffloaddevcnt warning dmesg log.
When a indr device add in offload success. The block->nooffloaddevcnt
should be 0. After the representor go away. When the dir device go away
the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
demesg log. 
The block->nooffloaddevcnt should always count for indr block.
even the indr block offload successful. The representor maybe
gone away and the ingress qdisc can work in software mode.

The last patch fix the list_del corruption in the driver list.
When a indr device add in offload success. After the representor
go away. All the flow_block_cb cleanup but miss del form driver
list.

All the problem can be reproduce through making real hw go away
after setup the block offoaded.

wenxu (4):
  flow_offload: fix incorrect cleanup for indirect flow_blocks
  flow_offload: fix incorrect cb_priv check for flow_block_cb
  net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  flow_offload: fix the list_del corruption in the driver list

 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c       |  3 ++-
 .../net/ethernet/mellanox/mlx5/core/en/rep/tc.c    |  4 ++--
 drivers/net/ethernet/netronome/nfp/flower/main.c   |  2 +-
 drivers/net/ethernet/netronome/nfp/flower/main.h   |  3 +--
 .../net/ethernet/netronome/nfp/flower/offload.c    |  7 +++---
 include/net/flow_offload.h                         |  3 ++-
 net/core/flow_offload.c                            | 11 +++++-----
 net/netfilter/nf_flow_table_offload.c              |  1 +
 net/netfilter/nf_tables_offload.c                  |  1 +
 net/sched/cls_api.c                                | 25 +++++++++++++---------
 10 files changed, 35 insertions(+), 25 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks
  2020-06-16  3:19 [PATCH net v3 0/4] several fixes for indirect flow_blocks offload wenxu
@ 2020-06-16  3:19 ` wenxu
  2020-06-16 15:55   ` Simon Horman
  2020-06-16 20:11   ` Pablo Neira Ayuso
  2020-06-16  3:19 ` [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb wenxu
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 24+ messages in thread
From: wenxu @ 2020-06-16  3:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, pablo, vladbu

From: wenxu <wenxu@ucloud.cn>

The cleanup operation based on the setup callback. But in the mlx5e
driver there are tc and flowtable indrict setup callback and shared
the same release callbacks. So when the representor is removed,
then identify the indirect flow_blocks that need to be removed by  
the release callback.

Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 2 +-
 drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/main.c    | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/main.h    | 3 +--
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 6 +++---
 include/net/flow_offload.h                          | 2 +-
 net/core/flow_offload.c                             | 9 +++++----
 7 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 0eef4f5..ef7f6bc 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -2074,7 +2074,7 @@ void bnxt_shutdown_tc(struct bnxt *bp)
 		return;
 
 	flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
-				 bnxt_tc_setup_indr_block_cb);
+				 bnxt_tc_setup_indr_rel);
 	rhashtable_destroy(&tc_info->flow_table);
 	rhashtable_destroy(&tc_info->l2_table);
 	rhashtable_destroy(&tc_info->decap_l2_table);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index 80713123..a62bcf0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -496,7 +496,7 @@ int mlx5e_rep_tc_netdevice_event_register(struct mlx5e_rep_priv *rpriv)
 void mlx5e_rep_tc_netdevice_event_unregister(struct mlx5e_rep_priv *rpriv)
 {
 	flow_indr_dev_unregister(mlx5e_rep_indr_setup_cb, rpriv,
-				 mlx5e_rep_indr_setup_tc_cb);
+				 mlx5e_rep_indr_block_unbind);
 }
 
 #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index c393276..bb448c8 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -861,7 +861,7 @@ static void nfp_flower_clean(struct nfp_app *app)
 	flush_work(&app_priv->cmsg_work);
 
 	flow_indr_dev_unregister(nfp_flower_indr_setup_tc_cb, app,
-				 nfp_flower_setup_indr_block_cb);
+				 nfp_flower_setup_indr_tc_release);
 
 	if (app_priv->flower_ext_feats & NFP_FL_FEATS_VF_RLIM)
 		nfp_flower_qos_cleanup(app);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index 6c3dc3b..c983337 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -460,8 +460,7 @@ int nfp_flower_setup_qos_offload(struct nfp_app *app, struct net_device *netdev,
 void nfp_flower_stats_rlim_reply(struct nfp_app *app, struct sk_buff *skb);
 int nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
 				enum tc_setup_type type, void *type_data);
-int nfp_flower_setup_indr_block_cb(enum tc_setup_type type, void *type_data,
-				   void *cb_priv);
+void nfp_flower_setup_indr_tc_release(void *cb_priv);
 
 void
 __nfp_flower_non_repr_priv_get(struct nfp_flower_non_repr_priv *non_repr_priv);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 695d24b9..28de905 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1619,8 +1619,8 @@ struct nfp_flower_indr_block_cb_priv {
 	return NULL;
 }
 
-int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
-				   void *type_data, void *cb_priv)
+static int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
+					  void *type_data, void *cb_priv)
 {
 	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
 	struct flow_cls_offload *flower = type_data;
@@ -1637,7 +1637,7 @@ int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
 	}
 }
 
-static void nfp_flower_setup_indr_tc_release(void *cb_priv)
+void nfp_flower_setup_indr_tc_release(void *cb_priv)
 {
 	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
 
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index f2c8311..3a2d6b4 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -536,7 +536,7 @@ typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
 
 int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv);
 void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      flow_setup_cb_t *setup_cb);
+			      void (*release)(void *cb_priv));
 int flow_indr_dev_setup_offload(struct net_device *dev,
 				enum tc_setup_type type, void *data,
 				struct flow_block_offload *bo,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 0cfc35e..b288d2f 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -372,13 +372,14 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
 }
 EXPORT_SYMBOL(flow_indr_dev_register);
 
-static void __flow_block_indr_cleanup(flow_setup_cb_t *setup_cb, void *cb_priv,
+static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
+				      void *cb_priv,
 				      struct list_head *cleanup_list)
 {
 	struct flow_block_cb *this, *next;
 
 	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
-		if (this->cb == setup_cb &&
+		if (this->release == release &&
 		    this->cb_priv == cb_priv) {
 			list_move(&this->indr.list, cleanup_list);
 			return;
@@ -397,7 +398,7 @@ static void flow_block_indr_notify(struct list_head *cleanup_list)
 }
 
 void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      flow_setup_cb_t *setup_cb)
+			      void (*release)(void *cb_priv))
 {
 	struct flow_indr_dev *this, *next, *indr_dev = NULL;
 	LIST_HEAD(cleanup_list);
@@ -418,7 +419,7 @@ void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
 		return;
 	}
 
-	__flow_block_indr_cleanup(setup_cb, cb_priv, &cleanup_list);
+	__flow_block_indr_cleanup(release, cb_priv, &cleanup_list);
 	mutex_unlock(&flow_indr_block_lock);
 
 	flow_block_indr_notify(&cleanup_list);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16  3:19 [PATCH net v3 0/4] several fixes for indirect flow_blocks offload wenxu
  2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
@ 2020-06-16  3:19 ` wenxu
  2020-06-16 10:51   ` Simon Horman
  2020-06-16 20:13   ` Pablo Neira Ayuso
  2020-06-16  3:19 ` [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log wenxu
  2020-06-16  3:19 ` [PATCH net v3 4/4] flow_offload: fix the list_del corruption in the driver list wenxu
  3 siblings, 2 replies; 24+ messages in thread
From: wenxu @ 2020-06-16  3:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, pablo, vladbu

From: wenxu <wenxu@ucloud.cn>

In the function __flow_block_indr_cleanup, The match stataments
this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
is totally different data with the flow_indr_dev->cb_priv.

Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
the driver.

Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 1 +
 drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
 drivers/net/ethernet/netronome/nfp/flower/offload.c | 1 +
 include/net/flow_offload.h                          | 1 +
 net/core/flow_offload.c                             | 2 +-
 5 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index ef7f6bc..042c285 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1918,6 +1918,7 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
 
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &bnxt_block_cb_list);
+		block_cb->indr.cb_priv = bp;
 		break;
 	case FLOW_BLOCK_UNBIND:
 		cb_priv = bnxt_tc_indr_block_cb_lookup(bp, netdev);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index a62bcf0..187f84c 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -447,7 +447,7 @@ static void mlx5e_rep_indr_block_unbind(void *cb_priv)
 		}
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &mlx5e_block_cb_list);
-
+		block_cb->indr.cb_priv = rpriv;
 		return 0;
 	case FLOW_BLOCK_UNBIND:
 		indr_priv = mlx5e_rep_indr_block_priv_lookup(rpriv, netdev);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 28de905..ca2f01a 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1687,6 +1687,7 @@ void nfp_flower_setup_indr_tc_release(void *cb_priv)
 
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &nfp_block_cb_list);
+		block_cb->indr.cb_priv = app;
 		return 0;
 	case FLOW_BLOCK_UNBIND:
 		cb_priv = nfp_flower_indr_block_cb_priv_lookup(app, netdev);
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index 3a2d6b4..ef4d8b0 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -450,6 +450,7 @@ struct flow_block_indr {
 	struct net_device		*dev;
 	enum flow_block_binder_type	binder_type;
 	void				*data;
+	void				*cb_priv;
 	void				(*cleanup)(struct flow_block_cb *block_cb);
 };
 
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index b288d2f..6614351 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -380,7 +380,7 @@ static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
 
 	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
 		if (this->release == release &&
-		    this->cb_priv == cb_priv) {
+		    this->indr.cb_priv == cb_priv) {
 			list_move(&this->indr.list, cleanup_list);
 			return;
 		}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  2020-06-16  3:19 [PATCH net v3 0/4] several fixes for indirect flow_blocks offload wenxu
  2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
  2020-06-16  3:19 ` [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb wenxu
@ 2020-06-16  3:19 ` wenxu
  2020-06-16 20:17   ` Pablo Neira Ayuso
  2020-06-16  3:19 ` [PATCH net v3 4/4] flow_offload: fix the list_del corruption in the driver list wenxu
  3 siblings, 1 reply; 24+ messages in thread
From: wenxu @ 2020-06-16  3:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, pablo, vladbu

From: wenxu <wenxu@ucloud.cn>

When a indr device add in offload success. The block->nooffloaddevcnt
should be 0. After the representor go away. When the dir device go away
the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
dmesg log. 

The block->nooffloaddevcnt should always count for indr block.
even the indr block offload successful. The representor maybe
gone away and the ingress qdisc can work in software mode.

block->nooffloaddevcnt warning with following dmesg log:

[  760.667058] #####################################################
[  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
[  760.669179] #####################################################
[  761.780655] :test: Fedora 30 (Thirty)
[  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
[  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
[  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
[  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
[  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
[  762.234341] :test: unbind vfs of ens1f0
[  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
[  762.291363] :test: unbind vfs of ens1f1
[  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
[  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
[  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
[  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
[  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[  763.552429] ens1f1_0: renamed from eth0
[  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
[  763.629694] ens1f1_1: renamed from eth1
[  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
[  764.670841] :test: unbind vfs of ens1f0
[  764.681966] :test: unbind vfs of ens1f1
[  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
[  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
[  764.797325] :test: Add multipath vxlan encap rule and disable sriov
[  764.798544] :test: config multipath route
[  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
[  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
[  765.603681] :test: OK
[  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
[  765.675085] :test: verify rule in hw
[  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
[  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
[  766.979230] :test: OK
[  768.125419] :test: OK
[  768.127519] :test: - disable sriov ens1f1
[  768.131160] pci 0000:81:02.2: Removing from iommu group 75
[  768.132646] pci 0000:81:02.3: Removing from iommu group 76
[  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
[  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
[  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
[  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
[  769.990022] :test: - disable sriov ens1f0
[  769.994922] pci 0000:81:00.2: Removing from iommu group 73
[  769.997048] pci 0000:81:00.3: Removing from iommu group 74
[  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
[  771.339091] ------------[ cut here ]------------
[  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
[  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
[  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
[  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
[  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
[  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
[  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
[  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
[  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
[  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
[  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
[  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
[  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
[  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
[  771.360825] Call Trace:
[  771.361764]  __tcf_block_put+0x84/0x150
[  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
[  771.363658]  qdisc_destroy+0x3e/0xc0
[  771.364594]  dev_shutdown+0x7a/0xa5
[  771.365522]  rollback_registered_many+0x20d/0x530
[  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
[  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
[  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
[  771.369454]  notifier_call_chain+0x4c/0x70
[  771.370579]  rollback_registered_many+0x2f5/0x530
[  771.371719]  rollback_registered+0x56/0x90
[  771.372843]  unregister_netdevice_queue+0x73/0xb0
[  771.373982]  unregister_netdev+0x18/0x20
[  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
[  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
[  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
[  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
[  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
[  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
[  771.382087]  sriov_numvfs_store+0xfc/0x130
[  771.383195]  kernfs_fop_write+0xce/0x1b0
[  771.384302]  vfs_write+0xb6/0x1a0
[  771.385410]  ksys_write+0x5f/0xe0
[  771.386500]  do_syscall_64+0x5b/0x1d0
[  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
 net/sched/cls_api.c | 24 ++++++++++++++----------
 1 file changed, 14 insertions(+), 10 deletions(-)

diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a00a203..86c3937 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -671,25 +671,29 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
 				 struct netlink_ext_ack *extack)
 {
 	struct flow_block_offload bo = {};
-	int err;
 
 	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
 			       &block->flow_block, tcf_block_shared(block),
 			       extack);
 
-	if (dev->netdev_ops->ndo_setup_tc)
+	if (dev->netdev_ops->ndo_setup_tc) {
+		int err;
+
 		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
-	else
-		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
-						  &bo, tc_block_indr_cleanup);
+		if (err < 0) {
+			if (err != -EOPNOTSUPP)
+				NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
+			return err;
+		}
 
-	if (err < 0) {
-		if (err != -EOPNOTSUPP)
-			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
-		return err;
+		return tcf_block_setup(block, &bo);
 	}
 
-	return tcf_block_setup(block, &bo);
+	flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block, &bo,
+				    tc_block_indr_cleanup);
+	tcf_block_setup(block, &bo);
+
+	return -EOPNOTSUPP;
 }
 
 static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH net v3 4/4] flow_offload: fix the list_del corruption in the driver list
  2020-06-16  3:19 [PATCH net v3 0/4] several fixes for indirect flow_blocks offload wenxu
                   ` (2 preceding siblings ...)
  2020-06-16  3:19 ` [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log wenxu
@ 2020-06-16  3:19 ` wenxu
  3 siblings, 0 replies; 24+ messages in thread
From: wenxu @ 2020-06-16  3:19 UTC (permalink / raw)
  To: netdev; +Cc: davem, pablo, vladbu

From: wenxu <wenxu@ucloud.cn>

When a indr device add in offload success. After the representor
go away. All the flow_block_cb cleanup but miss del form driver
list.

Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
Signed-off-by: wenxu <wenxu@ucloud.cn>
---
 net/netfilter/nf_flow_table_offload.c | 1 +
 net/netfilter/nf_tables_offload.c     | 1 +
 net/sched/cls_api.c                   | 1 +
 3 files changed, 3 insertions(+)

diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 62651e6..5fff1e0 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -950,6 +950,7 @@ static void nf_flow_table_indr_cleanup(struct flow_block_cb *block_cb)
 	nf_flow_table_gc_cleanup(flowtable, dev);
 	down_write(&flowtable->flow_block_lock);
 	list_del(&block_cb->list);
+	list_del(&block_cb->driver_list);
 	flow_block_cb_free(block_cb);
 	up_write(&flowtable->flow_block_lock);
 }
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 185fc82..c7cf1cd 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -296,6 +296,7 @@ static void nft_indr_block_cleanup(struct flow_block_cb *block_cb)
 	nft_flow_block_offload_init(&bo, dev_net(dev), FLOW_BLOCK_UNBIND,
 				    basechain, &extack);
 	mutex_lock(&net->nft.commit_mutex);
+	list_del(&block_cb->driver_list);
 	list_move(&block_cb->list, &bo.cb_list);
 	nft_flow_offload_unbind(&bo, basechain);
 	mutex_unlock(&net->nft.commit_mutex);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 86c3937..faa78b7 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -652,6 +652,7 @@ static void tc_block_indr_cleanup(struct flow_block_cb *block_cb)
 			       &block->flow_block, tcf_block_shared(block),
 			       &extack);
 	down_write(&block->cb_lock);
+	list_del(&block_cb->driver_list);
 	list_move(&block_cb->list, &bo.cb_list);
 	up_write(&block->cb_lock);
 	rtnl_lock();
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16  3:19 ` [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb wenxu
@ 2020-06-16 10:51   ` Simon Horman
  2020-06-16 14:20     ` wenxu
  2020-06-16 20:13   ` Pablo Neira Ayuso
  1 sibling, 1 reply; 24+ messages in thread
From: Simon Horman @ 2020-06-16 10:51 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, pablo, vladbu

On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> In the function __flow_block_indr_cleanup, The match stataments
> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> is totally different data with the flow_indr_dev->cb_priv.
> 
> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> the driver.
> 
> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> Signed-off-by: wenxu <wenxu@ucloud.cn>

Hi Wenxu,

I wonder if this can be resolved by using the cb_ident field of struct
flow_block_cb.

I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
where the value of the cb_ident parameter of flow_block_cb_alloc() is
per-block rather than per-device. So part of my proposal is to change
that.

The other part of my proposal is to make use of cb_ident in
__flow_block_indr_cleanup(). Which does seem to match the intended
purpose of cb_ident. Perhaps it would also be good to document what
the intended purpose of cb_ident (and the other fields of struct
flow_block_cb) is.

Compile tested only.

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index a62bcf0cf512..4de6fcae5252 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -438,7 +438,7 @@ mlx5e_rep_indr_setup_block(struct net_device *netdev,
 		list_add(&indr_priv->list,
 			 &rpriv->uplink_priv.tc_indr_block_priv_list);
 
-		block_cb = flow_block_cb_alloc(setup_cb, indr_priv, indr_priv,
+		block_cb = flow_block_cb_alloc(setup_cb, rpriv, indr_priv,
 					       mlx5e_rep_indr_block_unbind);
 		if (IS_ERR(block_cb)) {
 			list_del(&indr_priv->list);
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index b288d2f03789..d281fb182894 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -373,14 +373,13 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
 EXPORT_SYMBOL(flow_indr_dev_register);
 
 static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
-				      void *cb_priv,
+				      void *cb_ident,
 				      struct list_head *cleanup_list)
 {
 	struct flow_block_cb *this, *next;
 
 	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
-		if (this->release == release &&
-		    this->cb_priv == cb_priv) {
+		if (this->release == release && this->cb_ident == cb_ident) {
 			list_move(&this->indr.list, cleanup_list);
 			return;
 		}

^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 10:51   ` Simon Horman
@ 2020-06-16 14:20     ` wenxu
  2020-06-16 14:34       ` Simon Horman
  0 siblings, 1 reply; 24+ messages in thread
From: wenxu @ 2020-06-16 14:20 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, davem, pablo, vladbu


在 2020/6/16 18:51, Simon Horman 写道:
> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> In the function __flow_block_indr_cleanup, The match stataments
>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>> is totally different data with the flow_indr_dev->cb_priv.
>>
>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>> the driver.
>>
>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> Hi Wenxu,
>
> I wonder if this can be resolved by using the cb_ident field of struct
> flow_block_cb.
>
> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
> where the value of the cb_ident parameter of flow_block_cb_alloc() is
> per-block rather than per-device. So part of my proposal is to change
> that.

I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of

flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block

and bnxt_tc_setup_indr_block.


nfp_flower_setup_indr_tc_block:

struct nfp_flower_indr_block_cb_priv *cb_priv;

block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
                                               cb_priv, cb_priv,
                                               nfp_flower_setup_indr_tc_release);


bnxt_tc_setup_indr_block:

struct bnxt_flower_indr_block_cb_priv *cb_priv;

block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
                                               cb_priv, cb_priv,
                                               bnxt_tc_setup_indr_rel);


And the function flow_block_cb_is_busy called in most place. Pass the

parameter as cb_priv but not cb_indent .







>
> The other part of my proposal is to make use of cb_ident in
> __flow_block_indr_cleanup(). Which does seem to match the intended
> purpose of cb_ident. Perhaps it would also be good to document what
> the intended purpose of cb_ident (and the other fields of struct
> flow_block_cb) is.
>
> Compile tested only.
>
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> index a62bcf0cf512..4de6fcae5252 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> @@ -438,7 +438,7 @@ mlx5e_rep_indr_setup_block(struct net_device *netdev,
>  		list_add(&indr_priv->list,
>  			 &rpriv->uplink_priv.tc_indr_block_priv_list);
>  
> -		block_cb = flow_block_cb_alloc(setup_cb, indr_priv, indr_priv,
> +		block_cb = flow_block_cb_alloc(setup_cb, rpriv, indr_priv,
>  					       mlx5e_rep_indr_block_unbind);
>  		if (IS_ERR(block_cb)) {
>  			list_del(&indr_priv->list);
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index b288d2f03789..d281fb182894 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -373,14 +373,13 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
>  EXPORT_SYMBOL(flow_indr_dev_register);
>  
>  static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
> -				      void *cb_priv,
> +				      void *cb_ident,
>  				      struct list_head *cleanup_list)
>  {
>  	struct flow_block_cb *this, *next;
>  
>  	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
> -		if (this->release == release &&
> -		    this->cb_priv == cb_priv) {
> +		if (this->release == release && this->cb_ident == cb_ident) {
>  			list_move(&this->indr.list, cleanup_list);
>  			return;
>  		}
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 14:20     ` wenxu
@ 2020-06-16 14:34       ` Simon Horman
  2020-06-16 15:18         ` wenxu
  0 siblings, 1 reply; 24+ messages in thread
From: Simon Horman @ 2020-06-16 14:34 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, pablo, vladbu

On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
> 
> 在 2020/6/16 18:51, Simon Horman 写道:
> > On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> >> From: wenxu <wenxu@ucloud.cn>
> >>
> >> In the function __flow_block_indr_cleanup, The match stataments
> >> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> >> is totally different data with the flow_indr_dev->cb_priv.
> >>
> >> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> >> the driver.
> >>
> >> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> >> Signed-off-by: wenxu <wenxu@ucloud.cn>
> > Hi Wenxu,
> >
> > I wonder if this can be resolved by using the cb_ident field of struct
> > flow_block_cb.
> >
> > I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
> > where the value of the cb_ident parameter of flow_block_cb_alloc() is
> > per-block rather than per-device. So part of my proposal is to change
> > that.
> 
> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
> 
> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
> 
> and bnxt_tc_setup_indr_block.
> 
> 
> nfp_flower_setup_indr_tc_block:
> 
> struct nfp_flower_indr_block_cb_priv *cb_priv;
> 
> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
>                                                cb_priv, cb_priv,
>                                                nfp_flower_setup_indr_tc_release);
> 
> 
> bnxt_tc_setup_indr_block:
> 
> struct bnxt_flower_indr_block_cb_priv *cb_priv;
> 
> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
>                                                cb_priv, cb_priv,
>                                                bnxt_tc_setup_indr_rel);
> 
> 
> And the function flow_block_cb_is_busy called in most place. Pass the
> 
> parameter as cb_priv but not cb_indent .

Thanks, I see that now. But I still think it would be useful to understand
the purpose of cb_ident. It feels like it would lead to a clean solution
to the problem you have highlighted.

> > The other part of my proposal is to make use of cb_ident in
> > __flow_block_indr_cleanup(). Which does seem to match the intended
> > purpose of cb_ident. Perhaps it would also be good to document what
> > the intended purpose of cb_ident (and the other fields of struct
> > flow_block_cb) is.
> >
> > Compile tested only.
> >
> > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> > index a62bcf0cf512..4de6fcae5252 100644
> > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> > @@ -438,7 +438,7 @@ mlx5e_rep_indr_setup_block(struct net_device *netdev,
> >  		list_add(&indr_priv->list,
> >  			 &rpriv->uplink_priv.tc_indr_block_priv_list);
> >  
> > -		block_cb = flow_block_cb_alloc(setup_cb, indr_priv, indr_priv,
> > +		block_cb = flow_block_cb_alloc(setup_cb, rpriv, indr_priv,
> >  					       mlx5e_rep_indr_block_unbind);
> >  		if (IS_ERR(block_cb)) {
> >  			list_del(&indr_priv->list);
> > diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> > index b288d2f03789..d281fb182894 100644
> > --- a/net/core/flow_offload.c
> > +++ b/net/core/flow_offload.c
> > @@ -373,14 +373,13 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
> >  EXPORT_SYMBOL(flow_indr_dev_register);
> >  
> >  static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
> > -				      void *cb_priv,
> > +				      void *cb_ident,
> >  				      struct list_head *cleanup_list)
> >  {
> >  	struct flow_block_cb *this, *next;
> >  
> >  	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
> > -		if (this->release == release &&
> > -		    this->cb_priv == cb_priv) {
> > +		if (this->release == release && this->cb_ident == cb_ident) {
> >  			list_move(&this->indr.list, cleanup_list);
> >  			return;
> >  		}
> >

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 14:34       ` Simon Horman
@ 2020-06-16 15:18         ` wenxu
  2020-06-16 15:47           ` Simon Horman
  0 siblings, 1 reply; 24+ messages in thread
From: wenxu @ 2020-06-16 15:18 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, davem, pablo, vladbu


在 2020/6/16 22:34, Simon Horman 写道:
> On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
>> 在 2020/6/16 18:51, Simon Horman 写道:
>>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>>>> From: wenxu <wenxu@ucloud.cn>
>>>>
>>>> In the function __flow_block_indr_cleanup, The match stataments
>>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>>>> is totally different data with the flow_indr_dev->cb_priv.
>>>>
>>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>>>> the driver.
>>>>
>>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>> Hi Wenxu,
>>>
>>> I wonder if this can be resolved by using the cb_ident field of struct
>>> flow_block_cb.
>>>
>>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
>>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
>>> per-block rather than per-device. So part of my proposal is to change
>>> that.
>> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
>>
>> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
>>
>> and bnxt_tc_setup_indr_block.
>>
>>
>> nfp_flower_setup_indr_tc_block:
>>
>> struct nfp_flower_indr_block_cb_priv *cb_priv;
>>
>> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
>>                                                cb_priv, cb_priv,
>>                                                nfp_flower_setup_indr_tc_release);
>>
>>
>> bnxt_tc_setup_indr_block:
>>
>> struct bnxt_flower_indr_block_cb_priv *cb_priv;
>>
>> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
>>                                                cb_priv, cb_priv,
>>                                                bnxt_tc_setup_indr_rel);
>>
>>
>> And the function flow_block_cb_is_busy called in most place. Pass the
>>
>> parameter as cb_priv but not cb_indent .
> Thanks, I see that now. But I still think it would be useful to understand
> the purpose of cb_ident. It feels like it would lead to a clean solution
> to the problem you have highlighted.

I think The cb_ident means identify.  It is used to identify the each flow block cb.

In the both flow_block_cb_is_busy and flow_block_cb_lookup function check

the block_cb->cb_ident == cb_ident.





^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 15:18         ` wenxu
@ 2020-06-16 15:47           ` Simon Horman
  2020-06-16 20:38             ` Pablo Neira Ayuso
  2020-06-17  2:47             ` wenxu
  0 siblings, 2 replies; 24+ messages in thread
From: Simon Horman @ 2020-06-16 15:47 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, pablo, vladbu

On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
> 
> 在 2020/6/16 22:34, Simon Horman 写道:
> > On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
> >> 在 2020/6/16 18:51, Simon Horman 写道:
> >>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> >>>> From: wenxu <wenxu@ucloud.cn>
> >>>>
> >>>> In the function __flow_block_indr_cleanup, The match stataments
> >>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> >>>> is totally different data with the flow_indr_dev->cb_priv.
> >>>>
> >>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> >>>> the driver.
> >>>>
> >>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> >>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> >>> Hi Wenxu,
> >>>
> >>> I wonder if this can be resolved by using the cb_ident field of struct
> >>> flow_block_cb.
> >>>
> >>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
> >>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
> >>> per-block rather than per-device. So part of my proposal is to change
> >>> that.
> >> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
> >>
> >> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
> >>
> >> and bnxt_tc_setup_indr_block.
> >>
> >>
> >> nfp_flower_setup_indr_tc_block:
> >>
> >> struct nfp_flower_indr_block_cb_priv *cb_priv;
> >>
> >> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
> >>                                                cb_priv, cb_priv,
> >>                                                nfp_flower_setup_indr_tc_release);
> >>
> >>
> >> bnxt_tc_setup_indr_block:
> >>
> >> struct bnxt_flower_indr_block_cb_priv *cb_priv;
> >>
> >> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
> >>                                                cb_priv, cb_priv,
> >>                                                bnxt_tc_setup_indr_rel);
> >>
> >>
> >> And the function flow_block_cb_is_busy called in most place. Pass the
> >>
> >> parameter as cb_priv but not cb_indent .
> > Thanks, I see that now. But I still think it would be useful to understand
> > the purpose of cb_ident. It feels like it would lead to a clean solution
> > to the problem you have highlighted.
> 
> I think The cb_ident means identify.  It is used to identify the each flow block cb.
> 
> In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
> 
> the block_cb->cb_ident == cb_ident.

Thanks, I think that I now see what you mean about the different scope of
cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().

I do, however, still wonder if there is a nicer way than reaching into
the structure and manually setting block_cb->indr.cb_priv
at each call-site.

Perhaps a variant of flow_block_cb_alloc() for indirect blocks
would be nicer?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks
  2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
@ 2020-06-16 15:55   ` Simon Horman
  2020-06-16 20:11   ` Pablo Neira Ayuso
  1 sibling, 0 replies; 24+ messages in thread
From: Simon Horman @ 2020-06-16 15:55 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, pablo, vladbu

On Tue, Jun 16, 2020 at 11:19:37AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> The cleanup operation based on the setup callback. But in the mlx5e
> driver there are tc and flowtable indrict setup callback and shared
> the same release callbacks. So when the representor is removed,
> then identify the indirect flow_blocks that need to be removed by  
> the release callback.
> 
> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> Signed-off-by: wenxu <wenxu@ucloud.cn>

Reviewed-by: Simon Horman <simon.horman@netronome.com>


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks
  2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
  2020-06-16 15:55   ` Simon Horman
@ 2020-06-16 20:11   ` Pablo Neira Ayuso
  1 sibling, 0 replies; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-16 20:11 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, vladbu

On Tue, Jun 16, 2020 at 11:19:37AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> The cleanup operation based on the setup callback. But in the mlx5e
> driver there are tc and flowtable indrict setup callback and shared
> the same release callbacks. So when the representor is removed,
> then identify the indirect flow_blocks that need to be removed by  
> the release callback.
> 
> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 2 +-
>  drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
>  drivers/net/ethernet/netronome/nfp/flower/main.c    | 2 +-
>  drivers/net/ethernet/netronome/nfp/flower/main.h    | 3 +--
>  drivers/net/ethernet/netronome/nfp/flower/offload.c | 6 +++---
>  include/net/flow_offload.h                          | 2 +-
>  net/core/flow_offload.c                             | 9 +++++----
>  7 files changed, 13 insertions(+), 13 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> index 0eef4f5..ef7f6bc 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> @@ -2074,7 +2074,7 @@ void bnxt_shutdown_tc(struct bnxt *bp)
>  		return;
>  
>  	flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
> -				 bnxt_tc_setup_indr_block_cb);
> +				 bnxt_tc_setup_indr_rel);
>  	rhashtable_destroy(&tc_info->flow_table);
>  	rhashtable_destroy(&tc_info->l2_table);
>  	rhashtable_destroy(&tc_info->decap_l2_table);
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> index 80713123..a62bcf0 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
> @@ -496,7 +496,7 @@ int mlx5e_rep_tc_netdevice_event_register(struct mlx5e_rep_priv *rpriv)
>  void mlx5e_rep_tc_netdevice_event_unregister(struct mlx5e_rep_priv *rpriv)
>  {
>  	flow_indr_dev_unregister(mlx5e_rep_indr_setup_cb, rpriv,
> -				 mlx5e_rep_indr_setup_tc_cb);
> +				 mlx5e_rep_indr_block_unbind);
>  }
>  
>  #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
> diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
> index c393276..bb448c8 100644
> --- a/drivers/net/ethernet/netronome/nfp/flower/main.c
> +++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
> @@ -861,7 +861,7 @@ static void nfp_flower_clean(struct nfp_app *app)
>  	flush_work(&app_priv->cmsg_work);
>  
>  	flow_indr_dev_unregister(nfp_flower_indr_setup_tc_cb, app,
> -				 nfp_flower_setup_indr_block_cb);
> +				 nfp_flower_setup_indr_tc_release);
>  
>  	if (app_priv->flower_ext_feats & NFP_FL_FEATS_VF_RLIM)
>  		nfp_flower_qos_cleanup(app);
> diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
> index 6c3dc3b..c983337 100644
> --- a/drivers/net/ethernet/netronome/nfp/flower/main.h
> +++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
> @@ -460,8 +460,7 @@ int nfp_flower_setup_qos_offload(struct nfp_app *app, struct net_device *netdev,
>  void nfp_flower_stats_rlim_reply(struct nfp_app *app, struct sk_buff *skb);
>  int nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
>  				enum tc_setup_type type, void *type_data);
> -int nfp_flower_setup_indr_block_cb(enum tc_setup_type type, void *type_data,
> -				   void *cb_priv);
> +void nfp_flower_setup_indr_tc_release(void *cb_priv);
>  
>  void
>  __nfp_flower_non_repr_priv_get(struct nfp_flower_non_repr_priv *non_repr_priv);
> diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
> index 695d24b9..28de905 100644
> --- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
> +++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
> @@ -1619,8 +1619,8 @@ struct nfp_flower_indr_block_cb_priv {
>  	return NULL;
>  }
>  
> -int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
> -				   void *type_data, void *cb_priv)
> +static int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
> +					  void *type_data, void *cb_priv)
>  {
>  	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
>  	struct flow_cls_offload *flower = type_data;
> @@ -1637,7 +1637,7 @@ int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
>  	}
>  }
>  
> -static void nfp_flower_setup_indr_tc_release(void *cb_priv)
> +void nfp_flower_setup_indr_tc_release(void *cb_priv)
>  {
>  	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
>  
> diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
> index f2c8311..3a2d6b4 100644
> --- a/include/net/flow_offload.h
> +++ b/include/net/flow_offload.h
> @@ -536,7 +536,7 @@ typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
>  
>  int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv);
>  void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
> -			      flow_setup_cb_t *setup_cb);
> +			      void (*release)(void *cb_priv));
>  int flow_indr_dev_setup_offload(struct net_device *dev,
>  				enum tc_setup_type type, void *data,
>  				struct flow_block_offload *bo,
> diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
> index 0cfc35e..b288d2f 100644
> --- a/net/core/flow_offload.c
> +++ b/net/core/flow_offload.c
> @@ -372,13 +372,14 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
>  }
>  EXPORT_SYMBOL(flow_indr_dev_register);
>  
> -static void __flow_block_indr_cleanup(flow_setup_cb_t *setup_cb, void *cb_priv,
> +static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
> +				      void *cb_priv,
>  				      struct list_head *cleanup_list)
>  {
>  	struct flow_block_cb *this, *next;
>  
>  	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
> -		if (this->cb == setup_cb &&
> +		if (this->release == release &&
>  		    this->cb_priv == cb_priv) {
>  			list_move(&this->indr.list, cleanup_list);
>  			return;
> @@ -397,7 +398,7 @@ static void flow_block_indr_notify(struct list_head *cleanup_list)
>  }
>  
>  void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
> -			      flow_setup_cb_t *setup_cb)
> +			      void (*release)(void *cb_priv))

If you use cb_priv to identify the callback, then this should be
instead cb_ident?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16  3:19 ` [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb wenxu
  2020-06-16 10:51   ` Simon Horman
@ 2020-06-16 20:13   ` Pablo Neira Ayuso
  2020-06-17  2:42     ` wenxu
  1 sibling, 1 reply; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-16 20:13 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, vladbu

On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> In the function __flow_block_indr_cleanup, The match stataments
> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> is totally different data with the flow_indr_dev->cb_priv.
> 
> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> the driver.
> 
> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
>  drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 1 +
>  drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
>  drivers/net/ethernet/netronome/nfp/flower/offload.c | 1 +
>  include/net/flow_offload.h                          | 1 +
>  net/core/flow_offload.c                             | 2 +-
>  5 files changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> index ef7f6bc..042c285 100644
> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> @@ -1918,6 +1918,7 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
>  
>  		flow_block_cb_add(block_cb, f);
>  		list_add_tail(&block_cb->driver_list, &bnxt_block_cb_list);
> +		block_cb->indr.cb_priv = bp;

cb_indent ?

Why are you splitting the fix in multiple patches? This makes it
harder to review.

I think patch 1/4, 2/4 and 4/4 belong to the same logical change?
Collapse them.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  2020-06-16  3:19 ` [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log wenxu
@ 2020-06-16 20:17   ` Pablo Neira Ayuso
  2020-06-16 20:30     ` Pablo Neira Ayuso
  2020-06-17  2:29     ` wenxu
  0 siblings, 2 replies; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-16 20:17 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, vladbu

On Tue, Jun 16, 2020 at 11:19:39AM +0800, wenxu@ucloud.cn wrote:
> From: wenxu <wenxu@ucloud.cn>
> 
> When a indr device add in offload success. The block->nooffloaddevcnt
> should be 0. After the representor go away. When the dir device go away
> the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
> dmesg log. 
> 
> The block->nooffloaddevcnt should always count for indr block.
> even the indr block offload successful. The representor maybe
> gone away and the ingress qdisc can work in software mode.
> 
> block->nooffloaddevcnt warning with following dmesg log:
> 
> [  760.667058] #####################################################
> [  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
> [  760.669179] #####################################################
> [  761.780655] :test: Fedora 30 (Thirty)
> [  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
> [  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
> [  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
> [  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
> [  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
> [  762.234341] :test: unbind vfs of ens1f0
> [  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
> [  762.291363] :test: unbind vfs of ens1f1
> [  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
> [  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
> [  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
> [  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> [  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
> [  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> [  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> [  763.552429] ens1f1_0: renamed from eth0
> [  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
> [  763.629694] ens1f1_1: renamed from eth1
> [  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
> [  764.670841] :test: unbind vfs of ens1f0
> [  764.681966] :test: unbind vfs of ens1f1
> [  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
> [  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
> [  764.797325] :test: Add multipath vxlan encap rule and disable sriov
> [  764.798544] :test: config multipath route
> [  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
> [  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
> [  765.603681] :test: OK
> [  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
> [  765.675085] :test: verify rule in hw
> [  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
> [  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
> [  766.979230] :test: OK
> [  768.125419] :test: OK
> [  768.127519] :test: - disable sriov ens1f1
> [  768.131160] pci 0000:81:02.2: Removing from iommu group 75
> [  768.132646] pci 0000:81:02.3: Removing from iommu group 76
> [  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
> [  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
> [  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> [  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
> [  769.990022] :test: - disable sriov ens1f0
> [  769.994922] pci 0000:81:00.2: Removing from iommu group 73
> [  769.997048] pci 0000:81:00.3: Removing from iommu group 74
> [  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
> [  771.339091] ------------[ cut here ]------------
> [  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
> [  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
> ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
> [  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
> [  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
> [  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
> [  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
> [  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
> [  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
> [  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
> [  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
> [  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
> [  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
> [  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
> [  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
> [  771.360825] Call Trace:
> [  771.361764]  __tcf_block_put+0x84/0x150
> [  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
> [  771.363658]  qdisc_destroy+0x3e/0xc0
> [  771.364594]  dev_shutdown+0x7a/0xa5
> [  771.365522]  rollback_registered_many+0x20d/0x530
> [  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
> [  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
> [  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
> [  771.369454]  notifier_call_chain+0x4c/0x70
> [  771.370579]  rollback_registered_many+0x2f5/0x530
> [  771.371719]  rollback_registered+0x56/0x90
> [  771.372843]  unregister_netdevice_queue+0x73/0xb0
> [  771.373982]  unregister_netdev+0x18/0x20
> [  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
> [  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
> [  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
> [  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
> [  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
> [  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
> [  771.382087]  sriov_numvfs_store+0xfc/0x130
> [  771.383195]  kernfs_fop_write+0xce/0x1b0
> [  771.384302]  vfs_write+0xb6/0x1a0
> [  771.385410]  ksys_write+0x5f/0xe0
> [  771.386500]  do_syscall_64+0x5b/0x1d0
> [  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> 
> Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
> Signed-off-by: wenxu <wenxu@ucloud.cn>
> ---
>  net/sched/cls_api.c | 24 ++++++++++++++----------
>  1 file changed, 14 insertions(+), 10 deletions(-)
> 
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index a00a203..86c3937 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -671,25 +671,29 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
>  				 struct netlink_ext_ack *extack)
>  {
>  	struct flow_block_offload bo = {};
> -	int err;
>  
>  	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
>  			       &block->flow_block, tcf_block_shared(block),
>  			       extack);
>  
> -	if (dev->netdev_ops->ndo_setup_tc)
> +	if (dev->netdev_ops->ndo_setup_tc) {
> +		int err;
> +
>  		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> -	else
> -		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
> -						  &bo, tc_block_indr_cleanup);
> +		if (err < 0) {
> +			if (err != -EOPNOTSUPP)
> +				NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
> +			return err;
> +		}
>  
> -	if (err < 0) {
> -		if (err != -EOPNOTSUPP)
> -			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
> -		return err;
> +		return tcf_block_setup(block, &bo);
>  	}
>  
> -	return tcf_block_setup(block, &bo);
> +	flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block, &bo,
> +				    tc_block_indr_cleanup);
> +	tcf_block_setup(block, &bo);
> +
> +	return -EOPNOTSUPP;

So tcf_block_offload_cmd() always return -EOPNOTSUPP for _BIND and
_UNBIND operations after this patch ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  2020-06-16 20:17   ` Pablo Neira Ayuso
@ 2020-06-16 20:30     ` Pablo Neira Ayuso
  2020-06-17  2:34       ` wenxu
  2020-06-17  2:29     ` wenxu
  1 sibling, 1 reply; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-16 20:30 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, vladbu

On Tue, Jun 16, 2020 at 10:17:50PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Jun 16, 2020 at 11:19:39AM +0800, wenxu@ucloud.cn wrote:
> > From: wenxu <wenxu@ucloud.cn>
> > 
> > When a indr device add in offload success. The block->nooffloaddevcnt
> > should be 0. After the representor go away. When the dir device go away
> > the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
> > dmesg log. 
> > 
> > The block->nooffloaddevcnt should always count for indr block.
> > even the indr block offload successful. The representor maybe
> > gone away and the ingress qdisc can work in software mode.
> > 
> > block->nooffloaddevcnt warning with following dmesg log:
> > 
> > [  760.667058] #####################################################
> > [  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
> > [  760.669179] #####################################################
> > [  761.780655] :test: Fedora 30 (Thirty)
> > [  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
> > [  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
> > [  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
> > [  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
> > [  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
> > [  762.234341] :test: unbind vfs of ens1f0
> > [  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
> > [  762.291363] :test: unbind vfs of ens1f1
> > [  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
> > [  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
> > [  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
> > [  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> > [  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
> > [  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> > [  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> > [  763.552429] ens1f1_0: renamed from eth0
> > [  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
> > [  763.629694] ens1f1_1: renamed from eth1
> > [  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
> > [  764.670841] :test: unbind vfs of ens1f0
> > [  764.681966] :test: unbind vfs of ens1f1
> > [  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
> > [  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
> > [  764.797325] :test: Add multipath vxlan encap rule and disable sriov
> > [  764.798544] :test: config multipath route
> > [  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
> > [  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
> > [  765.603681] :test: OK
> > [  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
> > [  765.675085] :test: verify rule in hw
> > [  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
> > [  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
> > [  766.979230] :test: OK
> > [  768.125419] :test: OK
> > [  768.127519] :test: - disable sriov ens1f1
> > [  768.131160] pci 0000:81:02.2: Removing from iommu group 75
> > [  768.132646] pci 0000:81:02.3: Removing from iommu group 76
> > [  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
> > [  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
> > [  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
> > [  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
> > [  769.990022] :test: - disable sriov ens1f0
> > [  769.994922] pci 0000:81:00.2: Removing from iommu group 73
> > [  769.997048] pci 0000:81:00.3: Removing from iommu group 74
> > [  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
> > [  771.339091] ------------[ cut here ]------------
> > [  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
> > [  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
> > ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
> > [  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
> > [  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
> > [  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
> > [  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
> > [  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
> > [  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
> > [  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
> > [  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
> > [  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
> > [  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
> > [  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
> > [  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
> > [  771.360825] Call Trace:
> > [  771.361764]  __tcf_block_put+0x84/0x150
> > [  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
> > [  771.363658]  qdisc_destroy+0x3e/0xc0
> > [  771.364594]  dev_shutdown+0x7a/0xa5
> > [  771.365522]  rollback_registered_many+0x20d/0x530
> > [  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
> > [  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
> > [  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
> > [  771.369454]  notifier_call_chain+0x4c/0x70
> > [  771.370579]  rollback_registered_many+0x2f5/0x530
> > [  771.371719]  rollback_registered+0x56/0x90
> > [  771.372843]  unregister_netdevice_queue+0x73/0xb0
> > [  771.373982]  unregister_netdev+0x18/0x20
> > [  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
> > [  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
> > [  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
> > [  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
> > [  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
> > [  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
> > [  771.382087]  sriov_numvfs_store+0xfc/0x130
> > [  771.383195]  kernfs_fop_write+0xce/0x1b0
> > [  771.384302]  vfs_write+0xb6/0x1a0
> > [  771.385410]  ksys_write+0x5f/0xe0
> > [  771.386500]  do_syscall_64+0x5b/0x1d0
> > [  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> > 
> > Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
> > Signed-off-by: wenxu <wenxu@ucloud.cn>
> > ---
> >  net/sched/cls_api.c | 24 ++++++++++++++----------
> >  1 file changed, 14 insertions(+), 10 deletions(-)
> > 
> > diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> > index a00a203..86c3937 100644
> > --- a/net/sched/cls_api.c
> > +++ b/net/sched/cls_api.c
> > @@ -671,25 +671,29 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
> >  				 struct netlink_ext_ack *extack)
> >  {
> >  	struct flow_block_offload bo = {};
> > -	int err;
> >  
> >  	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
> >  			       &block->flow_block, tcf_block_shared(block),
> >  			       extack);
> >  
> > -	if (dev->netdev_ops->ndo_setup_tc)
> > +	if (dev->netdev_ops->ndo_setup_tc) {
> > +		int err;
> > +
> >  		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
> > -	else
> > -		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
> > -						  &bo, tc_block_indr_cleanup);
> > +		if (err < 0) {
> > +			if (err != -EOPNOTSUPP)
> > +				NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
> > +			return err;
> > +		}
> >  
> > -	if (err < 0) {
> > -		if (err != -EOPNOTSUPP)
> > -			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
> > -		return err;
> > +		return tcf_block_setup(block, &bo);
> >  	}
> >  
> > -	return tcf_block_setup(block, &bo);
> > +	flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block, &bo,
> > +				    tc_block_indr_cleanup);
> > +	tcf_block_setup(block, &bo);
> > +
> > +	return -EOPNOTSUPP;
> 
> So tcf_block_offload_cmd() always return -EOPNOTSUPP for _BIND and
> _UNBIND operations after this patch ?

tcf_block_offload_unbind() is not called from the
tc_block_indr_cleanup() path.

How is this patch related to 1/4, 2/4 and 4/4 ?

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 15:47           ` Simon Horman
@ 2020-06-16 20:38             ` Pablo Neira Ayuso
  2020-06-17  3:36               ` wenxu
  2020-06-17  2:47             ` wenxu
  1 sibling, 1 reply; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-16 20:38 UTC (permalink / raw)
  To: Simon Horman; +Cc: wenxu, netdev, davem, vladbu

On Tue, Jun 16, 2020 at 05:47:17PM +0200, Simon Horman wrote:
> On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
> > 
> > 在 2020/6/16 22:34, Simon Horman 写道:
> > > On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
> > >> 在 2020/6/16 18:51, Simon Horman 写道:
> > >>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> > >>>> From: wenxu <wenxu@ucloud.cn>
> > >>>>
> > >>>> In the function __flow_block_indr_cleanup, The match stataments
> > >>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> > >>>> is totally different data with the flow_indr_dev->cb_priv.
> > >>>>
> > >>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> > >>>> the driver.
> > >>>>
> > >>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> > >>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> > >>> Hi Wenxu,
> > >>>
> > >>> I wonder if this can be resolved by using the cb_ident field of struct
> > >>> flow_block_cb.
> > >>>
> > >>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
> > >>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
> > >>> per-block rather than per-device. So part of my proposal is to change
> > >>> that.
> > >> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
> > >>
> > >> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
> > >>
> > >> and bnxt_tc_setup_indr_block.
> > >>
> > >>
> > >> nfp_flower_setup_indr_tc_block:
> > >>
> > >> struct nfp_flower_indr_block_cb_priv *cb_priv;
> > >>
> > >> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
> > >>                                                cb_priv, cb_priv,
> > >>                                                nfp_flower_setup_indr_tc_release);
> > >>
> > >>
> > >> bnxt_tc_setup_indr_block:
> > >>
> > >> struct bnxt_flower_indr_block_cb_priv *cb_priv;
> > >>
> > >> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
> > >>                                                cb_priv, cb_priv,
> > >>                                                bnxt_tc_setup_indr_rel);
> > >>
> > >>
> > >> And the function flow_block_cb_is_busy called in most place. Pass the
> > >>
> > >> parameter as cb_priv but not cb_indent .
> > > Thanks, I see that now. But I still think it would be useful to understand
> > > the purpose of cb_ident. It feels like it would lead to a clean solution
> > > to the problem you have highlighted.
> > 
> > I think The cb_ident means identify.  It is used to identify the each flow block cb.
> > 
> > In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
> > 
> > the block_cb->cb_ident == cb_ident.
> 
> Thanks, I think that I now see what you mean about the different scope of
> cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().
> 
> I do, however, still wonder if there is a nicer way than reaching into
> the structure and manually setting block_cb->indr.cb_priv
> at each call-site.
> 
> Perhaps a variant of flow_block_cb_alloc() for indirect blocks
> would be nicer?

A follow up patch to add this new variant would be good. Probably
__flow_block_indr_binding() can go away with this new variant to set
up the indirect flow block.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  2020-06-16 20:17   ` Pablo Neira Ayuso
  2020-06-16 20:30     ` Pablo Neira Ayuso
@ 2020-06-17  2:29     ` wenxu
  1 sibling, 0 replies; 24+ messages in thread
From: wenxu @ 2020-06-17  2:29 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, davem, vladbu


On 6/17/2020 4:17 AM, Pablo Neira Ayuso wrote:
> On Tue, Jun 16, 2020 at 11:19:39AM +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> When a indr device add in offload success. The block->nooffloaddevcnt
>> should be 0. After the representor go away. When the dir device go away
>> the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
>> dmesg log. 
>>
>> The block->nooffloaddevcnt should always count for indr block.
>> even the indr block offload successful. The representor maybe
>> gone away and the ingress qdisc can work in software mode.
>>
>> block->nooffloaddevcnt warning with following dmesg log:
>>
>> [  760.667058] #####################################################
>> [  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
>> [  760.669179] #####################################################
>> [  761.780655] :test: Fedora 30 (Thirty)
>> [  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
>> [  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
>> [  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
>> [  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
>> [  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
>> [  762.234341] :test: unbind vfs of ens1f0
>> [  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
>> [  762.291363] :test: unbind vfs of ens1f1
>> [  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
>> [  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
>> [  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
>> [  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>> [  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
>> [  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>> [  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>> [  763.552429] ens1f1_0: renamed from eth0
>> [  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
>> [  763.629694] ens1f1_1: renamed from eth1
>> [  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
>> [  764.670841] :test: unbind vfs of ens1f0
>> [  764.681966] :test: unbind vfs of ens1f1
>> [  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
>> [  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
>> [  764.797325] :test: Add multipath vxlan encap rule and disable sriov
>> [  764.798544] :test: config multipath route
>> [  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
>> [  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
>> [  765.603681] :test: OK
>> [  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
>> [  765.675085] :test: verify rule in hw
>> [  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
>> [  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
>> [  766.979230] :test: OK
>> [  768.125419] :test: OK
>> [  768.127519] :test: - disable sriov ens1f1
>> [  768.131160] pci 0000:81:02.2: Removing from iommu group 75
>> [  768.132646] pci 0000:81:02.3: Removing from iommu group 76
>> [  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
>> [  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
>> [  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>> [  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
>> [  769.990022] :test: - disable sriov ens1f0
>> [  769.994922] pci 0000:81:00.2: Removing from iommu group 73
>> [  769.997048] pci 0000:81:00.3: Removing from iommu group 74
>> [  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
>> [  771.339091] ------------[ cut here ]------------
>> [  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
>> [  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
>> ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
>> [  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
>> [  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
>> [  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
>> [  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
>> [  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
>> [  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
>> [  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
>> [  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
>> [  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
>> [  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
>> [  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
>> [  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
>> [  771.360825] Call Trace:
>> [  771.361764]  __tcf_block_put+0x84/0x150
>> [  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
>> [  771.363658]  qdisc_destroy+0x3e/0xc0
>> [  771.364594]  dev_shutdown+0x7a/0xa5
>> [  771.365522]  rollback_registered_many+0x20d/0x530
>> [  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
>> [  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
>> [  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
>> [  771.369454]  notifier_call_chain+0x4c/0x70
>> [  771.370579]  rollback_registered_many+0x2f5/0x530
>> [  771.371719]  rollback_registered+0x56/0x90
>> [  771.372843]  unregister_netdevice_queue+0x73/0xb0
>> [  771.373982]  unregister_netdev+0x18/0x20
>> [  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
>> [  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
>> [  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
>> [  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
>> [  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
>> [  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
>> [  771.382087]  sriov_numvfs_store+0xfc/0x130
>> [  771.383195]  kernfs_fop_write+0xce/0x1b0
>> [  771.384302]  vfs_write+0xb6/0x1a0
>> [  771.385410]  ksys_write+0x5f/0xe0
>> [  771.386500]  do_syscall_64+0x5b/0x1d0
>> [  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>
>> Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>> ---
>>  net/sched/cls_api.c | 24 ++++++++++++++----------
>>  1 file changed, 14 insertions(+), 10 deletions(-)
>>
>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>> index a00a203..86c3937 100644
>> --- a/net/sched/cls_api.c
>> +++ b/net/sched/cls_api.c
>> @@ -671,25 +671,29 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
>>  				 struct netlink_ext_ack *extack)
>>  {
>>  	struct flow_block_offload bo = {};
>> -	int err;
>>  
>>  	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
>>  			       &block->flow_block, tcf_block_shared(block),
>>  			       extack);
>>  
>> -	if (dev->netdev_ops->ndo_setup_tc)
>> +	if (dev->netdev_ops->ndo_setup_tc) {
>> +		int err;
>> +
>>  		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
>> -	else
>> -		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
>> -						  &bo, tc_block_indr_cleanup);
>> +		if (err < 0) {
>> +			if (err != -EOPNOTSUPP)
>> +				NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
>> +			return err;
>> +		}
>>  
>> -	if (err < 0) {
>> -		if (err != -EOPNOTSUPP)
>> -			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
>> -		return err;
>> +		return tcf_block_setup(block, &bo);
>>  	}
>>  
>> -	return tcf_block_setup(block, &bo);
>> +	flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block, &bo,
>> +				    tc_block_indr_cleanup);
>> +	tcf_block_setup(block, &bo);
>> +
>> +	return -EOPNOTSUPP;
> So tcf_block_offload_cmd() always return -EOPNOTSUPP for _BIND and
> _UNBIND operations after this patch ?

yes, The original design for tc subsystem block->nooffloaddevcnt should

always inc for indr tunnel.

>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log
  2020-06-16 20:30     ` Pablo Neira Ayuso
@ 2020-06-17  2:34       ` wenxu
  0 siblings, 0 replies; 24+ messages in thread
From: wenxu @ 2020-06-17  2:34 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, davem, vladbu


On 6/17/2020 4:30 AM, Pablo Neira Ayuso wrote:
> On Tue, Jun 16, 2020 at 10:17:50PM +0200, Pablo Neira Ayuso wrote:
>> On Tue, Jun 16, 2020 at 11:19:39AM +0800, wenxu@ucloud.cn wrote:
>>> From: wenxu <wenxu@ucloud.cn>
>>>
>>> When a indr device add in offload success. The block->nooffloaddevcnt
>>> should be 0. After the representor go away. When the dir device go away
>>> the flow_block UNBIND operation with -EOPNOTSUPP which lead the warning
>>> dmesg log. 
>>>
>>> The block->nooffloaddevcnt should always count for indr block.
>>> even the indr block offload successful. The representor maybe
>>> gone away and the ingress qdisc can work in software mode.
>>>
>>> block->nooffloaddevcnt warning with following dmesg log:
>>>
>>> [  760.667058] #####################################################
>>> [  760.668186] ## TEST test-ecmp-add-vxlan-encap-disable-sriov.sh ##
>>> [  760.669179] #####################################################
>>> [  761.780655] :test: Fedora 30 (Thirty)
>>> [  761.783794] :test: Linux reg-r-vrt-018-180 5.7.0+
>>> [  761.822890] :test: NIC ens1f0 FW 16.26.6000 PCI 0000:81:00.0 DEVICE 0x1019 ConnectX-5 Ex
>>> [  761.860244] mlx5_core 0000:81:00.0 ens1f0: Link up
>>> [  761.880693] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
>>> [  762.059732] mlx5_core 0000:81:00.1 ens1f1: Link up
>>> [  762.234341] :test: unbind vfs of ens1f0
>>> [  762.257825] :test: Change ens1f0 eswitch (0000:81:00.0) mode to switchdev
>>> [  762.291363] :test: unbind vfs of ens1f1
>>> [  762.306914] :test: Change ens1f1 eswitch (0000:81:00.1) mode to switchdev
>>> [  762.309237] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(LEGACY), nvfs(2), active vports(3)
>>> [  763.282598] mlx5_core 0000:81:00.1: E-Switch: Supported tc offload range - chains: 4294967294, prios: 4294967295
>>> [  763.362825] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>>> [  763.444465] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
>>> [  763.460088] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>>> [  763.502586] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>>> [  763.552429] ens1f1_0: renamed from eth0
>>> [  763.569569] mlx5_core 0000:81:00.1: E-Switch: Enable: mode(OFFLOADS), nvfs(2), active vports(3)
>>> [  763.629694] ens1f1_1: renamed from eth1
>>> [  764.631552] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_0: link becomes ready
>>> [  764.670841] :test: unbind vfs of ens1f0
>>> [  764.681966] :test: unbind vfs of ens1f1
>>> [  764.726762] mlx5_core 0000:81:00.0 ens1f0: Link up
>>> [  764.766511] mlx5_core 0000:81:00.1 ens1f1: Link up
>>> [  764.797325] :test: Add multipath vxlan encap rule and disable sriov
>>> [  764.798544] :test: config multipath route
>>> [  764.812732] mlx5_core 0000:81:00.0: lag map port 1:2 port 2:2
>>> [  764.874556] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:2
>>> [  765.603681] :test: OK
>>> [  765.659048] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1_1: link becomes ready
>>> [  765.675085] :test: verify rule in hw
>>> [  765.694237] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f0: link becomes ready
>>> [  765.711892] IPv6: ADDRCONF(NETDEV_CHANGE): ens1f1: link becomes ready
>>> [  766.979230] :test: OK
>>> [  768.125419] :test: OK
>>> [  768.127519] :test: - disable sriov ens1f1
>>> [  768.131160] pci 0000:81:02.2: Removing from iommu group 75
>>> [  768.132646] pci 0000:81:02.3: Removing from iommu group 76
>>> [  769.179749] mlx5_core 0000:81:00.1: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
>>> [  769.455627] mlx5_core 0000:81:00.0: modify lag map port 1:1 port 2:1
>>> [  769.703990] mlx5_core 0000:81:00.1: MLX5E: StrdRq(1) RqSz(8) StrdSz(2048) RxCqeCmprss(0)
>>> [  769.988637] mlx5_core 0000:81:00.1 ens1f1: renamed from eth0
>>> [  769.990022] :test: - disable sriov ens1f0
>>> [  769.994922] pci 0000:81:00.2: Removing from iommu group 73
>>> [  769.997048] pci 0000:81:00.3: Removing from iommu group 74
>>> [  771.035813] mlx5_core 0000:81:00.0: E-Switch: Disable: mode(OFFLOADS), nvfs(2), active vports(3)
>>> [  771.339091] ------------[ cut here ]------------
>>> [  771.340812] WARNING: CPU: 6 PID: 3448 at net/sched/cls_api.c:749 tcf_block_offload_unbind.isra.0+0x5c/0x60
>>> [  771.341728] Modules linked in: act_mirred act_tunnel_key cls_flower dummy vxlan ip6_udp_tunnel udp_tunnel sch_ingress nfsv3 nfs_acl nfs lockd grace fscache tun bridge stp llc sunrpc rdma_ucm rdma_cm iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core intel_rapl_msr intel_rapl_common sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp mlxfw act_ct nf_flow_table kvm_intel nf_nat kvm nf_conntrack irqbypass crct10dif_pclmul igb crc32_pclmul nf_defrag_ipv6 libcrc32c nf_defrag_ipv4 crc32c_intel ghash_clmulni_intel ptp ipmi_ssif intel_cstate pps_c
>>> ore ses intel_uncore mei_me iTCO_wdt joydev ipmi_si iTCO_vendor_support i2c_i801 enclosure mei ioatdma dca lpc_ich wmi ipmi_devintf pcspkr acpi_power_meter ipmi_msghandler acpi_pad ast i2c_algo_bit drm_vram_helper drm_kms_helper drm_ttm_helper ttm drm mpt3sas raid_class scsi_transport_sas
>>> [  771.347818] CPU: 6 PID: 3448 Comm: test-ecmp-add-v Not tainted 5.7.0+ #1146
>>> [  771.348727] Hardware name: Supermicro SYS-2028TP-DECR/X10DRT-P, BIOS 2.0b 03/30/2017
>>> [  771.349646] RIP: 0010:tcf_block_offload_unbind.isra.0+0x5c/0x60
>>> [  771.350553] Code: 4a fd ff ff 83 f8 a1 74 0e 5b 4c 89 e7 5d 41 5c 41 5d e9 07 93 89 ff 8b 83 a0 00 00 00 8d 50 ff 89 93 a0 00 00 00 85 c0 75 df <0f> 0b eb db 0f 1f 44 00 00 41 57 41 56 41 55 41 89 cd 41 54 49 89
>>> [  771.352420] RSP: 0018:ffffb33144cd3b00 EFLAGS: 00010246
>>> [  771.353353] RAX: 0000000000000000 RBX: ffff8b37cf4b2800 RCX: 0000000000000000
>>> [  771.354294] RDX: 00000000ffffffff RSI: ffff8b3b9aad0000 RDI: ffffffff8d5c6e20
>>> [  771.355245] RBP: ffff8b37eb546948 R08: ffffffffc0b7a348 R09: ffff8b3b9aad0000
>>> [  771.356189] R10: 0000000000000001 R11: ffff8b3ba7a0a1c0 R12: ffff8b37cf4b2850
>>> [  771.357123] R13: ffff8b3b9aad0000 R14: ffff8b37cf4b2820 R15: ffff8b37cf4b2820
>>> [  771.358039] FS:  00007f8a19b6e740(0000) GS:ffff8b3befa00000(0000) knlGS:0000000000000000
>>> [  771.358965] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [  771.359885] CR2: 00007f3afb91c1a0 CR3: 000000045133c004 CR4: 00000000001606e0
>>> [  771.360825] Call Trace:
>>> [  771.361764]  __tcf_block_put+0x84/0x150
>>> [  771.362712]  ingress_destroy+0x1b/0x20 [sch_ingress]
>>> [  771.363658]  qdisc_destroy+0x3e/0xc0
>>> [  771.364594]  dev_shutdown+0x7a/0xa5
>>> [  771.365522]  rollback_registered_many+0x20d/0x530
>>> [  771.366458]  ? netdev_upper_dev_unlink+0x15d/0x1c0
>>> [  771.367387]  unregister_netdevice_many.part.0+0xf/0x70
>>> [  771.368310]  vxlan_netdevice_event+0xa4/0x110 [vxlan]
>>> [  771.369454]  notifier_call_chain+0x4c/0x70
>>> [  771.370579]  rollback_registered_many+0x2f5/0x530
>>> [  771.371719]  rollback_registered+0x56/0x90
>>> [  771.372843]  unregister_netdevice_queue+0x73/0xb0
>>> [  771.373982]  unregister_netdev+0x18/0x20
>>> [  771.375168]  mlx5e_vport_rep_unload+0x56/0xc0 [mlx5_core]
>>> [  771.376327]  esw_offloads_disable+0x81/0x90 [mlx5_core]
>>> [  771.377512]  mlx5_eswitch_disable_locked.cold+0xcb/0x1af [mlx5_core]
>>> [  771.378679]  mlx5_eswitch_disable+0x44/0x60 [mlx5_core]
>>> [  771.379822]  mlx5_device_disable_sriov+0xad/0xb0 [mlx5_core]
>>> [  771.380968]  mlx5_core_sriov_configure+0xc1/0xe0 [mlx5_core]
>>> [  771.382087]  sriov_numvfs_store+0xfc/0x130
>>> [  771.383195]  kernfs_fop_write+0xce/0x1b0
>>> [  771.384302]  vfs_write+0xb6/0x1a0
>>> [  771.385410]  ksys_write+0x5f/0xe0
>>> [  771.386500]  do_syscall_64+0x5b/0x1d0
>>> [  771.387569]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>
>>> Fixes: 0fdcf78d5973 ("net: use flow_indr_dev_setup_offload()")
>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>> ---
>>>  net/sched/cls_api.c | 24 ++++++++++++++----------
>>>  1 file changed, 14 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
>>> index a00a203..86c3937 100644
>>> --- a/net/sched/cls_api.c
>>> +++ b/net/sched/cls_api.c
>>> @@ -671,25 +671,29 @@ static int tcf_block_offload_cmd(struct tcf_block *block,
>>>  				 struct netlink_ext_ack *extack)
>>>  {
>>>  	struct flow_block_offload bo = {};
>>> -	int err;
>>>  
>>>  	tcf_block_offload_init(&bo, dev, command, ei->binder_type,
>>>  			       &block->flow_block, tcf_block_shared(block),
>>>  			       extack);
>>>  
>>> -	if (dev->netdev_ops->ndo_setup_tc)
>>> +	if (dev->netdev_ops->ndo_setup_tc) {
>>> +		int err;
>>> +
>>>  		err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo);
>>> -	else
>>> -		err = flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block,
>>> -						  &bo, tc_block_indr_cleanup);
>>> +		if (err < 0) {
>>> +			if (err != -EOPNOTSUPP)
>>> +				NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
>>> +			return err;
>>> +		}
>>>  
>>> -	if (err < 0) {
>>> -		if (err != -EOPNOTSUPP)
>>> -			NL_SET_ERR_MSG(extack, "Driver ndo_setup_tc failed");
>>> -		return err;
>>> +		return tcf_block_setup(block, &bo);
>>>  	}
>>>  
>>> -	return tcf_block_setup(block, &bo);
>>> +	flow_indr_dev_setup_offload(dev, TC_SETUP_BLOCK, block, &bo,
>>> +				    tc_block_indr_cleanup);
>>> +	tcf_block_setup(block, &bo);
>>> +
>>> +	return -EOPNOTSUPP;
>> So tcf_block_offload_cmd() always return -EOPNOTSUPP for _BIND and
>> _UNBIND operations after this patch ?
> tcf_block_offload_unbind() is not called from the
> tc_block_indr_cleanup() path.
>
> How is this patch related to 1/4, 2/4 and 4/4 ?
The bug will be triggered only with the problem described in patch1/2 are fixed.
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 20:13   ` Pablo Neira Ayuso
@ 2020-06-17  2:42     ` wenxu
  2020-06-17  9:03       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 24+ messages in thread
From: wenxu @ 2020-06-17  2:42 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netdev, davem, vladbu


On 6/17/2020 4:13 AM, Pablo Neira Ayuso wrote:
> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>> From: wenxu <wenxu@ucloud.cn>
>>
>> In the function __flow_block_indr_cleanup, The match stataments
>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>> is totally different data with the flow_indr_dev->cb_priv.
>>
>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>> the driver.
>>
>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>> ---
>>  drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        | 1 +
>>  drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c | 2 +-
>>  drivers/net/ethernet/netronome/nfp/flower/offload.c | 1 +
>>  include/net/flow_offload.h                          | 1 +
>>  net/core/flow_offload.c                             | 2 +-
>>  5 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
>> index ef7f6bc..042c285 100644
>> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
>> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
>> @@ -1918,6 +1918,7 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
>>  
>>  		flow_block_cb_add(block_cb, f);
>>  		list_add_tail(&block_cb->driver_list, &bnxt_block_cb_list);
>> +		block_cb->indr.cb_priv = bp;
> cb_indent ?
>
> Why are you splitting the fix in multiple patches? This makes it
> harder to review.
>
> I think patch 1/4, 2/4 and 4/4 belong to the same logical change?
> Collapse them.

I think patch 1/4, 2/4, 4/4 are different bugs, Although they are all in the new indirect

flow_block infrastructure.

patch1 fix the miss cleanup for flow_block_cb of flowtable

patch2 fix the incorrect check the cb_priv of flow_block_cb

patch4 fix the miss driver_list del in the cleanup callback

So maybe make them alone is better?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 15:47           ` Simon Horman
  2020-06-16 20:38             ` Pablo Neira Ayuso
@ 2020-06-17  2:47             ` wenxu
  1 sibling, 0 replies; 24+ messages in thread
From: wenxu @ 2020-06-17  2:47 UTC (permalink / raw)
  To: Simon Horman; +Cc: netdev, davem, pablo, vladbu


On 6/16/2020 11:47 PM, Simon Horman wrote:
> On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
>> 在 2020/6/16 22:34, Simon Horman 写道:
>>> On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
>>>> 在 2020/6/16 18:51, Simon Horman 写道:
>>>>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>>>>>> From: wenxu <wenxu@ucloud.cn>
>>>>>>
>>>>>> In the function __flow_block_indr_cleanup, The match stataments
>>>>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>>>>>> is totally different data with the flow_indr_dev->cb_priv.
>>>>>>
>>>>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>>>>>> the driver.
>>>>>>
>>>>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>>>>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>>>> Hi Wenxu,
>>>>>
>>>>> I wonder if this can be resolved by using the cb_ident field of struct
>>>>> flow_block_cb.
>>>>>
>>>>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
>>>>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
>>>>> per-block rather than per-device. So part of my proposal is to change
>>>>> that.
>>>> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
>>>>
>>>> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
>>>>
>>>> and bnxt_tc_setup_indr_block.
>>>>
>>>>
>>>> nfp_flower_setup_indr_tc_block:
>>>>
>>>> struct nfp_flower_indr_block_cb_priv *cb_priv;
>>>>
>>>> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
>>>>                                                cb_priv, cb_priv,
>>>>                                                nfp_flower_setup_indr_tc_release);
>>>>
>>>>
>>>> bnxt_tc_setup_indr_block:
>>>>
>>>> struct bnxt_flower_indr_block_cb_priv *cb_priv;
>>>>
>>>> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
>>>>                                                cb_priv, cb_priv,
>>>>                                                bnxt_tc_setup_indr_rel);
>>>>
>>>>
>>>> And the function flow_block_cb_is_busy called in most place. Pass the
>>>>
>>>> parameter as cb_priv but not cb_indent .
>>> Thanks, I see that now. But I still think it would be useful to understand
>>> the purpose of cb_ident. It feels like it would lead to a clean solution
>>> to the problem you have highlighted.
>> I think The cb_ident means identify.  It is used to identify the each flow block cb.
>>
>> In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
>>
>> the block_cb->cb_ident == cb_ident.
> Thanks, I think that I now see what you mean about the different scope of
> cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().
>
> I do, however, still wonder if there is a nicer way than reaching into
> the structure and manually setting block_cb->indr.cb_priv
> at each call-site.
>
> Perhaps a variant of flow_block_cb_alloc() for indirect blocks
> would be nicer?
Yes, It seems a variant of flow_block_cb_alloc() for indirect blocks is better, Thanks.
>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-16 20:38             ` Pablo Neira Ayuso
@ 2020-06-17  3:36               ` wenxu
  2020-06-17  8:38                 ` Pablo Neira Ayuso
  0 siblings, 1 reply; 24+ messages in thread
From: wenxu @ 2020-06-17  3:36 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Simon Horman; +Cc: netdev, davem, vladbu


On 6/17/2020 4:38 AM, Pablo Neira Ayuso wrote:
> On Tue, Jun 16, 2020 at 05:47:17PM +0200, Simon Horman wrote:
>> On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
>>> 在 2020/6/16 22:34, Simon Horman 写道:
>>>> On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
>>>>> 在 2020/6/16 18:51, Simon Horman 写道:
>>>>>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>>>>>>> From: wenxu <wenxu@ucloud.cn>
>>>>>>>
>>>>>>> In the function __flow_block_indr_cleanup, The match stataments
>>>>>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>>>>>>> is totally different data with the flow_indr_dev->cb_priv.
>>>>>>>
>>>>>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>>>>>>> the driver.
>>>>>>>
>>>>>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>>>>>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>>>>> Hi Wenxu,
>>>>>>
>>>>>> I wonder if this can be resolved by using the cb_ident field of struct
>>>>>> flow_block_cb.
>>>>>>
>>>>>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
>>>>>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
>>>>>> per-block rather than per-device. So part of my proposal is to change
>>>>>> that.
>>>>> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
>>>>>
>>>>> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
>>>>>
>>>>> and bnxt_tc_setup_indr_block.
>>>>>
>>>>>
>>>>> nfp_flower_setup_indr_tc_block:
>>>>>
>>>>> struct nfp_flower_indr_block_cb_priv *cb_priv;
>>>>>
>>>>> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
>>>>>                                                cb_priv, cb_priv,
>>>>>                                                nfp_flower_setup_indr_tc_release);
>>>>>
>>>>>
>>>>> bnxt_tc_setup_indr_block:
>>>>>
>>>>> struct bnxt_flower_indr_block_cb_priv *cb_priv;
>>>>>
>>>>> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
>>>>>                                                cb_priv, cb_priv,
>>>>>                                                bnxt_tc_setup_indr_rel);
>>>>>
>>>>>
>>>>> And the function flow_block_cb_is_busy called in most place. Pass the
>>>>>
>>>>> parameter as cb_priv but not cb_indent .
>>>> Thanks, I see that now. But I still think it would be useful to understand
>>>> the purpose of cb_ident. It feels like it would lead to a clean solution
>>>> to the problem you have highlighted.
>>> I think The cb_ident means identify.  It is used to identify the each flow block cb.
>>>
>>> In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
>>>
>>> the block_cb->cb_ident == cb_ident.
>> Thanks, I think that I now see what you mean about the different scope of
>> cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().
>>
>> I do, however, still wonder if there is a nicer way than reaching into
>> the structure and manually setting block_cb->indr.cb_priv
>> at each call-site.
>>
>> Perhaps a variant of flow_block_cb_alloc() for indirect blocks
>> would be nicer?
> A follow up patch to add this new variant would be good. Probably
> __flow_block_indr_binding() can go away with this new variant to set
> up the indirect flow block.


Maybe __flow_block_indr_binding() can't go away. The data and cleanup callback which should

init the flow_block_indr is only in the flow_indr_dev_setup_offload. This can't be gotten in

flow_indr_block_cb_alloc.

>

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-17  3:36               ` wenxu
@ 2020-06-17  8:38                 ` Pablo Neira Ayuso
  2020-06-17 10:09                   ` wenxu
  0 siblings, 1 reply; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-17  8:38 UTC (permalink / raw)
  To: wenxu; +Cc: Simon Horman, netdev, davem, vladbu

On Wed, Jun 17, 2020 at 11:36:19AM +0800, wenxu wrote:
> 
> On 6/17/2020 4:38 AM, Pablo Neira Ayuso wrote:
> > On Tue, Jun 16, 2020 at 05:47:17PM +0200, Simon Horman wrote:
> >> On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
> >>> 在 2020/6/16 22:34, Simon Horman 写道:
> >>>> On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
> >>>>> 在 2020/6/16 18:51, Simon Horman 写道:
> >>>>>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
> >>>>>>> From: wenxu <wenxu@ucloud.cn>
> >>>>>>>
> >>>>>>> In the function __flow_block_indr_cleanup, The match stataments
> >>>>>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
> >>>>>>> is totally different data with the flow_indr_dev->cb_priv.
> >>>>>>>
> >>>>>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
> >>>>>>> the driver.
> >>>>>>>
> >>>>>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
> >>>>>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
> >>>>>> Hi Wenxu,
> >>>>>>
> >>>>>> I wonder if this can be resolved by using the cb_ident field of struct
> >>>>>> flow_block_cb.
> >>>>>>
> >>>>>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
> >>>>>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
> >>>>>> per-block rather than per-device. So part of my proposal is to change
> >>>>>> that.
> >>>>> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
> >>>>>
> >>>>> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
> >>>>>
> >>>>> and bnxt_tc_setup_indr_block.
> >>>>>
> >>>>>
> >>>>> nfp_flower_setup_indr_tc_block:
> >>>>>
> >>>>> struct nfp_flower_indr_block_cb_priv *cb_priv;
> >>>>>
> >>>>> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
> >>>>>                                                cb_priv, cb_priv,
> >>>>>                                                nfp_flower_setup_indr_tc_release);
> >>>>>
> >>>>>
> >>>>> bnxt_tc_setup_indr_block:
> >>>>>
> >>>>> struct bnxt_flower_indr_block_cb_priv *cb_priv;
> >>>>>
> >>>>> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
> >>>>>                                                cb_priv, cb_priv,
> >>>>>                                                bnxt_tc_setup_indr_rel);
> >>>>>
> >>>>>
> >>>>> And the function flow_block_cb_is_busy called in most place. Pass the
> >>>>>
> >>>>> parameter as cb_priv but not cb_indent .
> >>>> Thanks, I see that now. But I still think it would be useful to understand
> >>>> the purpose of cb_ident. It feels like it would lead to a clean solution
> >>>> to the problem you have highlighted.
> >>> I think The cb_ident means identify.  It is used to identify the each flow block cb.
> >>>
> >>> In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
> >>>
> >>> the block_cb->cb_ident == cb_ident.
> >> Thanks, I think that I now see what you mean about the different scope of
> >> cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().
> >>
> >> I do, however, still wonder if there is a nicer way than reaching into
> >> the structure and manually setting block_cb->indr.cb_priv
> >> at each call-site.
> >>
> >> Perhaps a variant of flow_block_cb_alloc() for indirect blocks
> >> would be nicer?
> > A follow up patch to add this new variant would be good. Probably
> > __flow_block_indr_binding() can go away with this new variant to set
> > up the indirect flow block.
> 
> 
> Maybe __flow_block_indr_binding() can't go away. The data and cleanup callback which should
> init the flow_block_indr is only in the flow_indr_dev_setup_offload. This can't be gotten in
> flow_indr_block_cb_alloc.

Probably flow_indr_block_bind_cb_t can be updated to include the data
and the cleanup callback.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-17  2:42     ` wenxu
@ 2020-06-17  9:03       ` Pablo Neira Ayuso
  0 siblings, 0 replies; 24+ messages in thread
From: Pablo Neira Ayuso @ 2020-06-17  9:03 UTC (permalink / raw)
  To: wenxu; +Cc: netdev, davem, vladbu

[-- Attachment #1: Type: text/plain, Size: 1579 bytes --]

On Wed, Jun 17, 2020 at 10:42:19AM +0800, wenxu wrote:
[...]
> >> diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> >> index ef7f6bc..042c285 100644
> >> --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> >> +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
> >> @@ -1918,6 +1918,7 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
> >>  
> >>  		flow_block_cb_add(block_cb, f);
> >>  		list_add_tail(&block_cb->driver_list, &bnxt_block_cb_list);
> >> +		block_cb->indr.cb_priv = bp;
> > cb_indent ?
> >
> > Why are you splitting the fix in multiple patches? This makes it
> > harder to review.
> >
> > I think patch 1/4, 2/4 and 4/4 belong to the same logical change?
> > Collapse them.
> 
> I think patch 1/4, 2/4, 4/4 are different bugs, Although they are all in the new indirect
> flow_block infrastructure.
> 
> patch1 fix the miss cleanup for flow_block_cb of flowtable
> 
> patch2 fix the incorrect check the cb_priv of flow_block_cb
> 
> patch4 fix the miss driver_list del in the cleanup callback
> 
> So maybe make them alone is better?

Maybe.

I'm attaching the collapsed patch for preview.

        10 files changed, 21 insertions(+), 15 deletions(-)

This one single collapsed patch is small.

And it looks like it is part of the same logic fix for the .cleanup
path? You had to update three aspects to make the cleanup path work.

3/4 is a different code path, leaving it standalone makes sense to me,
on top of this one probably.

This is just a proposal.

Thank you.

[-- Attachment #2: 0001-net-flow_offload-fix-flow_indr_dev_unregister-path.patch --]
[-- Type: text/x-diff, Size: 9820 bytes --]

From 8a17ca6fa7852e359aa5fad01ecdda4f5ae33eeb Mon Sep 17 00:00:00 2001
From: wenxu <wenxu@ucloud.cn>
Date: Sat, 13 Jun 2020 17:25:59 +0800
Subject: [PATCH RFC] net: flow_offload: fix flow_indr_dev_unregister path

If the representor is removed, then identify the indirect flow_blocks
that need to be removed by the release callback and the port representor
structure. To identify the port representor structure, a new
indr.cb_priv field needs to be introduced. The flow_block also needs to
be removed from the driver list from the cleanup path.

Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
---
 drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c        |  3 ++-
 drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c |  4 ++--
 drivers/net/ethernet/netronome/nfp/flower/main.c    |  2 +-
 drivers/net/ethernet/netronome/nfp/flower/main.h    |  3 +--
 drivers/net/ethernet/netronome/nfp/flower/offload.c |  7 ++++---
 include/net/flow_offload.h                          |  3 ++-
 net/core/flow_offload.c                             | 11 ++++++-----
 net/netfilter/nf_flow_table_offload.c               |  1 +
 net/netfilter/nf_tables_offload.c                   |  1 +
 net/sched/cls_api.c                                 |  1 +
 10 files changed, 21 insertions(+), 15 deletions(-)

diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 0eef4f5e4a46..042c2850fcff 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -1918,6 +1918,7 @@ static int bnxt_tc_setup_indr_block(struct net_device *netdev, struct bnxt *bp,
 
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &bnxt_block_cb_list);
+		block_cb->indr.cb_priv = bp;
 		break;
 	case FLOW_BLOCK_UNBIND:
 		cb_priv = bnxt_tc_indr_block_cb_lookup(bp, netdev);
@@ -2074,7 +2075,7 @@ void bnxt_shutdown_tc(struct bnxt *bp)
 		return;
 
 	flow_indr_dev_unregister(bnxt_tc_setup_indr_cb, bp,
-				 bnxt_tc_setup_indr_block_cb);
+				 bnxt_tc_setup_indr_rel);
 	rhashtable_destroy(&tc_info->flow_table);
 	rhashtable_destroy(&tc_info->l2_table);
 	rhashtable_destroy(&tc_info->decap_l2_table);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
index 80713123de5c..187f84c2ec23 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/rep/tc.c
@@ -447,7 +447,7 @@ mlx5e_rep_indr_setup_block(struct net_device *netdev,
 		}
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &mlx5e_block_cb_list);
-
+		block_cb->indr.cb_priv = rpriv;
 		return 0;
 	case FLOW_BLOCK_UNBIND:
 		indr_priv = mlx5e_rep_indr_block_priv_lookup(rpriv, netdev);
@@ -496,7 +496,7 @@ int mlx5e_rep_tc_netdevice_event_register(struct mlx5e_rep_priv *rpriv)
 void mlx5e_rep_tc_netdevice_event_unregister(struct mlx5e_rep_priv *rpriv)
 {
 	flow_indr_dev_unregister(mlx5e_rep_indr_setup_cb, rpriv,
-				 mlx5e_rep_indr_setup_tc_cb);
+				 mlx5e_rep_indr_block_unbind);
 }
 
 #if IS_ENABLED(CONFIG_NET_TC_SKB_EXT)
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.c b/drivers/net/ethernet/netronome/nfp/flower/main.c
index c39327677a7d..bb448c82cdc2 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.c
@@ -861,7 +861,7 @@ static void nfp_flower_clean(struct nfp_app *app)
 	flush_work(&app_priv->cmsg_work);
 
 	flow_indr_dev_unregister(nfp_flower_indr_setup_tc_cb, app,
-				 nfp_flower_setup_indr_block_cb);
+				 nfp_flower_setup_indr_tc_release);
 
 	if (app_priv->flower_ext_feats & NFP_FL_FEATS_VF_RLIM)
 		nfp_flower_qos_cleanup(app);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/main.h b/drivers/net/ethernet/netronome/nfp/flower/main.h
index 6c3dc3baf387..c98333799f89 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/main.h
+++ b/drivers/net/ethernet/netronome/nfp/flower/main.h
@@ -460,8 +460,7 @@ int nfp_flower_setup_qos_offload(struct nfp_app *app, struct net_device *netdev,
 void nfp_flower_stats_rlim_reply(struct nfp_app *app, struct sk_buff *skb);
 int nfp_flower_indr_setup_tc_cb(struct net_device *netdev, void *cb_priv,
 				enum tc_setup_type type, void *type_data);
-int nfp_flower_setup_indr_block_cb(enum tc_setup_type type, void *type_data,
-				   void *cb_priv);
+void nfp_flower_setup_indr_tc_release(void *cb_priv);
 
 void
 __nfp_flower_non_repr_priv_get(struct nfp_flower_non_repr_priv *non_repr_priv);
diff --git a/drivers/net/ethernet/netronome/nfp/flower/offload.c b/drivers/net/ethernet/netronome/nfp/flower/offload.c
index 695d24b9dd92..ca2f01a3418d 100644
--- a/drivers/net/ethernet/netronome/nfp/flower/offload.c
+++ b/drivers/net/ethernet/netronome/nfp/flower/offload.c
@@ -1619,8 +1619,8 @@ nfp_flower_indr_block_cb_priv_lookup(struct nfp_app *app,
 	return NULL;
 }
 
-int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
-				   void *type_data, void *cb_priv)
+static int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
+					  void *type_data, void *cb_priv)
 {
 	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
 	struct flow_cls_offload *flower = type_data;
@@ -1637,7 +1637,7 @@ int nfp_flower_setup_indr_block_cb(enum tc_setup_type type,
 	}
 }
 
-static void nfp_flower_setup_indr_tc_release(void *cb_priv)
+void nfp_flower_setup_indr_tc_release(void *cb_priv)
 {
 	struct nfp_flower_indr_block_cb_priv *priv = cb_priv;
 
@@ -1687,6 +1687,7 @@ nfp_flower_setup_indr_tc_block(struct net_device *netdev, struct nfp_app *app,
 
 		flow_block_cb_add(block_cb, f);
 		list_add_tail(&block_cb->driver_list, &nfp_block_cb_list);
+		block_cb->indr.cb_priv = app;
 		return 0;
 	case FLOW_BLOCK_UNBIND:
 		cb_priv = nfp_flower_indr_block_cb_priv_lookup(app, netdev);
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index f2c8311a0433..ef4d8b0e304c 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -450,6 +450,7 @@ struct flow_block_indr {
 	struct net_device		*dev;
 	enum flow_block_binder_type	binder_type;
 	void				*data;
+	void				*cb_priv;
 	void				(*cleanup)(struct flow_block_cb *block_cb);
 };
 
@@ -536,7 +537,7 @@ typedef int flow_indr_block_bind_cb_t(struct net_device *dev, void *cb_priv,
 
 int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv);
 void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      flow_setup_cb_t *setup_cb);
+			      void (*release)(void *cb_priv));
 int flow_indr_dev_setup_offload(struct net_device *dev,
 				enum tc_setup_type type, void *data,
 				struct flow_block_offload *bo,
diff --git a/net/core/flow_offload.c b/net/core/flow_offload.c
index 0cfc35e6be28..66143518f1d4 100644
--- a/net/core/flow_offload.c
+++ b/net/core/flow_offload.c
@@ -372,14 +372,15 @@ int flow_indr_dev_register(flow_indr_block_bind_cb_t *cb, void *cb_priv)
 }
 EXPORT_SYMBOL(flow_indr_dev_register);
 
-static void __flow_block_indr_cleanup(flow_setup_cb_t *setup_cb, void *cb_priv,
+static void __flow_block_indr_cleanup(void (*release)(void *cb_priv),
+				      void *cb_priv,
 				      struct list_head *cleanup_list)
 {
 	struct flow_block_cb *this, *next;
 
 	list_for_each_entry_safe(this, next, &flow_block_indr_list, indr.list) {
-		if (this->cb == setup_cb &&
-		    this->cb_priv == cb_priv) {
+		if (this->release == release &&
+		    this->indr.cb_priv == cb_priv) {
 			list_move(&this->indr.list, cleanup_list);
 			return;
 		}
@@ -397,7 +398,7 @@ static void flow_block_indr_notify(struct list_head *cleanup_list)
 }
 
 void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
-			      flow_setup_cb_t *setup_cb)
+			      void (*release)(void *cb_priv))
 {
 	struct flow_indr_dev *this, *next, *indr_dev = NULL;
 	LIST_HEAD(cleanup_list);
@@ -418,7 +419,7 @@ void flow_indr_dev_unregister(flow_indr_block_bind_cb_t *cb, void *cb_priv,
 		return;
 	}
 
-	__flow_block_indr_cleanup(setup_cb, cb_priv, &cleanup_list);
+	__flow_block_indr_cleanup(release, cb_priv, &cleanup_list);
 	mutex_unlock(&flow_indr_block_lock);
 
 	flow_block_indr_notify(&cleanup_list);
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 62651e6683f6..5fff1e040168 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -950,6 +950,7 @@ static void nf_flow_table_indr_cleanup(struct flow_block_cb *block_cb)
 	nf_flow_table_gc_cleanup(flowtable, dev);
 	down_write(&flowtable->flow_block_lock);
 	list_del(&block_cb->list);
+	list_del(&block_cb->driver_list);
 	flow_block_cb_free(block_cb);
 	up_write(&flowtable->flow_block_lock);
 }
diff --git a/net/netfilter/nf_tables_offload.c b/net/netfilter/nf_tables_offload.c
index 185fc82c99aa..c7cf1cde46de 100644
--- a/net/netfilter/nf_tables_offload.c
+++ b/net/netfilter/nf_tables_offload.c
@@ -296,6 +296,7 @@ static void nft_indr_block_cleanup(struct flow_block_cb *block_cb)
 	nft_flow_block_offload_init(&bo, dev_net(dev), FLOW_BLOCK_UNBIND,
 				    basechain, &extack);
 	mutex_lock(&net->nft.commit_mutex);
+	list_del(&block_cb->driver_list);
 	list_move(&block_cb->list, &bo.cb_list);
 	nft_flow_offload_unbind(&bo, basechain);
 	mutex_unlock(&net->nft.commit_mutex);
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index a00a203b2ef5..f8028d73edf1 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -652,6 +652,7 @@ static void tc_block_indr_cleanup(struct flow_block_cb *block_cb)
 			       &block->flow_block, tcf_block_shared(block),
 			       &extack);
 	down_write(&block->cb_lock);
+	list_del(&block_cb->driver_list);
 	list_move(&block_cb->list, &bo.cb_list);
 	up_write(&block->cb_lock);
 	rtnl_lock();
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb
  2020-06-17  8:38                 ` Pablo Neira Ayuso
@ 2020-06-17 10:09                   ` wenxu
  0 siblings, 0 replies; 24+ messages in thread
From: wenxu @ 2020-06-17 10:09 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Simon Horman, netdev, davem, vladbu


On 6/17/2020 4:38 PM, Pablo Neira Ayuso wrote:
> On Wed, Jun 17, 2020 at 11:36:19AM +0800, wenxu wrote:
>> On 6/17/2020 4:38 AM, Pablo Neira Ayuso wrote:
>>> On Tue, Jun 16, 2020 at 05:47:17PM +0200, Simon Horman wrote:
>>>> On Tue, Jun 16, 2020 at 11:18:16PM +0800, wenxu wrote:
>>>>> 在 2020/6/16 22:34, Simon Horman 写道:
>>>>>> On Tue, Jun 16, 2020 at 10:20:46PM +0800, wenxu wrote:
>>>>>>> 在 2020/6/16 18:51, Simon Horman 写道:
>>>>>>>> On Tue, Jun 16, 2020 at 11:19:38AM +0800, wenxu@ucloud.cn wrote:
>>>>>>>>> From: wenxu <wenxu@ucloud.cn>
>>>>>>>>>
>>>>>>>>> In the function __flow_block_indr_cleanup, The match stataments
>>>>>>>>> this->cb_priv == cb_priv is always false, the flow_block_cb->cb_priv
>>>>>>>>> is totally different data with the flow_indr_dev->cb_priv.
>>>>>>>>>
>>>>>>>>> Store the representor cb_priv to the flow_block_cb->indr.cb_priv in
>>>>>>>>> the driver.
>>>>>>>>>
>>>>>>>>> Fixes: 1fac52da5942 ("net: flow_offload: consolidate indirect flow_block infrastructure")
>>>>>>>>> Signed-off-by: wenxu <wenxu@ucloud.cn>
>>>>>>>> Hi Wenxu,
>>>>>>>>
>>>>>>>> I wonder if this can be resolved by using the cb_ident field of struct
>>>>>>>> flow_block_cb.
>>>>>>>>
>>>>>>>> I observe that mlx5e_rep_indr_setup_block() seems to be the only call-site
>>>>>>>> where the value of the cb_ident parameter of flow_block_cb_alloc() is
>>>>>>>> per-block rather than per-device. So part of my proposal is to change
>>>>>>>> that.
>>>>>>> I check all the xxdriver_indr_setup_block. It seems all the cb_ident parameter of
>>>>>>>
>>>>>>> flow_block_cb_alloc is per-block. Both in the nfp_flower_setup_indr_tc_block
>>>>>>>
>>>>>>> and bnxt_tc_setup_indr_block.
>>>>>>>
>>>>>>>
>>>>>>> nfp_flower_setup_indr_tc_block:
>>>>>>>
>>>>>>> struct nfp_flower_indr_block_cb_priv *cb_priv;
>>>>>>>
>>>>>>> block_cb = flow_block_cb_alloc(nfp_flower_setup_indr_block_cb,
>>>>>>>                                                cb_priv, cb_priv,
>>>>>>>                                                nfp_flower_setup_indr_tc_release);
>>>>>>>
>>>>>>>
>>>>>>> bnxt_tc_setup_indr_block:
>>>>>>>
>>>>>>> struct bnxt_flower_indr_block_cb_priv *cb_priv;
>>>>>>>
>>>>>>> block_cb = flow_block_cb_alloc(bnxt_tc_setup_indr_block_cb,
>>>>>>>                                                cb_priv, cb_priv,
>>>>>>>                                                bnxt_tc_setup_indr_rel);
>>>>>>>
>>>>>>>
>>>>>>> And the function flow_block_cb_is_busy called in most place. Pass the
>>>>>>>
>>>>>>> parameter as cb_priv but not cb_indent .
>>>>>> Thanks, I see that now. But I still think it would be useful to understand
>>>>>> the purpose of cb_ident. It feels like it would lead to a clean solution
>>>>>> to the problem you have highlighted.
>>>>> I think The cb_ident means identify.  It is used to identify the each flow block cb.
>>>>>
>>>>> In the both flow_block_cb_is_busy and flow_block_cb_lookup function check
>>>>>
>>>>> the block_cb->cb_ident == cb_ident.
>>>> Thanks, I think that I now see what you mean about the different scope of
>>>> cb_ident and your proposal to allow cleanup by flow_indr_dev_unregister().
>>>>
>>>> I do, however, still wonder if there is a nicer way than reaching into
>>>> the structure and manually setting block_cb->indr.cb_priv
>>>> at each call-site.
>>>>
>>>> Perhaps a variant of flow_block_cb_alloc() for indirect blocks
>>>> would be nicer?
>>> A follow up patch to add this new variant would be good. Probably
>>> __flow_block_indr_binding() can go away with this new variant to set
>>> up the indirect flow block.
>>
>> Maybe __flow_block_indr_binding() can't go away. The data and cleanup callback which should
>> init the flow_block_indr is only in the flow_indr_dev_setup_offload. This can't be gotten in
>> flow_indr_block_cb_alloc.
> Probably flow_indr_block_bind_cb_t can be updated to include the data
> and the cleanup callback.

Yes this can setup the indr_block info in the flow_indr_block_cb_alloc.

it also needs a flow_indr_block_cb_remove to handle the UNBIND setup.

>

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2020-06-17 10:10 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16  3:19 [PATCH net v3 0/4] several fixes for indirect flow_blocks offload wenxu
2020-06-16  3:19 ` [PATCH net v3 1/4] flow_offload: fix incorrect cleanup for flowtable indirect flow_blocks wenxu
2020-06-16 15:55   ` Simon Horman
2020-06-16 20:11   ` Pablo Neira Ayuso
2020-06-16  3:19 ` [PATCH net v3 2/4] flow_offload: fix incorrect cb_priv check for flow_block_cb wenxu
2020-06-16 10:51   ` Simon Horman
2020-06-16 14:20     ` wenxu
2020-06-16 14:34       ` Simon Horman
2020-06-16 15:18         ` wenxu
2020-06-16 15:47           ` Simon Horman
2020-06-16 20:38             ` Pablo Neira Ayuso
2020-06-17  3:36               ` wenxu
2020-06-17  8:38                 ` Pablo Neira Ayuso
2020-06-17 10:09                   ` wenxu
2020-06-17  2:47             ` wenxu
2020-06-16 20:13   ` Pablo Neira Ayuso
2020-06-17  2:42     ` wenxu
2020-06-17  9:03       ` Pablo Neira Ayuso
2020-06-16  3:19 ` [PATCH net v3 3/4] net/sched: cls_api: fix nooffloaddevcnt warning dmesg log wenxu
2020-06-16 20:17   ` Pablo Neira Ayuso
2020-06-16 20:30     ` Pablo Neira Ayuso
2020-06-17  2:34       ` wenxu
2020-06-17  2:29     ` wenxu
2020-06-16  3:19 ` [PATCH net v3 4/4] flow_offload: fix the list_del corruption in the driver list wenxu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).