linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH for-next v2 0/3] RDMA/bnxt_re driver update
@ 2020-02-24 10:49 Selvin Xavier
  2020-02-24 10:49 ` [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path Selvin Xavier
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-02-24 10:49 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Selvin Xavier

Includes code refactoring in the device init/deinit path and
use the new driver unregistration APIs.

Please apply to for-next.

v1-> v2:
 - Remove the patches 1,2 and 6 from the v1 series.
   They are already merged.
 - Added ASSERT_RTNL instead of comment in Patch 2
 - For Patch 3, explicitly queue the removal of the VF devices
   before calling ib_unregister_driver. This can avoid command
   timeouts seen, if the PFs gets removed before the VFs.
   Previous discussion - https://patchwork.kernel.org/patch/11260013/

Selvin Xavier (3):
  RDMA/bnxt_re: Add more flags in device init and uninit path
  RDMA/bnxt_re: Refactor device add/remove functionalities
  RDMA/bnxt_re: Use driver_unregister and unregistration API

 drivers/infiniband/hw/bnxt_re/bnxt_re.h |  16 +-
 drivers/infiniband/hw/bnxt_re/main.c    | 249 ++++++++++++++++----------------
 2 files changed, 137 insertions(+), 128 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path
  2020-02-24 10:49 [PATCH for-next v2 0/3] RDMA/bnxt_re driver update Selvin Xavier
@ 2020-02-24 10:49 ` Selvin Xavier
  2020-02-24 13:48   ` Jason Gunthorpe
  2020-02-24 10:49 ` [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities Selvin Xavier
  2020-02-24 10:49 ` [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API Selvin Xavier
  2 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-02-24 10:49 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Selvin Xavier

Add more flags for better granularity in in device init/uninit path

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/bnxt_re.h | 16 +++++++++----
 drivers/infiniband/hw/bnxt_re/main.c    | 41 ++++++++++++++++-----------------
 2 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
index c736e82..0babc66 100644
--- a/drivers/infiniband/hw/bnxt_re/bnxt_re.h
+++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
@@ -135,11 +135,17 @@ struct bnxt_re_dev {
 #define BNXT_RE_FLAG_IBDEV_REGISTERED		1
 #define BNXT_RE_FLAG_GOT_MSIX			2
 #define BNXT_RE_FLAG_HAVE_L2_REF		3
-#define BNXT_RE_FLAG_RCFW_CHANNEL_EN		4
-#define BNXT_RE_FLAG_QOS_WORK_REG		5
-#define BNXT_RE_FLAG_RESOURCES_ALLOCATED	7
-#define BNXT_RE_FLAG_RESOURCES_INITIALIZED	8
-#define BNXT_RE_FLAG_ISSUE_ROCE_STATS          29
+#define BNXT_RE_FLAG_ALLOC_RCFW			4
+#define BNXT_RE_FLAG_NET_RING_ALLOC		5
+#define BNXT_RE_FLAG_RCFW_CHANNEL_EN		6
+#define BNXT_RE_FLAG_ALLOC_CTX			7
+#define BNXT_RE_FLAG_STATS_CTX_ALLOC		8
+#define BNXT_RE_FLAG_STATS_CTX2_ALLOC		9
+#define BNXT_RE_FLAG_RCFW_CHANNEL_INIT		10
+#define BNXT_RE_FLAG_QOS_WORK_REG		11
+#define BNXT_RE_FLAG_RESOURCES_ALLOCATED	12
+#define BNXT_RE_FLAG_RESOURCES_INITIALIZED	13
+#define BNXT_RE_FLAG_ISSUE_ROCE_STATS		29
 	struct net_device		*netdev;
 	unsigned int			version, major, minor;
 	struct bnxt_qplib_chip_ctx	*chip_ctx;
diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index b5128cc..1402130 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -1335,18 +1335,23 @@ static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev)
 	if (test_and_clear_bit(BNXT_RE_FLAG_RESOURCES_ALLOCATED, &rdev->flags))
 		bnxt_re_free_res(rdev);
 
-	if (test_and_clear_bit(BNXT_RE_FLAG_RCFW_CHANNEL_EN, &rdev->flags)) {
+	if (test_and_clear_bit(BNXT_RE_FLAG_RCFW_CHANNEL_INIT, &rdev->flags)) {
 		rc = bnxt_qplib_deinit_rcfw(&rdev->rcfw);
 		if (rc)
 			ibdev_warn(&rdev->ibdev,
 				   "Failed to deinitialize RCFW: %#x", rc);
+	}
+	if (test_and_clear_bit(BNXT_RE_FLAG_STATS_CTX_ALLOC, &rdev->flags))
 		bnxt_re_net_stats_ctx_free(rdev, rdev->qplib_ctx.stats.fw_id);
+	if (test_and_clear_bit(BNXT_RE_FLAG_ALLOC_CTX, &rdev->flags))
 		bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
+	if (test_and_clear_bit(BNXT_RE_FLAG_RCFW_CHANNEL_EN, &rdev->flags))
 		bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
-		type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
+	type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
+	if (test_and_clear_bit(BNXT_RE_FLAG_NET_RING_ALLOC, &rdev->flags))
 		bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
+	if (test_and_clear_bit(BNXT_RE_FLAG_ALLOC_RCFW, &rdev->flags))
 		bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
-	}
 	if (test_and_clear_bit(BNXT_RE_FLAG_GOT_MSIX, &rdev->flags)) {
 		rc = bnxt_re_free_msix(rdev);
 		if (rc)
@@ -1430,6 +1435,7 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 		goto fail;
 	}
 
+	set_bit(BNXT_RE_FLAG_ALLOC_RCFW, &rdev->flags);
 	type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
 	creq = &rdev->rcfw.creq;
 	rattr.dma_arr = creq->hwq.pbl[PBL_LVL_0].pg_map_arr;
@@ -1441,8 +1447,9 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 	rc = bnxt_re_net_ring_alloc(rdev, &rattr, &creq->ring_id);
 	if (rc) {
 		ibdev_err(&rdev->ibdev, "Failed to allocate CREQ: %#x\n", rc);
-		goto free_rcfw;
+		goto fail;
 	}
+	set_bit(BNXT_RE_FLAG_NET_RING_ALLOC, &rdev->flags);
 	db_offt = bnxt_re_get_nqdb_offset(rdev, BNXT_RE_AEQ_IDX);
 	vid = rdev->msix_entries[BNXT_RE_AEQ_IDX].vector;
 	rc = bnxt_qplib_enable_rcfw_channel(&rdev->rcfw,
@@ -1451,13 +1458,14 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 	if (rc) {
 		ibdev_err(&rdev->ibdev, "Failed to enable RCFW channel: %#x\n",
 			  rc);
-		goto free_ring;
+		goto fail;
 	}
 
+	set_bit(BNXT_RE_FLAG_RCFW_CHANNEL_EN, &rdev->flags);
 	rc = bnxt_qplib_get_dev_attr(&rdev->rcfw, &rdev->dev_attr,
 				     rdev->is_virtfn);
 	if (rc)
-		goto disable_rcfw;
+		goto fail;
 
 	bnxt_re_set_resource_limits(rdev);
 
@@ -1466,25 +1474,27 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 	if (rc) {
 		ibdev_err(&rdev->ibdev,
 			  "Failed to allocate QPLIB context: %#x\n", rc);
-		goto disable_rcfw;
+		goto fail;
 	}
+	set_bit(BNXT_RE_FLAG_ALLOC_CTX, &rdev->flags);
 	rc = bnxt_re_net_stats_ctx_alloc(rdev,
 					 rdev->qplib_ctx.stats.dma_map,
 					 &rdev->qplib_ctx.stats.fw_id);
 	if (rc) {
 		ibdev_err(&rdev->ibdev,
 			  "Failed to allocate stats context: %#x\n", rc);
-		goto free_ctx;
+		goto fail;
 	}
 
+	set_bit(BNXT_RE_FLAG_STATS_CTX_ALLOC, &rdev->flags);
 	rc = bnxt_qplib_init_rcfw(&rdev->rcfw, &rdev->qplib_ctx,
 				  rdev->is_virtfn);
 	if (rc) {
 		ibdev_err(&rdev->ibdev,
 			  "Failed to initialize RCFW: %#x\n", rc);
-		goto free_sctx;
+		goto fail;
 	}
-	set_bit(BNXT_RE_FLAG_RCFW_CHANNEL_EN, &rdev->flags);
+	set_bit(BNXT_RE_FLAG_RCFW_CHANNEL_INIT, &rdev->flags);
 
 	/* Resources based on the 'new' device caps */
 	rc = bnxt_re_alloc_res(rdev);
@@ -1532,17 +1542,6 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE);
 
 	return 0;
-free_sctx:
-	bnxt_re_net_stats_ctx_free(rdev, rdev->qplib_ctx.stats.fw_id);
-free_ctx:
-	bnxt_qplib_free_ctx(&rdev->qplib_res, &rdev->qplib_ctx);
-disable_rcfw:
-	bnxt_qplib_disable_rcfw_channel(&rdev->rcfw);
-free_ring:
-	type = bnxt_qplib_get_ring_type(rdev->chip_ctx);
-	bnxt_re_net_ring_free(rdev, rdev->rcfw.creq.ring_id, type);
-free_rcfw:
-	bnxt_qplib_free_rcfw_channel(&rdev->rcfw);
 fail:
 	if (!locked)
 		rtnl_lock();
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities
  2020-02-24 10:49 [PATCH for-next v2 0/3] RDMA/bnxt_re driver update Selvin Xavier
  2020-02-24 10:49 ` [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path Selvin Xavier
@ 2020-02-24 10:49 ` Selvin Xavier
  2020-02-24 13:53   ` Jason Gunthorpe
  2020-02-24 10:49 ` [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API Selvin Xavier
  2 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-02-24 10:49 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Selvin Xavier

 - bnxt_re_ib_reg() handles two main functionalities - initializing
   the device and registering with the IB stack.  Split it into 2
   functions i.e. bnxt_re_dev_init() and bnxt_re_ib_init()  to account
   for the same thereby improve modularity. Do the same for
   bnxt_re_ib_unreg()i.e. split into two functions - bnxt_re_dev_uninit()
   and  bnxt_re_ib_uninit().
 - Simplify the code by combining the different steps to add and
   remove the device into two functions.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/main.c | 135 ++++++++++++++++++++---------------
 1 file changed, 78 insertions(+), 57 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 1402130..3a548b1 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -78,7 +78,8 @@ static struct list_head bnxt_re_dev_list = LIST_HEAD_INIT(bnxt_re_dev_list);
 /* Mutex to protect the list of bnxt_re devices added */
 static DEFINE_MUTEX(bnxt_re_dev_lock);
 static struct workqueue_struct *bnxt_re_wq;
-static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev);
+static void bnxt_re_remove_device(struct bnxt_re_dev *rdev);
+static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev);
 
 static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev)
 {
@@ -237,7 +238,9 @@ static void bnxt_re_shutdown(void *p)
 	if (!rdev)
 		return;
 
-	bnxt_re_ib_unreg(rdev);
+	bnxt_re_ib_uninit(rdev);
+	ASSERT_RTNL();
+	bnxt_re_remove_device(rdev);
 }
 
 static void bnxt_re_stop_irq(void *handle)
@@ -1317,7 +1320,37 @@ static void bnxt_re_query_hwrm_intf_version(struct bnxt_re_dev *rdev)
 		le16_to_cpu(resp.hwrm_intf_patch);
 }
 
-static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev)
+static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev)
+{
+	/* Cleanup ib dev */
+	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
+		ib_unregister_device(&rdev->ibdev);
+		clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
+	}
+}
+
+int bnxt_re_ib_init(struct bnxt_re_dev *rdev)
+{
+	int rc = 0;
+
+	/* Register ib dev */
+	rc = bnxt_re_register_ib(rdev);
+	if (rc) {
+		pr_err("Failed to register with IB: %#x\n", rc);
+		return rc;
+	}
+	set_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
+	dev_info(rdev_to_dev(rdev), "Device registered successfully");
+	ib_get_eth_speed(&rdev->ibdev, 1, &rdev->active_speed,
+			 &rdev->active_width);
+	set_bit(BNXT_RE_FLAG_ISSUE_ROCE_STATS, &rdev->flags);
+	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE);
+	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_GID_CHANGE);
+
+	return rc;
+}
+
+static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev)
 {
 	u8 type;
 	int rc;
@@ -1378,20 +1411,15 @@ static void bnxt_re_worker(struct work_struct *work)
 	schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000));
 }
 
-static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
+static int bnxt_re_dev_init(struct bnxt_re_dev *rdev)
 {
 	struct bnxt_qplib_creq_ctx *creq;
 	struct bnxt_re_ring_attr rattr;
 	u32 db_offt;
-	bool locked;
 	int vid;
 	u8 type;
 	int rc;
 
-	/* Acquire rtnl lock through out this function */
-	rtnl_lock();
-	locked = true;
-
 	/* Registered a new RoCE device instance to netdev */
 	memset(&rattr, 0, sizeof(rattr));
 	rc = bnxt_re_register_netdev(rdev);
@@ -1524,30 +1552,10 @@ static int bnxt_re_ib_reg(struct bnxt_re_dev *rdev)
 		schedule_delayed_work(&rdev->worker, msecs_to_jiffies(30000));
 	}
 
-	rtnl_unlock();
-	locked = false;
-
-	/* Register ib dev */
-	rc = bnxt_re_register_ib(rdev);
-	if (rc) {
-		ibdev_err(&rdev->ibdev,
-			  "Failed to register with IB: %#x\n", rc);
-		goto fail;
-	}
-	set_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
-	ibdev_info(&rdev->ibdev, "Device registered successfully");
-	ib_get_eth_speed(&rdev->ibdev, 1, &rdev->active_speed,
-			 &rdev->active_width);
-	set_bit(BNXT_RE_FLAG_ISSUE_ROCE_STATS, &rdev->flags);
-	bnxt_re_dispatch_event(&rdev->ibdev, NULL, 1, IB_EVENT_PORT_ACTIVE);
-
 	return 0;
 fail:
-	if (!locked)
-		rtnl_lock();
-	bnxt_re_ib_unreg(rdev);
-	rtnl_unlock();
 
+	bnxt_re_dev_uninit(rdev);
 	return rc;
 }
 
@@ -1588,9 +1596,35 @@ static int bnxt_re_dev_reg(struct bnxt_re_dev **rdev, struct net_device *netdev)
 	return rc;
 }
 
-static void bnxt_re_remove_one(struct bnxt_re_dev *rdev)
+static void bnxt_re_remove_device(struct bnxt_re_dev *rdev)
 {
+	bnxt_re_dev_uninit(rdev);
 	pci_dev_put(rdev->en_dev->pdev);
+	bnxt_re_dev_unreg(rdev);
+}
+
+static int bnxt_re_add_device(struct bnxt_re_dev **rdev,
+			      struct net_device *netdev)
+{
+	int rc;
+
+	rc = bnxt_re_dev_reg(rdev, netdev);
+	if (rc == -ENODEV)
+		return rc;
+	if (rc) {
+		pr_err("Failed to register with the device %s: %#x\n",
+		       netdev->name, rc);
+		return rc;
+	}
+
+	pci_dev_get((*rdev)->en_dev->pdev);
+	rc = bnxt_re_dev_init(*rdev);
+	if (rc) {
+		pci_dev_put((*rdev)->en_dev->pdev);
+		bnxt_re_dev_unreg(*rdev);
+	}
+
+	return rc;
 }
 
 /* Handle all deferred netevents tasks */
@@ -1605,16 +1639,17 @@ static void bnxt_re_task(struct work_struct *work)
 
 	if (re_work->event != NETDEV_REGISTER &&
 	    !test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags))
-		return;
+		goto done;
 
 	switch (re_work->event) {
 	case NETDEV_REGISTER:
-		rc = bnxt_re_ib_reg(rdev);
+		rc = bnxt_re_ib_init(rdev);
 		if (rc) {
 			ibdev_err(&rdev->ibdev,
 				  "Failed to register with IB: %#x", rc);
-			bnxt_re_remove_one(rdev);
-			bnxt_re_dev_unreg(rdev);
+			rtnl_lock();
+			bnxt_re_remove_device(rdev);
+			rtnl_unlock();
 			goto exit;
 		}
 		break;
@@ -1637,17 +1672,13 @@ static void bnxt_re_task(struct work_struct *work)
 	default:
 		break;
 	}
+done:
 	smp_mb__before_atomic();
 	atomic_dec(&rdev->sched_count);
 exit:
 	kfree(re_work);
 }
 
-static void bnxt_re_init_one(struct bnxt_re_dev *rdev)
-{
-	pci_dev_get(rdev->en_dev->pdev);
-}
-
 /*
  * "Notifier chain callback can be invoked for the same chain from
  * different CPUs at the same time".
@@ -1685,17 +1716,9 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 	case NETDEV_REGISTER:
 		if (rdev)
 			break;
-		rc = bnxt_re_dev_reg(&rdev, real_dev);
-		if (rc == -ENODEV)
-			break;
-		if (rc) {
-			ibdev_err(&rdev->ibdev,
-				  "Failed to register with the device %s: %#x\n",
-				  real_dev->name, rc);
-			break;
-		}
-		bnxt_re_init_one(rdev);
-		sch_work = true;
+		rc = bnxt_re_add_device(&rdev, real_dev);
+		if (!rc)
+			sch_work = true;
 		break;
 
 	case NETDEV_UNREGISTER:
@@ -1704,9 +1727,8 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 		 */
 		if (atomic_read(&rdev->sched_count) > 0)
 			goto exit;
-		bnxt_re_ib_unreg(rdev);
-		bnxt_re_remove_one(rdev);
-		bnxt_re_dev_unreg(rdev);
+		bnxt_re_ib_uninit(rdev);
+		bnxt_re_remove_device(rdev);
 		break;
 
 	default:
@@ -1783,12 +1805,11 @@ static void __exit bnxt_re_mod_exit(void)
 		 */
 		flush_workqueue(bnxt_re_wq);
 		bnxt_re_dev_stop(rdev);
+		bnxt_re_ib_uninit(rdev);
 		/* Acquire the rtnl_lock as the L2 resources are freed here */
 		rtnl_lock();
-		bnxt_re_ib_unreg(rdev);
+		bnxt_re_remove_device(rdev);
 		rtnl_unlock();
-		bnxt_re_remove_one(rdev);
-		bnxt_re_dev_unreg(rdev);
 	}
 	unregister_netdevice_notifier(&bnxt_re_netdev_notifier);
 	if (bnxt_re_wq)
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API
  2020-02-24 10:49 [PATCH for-next v2 0/3] RDMA/bnxt_re driver update Selvin Xavier
  2020-02-24 10:49 ` [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path Selvin Xavier
  2020-02-24 10:49 ` [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities Selvin Xavier
@ 2020-02-24 10:49 ` Selvin Xavier
  2020-02-24 13:43   ` Jason Gunthorpe
  2 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-02-24 10:49 UTC (permalink / raw)
  To: dledford, jgg; +Cc: linux-rdma, Selvin Xavier

Using the new unregister APIs provided by the core.
Provide the dealloc_driver hook for the core to callback at the time
of device un-registration.

bnxt_re VF resources are created by the corresponding PF driver.
During ib_unregister_driver, PF might get removed before VF
and this could cause failure when VFs are removed. Driver
is explicitly queuing the removal of VF devices before
calling ib_unregister_driver. Commands to any VFs after
the PF removal will timeout and driver will proceed with
unregisteration and free up the host resources.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
---
 drivers/infiniband/hw/bnxt_re/main.c | 105 +++++++++++++++--------------------
 1 file changed, 44 insertions(+), 61 deletions(-)

diff --git a/drivers/infiniband/hw/bnxt_re/main.c b/drivers/infiniband/hw/bnxt_re/main.c
index 3a548b1..6dec09d 100644
--- a/drivers/infiniband/hw/bnxt_re/main.c
+++ b/drivers/infiniband/hw/bnxt_re/main.c
@@ -79,7 +79,8 @@ static struct list_head bnxt_re_dev_list = LIST_HEAD_INIT(bnxt_re_dev_list);
 static DEFINE_MUTEX(bnxt_re_dev_lock);
 static struct workqueue_struct *bnxt_re_wq;
 static void bnxt_re_remove_device(struct bnxt_re_dev *rdev);
-static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev);
+static void bnxt_re_dealloc_driver(struct ib_device *ib_dev);
+static void bnxt_re_stop_irq(void *handle);
 
 static void bnxt_re_destroy_chip_ctx(struct bnxt_re_dev *rdev)
 {
@@ -237,10 +238,10 @@ static void bnxt_re_shutdown(void *p)
 
 	if (!rdev)
 		return;
-
-	bnxt_re_ib_uninit(rdev);
 	ASSERT_RTNL();
-	bnxt_re_remove_device(rdev);
+	/* Release the MSIx vectors before queuing unregister */
+	bnxt_re_stop_irq(rdev);
+	ib_unregister_device_queued(&rdev->ibdev);
 }
 
 static void bnxt_re_stop_irq(void *handle)
@@ -542,17 +543,12 @@ static bool is_bnxt_re_dev(struct net_device *netdev)
 
 static struct bnxt_re_dev *bnxt_re_from_netdev(struct net_device *netdev)
 {
-	struct bnxt_re_dev *rdev;
+	struct ib_device *ibdev =
+		ib_device_get_by_netdev(netdev, RDMA_DRIVER_BNXT_RE);
+	if (!ibdev)
+		return NULL;
 
-	rcu_read_lock();
-	list_for_each_entry_rcu(rdev, &bnxt_re_dev_list, list) {
-		if (rdev->netdev == netdev) {
-			rcu_read_unlock();
-			return rdev;
-		}
-	}
-	rcu_read_unlock();
-	return NULL;
+	return container_of(ibdev, struct bnxt_re_dev, ibdev);
 }
 
 static void bnxt_re_dev_unprobe(struct net_device *netdev,
@@ -626,11 +622,6 @@ static const struct attribute_group bnxt_re_dev_attr_group = {
 	.attrs = bnxt_re_attributes,
 };
 
-static void bnxt_re_unregister_ib(struct bnxt_re_dev *rdev)
-{
-	ib_unregister_device(&rdev->ibdev);
-}
-
 static const struct ib_device_ops bnxt_re_dev_ops = {
 	.owner = THIS_MODULE,
 	.driver_id = RDMA_DRIVER_BNXT_RE,
@@ -645,6 +636,7 @@ static const struct ib_device_ops bnxt_re_dev_ops = {
 	.create_cq = bnxt_re_create_cq,
 	.create_qp = bnxt_re_create_qp,
 	.create_srq = bnxt_re_create_srq,
+	.dealloc_driver = bnxt_re_dealloc_driver,
 	.dealloc_pd = bnxt_re_dealloc_pd,
 	.dealloc_ucontext = bnxt_re_dealloc_ucontext,
 	.del_gid = bnxt_re_del_gid,
@@ -741,15 +733,11 @@ static void bnxt_re_dev_remove(struct bnxt_re_dev *rdev)
 {
 	dev_put(rdev->netdev);
 	rdev->netdev = NULL;
-
 	mutex_lock(&bnxt_re_dev_lock);
 	list_del_rcu(&rdev->list);
 	mutex_unlock(&bnxt_re_dev_lock);
 
 	synchronize_rcu();
-
-	ib_dealloc_device(&rdev->ibdev);
-	/* rdev is gone */
 }
 
 static struct bnxt_re_dev *bnxt_re_dev_add(struct net_device *netdev,
@@ -1320,15 +1308,6 @@ static void bnxt_re_query_hwrm_intf_version(struct bnxt_re_dev *rdev)
 		le16_to_cpu(resp.hwrm_intf_patch);
 }
 
-static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev)
-{
-	/* Cleanup ib dev */
-	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
-		ib_unregister_device(&rdev->ibdev);
-		clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
-	}
-}
-
 int bnxt_re_ib_init(struct bnxt_re_dev *rdev)
 {
 	int rc = 0;
@@ -1355,10 +1334,6 @@ static void bnxt_re_dev_uninit(struct bnxt_re_dev *rdev)
 	u8 type;
 	int rc;
 
-	if (test_and_clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
-		/* Cleanup ib dev */
-		bnxt_re_unregister_ib(rdev);
-	}
 	if (test_and_clear_bit(BNXT_RE_FLAG_QOS_WORK_REG, &rdev->flags))
 		cancel_delayed_work_sync(&rdev->worker);
 
@@ -1627,6 +1602,19 @@ static int bnxt_re_add_device(struct bnxt_re_dev **rdev,
 	return rc;
 }
 
+static void bnxt_re_dealloc_driver(struct ib_device *ib_dev)
+{
+	struct bnxt_re_dev *rdev =
+		container_of(ib_dev, struct bnxt_re_dev, ibdev);
+
+	clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
+	dev_info(rdev_to_dev(rdev), "Unregistering Device");
+
+	rtnl_lock();
+	bnxt_re_remove_device(rdev);
+	rtnl_unlock();
+}
+
 /* Handle all deferred netevents tasks */
 static void bnxt_re_task(struct work_struct *work)
 {
@@ -1701,6 +1689,7 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 	struct bnxt_re_dev *rdev;
 	int rc = 0;
 	bool sch_work = false;
+	bool release = true;
 
 	real_dev = rdma_vlan_dev_real_dev(netdev);
 	if (!real_dev)
@@ -1708,7 +1697,8 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 
 	rdev = bnxt_re_from_netdev(real_dev);
 	if (!rdev && event != NETDEV_REGISTER)
-		goto exit;
+		return NOTIFY_OK;
+
 	if (real_dev != netdev)
 		goto exit;
 
@@ -1719,6 +1709,7 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 		rc = bnxt_re_add_device(&rdev, real_dev);
 		if (!rc)
 			sch_work = true;
+		release = false;
 		break;
 
 	case NETDEV_UNREGISTER:
@@ -1727,8 +1718,7 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 		 */
 		if (atomic_read(&rdev->sched_count) > 0)
 			goto exit;
-		bnxt_re_ib_uninit(rdev);
-		bnxt_re_remove_device(rdev);
+		ib_unregister_device_queued(&rdev->ibdev);
 		break;
 
 	default:
@@ -1750,6 +1740,8 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
 	}
 
 exit:
+	if (rdev && release)
+		ib_device_put(&rdev->ibdev);
 	return NOTIFY_DONE;
 }
 
@@ -1785,32 +1777,23 @@ static int __init bnxt_re_mod_init(void)
 
 static void __exit bnxt_re_mod_exit(void)
 {
-	struct bnxt_re_dev *rdev, *next;
-	LIST_HEAD(to_be_deleted);
+	struct bnxt_re_dev *rdev;
 
+	flush_workqueue(bnxt_re_wq);
 	mutex_lock(&bnxt_re_dev_lock);
-	/* Free all adapter allocated resources */
-	if (!list_empty(&bnxt_re_dev_list))
-		list_splice_init(&bnxt_re_dev_list, &to_be_deleted);
-	mutex_unlock(&bnxt_re_dev_lock);
-       /*
-	* Cleanup the devices in reverse order so that the VF device
-	* cleanup is done before PF cleanup
-	*/
-	list_for_each_entry_safe_reverse(rdev, next, &to_be_deleted, list) {
-		ibdev_info(&rdev->ibdev, "Unregistering Device");
-		/*
-		 * Flush out any scheduled tasks before destroying the
-		 * resources
+	list_for_each_entry(rdev, &bnxt_re_dev_list, list) {
+		/* VF device removal should be called before the removal
+		 * of PF device. Queue VFs unregister first, so that VFs
+		 * shall be removed before the PF during the call of
+		 * ib_unregister_driver. Commands to any VFs after the
+		 * PF removal will timeout and driver will proceed with
+		 * unregisteration and free up the host resources.
 		 */
-		flush_workqueue(bnxt_re_wq);
-		bnxt_re_dev_stop(rdev);
-		bnxt_re_ib_uninit(rdev);
-		/* Acquire the rtnl_lock as the L2 resources are freed here */
-		rtnl_lock();
-		bnxt_re_remove_device(rdev);
-		rtnl_unlock();
+		if (rdev->is_virtfn)
+			ib_unregister_device_queued(&rdev->ibdev);
 	}
+	mutex_unlock(&bnxt_re_dev_lock);
+	ib_unregister_driver(RDMA_DRIVER_BNXT_RE);
 	unregister_netdevice_notifier(&bnxt_re_netdev_notifier);
 	if (bnxt_re_wq)
 		destroy_workqueue(bnxt_re_wq);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API
  2020-02-24 10:49 ` [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API Selvin Xavier
@ 2020-02-24 13:43   ` Jason Gunthorpe
  2020-02-25  4:10     ` Selvin Xavier
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-02-24 13:43 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: dledford, linux-rdma

On Mon, Feb 24, 2020 at 02:49:55AM -0800, Selvin Xavier wrote:
>  
> @@ -1785,32 +1777,23 @@ static int __init bnxt_re_mod_init(void)
>  
>  static void __exit bnxt_re_mod_exit(void)
>  {
> -	struct bnxt_re_dev *rdev, *next;
> -	LIST_HEAD(to_be_deleted);
> +	struct bnxt_re_dev *rdev;
>  
> +	flush_workqueue(bnxt_re_wq);

What is this for? Usually flushing a work queue before preventing new
work from queueing (ie via unregister) is racy.

>  	mutex_lock(&bnxt_re_dev_lock);
> -	/* Free all adapter allocated resources */
> -	if (!list_empty(&bnxt_re_dev_list))
> -		list_splice_init(&bnxt_re_dev_list, &to_be_deleted);
> -	mutex_unlock(&bnxt_re_dev_lock);
> -       /*
> -	* Cleanup the devices in reverse order so that the VF device
> -	* cleanup is done before PF cleanup
> -	*/
> -	list_for_each_entry_safe_reverse(rdev, next, &to_be_deleted, list) {
> -		ibdev_info(&rdev->ibdev, "Unregistering Device");
> -		/*
> -		 * Flush out any scheduled tasks before destroying the
> -		 * resources
> +	list_for_each_entry(rdev, &bnxt_re_dev_list, list) {
> +		/* VF device removal should be called before the removal
> +		 * of PF device. Queue VFs unregister first, so that VFs
> +		 * shall be removed before the PF during the call of
> +		 * ib_unregister_driver. Commands to any VFs after the
> +		 * PF removal will timeout and driver will proceed with
> +		 * unregisteration and free up the host resources.
>  		 */
> -		flush_workqueue(bnxt_re_wq);
> -		bnxt_re_dev_stop(rdev);
> -		bnxt_re_ib_uninit(rdev);
> -		/* Acquire the rtnl_lock as the L2 resources are freed here */
> -		rtnl_lock();
> -		bnxt_re_remove_device(rdev);
> -		rtnl_unlock();
> +		if (rdev->is_virtfn)
> +			ib_unregister_device_queued(&rdev->ibdev);

Why do it queued? Just call ib_unregister_device(). Otherwise it won't
order reliably.

But be very careful about lifetime. All the other drivers had problems
managing the lifetime of the pointers in their device lists.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path
  2020-02-24 10:49 ` [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path Selvin Xavier
@ 2020-02-24 13:48   ` Jason Gunthorpe
  2020-02-25 10:14     ` Selvin Xavier
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-02-24 13:48 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: dledford, linux-rdma

On Mon, Feb 24, 2020 at 02:49:53AM -0800, Selvin Xavier wrote:
> Add more flags for better granularity in in device init/uninit path
> 
> Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
>  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 16 +++++++++----
>  drivers/infiniband/hw/bnxt_re/main.c    | 41 ++++++++++++++++-----------------
>  2 files changed, 31 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> index c736e82..0babc66 100644
> +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> @@ -135,11 +135,17 @@ struct bnxt_re_dev {
>  #define BNXT_RE_FLAG_IBDEV_REGISTERED		1
>  #define BNXT_RE_FLAG_GOT_MSIX			2
>  #define BNXT_RE_FLAG_HAVE_L2_REF		3
> -#define BNXT_RE_FLAG_RCFW_CHANNEL_EN		4
> -#define BNXT_RE_FLAG_QOS_WORK_REG		5
> -#define BNXT_RE_FLAG_RESOURCES_ALLOCATED	7
> -#define BNXT_RE_FLAG_RESOURCES_INITIALIZED	8
> -#define BNXT_RE_FLAG_ISSUE_ROCE_STATS          29
> +#define BNXT_RE_FLAG_ALLOC_RCFW			4
> +#define BNXT_RE_FLAG_NET_RING_ALLOC		5
> +#define BNXT_RE_FLAG_RCFW_CHANNEL_EN		6
> +#define BNXT_RE_FLAG_ALLOC_CTX			7
> +#define BNXT_RE_FLAG_STATS_CTX_ALLOC		8
> +#define BNXT_RE_FLAG_STATS_CTX2_ALLOC		9
> +#define BNXT_RE_FLAG_RCFW_CHANNEL_INIT		10
> +#define BNXT_RE_FLAG_QOS_WORK_REG		11
> +#define BNXT_RE_FLAG_RESOURCES_ALLOCATED	12
> +#define BNXT_RE_FLAG_RESOURCES_INITIALIZED	13
> +#define BNXT_RE_FLAG_ISSUE_ROCE_STATS		29

The error unwind is better than adding these endless flags.

It is hard to understand the flow with all this needless conditional,
really if you want to change this I'd delete the test_bit scheme and
stick to the normal goto error unwind.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities
  2020-02-24 10:49 ` [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities Selvin Xavier
@ 2020-02-24 13:53   ` Jason Gunthorpe
  2020-02-25  4:22     ` Selvin Xavier
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-02-24 13:53 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: dledford, linux-rdma

>  static void bnxt_re_stop_irq(void *handle)
> @@ -1317,7 +1320,37 @@ static void bnxt_re_query_hwrm_intf_version(struct bnxt_re_dev *rdev)
>  		le16_to_cpu(resp.hwrm_intf_patch);
>  }
>  
> -static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev)
> +static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev)
> +{
> +	/* Cleanup ib dev */
> +	if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
> +		ib_unregister_device(&rdev->ibdev);
> +		clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
> +	}

The reason ib_unregister_device_queued exists is because you can't
call unregistration while holding RTNL.

>  	case NETDEV_UNREGISTER:
> @@ -1704,9 +1727,8 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
>  		 */
>  		if (atomic_read(&rdev->sched_count) > 0)
>  			goto exit;
> -		bnxt_re_ib_unreg(rdev);
> -		bnxt_re_remove_one(rdev);
> -		bnxt_re_dev_unreg(rdev);
> +		bnxt_re_ib_uninit(rdev);
> +		bnxt_re_remove_device(rdev);
>  		break;

ie here.

This *must* simply be a call to ib_unregister_device_queued() and all
this other stuff has to go into the dealloc.

As written this is all kinds of deadlocking and racy

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API
  2020-02-24 13:43   ` Jason Gunthorpe
@ 2020-02-25  4:10     ` Selvin Xavier
  2020-02-25 13:16       ` Jason Gunthorpe
  0 siblings, 1 reply; 12+ messages in thread
From: Selvin Xavier @ 2020-02-25  4:10 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Mon, Feb 24, 2020 at 7:13 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> On Mon, Feb 24, 2020 at 02:49:55AM -0800, Selvin Xavier wrote:
> >
> > @@ -1785,32 +1777,23 @@ static int __init bnxt_re_mod_init(void)
> >
> >  static void __exit bnxt_re_mod_exit(void)
> >  {
> > -     struct bnxt_re_dev *rdev, *next;
> > -     LIST_HEAD(to_be_deleted);
> > +     struct bnxt_re_dev *rdev;
> >
> > +     flush_workqueue(bnxt_re_wq);
>
> What is this for? Usually flushing a work queue before preventing new
> work from queueing (ie via unregister) is racy.
To flush out any netdev events scheduled to be handled by
bnxt_re_task. Mainly to wait for
case where we are in the middle of NETDEV_REGISTER event handled from
bnxt_re_task.
>
> >       mutex_lock(&bnxt_re_dev_lock);
> > -     /* Free all adapter allocated resources */
> > -     if (!list_empty(&bnxt_re_dev_list))
> > -             list_splice_init(&bnxt_re_dev_list, &to_be_deleted);
> > -     mutex_unlock(&bnxt_re_dev_lock);
> > -       /*
> > -     * Cleanup the devices in reverse order so that the VF device
> > -     * cleanup is done before PF cleanup
> > -     */
> > -     list_for_each_entry_safe_reverse(rdev, next, &to_be_deleted, list) {
> > -             ibdev_info(&rdev->ibdev, "Unregistering Device");
> > -             /*
> > -              * Flush out any scheduled tasks before destroying the
> > -              * resources
> > +     list_for_each_entry(rdev, &bnxt_re_dev_list, list) {
> > +             /* VF device removal should be called before the removal
> > +              * of PF device. Queue VFs unregister first, so that VFs
> > +              * shall be removed before the PF during the call of
> > +              * ib_unregister_driver. Commands to any VFs after the
> > +              * PF removal will timeout and driver will proceed with
> > +              * unregisteration and free up the host resources.
> >                */
> > -             flush_workqueue(bnxt_re_wq);
> > -             bnxt_re_dev_stop(rdev);
> > -             bnxt_re_ib_uninit(rdev);
> > -             /* Acquire the rtnl_lock as the L2 resources are freed here */
> > -             rtnl_lock();
> > -             bnxt_re_remove_device(rdev);
> > -             rtnl_unlock();
> > +             if (rdev->is_virtfn)
> > +                     ib_unregister_device_queued(&rdev->ibdev);
>
> Why do it queued? Just call ib_unregister_device(). Otherwise it won't
> order reliably.
Sure. Will change it.
>
> But be very careful about lifetime. All the other drivers had problems
> managing the lifetime of the pointers in their device lists.
>
> Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities
  2020-02-24 13:53   ` Jason Gunthorpe
@ 2020-02-25  4:22     ` Selvin Xavier
  0 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-02-25  4:22 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Mon, Feb 24, 2020 at 7:24 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> >  static void bnxt_re_stop_irq(void *handle)
> > @@ -1317,7 +1320,37 @@ static void bnxt_re_query_hwrm_intf_version(struct bnxt_re_dev *rdev)
> >               le16_to_cpu(resp.hwrm_intf_patch);
> >  }
> >
> > -static void bnxt_re_ib_unreg(struct bnxt_re_dev *rdev)
> > +static void bnxt_re_ib_uninit(struct bnxt_re_dev *rdev)
> > +{
> > +     /* Cleanup ib dev */
> > +     if (test_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags)) {
> > +             ib_unregister_device(&rdev->ibdev);
> > +             clear_bit(BNXT_RE_FLAG_IBDEV_REGISTERED, &rdev->flags);
> > +     }
>
> The reason ib_unregister_device_queued exists is because you can't
> call unregistration while holding RTNL.
>
> >       case NETDEV_UNREGISTER:
> > @@ -1704,9 +1727,8 @@ static int bnxt_re_netdev_event(struct notifier_block *notifier,
> >                */
> >               if (atomic_read(&rdev->sched_count) > 0)
> >                       goto exit;
> > -             bnxt_re_ib_unreg(rdev);
> > -             bnxt_re_remove_one(rdev);
> > -             bnxt_re_dev_unreg(rdev);
> > +             bnxt_re_ib_uninit(rdev);
> > +             bnxt_re_remove_device(rdev);
> >               break;
>
> ie here.
>
> This *must* simply be a call to ib_unregister_device_queued() and all
> this other stuff has to go into the dealloc.
>
> As written this is all kinds of deadlocking and racy
>

This patch (patch 2 of this series) was to refactor the existing code and
group the reg/unreg operations as bnxt_re_add_device/bnxt_re_remove_device.  The
third patch in this series is introducing the dealloc_driver hook and
changing the code as
you suggested by calling ib_unregister_device_queued/ ib_unregister_driver.
I didn't want to squash these two patches together.

-Selvin
> Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path
  2020-02-24 13:48   ` Jason Gunthorpe
@ 2020-02-25 10:14     ` Selvin Xavier
  0 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-02-25 10:14 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Mon, Feb 24, 2020 at 7:18 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> On Mon, Feb 24, 2020 at 02:49:53AM -0800, Selvin Xavier wrote:
> > Add more flags for better granularity in in device init/uninit path
> >
> > Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
> >  drivers/infiniband/hw/bnxt_re/bnxt_re.h | 16 +++++++++----
> >  drivers/infiniband/hw/bnxt_re/main.c    | 41 ++++++++++++++++-----------------
> >  2 files changed, 31 insertions(+), 26 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/bnxt_re/bnxt_re.h b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > index c736e82..0babc66 100644
> > +++ b/drivers/infiniband/hw/bnxt_re/bnxt_re.h
> > @@ -135,11 +135,17 @@ struct bnxt_re_dev {
> >  #define BNXT_RE_FLAG_IBDEV_REGISTERED                1
> >  #define BNXT_RE_FLAG_GOT_MSIX                        2
> >  #define BNXT_RE_FLAG_HAVE_L2_REF             3
> > -#define BNXT_RE_FLAG_RCFW_CHANNEL_EN         4
> > -#define BNXT_RE_FLAG_QOS_WORK_REG            5
> > -#define BNXT_RE_FLAG_RESOURCES_ALLOCATED     7
> > -#define BNXT_RE_FLAG_RESOURCES_INITIALIZED   8
> > -#define BNXT_RE_FLAG_ISSUE_ROCE_STATS          29
> > +#define BNXT_RE_FLAG_ALLOC_RCFW                      4
> > +#define BNXT_RE_FLAG_NET_RING_ALLOC          5
> > +#define BNXT_RE_FLAG_RCFW_CHANNEL_EN         6
> > +#define BNXT_RE_FLAG_ALLOC_CTX                       7
> > +#define BNXT_RE_FLAG_STATS_CTX_ALLOC         8
> > +#define BNXT_RE_FLAG_STATS_CTX2_ALLOC                9
> > +#define BNXT_RE_FLAG_RCFW_CHANNEL_INIT               10
> > +#define BNXT_RE_FLAG_QOS_WORK_REG            11
> > +#define BNXT_RE_FLAG_RESOURCES_ALLOCATED     12
> > +#define BNXT_RE_FLAG_RESOURCES_INITIALIZED   13
> > +#define BNXT_RE_FLAG_ISSUE_ROCE_STATS                29
>
> The error unwind is better than adding these endless flags.
>
> It is hard to understand the flow with all this needless conditional,
> really if you want to change this I'd delete the test_bit scheme and
> stick to the normal goto error unwind.
The existing code was doing few  error unwinding. I thought it would
be simple to collect everything under bnxt_re_ib_unreg function and
cleanup with these flags. I am ok with both the approaches. Maybe,
I can drop this patch and post the remaining series.

>
> Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API
  2020-02-25  4:10     ` Selvin Xavier
@ 2020-02-25 13:16       ` Jason Gunthorpe
  2020-02-25 15:52         ` Selvin Xavier
  0 siblings, 1 reply; 12+ messages in thread
From: Jason Gunthorpe @ 2020-02-25 13:16 UTC (permalink / raw)
  To: Selvin Xavier; +Cc: Doug Ledford, linux-rdma

On Tue, Feb 25, 2020 at 09:40:42AM +0530, Selvin Xavier wrote:
> On Mon, Feb 24, 2020 at 7:13 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
> >
> > On Mon, Feb 24, 2020 at 02:49:55AM -0800, Selvin Xavier wrote:
> > >
> > > @@ -1785,32 +1777,23 @@ static int __init bnxt_re_mod_init(void)
> > >
> > >  static void __exit bnxt_re_mod_exit(void)
> > >  {
> > > -     struct bnxt_re_dev *rdev, *next;
> > > -     LIST_HEAD(to_be_deleted);
> > > +     struct bnxt_re_dev *rdev;
> > >
> > > +     flush_workqueue(bnxt_re_wq);
> >
> > What is this for? Usually flushing a work queue before preventing new
> > work from queueing (ie via unregister) is racy.
> To flush out any netdev events scheduled to be handled by
> bnxt_re_task. Mainly to wait for
> case where we are in the middle of NETDEV_REGISTER event handled from
> bnxt_re_task.

Then the netdev notifer should be unregistered before doing the flush.

Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API
  2020-02-25 13:16       ` Jason Gunthorpe
@ 2020-02-25 15:52         ` Selvin Xavier
  0 siblings, 0 replies; 12+ messages in thread
From: Selvin Xavier @ 2020-02-25 15:52 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Tue, Feb 25, 2020 at 6:46 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
>
> On Tue, Feb 25, 2020 at 09:40:42AM +0530, Selvin Xavier wrote:
> > On Mon, Feb 24, 2020 at 7:13 PM Jason Gunthorpe <jgg@mellanox.com> wrote:
> > >
> > > On Mon, Feb 24, 2020 at 02:49:55AM -0800, Selvin Xavier wrote:
> > > >
> > > > @@ -1785,32 +1777,23 @@ static int __init bnxt_re_mod_init(void)
> > > >
> > > >  static void __exit bnxt_re_mod_exit(void)
> > > >  {
> > > > -     struct bnxt_re_dev *rdev, *next;
> > > > -     LIST_HEAD(to_be_deleted);
> > > > +     struct bnxt_re_dev *rdev;
> > > >
> > > > +     flush_workqueue(bnxt_re_wq);
> > >
> > > What is this for? Usually flushing a work queue before preventing new
> > > work from queueing (ie via unregister) is racy.
> > To flush out any netdev events scheduled to be handled by
> > bnxt_re_task. Mainly to wait for
> > case where we are in the middle of NETDEV_REGISTER event handled from
> > bnxt_re_task.
>
> Then the netdev notifer should be unregistered before doing the flush.
>
Agree that the netdev notifier should be unregistered before invoking
ib_unregister_driver
so that new device additions are prevented. I will move the unregister
code to the beginning of the mod_exit.
 But still we have to flush the bnxt_re_wq work queue.  bnxt_re_task
work is scheduled from netdev notifier. We need to make sure that
there are no pending work before proceeding with unregister.
 I will modify the code to  unregister notifier and then  delete
bnxt_re_wq workqueue before calling
ib_unregister_driver.
> Jason

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-02-25 15:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-24 10:49 [PATCH for-next v2 0/3] RDMA/bnxt_re driver update Selvin Xavier
2020-02-24 10:49 ` [PATCH for-next v2 1/3] RDMA/bnxt_re: Add more flags in device init and uninit path Selvin Xavier
2020-02-24 13:48   ` Jason Gunthorpe
2020-02-25 10:14     ` Selvin Xavier
2020-02-24 10:49 ` [PATCH for-next v2 2/3] RDMA/bnxt_re: Refactor device add/remove functionalities Selvin Xavier
2020-02-24 13:53   ` Jason Gunthorpe
2020-02-25  4:22     ` Selvin Xavier
2020-02-24 10:49 ` [PATCH for-next v2 3/3] RDMA/bnxt_re: Use driver_unregister and unregistration API Selvin Xavier
2020-02-24 13:43   ` Jason Gunthorpe
2020-02-25  4:10     ` Selvin Xavier
2020-02-25 13:16       ` Jason Gunthorpe
2020-02-25 15:52         ` Selvin Xavier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).