All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH rdma-next v1 00/10] Restore failure of destroy commands
@ 2020-08-30  8:40 Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 01/10] RDMA: Restore ability to fail on PD deallocate Leon Romanovsky
                   ` (9 more replies)
  0 siblings, 10 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Christian Benvenuti, Dennis Dalessandro, Devesh Sharma,
	Eli Cohen, Faisal Latif, Gal Pressman, Jack Morgenstein,
	Leon Romanovsky, Lijun Ou, linux-kernel, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Nelson Escobar, Or Gerlitz, Parvi Kaustubhi, Potnuri Bharat Teja,
	Roland Dreier, Selvin Xavier, Shiraz Saleem, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Yuval Shaia, Zhu Yanjun

From: Leon Romanovsky <leonro@nvidia.com>

Changelog:
v1:
 * Changed returned value in efa_destroy_ah() from EINVAL to EOPNOTSUPP
v0:
 * https://lore.kernel.org/lkml/20200824103247.1088464-1-leon@kernel.org

-----------------------------------------------------------------------------
Hi,

This series restores the ability to fail on destroy commands, due to the
fact that mlx5_ib DEVX implementation interleaved ib_core objects
with FW objects without sharing reference counters.

In a retrospective, every part of the mlx5_ib flow is correct.

It started from IBTA which was written by HW engineers with HW in mind and
they allowed to fail in destruction. FW implemented it with symmetrical
interface like any other command and propagated error back to the kernel,
which forwarded it to the libibverbs and kernel ULPs.

Libibverbs was designed with IBTA spec in hand putting destroy errors in
stone. Up till mlx5_ib DEVX, it worked well, because the IB verbs objects
are counted by the kernel and ib_core ensures that FW destroy will success
by managing various reference counters on such objects.

The extension of the mlx5 driver changed this flow when allowed DEVX objects
that are not managed by ib_core to be interleaved with the ones under ib_core
responsibility.

The drivers that want to implement DEVX flows, must ensure that FW/HW
destroys are performed as early as possible before any other internal
cleanup. After HW destroys, drivers are not allowed to fail.

This series includes two patches (WQ and "potential race") that will
require extra work in mlx5_ib, they both theoretical. WQ is not in use
in DEVX, but is needed to make interface symmetrical to other objects.
"Potential race" is in ULP flow that ensures that SRQ is destoyed in
proper order.

Thanks

Leon Romanovsky (10):
  RDMA: Restore ability to fail on PD deallocate
  RDMA: Restore ability to fail on AH destroy
  RDMA/mlx5: Issue FW command to destroy SRQ on reentry
  RDMA/mlx5: Fix potential race between destroy and CQE poll
  RDMA: Restore ability to fail on SRQ destroy
  RDMA/core: Delete function indirection for alloc/free kernel CQ
  RDMA: Allow fail of destroy CQ
  RDMA: Change XRCD destroy return value
  RDMA: Restore ability to return error for destroy WQ
  RDMA: Make counters destroy symmetrical

 drivers/infiniband/core/cq.c                  |  36 +++---
 drivers/infiniband/core/uverbs_std_types.c    |   3 +-
 .../core/uverbs_std_types_counters.c          |   4 +-
 drivers/infiniband/core/uverbs_std_types_wq.c |   2 +-
 drivers/infiniband/core/verbs.c               |  56 +++++++---
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  12 +-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h      |   8 +-
 drivers/infiniband/hw/cxgb4/cq.c              |   3 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h        |   4 +-
 drivers/infiniband/hw/cxgb4/provider.c        |   3 +-
 drivers/infiniband/hw/cxgb4/qp.c              |   3 +-
 drivers/infiniband/hw/efa/efa.h               |   6 +-
 drivers/infiniband/hw/efa/efa_verbs.c         |  11 +-
 drivers/infiniband/hw/hns/hns_roce_ah.c       |   5 -
 drivers/infiniband/hw/hns/hns_roce_cq.c       |   3 +-
 drivers/infiniband/hw/hns/hns_roce_device.h   |  13 ++-
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c    |   3 +-
 drivers/infiniband/hw/hns/hns_roce_pd.c       |   3 +-
 drivers/infiniband/hw/hns/hns_roce_srq.c      |   3 +-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c     |   6 +-
 drivers/infiniband/hw/mlx4/ah.c               |   5 -
 drivers/infiniband/hw/mlx4/cq.c               |   3 +-
 drivers/infiniband/hw/mlx4/main.c             |   6 +-
 drivers/infiniband/hw/mlx4/mlx4_ib.h          |  11 +-
 drivers/infiniband/hw/mlx4/qp.c               |   3 +-
 drivers/infiniband/hw/mlx4/srq.c              |   3 +-
 drivers/infiniband/hw/mlx5/ah.c               |   5 -
 drivers/infiniband/hw/mlx5/cmd.c              |   4 +-
 drivers/infiniband/hw/mlx5/cmd.h              |   2 +-
 drivers/infiniband/hw/mlx5/counters.c         |   3 +-
 drivers/infiniband/hw/mlx5/cq.c               |  21 ++--
 drivers/infiniband/hw/mlx5/main.c             |   4 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  13 ++-
 drivers/infiniband/hw/mlx5/qp.c               |  12 +-
 drivers/infiniband/hw/mlx5/qp.h               |   4 +-
 drivers/infiniband/hw/mlx5/qpc.c              |   5 +-
 drivers/infiniband/hw/mlx5/srq.c              |  26 ++---
 drivers/infiniband/hw/mlx5/srq.h              |   2 +-
 drivers/infiniband/hw/mlx5/srq_cmd.c          |  22 +++-
 drivers/infiniband/hw/mthca/mthca_provider.c  |  12 +-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c      |   3 +-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.h      |   2 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |  11 +-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h   |   6 +-
 drivers/infiniband/hw/qedr/verbs.c            |  14 ++-
 drivers/infiniband/hw/qedr/verbs.h            |   8 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c  |   7 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h  |   4 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c  |   3 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |   3 +-
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.c   |   8 +-
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h   |   8 +-
 drivers/infiniband/sw/rdmavt/ah.c             |   3 +-
 drivers/infiniband/sw/rdmavt/ah.h             |   2 +-
 drivers/infiniband/sw/rdmavt/cq.c             |   3 +-
 drivers/infiniband/sw/rdmavt/cq.h             |   2 +-
 drivers/infiniband/sw/rdmavt/pd.c             |   3 +-
 drivers/infiniband/sw/rdmavt/pd.h             |   2 +-
 drivers/infiniband/sw/rdmavt/srq.c            |   3 +-
 drivers/infiniband/sw/rdmavt/srq.h            |   2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c         |  12 +-
 drivers/infiniband/sw/siw/siw_verbs.c         |   9 +-
 drivers/infiniband/sw/siw/siw_verbs.h         |   6 +-
 drivers/infiniband/ulp/ipoib/ipoib_cm.c       |   6 +-
 include/rdma/ib_verbs.h                       | 105 +++++-------------
 65 files changed, 313 insertions(+), 275 deletions(-)

--
2.26.2


^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 01/10] RDMA: Restore ability to fail on PD deallocate
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 02/10] RDMA: Restore ability to fail on AH destroy Leon Romanovsky
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Christian Benvenuti, Dennis Dalessandro, Devesh Sharma,
	Faisal Latif, Gal Pressman, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Nelson Escobar, Parvi Kaustubhi, Potnuri Bharat Teja,
	Selvin Xavier, Shiraz Saleem, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

From: Leon Romanovsky <leonro@mellanox.com>

The IB verbs objects are counted by the kernel and ib_core ensures
that deallocate PD will success so it will be called once all other
objects that depends on PD will be released. This is achieved
by managing various reference counters on such objects.

The mlx5 driver didn't follow this standard flow when allowed DEVX
objects that are not managed by ib_core to be interleaved with the
ones under ib_core responsibility.

In such interleaved scenarios deallocate command can fail and ib_core
will leave uobject in internal DB and attempt to clean it later to
free resources anyway.

This change partially restores returned value from dealloc_pd() for
all drivers, but keeping in mind that non-DEVX devices and kernel verbs
paths shouldn't fail.

Fixes: 21a428a019c9 ("RDMA: Handle PD allocations by IB/core")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_std_types.c      |  3 +--
 drivers/infiniband/core/verbs.c                 |  8 ++++++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c        |  3 ++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h        |  2 +-
 drivers/infiniband/hw/cxgb4/provider.c          |  3 ++-
 drivers/infiniband/hw/efa/efa.h                 |  2 +-
 drivers/infiniband/hw/efa/efa_verbs.c           |  3 ++-
 drivers/infiniband/hw/hns/hns_roce_device.h     |  2 +-
 drivers/infiniband/hw/hns/hns_roce_pd.c         |  3 ++-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c       |  3 ++-
 drivers/infiniband/hw/mlx4/main.c               |  3 ++-
 drivers/infiniband/hw/mlx5/cmd.c                |  4 ++--
 drivers/infiniband/hw/mlx5/cmd.h                |  2 +-
 drivers/infiniband/hw/mlx5/main.c               |  4 ++--
 drivers/infiniband/hw/mthca/mthca_provider.c    |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c     |  5 +++--
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h     |  2 +-
 drivers/infiniband/hw/qedr/verbs.c              |  3 ++-
 drivers/infiniband/hw/qedr/verbs.h              |  2 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c    |  3 ++-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h    |  2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c |  5 +++--
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h |  2 +-
 drivers/infiniband/sw/rdmavt/pd.c               |  3 ++-
 drivers/infiniband/sw/rdmavt/pd.h               |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c           |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.c           |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.h           |  2 +-
 include/rdma/ib_verbs.h                         | 13 +++++--------
 29 files changed, 56 insertions(+), 42 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_std_types.c b/drivers/infiniband/core/uverbs_std_types.c
index 08c39cfb1bd9..2932e832f48f 100644
--- a/drivers/infiniband/core/uverbs_std_types.c
+++ b/drivers/infiniband/core/uverbs_std_types.c
@@ -122,8 +122,7 @@ static int uverbs_free_pd(struct ib_uobject *uobject,
 	if (ret)
 		return ret;
 
-	ib_dealloc_pd_user(pd, &attrs->driver_udata);
-	return 0;
+	return ib_dealloc_pd_user(pd, &attrs->driver_udata);
 }
 
 void ib_uverbs_free_event_queue(struct ib_uverbs_event_queue *event_queue)
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 3096e73797b7..688b34bf0268 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -329,7 +329,7 @@ EXPORT_SYMBOL(__ib_alloc_pd);
  * exist.  The caller is responsible to synchronously destroy them and
  * guarantee no new allocations will happen.
  */
-void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata)
+int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata)
 {
 	int ret;
 
@@ -343,9 +343,13 @@ void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata)
 	   requires the caller to guarantee we can't race here. */
 	WARN_ON(atomic_read(&pd->usecnt));
 
+	ret = pd->device->ops.dealloc_pd(pd, udata);
+	if (ret && udata)
+		return ret;
+
 	rdma_restrack_del(&pd->res);
-	pd->device->ops.dealloc_pd(pd, udata);
 	kfree(pd);
+	return ret;
 }
 EXPORT_SYMBOL(ib_dealloc_pd_user);
 
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 3f18efc0c297..05abce09b23c 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -532,7 +532,7 @@ static int bnxt_re_create_fence_mr(struct bnxt_re_pd *pd)
 }
 
 /* Protection Domains */
-void bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata)
+int bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata)
 {
 	struct bnxt_re_pd *pd = container_of(ib_pd, struct bnxt_re_pd, ib_pd);
 	struct bnxt_re_dev *rdev = pd->rdev;
@@ -542,6 +542,7 @@ void bnxt_re_dealloc_pd(struct ib_pd *ib_pd, struct ib_udata *udata)
 	if (pd->qplib_pd.id)
 		bnxt_qplib_dealloc_pd(&rdev->qplib_res, &rdev->qplib_res.pd_tbl,
 				      &pd->qplib_pd);
+	return 0;
 }
 
 int bnxt_re_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 1daeb30e06fd..d9e2e406f66a 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -163,7 +163,7 @@ int bnxt_re_query_gid(struct ib_device *ibdev, u8 port_num,
 enum rdma_link_layer bnxt_re_get_link_layer(struct ib_device *ibdev,
 					    u8 port_num);
 int bnxt_re_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
+int bnxt_re_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		      struct ib_udata *udata);
 int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
diff --git a/drivers/infiniband/hw/cxgb4/provider.c b/drivers/infiniband/hw/cxgb4/provider.c
index 6c579d2d3997..5f2b30624512 100644
--- a/drivers/infiniband/hw/cxgb4/provider.c
+++ b/drivers/infiniband/hw/cxgb4/provider.c
@@ -190,7 +190,7 @@ static int c4iw_mmap(struct ib_ucontext *context, struct vm_area_struct *vma)
 	return ret;
 }
 
-static void c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata)
+static int c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	struct c4iw_dev *rhp;
 	struct c4iw_pd *php;
@@ -202,6 +202,7 @@ static void c4iw_deallocate_pd(struct ib_pd *pd, struct ib_udata *udata)
 	mutex_lock(&rhp->rdev.stats.lock);
 	rhp->rdev.stats.pd.cur--;
 	mutex_unlock(&rhp->rdev.stats.lock);
+	return 0;
 }
 
 static int c4iw_allocate_pd(struct ib_pd *pd, struct ib_udata *udata)
diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h
index 1889dd172a25..8547f9d543df 100644
--- a/drivers/infiniband/hw/efa/efa.h
+++ b/drivers/infiniband/hw/efa/efa.h
@@ -134,7 +134,7 @@ int efa_query_gid(struct ib_device *ibdev, u8 port, int index,
 int efa_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
 		   u16 *pkey);
 int efa_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
-void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
+int efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
 int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata);
 struct ib_qp *efa_create_qp(struct ib_pd *ibpd,
 			    struct ib_qp_init_attr *init_attr,
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index de9a22f0fcc2..383ce126d82f 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -383,13 +383,14 @@ int efa_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return err;
 }
 
-void efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+int efa_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct efa_dev *dev = to_edev(ibpd->device);
 	struct efa_pd *pd = to_epd(ibpd);
 
 	ibdev_dbg(&dev->ibdev, "Dealloc pd[%d]\n", pd->pdn);
 	efa_pd_dealloc(dev, pd->pdn);
+	return 0;
 }
 
 static int efa_destroy_qp_handle(struct efa_dev *dev, u32 qp_handle)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index da9888deff8c..e2d544a76bb3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -1183,7 +1183,7 @@ int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
 void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags);
 
 int hns_roce_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
+int hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 
 struct ib_mr *hns_roce_get_dma_mr(struct ib_pd *pd, int acc);
 struct ib_mr *hns_roce_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
diff --git a/drivers/infiniband/hw/hns/hns_roce_pd.c b/drivers/infiniband/hw/hns/hns_roce_pd.c
index b10c50b8736e..98f69496adb4 100644
--- a/drivers/infiniband/hw/hns/hns_roce_pd.c
+++ b/drivers/infiniband/hw/hns/hns_roce_pd.c
@@ -82,9 +82,10 @@ int hns_roce_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-void hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+int hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	hns_roce_pd_free(to_hr_dev(pd->device), to_hr_pd(pd)->pdn);
+	return 0;
 }
 
 int hns_roce_uar_alloc(struct hns_roce_dev *hr_dev, struct hns_roce_uar *uar)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 6957e4f3404b..360b037f90e7 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -328,12 +328,13 @@ static int i40iw_alloc_pd(struct ib_pd *pd, struct ib_udata *udata)
  * @ibpd: ptr of pd to be deallocated
  * @udata: user data or null for kernel object
  */
-static void i40iw_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+static int i40iw_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct i40iw_pd *iwpd = to_iwpd(ibpd);
 	struct i40iw_device *iwdev = to_iwdev(ibpd->device);
 
 	i40iw_rem_pdusecount(iwpd, iwdev);
+	return 0;
 }
 
 /**
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 2543062c0cb0..1be0108db992 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -1214,9 +1214,10 @@ static int mlx4_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-static void mlx4_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+static int mlx4_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	mlx4_pd_free(to_mdev(pd->device)->dev, to_mpd(pd)->pdn);
+	return 0;
 }
 
 static int mlx4_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata)
diff --git a/drivers/infiniband/hw/mlx5/cmd.c b/drivers/infiniband/hw/mlx5/cmd.c
index ebb2f108b64f..f5aac53cebf0 100644
--- a/drivers/infiniband/hw/mlx5/cmd.c
+++ b/drivers/infiniband/hw/mlx5/cmd.c
@@ -209,14 +209,14 @@ void mlx5_cmd_dealloc_transport_domain(struct mlx5_core_dev *dev, u32 tdn,
 	mlx5_cmd_exec_in(dev, dealloc_transport_domain, in);
 }
 
-void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid)
+int mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid)
 {
 	u32 in[MLX5_ST_SZ_DW(dealloc_pd_in)] = {};
 
 	MLX5_SET(dealloc_pd_in, in, opcode, MLX5_CMD_OP_DEALLOC_PD);
 	MLX5_SET(dealloc_pd_in, in, pd, pdn);
 	MLX5_SET(dealloc_pd_in, in, uid, uid);
-	mlx5_cmd_exec_in(dev, dealloc_pd, in);
+	return mlx5_cmd_exec_in(dev, dealloc_pd, in);
 }
 
 int mlx5_cmd_attach_mcg(struct mlx5_core_dev *dev, union ib_gid *mgid,
diff --git a/drivers/infiniband/hw/mlx5/cmd.h b/drivers/infiniband/hw/mlx5/cmd.h
index 1d192a8ca87d..ca3afa7d73a3 100644
--- a/drivers/infiniband/hw/mlx5/cmd.h
+++ b/drivers/infiniband/hw/mlx5/cmd.h
@@ -44,7 +44,7 @@ int mlx5_cmd_query_cong_params(struct mlx5_core_dev *dev, int cong_point,
 int mlx5_cmd_alloc_memic(struct mlx5_dm *dm, phys_addr_t *addr,
 			 u64 length, u32 alignment);
 void mlx5_cmd_dealloc_memic(struct mlx5_dm *dm, phys_addr_t addr, u64 length);
-void mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid);
+int mlx5_cmd_dealloc_pd(struct mlx5_core_dev *dev, u32 pdn, u16 uid);
 void mlx5_cmd_destroy_tir(struct mlx5_core_dev *dev, u32 tirn, u16 uid);
 void mlx5_cmd_destroy_tis(struct mlx5_core_dev *dev, u32 tisn, u16 uid);
 void mlx5_cmd_destroy_rqt(struct mlx5_core_dev *dev, u32 rqtn, u16 uid);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index fbc45a5e76c5..4960e3b31c94 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -2569,12 +2569,12 @@ static int mlx5_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-static void mlx5_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+static int mlx5_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *mdev = to_mdev(pd->device);
 	struct mlx5_ib_pd *mpd = to_mpd(pd);
 
-	mlx5_cmd_dealloc_pd(mdev->mdev, mpd->pdn, mpd->uid);
+	return mlx5_cmd_dealloc_pd(mdev->mdev, mpd->pdn, mpd->uid);
 }
 
 static int mlx5_ib_mcg_attach(struct ib_qp *ibqp, union ib_gid *gid, u16 lid)
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 9fa2f9164a47..d3ed7c19b2ef 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -373,9 +373,10 @@ static int mthca_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-static void mthca_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+static int mthca_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	mthca_pd_free(to_mdev(pd->device), to_mpd(pd));
+	return 0;
 }
 
 static int mthca_ah_create(struct ib_ah *ibah,
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 6cdbec13756a..4b9295c8d4f3 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -664,7 +664,7 @@ int ocrdma_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return status;
 }
 
-void ocrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+int ocrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct ocrdma_pd *pd = get_ocrdma_pd(ibpd);
 	struct ocrdma_dev *dev = get_ocrdma_dev(ibpd->device);
@@ -682,10 +682,11 @@ void ocrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 
 		if (is_ucontext_pd(uctx, pd)) {
 			ocrdma_release_ucontext_pd(uctx);
-			return;
+			return 0;
 		}
 	}
 	_ocrdma_dealloc_pd(dev, pd);
+	return 0;
 }
 
 static int ocrdma_alloc_lkey(struct ocrdma_dev *dev, struct ocrdma_mr *mr,
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
index df8e3b923a44..4c85be43507c 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
@@ -67,7 +67,7 @@ void ocrdma_dealloc_ucontext(struct ib_ucontext *uctx);
 int ocrdma_mmap(struct ib_ucontext *, struct vm_area_struct *vma);
 
 int ocrdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
+int ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 
 int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		     struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 4ce4e2eef6cc..93ee1740a2d2 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -471,13 +471,14 @@ int qedr_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-void qedr_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+int qedr_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct qedr_dev *dev = get_qedr_dev(ibpd->device);
 	struct qedr_pd *pd = get_qedr_pd(ibpd);
 
 	DP_DEBUG(dev, QEDR_MSG_INIT, "Deallocating PD %d\n", pd->pd_id);
 	dev->ops->rdma_dealloc_pd(dev->rdma_ctx, pd->pd_id);
+	return 0;
 }
 
 static void qedr_free_pbl(struct qedr_dev *dev,
diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h
index 39dd6286ba39..1b450919ba9c 100644
--- a/drivers/infiniband/hw/qedr/verbs.h
+++ b/drivers/infiniband/hw/qedr/verbs.h
@@ -47,7 +47,7 @@ void qedr_dealloc_ucontext(struct ib_ucontext *uctx);
 int qedr_mmap(struct ib_ucontext *ucontext, struct vm_area_struct *vma);
 void qedr_mmap_free(struct rdma_user_mmap_entry *rdma_entry);
 int qedr_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
+int qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 
 int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		   struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 02a49f661c8d..8af3212101be 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -449,9 +449,10 @@ int usnic_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return 0;
 }
 
-void usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+int usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	usnic_uiom_dealloc_pd((to_upd(pd))->umem_pd);
+	return 0;
 }
 
 struct ib_qp *usnic_ib_create_qp(struct ib_pd *pd,
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
index 9195f2b901ce..f8911c0330e2 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
@@ -49,7 +49,7 @@ int usnic_ib_query_qp(struct ib_qp *qp, struct ib_qp_attr *qp_attr,
 int usnic_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
 				union ib_gid *gid);
 int usnic_ib_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
-void usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
+int usnic_ib_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 struct ib_qp *usnic_ib_create_qp(struct ib_pd *pd,
 					struct ib_qp_init_attr *init_attr,
 					struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
index 65ac3693ad12..678c94531e68 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
@@ -479,9 +479,9 @@ int pvrdma_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
  * @pd: the protection domain to be released
  * @udata: user data or null for kernel object
  *
- * @return: 0 on success, otherwise errno.
+ * @return: Always 0
  */
-void pvrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+int pvrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	struct pvrdma_dev *dev = to_vdev(pd->device);
 	union pvrdma_cmd_req req = {};
@@ -498,6 +498,7 @@ void pvrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 			 ret);
 
 	atomic_dec(&dev->num_pds);
+	return 0;
 }
 
 /**
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index 699b20849a7e..7bf33a654275 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -399,7 +399,7 @@ int pvrdma_mmap(struct ib_ucontext *context, struct vm_area_struct *vma);
 int pvrdma_alloc_ucontext(struct ib_ucontext *uctx, struct ib_udata *udata);
 void pvrdma_dealloc_ucontext(struct ib_ucontext *context);
 int pvrdma_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void pvrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
+int pvrdma_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
 struct ib_mr *pvrdma_get_dma_mr(struct ib_pd *pd, int acc);
 struct ib_mr *pvrdma_reg_user_mr(struct ib_pd *pd, u64 start, u64 length,
 				 u64 virt_addr, int access_flags,
diff --git a/drivers/infiniband/sw/rdmavt/pd.c b/drivers/infiniband/sw/rdmavt/pd.c
index a403718f0b5e..01b7abf91520 100644
--- a/drivers/infiniband/sw/rdmavt/pd.c
+++ b/drivers/infiniband/sw/rdmavt/pd.c
@@ -95,11 +95,12 @@ int rvt_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
  *
  * Return: always 0
  */
-void rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+int rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct rvt_dev_info *dev = ib_to_rvt(ibpd->device);
 
 	spin_lock(&dev->n_pds_lock);
 	dev->n_pds_allocated--;
 	spin_unlock(&dev->n_pds_lock);
+	return 0;
 }
diff --git a/drivers/infiniband/sw/rdmavt/pd.h b/drivers/infiniband/sw/rdmavt/pd.h
index 71ba76d72b1d..06a6a38beedc 100644
--- a/drivers/infiniband/sw/rdmavt/pd.h
+++ b/drivers/infiniband/sw/rdmavt/pd.h
@@ -51,6 +51,6 @@
 #include <rdma/rdma_vt.h>
 
 int rvt_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
-void rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
+int rvt_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata);
 
 #endif          /* DEF_RDMAVTPD_H */
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index bb61e534e468..d23bc6aa30e2 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -175,11 +175,12 @@ static int rxe_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 	return rxe_add_to_pool(&rxe->pd_pool, &pd->pelem);
 }
 
-static void rxe_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
+static int rxe_dealloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 {
 	struct rxe_pd *pd = to_rpd(ibpd);
 
 	rxe_drop_ref(pd);
+	return 0;
 }
 
 static int rxe_create_ah(struct ib_ah *ibah,
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index adafa1b8bebe..2d2b6df0b027 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -234,12 +234,13 @@ int siw_alloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 	return 0;
 }
 
-void siw_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
+int siw_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata)
 {
 	struct siw_device *sdev = to_siw_dev(pd->device);
 
 	siw_dbg_pd(pd, "free PD\n");
 	atomic_dec(&sdev->num_pd);
+	return 0;
 }
 
 void siw_qp_get_ref(struct ib_qp *base_qp)
diff --git a/drivers/infiniband/sw/siw/siw_verbs.h b/drivers/infiniband/sw/siw/siw_verbs.h
index d9572275a6b6..3dbab78579cb 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.h
+++ b/drivers/infiniband/sw/siw/siw_verbs.h
@@ -49,7 +49,7 @@ int siw_query_port(struct ib_device *base_dev, u8 port,
 int siw_query_gid(struct ib_device *base_dev, u8 port, int idx,
 		  union ib_gid *gid);
 int siw_alloc_pd(struct ib_pd *base_pd, struct ib_udata *udata);
-void siw_dealloc_pd(struct ib_pd *base_pd, struct ib_udata *udata);
+int siw_dealloc_pd(struct ib_pd *base_pd, struct ib_udata *udata);
 struct ib_qp *siw_create_qp(struct ib_pd *base_pd,
 			    struct ib_qp_init_attr *attr,
 			    struct ib_udata *udata);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 969561f6e70e..98e9ea1f693b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2404,7 +2404,7 @@ struct ib_device_ops {
 	void (*mmap_free)(struct rdma_user_mmap_entry *entry);
 	void (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
 	int (*alloc_pd)(struct ib_pd *pd, struct ib_udata *udata);
-	void (*dealloc_pd)(struct ib_pd *pd, struct ib_udata *udata);
+	int (*dealloc_pd)(struct ib_pd *pd, struct ib_udata *udata);
 	int (*create_ah)(struct ib_ah *ah, struct rdma_ah_init_attr *attr,
 			 struct ib_udata *udata);
 	int (*modify_ah)(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
@@ -3462,12 +3462,7 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
 #define ib_alloc_pd(device, flags) \
 	__ib_alloc_pd((device), (flags), KBUILD_MODNAME)
 
-/**
- * ib_dealloc_pd_user - Deallocate kernel/user PD
- * @pd: The protection domain
- * @udata: Valid user data or NULL for kernel objects
- */
-void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata);
+int ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata);
 
 /**
  * ib_dealloc_pd - Deallocate kernel PD
@@ -3477,7 +3472,9 @@ void ib_dealloc_pd_user(struct ib_pd *pd, struct ib_udata *udata);
  */
 static inline void ib_dealloc_pd(struct ib_pd *pd)
 {
-	ib_dealloc_pd_user(pd, NULL);
+	int ret = ib_dealloc_pd_user(pd, NULL);
+
+	WARN_ONCE(ret, "Destroy of kernel PD shouldn't fail");
 }
 
 enum rdma_create_ah_flags {
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 02/10] RDMA: Restore ability to fail on AH destroy
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 01/10] RDMA: Restore ability to fail on PD deallocate Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry Leon Romanovsky
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Adit Ranadive, Ariel Elior, Dennis Dalessandro,
	Devesh Sharma, Gal Pressman, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Selvin Xavier, Somnath Kotur, Sriharsha Basavapatna,
	VMware PV-Drivers, Weihang Li, Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

From: Leon Romanovsky <leonro@mellanox.com>

Like any other IB verbs objects, AH are refcounted by ib_core. The release
of those objects are controlled by ib_core with promise that AH destroy
can't fail.

Being SW object for now, this change makes dealloc_ah() to behave like
any other destroy IB flows.

Fixes: d345691471b4 ("RDMA: Handle AH allocations by IB/core")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/verbs.c                 | 8 ++++++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c        | 3 ++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h        | 2 +-
 drivers/infiniband/hw/efa/efa.h                 | 2 +-
 drivers/infiniband/hw/efa/efa_verbs.c           | 5 +++--
 drivers/infiniband/hw/hns/hns_roce_ah.c         | 5 -----
 drivers/infiniband/hw/hns/hns_roce_device.h     | 5 ++++-
 drivers/infiniband/hw/mlx4/ah.c                 | 5 -----
 drivers/infiniband/hw/mlx4/mlx4_ib.h            | 5 ++++-
 drivers/infiniband/hw/mlx5/ah.c                 | 5 -----
 drivers/infiniband/hw/mlx5/mlx5_ib.h            | 5 ++++-
 drivers/infiniband/hw/mthca/mthca_provider.c    | 3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.c        | 3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_ah.h        | 2 +-
 drivers/infiniband/hw/qedr/verbs.c              | 3 ++-
 drivers/infiniband/hw/qedr/verbs.h              | 2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c | 3 ++-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h | 2 +-
 drivers/infiniband/sw/rdmavt/ah.c               | 3 ++-
 drivers/infiniband/sw/rdmavt/ah.h               | 2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c           | 3 ++-
 include/rdma/ib_verbs.h                         | 8 +++++---
 22 files changed, 46 insertions(+), 38 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 688b34bf0268..5b4ad90ef6e6 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -968,18 +968,22 @@ int rdma_destroy_ah_user(struct ib_ah *ah, u32 flags, struct ib_udata *udata)
 {
 	const struct ib_gid_attr *sgid_attr = ah->sgid_attr;
 	struct ib_pd *pd;
+	int ret;
 
 	might_sleep_if(flags & RDMA_DESTROY_AH_SLEEPABLE);
 
 	pd = ah->pd;
 
-	ah->device->ops.destroy_ah(ah, flags);
+	ret = ah->device->ops.destroy_ah(ah, flags);
+	if (ret && udata)
+		return ret;
+
 	atomic_dec(&pd->usecnt);
 	if (sgid_attr)
 		rdma_put_gid_attr(sgid_attr);
 
 	kfree(ah);
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(rdma_destroy_ah_user);
 
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 05abce09b23c..e950a0792518 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -602,13 +602,14 @@ int bnxt_re_alloc_pd(struct ib_pd *ibpd, struct ib_udata *udata)
 }
 
 /* Address Handles */
-void bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags)
+int bnxt_re_destroy_ah(struct ib_ah *ib_ah, u32 flags)
 {
 	struct bnxt_re_ah *ah = container_of(ib_ah, struct bnxt_re_ah, ib_ah);
 	struct bnxt_re_dev *rdev = ah->rdev;
 
 	bnxt_qplib_destroy_ah(&rdev->qplib_res, &ah->qplib_ah,
 			      !(flags & RDMA_DESTROY_AH_SLEEPABLE));
+	return 0;
 }
 
 static u8 bnxt_re_stack_to_dev_nw_type(enum rdma_network_type ntype)
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index d9e2e406f66a..b6b56a92b78e 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -168,7 +168,7 @@ int bnxt_re_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		      struct ib_udata *udata);
 int bnxt_re_modify_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
 int bnxt_re_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
-void bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags);
+int bnxt_re_destroy_ah(struct ib_ah *ah, u32 flags);
 int bnxt_re_create_srq(struct ib_srq *srq,
 		       struct ib_srq_init_attr *srq_init_attr,
 		       struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h
index 8547f9d543df..6b06ce87fbfc 100644
--- a/drivers/infiniband/hw/efa/efa.h
+++ b/drivers/infiniband/hw/efa/efa.h
@@ -156,7 +156,7 @@ void efa_mmap_free(struct rdma_user_mmap_entry *rdma_entry);
 int efa_create_ah(struct ib_ah *ibah,
 		  struct rdma_ah_init_attr *init_attr,
 		  struct ib_udata *udata);
-void efa_destroy_ah(struct ib_ah *ibah, u32 flags);
+int efa_destroy_ah(struct ib_ah *ibah, u32 flags);
 int efa_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *qp_attr,
 		  int qp_attr_mask, struct ib_udata *udata);
 enum rdma_link_layer efa_port_link_layer(struct ib_device *ibdev,
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index 383ce126d82f..a03e3514bd8a 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -1873,7 +1873,7 @@ int efa_create_ah(struct ib_ah *ibah,
 	return err;
 }
 
-void efa_destroy_ah(struct ib_ah *ibah, u32 flags)
+int efa_destroy_ah(struct ib_ah *ibah, u32 flags)
 {
 	struct efa_dev *dev = to_edev(ibah->pd->device);
 	struct efa_ah *ah = to_eah(ibah);
@@ -1883,10 +1883,11 @@ void efa_destroy_ah(struct ib_ah *ibah, u32 flags)
 	if (!(flags & RDMA_DESTROY_AH_SLEEPABLE)) {
 		ibdev_dbg(&dev->ibdev,
 			  "Destroy address handle is not supported in atomic context\n");
-		return;
+		return -EOPNOTSUPP;
 	}
 
 	efa_ah_destroy(dev, ah);
+	return 0;
 }
 
 struct rdma_hw_stats *efa_alloc_hw_stats(struct ib_device *ibdev, u8 port_num)
diff --git a/drivers/infiniband/hw/hns/hns_roce_ah.c b/drivers/infiniband/hw/hns/hns_roce_ah.c
index 5b2f9314edd3..8495b148a443 100644
--- a/drivers/infiniband/hw/hns/hns_roce_ah.c
+++ b/drivers/infiniband/hw/hns/hns_roce_ah.c
@@ -98,8 +98,3 @@ int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
 
 	return 0;
 }
-
-void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags)
-{
-	return;
-}
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index e2d544a76bb3..7100127c7d1c 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -1180,7 +1180,10 @@ void hns_roce_bitmap_free_range(struct hns_roce_bitmap *bitmap,
 int hns_roce_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		       struct ib_udata *udata);
 int hns_roce_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
-void hns_roce_destroy_ah(struct ib_ah *ah, u32 flags);
+static inline int hns_roce_destroy_ah(struct ib_ah *ah, u32 flags)
+{
+	return 0;
+}
 
 int hns_roce_alloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 int hns_roce_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/mlx4/ah.c b/drivers/infiniband/hw/mlx4/ah.c
index 5f8f8d5c0ce0..7321d6ab5fe1 100644
--- a/drivers/infiniband/hw/mlx4/ah.c
+++ b/drivers/infiniband/hw/mlx4/ah.c
@@ -232,8 +232,3 @@ int mlx4_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
 
 	return 0;
 }
-
-void mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags)
-{
-	return;
-}
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index bcac8fc50317..6d51653edaf8 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -753,7 +753,10 @@ int mlx4_ib_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 int mlx4_ib_create_ah_slave(struct ib_ah *ah, struct rdma_ah_attr *ah_attr,
 			    int slave_sgid_index, u8 *s_mac, u16 vlan_tag);
 int mlx4_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
-void mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags);
+static inline int mlx4_ib_destroy_ah(struct ib_ah *ah, u32 flags)
+{
+	return 0;
+}
 
 int mlx4_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 		       struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/mlx5/ah.c b/drivers/infiniband/hw/mlx5/ah.c
index 4a60e693a04d..505bc47fd575 100644
--- a/drivers/infiniband/hw/mlx5/ah.c
+++ b/drivers/infiniband/hw/mlx5/ah.c
@@ -147,8 +147,3 @@ int mlx5_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr)
 
 	return 0;
 }
-
-void mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags)
-{
-	return;
-}
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 5287fc868662..1e5f77d3e86b 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1119,7 +1119,10 @@ void mlx5_ib_free_srq_wqe(struct mlx5_ib_srq *srq, int wqe_index);
 int mlx5_ib_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		      struct ib_udata *udata);
 int mlx5_ib_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
-void mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags);
+static inline int mlx5_ib_destroy_ah(struct ib_ah *ah, u32 flags)
+{
+	return 0;
+}
 int mlx5_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 		       struct ib_udata *udata);
 int mlx5_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index d3ed7c19b2ef..12b7c5349004 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -390,9 +390,10 @@ static int mthca_ah_create(struct ib_ah *ibah,
 			       init_attr->ah_attr, ah);
 }
 
-static void mthca_ah_destroy(struct ib_ah *ah, u32 flags)
+static int mthca_ah_destroy(struct ib_ah *ah, u32 flags)
 {
 	mthca_destroy_ah(to_mdev(ah->device), to_mah(ah));
+	return 0;
 }
 
 static int mthca_create_srq(struct ib_srq *ibsrq,
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
index 6eea02b18968..699a8b719ed6 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.c
@@ -215,12 +215,13 @@ int ocrdma_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
 	return status;
 }
 
-void ocrdma_destroy_ah(struct ib_ah *ibah, u32 flags)
+int ocrdma_destroy_ah(struct ib_ah *ibah, u32 flags)
 {
 	struct ocrdma_ah *ah = get_ocrdma_ah(ibah);
 	struct ocrdma_dev *dev = get_ocrdma_dev(ibah->device);
 
 	ocrdma_free_av(dev, ah);
+	return 0;
 }
 
 int ocrdma_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *attr)
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_ah.h b/drivers/infiniband/hw/ocrdma/ocrdma_ah.h
index 8b73b3489f3a..35cf2e2ff391 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_ah.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_ah.h
@@ -53,7 +53,7 @@ enum {
 
 int ocrdma_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		     struct ib_udata *udata);
-void ocrdma_destroy_ah(struct ib_ah *ah, u32 flags);
+int ocrdma_destroy_ah(struct ib_ah *ah, u32 flags);
 int ocrdma_query_ah(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
 
 int ocrdma_process_mad(struct ib_device *dev, int process_mad_flags,
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 93ee1740a2d2..53ac0a4d32f2 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -2767,11 +2767,12 @@ int qedr_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
 	return 0;
 }
 
-void qedr_destroy_ah(struct ib_ah *ibah, u32 flags)
+int qedr_destroy_ah(struct ib_ah *ibah, u32 flags)
 {
 	struct qedr_ah *ah = get_qedr_ah(ibah);
 
 	rdma_destroy_ah_attr(&ah->attr);
+	return 0;
 }
 
 static void free_mr_info(struct qedr_dev *dev, struct mr_info *info)
diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h
index 1b450919ba9c..1b4ed8d37f5e 100644
--- a/drivers/infiniband/hw/qedr/verbs.h
+++ b/drivers/infiniband/hw/qedr/verbs.h
@@ -72,7 +72,7 @@ int qedr_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
 		       const struct ib_recv_wr **bad_recv_wr);
 int qedr_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
 		   struct ib_udata *udata);
-void qedr_destroy_ah(struct ib_ah *ibah, u32 flags);
+int qedr_destroy_ah(struct ib_ah *ibah, u32 flags);
 
 int qedr_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata);
 struct ib_mr *qedr_get_dma_mr(struct ib_pd *, int acc);
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
index 678c94531e68..fc412cbfd042 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.c
@@ -548,9 +548,10 @@ int pvrdma_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
  * @flags: destroy address handle flags (see enum rdma_destroy_ah_flags)
  *
  */
-void pvrdma_destroy_ah(struct ib_ah *ah, u32 flags)
+int pvrdma_destroy_ah(struct ib_ah *ah, u32 flags)
 {
 	struct pvrdma_dev *dev = to_vdev(ah->device);
 
 	atomic_dec(&dev->num_ahs);
+	return 0;
 }
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index 7bf33a654275..58b41a3e8b7e 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -416,7 +416,7 @@ int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 int pvrdma_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
 int pvrdma_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		     struct ib_udata *udata);
-void pvrdma_destroy_ah(struct ib_ah *ah, u32 flags);
+int pvrdma_destroy_ah(struct ib_ah *ah, u32 flags);
 
 int pvrdma_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 		      struct ib_udata *udata);
diff --git a/drivers/infiniband/sw/rdmavt/ah.c b/drivers/infiniband/sw/rdmavt/ah.c
index 75a04b1497c4..b938c4ffa99a 100644
--- a/drivers/infiniband/sw/rdmavt/ah.c
+++ b/drivers/infiniband/sw/rdmavt/ah.c
@@ -132,7 +132,7 @@ int rvt_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
  *
  * Return: 0 on success
  */
-void rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags)
+int rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags)
 {
 	struct rvt_dev_info *dev = ib_to_rvt(ibah->device);
 	struct rvt_ah *ah = ibah_to_rvtah(ibah);
@@ -143,6 +143,7 @@ void rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags)
 	spin_unlock_irqrestore(&dev->n_ahs_lock, flags);
 
 	rdma_destroy_ah_attr(&ah->attr);
+	return 0;
 }
 
 /**
diff --git a/drivers/infiniband/sw/rdmavt/ah.h b/drivers/infiniband/sw/rdmavt/ah.h
index 40b7123fec76..5a85edd06491 100644
--- a/drivers/infiniband/sw/rdmavt/ah.h
+++ b/drivers/infiniband/sw/rdmavt/ah.h
@@ -52,7 +52,7 @@
 
 int rvt_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
 		  struct ib_udata *udata);
-void rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags);
+int rvt_destroy_ah(struct ib_ah *ibah, u32 destroy_flags);
 int rvt_modify_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
 int rvt_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *ah_attr);
 
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index d23bc6aa30e2..4c7df057da54 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -228,11 +228,12 @@ static int rxe_query_ah(struct ib_ah *ibah, struct rdma_ah_attr *attr)
 	return 0;
 }
 
-static void rxe_destroy_ah(struct ib_ah *ibah, u32 flags)
+static int rxe_destroy_ah(struct ib_ah *ibah, u32 flags)
 {
 	struct rxe_ah *ah = to_rah(ibah);
 
 	rxe_drop_ref(ah);
+	return 0;
 }
 
 static int post_one_recv(struct rxe_rq *rq, const struct ib_recv_wr *ibwr)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 98e9ea1f693b..38d34b85138f 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2409,7 +2409,7 @@ struct ib_device_ops {
 			 struct ib_udata *udata);
 	int (*modify_ah)(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
 	int (*query_ah)(struct ib_ah *ah, struct rdma_ah_attr *ah_attr);
-	void (*destroy_ah)(struct ib_ah *ah, u32 flags);
+	int (*destroy_ah)(struct ib_ah *ah, u32 flags);
 	int (*create_srq)(struct ib_srq *srq,
 			  struct ib_srq_init_attr *srq_init_attr,
 			  struct ib_udata *udata);
@@ -3602,9 +3602,11 @@ int rdma_destroy_ah_user(struct ib_ah *ah, u32 flags, struct ib_udata *udata);
  *
  * NOTE: for user ah use rdma_destroy_ah_user with valid udata!
  */
-static inline int rdma_destroy_ah(struct ib_ah *ah, u32 flags)
+static inline void rdma_destroy_ah(struct ib_ah *ah, u32 flags)
 {
-	return rdma_destroy_ah_user(ah, flags, NULL);
+	int ret = rdma_destroy_ah_user(ah, flags, NULL);
+
+	WARN_ONCE(ret, "Destroy of kernel AH shouldn't fail");
 }
 
 struct ib_srq *ib_create_srq_user(struct ib_pd *pd,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 01/10] RDMA: Restore ability to fail on PD deallocate Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 02/10] RDMA: Restore ability to fail on AH destroy Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-09-03  0:31   ` Jason Gunthorpe
  2020-08-30  8:40 ` [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll Leon Romanovsky
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma

From: Leon Romanovsky <leonro@mellanox.com>

The HW release can fail and leave the system in limbo state,
where SRQ is removed from the table, but can't be destroyed later.
In every reentry, the initial xa_erase_irq() check will fail.

Rewrite the erase logic to keep index, but don't store the entry
itself. By doing it, we can safely reinsert entry back in the case
of destroy failure and be safe from any xa_store_irq() error.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/srq_cmd.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c
index 37aaacebd3f2..c6d807f04d9d 100644
--- a/drivers/infiniband/hw/mlx5/srq_cmd.c
+++ b/drivers/infiniband/hw/mlx5/srq_cmd.c
@@ -596,13 +596,22 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
 	struct mlx5_core_srq *tmp;
 	int err;
 
-	tmp = xa_erase_irq(&table->array, srq->srqn);
-	if (!tmp || tmp != srq)
+	/* Delete entry, but leave index occupied */
+	tmp = xa_store_irq(&table->array, srq->srqn, NULL, 0);
+	if (WARN_ON(!tmp || tmp != srq))
 		return;
 
 	err = destroy_srq_split(dev, srq);
-	if (err)
+	if (err) {
+		/*
+		 * We don't need to check returned result for an error,
+		 * because  we are storing in pre-allocated space xarray
+		 * entry and it can't fail at this stage.
+		 */
+		xa_store_irq(&table->array, srq->srqn, srq, 0);
 		return;
+	}
+	xa_erase_irq(&table->array, srq->srqn);
 
 	mlx5_core_res_put(&srq->common);
 	wait_for_completion(&srq->common.free);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (2 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-09-03 13:42   ` Jason Gunthorpe
  2020-08-30  8:40 ` [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Eli Cohen, Jack Morgenstein, linux-rdma,
	Or Gerlitz, Roland Dreier

From: Leon Romanovsky <leonro@mellanox.com>

The SRQ can be destroyed right before mlx5_cmd_get_srq is called.
In such case the latter will return NULL instead of expected SRQ.

Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/hw/mlx5/cq.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 34c2cd57d994..2bac246b9fef 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -168,7 +168,7 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe,
 {
 	enum rdma_link_layer ll = rdma_port_get_link_layer(qp->ibqp.device, 1);
 	struct mlx5_ib_dev *dev = to_mdev(qp->ibqp.device);
-	struct mlx5_ib_srq *srq;
+	struct mlx5_ib_srq *srq = NULL;
 	struct mlx5_ib_wq *wq;
 	u16 wqe_ctr;
 	u8  roce_packet_type;
@@ -180,7 +180,8 @@ static void handle_responder(struct ib_wc *wc, struct mlx5_cqe64 *cqe,
 
 		if (qp->ibqp.xrcd) {
 			msrq = mlx5_cmd_get_srq(dev, be32_to_cpu(cqe->srqn));
-			srq = to_mibsrq(msrq);
+			if (msrq)
+				srq = to_mibsrq(msrq);
 		} else {
 			srq = to_msrq(qp->ibqp.srq);
 		}
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (3 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-09-03  0:08   ` Jason Gunthorpe
  2020-09-03  0:18   ` Jason Gunthorpe
  2020-08-30  8:40 ` [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ Leon Romanovsky
                   ` (4 subsequent siblings)
  9 siblings, 2 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

From: Leon Romanovsky <leonro@mellanox.com>

In similar way to other IB objects, restore ability to return
error on SRQ destroy. Strictly saying, this change is not necessary,
and provided here to ensure symmetrical interface to be like any other
destroy command.

Fixes: 68e326dea1db ("RDMA: Handle SRQ allocations by IB/core")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/verbs.c               |  8 ++++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      |  3 ++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h      |  2 +-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h        |  2 +-
 drivers/infiniband/hw/cxgb4/qp.c              |  3 ++-
 drivers/infiniband/hw/hns/hns_roce_device.h   |  2 +-
 drivers/infiniband/hw/hns/hns_roce_srq.c      |  3 ++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h          |  2 +-
 drivers/infiniband/hw/mlx4/srq.c              |  3 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  2 +-
 drivers/infiniband/hw/mlx5/srq.c              | 26 +++++++++----------
 drivers/infiniband/hw/mlx5/srq.h              |  2 +-
 drivers/infiniband/hw/mlx5/srq_cmd.c          |  7 ++---
 drivers/infiniband/hw/mthca/mthca_provider.c  |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c   |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h   |  2 +-
 drivers/infiniband/hw/qedr/verbs.c            |  3 ++-
 drivers/infiniband/hw/qedr/verbs.h            |  2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c |  3 ++-
 .../infiniband/hw/vmw_pvrdma/pvrdma_verbs.h   |  2 +-
 drivers/infiniband/sw/rdmavt/srq.c            |  3 ++-
 drivers/infiniband/sw/rdmavt/srq.h            |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c         |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.c         |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.h         |  2 +-
 drivers/infiniband/ulp/ipoib/ipoib_cm.c       |  6 +----
 include/rdma/ib_verbs.h                       |  8 +++---
 27 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 5b4ad90ef6e6..8ce6fc14677a 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -1068,10 +1068,14 @@ EXPORT_SYMBOL(ib_query_srq);
 
 int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata)
 {
+	int ret;
+
 	if (atomic_read(&srq->usecnt))
 		return -EBUSY;
 
-	srq->device->ops.destroy_srq(srq, udata);
+	ret = srq->device->ops.destroy_srq(srq, udata);
+	if (ret && udata)
+		return ret;
 
 	atomic_dec(&srq->pd->usecnt);
 	if (srq->srq_type == IB_SRQT_XRC)
@@ -1080,7 +1084,7 @@ int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata)
 		atomic_dec(&srq->ext.cq->usecnt);
 	kfree(srq);
 
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(ib_destroy_srq_user);
 
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index e950a0792518..13460fd31c8d 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -1570,7 +1570,7 @@ static enum ib_mtu __to_ib_mtu(u32 mtu)
 }
 
 /* Shared Receive Queues */
-void bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata)
+int bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata)
 {
 	struct bnxt_re_srq *srq = container_of(ib_srq, struct bnxt_re_srq,
 					       ib_srq);
@@ -1585,6 +1585,7 @@ void bnxt_re_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata)
 	atomic_dec(&rdev->srq_count);
 	if (nq)
 		nq->budget--;
+	return 0;
 }
 
 static int bnxt_re_init_user_srq(struct bnxt_re_dev *rdev,
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index b6b56a92b78e..7ca232809466 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -176,7 +176,7 @@ int bnxt_re_modify_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr,
 		       enum ib_srq_attr_mask srq_attr_mask,
 		       struct ib_udata *udata);
 int bnxt_re_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
-void bnxt_re_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
+int bnxt_re_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
 int bnxt_re_post_srq_recv(struct ib_srq *srq, const struct ib_recv_wr *recv_wr,
 			  const struct ib_recv_wr **bad_recv_wr);
 struct ib_qp *bnxt_re_create_qp(struct ib_pd *pd,
diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index 2b2b009b371a..fa91e80869c0 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -999,7 +999,7 @@ int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 int c4iw_modify_srq(struct ib_srq *ib_srq, struct ib_srq_attr *attr,
 		    enum ib_srq_attr_mask srq_attr_mask,
 		    struct ib_udata *udata);
-void c4iw_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata);
+int c4iw_destroy_srq(struct ib_srq *ib_srq, struct ib_udata *udata);
 int c4iw_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *attrs,
 		    struct ib_udata *udata);
 int c4iw_destroy_qp(struct ib_qp *ib_qp, struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/cxgb4/qp.c b/drivers/infiniband/hw/cxgb4/qp.c
index ac48012c992f..dbee730342af 100644
--- a/drivers/infiniband/hw/cxgb4/qp.c
+++ b/drivers/infiniband/hw/cxgb4/qp.c
@@ -2797,7 +2797,7 @@ int c4iw_create_srq(struct ib_srq *ib_srq, struct ib_srq_init_attr *attrs,
 	return ret;
 }
 
-void c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+int c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct c4iw_dev *rhp;
 	struct c4iw_srq *srq;
@@ -2813,4 +2813,5 @@ void c4iw_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 		       srq->wr_waitp);
 	c4iw_free_srq_idx(&rhp->rdev, srq->idx);
 	c4iw_put_wr_wait(srq->wr_waitp);
+	return 0;
 }
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index 7100127c7d1c..ce0bec4a73c2 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -1225,7 +1225,7 @@ int hns_roce_create_srq(struct ib_srq *srq,
 int hns_roce_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr,
 			enum ib_srq_attr_mask srq_attr_mask,
 			struct ib_udata *udata);
-void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
+int hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
 
 struct ib_qp *hns_roce_create_qp(struct ib_pd *ib_pd,
 				 struct ib_qp_init_attr *init_attr,
diff --git a/drivers/infiniband/hw/hns/hns_roce_srq.c b/drivers/infiniband/hw/hns/hns_roce_srq.c
index f40a000e94ee..17585b127d3a 100644
--- a/drivers/infiniband/hw/hns/hns_roce_srq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_srq.c
@@ -363,7 +363,7 @@ int hns_roce_create_srq(struct ib_srq *ib_srq,
 	return ret;
 }
 
-void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+int hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibsrq->device);
 	struct hns_roce_srq *srq = to_hr_srq(ibsrq);
@@ -372,6 +372,7 @@ void hns_roce_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 	free_srq_idx(hr_dev, srq);
 	free_srq_wrid(srq);
 	free_srq_buf(hr_dev, srq);
+	return 0;
 }
 
 int hns_roce_init_srq_table(struct hns_roce_dev *hr_dev)
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 6d51653edaf8..392a5a7c2a31 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -763,7 +763,7 @@ int mlx4_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 int mlx4_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		       enum ib_srq_attr_mask attr_mask, struct ib_udata *udata);
 int mlx4_ib_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
-void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
+int mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
 void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index);
 int mlx4_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
 			  const struct ib_recv_wr **bad_wr);
diff --git a/drivers/infiniband/hw/mlx4/srq.c b/drivers/infiniband/hw/mlx4/srq.c
index 8f9d5035142d..2651b68a1c04 100644
--- a/drivers/infiniband/hw/mlx4/srq.c
+++ b/drivers/infiniband/hw/mlx4/srq.c
@@ -260,7 +260,7 @@ int mlx4_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr)
 	return 0;
 }
 
-void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
+int mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 {
 	struct mlx4_ib_dev *dev = to_mdev(srq->device);
 	struct mlx4_ib_srq *msrq = to_msrq(srq);
@@ -282,6 +282,7 @@ void mlx4_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 		mlx4_db_free(dev->dev, &msrq->db);
 	}
 	ib_umem_release(msrq->umem);
+	return 0;
 }
 
 void mlx4_ib_free_srq_wqe(struct mlx4_ib_srq *srq, int wqe_index)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 1e5f77d3e86b..b7b00e9e180b 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1128,7 +1128,7 @@ int mlx5_ib_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 int mlx5_ib_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		       enum ib_srq_attr_mask attr_mask, struct ib_udata *udata);
 int mlx5_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr);
-void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
+int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
 int mlx5_ib_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
 			  const struct ib_recv_wr **bad_wr);
 int mlx5_ib_enable_lb(struct mlx5_ib_dev *dev, bool td, bool qp);
diff --git a/drivers/infiniband/hw/mlx5/srq.c b/drivers/infiniband/hw/mlx5/srq.c
index 1b54fe4ea21d..3d7561f37742 100644
--- a/drivers/infiniband/hw/mlx5/srq.c
+++ b/drivers/infiniband/hw/mlx5/srq.c
@@ -389,24 +389,24 @@ int mlx5_ib_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr)
 	return ret;
 }
 
-void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
+int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *dev = to_mdev(srq->device);
 	struct mlx5_ib_srq *msrq = to_msrq(srq);
+	int ret;
+
+	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
+	if (ret && udata)
+		return ret;
 
-	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
-
-	if (srq->uobject) {
-		mlx5_ib_db_unmap_user(
-			rdma_udata_to_drv_context(
-				udata,
-				struct mlx5_ib_ucontext,
-				ibucontext),
-			&msrq->db);
-		ib_umem_release(msrq->umem);
-	} else {
-		destroy_srq_kernel(dev, msrq);
+	if (udata) {
+		destroy_srq_user(srq->pd, msrq, udata);
+		return 0;
 	}
+
+	/* We are cleaning kernel resources anyway */
+	destroy_srq_kernel(dev, msrq);
+	return ret;
 }
 
 void mlx5_ib_free_srq_wqe(struct mlx5_ib_srq *srq, int wqe_index)
diff --git a/drivers/infiniband/hw/mlx5/srq.h b/drivers/infiniband/hw/mlx5/srq.h
index af197c36d757..2c3627b2509d 100644
--- a/drivers/infiniband/hw/mlx5/srq.h
+++ b/drivers/infiniband/hw/mlx5/srq.h
@@ -56,7 +56,7 @@ struct mlx5_srq_table {
 
 int mlx5_cmd_create_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
 			struct mlx5_srq_attr *in);
-void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq);
+int mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq);
 int mlx5_cmd_query_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
 		       struct mlx5_srq_attr *out);
 int mlx5_cmd_arm_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c
index c6d807f04d9d..d590bac684eb 100644
--- a/drivers/infiniband/hw/mlx5/srq_cmd.c
+++ b/drivers/infiniband/hw/mlx5/srq_cmd.c
@@ -590,7 +590,7 @@ int mlx5_cmd_create_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
 	return err;
 }
 
-void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
+int mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
 {
 	struct mlx5_srq_table *table = &dev->srq_table;
 	struct mlx5_core_srq *tmp;
@@ -599,7 +599,7 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
 	/* Delete entry, but leave index occupied */
 	tmp = xa_store_irq(&table->array, srq->srqn, NULL, 0);
 	if (WARN_ON(!tmp || tmp != srq))
-		return;
+		return -EINVAL;
 
 	err = destroy_srq_split(dev, srq);
 	if (err) {
@@ -609,12 +609,13 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
 		 * entry and it can't fail at this stage.
 		 */
 		xa_store_irq(&table->array, srq->srqn, srq, 0);
-		return;
+		return err;
 	}
 	xa_erase_irq(&table->array, srq->srqn);
 
 	mlx5_core_res_put(&srq->common);
 	wait_for_completion(&srq->common.free);
+	return 0;
 }
 
 int mlx5_cmd_query_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq,
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 12b7c5349004..5d1e17214f0c 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -442,7 +442,7 @@ static int mthca_create_srq(struct ib_srq *ibsrq,
 	return 0;
 }
 
-static void mthca_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
+static int mthca_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 {
 	if (udata) {
 		struct mthca_ucontext *context =
@@ -456,6 +456,7 @@ static void mthca_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 	}
 
 	mthca_free_srq(to_mdev(srq->device), to_msrq(srq));
+	return 0;
 }
 
 static struct ib_qp *mthca_create_qp(struct ib_pd *pd,
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 4b9295c8d4f3..220bb09d6431 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -1858,7 +1858,7 @@ int ocrdma_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *srq_attr)
 	return status;
 }
 
-void ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+int ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct ocrdma_srq *srq;
 	struct ocrdma_dev *dev = get_ocrdma_dev(ibsrq->device);
@@ -1873,6 +1873,7 @@ void ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 
 	kfree(srq->idx_bit_fields);
 	kfree(srq->rqe_wr_id_tbl);
+	return 0;
 }
 
 /* unprivileged verbs and their support functions. */
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
index 4c85be43507c..4f6806f16e61 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
@@ -92,7 +92,7 @@ int ocrdma_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *attr,
 int ocrdma_modify_srq(struct ib_srq *, struct ib_srq_attr *,
 		      enum ib_srq_attr_mask, struct ib_udata *);
 int ocrdma_query_srq(struct ib_srq *, struct ib_srq_attr *);
-void ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
+int ocrdma_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
 int ocrdma_post_srq_recv(struct ib_srq *, const struct ib_recv_wr *,
 			 const struct ib_recv_wr **bad_recv_wr);
 
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index 53ac0a4d32f2..d75300d7df95 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -1592,7 +1592,7 @@ int qedr_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *init_attr,
 	return -EFAULT;
 }
 
-void qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+int qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct qed_rdma_destroy_srq_in_params in_params = {};
 	struct qedr_dev *dev = get_qedr_dev(ibsrq->device);
@@ -1610,6 +1610,7 @@ void qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 	DP_DEBUG(dev, QEDR_MSG_SRQ,
 		 "destroy srq: destroyed srq with srq_id=0x%0x\n",
 		 srq->srq_id);
+	return 0;
 }
 
 int qedr_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h
index 1b4ed8d37f5e..a78b206d8b5a 100644
--- a/drivers/infiniband/hw/qedr/verbs.h
+++ b/drivers/infiniband/hw/qedr/verbs.h
@@ -67,7 +67,7 @@ int qedr_create_srq(struct ib_srq *ibsrq, struct ib_srq_init_attr *attr,
 int qedr_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		    enum ib_srq_attr_mask attr_mask, struct ib_udata *udata);
 int qedr_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr);
-void qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
+int qedr_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
 int qedr_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
 		       const struct ib_recv_wr **bad_recv_wr);
 int qedr_create_ah(struct ib_ah *ibah, struct rdma_ah_init_attr *init_attr,
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
index f60a8e81bddd..f6802276fc04 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_srq.c
@@ -240,7 +240,7 @@ static void pvrdma_free_srq(struct pvrdma_dev *dev, struct pvrdma_srq *srq)
  *
  * @return: 0 for success.
  */
-void pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
+int pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 {
 	struct pvrdma_srq *vsrq = to_vsrq(srq);
 	union pvrdma_cmd_req req;
@@ -259,6 +259,7 @@ void pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
 			 ret);
 
 	pvrdma_free_srq(dev, vsrq);
+	return 0;
 }
 
 /**
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index 58b41a3e8b7e..f9edce71b79b 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -423,7 +423,7 @@ int pvrdma_create_srq(struct ib_srq *srq, struct ib_srq_init_attr *init_attr,
 int pvrdma_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		      enum ib_srq_attr_mask attr_mask, struct ib_udata *udata);
 int pvrdma_query_srq(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
-void pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
+int pvrdma_destroy_srq(struct ib_srq *srq, struct ib_udata *udata);
 
 struct ib_qp *pvrdma_create_qp(struct ib_pd *pd,
 			       struct ib_qp_init_attr *init_attr,
diff --git a/drivers/infiniband/sw/rdmavt/srq.c b/drivers/infiniband/sw/rdmavt/srq.c
index f547c115af03..64d98bf238ab 100644
--- a/drivers/infiniband/sw/rdmavt/srq.c
+++ b/drivers/infiniband/sw/rdmavt/srq.c
@@ -332,7 +332,7 @@ int rvt_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr)
  * @ibsrq: srq object to destroy
  *
  */
-void rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+int rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct rvt_srq *srq = ibsrq_to_rvtsrq(ibsrq);
 	struct rvt_dev_info *dev = ib_to_rvt(ibsrq->device);
@@ -343,4 +343,5 @@ void rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 	if (srq->ip)
 		kref_put(&srq->ip->ref, rvt_release_mmap_info);
 	kvfree(srq->rq.kwq);
+	return 0;
 }
diff --git a/drivers/infiniband/sw/rdmavt/srq.h b/drivers/infiniband/sw/rdmavt/srq.h
index 6427d7d62a9a..d5a1a053b1b9 100644
--- a/drivers/infiniband/sw/rdmavt/srq.h
+++ b/drivers/infiniband/sw/rdmavt/srq.h
@@ -56,6 +56,6 @@ int rvt_modify_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr,
 		   enum ib_srq_attr_mask attr_mask,
 		   struct ib_udata *udata);
 int rvt_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr);
-void rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
+int rvt_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata);
 
 #endif          /* DEF_RVTSRQ_H */
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 4c7df057da54..7303edbc293b 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -367,7 +367,7 @@ static int rxe_query_srq(struct ib_srq *ibsrq, struct ib_srq_attr *attr)
 	return 0;
 }
 
-static void rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
+static int rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 {
 	struct rxe_srq *srq = to_rsrq(ibsrq);
 
@@ -376,6 +376,7 @@ static void rxe_destroy_srq(struct ib_srq *ibsrq, struct ib_udata *udata)
 
 	rxe_drop_ref(srq->pd);
 	rxe_drop_ref(srq);
+	return 0;
 }
 
 static int rxe_post_srq_recv(struct ib_srq *ibsrq, const struct ib_recv_wr *wr,
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index 2d2b6df0b027..a6ec1e968fb4 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -1691,7 +1691,7 @@ int siw_query_srq(struct ib_srq *base_srq, struct ib_srq_attr *attrs)
  * QP anymore - the code trusts the RDMA core environment to keep track
  * of QP references.
  */
-void siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata)
+int siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata)
 {
 	struct siw_srq *srq = to_siw_srq(base_srq);
 	struct siw_device *sdev = to_siw_dev(base_srq->device);
@@ -1703,6 +1703,7 @@ void siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata)
 		rdma_user_mmap_entry_remove(srq->srq_entry);
 	vfree(srq->recvq);
 	atomic_dec(&sdev->num_srq);
+	return 0;
 }
 
 /*
diff --git a/drivers/infiniband/sw/siw/siw_verbs.h b/drivers/infiniband/sw/siw/siw_verbs.h
index 3dbab78579cb..ed2d8ac2f967 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.h
+++ b/drivers/infiniband/sw/siw/siw_verbs.h
@@ -78,7 +78,7 @@ int siw_create_srq(struct ib_srq *base_srq, struct ib_srq_init_attr *attr,
 int siw_modify_srq(struct ib_srq *base_srq, struct ib_srq_attr *attr,
 		   enum ib_srq_attr_mask mask, struct ib_udata *udata);
 int siw_query_srq(struct ib_srq *base_srq, struct ib_srq_attr *attr);
-void siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata);
+int siw_destroy_srq(struct ib_srq *base_srq, struct ib_udata *udata);
 int siw_post_srq_recv(struct ib_srq *base_srq, const struct ib_recv_wr *wr,
 		      const struct ib_recv_wr **bad_wr);
 int siw_mmap(struct ib_ucontext *ctx, struct vm_area_struct *vma);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 9bf0fa30df28..dca86abb3012 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1647,17 +1647,13 @@ int ipoib_cm_dev_init(struct net_device *dev)
 void ipoib_cm_dev_cleanup(struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
-	int ret;
 
 	if (!priv->cm.srq)
 		return;
 
 	ipoib_dbg(priv, "Cleanup ipoib connected mode.\n");
 
-	ret = ib_destroy_srq(priv->cm.srq);
-	if (ret)
-		ipoib_warn(priv, "ib_destroy_srq failed: %d\n", ret);
-
+	ib_destroy_srq(priv->cm.srq);
 	priv->cm.srq = NULL;
 	if (!priv->cm.srq_ring)
 		return;
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 38d34b85138f..2ea278b0494a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2417,7 +2417,7 @@ struct ib_device_ops {
 			  enum ib_srq_attr_mask srq_attr_mask,
 			  struct ib_udata *udata);
 	int (*query_srq)(struct ib_srq *srq, struct ib_srq_attr *srq_attr);
-	void (*destroy_srq)(struct ib_srq *srq, struct ib_udata *udata);
+	int (*destroy_srq)(struct ib_srq *srq, struct ib_udata *udata);
 	struct ib_qp *(*create_qp)(struct ib_pd *pd,
 				   struct ib_qp_init_attr *qp_init_attr,
 				   struct ib_udata *udata);
@@ -3660,9 +3660,11 @@ int ib_destroy_srq_user(struct ib_srq *srq, struct ib_udata *udata);
  *
  * NOTE: for user srq use ib_destroy_srq_user with valid udata!
  */
-static inline int ib_destroy_srq(struct ib_srq *srq)
+static inline void ib_destroy_srq(struct ib_srq *srq)
 {
-	return ib_destroy_srq_user(srq, NULL);
+	int ret = ib_destroy_srq_user(srq, NULL);
+
+	WARN_ONCE(ret, "Destroy of kernel SRQ shouldn't fail");
 }
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (4 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-09-03  0:20   ` Jason Gunthorpe
  2020-08-30  8:40 ` [PATCH rdma-next v1 07/10] RDMA: Allow fail of destroy CQ Leon Romanovsky
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma

From: Leon Romanovsky <leonro@mellanox.com>

The ib_alloc_cq*() and ib_free_cq*() are solely kernel verbs to manage
CQs and doesn't need extra indirection just to call same functions with
constant parameter NULL as udata.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cq.c | 33 ++++++++-----------
 include/rdma/ib_verbs.h      | 62 ++++--------------------------------
 2 files changed, 20 insertions(+), 75 deletions(-)

diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index 513825e424bf..49c2d71e309a 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -197,24 +197,22 @@ static void ib_cq_completion_workqueue(struct ib_cq *cq, void *private)
 }
 
 /**
- * __ib_alloc_cq_user - allocate a completion queue
+ * __ib_alloc_cq        allocate a completion queue
  * @dev:		device to allocate the CQ for
  * @private:		driver private data, accessible from cq->cq_context
  * @nr_cqe:		number of CQEs to allocate
  * @comp_vector:	HCA completion vectors for this CQ
  * @poll_ctx:		context to poll the CQ from.
  * @caller:		module owner name.
- * @udata:		Valid user data or NULL for kernel object
  *
  * This is the proper interface to allocate a CQ for in-kernel users. A
  * CQ allocated with this interface will automatically be polled from the
  * specified context. The ULP must use wr->wr_cqe instead of wr->wr_id
  * to use this CQ abstraction.
  */
-struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
-				 int nr_cqe, int comp_vector,
-				 enum ib_poll_context poll_ctx,
-				 const char *caller, struct ib_udata *udata)
+struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe,
+			    int comp_vector, enum ib_poll_context poll_ctx,
+			    const char *caller)
 {
 	struct ib_cq_init_attr cq_attr = {
 		.cqe		= nr_cqe,
@@ -277,7 +275,7 @@ struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
 out_destroy_cq:
 	rdma_dim_destroy(cq);
 	rdma_restrack_del(&cq->res);
-	cq->device->ops.destroy_cq(cq, udata);
+	cq->device->ops.destroy_cq(cq, NULL);
 out_free_wc:
 	kfree(cq->wc);
 out_free_cq:
@@ -285,7 +283,7 @@ struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
 	trace_cq_alloc_error(nr_cqe, comp_vector, poll_ctx, ret);
 	return ERR_PTR(ret);
 }
-EXPORT_SYMBOL(__ib_alloc_cq_user);
+EXPORT_SYMBOL(__ib_alloc_cq);
 
 /**
  * __ib_alloc_cq_any - allocate a completion queue
@@ -310,22 +308,19 @@ struct ib_cq *__ib_alloc_cq_any(struct ib_device *dev, void *private,
 			atomic_inc_return(&counter) %
 			min_t(int, dev->num_comp_vectors, num_online_cpus());
 
-	return __ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx,
-				  caller, NULL);
+	return __ib_alloc_cq(dev, private, nr_cqe, comp_vector, poll_ctx,
+			     caller);
 }
 EXPORT_SYMBOL(__ib_alloc_cq_any);
 
 /**
- * ib_free_cq_user - free a completion queue
+ * ib_free_cq - free a completion queue
  * @cq:		completion queue to free.
- * @udata:	User data or NULL for kernel object
  */
-void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
+void ib_free_cq(struct ib_cq *cq)
 {
-	if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
-		return;
-	if (WARN_ON_ONCE(cq->cqe_used))
-		return;
+	WARN_ON_ONCE(atomic_read(&cq->usecnt));
+	WARN_ON_ONCE(cq->cqe_used);
 
 	switch (cq->poll_ctx) {
 	case IB_POLL_DIRECT:
@@ -344,11 +339,11 @@ void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
 	rdma_dim_destroy(cq);
 	trace_cq_free(cq);
 	rdma_restrack_del(&cq->res);
-	cq->device->ops.destroy_cq(cq, udata);
+	cq->device->ops.destroy_cq(cq, NULL);
 	kfree(cq->wc);
 	kfree(cq);
 }
-EXPORT_SYMBOL(ib_free_cq_user);
+EXPORT_SYMBOL(ib_free_cq);
 
 void ib_cq_pool_init(struct ib_device *dev)
 {
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 2ea278b0494a..a3eb363674df 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -3808,46 +3808,15 @@ static inline int ib_post_recv(struct ib_qp *qp,
 	return qp->device->ops.post_recv(qp, recv_wr, bad_recv_wr ? : &dummy);
 }
 
-struct ib_cq *__ib_alloc_cq_user(struct ib_device *dev, void *private,
-				 int nr_cqe, int comp_vector,
-				 enum ib_poll_context poll_ctx,
-				 const char *caller, struct ib_udata *udata);
-
-/**
- * ib_alloc_cq_user: Allocate kernel/user CQ
- * @dev: The IB device
- * @private: Private data attached to the CQE
- * @nr_cqe: Number of CQEs in the CQ
- * @comp_vector: Completion vector used for the IRQs
- * @poll_ctx: Context used for polling the CQ
- * @udata: Valid user data or NULL for kernel objects
- */
-static inline struct ib_cq *ib_alloc_cq_user(struct ib_device *dev,
-					     void *private, int nr_cqe,
-					     int comp_vector,
-					     enum ib_poll_context poll_ctx,
-					     struct ib_udata *udata)
-{
-	return __ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx,
-				  KBUILD_MODNAME, udata);
-}
-
-/**
- * ib_alloc_cq: Allocate kernel CQ
- * @dev: The IB device
- * @private: Private data attached to the CQE
- * @nr_cqe: Number of CQEs in the CQ
- * @comp_vector: Completion vector used for the IRQs
- * @poll_ctx: Context used for polling the CQ
- *
- * NOTE: for user cq use ib_alloc_cq_user with valid udata!
- */
+struct ib_cq *__ib_alloc_cq(struct ib_device *dev, void *private, int nr_cqe,
+			    int comp_vector, enum ib_poll_context poll_ctx,
+			    const char *caller);
 static inline struct ib_cq *ib_alloc_cq(struct ib_device *dev, void *private,
 					int nr_cqe, int comp_vector,
 					enum ib_poll_context poll_ctx)
 {
-	return ib_alloc_cq_user(dev, private, nr_cqe, comp_vector, poll_ctx,
-				NULL);
+	return __ib_alloc_cq(dev, private, nr_cqe, comp_vector, poll_ctx,
+			     KBUILD_MODNAME);
 }
 
 struct ib_cq *__ib_alloc_cq_any(struct ib_device *dev, void *private,
@@ -3869,26 +3838,7 @@ static inline struct ib_cq *ib_alloc_cq_any(struct ib_device *dev,
 				 KBUILD_MODNAME);
 }
 
-/**
- * ib_free_cq_user - Free kernel/user CQ
- * @cq: The CQ to free
- * @udata: Valid user data or NULL for kernel objects
- *
- * NOTE: This function shouldn't be called on shared CQs.
- */
-void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata);
-
-/**
- * ib_free_cq - Free kernel CQ
- * @cq: The CQ to free
- *
- * NOTE: for user cq use ib_free_cq_user with valid udata!
- */
-static inline void ib_free_cq(struct ib_cq *cq)
-{
-	ib_free_cq_user(cq, NULL);
-}
-
+void ib_free_cq(struct ib_cq *cq);
 int ib_process_cq_direct(struct ib_cq *cq, int budget);
 
 /**
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 07/10] RDMA: Allow fail of destroy CQ
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (5 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 08/10] RDMA: Change XRCD destroy return value Leon Romanovsky
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Christian Benvenuti, Dennis Dalessandro, Devesh Sharma,
	Faisal Latif, Gal Pressman, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Nelson Escobar, Parvi Kaustubhi, Potnuri Bharat Teja,
	Selvin Xavier, Shiraz Saleem, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

From: Leon Romanovsky <leonro@mellanox.com>

Like any other verbs object, CQ shouldn't fail during destroy,
but mlx5_ib didn't follow this contract while mixed IB verbs objects
with DEVX. Such mix causes to the situation where FW and kernel
are fully dependent on the reference counting of another side.

Kernel verbs and drivers that don't have DEVX flows shouldn't fail.

Fixes: e39afe3d6dbd ("RDMA: Convert CQ allocations to be under core responsibility")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/cq.c                    |  5 ++++-
 drivers/infiniband/core/verbs.c                 |  9 +++++++--
 drivers/infiniband/hw/bnxt_re/ib_verbs.c        |  3 ++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.h        |  2 +-
 drivers/infiniband/hw/cxgb4/cq.c                |  3 ++-
 drivers/infiniband/hw/cxgb4/iw_cxgb4.h          |  2 +-
 drivers/infiniband/hw/efa/efa.h                 |  2 +-
 drivers/infiniband/hw/efa/efa_verbs.c           |  3 ++-
 drivers/infiniband/hw/hns/hns_roce_cq.c         |  3 ++-
 drivers/infiniband/hw/hns/hns_roce_device.h     |  4 ++--
 drivers/infiniband/hw/hns/hns_roce_hw_v1.c      |  3 ++-
 drivers/infiniband/hw/i40iw/i40iw_verbs.c       |  3 ++-
 drivers/infiniband/hw/mlx4/cq.c                 |  3 ++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h            |  2 +-
 drivers/infiniband/hw/mlx5/cq.c                 | 16 +++++++++++-----
 drivers/infiniband/hw/mlx5/mlx5_ib.h            |  2 +-
 drivers/infiniband/hw/mthca/mthca_provider.c    |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.c     |  3 ++-
 drivers/infiniband/hw/ocrdma/ocrdma_verbs.h     |  2 +-
 drivers/infiniband/hw/qedr/verbs.c              |  5 +++--
 drivers/infiniband/hw/qedr/verbs.h              |  2 +-
 drivers/infiniband/hw/usnic/usnic_ib_verbs.c    |  4 ++--
 drivers/infiniband/hw/usnic/usnic_ib_verbs.h    |  2 +-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c    |  3 ++-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h |  2 +-
 drivers/infiniband/sw/rdmavt/cq.c               |  3 ++-
 drivers/infiniband/sw/rdmavt/cq.h               |  2 +-
 drivers/infiniband/sw/rxe/rxe_verbs.c           |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.c           |  3 ++-
 drivers/infiniband/sw/siw/siw_verbs.h           |  2 +-
 include/rdma/ib_verbs.h                         |  6 ++++--
 31 files changed, 70 insertions(+), 40 deletions(-)

diff --git a/drivers/infiniband/core/cq.c b/drivers/infiniband/core/cq.c
index 49c2d71e309a..9d86513fed81 100644
--- a/drivers/infiniband/core/cq.c
+++ b/drivers/infiniband/core/cq.c
@@ -319,6 +319,8 @@ EXPORT_SYMBOL(__ib_alloc_cq_any);
  */
 void ib_free_cq(struct ib_cq *cq)
 {
+	int ret;
+
 	WARN_ON_ONCE(atomic_read(&cq->usecnt));
 	WARN_ON_ONCE(cq->cqe_used);

@@ -338,8 +340,9 @@ void ib_free_cq(struct ib_cq *cq)

 	rdma_dim_destroy(cq);
 	trace_cq_free(cq);
+	ret = cq->device->ops.destroy_cq(cq, NULL);
+	WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail");
 	rdma_restrack_del(&cq->res);
-	cq->device->ops.destroy_cq(cq, NULL);
 	kfree(cq->wc);
 	kfree(cq);
 }
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 8ce6fc14677a..80154be9390a 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2023,16 +2023,21 @@ EXPORT_SYMBOL(rdma_set_cq_moderation);

 int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata)
 {
+	int ret;
+
 	if (WARN_ON_ONCE(cq->shared))
 		return -EOPNOTSUPP;

 	if (atomic_read(&cq->usecnt))
 		return -EBUSY;

+	ret = cq->device->ops.destroy_cq(cq, udata);
+	if (ret && udata)
+		return ret;
+
 	rdma_restrack_del(&cq->res);
-	cq->device->ops.destroy_cq(cq, udata);
 	kfree(cq);
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(ib_destroy_cq_user);

diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.c b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
index 13460fd31c8d..c983a48e9aed 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.c
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.c
@@ -2803,7 +2803,7 @@ int bnxt_re_post_recv(struct ib_qp *ib_qp, const struct ib_recv_wr *wr,
 }

 /* Completion Queues */
-void bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+int bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 {
 	struct bnxt_re_cq *cq;
 	struct bnxt_qplib_nq *nq;
@@ -2819,6 +2819,7 @@ void bnxt_re_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 	atomic_dec(&rdev->cq_count);
 	nq->budget--;
 	kfree(cq->cql);
+	return 0;
 }

 int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
diff --git a/drivers/infiniband/hw/bnxt_re/ib_verbs.h b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
index 7ca232809466..9a8130b79256 100644
--- a/drivers/infiniband/hw/bnxt_re/ib_verbs.h
+++ b/drivers/infiniband/hw/bnxt_re/ib_verbs.h
@@ -193,7 +193,7 @@ int bnxt_re_post_recv(struct ib_qp *qp, const struct ib_recv_wr *recv_wr,
 		      const struct ib_recv_wr **bad_recv_wr);
 int bnxt_re_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		      struct ib_udata *udata);
-void bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
+int bnxt_re_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
 int bnxt_re_poll_cq(struct ib_cq *cq, int num_entries, struct ib_wc *wc);
 int bnxt_re_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
 struct ib_mr *bnxt_re_get_dma_mr(struct ib_pd *pd, int mr_access_flags);
diff --git a/drivers/infiniband/hw/cxgb4/cq.c b/drivers/infiniband/hw/cxgb4/cq.c
index 352b8af1998a..28349ed50885 100644
--- a/drivers/infiniband/hw/cxgb4/cq.c
+++ b/drivers/infiniband/hw/cxgb4/cq.c
@@ -967,7 +967,7 @@ int c4iw_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc)
 	return !err || err == -ENODATA ? npolled : err;
 }

-void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 {
 	struct c4iw_cq *chp;
 	struct c4iw_ucontext *ucontext;
@@ -985,6 +985,7 @@ void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 		   ucontext ? &ucontext->uctx : &chp->cq.rdev->uctx,
 		   chp->destroy_skb, chp->wr_waitp);
 	c4iw_put_wr_wait(chp->wr_waitp);
+	return 0;
 }

 int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
diff --git a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
index fa91e80869c0..dc65811e6a93 100644
--- a/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
+++ b/drivers/infiniband/hw/cxgb4/iw_cxgb4.h
@@ -992,7 +992,7 @@ struct ib_mr *c4iw_reg_user_mr(struct ib_pd *pd, u64 start,
 					   struct ib_udata *udata);
 struct ib_mr *c4iw_get_dma_mr(struct ib_pd *pd, int acc);
 int c4iw_dereg_mr(struct ib_mr *ib_mr, struct ib_udata *udata);
-void c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
+int c4iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
 int c4iw_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		   struct ib_udata *udata);
 int c4iw_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
diff --git a/drivers/infiniband/hw/efa/efa.h b/drivers/infiniband/hw/efa/efa.h
index 6b06ce87fbfc..64ae8ba6a7f6 100644
--- a/drivers/infiniband/hw/efa/efa.h
+++ b/drivers/infiniband/hw/efa/efa.h
@@ -139,7 +139,7 @@ int efa_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata);
 struct ib_qp *efa_create_qp(struct ib_pd *ibpd,
 			    struct ib_qp_init_attr *init_attr,
 			    struct ib_udata *udata);
-void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
+int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
 int efa_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		  struct ib_udata *udata);
 struct ib_mr *efa_reg_mr(struct ib_pd *ibpd, u64 start, u64 length,
diff --git a/drivers/infiniband/hw/efa/efa_verbs.c b/drivers/infiniband/hw/efa/efa_verbs.c
index a03e3514bd8a..57910bcfc572 100644
--- a/drivers/infiniband/hw/efa/efa_verbs.c
+++ b/drivers/infiniband/hw/efa/efa_verbs.c
@@ -973,7 +973,7 @@ static int efa_destroy_cq_idx(struct efa_dev *dev, int cq_idx)
 	return efa_com_destroy_cq(&dev->edev, &params);
 }

-void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+int efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct efa_dev *dev = to_edev(ibcq->device);
 	struct efa_cq *cq = to_ecq(ibcq);
@@ -986,6 +986,7 @@ void efa_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	efa_destroy_cq_idx(dev, cq->cq_idx);
 	efa_free_mapped(dev, cq->cpu_addr, cq->dma_addr, cq->size,
 			DMA_FROM_DEVICE);
+	return 0;
 }

 static int cq_mmap_entries_setup(struct efa_dev *dev, struct efa_cq *cq,
diff --git a/drivers/infiniband/hw/hns/hns_roce_cq.c b/drivers/infiniband/hw/hns/hns_roce_cq.c
index e87d616f7988..c5acf3332519 100644
--- a/drivers/infiniband/hw/hns/hns_roce_cq.c
+++ b/drivers/infiniband/hw/hns/hns_roce_cq.c
@@ -311,7 +311,7 @@ int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr,
 	return ret;
 }

-void hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+int hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 {
 	struct hns_roce_dev *hr_dev = to_hr_dev(ib_cq->device);
 	struct hns_roce_cq *hr_cq = to_hr_cq(ib_cq);
@@ -322,6 +322,7 @@ void hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 	free_cq_buf(hr_dev, hr_cq);
 	free_cq_db(hr_dev, hr_cq, udata);
 	free_cqc(hr_dev, hr_cq);
+	return 0;
 }

 void hns_roce_cq_completion(struct hns_roce_dev *hr_dev, u32 cqn)
diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index ce0bec4a73c2..f8d70c3a5bf8 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -932,7 +932,7 @@ struct hns_roce_hw {
 	int (*poll_cq)(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 	int (*dereg_mr)(struct hns_roce_dev *hr_dev, struct hns_roce_mr *mr,
 			struct ib_udata *udata);
-	void (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata);
+	int (*destroy_cq)(struct ib_cq *ibcq, struct ib_udata *udata);
 	int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period);
 	int (*init_eq)(struct hns_roce_dev *hr_dev);
 	void (*cleanup_eq)(struct hns_roce_dev *hr_dev);
@@ -1252,7 +1252,7 @@ int to_hr_qp_type(int qp_type);
 int hns_roce_create_cq(struct ib_cq *ib_cq, const struct ib_cq_init_attr *attr,
 		       struct ib_udata *udata);

-void hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
+int hns_roce_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata);
 int hns_roce_db_map_user(struct hns_roce_ucontext *context,
 			 struct ib_udata *udata, unsigned long virt,
 			 struct hns_roce_db *db);
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
index 07b4c85d341d..86a4b1421f82 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v1.c
@@ -3572,7 +3572,7 @@ int hns_roce_v1_destroy_qp(struct ib_qp *ibqp, struct ib_udata *udata)
 	return 0;
 }

-static void hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+static int hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibcq->device);
 	struct hns_roce_cq *hr_cq = to_hr_cq(ibcq);
@@ -3603,6 +3603,7 @@ static void hns_roce_v1_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 		}
 		wait_time++;
 	}
+	return 0;
 }

 static void set_eq_cons_index_v1(struct hns_roce_eq *eq, int req_not)
diff --git a/drivers/infiniband/hw/i40iw/i40iw_verbs.c b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
index 360b037f90e7..d863b28a71f0 100644
--- a/drivers/infiniband/hw/i40iw/i40iw_verbs.c
+++ b/drivers/infiniband/hw/i40iw/i40iw_verbs.c
@@ -1053,7 +1053,7 @@ void i40iw_cq_wq_destroy(struct i40iw_device *iwdev, struct i40iw_sc_cq *cq)
  * @ib_cq: cq pointer
  * @udata: user data or NULL for kernel object
  */
-static void i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
+static int i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 {
 	struct i40iw_cq *iwcq;
 	struct i40iw_device *iwdev;
@@ -1065,6 +1065,7 @@ static void i40iw_destroy_cq(struct ib_cq *ib_cq, struct ib_udata *udata)
 	i40iw_cq_wq_destroy(iwdev, cq);
 	cq_free_resources(iwdev, iwcq);
 	i40iw_rem_devusecount(iwdev);
+	return 0;
 }

 /**
diff --git a/drivers/infiniband/hw/mlx4/cq.c b/drivers/infiniband/hw/mlx4/cq.c
index f8b936b76dcd..3851316407ce 100644
--- a/drivers/infiniband/hw/mlx4/cq.c
+++ b/drivers/infiniband/hw/mlx4/cq.c
@@ -475,7 +475,7 @@ int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata)
 	return err;
 }

-void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
+int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 {
 	struct mlx4_ib_dev *dev = to_mdev(cq->device);
 	struct mlx4_ib_cq *mcq = to_mcq(cq);
@@ -495,6 +495,7 @@ void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 		mlx4_db_free(dev->dev, &mcq->db);
 	}
 	ib_umem_release(mcq->umem);
+	return 0;
 }

 static void dump_cqe(void *cqe)
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 392a5a7c2a31..32a024f765ea 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -742,7 +742,7 @@ int mlx4_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period);
 int mlx4_ib_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *udata);
 int mlx4_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		      struct ib_udata *udata);
-void mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
+int mlx4_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
 int mlx4_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 int mlx4_ib_arm_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
 void __mlx4_ib_cq_clean(struct mlx4_ib_cq *cq, u32 qpn, struct mlx4_ib_srq *srq);
diff --git a/drivers/infiniband/hw/mlx5/cq.c b/drivers/infiniband/hw/mlx5/cq.c
index 2bac246b9fef..cb418fdcc9f8 100644
--- a/drivers/infiniband/hw/mlx5/cq.c
+++ b/drivers/infiniband/hw/mlx5/cq.c
@@ -1024,16 +1024,22 @@ int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	return err;
 }

-void mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
+int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *dev = to_mdev(cq->device);
 	struct mlx5_ib_cq *mcq = to_mcq(cq);
+	int ret;

-	mlx5_core_destroy_cq(dev->mdev, &mcq->mcq);
-	if (udata)
+	ret = mlx5_core_destroy_cq(dev->mdev, &mcq->mcq);
+	if (ret && udata)
+		return ret;
+
+	if (udata) {
 		destroy_cq_user(mcq, udata);
-	else
-		destroy_cq_kernel(dev, mcq);
+		return 0;
+	}
+	destroy_cq_kernel(dev, mcq);
+	return ret;
 }

 static int is_equal_rsn(struct mlx5_cqe64 *cqe64, u32 rsn)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index b7b00e9e180b..0a65f7ba40c4 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1151,7 +1151,7 @@ int mlx5_ib_read_wqe_srq(struct mlx5_ib_srq *srq, int wqe_index, void *buffer,
 			 size_t buflen, size_t *bc);
 int mlx5_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		      struct ib_udata *udata);
-void mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
+int mlx5_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
 int mlx5_ib_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 int mlx5_ib_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 int mlx5_ib_modify_cq(struct ib_cq *cq, u16 cq_count, u16 cq_period);
diff --git a/drivers/infiniband/hw/mthca/mthca_provider.c b/drivers/infiniband/hw/mthca/mthca_provider.c
index 5d1e17214f0c..4624b975fee2 100644
--- a/drivers/infiniband/hw/mthca/mthca_provider.c
+++ b/drivers/infiniband/hw/mthca/mthca_provider.c
@@ -792,7 +792,7 @@ static int mthca_resize_cq(struct ib_cq *ibcq, int entries, struct ib_udata *uda
 	return ret;
 }

-static void mthca_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
+static int mthca_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 {
 	if (udata) {
 		struct mthca_ucontext *context =
@@ -811,6 +811,7 @@ static void mthca_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 				    to_mcq(cq)->set_ci_db_index);
 	}
 	mthca_free_cq(to_mdev(cq->device), to_mcq(cq));
+	return 0;
 }

 static inline u32 convert_access(int acc)
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
index 220bb09d6431..6f15dbf39939 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.c
@@ -1057,7 +1057,7 @@ static void ocrdma_flush_cq(struct ocrdma_cq *cq)
 	spin_unlock_irqrestore(&cq->cq_lock, flags);
 }

-void ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+int ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct ocrdma_cq *cq = get_ocrdma_cq(ibcq);
 	struct ocrdma_eq *eq = NULL;
@@ -1082,6 +1082,7 @@ void ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 				ocrdma_get_db_addr(dev, pdid),
 				dev->nic_info.db_page_size);
 	}
+	return 0;
 }

 static int ocrdma_add_qpn_map(struct ocrdma_dev *dev, struct ocrdma_qp *qp)
diff --git a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
index 4f6806f16e61..425d554e7f3f 100644
--- a/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
+++ b/drivers/infiniband/hw/ocrdma/ocrdma_verbs.h
@@ -72,7 +72,7 @@ int ocrdma_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 int ocrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		     struct ib_udata *udata);
 int ocrdma_resize_cq(struct ib_cq *, int cqe, struct ib_udata *);
-void ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
+int ocrdma_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);

 struct ib_qp *ocrdma_create_qp(struct ib_pd *,
 			       struct ib_qp_init_attr *attrs,
diff --git a/drivers/infiniband/hw/qedr/verbs.c b/drivers/infiniband/hw/qedr/verbs.c
index d75300d7df95..e3640064a401 100644
--- a/drivers/infiniband/hw/qedr/verbs.c
+++ b/drivers/infiniband/hw/qedr/verbs.c
@@ -1052,7 +1052,7 @@ int qedr_resize_cq(struct ib_cq *ibcq, int new_cnt, struct ib_udata *udata)
 #define QEDR_DESTROY_CQ_MAX_ITERATIONS		(10)
 #define QEDR_DESTROY_CQ_ITER_DURATION		(10)

-void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+int qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct qedr_dev *dev = get_qedr_dev(ibcq->device);
 	struct qed_rdma_destroy_cq_out_params oparams;
@@ -1067,7 +1067,7 @@ void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	/* GSIs CQs are handled by driver, so they don't exist in the FW */
 	if (cq->cq_type == QEDR_CQ_TYPE_GSI) {
 		qedr_db_recovery_del(dev, cq->db_addr, &cq->db.data);
-		return;
+		return 0;
 	}

 	iparams.icid = cq->icid;
@@ -1115,6 +1115,7 @@ void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 	 * Since the destroy CQ ramrod has also been received on the EQ we can
 	 * be certain that there's no event handler in process.
 	 */
+	return 0;
 }

 static inline int get_gid_info_from_table(struct ib_qp *ibqp,
diff --git a/drivers/infiniband/hw/qedr/verbs.h b/drivers/infiniband/hw/qedr/verbs.h
index a78b206d8b5a..4620fba34d5f 100644
--- a/drivers/infiniband/hw/qedr/verbs.h
+++ b/drivers/infiniband/hw/qedr/verbs.h
@@ -52,7 +52,7 @@ int qedr_dealloc_pd(struct ib_pd *pd, struct ib_udata *udata);
 int qedr_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		   struct ib_udata *udata);
 int qedr_resize_cq(struct ib_cq *, int cqe, struct ib_udata *);
-void qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
+int qedr_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
 int qedr_arm_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags flags);
 struct ib_qp *qedr_create_qp(struct ib_pd *, struct ib_qp_init_attr *attrs,
 			     struct ib_udata *);
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
index 8af3212101be..9e961f8ffa10 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.c
@@ -586,9 +586,9 @@ int usnic_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	return 0;
 }

-void usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
+int usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 {
-	return;
+	return 0;
 }

 struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length,
diff --git a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
index f8911c0330e2..11fe1ba6bbc9 100644
--- a/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
+++ b/drivers/infiniband/hw/usnic/usnic_ib_verbs.h
@@ -58,7 +58,7 @@ int usnic_ib_modify_qp(struct ib_qp *ibqp, struct ib_qp_attr *attr,
 				int attr_mask, struct ib_udata *udata);
 int usnic_ib_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		       struct ib_udata *udata);
-void usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
+int usnic_ib_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
 struct ib_mr *usnic_ib_reg_mr(struct ib_pd *pd, u64 start, u64 length,
 				u64 virt_addr, int access_flags,
 				struct ib_udata *udata);
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
index 01cd122a8b69..32aede5a3381 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_cq.c
@@ -235,7 +235,7 @@ static void pvrdma_free_cq(struct pvrdma_dev *dev, struct pvrdma_cq *cq)
  * @cq: the completion queue to destroy.
  * @udata: user data or null for kernel object
  */
-void pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
+int pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)
 {
 	struct pvrdma_cq *vcq = to_vcq(cq);
 	union pvrdma_cmd_req req;
@@ -261,6 +261,7 @@ void pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata)

 	pvrdma_free_cq(dev, vcq);
 	atomic_dec(&dev->num_cqs);
+	return 0;
 }

 static inline struct pvrdma_cqe *get_cqe(struct pvrdma_cq *cq, int i)
diff --git a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
index f9edce71b79b..97ed8f952f6e 100644
--- a/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
+++ b/drivers/infiniband/hw/vmw_pvrdma/pvrdma_verbs.h
@@ -411,7 +411,7 @@ int pvrdma_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg,
 		     int sg_nents, unsigned int *sg_offset);
 int pvrdma_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		     struct ib_udata *udata);
-void pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
+int pvrdma_destroy_cq(struct ib_cq *cq, struct ib_udata *udata);
 int pvrdma_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *wc);
 int pvrdma_req_notify_cq(struct ib_cq *cq, enum ib_cq_notify_flags flags);
 int pvrdma_create_ah(struct ib_ah *ah, struct rdma_ah_init_attr *init_attr,
diff --git a/drivers/infiniband/sw/rdmavt/cq.c b/drivers/infiniband/sw/rdmavt/cq.c
index 04d2e72017fe..19248be14093 100644
--- a/drivers/infiniband/sw/rdmavt/cq.c
+++ b/drivers/infiniband/sw/rdmavt/cq.c
@@ -315,7 +315,7 @@ int rvt_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
  *
  * Called by ib_destroy_cq() in the generic verbs code.
  */
-void rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+int rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct rvt_cq *cq = ibcq_to_rvtcq(ibcq);
 	struct rvt_dev_info *rdi = cq->rdi;
@@ -328,6 +328,7 @@ void rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 		kref_put(&cq->ip->ref, rvt_release_mmap_info);
 	else
 		vfree(cq->kqueue);
+	return 0;
 }

 /**
diff --git a/drivers/infiniband/sw/rdmavt/cq.h b/drivers/infiniband/sw/rdmavt/cq.h
index 5e26a2eb19a4..feb01e7ee004 100644
--- a/drivers/infiniband/sw/rdmavt/cq.h
+++ b/drivers/infiniband/sw/rdmavt/cq.h
@@ -53,7 +53,7 @@

 int rvt_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 		  struct ib_udata *udata);
-void rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
+int rvt_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata);
 int rvt_req_notify_cq(struct ib_cq *ibcq, enum ib_cq_notify_flags notify_flags);
 int rvt_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata);
 int rvt_poll_cq(struct ib_cq *ibcq, int num_entries, struct ib_wc *entry);
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 7303edbc293b..576aa60fdd27 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -806,13 +806,14 @@ static int rxe_create_cq(struct ib_cq *ibcq, const struct ib_cq_init_attr *attr,
 	return rxe_add_to_pool(&rxe->cq_pool, &cq->pelem);
 }

-static void rxe_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
+static int rxe_destroy_cq(struct ib_cq *ibcq, struct ib_udata *udata)
 {
 	struct rxe_cq *cq = to_rcq(ibcq);

 	rxe_cq_disable(cq);

 	rxe_drop_ref(cq);
+	return 0;
 }

 static int rxe_resize_cq(struct ib_cq *ibcq, int cqe, struct ib_udata *udata)
diff --git a/drivers/infiniband/sw/siw/siw_verbs.c b/drivers/infiniband/sw/siw/siw_verbs.c
index a6ec1e968fb4..7cf3242ffb41 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.c
+++ b/drivers/infiniband/sw/siw/siw_verbs.c
@@ -1056,7 +1056,7 @@ int siw_post_receive(struct ib_qp *base_qp, const struct ib_recv_wr *wr,
 	return rv > 0 ? 0 : rv;
 }

-void siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata)
+int siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata)
 {
 	struct siw_cq *cq = to_siw_cq(base_cq);
 	struct siw_device *sdev = to_siw_dev(base_cq->device);
@@ -1074,6 +1074,7 @@ void siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata)
 	atomic_dec(&sdev->num_cq);

 	vfree(cq->queue);
+	return 0;
 }

 /*
diff --git a/drivers/infiniband/sw/siw/siw_verbs.h b/drivers/infiniband/sw/siw/siw_verbs.h
index ed2d8ac2f967..637454529357 100644
--- a/drivers/infiniband/sw/siw/siw_verbs.h
+++ b/drivers/infiniband/sw/siw/siw_verbs.h
@@ -62,7 +62,7 @@ int siw_post_send(struct ib_qp *base_qp, const struct ib_send_wr *wr,
 		  const struct ib_send_wr **bad_wr);
 int siw_post_receive(struct ib_qp *base_qp, const struct ib_recv_wr *wr,
 		     const struct ib_recv_wr **bad_wr);
-void siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata);
+int siw_destroy_cq(struct ib_cq *base_cq, struct ib_udata *udata);
 int siw_poll_cq(struct ib_cq *base_cq, int num_entries, struct ib_wc *wc);
 int siw_req_notify_cq(struct ib_cq *base_cq, enum ib_cq_notify_flags flags);
 struct ib_mr *siw_reg_user_mr(struct ib_pd *base_pd, u64 start, u64 len,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index a3eb363674df..335189bddbef 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2429,7 +2429,7 @@ struct ib_device_ops {
 	int (*create_cq)(struct ib_cq *cq, const struct ib_cq_init_attr *attr,
 			 struct ib_udata *udata);
 	int (*modify_cq)(struct ib_cq *cq, u16 cq_count, u16 cq_period);
-	void (*destroy_cq)(struct ib_cq *cq, struct ib_udata *udata);
+	int (*destroy_cq)(struct ib_cq *cq, struct ib_udata *udata);
 	int (*resize_cq)(struct ib_cq *cq, int cqe, struct ib_udata *udata);
 	struct ib_mr *(*get_dma_mr)(struct ib_pd *pd, int mr_access_flags);
 	struct ib_mr *(*reg_user_mr)(struct ib_pd *pd, u64 start, u64 length,
@@ -3896,7 +3896,9 @@ int ib_destroy_cq_user(struct ib_cq *cq, struct ib_udata *udata);
  */
 static inline void ib_destroy_cq(struct ib_cq *cq)
 {
-	ib_destroy_cq_user(cq, NULL);
+	int ret = ib_destroy_cq_user(cq, NULL);
+
+	WARN_ONCE(ret, "Destroy of kernel CQ shouldn't fail");
 }

 /**
--
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 08/10] RDMA: Change XRCD destroy return value
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (6 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 07/10] RDMA: Allow fail of destroy CQ Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 09/10] RDMA: Restore ability to return error for destroy WQ Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 10/10] RDMA: Make counters destroy symmetrical Leon Romanovsky
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma, Yishai Hadas

From: Leon Romanovsky <leonro@mellanox.com>

Update XRCD destroy flow to allow command failure.

Fixes: 28ad5f65c314 ("RDMA: Move XRCD to be under ib_core responsibility")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/verbs.c      | 8 ++++++--
 drivers/infiniband/hw/mlx4/main.c    | 3 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h | 2 +-
 drivers/infiniband/hw/mlx5/qp.c      | 4 ++--
 include/rdma/ib_verbs.h              | 2 +-
 5 files changed, 12 insertions(+), 7 deletions(-)

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 80154be9390a..0dec1c8d6744 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2345,13 +2345,17 @@ EXPORT_SYMBOL(ib_alloc_xrcd_user);
  */
 int ib_dealloc_xrcd_user(struct ib_xrcd *xrcd, struct ib_udata *udata)
 {
+	int ret;
+
 	if (atomic_read(&xrcd->usecnt))
 		return -EBUSY;
 
 	WARN_ON(!xa_empty(&xrcd->tgt_qps));
-	xrcd->device->ops.dealloc_xrcd(xrcd, udata);
+	ret = xrcd->device->ops.dealloc_xrcd(xrcd, udata);
+	if (ret && udata)
+		return ret;
 	kfree(xrcd);
-	return 0;
+	return ret;
 }
 EXPORT_SYMBOL(ib_dealloc_xrcd_user);
 
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 1be0108db992..8e1e3b8a5a0d 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -1256,11 +1256,12 @@ static int mlx4_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata)
 	return err;
 }
 
-static void mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata)
+static int mlx4_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata)
 {
 	ib_destroy_cq(to_mxrcd(xrcd)->cq);
 	ib_dealloc_pd(to_mxrcd(xrcd)->pd);
 	mlx4_xrcd_free(to_mdev(xrcd->device)->dev, to_mxrcd(xrcd)->xrcdn);
+	return 0;
 }
 
 static int add_gid_entry(struct ib_qp *ibqp, union ib_gid *gid)
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 0a65f7ba40c4..041f9d1d696b 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1196,7 +1196,7 @@ int mlx5_ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 			const struct ib_mad *in, struct ib_mad *out,
 			size_t *out_mad_size, u16 *out_mad_pkey_index);
 int mlx5_ib_alloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata);
-void mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata);
+int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata);
 int mlx5_ib_get_buf_offset(u64 addr, int page_shift, u32 *offset);
 int mlx5_query_ext_port_caps(struct mlx5_ib_dev *dev, u8 port);
 int mlx5_query_mad_ifc_smp_attr_node_info(struct ib_device *ibdev,
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 98abb8606f0e..fea837cb2ec4 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4745,12 +4745,12 @@ int mlx5_ib_alloc_xrcd(struct ib_xrcd *ibxrcd, struct ib_udata *udata)
 	return mlx5_cmd_xrcd_alloc(dev->mdev, &xrcd->xrcdn, 0);
 }
 
-void mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata)
+int mlx5_ib_dealloc_xrcd(struct ib_xrcd *xrcd, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *dev = to_mdev(xrcd->device);
 	u32 xrcdn = to_mxrcd(xrcd)->xrcdn;
 
-	mlx5_cmd_xrcd_dealloc(dev->mdev, xrcdn, 0);
+	return mlx5_cmd_xrcd_dealloc(dev->mdev, xrcdn, 0);
 }
 
 static void mlx5_ib_wq_event(struct mlx5_core_qp *core_qp, int type)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 335189bddbef..1161e18d948a 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2458,7 +2458,7 @@ struct ib_device_ops {
 	int (*attach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	int (*detach_mcast)(struct ib_qp *qp, union ib_gid *gid, u16 lid);
 	int (*alloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata);
-	void (*dealloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata);
+	int (*dealloc_xrcd)(struct ib_xrcd *xrcd, struct ib_udata *udata);
 	struct ib_flow *(*create_flow)(struct ib_qp *qp,
 				       struct ib_flow_attr *flow_attr,
 				       struct ib_udata *udata);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 09/10] RDMA: Restore ability to return error for destroy WQ
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (7 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 08/10] RDMA: Change XRCD destroy return value Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  2020-08-30  8:40 ` [PATCH rdma-next v1 10/10] RDMA: Make counters destroy symmetrical Leon Romanovsky
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe
  Cc: Leon Romanovsky, linux-rdma, Yishai Hadas, Yuval Shaia

From: Leon Romanovsky <leonro@mellanox.com>

Make this interface symmetrical to other destroy paths.

Fixes: a49b1dc7ae44 ("RDMA: Convert destroy_wq to be void")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_std_types_wq.c |  2 +-
 drivers/infiniband/core/verbs.c               | 15 +++++++++------
 drivers/infiniband/hw/mlx4/mlx4_ib.h          |  2 +-
 drivers/infiniband/hw/mlx4/qp.c               |  3 ++-
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |  2 +-
 drivers/infiniband/hw/mlx5/qp.c               |  8 ++++++--
 drivers/infiniband/hw/mlx5/qp.h               |  4 ++--
 drivers/infiniband/hw/mlx5/qpc.c              |  5 +++--
 include/rdma/ib_verbs.h                       |  4 ++--
 9 files changed, 27 insertions(+), 18 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_std_types_wq.c b/drivers/infiniband/core/uverbs_std_types_wq.c
index cad842ede077..f2e6a625724a 100644
--- a/drivers/infiniband/core/uverbs_std_types_wq.c
+++ b/drivers/infiniband/core/uverbs_std_types_wq.c
@@ -16,7 +16,7 @@ static int uverbs_free_wq(struct ib_uobject *uobject,
 		container_of(uobject, struct ib_uwq_object, uevent.uobject);
 	int ret;
 
-	ret = ib_destroy_wq(wq, &attrs->driver_udata);
+	ret = ib_destroy_wq_user(wq, &attrs->driver_udata);
 	if (ib_is_destroy_retryable(ret, why, uobject))
 		return ret;
 
diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 0dec1c8d6744..bab99f326cce 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -2399,25 +2399,28 @@ struct ib_wq *ib_create_wq(struct ib_pd *pd,
 EXPORT_SYMBOL(ib_create_wq);
 
 /**
- * ib_destroy_wq - Destroys the specified user WQ.
+ * ib_destroy_wq_user - Destroys the specified user WQ.
  * @wq: The WQ to destroy.
  * @udata: Valid user data
  */
-int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata)
+int ib_destroy_wq_user(struct ib_wq *wq, struct ib_udata *udata)
 {
 	struct ib_cq *cq = wq->cq;
 	struct ib_pd *pd = wq->pd;
+	int ret;
 
 	if (atomic_read(&wq->usecnt))
 		return -EBUSY;
 
-	wq->device->ops.destroy_wq(wq, udata);
+	ret = wq->device->ops.destroy_wq(wq, udata);
+	if (ret && udata)
+		return ret;
+
 	atomic_dec(&pd->usecnt);
 	atomic_dec(&cq->usecnt);
-
-	return 0;
+	return ret;
 }
-EXPORT_SYMBOL(ib_destroy_wq);
+EXPORT_SYMBOL(ib_destroy_wq_user);
 
 /**
  * ib_modify_wq - Modifies the specified WQ.
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 32a024f765ea..8f5467c2309a 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -899,7 +899,7 @@ void mlx4_ib_sl2vl_update(struct mlx4_ib_dev *mdev, int port);
 struct ib_wq *mlx4_ib_create_wq(struct ib_pd *pd,
 				struct ib_wq_init_attr *init_attr,
 				struct ib_udata *udata);
-void mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
+int mlx4_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
 int mlx4_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      u32 wq_attr_mask, struct ib_udata *udata);
 
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index f9ca6e000a81..1db1e25a436b 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -4327,7 +4327,7 @@ int mlx4_ib_modify_wq(struct ib_wq *ibwq, struct ib_wq_attr *wq_attr,
 	return err;
 }
 
-void mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
+int mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
 {
 	struct mlx4_ib_dev *dev = to_mdev(ibwq->device);
 	struct mlx4_ib_qp *qp = to_mqp((struct ib_qp *)ibwq);
@@ -4338,6 +4338,7 @@ void mlx4_ib_destroy_wq(struct ib_wq *ibwq, struct ib_udata *udata)
 	destroy_qp_common(dev, qp, MLX4_IB_RWQ_SRC, udata);
 
 	kfree(qp);
+	return 0;
 }
 
 struct ib_rwq_ind_table
diff --git a/drivers/infiniband/hw/mlx5/mlx5_ib.h b/drivers/infiniband/hw/mlx5/mlx5_ib.h
index 041f9d1d696b..0a3681463a62 100644
--- a/drivers/infiniband/hw/mlx5/mlx5_ib.h
+++ b/drivers/infiniband/hw/mlx5/mlx5_ib.h
@@ -1241,7 +1241,7 @@ int mlx5_ib_check_mr_status(struct ib_mr *ibmr, u32 check_mask,
 struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
 				struct ib_wq_init_attr *init_attr,
 				struct ib_udata *udata);
-void mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
+int mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
 int mlx5_ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *wq_attr,
 		      u32 wq_attr_mask, struct ib_udata *udata);
 struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index fea837cb2ec4..74d3f4c39dc5 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -5085,14 +5085,18 @@ struct ib_wq *mlx5_ib_create_wq(struct ib_pd *pd,
 	return ERR_PTR(err);
 }
 
-void mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata)
+int mlx5_ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata)
 {
 	struct mlx5_ib_dev *dev = to_mdev(wq->device);
 	struct mlx5_ib_rwq *rwq = to_mrwq(wq);
+	int ret;
 
-	mlx5_core_destroy_rq_tracked(dev, &rwq->core_qp);
+	ret = mlx5_core_destroy_rq_tracked(dev, &rwq->core_qp);
+	if (ret && udata)
+		return ret;
 	destroy_user_rq(dev, wq->pd, rwq, udata);
 	kfree(rwq);
+	return ret;
 }
 
 struct ib_rwq_ind_table *mlx5_ib_create_rwq_ind_table(struct ib_device *device,
diff --git a/drivers/infiniband/hw/mlx5/qp.h b/drivers/infiniband/hw/mlx5/qp.h
index ba899df44c5b..5d4e140db99c 100644
--- a/drivers/infiniband/hw/mlx5/qp.h
+++ b/drivers/infiniband/hw/mlx5/qp.h
@@ -26,8 +26,8 @@ int mlx5_core_dct_query(struct mlx5_ib_dev *dev, struct mlx5_core_dct *dct,
 
 int mlx5_core_set_delay_drop(struct mlx5_ib_dev *dev, u32 timeout_usec);
 
-void mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev,
-				  struct mlx5_core_qp *rq);
+int mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev,
+				 struct mlx5_core_qp *rq);
 int mlx5_core_create_sq_tracked(struct mlx5_ib_dev *dev, u32 *in, int inlen,
 				struct mlx5_core_qp *sq);
 void mlx5_core_destroy_sq_tracked(struct mlx5_ib_dev *dev,
diff --git a/drivers/infiniband/hw/mlx5/qpc.c b/drivers/infiniband/hw/mlx5/qpc.c
index 7c3968ef9cd1..c683d7000168 100644
--- a/drivers/infiniband/hw/mlx5/qpc.c
+++ b/drivers/infiniband/hw/mlx5/qpc.c
@@ -576,11 +576,12 @@ int mlx5_core_create_rq_tracked(struct mlx5_ib_dev *dev, u32 *in, int inlen,
 	return err;
 }
 
-void mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev,
-				  struct mlx5_core_qp *rq)
+int mlx5_core_destroy_rq_tracked(struct mlx5_ib_dev *dev,
+				 struct mlx5_core_qp *rq)
 {
 	destroy_resource_common(dev, rq);
 	destroy_rq_tracked(dev, rq->qpn, rq->uid);
+	return 0;
 }
 
 static void destroy_sq_tracked(struct mlx5_ib_dev *dev, u32 sqn, u16 uid)
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 1161e18d948a..58fdaaf4f67b 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2486,7 +2486,7 @@ struct ib_device_ops {
 	struct ib_wq *(*create_wq)(struct ib_pd *pd,
 				   struct ib_wq_init_attr *init_attr,
 				   struct ib_udata *udata);
-	void (*destroy_wq)(struct ib_wq *wq, struct ib_udata *udata);
+	int (*destroy_wq)(struct ib_wq *wq, struct ib_udata *udata);
 	int (*modify_wq)(struct ib_wq *wq, struct ib_wq_attr *attr,
 			 u32 wq_attr_mask, struct ib_udata *udata);
 	struct ib_rwq_ind_table *(*create_rwq_ind_table)(
@@ -4322,7 +4322,7 @@ struct net_device *ib_device_netdev(struct ib_device *dev, u8 port);
 
 struct ib_wq *ib_create_wq(struct ib_pd *pd,
 			   struct ib_wq_init_attr *init_attr);
-int ib_destroy_wq(struct ib_wq *wq, struct ib_udata *udata);
+int ib_destroy_wq_user(struct ib_wq *wq, struct ib_udata *udata);
 int ib_modify_wq(struct ib_wq *wq, struct ib_wq_attr *attr,
 		 u32 wq_attr_mask);
 int ib_destroy_rwq_ind_table(struct ib_rwq_ind_table *wq_ind_table);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH rdma-next v1 10/10] RDMA: Make counters destroy symmetrical
  2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
                   ` (8 preceding siblings ...)
  2020-08-30  8:40 ` [PATCH rdma-next v1 09/10] RDMA: Restore ability to return error for destroy WQ Leon Romanovsky
@ 2020-08-30  8:40 ` Leon Romanovsky
  9 siblings, 0 replies; 25+ messages in thread
From: Leon Romanovsky @ 2020-08-30  8:40 UTC (permalink / raw)
  To: Doug Ledford, Jason Gunthorpe; +Cc: Leon Romanovsky, linux-rdma

From: Leon Romanovsky <leonro@mellanox.com>

Change counters to return failure like any other verbs
destroy, however this flow shouldn't return error at all.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
---
 drivers/infiniband/core/uverbs_std_types_counters.c | 4 +++-
 drivers/infiniband/hw/mlx5/counters.c               | 3 ++-
 include/rdma/ib_verbs.h                             | 2 +-
 3 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/drivers/infiniband/core/uverbs_std_types_counters.c b/drivers/infiniband/core/uverbs_std_types_counters.c
index c7e7438752bc..b3c6c066b601 100644
--- a/drivers/infiniband/core/uverbs_std_types_counters.c
+++ b/drivers/infiniband/core/uverbs_std_types_counters.c
@@ -46,7 +46,9 @@ static int uverbs_free_counters(struct ib_uobject *uobject,
 	if (ret)
 		return ret;
 
-	counters->device->ops.destroy_counters(counters);
+	ret = counters->device->ops.destroy_counters(counters);
+	if (ret)
+		return ret;
 	kfree(counters);
 	return 0;
 }
diff --git a/drivers/infiniband/hw/mlx5/counters.c b/drivers/infiniband/hw/mlx5/counters.c
index 145f3cb40ccb..8d77fea0eb48 100644
--- a/drivers/infiniband/hw/mlx5/counters.c
+++ b/drivers/infiniband/hw/mlx5/counters.c
@@ -117,7 +117,7 @@ static int mlx5_ib_read_counters(struct ib_counters *counters,
 	return ret;
 }
 
-static void mlx5_ib_destroy_counters(struct ib_counters *counters)
+static int mlx5_ib_destroy_counters(struct ib_counters *counters)
 {
 	struct mlx5_ib_mcounters *mcounters = to_mcounters(counters);
 
@@ -125,6 +125,7 @@ static void mlx5_ib_destroy_counters(struct ib_counters *counters)
 	if (mcounters->hw_cntrs_hndl)
 		mlx5_fc_destroy(to_mdev(counters->device)->mdev,
 				mcounters->hw_cntrs_hndl);
+	return 0;
 }
 
 static int mlx5_ib_create_counters(struct ib_counters *counters,
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 58fdaaf4f67b..25c5180f5a79 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -2504,7 +2504,7 @@ struct ib_device_ops {
 				   struct uverbs_attr_bundle *attrs);
 	int (*create_counters)(struct ib_counters *counters,
 			       struct uverbs_attr_bundle *attrs);
-	void (*destroy_counters)(struct ib_counters *counters);
+	int (*destroy_counters)(struct ib_counters *counters);
 	int (*read_counters)(struct ib_counters *counters,
 			     struct ib_counters_read_attr *counters_read_attr,
 			     struct uverbs_attr_bundle *attrs);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-08-30  8:40 ` [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
@ 2020-09-03  0:08   ` Jason Gunthorpe
  2020-09-03  5:11     ` Leon Romanovsky
  2020-09-03  0:18   ` Jason Gunthorpe
  1 sibling, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03  0:08 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, Adit Ranadive, Ariel Elior,
	Bernard Metzler, Dennis Dalessandro, Devesh Sharma, Lijun Ou,
	linux-rdma, Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
> -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
>  {
>  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
>  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> +	int ret;
> +
> +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> +	if (ret && udata)
> +		return ret;
>  
> -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> -
> -	if (srq->uobject) {
> -		mlx5_ib_db_unmap_user(
> -			rdma_udata_to_drv_context(
> -				udata,
> -				struct mlx5_ib_ucontext,
> -				ibucontext),
> -			&msrq->db);
> -		ib_umem_release(msrq->umem);
> -	} else {
> -		destroy_srq_kernel(dev, msrq);
> +	if (udata) {
> +		destroy_srq_user(srq->pd, msrq, udata);
> +		return 0;
>  	}

It is not a big deal but I would like to remove udata as an argument
to the driver destroy functions, it is completely nonsensical and
never used.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-08-30  8:40 ` [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
  2020-09-03  0:08   ` Jason Gunthorpe
@ 2020-09-03  0:18   ` Jason Gunthorpe
  2020-09-03  5:28     ` Leon Romanovsky
  1 sibling, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03  0:18 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, Adit Ranadive, Ariel Elior,
	Bernard Metzler, Dennis Dalessandro, Devesh Sharma, Lijun Ou,
	linux-rdma, Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
  
> -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
>  {
>  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
>  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> +	int ret;
> +
> +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> +	if (ret && udata)
> +		return ret;
>  
> -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> -
> -	if (srq->uobject) {
> -		mlx5_ib_db_unmap_user(
> -			rdma_udata_to_drv_context(
> -				udata,
> -				struct mlx5_ib_ucontext,
> -				ibucontext),
> -			&msrq->db);
> -		ib_umem_release(msrq->umem);
> -	} else {
> -		destroy_srq_kernel(dev, msrq);
> +	if (udata) {
> +		destroy_srq_user(srq->pd, msrq, udata);
> +		return 0;
>  	}
> +
> +	/* We are cleaning kernel resources anyway */
> +	destroy_srq_kernel(dev, msrq);

Oh, and this isn't right.. If we are going to leak things then we have
to leak anything exposed for DMA as well, eg the fragbuf under the SRQ
has to be leaked.

If the HW can't guarentee it stopped doing DMA then we can't return
memory under potentially active DMA back to the system.

IHMO mlx5, and all the other drivers, get this wrong. Failing to
eventually destroy an object is a catastrophic failure of the
device. In the case of a kernel object it must always be destroyed on
the first attempt.

In this case the device should be killed. Disable memory access at the
PCI config space, trigger a device reset, disassociate the device, and
allow all destroy commands to fake-succeed.

Since drivers need help to get this right, I'm wonder if we should fix
this at the core level by introducing a 'your device is screwed up,
kill it' callback.

Then all the destroys can return failures as Gal wanted.

The core logic would be something like

ret = dev->ops.destroy_foo()
if (is_kernel_object())
   dev->ops.device_is_broken()
   ret = dev->ops.destroy_foo()
   WARN_ON(ret);

Ie after 'device_is_broken' the driver must always succeed future
destroys.

Then we have a chance to make this work properly... mlx5 at least
already has an implementation of 'device_is_broken' that does trigger
success for future destroys.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ
  2020-08-30  8:40 ` [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ Leon Romanovsky
@ 2020-09-03  0:20   ` Jason Gunthorpe
  2020-09-03  5:35     ` Leon Romanovsky
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03  0:20 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Leon Romanovsky, linux-rdma

On Sun, Aug 30, 2020 at 11:40:06AM +0300, Leon Romanovsky wrote:
>  /**
> - * ib_free_cq_user - free a completion queue
> + * ib_free_cq - free a completion queue
>   * @cq:		completion queue to free.
> - * @udata:	User data or NULL for kernel object
>   */
> -void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
> +void ib_free_cq(struct ib_cq *cq)
>  {
> -	if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
> -		return;
> -	if (WARN_ON_ONCE(cq->cqe_used))
> -		return;
> +	WARN_ON_ONCE(atomic_read(&cq->usecnt));

In this case we expect ops.destroy_cq to fail, so no sense in
continuing, leak everything, the ULP is buggy.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry
  2020-08-30  8:40 ` [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry Leon Romanovsky
@ 2020-09-03  0:31   ` Jason Gunthorpe
  2020-09-03  5:08     ` Leon Romanovsky
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03  0:31 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, Leon Romanovsky, linux-rdma

On Sun, Aug 30, 2020 at 11:40:03AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> The HW release can fail and leave the system in limbo state,
> where SRQ is removed from the table, but can't be destroyed later.
> In every reentry, the initial xa_erase_irq() check will fail.
> 
> Rewrite the erase logic to keep index, but don't store the entry
> itself. By doing it, we can safely reinsert entry back in the case
> of destroy failure and be safe from any xa_store_irq() error.
> 
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
>  drivers/infiniband/hw/mlx5/srq_cmd.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c
> index 37aaacebd3f2..c6d807f04d9d 100644
> +++ b/drivers/infiniband/hw/mlx5/srq_cmd.c
> @@ -596,13 +596,22 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
>  	struct mlx5_core_srq *tmp;
>  	int err;
>  
> -	tmp = xa_erase_irq(&table->array, srq->srqn);
> -	if (!tmp || tmp != srq)
> +	/* Delete entry, but leave index occupied */
> +	tmp = xa_store_irq(&table->array, srq->srqn, NULL, 0);
> +	if (WARN_ON(!tmp || tmp != srq))
>  		return;

This isn't an allocating xarray:

	xa_init_flags(&table->array, XA_FLAGS_LOCK_IRQ);

So storing NULL actually does delete the entry and clean up the memory
and the store below could fail.

I think this should be written as

   xa_cmpxchg_irq(&table->array, srq->srqn, srq, XA_ZERO_ENTRY, 0);

And the undo below would be

   xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, srq 0);

> +	xa_erase_irq(&table->array, srq->srqn);

And this is racy since the FW could have reallocated the same srqn and
already set it in the xarray.

It needs to be xa_release_irq(), which looks like it needs to be
added to match xa_reserve_irq()

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry
  2020-09-03  0:31   ` Jason Gunthorpe
@ 2020-09-03  5:08     ` Leon Romanovsky
  2020-09-03 11:54       ` Jason Gunthorpe
  0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-09-03  5:08 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Wed, Sep 02, 2020 at 09:31:15PM -0300, Jason Gunthorpe wrote:
> On Sun, Aug 30, 2020 at 11:40:03AM +0300, Leon Romanovsky wrote:
> > From: Leon Romanovsky <leonro@mellanox.com>
> >
> > The HW release can fail and leave the system in limbo state,
> > where SRQ is removed from the table, but can't be destroyed later.
> > In every reentry, the initial xa_erase_irq() check will fail.
> >
> > Rewrite the erase logic to keep index, but don't store the entry
> > itself. By doing it, we can safely reinsert entry back in the case
> > of destroy failure and be safe from any xa_store_irq() error.
> >
> > Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> >  drivers/infiniband/hw/mlx5/srq_cmd.c | 15 ++++++++++++---
> >  1 file changed, 12 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c
> > index 37aaacebd3f2..c6d807f04d9d 100644
> > +++ b/drivers/infiniband/hw/mlx5/srq_cmd.c
> > @@ -596,13 +596,22 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq)
> >  	struct mlx5_core_srq *tmp;
> >  	int err;
> >
> > -	tmp = xa_erase_irq(&table->array, srq->srqn);
> > -	if (!tmp || tmp != srq)
> > +	/* Delete entry, but leave index occupied */
> > +	tmp = xa_store_irq(&table->array, srq->srqn, NULL, 0);
> > +	if (WARN_ON(!tmp || tmp != srq))
> >  		return;
>
> This isn't an allocating xarray:
>
> 	xa_init_flags(&table->array, XA_FLAGS_LOCK_IRQ);
>
> So storing NULL actually does delete the entry and clean up the memory
> and the store below could fail.
>
> I think this should be written as
>
>    xa_cmpxchg_irq(&table->array, srq->srqn, srq, XA_ZERO_ENTRY, 0);
>
> And the undo below would be
>
>    xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, srq 0);

Thanks, it is better.

>
> > +	xa_erase_irq(&table->array, srq->srqn);
>
> And this is racy since the FW could have reallocated the same srqn and
> already set it in the xarray.

Not in mlx5 devices, srqn is allocated in round-robin order.

>
> It needs to be xa_release_irq(), which looks like it needs to be
> added to match xa_reserve_irq()

We can't call to xa_lock_irq*() while issuing FW commands.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-09-03  0:08   ` Jason Gunthorpe
@ 2020-09-03  5:11     ` Leon Romanovsky
  2020-09-03 11:55       ` Jason Gunthorpe
  0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-09-03  5:11 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Wed, Sep 02, 2020 at 09:08:50PM -0300, Jason Gunthorpe wrote:
> On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
> > -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> >  {
> >  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
> >  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> > +	int ret;
> > +
> > +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > +	if (ret && udata)
> > +		return ret;
> >
> > -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > -
> > -	if (srq->uobject) {
> > -		mlx5_ib_db_unmap_user(
> > -			rdma_udata_to_drv_context(
> > -				udata,
> > -				struct mlx5_ib_ucontext,
> > -				ibucontext),
> > -			&msrq->db);
> > -		ib_umem_release(msrq->umem);
> > -	} else {
> > -		destroy_srq_kernel(dev, msrq);
> > +	if (udata) {
> > +		destroy_srq_user(srq->pd, msrq, udata);
> > +		return 0;
> >  	}
>
> It is not a big deal but I would like to remove udata as an argument
> to the driver destroy functions, it is completely nonsensical and
> never used.

  197 static void destroy_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
  198                              struct ib_udata *udata)
  199 {
  200         mlx5_ib_db_unmap_user(
  201                 rdma_udata_to_drv_context(
  202                         udata,
                              ^^^^^^ in use

  203                         struct mlx5_ib_ucontext,
  204                         ibucontext),
  205                 &srq->db);
  206         ib_umem_release(srq->umem);
  207 }
  208


>
> Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-09-03  0:18   ` Jason Gunthorpe
@ 2020-09-03  5:28     ` Leon Romanovsky
  2020-09-03 12:22       ` Jason Gunthorpe
  0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-09-03  5:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Doug Ledford, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Wed, Sep 02, 2020 at 09:18:27PM -0300, Jason Gunthorpe wrote:
> On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
>
> > -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> >  {
> >  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
> >  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> > +	int ret;
> > +
> > +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > +	if (ret && udata)
> > +		return ret;
> >
> > -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > -
> > -	if (srq->uobject) {
> > -		mlx5_ib_db_unmap_user(
> > -			rdma_udata_to_drv_context(
> > -				udata,
> > -				struct mlx5_ib_ucontext,
> > -				ibucontext),
> > -			&msrq->db);
> > -		ib_umem_release(msrq->umem);
> > -	} else {
> > -		destroy_srq_kernel(dev, msrq);
> > +	if (udata) {
> > +		destroy_srq_user(srq->pd, msrq, udata);
> > +		return 0;
> >  	}
> > +
> > +	/* We are cleaning kernel resources anyway */
> > +	destroy_srq_kernel(dev, msrq);
>
> Oh, and this isn't right.. If we are going to leak things then we have
> to leak anything exposed for DMA as well, eg the fragbuf under the SRQ
> has to be leaked.

We are leaking for ULPs only, from their perspective everything was
released and WARN_ON() will be the sign of the problem.

>
> If the HW can't guarentee it stopped doing DMA then we can't return
> memory under potentially active DMA back to the system.

ULPs are supposed to guarantee that all operations stopped.

>
> IHMO mlx5, and all the other drivers, get this wrong. Failing to
> eventually destroy an object is a catastrophic failure of the
> device. In the case of a kernel object it must always be destroyed on
> the first attempt.

This is current flow, we are destroying kernel object anyway.

>
> In this case the device should be killed. Disable memory access at the
> PCI config space, trigger a device reset, disassociate the device, and
> allow all destroy commands to fake-succeed.
>
> Since drivers need help to get this right, I'm wonder if we should fix
> this at the core level by introducing a 'your device is screwed up,
> kill it' callback.
>
> Then all the destroys can return failures as Gal wanted.
>
> The core logic would be something like
>
> ret = dev->ops.destroy_foo()
> if (is_kernel_object())
>    dev->ops.device_is_broken()
>    ret = dev->ops.destroy_foo()
>    WARN_ON(ret);
>
> Ie after 'device_is_broken' the driver must always succeed future
> destroys.
>
> Then we have a chance to make this work properly... mlx5 at least
> already has an implementation of 'device_is_broken' that does trigger
> success for future destroys.

I don't know, all those years we relied on the ULPs to do destroy
properly and it worked well. I didn't hear any complain from the field
that HW destroy failed in proper ULP flow.

It looks to me over-engineering.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ
  2020-09-03  0:20   ` Jason Gunthorpe
@ 2020-09-03  5:35     ` Leon Romanovsky
  2020-09-03 12:24       ` Jason Gunthorpe
  0 siblings, 1 reply; 25+ messages in thread
From: Leon Romanovsky @ 2020-09-03  5:35 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Doug Ledford, linux-rdma

On Wed, Sep 02, 2020 at 09:20:48PM -0300, Jason Gunthorpe wrote:
> On Sun, Aug 30, 2020 at 11:40:06AM +0300, Leon Romanovsky wrote:
> >  /**
> > - * ib_free_cq_user - free a completion queue
> > + * ib_free_cq - free a completion queue
> >   * @cq:		completion queue to free.
> > - * @udata:	User data or NULL for kernel object
> >   */
> > -void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
> > +void ib_free_cq(struct ib_cq *cq)
> >  {
> > -	if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
> > -		return;
> > -	if (WARN_ON_ONCE(cq->cqe_used))
> > -		return;
> > +	WARN_ON_ONCE(atomic_read(&cq->usecnt));
>
> In this case we expect ops.destroy_cq to fail, so no sense in
> continuing, leak everything, the ULP is buggy.

I disagree, we should clean as much as possible and leave the HW object.

Thanks

>
> Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry
  2020-09-03  5:08     ` Leon Romanovsky
@ 2020-09-03 11:54       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 11:54 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, linux-rdma

On Thu, Sep 03, 2020 at 08:08:11AM +0300, Leon Romanovsky wrote:

> > It needs to be xa_release_irq(), which looks like it needs to be
> > added to match xa_reserve_irq()
> 
> We can't call to xa_lock_irq*() while issuing FW commands.

It is done in place of the erase_irq above, no lock held while calling
FW

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-09-03  5:11     ` Leon Romanovsky
@ 2020-09-03 11:55       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 11:55 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Thu, Sep 03, 2020 at 08:11:12AM +0300, Leon Romanovsky wrote:

> > It is not a big deal but I would like to remove udata as an argument
> > to the driver destroy functions, it is completely nonsensical and
> > never used.
> 
>   197 static void destroy_srq_user(struct ib_pd *pd, struct mlx5_ib_srq *srq,
>   198                              struct ib_udata *udata)
>   199 {
>   200         mlx5_ib_db_unmap_user(
>   201                 rdma_udata_to_drv_context(
>   202                         udata,
>                               ^^^^^^ in use
> 
>   203                         struct mlx5_ib_ucontext,
>   204                         ibucontext),
>   205                 &srq->db);

The ucontext should come from pd->uobject->context

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-09-03  5:28     ` Leon Romanovsky
@ 2020-09-03 12:22       ` Jason Gunthorpe
  2020-09-03 13:12         ` Jason Gunthorpe
  0 siblings, 1 reply; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 12:22 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Thu, Sep 03, 2020 at 08:28:26AM +0300, Leon Romanovsky wrote:
> On Wed, Sep 02, 2020 at 09:18:27PM -0300, Jason Gunthorpe wrote:
> > On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
> >
> > > -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > > +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > >  {
> > >  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
> > >  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> > > +	int ret;
> > > +
> > > +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > > +	if (ret && udata)
> > > +		return ret;
> > >
> > > -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > > -
> > > -	if (srq->uobject) {
> > > -		mlx5_ib_db_unmap_user(
> > > -			rdma_udata_to_drv_context(
> > > -				udata,
> > > -				struct mlx5_ib_ucontext,
> > > -				ibucontext),
> > > -			&msrq->db);
> > > -		ib_umem_release(msrq->umem);
> > > -	} else {
> > > -		destroy_srq_kernel(dev, msrq);
> > > +	if (udata) {
> > > +		destroy_srq_user(srq->pd, msrq, udata);
> > > +		return 0;
> > >  	}
> > > +
> > > +	/* We are cleaning kernel resources anyway */
> > > +	destroy_srq_kernel(dev, msrq);
> >
> > Oh, and this isn't right.. If we are going to leak things then we have
> > to leak anything exposed for DMA as well, eg the fragbuf under the SRQ
> > has to be leaked.
> 
> We are leaking for ULPs only, from their perspective everything was
> released and WARN_ON() will be the sign of the problem.

If we are going to add back in error handling, then it needs to be
done right, there is no different between kernel and user, everything
should be leaked.

> > If the HW can't guarentee it stopped doing DMA then we can't return
> > memory under potentially active DMA back to the system.
> 
> ULPs are supposed to guarantee that all operations stopped.

ULP should never trigger this, only broken HW can cause this kind of
problem.

> I don't know, all those years we relied on the ULPs to do destroy
> properly and it worked well. I didn't hear any complain from the field
> that HW destroy failed in proper ULP flow.
> 
> It looks to me over-engineering.

Given mlx5 already has the fatal error handling it seems a reasonable
way to re-introduce the error code without just delcaring drivers are
buggy to use it..

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ
  2020-09-03  5:35     ` Leon Romanovsky
@ 2020-09-03 12:24       ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 12:24 UTC (permalink / raw)
  To: Leon Romanovsky; +Cc: Doug Ledford, linux-rdma

On Thu, Sep 03, 2020 at 08:35:17AM +0300, Leon Romanovsky wrote:
> On Wed, Sep 02, 2020 at 09:20:48PM -0300, Jason Gunthorpe wrote:
> > On Sun, Aug 30, 2020 at 11:40:06AM +0300, Leon Romanovsky wrote:
> > >  /**
> > > - * ib_free_cq_user - free a completion queue
> > > + * ib_free_cq - free a completion queue
> > >   * @cq:		completion queue to free.
> > > - * @udata:	User data or NULL for kernel object
> > >   */
> > > -void ib_free_cq_user(struct ib_cq *cq, struct ib_udata *udata)
> > > +void ib_free_cq(struct ib_cq *cq)
> > >  {
> > > -	if (WARN_ON_ONCE(atomic_read(&cq->usecnt)))
> > > -		return;
> > > -	if (WARN_ON_ONCE(cq->cqe_used))
> > > -		return;
> > > +	WARN_ON_ONCE(atomic_read(&cq->usecnt));
> >
> > In this case we expect ops.destroy_cq to fail, so no sense in
> > continuing, leak everything, the ULP is buggy.
> 
> I disagree, we should clean as much as possible and leave the HW object.

There is no reason, this just creates possible data corruption.

Once these WARN_ONs are hit, the system is leaking memory. It doesn't
really matter how much is leaked, and there is no reason to try to
minimize the leaking.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy
  2020-09-03 12:22       ` Jason Gunthorpe
@ 2020-09-03 13:12         ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 13:12 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Adit Ranadive, Ariel Elior, Bernard Metzler,
	Dennis Dalessandro, Devesh Sharma, Lijun Ou, linux-rdma,
	Michal Kalderon, Mike Marciniszyn, Naresh Kumar PBS,
	Potnuri Bharat Teja, Selvin Xavier, Somnath Kotur,
	Sriharsha Basavapatna, VMware PV-Drivers, Weihang Li,
	Wei Hu(Xavier),
	Yishai Hadas, Zhu Yanjun

On Thu, Sep 03, 2020 at 09:22:56AM -0300, Jason Gunthorpe wrote:
> On Thu, Sep 03, 2020 at 08:28:26AM +0300, Leon Romanovsky wrote:
> > On Wed, Sep 02, 2020 at 09:18:27PM -0300, Jason Gunthorpe wrote:
> > > On Sun, Aug 30, 2020 at 11:40:05AM +0300, Leon Romanovsky wrote:
> > >
> > > > -void mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > > > +int mlx5_ib_destroy_srq(struct ib_srq *srq, struct ib_udata *udata)
> > > >  {
> > > >  	struct mlx5_ib_dev *dev = to_mdev(srq->device);
> > > >  	struct mlx5_ib_srq *msrq = to_msrq(srq);
> > > > +	int ret;
> > > > +
> > > > +	ret = mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > > > +	if (ret && udata)
> > > > +		return ret;
> > > >
> > > > -	mlx5_cmd_destroy_srq(dev, &msrq->msrq);
> > > > -
> > > > -	if (srq->uobject) {
> > > > -		mlx5_ib_db_unmap_user(
> > > > -			rdma_udata_to_drv_context(
> > > > -				udata,
> > > > -				struct mlx5_ib_ucontext,
> > > > -				ibucontext),
> > > > -			&msrq->db);
> > > > -		ib_umem_release(msrq->umem);
> > > > -	} else {
> > > > -		destroy_srq_kernel(dev, msrq);
> > > > +	if (udata) {
> > > > +		destroy_srq_user(srq->pd, msrq, udata);
> > > > +		return 0;
> > > >  	}
> > > > +
> > > > +	/* We are cleaning kernel resources anyway */
> > > > +	destroy_srq_kernel(dev, msrq);
> > >
> > > Oh, and this isn't right.. If we are going to leak things then we have
> > > to leak anything exposed for DMA as well, eg the fragbuf under the SRQ
> > > has to be leaked.
> > 
> > We are leaking for ULPs only, from their perspective everything was
> > released and WARN_ON() will be the sign of the problem.
> 
> If we are going to add back in error handling, then it needs to be
> done right, there is no different between kernel and user, everything
> should be leaked.
> 
> > > If the HW can't guarentee it stopped doing DMA then we can't return
> > > memory under potentially active DMA back to the system.
> > 
> > ULPs are supposed to guarantee that all operations stopped.
> 
> ULP should never trigger this, only broken HW can cause this kind of
> problem.
> 
> > I don't know, all those years we relied on the ULPs to do destroy
> > properly and it worked well. I didn't hear any complain from the field
> > that HW destroy failed in proper ULP flow.
> > 
> > It looks to me over-engineering.
> 
> Given mlx5 already has the fatal error handling it seems a reasonable
> way to re-introduce the error code without just delcaring drivers are
> buggy to use it..

That said, it doesn't have to be done in this series, however lets
just keep the basic principle that if destroy fails then the HW object
remains untouched and fully operational. None of this partial destroy
only if in kernel mode stuff.

Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll
  2020-08-30  8:40 ` [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll Leon Romanovsky
@ 2020-09-03 13:42   ` Jason Gunthorpe
  0 siblings, 0 replies; 25+ messages in thread
From: Jason Gunthorpe @ 2020-09-03 13:42 UTC (permalink / raw)
  To: Leon Romanovsky
  Cc: Doug Ledford, Leon Romanovsky, Eli Cohen, Jack Morgenstein,
	linux-rdma, Or Gerlitz, Roland Dreier

On Sun, Aug 30, 2020 at 11:40:04AM +0300, Leon Romanovsky wrote:
> From: Leon Romanovsky <leonro@mellanox.com>
> 
> The SRQ can be destroyed right before mlx5_cmd_get_srq is called.
> In such case the latter will return NULL instead of expected SRQ.
> 
> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox Connect-IB adapters")
> Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
> ---
>  drivers/infiniband/hw/mlx5/cq.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)

Applied to for-next

Thanks,
Jason

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2020-09-03 15:13 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-30  8:40 [PATCH rdma-next v1 00/10] Restore failure of destroy commands Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 01/10] RDMA: Restore ability to fail on PD deallocate Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 02/10] RDMA: Restore ability to fail on AH destroy Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 03/10] RDMA/mlx5: Issue FW command to destroy SRQ on reentry Leon Romanovsky
2020-09-03  0:31   ` Jason Gunthorpe
2020-09-03  5:08     ` Leon Romanovsky
2020-09-03 11:54       ` Jason Gunthorpe
2020-08-30  8:40 ` [PATCH rdma-next v1 04/10] RDMA/mlx5: Fix potential race between destroy and CQE poll Leon Romanovsky
2020-09-03 13:42   ` Jason Gunthorpe
2020-08-30  8:40 ` [PATCH rdma-next v1 05/10] RDMA: Restore ability to fail on SRQ destroy Leon Romanovsky
2020-09-03  0:08   ` Jason Gunthorpe
2020-09-03  5:11     ` Leon Romanovsky
2020-09-03 11:55       ` Jason Gunthorpe
2020-09-03  0:18   ` Jason Gunthorpe
2020-09-03  5:28     ` Leon Romanovsky
2020-09-03 12:22       ` Jason Gunthorpe
2020-09-03 13:12         ` Jason Gunthorpe
2020-08-30  8:40 ` [PATCH rdma-next v1 06/10] RDMA/core: Delete function indirection for alloc/free kernel CQ Leon Romanovsky
2020-09-03  0:20   ` Jason Gunthorpe
2020-09-03  5:35     ` Leon Romanovsky
2020-09-03 12:24       ` Jason Gunthorpe
2020-08-30  8:40 ` [PATCH rdma-next v1 07/10] RDMA: Allow fail of destroy CQ Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 08/10] RDMA: Change XRCD destroy return value Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 09/10] RDMA: Restore ability to return error for destroy WQ Leon Romanovsky
2020-08-30  8:40 ` [PATCH rdma-next v1 10/10] RDMA: Make counters destroy symmetrical Leon Romanovsky

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.