All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/16] Fix SDMA/TID caching code
@ 2016-07-28 19:21 ira.weiny-ral2JQCrhuEAvxtiuMwx3w
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

This series fixes a number of bugs found while debugging the use of the
mm_struct "current->mm".  When PSM jobs were stopped unexpectedly (for example
control-c) it was possible for the wrong mm struct to be used in
mmu_notifier_unregister.  This was causing memory corruption.  In addition, a
number of might sleep conditions were found when mmap_sem needed to be taken
while a spinlock was held.

The following are stand alone fixes which can probably be reordered if
necessary:

	IB/hfi1: Prevent null pointer dereference
	IB/hfi1: Use the same capability state for all shared contexts
	IB/hfi1: Validate SDMA user request index
	IB/hfi1: Validate SDMA user iovector count
	IB/hfi1: Release node on insert failure
	IB/hfi1: Fix error condition that needs to clean up
	IB/hfi1: Fix user SDMA racy user request claim

The use of mm and locking issues are fixed with the following commits (they
depend on the previous commits and must be ordered among themselves):

	IB/hfi1: Make use of mm consistent
	IB/hfi1: Make the cache handler own its rb tree root
	IB/hfi1: Fix TID caching actions
	IB/hfi1: Add evict operation to the mmu rb handler
	IB/hfi1: Use evict mmu rb operation
	IB/hfi1: Consistently call ops->remove outside spinlock
	IB/hfi1: Remove unneeded mm argument in remove function
	IB/hfi1: Fix memory leak during unexpected shutdown
	IB/hfi1: Add cache evict LRU list

Special thanks to Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org> for suggestions on how to fix
the main issues.

This series requires the clean up patch series sent previously.

Dean Luick (13):
  IB/hfi1: Use the same capability state for all shared contexts
  IB/hfi1: Validate SDMA user request index
  IB/hfi1: Validate SDMA user iovector count
  IB/hfi1: Release node on insert failure
  IB/hfi1: Fix error condition that needs to clean up
  IB/hfi1: Fix user SDMA racy user request claim
  IB/hfi1: Make the cache handler own its rb tree root
  IB/hfi1: Fix TID caching actions
  IB/hfi1: Add evict operation to the mmu rb handler
  IB/hfi1: Use evict mmu rb operation
  IB/hfi1: Consistently call ops->remove outside spinlock
  IB/hfi1: Remove unneeded mm argument in remove function
  IB/hfi1: Add cache evict LRU list

Ira Weiny (3):
  IB/hfi1: Prevent null pointer dereference
  IB/hfi1: Make use of mm consistent
  IB/hfi1: Fix memory leak during unexpected shutdown

 drivers/infiniband/hw/hfi1/chip.c         |   8 --
 drivers/infiniband/hw/hfi1/chip.h         |   2 -
 drivers/infiniband/hw/hfi1/file_ops.c     |  31 ++--
 drivers/infiniband/hw/hfi1/hfi.h          |  13 +-
 drivers/infiniband/hw/hfi1/mmu_rb.c       | 214 ++++++++++++++++++---------
 drivers/infiniband/hw/hfi1/mmu_rb.h       |  32 +++--
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 115 ++++++++-------
 drivers/infiniband/hw/hfi1/user_pages.c   |  19 +--
 drivers/infiniband/hw/hfi1/user_sdma.c    | 230 +++++++++++++++---------------
 drivers/infiniband/hw/hfi1/user_sdma.h    |   8 +-
 10 files changed, 386 insertions(+), 286 deletions(-)

-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH 01/16] IB/hfi1: Prevent null pointer dereference
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 02/16] IB/hfi1: Use the same capability state for all shared contexts ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (15 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If a context has not been assigned or assignment failed, pq may be NULL.
Move the unregister within the protection of the null check.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 586f07807b27..6b8d1e8b6286 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -472,8 +472,8 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd)
 	hfi1_cdbg(SDMA, "[%u:%u:%u] Freeing user SDMA queues", uctxt->dd->unit,
 		  uctxt->ctxt, fd->subctxt);
 	pq = fd->pq;
-	hfi1_mmu_rb_unregister(&pq->sdma_rb_root);
 	if (pq) {
+		hfi1_mmu_rb_unregister(&pq->sdma_rb_root);
 		spin_lock_irqsave(&uctxt->sdma_qlock, flags);
 		if (!list_empty(&pq->list))
 			list_del_init(&pq->list);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 02/16] IB/hfi1: Use the same capability state for all shared contexts
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2016-07-28 19:21   ` [PATCH 01/16] IB/hfi1: Prevent null pointer dereference ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 03/16] IB/hfi1: Validate SDMA user request index ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (14 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Save the current capability state at user context creation
time.  Report this saved value for all shared contexts.

Also get rid of unnecessary hfi1_get_base_kinfo function.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/chip.c     |  8 --------
 drivers/infiniband/hw/hfi1/chip.h     |  2 --
 drivers/infiniband/hw/hfi1/file_ops.c | 21 +++++++++++----------
 drivers/infiniband/hw/hfi1/hfi.h      |  2 +-
 4 files changed, 12 insertions(+), 21 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c
index c93683496614..b32638d58ae8 100644
--- a/drivers/infiniband/hw/hfi1/chip.c
+++ b/drivers/infiniband/hw/hfi1/chip.c
@@ -9648,14 +9648,6 @@ void hfi1_clear_tids(struct hfi1_ctxtdata *rcd)
 		hfi1_put_tid(dd, i, PT_INVALID, 0, 0);
 }
 
-int hfi1_get_base_kinfo(struct hfi1_ctxtdata *rcd,
-			struct hfi1_ctxt_info *kinfo)
-{
-	kinfo->runtime_flags = (HFI1_MISC_GET() << HFI1_CAP_USER_SHIFT) |
-		HFI1_CAP_UGET(MASK) | HFI1_CAP_KGET(K2U);
-	return 0;
-}
-
 struct hfi1_message_header *hfi1_get_msgheader(
 				struct hfi1_devdata *dd, __le32 *rhf_addr)
 {
diff --git a/drivers/infiniband/hw/hfi1/chip.h b/drivers/infiniband/hw/hfi1/chip.h
index f07bc4ccc468..ed11107c50fe 100644
--- a/drivers/infiniband/hw/hfi1/chip.h
+++ b/drivers/infiniband/hw/hfi1/chip.h
@@ -1337,8 +1337,6 @@ void hfi1_start_cleanup(struct hfi1_devdata *dd);
 void hfi1_clear_tids(struct hfi1_ctxtdata *rcd);
 struct hfi1_message_header *hfi1_get_msgheader(
 				struct hfi1_devdata *dd, __le32 *rhf_addr);
-int hfi1_get_base_kinfo(struct hfi1_ctxtdata *rcd,
-			struct hfi1_ctxt_info *kinfo);
 int hfi1_init_ctxt(struct send_context *sc);
 void hfi1_put_tid(struct hfi1_devdata *dd, u32 index,
 		  u32 type, unsigned long pa, u16 order);
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 0522bafb190b..1f4cd5aa2071 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -981,7 +981,7 @@ static int allocate_ctxt(struct file *fp, struct hfi1_devdata *dd,
 			return ret;
 	}
 	uctxt->userversion = uinfo->userversion;
-	uctxt->flags = HFI1_CAP_UGET(MASK);
+	uctxt->flags = hfi1_cap_mask; /* save current flag state */
 	init_waitqueue_head(&uctxt->wait);
 	strlcpy(uctxt->comm, current->comm, sizeof(uctxt->comm));
 	memcpy(uctxt->uuid, uinfo->uuid, sizeof(uctxt->uuid));
@@ -1084,18 +1084,18 @@ static int user_init(struct file *fp)
 	hfi1_set_ctxt_jkey(uctxt->dd, uctxt->ctxt, uctxt->jkey);
 
 	rcvctrl_ops = HFI1_RCVCTRL_CTXT_ENB;
-	if (HFI1_CAP_KGET_MASK(uctxt->flags, HDRSUPP))
+	if (HFI1_CAP_UGET_MASK(uctxt->flags, HDRSUPP))
 		rcvctrl_ops |= HFI1_RCVCTRL_TIDFLOW_ENB;
 	/*
 	 * Ignore the bit in the flags for now until proper
 	 * support for multiple packet per rcv array entry is
 	 * added.
 	 */
-	if (!HFI1_CAP_KGET_MASK(uctxt->flags, MULTI_PKT_EGR))
+	if (!HFI1_CAP_UGET_MASK(uctxt->flags, MULTI_PKT_EGR))
 		rcvctrl_ops |= HFI1_RCVCTRL_ONE_PKT_EGR_ENB;
-	if (HFI1_CAP_KGET_MASK(uctxt->flags, NODROP_EGR_FULL))
+	if (HFI1_CAP_UGET_MASK(uctxt->flags, NODROP_EGR_FULL))
 		rcvctrl_ops |= HFI1_RCVCTRL_NO_EGR_DROP_ENB;
-	if (HFI1_CAP_KGET_MASK(uctxt->flags, NODROP_RHQ_FULL))
+	if (HFI1_CAP_UGET_MASK(uctxt->flags, NODROP_RHQ_FULL))
 		rcvctrl_ops |= HFI1_RCVCTRL_NO_RHQ_DROP_ENB;
 	/*
 	 * The RcvCtxtCtrl.TailUpd bit has to be explicitly written.
@@ -1103,7 +1103,7 @@ static int user_init(struct file *fp)
 	 * uses of the chip or ctxt. Therefore, add the rcvctrl op
 	 * for both cases.
 	 */
-	if (HFI1_CAP_KGET_MASK(uctxt->flags, DMA_RTAIL))
+	if (HFI1_CAP_UGET_MASK(uctxt->flags, DMA_RTAIL))
 		rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_ENB;
 	else
 		rcvctrl_ops |= HFI1_RCVCTRL_TAILUPD_DIS;
@@ -1126,9 +1126,10 @@ static int get_ctxt_info(struct file *fp, void __user *ubase, __u32 len)
 	int ret = 0;
 
 	memset(&cinfo, 0, sizeof(cinfo));
-	ret = hfi1_get_base_kinfo(uctxt, &cinfo);
-	if (ret < 0)
-		goto done;
+	cinfo.runtime_flags = (((uctxt->flags >> HFI1_CAP_MISC_SHIFT) &
+				HFI1_CAP_MISC_MASK) << HFI1_CAP_USER_SHIFT) |
+			HFI1_CAP_UGET_MASK(uctxt->flags, MASK) |
+			HFI1_CAP_KGET_MASK(uctxt->flags, K2U);
 	cinfo.num_active = hfi1_count_active_units();
 	cinfo.unit = uctxt->dd->unit;
 	cinfo.ctxt = uctxt->ctxt;
@@ -1150,7 +1151,7 @@ static int get_ctxt_info(struct file *fp, void __user *ubase, __u32 len)
 	trace_hfi1_ctxt_info(uctxt->dd, uctxt->ctxt, fd->subctxt, cinfo);
 	if (copy_to_user(ubase, &cinfo, sizeof(cinfo)))
 		ret = -EFAULT;
-done:
+
 	return ret;
 }
 
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 6fb86fee0701..36e6b8e0c735 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -255,7 +255,7 @@ struct hfi1_ctxtdata {
 	/* chip offset of PIO buffers for this ctxt */
 	u32 piobufs;
 	/* per-context configuration flags */
-	u32 flags;
+	unsigned long flags;
 	/* per-context event flags for fileops/intr communication */
 	unsigned long event_flags;
 	/* WAIT_RCV that timed out, no interrupt */
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 03/16] IB/hfi1: Validate SDMA user request index
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2016-07-28 19:21   ` [PATCH 01/16] IB/hfi1: Prevent null pointer dereference ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 02/16] IB/hfi1: Use the same capability state for all shared contexts ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 04/16] IB/hfi1: Validate SDMA user iovector count ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (13 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 6b8d1e8b6286..0a0281ae35f1 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -552,6 +552,14 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 
 	trace_hfi1_sdma_user_reqinfo(dd, uctxt->ctxt, fd->subctxt,
 				     (u16 *)&info);
+
+	if (info.comp_idx >= hfi1_sdma_comp_ring_size) {
+		hfi1_cdbg(SDMA,
+			  "[%u:%u:%u:%u] Invalid comp index",
+			  dd->unit, uctxt->ctxt, fd->subctxt, info.comp_idx);
+		return -EINVAL;
+	}
+
 	if (cq->comps[info.comp_idx].status == QUEUED ||
 	    test_bit(SDMA_REQ_IN_USE, &pq->reqs[info.comp_idx].flags)) {
 		hfi1_cdbg(SDMA, "[%u:%u:%u] Entry %u is in QUEUED state",
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 04/16] IB/hfi1: Validate SDMA user iovector count
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (2 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 03/16] IB/hfi1: Validate SDMA user request index ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 05/16] IB/hfi1: Release node on insert failure ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (12 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 0a0281ae35f1..42cc371cdf95 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -560,6 +560,18 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 		return -EINVAL;
 	}
 
+	/*
+	 * Sanity check the header io vector count.  Need at least 1 vector
+	 * (header) and cannot be larger than the actual io vector count.
+	 */
+	if (req_iovcnt(info.ctrl) < 1 || req_iovcnt(info.ctrl) > dim) {
+		hfi1_cdbg(SDMA,
+			  "[%u:%u:%u:%u] Invalid iov count %d, dim %ld",
+			  dd->unit, uctxt->ctxt, fd->subctxt, info.comp_idx,
+			  req_iovcnt(info.ctrl), dim);
+		return -EINVAL;
+	}
+
 	if (cq->comps[info.comp_idx].status == QUEUED ||
 	    test_bit(SDMA_REQ_IN_USE, &pq->reqs[info.comp_idx].flags)) {
 		hfi1_cdbg(SDMA, "[%u:%u:%u] Entry %u is in QUEUED state",
@@ -583,7 +595,7 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 	memset(req, 0, sizeof(*req));
 	/* Mark the request as IN_USE before we start filling it in. */
 	set_bit(SDMA_REQ_IN_USE, &req->flags);
-	req->data_iovs = req_iovcnt(info.ctrl) - 1;
+	req->data_iovs = req_iovcnt(info.ctrl) - 1; /* subtract header vector */
 	req->pq = pq;
 	req->cq = cq;
 	req->status = -1;
@@ -591,8 +603,16 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 
 	memcpy(&req->info, &info, sizeof(info));
 
-	if (req_opcode(info.ctrl) == EXPECTED)
+	if (req_opcode(info.ctrl) == EXPECTED) {
+		/* expected must have a TID info and at least one data vector */
+		if (req->data_iovs < 2) {
+			SDMA_DBG(req,
+				 "Not enough vectors for expected request");
+			ret = -EINVAL;
+			goto free_req;
+		}
 		req->data_iovs--;
+	}
 
 	if (!info.npkts || req->data_iovs > MAX_VECTORS_PER_REQ) {
 		SDMA_DBG(req, "Too many vectors (%u/%u)", req->data_iovs,
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 05/16] IB/hfi1: Release node on insert failure
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (3 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 04/16] IB/hfi1: Validate SDMA user iovector count ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 06/16] IB/hfi1: Fix error condition that needs to clean up ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (11 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If unable to insert node into the RB tree cache, node will be freed
before returning from the function.  Null out iovec's pointer to node
so iovec does not try to free it later.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 42cc371cdf95..ff03e1dad5b9 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -1239,6 +1239,7 @@ retry:
 			list_del(&node->list);
 		pq->n_locked -= node->npages;
 		spin_unlock(&pq->evict_lock);
+		iovec->node = NULL;
 		goto bail;
 	}
 	return 0;
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 06/16] IB/hfi1: Fix error condition that needs to clean up
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (4 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 05/16] IB/hfi1: Release node on insert failure ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 07/16] IB/hfi1: Fix user SDMA racy user request claim ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (10 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

If input validation fails, properly free the request before returning.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index ff03e1dad5b9..5c1322428065 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -617,7 +617,8 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 	if (!info.npkts || req->data_iovs > MAX_VECTORS_PER_REQ) {
 		SDMA_DBG(req, "Too many vectors (%u/%u)", req->data_iovs,
 			 MAX_VECTORS_PER_REQ);
-		return -EINVAL;
+		ret = -EINVAL;
+		goto free_req;
 	}
 	/* Copy the header from the user buffer */
 	ret = copy_from_user(&req->hdr, iovec[idx].iov_base + sizeof(info),
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 07/16] IB/hfi1: Fix user SDMA racy user request claim
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (5 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 06/16] IB/hfi1: Fix error condition that needs to clean up ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 08/16] IB/hfi1: Make use of mm consistent ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (9 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The user SDMA in-use claim bit is in the structure that gets zeroed out
once the claim is made.  Move the request in-use flag into its own bit
array and use that for atomic claims.  This cleans up the claim code and
removes any race possibility.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 32 +++++++++++++++++++-------------
 drivers/infiniband/hw/hfi1/user_sdma.h |  1 +
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 5c1322428065..e88d555389f4 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -145,7 +145,7 @@ MODULE_PARM_DESC(sdma_comp_size, "Size of User SDMA completion ring. Default: 12
 /* Last packet in the request */
 #define TXREQ_FLAGS_REQ_LAST_PKT BIT(0)
 
-#define SDMA_REQ_IN_USE     0
+/* SDMA request flag bits */
 #define SDMA_REQ_FOR_THREAD 1
 #define SDMA_REQ_SEND_DONE  2
 #define SDMA_REQ_HAVE_AHG   3
@@ -397,6 +397,11 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	if (!pq->reqs)
 		goto pq_reqs_nomem;
 
+	memsize = BITS_TO_LONGS(hfi1_sdma_comp_ring_size) * sizeof(long);
+	pq->req_in_use = kzalloc(memsize, GFP_KERNEL);
+	if (!pq->req_in_use)
+		goto pq_reqs_no_in_use;
+
 	INIT_LIST_HEAD(&pq->list);
 	pq->dd = dd;
 	pq->ctxt = uctxt->ctxt;
@@ -453,6 +458,8 @@ cq_comps_nomem:
 cq_nomem:
 	kmem_cache_destroy(pq->txreq_cache);
 pq_txreq_nomem:
+	kfree(pq->req_in_use);
+pq_reqs_no_in_use:
 	kfree(pq->reqs);
 pq_reqs_nomem:
 	kfree(pq);
@@ -484,6 +491,7 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd)
 			pq->wait,
 			(ACCESS_ONCE(pq->state) == SDMA_PKT_Q_INACTIVE));
 		kfree(pq->reqs);
+		kfree(pq->req_in_use);
 		kmem_cache_destroy(pq->txreq_cache);
 		kfree(pq);
 		fd->pq = NULL;
@@ -572,29 +580,27 @@ int hfi1_user_sdma_process_request(struct file *fp, struct iovec *iovec,
 		return -EINVAL;
 	}
 
-	if (cq->comps[info.comp_idx].status == QUEUED ||
-	    test_bit(SDMA_REQ_IN_USE, &pq->reqs[info.comp_idx].flags)) {
-		hfi1_cdbg(SDMA, "[%u:%u:%u] Entry %u is in QUEUED state",
-			  dd->unit, uctxt->ctxt, fd->subctxt,
-			  info.comp_idx);
-		return -EBADSLT;
-	}
 	if (!info.fragsize) {
 		hfi1_cdbg(SDMA,
 			  "[%u:%u:%u:%u] Request does not specify fragsize",
 			  dd->unit, uctxt->ctxt, fd->subctxt, info.comp_idx);
 		return -EINVAL;
 	}
+
+	/* Try to claim the request. */
+	if (test_and_set_bit(info.comp_idx, pq->req_in_use)) {
+		hfi1_cdbg(SDMA, "[%u:%u:%u] Entry %u is in use",
+			  dd->unit, uctxt->ctxt, fd->subctxt,
+			  info.comp_idx);
+		return -EBADSLT;
+	}
 	/*
-	 * We've done all the safety checks that we can up to this point,
-	 * "allocate" the request entry.
+	 * All safety checks have been done and this request has been claimed.
 	 */
 	hfi1_cdbg(SDMA, "[%u:%u:%u] Using req/comp entry %u\n", dd->unit,
 		  uctxt->ctxt, fd->subctxt, info.comp_idx);
 	req = pq->reqs + info.comp_idx;
 	memset(req, 0, sizeof(*req));
-	/* Mark the request as IN_USE before we start filling it in. */
-	set_bit(SDMA_REQ_IN_USE, &req->flags);
 	req->data_iovs = req_iovcnt(info.ctrl) - 1; /* subtract header vector */
 	req->pq = pq;
 	req->cq = cq;
@@ -1612,7 +1618,7 @@ static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
 		}
 	}
 	kfree(req->tids);
-	clear_bit(SDMA_REQ_IN_USE, &req->flags);
+	clear_bit(req->info.comp_idx, req->pq->req_in_use);
 }
 
 static inline void set_comp_state(struct hfi1_user_sdma_pkt_q *pq,
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index b9240e351161..20ff846f318b 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -63,6 +63,7 @@ struct hfi1_user_sdma_pkt_q {
 	struct hfi1_devdata *dd;
 	struct kmem_cache *txreq_cache;
 	struct user_sdma_request *reqs;
+	unsigned long *req_in_use;
 	struct iowait busy;
 	unsigned state;
 	wait_queue_head_t wait;
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 08/16] IB/hfi1: Make use of mm consistent
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (6 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 07/16] IB/hfi1: Fix user SDMA racy user request claim ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 09/16] IB/hfi1: Make the cache handler own its rb tree root ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (8 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The hfi1 driver registers a mmu_notifier callback when /dev/hfi1_* is
opened, and unregisters it when the device is closed.  The driver
incorrectly assumes that the close will always happen from the same
context as the open.  In particular, closes due to SIGKILL or OOM killer
activity may happen from a different context.  In these cases, the wrong
mm is passed to mmu_notifier_unregister(), which causes improper reference
counting for the victim mm, and eventual memory corruption.

Preserve the mm for all open file descriptors and use this mm rather than
current->mm for memory operations for the lifetime of that fd.  Note: this
patch leaves 1 use of current->mm in place.  This use is removed in a
follow on patch because other functional changes were required prior to
that use being removed.

If registration fails, there is no reason to keep the handler object
around.  Free the handler object rather than add it to the list to
prevent any mmu_notifier operations, including unregister, when
registration fails.

Suggested-by: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/file_ops.c     |  6 ++++--
 drivers/infiniband/hw/hfi1/hfi.h          |  8 +++++---
 drivers/infiniband/hw/hfi1/mmu_rb.c       | 18 ++++++++++++++----
 drivers/infiniband/hw/hfi1/mmu_rb.h       |  3 ++-
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 15 ++++++++-------
 drivers/infiniband/hw/hfi1/user_pages.c   | 19 ++++++++++---------
 drivers/infiniband/hw/hfi1/user_sdma.c    | 11 ++++++-----
 drivers/infiniband/hw/hfi1/user_sdma.h    |  1 +
 8 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 1f4cd5aa2071..302f0cdd8119 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -180,8 +180,10 @@ static int hfi1_file_open(struct inode *inode, struct file *fp)
 
 	fd = kzalloc(sizeof(*fd), GFP_KERNEL);
 
-	if (fd) /* no cpu affinity by default */
-		fd->rec_cpu_num = -1;
+	if (fd) {
+		fd->rec_cpu_num = -1; /* no cpu affinity by default */
+		fd->mm = current->mm;
+	}
 
 	fp->private_data = fd;
 
diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 36e6b8e0c735..67f37c9ea960 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1205,6 +1205,7 @@ struct hfi1_filedata {
 	u32 invalid_tid_idx;
 	/* protect invalid_tids array and invalid_tid_idx */
 	spinlock_t invalid_lock;
+	struct mm_struct *mm;
 };
 
 extern struct list_head hfi1_dev_list;
@@ -1700,9 +1701,10 @@ void shutdown_led_override(struct hfi1_pportdata *ppd);
  */
 #define DEFAULT_RCVHDR_ENTSIZE 32
 
-bool hfi1_can_pin_pages(struct hfi1_devdata *dd, u32 nlocked, u32 npages);
-int hfi1_acquire_user_pages(unsigned long vaddr, size_t npages, bool writable,
-			    struct page **pages);
+bool hfi1_can_pin_pages(struct hfi1_devdata *dd, struct mm_struct *mm,
+			u32 nlocked, u32 npages);
+int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr,
+			    size_t npages, bool writable, struct page **pages);
 void hfi1_release_user_pages(struct mm_struct *mm, struct page **p,
 			     size_t npages, bool dirty);
 
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index 1c7e25b90a2c..e5c5ef4cf06c 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -58,6 +58,7 @@ struct mmu_rb_handler {
 	struct rb_root *root;
 	spinlock_t lock;        /* protect the RB tree */
 	struct mmu_rb_ops *ops;
+	struct mm_struct *mm;
 };
 
 static LIST_HEAD(mmu_rb_handlers);
@@ -95,9 +96,11 @@ static unsigned long mmu_node_last(struct mmu_rb_node *node)
 	return PAGE_ALIGN(node->addr + node->len) - 1;
 }
 
-int hfi1_mmu_rb_register(struct rb_root *root, struct mmu_rb_ops *ops)
+int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
+			 struct mmu_rb_ops *ops)
 {
 	struct mmu_rb_handler *handlr;
+	int ret;
 
 	handlr = kmalloc(sizeof(*handlr), GFP_KERNEL);
 	if (!handlr)
@@ -108,11 +111,19 @@ int hfi1_mmu_rb_register(struct rb_root *root, struct mmu_rb_ops *ops)
 	INIT_HLIST_NODE(&handlr->mn.hlist);
 	spin_lock_init(&handlr->lock);
 	handlr->mn.ops = &mn_opts;
+	handlr->mm = mm;
+
+	ret = mmu_notifier_register(&handlr->mn, handlr->mm);
+	if (ret) {
+		kfree(handlr);
+		return ret;
+	}
+
 	spin_lock(&mmu_rb_lock);
 	list_add_tail_rcu(&handlr->list, &mmu_rb_handlers);
 	spin_unlock(&mmu_rb_lock);
 
-	return mmu_notifier_register(&handlr->mn, current->mm);
+	return ret;
 }
 
 void hfi1_mmu_rb_unregister(struct rb_root *root)
@@ -126,8 +137,7 @@ void hfi1_mmu_rb_unregister(struct rb_root *root)
 		return;
 
 	/* Unregister first so we don't get any more notifications. */
-	if (current->mm)
-		mmu_notifier_unregister(&handler->mn, current->mm);
+	mmu_notifier_unregister(&handler->mn, handler->mm);
 
 	spin_lock(&mmu_rb_lock);
 	list_del_rcu(&handler->list);
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index 45e7245d813b..489a691856e5 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -65,7 +65,8 @@ struct mmu_rb_ops {
 	int (*invalidate)(struct rb_root *root, struct mmu_rb_node *node);
 };
 
-int hfi1_mmu_rb_register(struct rb_root *root, struct mmu_rb_ops *ops);
+int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
+			 struct mmu_rb_ops *ops);
 void hfi1_mmu_rb_unregister(struct rb_root *);
 int hfi1_mmu_rb_insert(struct rb_root *, struct mmu_rb_node *);
 void hfi1_mmu_rb_remove(struct rb_root *, struct mmu_rb_node *);
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 8283a6a2bb15..a2f7e719dc4d 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -211,7 +211,8 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 		 * fails, continue but turn off the TID caching for
 		 * all user contexts.
 		 */
-		ret = hfi1_mmu_rb_register(&fd->tid_rb_root, &tid_rb_ops);
+		ret = hfi1_mmu_rb_register(fd->mm, &fd->tid_rb_root,
+					   &tid_rb_ops);
 		if (ret) {
 			dd_dev_info(dd,
 				    "Failed MMU notifier registration %d\n",
@@ -399,12 +400,12 @@ int hfi1_user_exp_rcv_setup(struct file *fp, struct hfi1_tid_info *tinfo)
 	 * pages, accept the amount pinned so far and program only that.
 	 * User space knows how to deal with partially programmed buffers.
 	 */
-	if (!hfi1_can_pin_pages(dd, fd->tid_n_pinned, npages)) {
+	if (!hfi1_can_pin_pages(dd, fd->mm, fd->tid_n_pinned, npages)) {
 		ret = -ENOMEM;
 		goto bail;
 	}
 
-	pinned = hfi1_acquire_user_pages(vaddr, npages, true, pages);
+	pinned = hfi1_acquire_user_pages(fd->mm, vaddr, npages, true, pages);
 	if (pinned <= 0) {
 		ret = pinned;
 		goto bail;
@@ -559,7 +560,7 @@ nomem:
 	 * for example), unpin all unmapped pages so we can pin them nex time.
 	 */
 	if (mapped_pages != pinned) {
-		hfi1_release_user_pages(current->mm, &pages[mapped_pages],
+		hfi1_release_user_pages(fd->mm, &pages[mapped_pages],
 					pinned - mapped_pages,
 					false);
 		fd->tid_n_pinned -= pinned - mapped_pages;
@@ -905,7 +906,7 @@ static int unprogram_rcvarray(struct file *fp, u32 tidinfo,
 	if (!node || node->rcventry != (uctxt->expected_base + rcventry))
 		return -EBADF;
 	if (HFI1_CAP_IS_USET(TID_UNMAP))
-		tid_rb_remove(&fd->tid_rb_root, &node->mmu, NULL);
+		tid_rb_remove(&fd->tid_rb_root, &node->mmu, fd->mm);
 	else
 		hfi1_mmu_rb_remove(&fd->tid_rb_root, &node->mmu);
 
@@ -933,7 +934,7 @@ static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node)
 
 	pci_unmap_single(dd->pcidev, node->dma_addr, node->mmu.len,
 			 PCI_DMA_FROMDEVICE);
-	hfi1_release_user_pages(current->mm, node->pages, node->npages, true);
+	hfi1_release_user_pages(fd->mm, node->pages, node->npages, true);
 	fd->tid_n_pinned -= node->npages;
 
 	node->grp->used--;
@@ -970,7 +971,7 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 					continue;
 				if (HFI1_CAP_IS_USET(TID_UNMAP))
 					tid_rb_remove(&fd->tid_rb_root,
-						      &node->mmu, NULL);
+						      &node->mmu, fd->mm);
 				else
 					hfi1_mmu_rb_remove(&fd->tid_rb_root,
 							   &node->mmu);
diff --git a/drivers/infiniband/hw/hfi1/user_pages.c b/drivers/infiniband/hw/hfi1/user_pages.c
index 88e10b5f55f1..20f4ddcac3b0 100644
--- a/drivers/infiniband/hw/hfi1/user_pages.c
+++ b/drivers/infiniband/hw/hfi1/user_pages.c
@@ -68,7 +68,8 @@ MODULE_PARM_DESC(cache_size, "Send and receive side cache size limit (in MB)");
  * could keeping caching buffers.
  *
  */
-bool hfi1_can_pin_pages(struct hfi1_devdata *dd, u32 nlocked, u32 npages)
+bool hfi1_can_pin_pages(struct hfi1_devdata *dd, struct mm_struct *mm,
+			u32 nlocked, u32 npages)
 {
 	unsigned long ulimit = rlimit(RLIMIT_MEMLOCK), pinned, cache_limit,
 		size = (cache_size * (1UL << 20)); /* convert to bytes */
@@ -89,9 +90,9 @@ bool hfi1_can_pin_pages(struct hfi1_devdata *dd, u32 nlocked, u32 npages)
 	/* Convert to number of pages */
 	size = DIV_ROUND_UP(size, PAGE_SIZE);
 
-	down_read(&current->mm->mmap_sem);
-	pinned = current->mm->pinned_vm;
-	up_read(&current->mm->mmap_sem);
+	down_read(&mm->mmap_sem);
+	pinned = mm->pinned_vm;
+	up_read(&mm->mmap_sem);
 
 	/* First, check the absolute limit against all pinned pages. */
 	if (pinned + npages >= ulimit && !can_lock)
@@ -100,8 +101,8 @@ bool hfi1_can_pin_pages(struct hfi1_devdata *dd, u32 nlocked, u32 npages)
 	return ((nlocked + npages) <= size) || can_lock;
 }
 
-int hfi1_acquire_user_pages(unsigned long vaddr, size_t npages, bool writable,
-			    struct page **pages)
+int hfi1_acquire_user_pages(struct mm_struct *mm, unsigned long vaddr, size_t npages,
+			    bool writable, struct page **pages)
 {
 	int ret;
 
@@ -109,9 +110,9 @@ int hfi1_acquire_user_pages(unsigned long vaddr, size_t npages, bool writable,
 	if (ret < 0)
 		return ret;
 
-	down_write(&current->mm->mmap_sem);
-	current->mm->pinned_vm += ret;
-	up_write(&current->mm->mmap_sem);
+	down_write(&mm->mmap_sem);
+	mm->pinned_vm += ret;
+	up_write(&mm->mmap_sem);
 
 	return ret;
 }
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index e88d555389f4..640c244b665b 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -413,6 +413,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	pq->sdma_rb_root = RB_ROOT;
 	INIT_LIST_HEAD(&pq->evict);
 	spin_lock_init(&pq->evict_lock);
+	pq->mm = fd->mm;
 
 	iowait_init(&pq->busy, 0, NULL, defer_packet_queue,
 		    activate_packet_queue, NULL);
@@ -442,7 +443,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	cq->nentries = hfi1_sdma_comp_ring_size;
 	fd->cq = cq;
 
-	ret = hfi1_mmu_rb_register(&pq->sdma_rb_root, &sdma_rb_ops);
+	ret = hfi1_mmu_rb_register(pq->mm, &pq->sdma_rb_root, &sdma_rb_ops);
 	if (ret) {
 		dd_dev_err(dd, "Failed to register with MMU %d", ret);
 		goto done;
@@ -1205,12 +1206,12 @@ static int pin_vector_pages(struct user_sdma_request *req,
 			spin_unlock(&pq->evict_lock);
 		}
 retry:
-		if (!hfi1_can_pin_pages(pq->dd, pq->n_locked, npages)) {
+		if (!hfi1_can_pin_pages(pq->dd, pq->mm, pq->n_locked, npages)) {
 			cleared = sdma_cache_evict(pq, npages);
 			if (cleared >= npages)
 				goto retry;
 		}
-		pinned = hfi1_acquire_user_pages(
+		pinned = hfi1_acquire_user_pages(pq->mm,
 			((unsigned long)iovec->iov.iov_base +
 			 (node->npages * PAGE_SIZE)), npages, 0,
 			pages + node->npages);
@@ -1220,7 +1221,7 @@ retry:
 			goto bail;
 		}
 		if (pinned != npages) {
-			unpin_vector_pages(current->mm, pages, node->npages,
+			unpin_vector_pages(pq->mm, pages, node->npages,
 					   pinned);
 			ret = -EFAULT;
 			goto bail;
@@ -1252,7 +1253,7 @@ retry:
 	return 0;
 bail:
 	if (rb_node)
-		unpin_vector_pages(current->mm, node->pages, 0, node->npages);
+		unpin_vector_pages(pq->mm, node->pages, 0, node->npages);
 	kfree(node);
 	return ret;
 }
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index 20ff846f318b..ff49f74f43f4 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -72,6 +72,7 @@ struct hfi1_user_sdma_pkt_q {
 	u32 n_locked;
 	struct list_head evict;
 	spinlock_t evict_lock; /* protect evict and n_locked */
+	struct mm_struct *mm;
 };
 
 struct hfi1_user_sdma_comp_q {
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 09/16] IB/hfi1: Make the cache handler own its rb tree root
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (7 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 08/16] IB/hfi1: Make use of mm consistent ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 10/16] IB/hfi1: Fix TID caching actions ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (7 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The objects which use cache handling should reference their own handler
object not the internal data structure it uses to track the nodes.

Have the "users" of the mmu notifier code pass opaque objects which can
then be properly used in the mmu callbacks depending on the owners needs.

This patch has the additional benefit that operations no longer require a
look up in a list to find the handlers.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/hfi.h          |  3 +-
 drivers/infiniband/hw/hfi1/mmu_rb.c       | 98 ++++++++++---------------------
 drivers/infiniband/hw/hfi1/mmu_rb.h       | 23 ++++----
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 54 +++++++----------
 drivers/infiniband/hw/hfi1/user_sdma.c    | 26 ++++----
 drivers/infiniband/hw/hfi1/user_sdma.h    |  2 +-
 6 files changed, 81 insertions(+), 125 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/hfi.h b/drivers/infiniband/hw/hfi1/hfi.h
index 67f37c9ea960..ba9083602cbd 100644
--- a/drivers/infiniband/hw/hfi1/hfi.h
+++ b/drivers/infiniband/hw/hfi1/hfi.h
@@ -1186,6 +1186,7 @@ struct hfi1_devdata {
 
 struct tid_rb_node;
 struct mmu_rb_node;
+struct mmu_rb_handler;
 
 /* Private data for file operations */
 struct hfi1_filedata {
@@ -1196,7 +1197,7 @@ struct hfi1_filedata {
 	/* for cpu affinity; -1 if none */
 	int rec_cpu_num;
 	u32 tid_n_pinned;
-	struct rb_root tid_rb_root;
+	struct mmu_rb_handler *handler;
 	struct tid_rb_node **entry_to_rb;
 	spinlock_t tid_lock; /* protect tid_[limit,used] counters */
 	u32 tid_limit;
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index e5c5ef4cf06c..9fbcfed4d34c 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -53,20 +53,16 @@
 #include "trace.h"
 
 struct mmu_rb_handler {
-	struct list_head list;
 	struct mmu_notifier mn;
-	struct rb_root *root;
+	struct rb_root root;
+	void *ops_arg;
 	spinlock_t lock;        /* protect the RB tree */
 	struct mmu_rb_ops *ops;
 	struct mm_struct *mm;
 };
 
-static LIST_HEAD(mmu_rb_handlers);
-static DEFINE_SPINLOCK(mmu_rb_lock); /* protect mmu_rb_handlers list */
-
 static unsigned long mmu_node_start(struct mmu_rb_node *);
 static unsigned long mmu_node_last(struct mmu_rb_node *);
-static struct mmu_rb_handler *find_mmu_handler(struct rb_root *);
 static inline void mmu_notifier_page(struct mmu_notifier *, struct mm_struct *,
 				     unsigned long);
 static inline void mmu_notifier_range_start(struct mmu_notifier *,
@@ -96,8 +92,9 @@ static unsigned long mmu_node_last(struct mmu_rb_node *node)
 	return PAGE_ALIGN(node->addr + node->len) - 1;
 }
 
-int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
-			 struct mmu_rb_ops *ops)
+int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
+			 struct mmu_rb_ops *ops,
+			 struct mmu_rb_handler **handler)
 {
 	struct mmu_rb_handler *handlr;
 	int ret;
@@ -106,8 +103,9 @@ int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
 	if (!handlr)
 		return -ENOMEM;
 
-	handlr->root = root;
+	handlr->root = RB_ROOT;
 	handlr->ops = ops;
+	handlr->ops_arg = ops_arg;
 	INIT_HLIST_NODE(&handlr->mn.hlist);
 	spin_lock_init(&handlr->lock);
 	handlr->mn.ops = &mn_opts;
@@ -119,52 +117,38 @@ int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
 		return ret;
 	}
 
-	spin_lock(&mmu_rb_lock);
-	list_add_tail_rcu(&handlr->list, &mmu_rb_handlers);
-	spin_unlock(&mmu_rb_lock);
-
-	return ret;
+	*handler = handlr;
+	return 0;
 }
 
-void hfi1_mmu_rb_unregister(struct rb_root *root)
+void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
 {
-	struct mmu_rb_handler *handler = find_mmu_handler(root);
 	struct mmu_rb_node *rbnode;
 	struct rb_node *node;
 	unsigned long flags;
 
-	if (!handler)
-		return;
-
 	/* Unregister first so we don't get any more notifications. */
 	mmu_notifier_unregister(&handler->mn, handler->mm);
 
-	spin_lock(&mmu_rb_lock);
-	list_del_rcu(&handler->list);
-	spin_unlock(&mmu_rb_lock);
-	synchronize_rcu();
-
 	spin_lock_irqsave(&handler->lock, flags);
-	while ((node = rb_first(root))) {
+	while ((node = rb_first(&handler->root))) {
 		rbnode = rb_entry(node, struct mmu_rb_node, node);
-		rb_erase(node, root);
-		handler->ops->remove(root, rbnode, NULL);
+		rb_erase(node, &handler->root);
+		handler->ops->remove(handler->ops_arg, rbnode,
+				     NULL);
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
 
 	kfree(handler);
 }
 
-int hfi1_mmu_rb_insert(struct rb_root *root, struct mmu_rb_node *mnode)
+int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
+		       struct mmu_rb_node *mnode)
 {
-	struct mmu_rb_handler *handler = find_mmu_handler(root);
 	struct mmu_rb_node *node;
 	unsigned long flags;
 	int ret = 0;
 
-	if (!handler)
-		return -EINVAL;
-
 	spin_lock_irqsave(&handler->lock, flags);
 	hfi1_cdbg(MMU, "Inserting node addr 0x%llx, len %u", mnode->addr,
 		  mnode->len);
@@ -173,11 +157,11 @@ int hfi1_mmu_rb_insert(struct rb_root *root, struct mmu_rb_node *mnode)
 		ret = -EINVAL;
 		goto unlock;
 	}
-	__mmu_int_rb_insert(mnode, root);
+	__mmu_int_rb_insert(mnode, &handler->root);
 
-	ret = handler->ops->insert(root, mnode);
+	ret = handler->ops->insert(handler->ops_arg, mnode);
 	if (ret)
-		__mmu_int_rb_remove(mnode, root);
+		__mmu_int_rb_remove(mnode, &handler->root);
 unlock:
 	spin_unlock_irqrestore(&handler->lock, flags);
 	return ret;
@@ -192,10 +176,10 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
 
 	hfi1_cdbg(MMU, "Searching for addr 0x%llx, len %u", addr, len);
 	if (!handler->ops->filter) {
-		node = __mmu_int_rb_iter_first(handler->root, addr,
+		node = __mmu_int_rb_iter_first(&handler->root, addr,
 					       (addr + len) - 1);
 	} else {
-		for (node = __mmu_int_rb_iter_first(handler->root, addr,
+		for (node = __mmu_int_rb_iter_first(&handler->root, addr,
 						    (addr + len) - 1);
 		     node;
 		     node = __mmu_int_rb_iter_next(node, addr,
@@ -207,56 +191,34 @@ static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *handler,
 	return node;
 }
 
-struct mmu_rb_node *hfi1_mmu_rb_extract(struct rb_root *root,
+struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
 					unsigned long addr, unsigned long len)
 {
-	struct mmu_rb_handler *handler = find_mmu_handler(root);
 	struct mmu_rb_node *node;
 	unsigned long flags;
 
-	if (!handler)
-		return ERR_PTR(-EINVAL);
-
 	spin_lock_irqsave(&handler->lock, flags);
 	node = __mmu_rb_search(handler, addr, len);
 	if (node)
-		__mmu_int_rb_remove(node, handler->root);
+		__mmu_int_rb_remove(node, &handler->root);
 	spin_unlock_irqrestore(&handler->lock, flags);
 
 	return node;
 }
 
-void hfi1_mmu_rb_remove(struct rb_root *root, struct mmu_rb_node *node)
+void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
+			struct mmu_rb_node *node)
 {
 	unsigned long flags;
-	struct mmu_rb_handler *handler = find_mmu_handler(root);
-
-	if (!handler || !node)
-		return;
 
 	/* Validity of handler and node pointers has been checked by caller. */
 	hfi1_cdbg(MMU, "Removing node addr 0x%llx, len %u", node->addr,
 		  node->len);
 	spin_lock_irqsave(&handler->lock, flags);
-	__mmu_int_rb_remove(node, handler->root);
+	__mmu_int_rb_remove(node, &handler->root);
 	spin_unlock_irqrestore(&handler->lock, flags);
 
-	handler->ops->remove(handler->root, node, NULL);
-}
-
-static struct mmu_rb_handler *find_mmu_handler(struct rb_root *root)
-{
-	struct mmu_rb_handler *handler;
-
-	rcu_read_lock();
-	list_for_each_entry_rcu(handler, &mmu_rb_handlers, list) {
-		if (handler->root == root)
-			goto unlock;
-	}
-	handler = NULL;
-unlock:
-	rcu_read_unlock();
-	return handler;
+	handler->ops->remove(handler->ops_arg, node, NULL);
 }
 
 static inline void mmu_notifier_page(struct mmu_notifier *mn,
@@ -279,7 +241,7 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 {
 	struct mmu_rb_handler *handler =
 		container_of(mn, struct mmu_rb_handler, mn);
-	struct rb_root *root = handler->root;
+	struct rb_root *root = &handler->root;
 	struct mmu_rb_node *node, *ptr = NULL;
 	unsigned long flags;
 
@@ -290,9 +252,9 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 		ptr = __mmu_int_rb_iter_next(node, start, end - 1);
 		hfi1_cdbg(MMU, "Invalidating node addr 0x%llx, len %u",
 			  node->addr, node->len);
-		if (handler->ops->invalidate(root, node)) {
+		if (handler->ops->invalidate(handler->ops_arg, node)) {
 			__mmu_int_rb_remove(node, root);
-			handler->ops->remove(root, node, mm);
+			handler->ops->remove(handler->ops_arg, node, mm);
 		}
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index 489a691856e5..2cedfbe2189e 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -59,18 +59,21 @@ struct mmu_rb_node {
 struct mmu_rb_ops {
 	bool (*filter)(struct mmu_rb_node *node, unsigned long addr,
 		       unsigned long len);
-	int (*insert)(struct rb_root *root, struct mmu_rb_node *mnode);
-	void (*remove)(struct rb_root *root, struct mmu_rb_node *mnode,
+	int (*insert)(void *ops_arg, struct mmu_rb_node *mnode);
+	void (*remove)(void *ops_arg, struct mmu_rb_node *mnode,
 		       struct mm_struct *mm);
-	int (*invalidate)(struct rb_root *root, struct mmu_rb_node *node);
+	int (*invalidate)(void *ops_arg, struct mmu_rb_node *node);
 };
 
-int hfi1_mmu_rb_register(struct mm_struct *mm, struct rb_root *root,
-			 struct mmu_rb_ops *ops);
-void hfi1_mmu_rb_unregister(struct rb_root *);
-int hfi1_mmu_rb_insert(struct rb_root *, struct mmu_rb_node *);
-void hfi1_mmu_rb_remove(struct rb_root *, struct mmu_rb_node *);
-struct mmu_rb_node *hfi1_mmu_rb_extract(struct rb_root *, unsigned long,
-					unsigned long);
+int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
+			 struct mmu_rb_ops *ops,
+			 struct mmu_rb_handler **handler);
+void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler);
+int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
+		       struct mmu_rb_node *mnode);
+void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
+			struct mmu_rb_node *mnode);
+struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
+					unsigned long addr, unsigned long len);
 
 #endif /* _HFI1_MMU_RB_H */
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index a2f7e719dc4d..269a948189e0 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -82,14 +82,14 @@ struct tid_pageset {
 	       ((unsigned long)vaddr & PAGE_MASK)) >> PAGE_SHIFT))
 
 static void unlock_exp_tids(struct hfi1_ctxtdata *, struct exp_tid_set *,
-			    struct rb_root *);
+			    struct hfi1_filedata *);
 static u32 find_phys_blocks(struct page **, unsigned, struct tid_pageset *);
 static int set_rcvarray_entry(struct file *, unsigned long, u32,
 			      struct tid_group *, struct page **, unsigned);
-static int tid_rb_insert(struct rb_root *, struct mmu_rb_node *);
-static void tid_rb_remove(struct rb_root *, struct mmu_rb_node *,
+static int tid_rb_insert(void *, struct mmu_rb_node *);
+static void tid_rb_remove(void *, struct mmu_rb_node *,
 			  struct mm_struct *);
-static int tid_rb_invalidate(struct rb_root *, struct mmu_rb_node *);
+static int tid_rb_invalidate(void *, struct mmu_rb_node *);
 static int program_rcvarray(struct file *, unsigned long, struct tid_group *,
 			    struct tid_pageset *, unsigned, u16, struct page **,
 			    u32 *, unsigned *, unsigned *);
@@ -162,7 +162,6 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 
 	spin_lock_init(&fd->tid_lock);
 	spin_lock_init(&fd->invalid_lock);
-	fd->tid_rb_root = RB_ROOT;
 
 	if (!uctxt->subctxt_cnt || !fd->subctxt) {
 		exp_tid_group_init(&uctxt->tid_group_list);
@@ -211,8 +210,7 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 		 * fails, continue but turn off the TID caching for
 		 * all user contexts.
 		 */
-		ret = hfi1_mmu_rb_register(fd->mm, &fd->tid_rb_root,
-					   &tid_rb_ops);
+		ret = hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops, &fd->handler);
 		if (ret) {
 			dd_dev_info(dd,
 				    "Failed MMU notifier registration %d\n",
@@ -263,17 +261,15 @@ int hfi1_user_exp_rcv_free(struct hfi1_filedata *fd)
 	 * was freed.
 	 */
 	if (!HFI1_CAP_IS_USET(TID_UNMAP))
-		hfi1_mmu_rb_unregister(&fd->tid_rb_root);
+		hfi1_mmu_rb_unregister(fd->handler);
 
 	kfree(fd->invalid_tids);
 
 	if (!uctxt->cnt) {
 		if (!EXP_TID_SET_EMPTY(uctxt->tid_full_list))
-			unlock_exp_tids(uctxt, &uctxt->tid_full_list,
-					&fd->tid_rb_root);
+			unlock_exp_tids(uctxt, &uctxt->tid_full_list, fd);
 		if (!EXP_TID_SET_EMPTY(uctxt->tid_used_list))
-			unlock_exp_tids(uctxt, &uctxt->tid_used_list,
-					&fd->tid_rb_root);
+			unlock_exp_tids(uctxt, &uctxt->tid_used_list, fd);
 		list_for_each_entry_safe(grp, gptr, &uctxt->tid_group_list.list,
 					 list) {
 			list_del_init(&grp->list);
@@ -830,7 +826,6 @@ static int set_rcvarray_entry(struct file *fp, unsigned long vaddr,
 	struct hfi1_ctxtdata *uctxt = fd->uctxt;
 	struct tid_rb_node *node;
 	struct hfi1_devdata *dd = uctxt->dd;
-	struct rb_root *root = &fd->tid_rb_root;
 	dma_addr_t phys;
 
 	/*
@@ -863,9 +858,9 @@ static int set_rcvarray_entry(struct file *fp, unsigned long vaddr,
 	memcpy(node->pages, pages, sizeof(struct page *) * npages);
 
 	if (HFI1_CAP_IS_USET(TID_UNMAP))
-		ret = tid_rb_insert(root, &node->mmu);
+		ret = tid_rb_insert(fd, &node->mmu);
 	else
-		ret = hfi1_mmu_rb_insert(root, &node->mmu);
+		ret = hfi1_mmu_rb_insert(fd->handler, &node->mmu);
 
 	if (ret) {
 		hfi1_cdbg(TID, "Failed to insert RB node %u 0x%lx, 0x%lx %d",
@@ -906,9 +901,9 @@ static int unprogram_rcvarray(struct file *fp, u32 tidinfo,
 	if (!node || node->rcventry != (uctxt->expected_base + rcventry))
 		return -EBADF;
 	if (HFI1_CAP_IS_USET(TID_UNMAP))
-		tid_rb_remove(&fd->tid_rb_root, &node->mmu, fd->mm);
+		tid_rb_remove(fd, &node->mmu, fd->mm);
 	else
-		hfi1_mmu_rb_remove(&fd->tid_rb_root, &node->mmu);
+		hfi1_mmu_rb_remove(fd->handler, &node->mmu);
 
 	if (grp)
 		*grp = node->grp;
@@ -950,11 +945,10 @@ static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node)
 }
 
 static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
-			    struct exp_tid_set *set, struct rb_root *root)
+			    struct exp_tid_set *set,
+			    struct hfi1_filedata *fd)
 {
 	struct tid_group *grp, *ptr;
-	struct hfi1_filedata *fd = container_of(root, struct hfi1_filedata,
-						tid_rb_root);
 	int i;
 
 	list_for_each_entry_safe(grp, ptr, &set->list, list) {
@@ -970,10 +964,9 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 				if (!node || node->rcventry != rcventry)
 					continue;
 				if (HFI1_CAP_IS_USET(TID_UNMAP))
-					tid_rb_remove(&fd->tid_rb_root,
-						      &node->mmu, fd->mm);
+					tid_rb_remove(fd, &node->mmu, fd->mm);
 				else
-					hfi1_mmu_rb_remove(&fd->tid_rb_root,
+					hfi1_mmu_rb_remove(fd->handler,
 							   &node->mmu);
 				clear_tid_node(fd, node);
 			}
@@ -981,10 +974,9 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 	}
 }
 
-static int tid_rb_invalidate(struct rb_root *root, struct mmu_rb_node *mnode)
+static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode)
 {
-	struct hfi1_filedata *fdata =
-		container_of(root, struct hfi1_filedata, tid_rb_root);
+	struct hfi1_filedata *fdata = arg;
 	struct hfi1_ctxtdata *uctxt = fdata->uctxt;
 	struct tid_rb_node *node =
 		container_of(mnode, struct tid_rb_node, mmu);
@@ -1025,10 +1017,9 @@ static int tid_rb_invalidate(struct rb_root *root, struct mmu_rb_node *mnode)
 	return 0;
 }
 
-static int tid_rb_insert(struct rb_root *root, struct mmu_rb_node *node)
+static int tid_rb_insert(void *arg, struct mmu_rb_node *node)
 {
-	struct hfi1_filedata *fdata =
-		container_of(root, struct hfi1_filedata, tid_rb_root);
+	struct hfi1_filedata *fdata = arg;
 	struct tid_rb_node *tnode =
 		container_of(node, struct tid_rb_node, mmu);
 	u32 base = fdata->uctxt->expected_base;
@@ -1037,11 +1028,10 @@ static int tid_rb_insert(struct rb_root *root, struct mmu_rb_node *node)
 	return 0;
 }
 
-static void tid_rb_remove(struct rb_root *root, struct mmu_rb_node *node,
+static void tid_rb_remove(void *arg, struct mmu_rb_node *node,
 			  struct mm_struct *mm)
 {
-	struct hfi1_filedata *fdata =
-		container_of(root, struct hfi1_filedata, tid_rb_root);
+	struct hfi1_filedata *fdata = arg;
 	struct tid_rb_node *tnode =
 		container_of(node, struct tid_rb_node, mmu);
 	u32 base = fdata->uctxt->expected_base;
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 640c244b665b..8be095e1a538 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -305,10 +305,10 @@ static int defer_packet_queue(
 	unsigned seq);
 static void activate_packet_queue(struct iowait *, int);
 static bool sdma_rb_filter(struct mmu_rb_node *, unsigned long, unsigned long);
-static int sdma_rb_insert(struct rb_root *, struct mmu_rb_node *);
-static void sdma_rb_remove(struct rb_root *, struct mmu_rb_node *,
+static int sdma_rb_insert(void *, struct mmu_rb_node *);
+static void sdma_rb_remove(void *, struct mmu_rb_node *,
 			   struct mm_struct *);
-static int sdma_rb_invalidate(struct rb_root *, struct mmu_rb_node *);
+static int sdma_rb_invalidate(void *, struct mmu_rb_node *);
 
 static struct mmu_rb_ops sdma_rb_ops = {
 	.filter = sdma_rb_filter,
@@ -410,7 +410,6 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	pq->state = SDMA_PKT_Q_INACTIVE;
 	atomic_set(&pq->n_reqs, 0);
 	init_waitqueue_head(&pq->wait);
-	pq->sdma_rb_root = RB_ROOT;
 	INIT_LIST_HEAD(&pq->evict);
 	spin_lock_init(&pq->evict_lock);
 	pq->mm = fd->mm;
@@ -443,7 +442,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	cq->nentries = hfi1_sdma_comp_ring_size;
 	fd->cq = cq;
 
-	ret = hfi1_mmu_rb_register(pq->mm, &pq->sdma_rb_root, &sdma_rb_ops);
+	ret = hfi1_mmu_rb_register(pq, pq->mm, &sdma_rb_ops, &pq->handler);
 	if (ret) {
 		dd_dev_err(dd, "Failed to register with MMU %d", ret);
 		goto done;
@@ -481,7 +480,8 @@ int hfi1_user_sdma_free_queues(struct hfi1_filedata *fd)
 		  uctxt->ctxt, fd->subctxt);
 	pq = fd->pq;
 	if (pq) {
-		hfi1_mmu_rb_unregister(&pq->sdma_rb_root);
+		if (pq->handler)
+			hfi1_mmu_rb_unregister(pq->handler);
 		spin_lock_irqsave(&uctxt->sdma_qlock, flags);
 		if (!list_empty(&pq->list))
 			list_del_init(&pq->list);
@@ -1145,7 +1145,7 @@ static u32 sdma_cache_evict(struct hfi1_user_sdma_pkt_q *pq, u32 npages)
 	spin_unlock(&pq->evict_lock);
 
 	list_for_each_entry_safe(node, ptr, &to_evict, list)
-		hfi1_mmu_rb_remove(&pq->sdma_rb_root, &node->rb);
+		hfi1_mmu_rb_remove(pq->handler, &node->rb);
 
 	return cleared;
 }
@@ -1159,7 +1159,7 @@ static int pin_vector_pages(struct user_sdma_request *req,
 	struct sdma_mmu_node *node = NULL;
 	struct mmu_rb_node *rb_node;
 
-	rb_node = hfi1_mmu_rb_extract(&pq->sdma_rb_root,
+	rb_node = hfi1_mmu_rb_extract(pq->handler,
 				      (unsigned long)iovec->iov.iov_base,
 				      iovec->iov.iov_len);
 	if (rb_node && !IS_ERR(rb_node))
@@ -1240,7 +1240,7 @@ retry:
 	iovec->npages = npages;
 	iovec->node = node;
 
-	ret = hfi1_mmu_rb_insert(&req->pq->sdma_rb_root, &node->rb);
+	ret = hfi1_mmu_rb_insert(req->pq->handler, &node->rb);
 	if (ret) {
 		spin_lock(&pq->evict_lock);
 		if (!list_empty(&node->list))
@@ -1612,7 +1612,7 @@ static void user_sdma_free_request(struct user_sdma_request *req, bool unpin)
 				continue;
 
 			if (unpin)
-				hfi1_mmu_rb_remove(&req->pq->sdma_rb_root,
+				hfi1_mmu_rb_remove(req->pq->handler,
 						   &node->rb);
 			else
 				atomic_dec(&node->refcount);
@@ -1642,7 +1642,7 @@ static bool sdma_rb_filter(struct mmu_rb_node *node, unsigned long addr,
 	return (bool)(node->addr == addr);
 }
 
-static int sdma_rb_insert(struct rb_root *root, struct mmu_rb_node *mnode)
+static int sdma_rb_insert(void *arg, struct mmu_rb_node *mnode)
 {
 	struct sdma_mmu_node *node =
 		container_of(mnode, struct sdma_mmu_node, rb);
@@ -1651,7 +1651,7 @@ static int sdma_rb_insert(struct rb_root *root, struct mmu_rb_node *mnode)
 	return 0;
 }
 
-static void sdma_rb_remove(struct rb_root *root, struct mmu_rb_node *mnode,
+static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode,
 			   struct mm_struct *mm)
 {
 	struct sdma_mmu_node *node =
@@ -1692,7 +1692,7 @@ static void sdma_rb_remove(struct rb_root *root, struct mmu_rb_node *mnode,
 	kfree(node);
 }
 
-static int sdma_rb_invalidate(struct rb_root *root, struct mmu_rb_node *mnode)
+static int sdma_rb_invalidate(void *arg, struct mmu_rb_node *mnode)
 {
 	struct sdma_mmu_node *node =
 		container_of(mnode, struct sdma_mmu_node, rb);
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index ff49f74f43f4..bcdc9e8ae1f0 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -68,7 +68,7 @@ struct hfi1_user_sdma_pkt_q {
 	unsigned state;
 	wait_queue_head_t wait;
 	unsigned long unpinned;
-	struct rb_root sdma_rb_root;
+	struct mmu_rb_handler *handler;
 	u32 n_locked;
 	struct list_head evict;
 	spinlock_t evict_lock; /* protect evict and n_locked */
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 10/16] IB/hfi1: Fix TID caching actions
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (8 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 09/16] IB/hfi1: Make the cache handler own its rb tree root ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 11/16] IB/hfi1: Add evict operation to the mmu rb handler ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (6 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Per file descriptor TID caching actions depend on a global that can
change midway through the lifetime of that file descriptor.

Make the use of caching consistent for the life of the file descriptor
by using the presence of the cache handler to decide when to use the cache
functions.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/file_ops.c     |  4 ++++
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 16 +++++++---------
 2 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 302f0cdd8119..4f39bffad74a 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -1132,6 +1132,10 @@ static int get_ctxt_info(struct file *fp, void __user *ubase, __u32 len)
 				HFI1_CAP_MISC_MASK) << HFI1_CAP_USER_SHIFT) |
 			HFI1_CAP_UGET_MASK(uctxt->flags, MASK) |
 			HFI1_CAP_KGET_MASK(uctxt->flags, K2U);
+	/* adjust flag if this fd is not able to cache */
+	if (!fd->handler)
+		cinfo.runtime_flags |= HFI1_CAP_TID_UNMAP; /* no caching */
+
 	cinfo.num_active = hfi1_count_active_units();
 	cinfo.unit = uctxt->dd->unit;
 	cinfo.ctxt = uctxt->ctxt;
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 269a948189e0..9b740db34963 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -196,7 +196,7 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 	if (!fd->entry_to_rb)
 		return -ENOMEM;
 
-	if (!HFI1_CAP_IS_USET(TID_UNMAP)) {
+	if (!HFI1_CAP_UGET_MASK(uctxt->flags, TID_UNMAP)) {
 		fd->invalid_tid_idx = 0;
 		fd->invalid_tids = kzalloc(uctxt->expected_count *
 					   sizeof(u32), GFP_KERNEL);
@@ -207,15 +207,13 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 
 		/*
 		 * Register MMU notifier callbacks. If the registration
-		 * fails, continue but turn off the TID caching for
-		 * all user contexts.
+		 * fails, continue without TID caching for this context.
 		 */
 		ret = hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops, &fd->handler);
 		if (ret) {
 			dd_dev_info(dd,
 				    "Failed MMU notifier registration %d\n",
 				    ret);
-			HFI1_CAP_USET(TID_UNMAP);
 			ret = 0;
 		}
 	}
@@ -234,7 +232,7 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 	 * init.
 	 */
 	spin_lock(&fd->tid_lock);
-	if (uctxt->subctxt_cnt && !HFI1_CAP_IS_USET(TID_UNMAP)) {
+	if (uctxt->subctxt_cnt && fd->handler) {
 		u16 remainder;
 
 		fd->tid_limit = uctxt->expected_count / uctxt->subctxt_cnt;
@@ -260,7 +258,7 @@ int hfi1_user_exp_rcv_free(struct hfi1_filedata *fd)
 	 * The notifier would have been removed when the process'es mm
 	 * was freed.
 	 */
-	if (!HFI1_CAP_IS_USET(TID_UNMAP))
+	if (fd->handler)
 		hfi1_mmu_rb_unregister(fd->handler);
 
 	kfree(fd->invalid_tids);
@@ -857,7 +855,7 @@ static int set_rcvarray_entry(struct file *fp, unsigned long vaddr,
 	node->freed = false;
 	memcpy(node->pages, pages, sizeof(struct page *) * npages);
 
-	if (HFI1_CAP_IS_USET(TID_UNMAP))
+	if (!fd->handler)
 		ret = tid_rb_insert(fd, &node->mmu);
 	else
 		ret = hfi1_mmu_rb_insert(fd->handler, &node->mmu);
@@ -900,7 +898,7 @@ static int unprogram_rcvarray(struct file *fp, u32 tidinfo,
 	node = fd->entry_to_rb[rcventry];
 	if (!node || node->rcventry != (uctxt->expected_base + rcventry))
 		return -EBADF;
-	if (HFI1_CAP_IS_USET(TID_UNMAP))
+	if (!fd->handler)
 		tid_rb_remove(fd, &node->mmu, fd->mm);
 	else
 		hfi1_mmu_rb_remove(fd->handler, &node->mmu);
@@ -963,7 +961,7 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 							  uctxt->expected_base];
 				if (!node || node->rcventry != rcventry)
 					continue;
-				if (HFI1_CAP_IS_USET(TID_UNMAP))
+				if (!fd->handler)
 					tid_rb_remove(fd, &node->mmu, fd->mm);
 				else
 					hfi1_mmu_rb_remove(fd->handler,
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 11/16] IB/hfi1: Add evict operation to the mmu rb handler
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (9 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 10/16] IB/hfi1: Fix TID caching actions ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 12/16] IB/hfi1: Use evict mmu rb operation ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (5 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Allow users to clear nodes from the rb tree based on their evict callback.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c | 34 ++++++++++++++++++++++++++++++++++
 drivers/infiniband/hw/hfi1/mmu_rb.h |  4 ++++
 2 files changed, 38 insertions(+)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index 9fbcfed4d34c..97f2d3680751 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -206,6 +206,40 @@ struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
 	return node;
 }
 
+void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
+{
+	struct mmu_rb_node *rbnode;
+	struct rb_node *node, *next;
+	struct list_head del_list;
+	unsigned long flags;
+	bool stop = false;
+
+	INIT_LIST_HEAD(&del_list);
+
+	spin_lock_irqsave(&handler->lock, flags);
+	for (node = rb_first(&handler->root); node; node = next) {
+		next = rb_next(node);
+		rbnode = rb_entry(node, struct mmu_rb_node, node);
+		if (handler->ops->evict(handler->ops_arg, rbnode, evict_arg,
+					&stop)) {
+			__mmu_int_rb_remove(rbnode, &handler->root);
+			list_add(&rbnode->list, &del_list);
+		}
+		if (stop)
+			break;
+	}
+	spin_unlock_irqrestore(&handler->lock, flags);
+
+	down_write(&handler->mm->mmap_sem);
+	while (!list_empty(&del_list)) {
+		rbnode = list_first_entry(&del_list, struct mmu_rb_node, list);
+		list_del(&rbnode->list);
+		handler->ops->remove(handler->ops_arg, rbnode,
+				     handler->mm);
+	}
+	up_write(&handler->mm->mmap_sem);
+}
+
 void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 			struct mmu_rb_node *node)
 {
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index 2cedfbe2189e..09e5888c0818 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -54,6 +54,7 @@ struct mmu_rb_node {
 	unsigned long len;
 	unsigned long __last;
 	struct rb_node node;
+	struct list_head list;
 };
 
 struct mmu_rb_ops {
@@ -63,6 +64,8 @@ struct mmu_rb_ops {
 	void (*remove)(void *ops_arg, struct mmu_rb_node *mnode,
 		       struct mm_struct *mm);
 	int (*invalidate)(void *ops_arg, struct mmu_rb_node *node);
+	int (*evict)(void *ops_arg, struct mmu_rb_node *mnode,
+		     void *evict_arg, bool *stop);
 };
 
 int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
@@ -71,6 +74,7 @@ int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
 void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler);
 int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
 		       struct mmu_rb_node *mnode);
+void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg);
 void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 			struct mmu_rb_node *mnode);
 struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 12/16] IB/hfi1: Use evict mmu rb operation
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (10 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 11/16] IB/hfi1: Add evict operation to the mmu rb handler ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 13/16] IB/hfi1: Consistently call ops->remove outside spinlock ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (4 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

Use the new cache evict operation in the SDMA code.  This allows the cache
to properly coordinate evicts and removes, preventing any race.  With this
change, the separate list, lock, and race flag are not needed.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_sdma.c | 116 +++++++++++++--------------------
 drivers/infiniband/hw/hfi1/user_sdma.h |   4 +-
 2 files changed, 47 insertions(+), 73 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 8be095e1a538..3d76222d1aac 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -183,16 +183,18 @@ struct user_sdma_iovec {
 	struct sdma_mmu_node *node;
 };
 
-#define SDMA_CACHE_NODE_EVICT 0
-
 struct sdma_mmu_node {
 	struct mmu_rb_node rb;
-	struct list_head list;
 	struct hfi1_user_sdma_pkt_q *pq;
 	atomic_t refcount;
 	struct page **pages;
 	unsigned npages;
-	unsigned long flags;
+};
+
+/* evict operation argument */
+struct evict_data {
+	u32 cleared;	/* count evicted so far */
+	u32 target;	/* target count to evict */
 };
 
 struct user_sdma_request {
@@ -306,6 +308,8 @@ static int defer_packet_queue(
 static void activate_packet_queue(struct iowait *, int);
 static bool sdma_rb_filter(struct mmu_rb_node *, unsigned long, unsigned long);
 static int sdma_rb_insert(void *, struct mmu_rb_node *);
+static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
+			 void *arg2, bool *stop);
 static void sdma_rb_remove(void *, struct mmu_rb_node *,
 			   struct mm_struct *);
 static int sdma_rb_invalidate(void *, struct mmu_rb_node *);
@@ -313,6 +317,7 @@ static int sdma_rb_invalidate(void *, struct mmu_rb_node *);
 static struct mmu_rb_ops sdma_rb_ops = {
 	.filter = sdma_rb_filter,
 	.insert = sdma_rb_insert,
+	.evict = sdma_rb_evict,
 	.remove = sdma_rb_remove,
 	.invalidate = sdma_rb_invalidate
 };
@@ -410,8 +415,7 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	pq->state = SDMA_PKT_Q_INACTIVE;
 	atomic_set(&pq->n_reqs, 0);
 	init_waitqueue_head(&pq->wait);
-	INIT_LIST_HEAD(&pq->evict);
-	spin_lock_init(&pq->evict_lock);
+	atomic_set(&pq->n_locked, 0);
 	pq->mm = fd->mm;
 
 	iowait_init(&pq->busy, 0, NULL, defer_packet_queue,
@@ -1126,28 +1130,12 @@ static inline int num_user_pages(const struct iovec *iov)
 
 static u32 sdma_cache_evict(struct hfi1_user_sdma_pkt_q *pq, u32 npages)
 {
-	u32 cleared = 0;
-	struct sdma_mmu_node *node, *ptr;
-	struct list_head to_evict = LIST_HEAD_INIT(to_evict);
-
-	spin_lock(&pq->evict_lock);
-	list_for_each_entry_safe_reverse(node, ptr, &pq->evict, list) {
-		/* Make sure that no one is still using the node. */
-		if (!atomic_read(&node->refcount)) {
-			set_bit(SDMA_CACHE_NODE_EVICT, &node->flags);
-			list_del_init(&node->list);
-			list_add(&node->list, &to_evict);
-			cleared += node->npages;
-			if (cleared >= npages)
-				break;
-		}
-	}
-	spin_unlock(&pq->evict_lock);
+	struct evict_data evict_data;
 
-	list_for_each_entry_safe(node, ptr, &to_evict, list)
-		hfi1_mmu_rb_remove(pq->handler, &node->rb);
-
-	return cleared;
+	evict_data.cleared = 0;
+	evict_data.target = npages;
+	hfi1_mmu_rb_evict(pq->handler, &evict_data);
+	return evict_data.cleared;
 }
 
 static int pin_vector_pages(struct user_sdma_request *req,
@@ -1175,7 +1163,6 @@ static int pin_vector_pages(struct user_sdma_request *req,
 		node->rb.addr = (unsigned long)iovec->iov.iov_base;
 		node->pq = pq;
 		atomic_set(&node->refcount, 0);
-		INIT_LIST_HEAD(&node->list);
 	}
 
 	npages = num_user_pages(&iovec->iov);
@@ -1190,23 +1177,9 @@ static int pin_vector_pages(struct user_sdma_request *req,
 
 		npages -= node->npages;
 
-		/*
-		 * If rb_node is NULL, it means that this is brand new node
-		 * and, therefore not on the eviction list.
-		 * If, however, the rb_node is non-NULL, it means that the
-		 * node is already in RB tree and, therefore on the eviction
-		 * list (nodes are unconditionally inserted in the eviction
-		 * list). In that case, we have to remove the node prior to
-		 * calling the eviction function in order to prevent it from
-		 * freeing this node.
-		 */
-		if (rb_node) {
-			spin_lock(&pq->evict_lock);
-			list_del_init(&node->list);
-			spin_unlock(&pq->evict_lock);
-		}
 retry:
-		if (!hfi1_can_pin_pages(pq->dd, pq->mm, pq->n_locked, npages)) {
+		if (!hfi1_can_pin_pages(pq->dd, pq->mm,
+					atomic_read(&pq->n_locked), npages)) {
 			cleared = sdma_cache_evict(pq, npages);
 			if (cleared >= npages)
 				goto retry;
@@ -1231,10 +1204,7 @@ retry:
 		node->pages = pages;
 		node->npages += pinned;
 		npages = node->npages;
-		spin_lock(&pq->evict_lock);
-		list_add(&node->list, &pq->evict);
-		pq->n_locked += pinned;
-		spin_unlock(&pq->evict_lock);
+		atomic_add(pinned, &pq->n_locked);
 	}
 	iovec->pages = node->pages;
 	iovec->npages = npages;
@@ -1242,11 +1212,7 @@ retry:
 
 	ret = hfi1_mmu_rb_insert(req->pq->handler, &node->rb);
 	if (ret) {
-		spin_lock(&pq->evict_lock);
-		if (!list_empty(&node->list))
-			list_del(&node->list);
-		pq->n_locked -= node->npages;
-		spin_unlock(&pq->evict_lock);
+		atomic_sub(node->npages, &pq->n_locked);
 		iovec->node = NULL;
 		goto bail;
 	}
@@ -1651,29 +1617,39 @@ static int sdma_rb_insert(void *arg, struct mmu_rb_node *mnode)
 	return 0;
 }
 
+/*
+ * Return 1 to remove the node from the rb tree and call the remove op.
+ *
+ * Called with the rb tree lock held.
+ */
+static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
+			 void *evict_arg, bool *stop)
+{
+	struct sdma_mmu_node *node =
+		container_of(mnode, struct sdma_mmu_node, rb);
+	struct evict_data *evict_data = evict_arg;
+
+	/* is this node still being used? */
+	if (atomic_read(&node->refcount))
+		return 0; /* keep this node */
+
+	/* this node will be evicted, add its pages to our count */
+	evict_data->cleared += node->npages;
+
+	/* have enough pages been cleared? */
+	if (evict_data->cleared >= evict_data->target)
+		*stop = true;
+
+	return 1; /* remove this node */
+}
+
 static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode,
 			   struct mm_struct *mm)
 {
 	struct sdma_mmu_node *node =
 		container_of(mnode, struct sdma_mmu_node, rb);
 
-	spin_lock(&node->pq->evict_lock);
-	/*
-	 * We've been called by the MMU notifier but this node has been
-	 * scheduled for eviction. The eviction function will take care
-	 * of freeing this node.
-	 * We have to take the above lock first because we are racing
-	 * against the setting of the bit in the eviction function.
-	 */
-	if (mm && test_bit(SDMA_CACHE_NODE_EVICT, &node->flags)) {
-		spin_unlock(&node->pq->evict_lock);
-		return;
-	}
-
-	if (!list_empty(&node->list))
-		list_del(&node->list);
-	node->pq->n_locked -= node->npages;
-	spin_unlock(&node->pq->evict_lock);
+	atomic_sub(node->npages, &node->pq->n_locked);
 
 	/*
 	 * If mm is set, we are being called by the MMU notifier and we
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.h b/drivers/infiniband/hw/hfi1/user_sdma.h
index bcdc9e8ae1f0..39001714f551 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.h
+++ b/drivers/infiniband/hw/hfi1/user_sdma.h
@@ -69,9 +69,7 @@ struct hfi1_user_sdma_pkt_q {
 	wait_queue_head_t wait;
 	unsigned long unpinned;
 	struct mmu_rb_handler *handler;
-	u32 n_locked;
-	struct list_head evict;
-	spinlock_t evict_lock; /* protect evict and n_locked */
+	atomic_t n_locked;
 	struct mm_struct *mm;
 };
 
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 13/16] IB/hfi1: Consistently call ops->remove outside spinlock
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (11 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 12/16] IB/hfi1: Use evict mmu rb operation ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 14/16] IB/hfi1: Remove unneeded mm argument in remove function ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (3 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The ops->remove() callback was called by hfi1_mmu_unregister() with a
NULL mm argument while holding a spinlock.  In the case of sdma_rb_remove()
this caused it to pass current->mm to hfi1_release_user_pages()

This had 2 problems.  First this would attempt to acquire the mmap_sem
under a spin lock.  Second the use of current->mm is not always guaranteed
to be the proper mm when the fd is being closed.

Rather than depend on this implicit behavior we move all calls to
ops->remove outside of the spinlock.  This also allows the correct
mm to be used in the remove callback without fear of deadlock.

Because the MMU notifier is not guaranteed to hold mm->mmap_sem, but
usually does, we must delay all remove callbacks until out of the notifier,
when the callbacks can take the mmap_sem if they need to.

Code comments were added to clarify what the expectations are for the
users of the mmu rb tree.

Suggested-by: Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org>
Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c       | 76 +++++++++++++++++++++++++++++--
 drivers/infiniband/hw/hfi1/mmu_rb.h       |  5 ++
 drivers/infiniband/hw/hfi1/user_exp_rcv.c |  4 +-
 drivers/infiniband/hw/hfi1/user_sdma.c    | 22 ++-------
 4 files changed, 84 insertions(+), 23 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index 97f2d3680751..76f7b0403207 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -59,6 +59,9 @@ struct mmu_rb_handler {
 	spinlock_t lock;        /* protect the RB tree */
 	struct mmu_rb_ops *ops;
 	struct mm_struct *mm;
+	struct work_struct del_work;
+	struct list_head del_list;
+	struct workqueue_struct *wq;
 };
 
 static unsigned long mmu_node_start(struct mmu_rb_node *);
@@ -73,6 +76,9 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *,
 					unsigned long, unsigned long);
 static struct mmu_rb_node *__mmu_rb_search(struct mmu_rb_handler *,
 					   unsigned long, unsigned long);
+static void do_remove(struct mmu_rb_handler *handler,
+		      struct list_head *del_list);
+static void handle_remove(struct work_struct *work);
 
 static struct mmu_notifier_ops mn_opts = {
 	.invalidate_page = mmu_notifier_page,
@@ -94,6 +100,7 @@ static unsigned long mmu_node_last(struct mmu_rb_node *node)
 
 int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
 			 struct mmu_rb_ops *ops,
+			 struct workqueue_struct *wq,
 			 struct mmu_rb_handler **handler)
 {
 	struct mmu_rb_handler *handlr;
@@ -110,6 +117,9 @@ int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
 	spin_lock_init(&handlr->lock);
 	handlr->mn.ops = &mn_opts;
 	handlr->mm = mm;
+	INIT_WORK(&handlr->del_work, handle_remove);
+	INIT_LIST_HEAD(&handlr->del_list);
+	handlr->wq = wq;
 
 	ret = mmu_notifier_register(&handlr->mn, handlr->mm);
 	if (ret) {
@@ -126,19 +136,29 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
 	struct mmu_rb_node *rbnode;
 	struct rb_node *node;
 	unsigned long flags;
+	struct list_head del_list;
 
 	/* Unregister first so we don't get any more notifications. */
 	mmu_notifier_unregister(&handler->mn, handler->mm);
 
+	/*
+	 * Make sure the wq delete handler is finished running.  It will not
+	 * be triggered once the mmu notifiers are unregistered above.
+	 */
+	flush_work(&handler->del_work);
+
+	INIT_LIST_HEAD(&del_list);
+
 	spin_lock_irqsave(&handler->lock, flags);
 	while ((node = rb_first(&handler->root))) {
 		rbnode = rb_entry(node, struct mmu_rb_node, node);
 		rb_erase(node, &handler->root);
-		handler->ops->remove(handler->ops_arg, rbnode,
-				     NULL);
+		list_add(&rbnode->list, &del_list);
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
 
+	do_remove(handler, &del_list);
+
 	kfree(handler);
 }
 
@@ -230,16 +250,19 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
 
-	down_write(&handler->mm->mmap_sem);
 	while (!list_empty(&del_list)) {
 		rbnode = list_first_entry(&del_list, struct mmu_rb_node, list);
 		list_del(&rbnode->list);
 		handler->ops->remove(handler->ops_arg, rbnode,
 				     handler->mm);
 	}
-	up_write(&handler->mm->mmap_sem);
 }
 
+/*
+ * It is up to the caller to ensure that this function does not race with the
+ * mmu invalidate notifier which may be calling the users remove callback on
+ * 'node'.
+ */
 void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 			struct mmu_rb_node *node)
 {
@@ -278,6 +301,7 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 	struct rb_root *root = &handler->root;
 	struct mmu_rb_node *node, *ptr = NULL;
 	unsigned long flags;
+	bool added = false;
 
 	spin_lock_irqsave(&handler->lock, flags);
 	for (node = __mmu_int_rb_iter_first(root, start, end - 1);
@@ -288,8 +312,50 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 			  node->addr, node->len);
 		if (handler->ops->invalidate(handler->ops_arg, node)) {
 			__mmu_int_rb_remove(node, root);
-			handler->ops->remove(handler->ops_arg, node, mm);
+			list_add(&node->list, &handler->del_list);
+			added = true;
 		}
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
+
+	if (added)
+		queue_work(handler->wq, &handler->del_work);
+}
+
+/*
+ * Call the remove function for the given handler and the list.  This
+ * is expected to be called with a delete list extracted from handler.
+ * The caller should not be holding the handler lock.
+ */
+static void do_remove(struct mmu_rb_handler *handler,
+		      struct list_head *del_list)
+{
+	struct mmu_rb_node *node;
+
+	while (!list_empty(del_list)) {
+		node = list_first_entry(del_list, struct mmu_rb_node, list);
+		list_del(&node->list);
+		handler->ops->remove(handler->ops_arg, node, handler->mm);
+	}
+}
+
+/*
+ * Work queue function to remove all nodes that have been queued up to
+ * be removed.  The key feature is that mm->mmap_sem is not being held
+ * and the remove callback can sleep while taking it, if needed.
+ */
+static void handle_remove(struct work_struct *work)
+{
+	struct mmu_rb_handler *handler = container_of(work,
+						struct mmu_rb_handler,
+						del_work);
+	struct list_head del_list;
+	unsigned long flags;
+
+	/* remove anything that is queued to get removed */
+	spin_lock_irqsave(&handler->lock, flags);
+	list_replace_init(&handler->del_list, &del_list);
+	spin_unlock_irqrestore(&handler->lock, flags);
+
+	do_remove(handler, &del_list);
 }
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index 09e5888c0818..e4f853fa91e6 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -57,6 +57,10 @@ struct mmu_rb_node {
 	struct list_head list;
 };
 
+/*
+ * NOTE: filter, insert, invalidate, and evict must not sleep.  Only remove is
+ * allowed to sleep.
+ */
 struct mmu_rb_ops {
 	bool (*filter)(struct mmu_rb_node *node, unsigned long addr,
 		       unsigned long len);
@@ -70,6 +74,7 @@ struct mmu_rb_ops {
 
 int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
 			 struct mmu_rb_ops *ops,
+			 struct workqueue_struct *wq,
 			 struct mmu_rb_handler **handler);
 void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler);
 int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 9b740db34963..3bcda22e7b87 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -209,7 +209,9 @@ int hfi1_user_exp_rcv_init(struct file *fp)
 		 * Register MMU notifier callbacks. If the registration
 		 * fails, continue without TID caching for this context.
 		 */
-		ret = hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops, &fd->handler);
+		ret = hfi1_mmu_rb_register(fd, fd->mm, &tid_rb_ops,
+					   dd->pport->hfi1_wq,
+					   &fd->handler);
 		if (ret) {
 			dd_dev_info(dd,
 				    "Failed MMU notifier registration %d\n",
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 3d76222d1aac..751aa2260c1c 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -310,8 +310,7 @@ static bool sdma_rb_filter(struct mmu_rb_node *, unsigned long, unsigned long);
 static int sdma_rb_insert(void *, struct mmu_rb_node *);
 static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
 			 void *arg2, bool *stop);
-static void sdma_rb_remove(void *, struct mmu_rb_node *,
-			   struct mm_struct *);
+static void sdma_rb_remove(void *, struct mmu_rb_node *, struct mm_struct *);
 static int sdma_rb_invalidate(void *, struct mmu_rb_node *);
 
 static struct mmu_rb_ops sdma_rb_ops = {
@@ -446,7 +445,8 @@ int hfi1_user_sdma_alloc_queues(struct hfi1_ctxtdata *uctxt, struct file *fp)
 	cq->nentries = hfi1_sdma_comp_ring_size;
 	fd->cq = cq;
 
-	ret = hfi1_mmu_rb_register(pq, pq->mm, &sdma_rb_ops, &pq->handler);
+	ret = hfi1_mmu_rb_register(pq, pq->mm, &sdma_rb_ops, dd->pport->hfi1_wq,
+				   &pq->handler);
 	if (ret) {
 		dd_dev_err(dd, "Failed to register with MMU %d", ret);
 		goto done;
@@ -1651,20 +1651,8 @@ static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode,
 
 	atomic_sub(node->npages, &node->pq->n_locked);
 
-	/*
-	 * If mm is set, we are being called by the MMU notifier and we
-	 * should not pass a mm_struct to unpin_vector_page(). This is to
-	 * prevent a deadlock when hfi1_release_user_pages() attempts to
-	 * take the mmap_sem, which the MMU notifier has already taken.
-	 */
-	unpin_vector_pages(mm ? NULL : current->mm, node->pages, 0,
-			   node->npages);
-	/*
-	 * If called by the MMU notifier, we have to adjust the pinned
-	 * page count ourselves.
-	 */
-	if (mm)
-		mm->pinned_vm -= node->npages;
+	unpin_vector_pages(node->pq->mm, node->pages, 0, node->npages);
+
 	kfree(node);
 }
 
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 14/16] IB/hfi1: Remove unneeded mm argument in remove function
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (12 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 13/16] IB/hfi1: Consistently call ops->remove outside spinlock ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 15/16] IB/hfi1: Fix memory leak during unexpected shutdown ira.weiny-ral2JQCrhuEAvxtiuMwx3w
                     ` (2 subsequent siblings)
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The reworked mmu_rb interface allows the unused mm argument to be removed.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c       |  7 +++----
 drivers/infiniband/hw/hfi1/mmu_rb.h       |  3 +--
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 10 ++++------
 drivers/infiniband/hw/hfi1/user_sdma.c    |  5 ++---
 4 files changed, 10 insertions(+), 15 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index 76f7b0403207..a02344a9d746 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -253,8 +253,7 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 	while (!list_empty(&del_list)) {
 		rbnode = list_first_entry(&del_list, struct mmu_rb_node, list);
 		list_del(&rbnode->list);
-		handler->ops->remove(handler->ops_arg, rbnode,
-				     handler->mm);
+		handler->ops->remove(handler->ops_arg, rbnode);
 	}
 }
 
@@ -275,7 +274,7 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 	__mmu_int_rb_remove(node, &handler->root);
 	spin_unlock_irqrestore(&handler->lock, flags);
 
-	handler->ops->remove(handler->ops_arg, node, NULL);
+	handler->ops->remove(handler->ops_arg, node);
 }
 
 static inline void mmu_notifier_page(struct mmu_notifier *mn,
@@ -335,7 +334,7 @@ static void do_remove(struct mmu_rb_handler *handler,
 	while (!list_empty(del_list)) {
 		node = list_first_entry(del_list, struct mmu_rb_node, list);
 		list_del(&node->list);
-		handler->ops->remove(handler->ops_arg, node, handler->mm);
+		handler->ops->remove(handler->ops_arg, node);
 	}
 }
 
diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.h b/drivers/infiniband/hw/hfi1/mmu_rb.h
index e4f853fa91e6..754f6ebf13fb 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.h
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.h
@@ -65,8 +65,7 @@ struct mmu_rb_ops {
 	bool (*filter)(struct mmu_rb_node *node, unsigned long addr,
 		       unsigned long len);
 	int (*insert)(void *ops_arg, struct mmu_rb_node *mnode);
-	void (*remove)(void *ops_arg, struct mmu_rb_node *mnode,
-		       struct mm_struct *mm);
+	void (*remove)(void *ops_arg, struct mmu_rb_node *mnode);
 	int (*invalidate)(void *ops_arg, struct mmu_rb_node *node);
 	int (*evict)(void *ops_arg, struct mmu_rb_node *mnode,
 		     void *evict_arg, bool *stop);
diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 3bcda22e7b87..8717e11fe3f5 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -87,8 +87,7 @@ static u32 find_phys_blocks(struct page **, unsigned, struct tid_pageset *);
 static int set_rcvarray_entry(struct file *, unsigned long, u32,
 			      struct tid_group *, struct page **, unsigned);
 static int tid_rb_insert(void *, struct mmu_rb_node *);
-static void tid_rb_remove(void *, struct mmu_rb_node *,
-			  struct mm_struct *);
+static void tid_rb_remove(void *, struct mmu_rb_node *);
 static int tid_rb_invalidate(void *, struct mmu_rb_node *);
 static int program_rcvarray(struct file *, unsigned long, struct tid_group *,
 			    struct tid_pageset *, unsigned, u16, struct page **,
@@ -901,7 +900,7 @@ static int unprogram_rcvarray(struct file *fp, u32 tidinfo,
 	if (!node || node->rcventry != (uctxt->expected_base + rcventry))
 		return -EBADF;
 	if (!fd->handler)
-		tid_rb_remove(fd, &node->mmu, fd->mm);
+		tid_rb_remove(fd, &node->mmu);
 	else
 		hfi1_mmu_rb_remove(fd->handler, &node->mmu);
 
@@ -964,7 +963,7 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 				if (!node || node->rcventry != rcventry)
 					continue;
 				if (!fd->handler)
-					tid_rb_remove(fd, &node->mmu, fd->mm);
+					tid_rb_remove(fd, &node->mmu);
 				else
 					hfi1_mmu_rb_remove(fd->handler,
 							   &node->mmu);
@@ -1028,8 +1027,7 @@ static int tid_rb_insert(void *arg, struct mmu_rb_node *node)
 	return 0;
 }
 
-static void tid_rb_remove(void *arg, struct mmu_rb_node *node,
-			  struct mm_struct *mm)
+static void tid_rb_remove(void *arg, struct mmu_rb_node *node)
 {
 	struct hfi1_filedata *fdata = arg;
 	struct tid_rb_node *tnode =
diff --git a/drivers/infiniband/hw/hfi1/user_sdma.c b/drivers/infiniband/hw/hfi1/user_sdma.c
index 751aa2260c1c..0ecf27903dc2 100644
--- a/drivers/infiniband/hw/hfi1/user_sdma.c
+++ b/drivers/infiniband/hw/hfi1/user_sdma.c
@@ -310,7 +310,7 @@ static bool sdma_rb_filter(struct mmu_rb_node *, unsigned long, unsigned long);
 static int sdma_rb_insert(void *, struct mmu_rb_node *);
 static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
 			 void *arg2, bool *stop);
-static void sdma_rb_remove(void *, struct mmu_rb_node *, struct mm_struct *);
+static void sdma_rb_remove(void *, struct mmu_rb_node *);
 static int sdma_rb_invalidate(void *, struct mmu_rb_node *);
 
 static struct mmu_rb_ops sdma_rb_ops = {
@@ -1643,8 +1643,7 @@ static int sdma_rb_evict(void *arg, struct mmu_rb_node *mnode,
 	return 1; /* remove this node */
 }
 
-static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode,
-			   struct mm_struct *mm)
+static void sdma_rb_remove(void *arg, struct mmu_rb_node *mnode)
 {
 	struct sdma_mmu_node *node =
 		container_of(mnode, struct sdma_mmu_node, rb);
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 15/16] IB/hfi1: Fix memory leak during unexpected shutdown
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (13 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 14/16] IB/hfi1: Remove unneeded mm argument in remove function ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-07-28 19:21   ` [PATCH 16/16] IB/hfi1: Add cache evict LRU list ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-08-03  3:04   ` [PATCH 00/16] Fix SDMA/TID caching code Doug Ledford
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Ira Weiny

From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

During an unexpected shutdown, references to tid_rb_node were NULL'ed out
without properly being released.

Fix this by calling clear_tid_node in the mmu notifier remove callback
rather than after these callbacks are called.

Reviewed-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/user_exp_rcv.c | 44 ++++++++++++++++++++++---------
 1 file changed, 31 insertions(+), 13 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/user_exp_rcv.c b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
index 8717e11fe3f5..64d26525435a 100644
--- a/drivers/infiniband/hw/hfi1/user_exp_rcv.c
+++ b/drivers/infiniband/hw/hfi1/user_exp_rcv.c
@@ -87,13 +87,15 @@ static u32 find_phys_blocks(struct page **, unsigned, struct tid_pageset *);
 static int set_rcvarray_entry(struct file *, unsigned long, u32,
 			      struct tid_group *, struct page **, unsigned);
 static int tid_rb_insert(void *, struct mmu_rb_node *);
+static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata,
+				    struct tid_rb_node *tnode);
 static void tid_rb_remove(void *, struct mmu_rb_node *);
 static int tid_rb_invalidate(void *, struct mmu_rb_node *);
 static int program_rcvarray(struct file *, unsigned long, struct tid_group *,
 			    struct tid_pageset *, unsigned, u16, struct page **,
 			    u32 *, unsigned *, unsigned *);
 static int unprogram_rcvarray(struct file *, u32, struct tid_group **);
-static void clear_tid_node(struct hfi1_filedata *, struct tid_rb_node *);
+static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node);
 
 static struct mmu_rb_ops tid_rb_ops = {
 	.insert = tid_rb_insert,
@@ -899,14 +901,15 @@ static int unprogram_rcvarray(struct file *fp, u32 tidinfo,
 	node = fd->entry_to_rb[rcventry];
 	if (!node || node->rcventry != (uctxt->expected_base + rcventry))
 		return -EBADF;
+
+	if (grp)
+		*grp = node->grp;
+
 	if (!fd->handler)
-		tid_rb_remove(fd, &node->mmu);
+		cacheless_tid_rb_remove(fd, node);
 	else
 		hfi1_mmu_rb_remove(fd->handler, &node->mmu);
 
-	if (grp)
-		*grp = node->grp;
-	clear_tid_node(fd, node);
 	return 0;
 }
 
@@ -943,6 +946,10 @@ static void clear_tid_node(struct hfi1_filedata *fd, struct tid_rb_node *node)
 	kfree(node);
 }
 
+/*
+ * As a simple helper for hfi1_user_exp_rcv_free, this function deals with
+ * clearing nodes in the non-cached case.
+ */
 static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 			    struct exp_tid_set *set,
 			    struct hfi1_filedata *fd)
@@ -962,17 +969,20 @@ static void unlock_exp_tids(struct hfi1_ctxtdata *uctxt,
 							  uctxt->expected_base];
 				if (!node || node->rcventry != rcventry)
 					continue;
-				if (!fd->handler)
-					tid_rb_remove(fd, &node->mmu);
-				else
-					hfi1_mmu_rb_remove(fd->handler,
-							   &node->mmu);
-				clear_tid_node(fd, node);
+
+				cacheless_tid_rb_remove(fd, node);
 			}
 		}
 	}
 }
 
+/*
+ * Always return 0 from this function.  A non-zero return indicates that the
+ * remove operation will be called and that memory should be unpinned.
+ * However, the driver cannot unpin out from under PSM.  Instead, retain the
+ * memory (by returning 0) and inform PSM that the memory is going away.  PSM
+ * will call back later when it has removed the memory from its list.
+ */
 static int tid_rb_invalidate(void *arg, struct mmu_rb_node *mnode)
 {
 	struct hfi1_filedata *fdata = arg;
@@ -1027,12 +1037,20 @@ static int tid_rb_insert(void *arg, struct mmu_rb_node *node)
 	return 0;
 }
 
+static void cacheless_tid_rb_remove(struct hfi1_filedata *fdata,
+				    struct tid_rb_node *tnode)
+{
+	u32 base = fdata->uctxt->expected_base;
+
+	fdata->entry_to_rb[tnode->rcventry - base] = NULL;
+	clear_tid_node(fdata, tnode);
+}
+
 static void tid_rb_remove(void *arg, struct mmu_rb_node *node)
 {
 	struct hfi1_filedata *fdata = arg;
 	struct tid_rb_node *tnode =
 		container_of(node, struct tid_rb_node, mmu);
-	u32 base = fdata->uctxt->expected_base;
 
-	fdata->entry_to_rb[tnode->rcventry - base] = NULL;
+	cacheless_tid_rb_remove(fdata, tnode);
 }
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH 16/16] IB/hfi1: Add cache evict LRU list
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (14 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 15/16] IB/hfi1: Fix memory leak during unexpected shutdown ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-07-28 19:21   ` ira.weiny-ral2JQCrhuEAvxtiuMwx3w
  2016-08-03  3:04   ` [PATCH 00/16] Fix SDMA/TID caching code Doug Ledford
  16 siblings, 0 replies; 18+ messages in thread
From: ira.weiny-ral2JQCrhuEAvxtiuMwx3w @ 2016-07-28 19:21 UTC (permalink / raw)
  To: dledford-H+wXaHxf7aLQT0dZR+AlfA
  Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA, Dean Luick

From: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>

The original code used a LRU list to evict nodes which were least
recently used.  For correctness the evict code was moved under the
handler->lock, now add back the LRU list.

Reviewed-by: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
Signed-off-by: Dean Luick <dean.luick-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 drivers/infiniband/hw/hfi1/mmu_rb.c | 29 +++++++++++++++++++----------
 1 file changed, 19 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/hw/hfi1/mmu_rb.c b/drivers/infiniband/hw/hfi1/mmu_rb.c
index a02344a9d746..7ad30898fc19 100644
--- a/drivers/infiniband/hw/hfi1/mmu_rb.c
+++ b/drivers/infiniband/hw/hfi1/mmu_rb.c
@@ -59,6 +59,7 @@ struct mmu_rb_handler {
 	spinlock_t lock;        /* protect the RB tree */
 	struct mmu_rb_ops *ops;
 	struct mm_struct *mm;
+	struct list_head lru_list;
 	struct work_struct del_work;
 	struct list_head del_list;
 	struct workqueue_struct *wq;
@@ -119,6 +120,7 @@ int hfi1_mmu_rb_register(void *ops_arg, struct mm_struct *mm,
 	handlr->mm = mm;
 	INIT_WORK(&handlr->del_work, handle_remove);
 	INIT_LIST_HEAD(&handlr->del_list);
+	INIT_LIST_HEAD(&handlr->lru_list);
 	handlr->wq = wq;
 
 	ret = mmu_notifier_register(&handlr->mn, handlr->mm);
@@ -153,7 +155,8 @@ void hfi1_mmu_rb_unregister(struct mmu_rb_handler *handler)
 	while ((node = rb_first(&handler->root))) {
 		rbnode = rb_entry(node, struct mmu_rb_node, node);
 		rb_erase(node, &handler->root);
-		list_add(&rbnode->list, &del_list);
+		/* move from LRU list to delete list */
+		list_move(&rbnode->list, &del_list);
 	}
 	spin_unlock_irqrestore(&handler->lock, flags);
 
@@ -178,10 +181,13 @@ int hfi1_mmu_rb_insert(struct mmu_rb_handler *handler,
 		goto unlock;
 	}
 	__mmu_int_rb_insert(mnode, &handler->root);
+	list_add(&mnode->list, &handler->lru_list);
 
 	ret = handler->ops->insert(handler->ops_arg, mnode);
-	if (ret)
+	if (ret) {
 		__mmu_int_rb_remove(mnode, &handler->root);
+		list_del(&mnode->list); /* remove from LRU list */
+	}
 unlock:
 	spin_unlock_irqrestore(&handler->lock, flags);
 	return ret;
@@ -219,8 +225,10 @@ struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
 
 	spin_lock_irqsave(&handler->lock, flags);
 	node = __mmu_rb_search(handler, addr, len);
-	if (node)
+	if (node) {
 		__mmu_int_rb_remove(node, &handler->root);
+		list_del(&node->list); /* remove from LRU list */
+	}
 	spin_unlock_irqrestore(&handler->lock, flags);
 
 	return node;
@@ -228,8 +236,7 @@ struct mmu_rb_node *hfi1_mmu_rb_extract(struct mmu_rb_handler *handler,
 
 void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 {
-	struct mmu_rb_node *rbnode;
-	struct rb_node *node, *next;
+	struct mmu_rb_node *rbnode, *ptr;
 	struct list_head del_list;
 	unsigned long flags;
 	bool stop = false;
@@ -237,13 +244,13 @@ void hfi1_mmu_rb_evict(struct mmu_rb_handler *handler, void *evict_arg)
 	INIT_LIST_HEAD(&del_list);
 
 	spin_lock_irqsave(&handler->lock, flags);
-	for (node = rb_first(&handler->root); node; node = next) {
-		next = rb_next(node);
-		rbnode = rb_entry(node, struct mmu_rb_node, node);
+	list_for_each_entry_safe_reverse(rbnode, ptr, &handler->lru_list,
+					 list) {
 		if (handler->ops->evict(handler->ops_arg, rbnode, evict_arg,
 					&stop)) {
 			__mmu_int_rb_remove(rbnode, &handler->root);
-			list_add(&rbnode->list, &del_list);
+			/* move from LRU list to delete list */
+			list_move(&rbnode->list, &del_list);
 		}
 		if (stop)
 			break;
@@ -272,6 +279,7 @@ void hfi1_mmu_rb_remove(struct mmu_rb_handler *handler,
 		  node->len);
 	spin_lock_irqsave(&handler->lock, flags);
 	__mmu_int_rb_remove(node, &handler->root);
+	list_del(&node->list); /* remove from LRU list */
 	spin_unlock_irqrestore(&handler->lock, flags);
 
 	handler->ops->remove(handler->ops_arg, node);
@@ -311,7 +319,8 @@ static void mmu_notifier_mem_invalidate(struct mmu_notifier *mn,
 			  node->addr, node->len);
 		if (handler->ops->invalidate(handler->ops_arg, node)) {
 			__mmu_int_rb_remove(node, root);
-			list_add(&node->list, &handler->del_list);
+			/* move from LRU list to delete list */
+			list_move(&node->list, &handler->del_list);
 			added = true;
 		}
 	}
-- 
1.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH 00/16] Fix SDMA/TID caching code
       [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                     ` (15 preceding siblings ...)
  2016-07-28 19:21   ` [PATCH 16/16] IB/hfi1: Add cache evict LRU list ira.weiny-ral2JQCrhuEAvxtiuMwx3w
@ 2016-08-03  3:04   ` Doug Ledford
  16 siblings, 0 replies; 18+ messages in thread
From: Doug Ledford @ 2016-08-03  3:04 UTC (permalink / raw)
  To: ira.weiny-ral2JQCrhuEAvxtiuMwx3w; +Cc: linux-rdma-u79uwXL29TY76Z2rM5mHXA

[-- Attachment #1: Type: text/plain, Size: 2002 bytes --]

On Thu, 2016-07-28 at 15:21 -0400, ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org wrote:
> From: Ira Weiny <ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
> 
> This series fixes a number of bugs found while debugging the use of
> the
> mm_struct "current->mm".  When PSM jobs were stopped unexpectedly
> (for example
> control-c) it was possible for the wrong mm struct to be used in
> mmu_notifier_unregister.  This was causing memory corruption.  In
> addition, a
> number of might sleep conditions were found when mmap_sem needed to
> be taken
> while a spinlock was held.
> 
> The following are stand alone fixes which can probably be reordered
> if
> necessary:
> 
> 	IB/hfi1: Prevent null pointer dereference
> 	IB/hfi1: Use the same capability state for all shared contexts
> 	IB/hfi1: Validate SDMA user request index
> 	IB/hfi1: Validate SDMA user iovector count
> 	IB/hfi1: Release node on insert failure
> 	IB/hfi1: Fix error condition that needs to clean up
> 	IB/hfi1: Fix user SDMA racy user request claim
> 
> The use of mm and locking issues are fixed with the following commits
> (they
> depend on the previous commits and must be ordered among themselves):
> 
> 	IB/hfi1: Make use of mm consistent
> 	IB/hfi1: Make the cache handler own its rb tree root
> 	IB/hfi1: Fix TID caching actions
> 	IB/hfi1: Add evict operation to the mmu rb handler
> 	IB/hfi1: Use evict mmu rb operation
> 	IB/hfi1: Consistently call ops->remove outside spinlock
> 	IB/hfi1: Remove unneeded mm argument in remove function
> 	IB/hfi1: Fix memory leak during unexpected shutdown
> 	IB/hfi1: Add cache evict LRU list
> 
> Special thanks to Jim Foraker <foraker1-i2BcT+NCU+M@public.gmane.org> for suggestions on
> how to fix
> the main issues.
> 
> This series requires the clean up patch series sent previously.

Thanks Ira, series applied.

-- 
Doug Ledford <dledford-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
              GPG KeyID: 0E572FDD

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2016-08-03  3:04 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-28 19:21 [PATCH 00/16] Fix SDMA/TID caching code ira.weiny-ral2JQCrhuEAvxtiuMwx3w
     [not found] ` <1469733687-31738-1-git-send-email-ira.weiny-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2016-07-28 19:21   ` [PATCH 01/16] IB/hfi1: Prevent null pointer dereference ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 02/16] IB/hfi1: Use the same capability state for all shared contexts ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 03/16] IB/hfi1: Validate SDMA user request index ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 04/16] IB/hfi1: Validate SDMA user iovector count ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 05/16] IB/hfi1: Release node on insert failure ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 06/16] IB/hfi1: Fix error condition that needs to clean up ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 07/16] IB/hfi1: Fix user SDMA racy user request claim ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 08/16] IB/hfi1: Make use of mm consistent ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 09/16] IB/hfi1: Make the cache handler own its rb tree root ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 10/16] IB/hfi1: Fix TID caching actions ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 11/16] IB/hfi1: Add evict operation to the mmu rb handler ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 12/16] IB/hfi1: Use evict mmu rb operation ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 13/16] IB/hfi1: Consistently call ops->remove outside spinlock ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 14/16] IB/hfi1: Remove unneeded mm argument in remove function ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 15/16] IB/hfi1: Fix memory leak during unexpected shutdown ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-07-28 19:21   ` [PATCH 16/16] IB/hfi1: Add cache evict LRU list ira.weiny-ral2JQCrhuEAvxtiuMwx3w
2016-08-03  3:04   ` [PATCH 00/16] Fix SDMA/TID caching code Doug Ledford

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.