dri-devel.lists.freedesktop.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review
@ 2019-06-14  0:44 Jason Gunthorpe
  2019-06-14  0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
                   ` (11 more replies)
  0 siblings, 12 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, linux-rdma, amd-gfx, linux-mm, Jason Gunthorpe,
	dri-devel, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

This patch series arised out of discussions with Jerome when looking at the
ODP changes, particularly informed by use after free races we have already
found and fixed in the ODP code (thanks to syzkaller) working with mmu
notifiers, and the discussion with Ralph on how to resolve the lifetime model.

Overall this brings in a simplified locking scheme and easy to explain
lifetime model:

 If a hmm_range is valid, then the hmm is valid, if a hmm is valid then the mm
 is allocated memory.

 If the mm needs to still be alive (ie to lock the mmap_sem, find a vma, etc)
 then the mmget must be obtained via mmget_not_zero().

The use unlocked reads on 'hmm->dead' are also eliminated in favour of using
standard mmget() locking to prevent the mm from being released. Many of the
debugging checks of !range->hmm and !hmm->mm are dropped in favour of poison -
which is much clearer as to the lifetime intent.

The trailing patches are just some random cleanups I noticed when reviewing
this code.

I would like to run some testing with the ODP patch, but haven't
yet. Otherwise I think this is reviewed enough, and if there is nothing more
say I hope to apply it next week.

I plan to continue to work on the idea with CH to move more of this mirror
code into mmu notifiers and other places, but this will take some time and
research.

Thanks to everyone who took time to look at this!

Jason Gunthorpe (12):
  mm/hmm: fix use after free with struct hmm in the mmu notifiers
  mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
  mm/hmm: Hold a mmgrab from hmm to mm
  mm/hmm: Simplify hmm_get_or_create and make it reliable
  mm/hmm: Remove duplicate condition test before wait_event_timeout
  mm/hmm: Hold on to the mmget for the lifetime of the range
  mm/hmm: Use lockdep instead of comments
  mm/hmm: Remove racy protection against double-unregistration
  mm/hmm: Poison hmm_range during unregister
  mm/hmm: Do not use list*_rcu() for hmm->ranges
  mm/hmm: Remove confusing comment and logic from hmm_release
  mm/hmm: Fix error flows in hmm_invalidate_range_start

 drivers/gpu/drm/nouveau/nouveau_svm.c |   2 +-
 include/linux/hmm.h                   |  52 +----
 kernel/fork.c                         |   1 -
 mm/hmm.c                              | 286 ++++++++++++--------------
 4 files changed, 140 insertions(+), 201 deletions(-)

-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

mmu_notifier_unregister_no_release() is not a fence and the mmu_notifier
system will continue to reference hmm->mn until the srcu grace period
expires.

Resulting in use after free races like this:

         CPU0                                     CPU1
                                               __mmu_notifier_invalidate_range_start()
                                                 srcu_read_lock
                                                 hlist_for_each ()
                                                   // mn == hmm->mn
hmm_mirror_unregister()
  hmm_put()
    hmm_free()
      mmu_notifier_unregister_no_release()
         hlist_del_init_rcu(hmm-mn->list)
			                           mn->ops->invalidate_range_start(mn, range);
					             mm_get_hmm()
      mm->hmm = NULL;
      kfree(hmm)
                                                     mutex_lock(&hmm->lock);

Use SRCU to kfree the hmm memory so that the notifiers can rely on hmm
existing. Get the now-safe hmm struct through container_of and directly
check kref_get_unless_zero to lock it against free.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
- Spell 'free' properly (Jerome/Ralph)
v3:
- Have only one clearer comment about kref_get_unless_zero (John)
---
 include/linux/hmm.h |  1 +
 mm/hmm.c            | 23 +++++++++++++++++------
 2 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 7007123842ba76..cb01cf1fa3c08b 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -93,6 +93,7 @@ struct hmm {
 	struct mmu_notifier	mmu_notifier;
 	struct rw_semaphore	mirrors_sem;
 	wait_queue_head_t	wq;
+	struct rcu_head		rcu;
 	long			notifiers;
 	bool			dead;
 };
diff --git a/mm/hmm.c b/mm/hmm.c
index 826816ab237799..f6956d78e3cb25 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -104,6 +104,11 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	return NULL;
 }
 
+static void hmm_free_rcu(struct rcu_head *rcu)
+{
+	kfree(container_of(rcu, struct hmm, rcu));
+}
+
 static void hmm_free(struct kref *kref)
 {
 	struct hmm *hmm = container_of(kref, struct hmm, kref);
@@ -116,7 +121,7 @@ static void hmm_free(struct kref *kref)
 		mm->hmm = NULL;
 	spin_unlock(&mm->page_table_lock);
 
-	kfree(hmm);
+	mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
 }
 
 static inline void hmm_put(struct hmm *hmm)
@@ -144,10 +149,14 @@ void hmm_mm_destroy(struct mm_struct *mm)
 
 static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
-	struct hmm *hmm = mm_get_hmm(mm);
+	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
 	struct hmm_range *range;
 
+	/* Bail out if hmm is in the process of being freed */
+	if (!kref_get_unless_zero(&hmm->kref))
+		return;
+
 	/* Report this HMM as dying. */
 	hmm->dead = true;
 
@@ -185,13 +194,14 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 			const struct mmu_notifier_range *nrange)
 {
-	struct hmm *hmm = mm_get_hmm(nrange->mm);
+	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
 	struct hmm_update update;
 	struct hmm_range *range;
 	int ret = 0;
 
-	VM_BUG_ON(!hmm);
+	if (!kref_get_unless_zero(&hmm->kref))
+		return 0;
 
 	update.start = nrange->start;
 	update.end = nrange->end;
@@ -236,9 +246,10 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 static void hmm_invalidate_range_end(struct mmu_notifier *mn,
 			const struct mmu_notifier_range *nrange)
 {
-	struct hmm *hmm = mm_get_hmm(nrange->mm);
+	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 
-	VM_BUG_ON(!hmm);
+	if (!kref_get_unless_zero(&hmm->kref))
+		return;
 
 	mutex_lock(&hmm->lock);
 	hmm->notifiers--;
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
  2019-06-14  0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

Ralph observes that hmm_range_register() can only be called by a driver
while a mirror is registered. Make this clear in the API by passing in the
mirror structure as a parameter.

This also simplifies understanding the lifetime model for struct hmm, as
the hmm pointer must be valid as part of a registered mirror so all we
need in hmm_register_range() is a simple kref_get.

Suggested-by: Ralph Campbell <rcampbell@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2
- Include the oneline patch to nouveau_svm.c
---
 drivers/gpu/drm/nouveau/nouveau_svm.c |  2 +-
 include/linux/hmm.h                   |  7 ++++---
 mm/hmm.c                              | 13 ++++---------
 3 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_svm.c b/drivers/gpu/drm/nouveau/nouveau_svm.c
index 93ed43c413f0bb..8c92374afcf227 100644
--- a/drivers/gpu/drm/nouveau/nouveau_svm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_svm.c
@@ -649,7 +649,7 @@ nouveau_svm_fault(struct nvif_notify *notify)
 		range.values = nouveau_svm_pfn_values;
 		range.pfn_shift = NVIF_VMM_PFNMAP_V0_ADDR_SHIFT;
 again:
-		ret = hmm_vma_fault(&range, true);
+		ret = hmm_vma_fault(&svmm->mirror, &range, true);
 		if (ret == 0) {
 			mutex_lock(&svmm->mutex);
 			if (!hmm_vma_range_done(&range)) {
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index cb01cf1fa3c08b..1fba6979adf460 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -496,7 +496,7 @@ static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
  * Please see Documentation/vm/hmm.rst for how to use the range API.
  */
 int hmm_range_register(struct hmm_range *range,
-		       struct mm_struct *mm,
+		       struct hmm_mirror *mirror,
 		       unsigned long start,
 		       unsigned long end,
 		       unsigned page_shift);
@@ -532,7 +532,8 @@ static inline bool hmm_vma_range_done(struct hmm_range *range)
 }
 
 /* This is a temporary helper to avoid merge conflict between trees. */
-static inline int hmm_vma_fault(struct hmm_range *range, bool block)
+static inline int hmm_vma_fault(struct hmm_mirror *mirror,
+				struct hmm_range *range, bool block)
 {
 	long ret;
 
@@ -545,7 +546,7 @@ static inline int hmm_vma_fault(struct hmm_range *range, bool block)
 	range->default_flags = 0;
 	range->pfn_flags_mask = -1UL;
 
-	ret = hmm_range_register(range, range->vma->vm_mm,
+	ret = hmm_range_register(range, mirror,
 				 range->start, range->end,
 				 PAGE_SHIFT);
 	if (ret)
diff --git a/mm/hmm.c b/mm/hmm.c
index f6956d78e3cb25..22a97ada108b4e 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
  * Track updates to the CPU page table see include/linux/hmm.h
  */
 int hmm_range_register(struct hmm_range *range,
-		       struct mm_struct *mm,
+		       struct hmm_mirror *mirror,
 		       unsigned long start,
 		       unsigned long end,
 		       unsigned page_shift)
 {
 	unsigned long mask = ((1UL << page_shift) - 1UL);
-	struct hmm *hmm;
+	struct hmm *hmm = mirror->hmm;
 
 	range->valid = false;
 	range->hmm = NULL;
@@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range,
 	range->start = start;
 	range->end = end;
 
-	hmm = hmm_get_or_create(mm);
-	if (!hmm)
-		return -EFAULT;
-
 	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead) {
-		hmm_put(hmm);
+	if (hmm->mm == NULL || hmm->dead)
 		return -EFAULT;
-	}
 
 	/* Initialize range to track CPU page table updates. */
 	mutex_lock(&hmm->lock);
 
 	range->hmm = hmm;
+	kref_get(&hmm->kref);
 	list_add_rcu(&range->list, &hmm->ranges);
 
 	/*
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
  2019-06-14  0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
  2019-06-14  0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

So long as a struct hmm pointer exists, so should the struct mm it is
linked too. Hold the mmgrab() as soon as a hmm is created, and mmdrop() it
once the hmm refcount goes to zero.

Since mmdrop() (ie a 0 kref on struct mm) is now impossible with a !NULL
mm->hmm delete the hmm_hmm_destroy().

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
 - Fix error unwind paths in hmm_get_or_create (Jerome/Jason)
---
 include/linux/hmm.h |  3 ---
 kernel/fork.c       |  1 -
 mm/hmm.c            | 22 ++++------------------
 3 files changed, 4 insertions(+), 22 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 1fba6979adf460..1d97b6d62c5bcf 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -577,14 +577,11 @@ static inline int hmm_vma_fault(struct hmm_mirror *mirror,
 }
 
 /* Below are for HMM internal use only! Not to be used by device driver! */
-void hmm_mm_destroy(struct mm_struct *mm);
-
 static inline void hmm_mm_init(struct mm_struct *mm)
 {
 	mm->hmm = NULL;
 }
 #else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-static inline void hmm_mm_destroy(struct mm_struct *mm) {}
 static inline void hmm_mm_init(struct mm_struct *mm) {}
 #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
 
diff --git a/kernel/fork.c b/kernel/fork.c
index 75675b9bf6dfd3..c704c3cedee78d 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -673,7 +673,6 @@ void __mmdrop(struct mm_struct *mm)
 	WARN_ON_ONCE(mm == current->active_mm);
 	mm_free_pgd(mm);
 	destroy_context(mm);
-	hmm_mm_destroy(mm);
 	mmu_notifier_mm_destroy(mm);
 	check_mm(mm);
 	put_user_ns(mm->user_ns);
diff --git a/mm/hmm.c b/mm/hmm.c
index 22a97ada108b4e..080b17a2e87e2d 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -20,6 +20,7 @@
 #include <linux/swapops.h>
 #include <linux/hugetlb.h>
 #include <linux/memremap.h>
+#include <linux/sched/mm.h>
 #include <linux/jump_label.h>
 #include <linux/dma-mapping.h>
 #include <linux/mmu_notifier.h>
@@ -73,6 +74,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	hmm->notifiers = 0;
 	hmm->dead = false;
 	hmm->mm = mm;
+	mmgrab(hmm->mm);
 
 	spin_lock(&mm->page_table_lock);
 	if (!mm->hmm)
@@ -100,6 +102,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 		mm->hmm = NULL;
 	spin_unlock(&mm->page_table_lock);
 error:
+	mmdrop(hmm->mm);
 	kfree(hmm);
 	return NULL;
 }
@@ -121,6 +124,7 @@ static void hmm_free(struct kref *kref)
 		mm->hmm = NULL;
 	spin_unlock(&mm->page_table_lock);
 
+	mmdrop(hmm->mm);
 	mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
 }
 
@@ -129,24 +133,6 @@ static inline void hmm_put(struct hmm *hmm)
 	kref_put(&hmm->kref, hmm_free);
 }
 
-void hmm_mm_destroy(struct mm_struct *mm)
-{
-	struct hmm *hmm;
-
-	spin_lock(&mm->page_table_lock);
-	hmm = mm_get_hmm(mm);
-	mm->hmm = NULL;
-	if (hmm) {
-		hmm->mm = NULL;
-		hmm->dead = true;
-		spin_unlock(&mm->page_table_lock);
-		hmm_put(hmm);
-		return;
-	}
-
-	spin_unlock(&mm->page_table_lock);
-}
-
 static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

As coded this function can false-fail in various racy situations. Make it
reliable and simpler by running under the write side of the mmap_sem and
avoiding the false-failing compare/exchange pattern. Due to the mmap_sem
this no longer has to avoid racing with a 2nd parallel
hmm_get_or_create().

Unfortunately this still has to use the page_table_lock as the
non-sleeping lock protecting mm->hmm, since the contexts where we free the
hmm are incompatible with mmap_sem.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
- Fix error unwind of mmgrab (Jerome)
- Use hmm local instead of 2nd container_of (Jerome)
v3:
- Can't use mmap_sem in the SRCU callback, keep using the
  page_table_lock (Philip)
---
 mm/hmm.c | 84 ++++++++++++++++++++++++--------------------------------
 1 file changed, 36 insertions(+), 48 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 080b17a2e87e2d..4c64d4c32f4825 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -31,16 +31,6 @@
 #if IS_ENABLED(CONFIG_HMM_MIRROR)
 static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
 
-static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
-{
-	struct hmm *hmm = READ_ONCE(mm->hmm);
-
-	if (hmm && kref_get_unless_zero(&hmm->kref))
-		return hmm;
-
-	return NULL;
-}
-
 /**
  * hmm_get_or_create - register HMM against an mm (HMM internal)
  *
@@ -55,11 +45,19 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
  */
 static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 {
-	struct hmm *hmm = mm_get_hmm(mm);
-	bool cleanup = false;
+	struct hmm *hmm;
 
-	if (hmm)
-		return hmm;
+	lockdep_assert_held_exclusive(&mm->mmap_sem);
+
+	/* Abuse the page_table_lock to also protect mm->hmm. */
+	spin_lock(&mm->page_table_lock);
+	if (mm->hmm) {
+		if (kref_get_unless_zero(&mm->hmm->kref)) {
+			spin_unlock(&mm->page_table_lock);
+			return mm->hmm;
+		}
+	}
+	spin_unlock(&mm->page_table_lock);
 
 	hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
 	if (!hmm)
@@ -74,57 +72,47 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	hmm->notifiers = 0;
 	hmm->dead = false;
 	hmm->mm = mm;
-	mmgrab(hmm->mm);
 
-	spin_lock(&mm->page_table_lock);
-	if (!mm->hmm)
-		mm->hmm = hmm;
-	else
-		cleanup = true;
-	spin_unlock(&mm->page_table_lock);
+	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
+	if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
+		kfree(hmm);
+		return NULL;
+	}
 
-	if (cleanup)
-		goto error;
+	mmgrab(hmm->mm);
 
 	/*
-	 * We should only get here if hold the mmap_sem in write mode ie on
-	 * registration of first mirror through hmm_mirror_register()
+	 * We hold the exclusive mmap_sem here so we know that mm->hmm is
+	 * still NULL or 0 kref, and is safe to update.
 	 */
-	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
-	if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
-		goto error_mm;
-
-	return hmm;
-
-error_mm:
 	spin_lock(&mm->page_table_lock);
-	if (mm->hmm == hmm)
-		mm->hmm = NULL;
+	mm->hmm = hmm;
 	spin_unlock(&mm->page_table_lock);
-error:
-	mmdrop(hmm->mm);
-	kfree(hmm);
-	return NULL;
+	return hmm;
 }
 
 static void hmm_free_rcu(struct rcu_head *rcu)
 {
-	kfree(container_of(rcu, struct hmm, rcu));
+	struct hmm *hmm = container_of(rcu, struct hmm, rcu);
+
+	/*
+	 * The mm->hmm pointer is kept valid while notifier ops can be running
+	 * so they don't have to deal with a NULL mm->hmm value
+	 */
+	spin_lock(&hmm->mm->page_table_lock);
+	if (hmm->mm->hmm == hmm)
+		hmm->mm->hmm = NULL;
+	spin_unlock(&hmm->mm->page_table_lock);
+	mmdrop(hmm->mm);
+
+	kfree(hmm);
 }
 
 static void hmm_free(struct kref *kref)
 {
 	struct hmm *hmm = container_of(kref, struct hmm, kref);
-	struct mm_struct *mm = hmm->mm;
-
-	mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
 
-	spin_lock(&mm->page_table_lock);
-	if (mm->hmm == hmm)
-		mm->hmm = NULL;
-	spin_unlock(&mm->page_table_lock);
-
-	mmdrop(hmm->mm);
+	mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
 	mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
 }
 
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

The wait_event_timeout macro already tests the condition as its first
action, so there is no reason to open code another version of this, all
that does is skip the might_sleep() debugging in common cases, which is
not helpful.

Further, based on prior patches, we can now simplify the required condition
test:
 - If range is valid memory then so is range->hmm
 - If hmm_release() has run then range->valid is set to false
   at the same time as dead, so no reason to check both.
 - A valid hmm has a valid hmm->mm.

Allowing the return value of wait_event_timeout() (along with its internal
barriers) to compute the result of the function.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v3
- Simplify the wait_event_timeout to not check valid
---
 include/linux/hmm.h | 13 ++-----------
 1 file changed, 2 insertions(+), 11 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 1d97b6d62c5bcf..26e7c477490c4e 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -209,17 +209,8 @@ static inline unsigned long hmm_range_page_size(const struct hmm_range *range)
 static inline bool hmm_range_wait_until_valid(struct hmm_range *range,
 					      unsigned long timeout)
 {
-	/* Check if mm is dead ? */
-	if (range->hmm == NULL || range->hmm->dead || range->hmm->mm == NULL) {
-		range->valid = false;
-		return false;
-	}
-	if (range->valid)
-		return true;
-	wait_event_timeout(range->hmm->wq, range->valid || range->hmm->dead,
-			   msecs_to_jiffies(timeout));
-	/* Return current valid status just in case we get lucky */
-	return range->valid;
+	return wait_event_timeout(range->hmm->wq, range->valid,
+				  msecs_to_jiffies(timeout)) != 0;
 }
 
 /*
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

Range functions like hmm_range_snapshot() and hmm_range_fault() call
find_vma, which requires hodling the mmget() and the mmap_sem for the mm.

Make this simpler for the callers by holding the mmget() inside the range
for the lifetime of the range. Other functions that accept a range should
only be called if the range is registered.

This has the side effect of directly preventing hmm_release() from
happening while a range is registered. That means range->dead cannot be
false during the lifetime of the range, so remove dead and
hmm_mirror_mm_is_alive() entirely.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
 - Use Jerome's idea of just holding the mmget() for the range lifetime,
   rework the patch to use that as as simplification to remove dead in
   one step
---
 include/linux/hmm.h | 26 --------------------------
 mm/hmm.c            | 28 ++++++++++------------------
 2 files changed, 10 insertions(+), 44 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 26e7c477490c4e..bf013e96525771 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -82,7 +82,6 @@
  * @mirrors_sem: read/write semaphore protecting the mirrors list
  * @wq: wait queue for user waiting on a range invalidation
  * @notifiers: count of active mmu notifiers
- * @dead: is the mm dead ?
  */
 struct hmm {
 	struct mm_struct	*mm;
@@ -95,7 +94,6 @@ struct hmm {
 	wait_queue_head_t	wq;
 	struct rcu_head		rcu;
 	long			notifiers;
-	bool			dead;
 };
 
 /*
@@ -459,30 +457,6 @@ struct hmm_mirror {
 int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm);
 void hmm_mirror_unregister(struct hmm_mirror *mirror);
 
-/*
- * hmm_mirror_mm_is_alive() - test if mm is still alive
- * @mirror: the HMM mm mirror for which we want to lock the mmap_sem
- * Return: false if the mm is dead, true otherwise
- *
- * This is an optimization, it will not always accurately return false if the
- * mm is dead; i.e., there can be false negatives (process is being killed but
- * HMM is not yet informed of that). It is only intended to be used to optimize
- * out cases where the driver is about to do something time consuming and it
- * would be better to skip it if the mm is dead.
- */
-static inline bool hmm_mirror_mm_is_alive(struct hmm_mirror *mirror)
-{
-	struct mm_struct *mm;
-
-	if (!mirror || !mirror->hmm)
-		return false;
-	mm = READ_ONCE(mirror->hmm->mm);
-	if (mirror->hmm->dead || !mm)
-		return false;
-
-	return true;
-}
-
 /*
  * Please see Documentation/vm/hmm.rst for how to use the range API.
  */
diff --git a/mm/hmm.c b/mm/hmm.c
index 4c64d4c32f4825..58712d74edd585 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -70,7 +70,6 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	mutex_init(&hmm->lock);
 	kref_init(&hmm->kref);
 	hmm->notifiers = 0;
-	hmm->dead = false;
 	hmm->mm = mm;
 
 	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
@@ -125,20 +124,17 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
-	struct hmm_range *range;
 
 	/* Bail out if hmm is in the process of being freed */
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	/* Report this HMM as dying. */
-	hmm->dead = true;
-
-	/* Wake-up everyone waiting on any range. */
 	mutex_lock(&hmm->lock);
-	list_for_each_entry(range, &hmm->ranges, list)
-		range->valid = false;
-	wake_up_all(&hmm->wq);
+	/*
+	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
+	 * prevented as long as a range exists.
+	 */
+	WARN_ON(!list_empty(&hmm->ranges));
 	mutex_unlock(&hmm->lock);
 
 	down_write(&hmm->mirrors_sem);
@@ -908,8 +904,8 @@ int hmm_range_register(struct hmm_range *range,
 	range->start = start;
 	range->end = end;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
+	/* Prevent hmm_release() from running while the range is valid */
+	if (!mmget_not_zero(hmm->mm))
 		return -EFAULT;
 
 	/* Initialize range to track CPU page table updates. */
@@ -952,6 +948,7 @@ void hmm_range_unregister(struct hmm_range *range)
 
 	/* Drop reference taken by hmm_range_register() */
 	range->valid = false;
+	mmput(hmm->mm);
 	hmm_put(hmm);
 	range->hmm = NULL;
 }
@@ -979,10 +976,7 @@ long hmm_range_snapshot(struct hmm_range *range)
 	struct vm_area_struct *vma;
 	struct mm_walk mm_walk;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
-		return -EFAULT;
-
+	lockdep_assert_held(&hmm->mm->mmap_sem);
 	do {
 		/* If range is no longer valid force retry. */
 		if (!range->valid)
@@ -1077,9 +1071,7 @@ long hmm_range_fault(struct hmm_range *range, bool block)
 	struct mm_walk mm_walk;
 	int ret;
 
-	/* Check if hmm_mm_destroy() was call. */
-	if (hmm->mm == NULL || hmm->dead)
-		return -EFAULT;
+	lockdep_assert_held(&hmm->mm->mmap_sem);
 
 	do {
 		/* If range is no longer valid force retry. */
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx,
	Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel,
	Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

So we can check locking at runtime.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2
- Fix missing & in lockdeps (Jason)
---
 mm/hmm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 58712d74edd585..c0f622f86223c2 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -253,11 +253,11 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
  *
  * To start mirroring a process address space, the device driver must register
  * an HMM mirror struct.
- *
- * THE mm->mmap_sem MUST BE HELD IN WRITE MODE !
  */
 int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
 {
+	lockdep_assert_held_exclusive(&mm->mmap_sem);
+
 	/* Sanity check */
 	if (!mm || !mirror || !mirror->ops)
 		return -EINVAL;
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

No other register/unregister kernel API attempts to provide this kind of
protection as it is inherently racy, so just drop it.

Callers should provide their own protection, it appears nouveau already
does, but just in case drop a debugging POISON.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
 mm/hmm.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index c0f622f86223c2..e3e0a811a3a774 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -283,18 +283,13 @@ EXPORT_SYMBOL(hmm_mirror_register);
  */
 void hmm_mirror_unregister(struct hmm_mirror *mirror)
 {
-	struct hmm *hmm = READ_ONCE(mirror->hmm);
-
-	if (hmm == NULL)
-		return;
+	struct hmm *hmm = mirror->hmm;
 
 	down_write(&hmm->mirrors_sem);
 	list_del_init(&mirror->list);
-	/* To protect us against double unregister ... */
-	mirror->hmm = NULL;
 	up_write(&hmm->mirrors_sem);
-
 	hmm_put(hmm);
+	memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
 }
 EXPORT_SYMBOL(hmm_mirror_unregister);
 
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx,
	Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel,
	Ira Weiny, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

Trying to misuse a range outside its lifetime is a kernel bug. Use poison
bytes to help detect this condition. Double unregister will reliably crash.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2
- Keep range start/end valid after unregistration (Jerome)
v3
- Revise some comments (John)
- Remove start/end WARN_ON (Souptick)
---
 mm/hmm.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index e3e0a811a3a774..e214668cba3474 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -933,19 +933,21 @@ void hmm_range_unregister(struct hmm_range *range)
 {
 	struct hmm *hmm = range->hmm;
 
-	/* Sanity check this really should not happen. */
-	if (hmm == NULL || range->end <= range->start)
-		return;
-
 	mutex_lock(&hmm->lock);
 	list_del_rcu(&range->list);
 	mutex_unlock(&hmm->lock);
 
 	/* Drop reference taken by hmm_range_register() */
-	range->valid = false;
 	mmput(hmm->mm);
 	hmm_put(hmm);
-	range->hmm = NULL;
+
+	/*
+	 * The range is now invalid and the ref on the hmm is dropped, so
+         * poison the pointer.  Leave other fields in place, for the caller's
+         * use.
+         */
+	range->valid = false;
+	memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
 }
 EXPORT_SYMBOL(hmm_range_unregister);
 
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
  2019-06-14  0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, Ira Weiny, amd-gfx,
	Souptick Joarder, linux-mm, Jason Gunthorpe, dri-devel,
	Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

This list is always read and written while holding hmm->lock so there is
no need for the confusing _rcu annotations.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Jérôme Glisse <jglisse@redhat.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Acked-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Ira Weiny <iweiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
 mm/hmm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index e214668cba3474..26af511cbdd075 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -908,7 +908,7 @@ int hmm_range_register(struct hmm_range *range,
 
 	range->hmm = hmm;
 	kref_get(&hmm->kref);
-	list_add_rcu(&range->list, &hmm->ranges);
+	list_add(&range->list, &hmm->ranges);
 
 	/*
 	 * If there are any concurrent notifiers we have to wait for them for
@@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range)
 	struct hmm *hmm = range->hmm;
 
 	mutex_lock(&hmm->lock);
-	list_del_rcu(&range->list);
+	list_del(&range->list);
 	mutex_unlock(&hmm->lock);
 
 	/* Drop reference taken by hmm_range_register() */
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org>
  2019-06-14  0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

hmm_release() is called exactly once per hmm. ops->release() cannot
accidentally trigger any action that would recurse back onto
hmm->mirrors_sem.

This fixes a use after-free race of the form:

       CPU0                                   CPU1
                                           hmm_release()
                                             up_write(&hmm->mirrors_sem);
 hmm_mirror_unregister(mirror)
  down_write(&hmm->mirrors_sem);
  up_write(&hmm->mirrors_sem);
  kfree(mirror)
                                             mirror->ops->release(mirror)

The only user we have today for ops->release is an empty function, so this
is unambiguously safe.

As a consequence of plugging this race drivers are not allowed to
register/unregister mirrors from within a release op.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
 mm/hmm.c | 28 +++++++++-------------------
 1 file changed, 9 insertions(+), 19 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 26af511cbdd075..c0d43302fd6b2f 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -137,26 +137,16 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	WARN_ON(!list_empty(&hmm->ranges));
 	mutex_unlock(&hmm->lock);
 
-	down_write(&hmm->mirrors_sem);
-	mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
-					  list);
-	while (mirror) {
-		list_del_init(&mirror->list);
-		if (mirror->ops->release) {
-			/*
-			 * Drop mirrors_sem so the release callback can wait
-			 * on any pending work that might itself trigger a
-			 * mmu_notifier callback and thus would deadlock with
-			 * us.
-			 */
-			up_write(&hmm->mirrors_sem);
+	down_read(&hmm->mirrors_sem);
+	list_for_each_entry(mirror, &hmm->mirrors, list) {
+		/*
+		 * Note: The driver is not allowed to trigger
+		 * hmm_mirror_unregister() from this thread.
+		 */
+		if (mirror->ops->release)
 			mirror->ops->release(mirror);
-			down_write(&hmm->mirrors_sem);
-		}
-		mirror = list_first_entry_or_null(&hmm->mirrors,
-						  struct hmm_mirror, list);
 	}
-	up_write(&hmm->mirrors_sem);
+	up_read(&hmm->mirrors_sem);
 
 	hmm_put(hmm);
 }
@@ -286,7 +276,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror)
 	struct hmm *hmm = mirror->hmm;
 
 	down_write(&hmm->mirrors_sem);
-	list_del_init(&mirror->list);
+	list_del(&mirror->list);
 	up_write(&hmm->mirrors_sem);
 	hmm_put(hmm);
 	memset(&mirror->hmm, POISON_INUSE, sizeof(mirror->hmm));
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start
  2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2019-06-14  0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
@ 2019-06-14  0:44 ` Jason Gunthorpe
       [not found]   ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org>
  11 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-14  0:44 UTC (permalink / raw)
  To: Jerome Glisse, Ralph Campbell, John Hubbard, Felix.Kuehling
  Cc: Andrea Arcangeli, Philip Yang, linux-rdma, amd-gfx, linux-mm,
	Jason Gunthorpe, dri-devel, Ben Skeggs

From: Jason Gunthorpe <jgg@mellanox.com>

If the trylock on the hmm->mirrors_sem fails the function will return
without decrementing the notifiers that were previously incremented. Since
the caller will not call invalidate_range_end() on EAGAIN this will result
in notifiers becoming permanently incremented and deadlock.

If the sync_cpu_device_pagetables() required blocking the function will
not return EAGAIN even though the device continues to touch the
pages. This is a violation of the mmu notifier contract.

Switch, and rename, the ranges_lock to a spin lock so we can reliably
obtain it without blocking during error unwind.

The error unwind is necessary since the notifiers count must be held
incremented across the call to sync_cpu_device_pagetables() as we cannot
allow the range to become marked valid by a parallel
invalidate_start/end() pair while doing sync_cpu_device_pagetables().

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
 include/linux/hmm.h |  2 +-
 mm/hmm.c            | 77 +++++++++++++++++++++++++++------------------
 2 files changed, 48 insertions(+), 31 deletions(-)

diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index bf013e96525771..0fa8ea34ccef6d 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -86,7 +86,7 @@
 struct hmm {
 	struct mm_struct	*mm;
 	struct kref		kref;
-	struct mutex		lock;
+	spinlock_t		ranges_lock;
 	struct list_head	ranges;
 	struct list_head	mirrors;
 	struct mmu_notifier	mmu_notifier;
diff --git a/mm/hmm.c b/mm/hmm.c
index c0d43302fd6b2f..1172a4f0206963 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -67,7 +67,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	init_rwsem(&hmm->mirrors_sem);
 	hmm->mmu_notifier.ops = NULL;
 	INIT_LIST_HEAD(&hmm->ranges);
-	mutex_init(&hmm->lock);
+	spin_lock_init(&hmm->ranges_lock);
 	kref_init(&hmm->kref);
 	hmm->notifiers = 0;
 	hmm->mm = mm;
@@ -124,18 +124,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
+	unsigned long flags;
 
 	/* Bail out if hmm is in the process of being freed */
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	mutex_lock(&hmm->lock);
+	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	/*
 	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
 	 * prevented as long as a range exists.
 	 */
 	WARN_ON(!list_empty(&hmm->ranges));
-	mutex_unlock(&hmm->lock);
+	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	down_read(&hmm->mirrors_sem);
 	list_for_each_entry(mirror, &hmm->mirrors, list) {
@@ -151,6 +152,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	hmm_put(hmm);
 }
 
+static void notifiers_decrement(struct hmm *hmm)
+{
+	lockdep_assert_held(&hmm->ranges_lock);
+
+	hmm->notifiers--;
+	if (!hmm->notifiers) {
+		struct hmm_range *range;
+
+		list_for_each_entry(range, &hmm->ranges, list) {
+			if (range->valid)
+				continue;
+			range->valid = true;
+		}
+		wake_up_all(&hmm->wq);
+	}
+}
+
 static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 			const struct mmu_notifier_range *nrange)
 {
@@ -158,6 +176,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	struct hmm_mirror *mirror;
 	struct hmm_update update;
 	struct hmm_range *range;
+	unsigned long flags;
 	int ret = 0;
 
 	if (!kref_get_unless_zero(&hmm->kref))
@@ -168,12 +187,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	update.event = HMM_UPDATE_INVALIDATE;
 	update.blockable = mmu_notifier_range_blockable(nrange);
 
-	if (mmu_notifier_range_blockable(nrange))
-		mutex_lock(&hmm->lock);
-	else if (!mutex_trylock(&hmm->lock)) {
-		ret = -EAGAIN;
-		goto out;
-	}
+	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	hmm->notifiers++;
 	list_for_each_entry(range, &hmm->ranges, list) {
 		if (update.end < range->start || update.start >= range->end)
@@ -181,7 +195,7 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 
 		range->valid = false;
 	}
-	mutex_unlock(&hmm->lock);
+	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	if (mmu_notifier_range_blockable(nrange))
 		down_read(&hmm->mirrors_sem);
@@ -189,16 +203,26 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 		ret = -EAGAIN;
 		goto out;
 	}
+
 	list_for_each_entry(mirror, &hmm->mirrors, list) {
-		int ret;
+		int rc;
 
-		ret = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
-		if (!update.blockable && ret == -EAGAIN)
+		rc = mirror->ops->sync_cpu_device_pagetables(mirror, &update);
+		if (rc) {
+			if (WARN_ON(update.blockable || rc != -EAGAIN))
+				continue;
+			ret = -EAGAIN;
 			break;
+		}
 	}
 	up_read(&hmm->mirrors_sem);
 
 out:
+	if (ret) {
+		spin_lock_irqsave(&hmm->ranges_lock, flags);
+		notifiers_decrement(hmm);
+		spin_unlock_irqrestore(&hmm->ranges_lock, flags);
+	}
 	hmm_put(hmm);
 	return ret;
 }
@@ -207,23 +231,14 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn,
 			const struct mmu_notifier_range *nrange)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
+	unsigned long flags;
 
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	mutex_lock(&hmm->lock);
-	hmm->notifiers--;
-	if (!hmm->notifiers) {
-		struct hmm_range *range;
-
-		list_for_each_entry(range, &hmm->ranges, list) {
-			if (range->valid)
-				continue;
-			range->valid = true;
-		}
-		wake_up_all(&hmm->wq);
-	}
-	mutex_unlock(&hmm->lock);
+	spin_lock_irqsave(&hmm->ranges_lock, flags);
+	notifiers_decrement(hmm);
+	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	hmm_put(hmm);
 }
@@ -876,6 +891,7 @@ int hmm_range_register(struct hmm_range *range,
 {
 	unsigned long mask = ((1UL << page_shift) - 1UL);
 	struct hmm *hmm = mirror->hmm;
+	unsigned long flags;
 
 	range->valid = false;
 	range->hmm = NULL;
@@ -894,7 +910,7 @@ int hmm_range_register(struct hmm_range *range,
 		return -EFAULT;
 
 	/* Initialize range to track CPU page table updates. */
-	mutex_lock(&hmm->lock);
+	spin_lock_irqsave(&hmm->ranges_lock, flags);
 
 	range->hmm = hmm;
 	kref_get(&hmm->kref);
@@ -906,7 +922,7 @@ int hmm_range_register(struct hmm_range *range,
 	 */
 	if (!hmm->notifiers)
 		range->valid = true;
-	mutex_unlock(&hmm->lock);
+	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	return 0;
 }
@@ -922,10 +938,11 @@ EXPORT_SYMBOL(hmm_range_register);
 void hmm_range_unregister(struct hmm_range *range)
 {
 	struct hmm *hmm = range->hmm;
+	unsigned long flags;
 
-	mutex_lock(&hmm->lock);
+	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	list_del(&range->list);
-	mutex_unlock(&hmm->lock);
+	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	/* Drop reference taken by hmm_range_register() */
 	mmput(hmm->mm);
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers
       [not found]   ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 13:56     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 13:56 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
       [not found]   ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 13:59     ` Christoph Hellwig
  2019-06-18 13:05       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 13:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

On Thu, Jun 13, 2019 at 09:44:40PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Ralph observes that hmm_range_register() can only be called by a driver
> while a mirror is registered. Make this clear in the API by passing in the
> mirror structure as a parameter.
> 
> This also simplifies understanding the lifetime model for struct hmm, as
> the hmm pointer must be valid as part of a registered mirror so all we
> need in hmm_register_range() is a simple kref_get.

Looks good, at least an an intermediate step:

Reviewed-by: Christoph Hellwig <hch@lst.de>

> index f6956d78e3cb25..22a97ada108b4e 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
>   * Track updates to the CPU page table see include/linux/hmm.h
>   */
>  int hmm_range_register(struct hmm_range *range,
> -		       struct mm_struct *mm,
> +		       struct hmm_mirror *mirror,
>  		       unsigned long start,
>  		       unsigned long end,
>  		       unsigned page_shift)
>  {
>  	unsigned long mask = ((1UL << page_shift) - 1UL);
> -	struct hmm *hmm;
> +	struct hmm *hmm = mirror->hmm;
>  
>  	range->valid = false;
>  	range->hmm = NULL;
> @@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range,
>  	range->start = start;
>  	range->end = end;

But while you're at it:  the calling conventions of hmm_range_register
are still rather odd, as the staet, end and page_shift arguments are
only used to fill out fields in the range structure passed in.  Might
be worth cleaning up as well if we change the calling convention.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm
       [not found]   ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 13:59     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 13:59 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable
       [not found]   ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:12     ` Christoph Hellwig
  2019-06-18  0:36       ` Jason Gunthorpe
  2019-06-18 18:55       ` Jason Gunthorpe
  0 siblings, 2 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

> +	spin_lock(&mm->page_table_lock);
> +	if (mm->hmm) {
> +		if (kref_get_unless_zero(&mm->hmm->kref)) {
> +			spin_unlock(&mm->page_table_lock);
> +			return mm->hmm;
> +		}
> +	}
> +	spin_unlock(&mm->page_table_lock);

This could become:

	spin_lock(&mm->page_table_lock);
	hmm = mm->hmm
	if (hmm && kref_get_unless_zero(&hmm->kref))
		goto out_unlock;
	spin_unlock(&mm->page_table_lock);

as the last two lines of the function already drop the page_table_lock
and then return hmm.  Or drop the "hmm = mm->hmm" asignment above and
return mm->hmm as that should be always identical to hmm at the end
to save another line.

> +	/*
> +	 * The mm->hmm pointer is kept valid while notifier ops can be running
> +	 * so they don't have to deal with a NULL mm->hmm value
> +	 */

The comment confuses me.  How does the page_table_lock relate to
possibly running notifiers, as I can't find that we take
page_table_lock?  Or is it just about the fact that we only clear
mm->hmm in the free callback, and not in hmm_free?
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout
       [not found]   ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:12     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:12 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
       [not found]   ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:14     ` Christoph Hellwig
  2019-06-18 15:11       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

>  	mutex_lock(&hmm->lock);
> -	list_for_each_entry(range, &hmm->ranges, list)
> -		range->valid = false;
> -	wake_up_all(&hmm->wq);
> +	/*
> +	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
> +	 * prevented as long as a range exists.
> +	 */
> +	WARN_ON(!list_empty(&hmm->ranges));
>  	mutex_unlock(&hmm->lock);

This can just use list_empty_careful and avoid the lock entirely.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments
       [not found]   ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:14     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
       [not found]   ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:16     ` Christoph Hellwig
  2019-06-18 13:13       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:16 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Thu, Jun 13, 2019 at 09:44:46PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> No other register/unregister kernel API attempts to provide this kind of
> protection as it is inherently racy, so just drop it.
> 
> Callers should provide their own protection, it appears nouveau already
> does, but just in case drop a debugging POISON.

I don't even think we even need to bother with the POISON, normal list
debugging will already catch a double unregistration anyway.

Otherwise looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister
       [not found]   ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:17     ` Christoph Hellwig
  2019-06-18 18:04       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:17 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

> -	/* Sanity check this really should not happen. */
> -	if (hmm == NULL || range->end <= range->start)
> -		return;
> -
>  	mutex_lock(&hmm->lock);
>  	list_del_rcu(&range->list);
>  	mutex_unlock(&hmm->lock);
>  
>  	/* Drop reference taken by hmm_range_register() */
> -	range->valid = false;
>  	mmput(hmm->mm);
>  	hmm_put(hmm);
> -	range->hmm = NULL;
> +
> +	/*
> +	 * The range is now invalid and the ref on the hmm is dropped, so
> +         * poison the pointer.  Leave other fields in place, for the caller's
> +         * use.
> +         */
> +	range->valid = false;
> +	memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));

Formatting seems to be messed up.  But again I don't see the value
in the poisoning, just let normal linked list debugging do its work.
The other cleanups looks fine to me.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges
       [not found]   ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:18     ` Christoph Hellwig
  2019-06-18  0:38       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo, Ira Weiny,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Souptick Joarder,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Thu, Jun 13, 2019 at 09:44:48PM -0300, Jason Gunthorpe wrote:
>  	range->hmm = hmm;
>  	kref_get(&hmm->kref);
> -	list_add_rcu(&range->list, &hmm->ranges);
> +	list_add(&range->list, &hmm->ranges);
>  
>  	/*
>  	 * If there are any concurrent notifiers we have to wait for them for
> @@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range)
>  	struct hmm *hmm = range->hmm;
>  
>  	mutex_lock(&hmm->lock);
> -	list_del_rcu(&range->list);
> +	list_del(&range->list);
>  	mutex_unlock(&hmm->lock);

Looks fine:

Signed-off-by: Christoph Hellwig <hch@lst.de>

Btw, is there any reason new ranges are added to the front and not the
tail of the list?
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
       [not found]   ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:21     ` Christoph Hellwig
  2019-06-18  0:45       ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:21 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Thu, Jun 13, 2019 at 09:44:49PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> hmm_release() is called exactly once per hmm. ops->release() cannot
> accidentally trigger any action that would recurse back onto
> hmm->mirrors_sem.

In linux-next amdgpu actually calls hmm_mirror_unregister from its
release function.  That whole release function looks rather sketchy,
but we probably need to sort that out first.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start
       [not found]   ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org>
@ 2019-06-15 14:25     ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-15 14:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse, Jason Gunthorpe,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Thu, Jun 13, 2019 at 09:44:50PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> If the trylock on the hmm->mirrors_sem fails the function will return
> without decrementing the notifiers that were previously incremented. Since
> the caller will not call invalidate_range_end() on EAGAIN this will result
> in notifiers becoming permanently incremented and deadlock.
> 
> If the sync_cpu_device_pagetables() required blocking the function will
> not return EAGAIN even though the device continues to touch the
> pages. This is a violation of the mmu notifier contract.
> 
> Switch, and rename, the ranges_lock to a spin lock so we can reliably
> obtain it without blocking during error unwind.
> 
> The error unwind is necessary since the notifiers count must be held
> incremented across the call to sync_cpu_device_pagetables() as we cannot
> allow the range to become marked valid by a parallel
> invalidate_start/end() pair while doing sync_cpu_device_pagetables().
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
> Tested-by: Philip Yang <Philip.Yang@amd.com>
> ---
>  include/linux/hmm.h |  2 +-
>  mm/hmm.c            | 77 +++++++++++++++++++++++++++------------------
>  2 files changed, 48 insertions(+), 31 deletions(-)
> 
> diff --git a/include/linux/hmm.h b/include/linux/hmm.h
> index bf013e96525771..0fa8ea34ccef6d 100644
> --- a/include/linux/hmm.h
> +++ b/include/linux/hmm.h
> @@ -86,7 +86,7 @@
>  struct hmm {
>  	struct mm_struct	*mm;
>  	struct kref		kref;
> -	struct mutex		lock;
> +	spinlock_t		ranges_lock;
>  	struct list_head	ranges;
>  	struct list_head	mirrors;
>  	struct mmu_notifier	mmu_notifier;
> diff --git a/mm/hmm.c b/mm/hmm.c
> index c0d43302fd6b2f..1172a4f0206963 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -67,7 +67,7 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
>  	init_rwsem(&hmm->mirrors_sem);
>  	hmm->mmu_notifier.ops = NULL;
>  	INIT_LIST_HEAD(&hmm->ranges);
> -	mutex_init(&hmm->lock);
> +	spin_lock_init(&hmm->ranges_lock);
>  	kref_init(&hmm->kref);
>  	hmm->notifiers = 0;
>  	hmm->mm = mm;
> @@ -124,18 +124,19 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
>  {
>  	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
>  	struct hmm_mirror *mirror;
> +	unsigned long flags;
>  
>  	/* Bail out if hmm is in the process of being freed */
>  	if (!kref_get_unless_zero(&hmm->kref))
>  		return;
>  
> -	mutex_lock(&hmm->lock);
> +	spin_lock_irqsave(&hmm->ranges_lock, flags);
>  	/*
>  	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
>  	 * prevented as long as a range exists.
>  	 */
>  	WARN_ON(!list_empty(&hmm->ranges));
> -	mutex_unlock(&hmm->lock);
> +	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
>  
>  	down_read(&hmm->mirrors_sem);
>  	list_for_each_entry(mirror, &hmm->mirrors, list) {
> @@ -151,6 +152,23 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
>  	hmm_put(hmm);
>  }
>  
> +static void notifiers_decrement(struct hmm *hmm)
> +{
> +	lockdep_assert_held(&hmm->ranges_lock);
> +
> +	hmm->notifiers--;
> +	if (!hmm->notifiers) {

Nitpick, when doing dec and test or inc and test ops I find it much
easier to read if they are merged into one line, i.e.

	if (!--hmm->notifiers) {

Otherwise this looks fine:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable
  2019-06-15 14:12     ` Christoph Hellwig
@ 2019-06-18  0:36       ` Jason Gunthorpe
  2019-06-18 18:55       ` Jason Gunthorpe
  1 sibling, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18  0:36 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ira Weiny, Ben Skeggs

On Sat, Jun 15, 2019 at 07:12:11AM -0700, Christoph Hellwig wrote:
> > +	spin_lock(&mm->page_table_lock);
> > +	if (mm->hmm) {
> > +		if (kref_get_unless_zero(&mm->hmm->kref)) {
> > +			spin_unlock(&mm->page_table_lock);
> > +			return mm->hmm;
> > +		}
> > +	}
> > +	spin_unlock(&mm->page_table_lock);
> 
> This could become:
> 
> 	spin_lock(&mm->page_table_lock);
> 	hmm = mm->hmm
> 	if (hmm && kref_get_unless_zero(&hmm->kref))
> 		goto out_unlock;
> 	spin_unlock(&mm->page_table_lock);
> 
> as the last two lines of the function already drop the page_table_lock
> and then return hmm.  Or drop the "hmm = mm->hmm" asignment above and
> return mm->hmm as that should be always identical to hmm at the end
> to save another line.

Yeah, I can fuss it some more.

> > +	/*
> > +	 * The mm->hmm pointer is kept valid while notifier ops can be running
> > +	 * so they don't have to deal with a NULL mm->hmm value
> > +	 */
> 
> The comment confuses me.  How does the page_table_lock relate to
> possibly running notifiers, as I can't find that we take
> page_table_lock?  Or is it just about the fact that we only clear
> mm->hmm in the free callback, and not in hmm_free?

It was late when I wrote this fixup, the comment is faulty, and there
is no reason to delay this until the SRCU cleanup at this point in the
series.

The ops all get their struct hmm from container_of, the only thing
that refers to mm->hmm is hmm_get_or_create().

I'll revise it tomorrow, the comment will go away and the =NULL will
go to the release callback

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges
  2019-06-15 14:18     ` Christoph Hellwig
@ 2019-06-18  0:38       ` Jason Gunthorpe
  0 siblings, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18  0:38 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, Ira Weiny, dri-devel,
	Souptick Joarder, linux-mm, Jerome Glisse, amd-gfx, Ben Skeggs

On Sat, Jun 15, 2019 at 07:18:26AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 13, 2019 at 09:44:48PM -0300, Jason Gunthorpe wrote:
> >  	range->hmm = hmm;
> >  	kref_get(&hmm->kref);
> > -	list_add_rcu(&range->list, &hmm->ranges);
> > +	list_add(&range->list, &hmm->ranges);
> >  
> >  	/*
> >  	 * If there are any concurrent notifiers we have to wait for them for
> > @@ -934,7 +934,7 @@ void hmm_range_unregister(struct hmm_range *range)
> >  	struct hmm *hmm = range->hmm;
> >  
> >  	mutex_lock(&hmm->lock);
> > -	list_del_rcu(&range->list);
> > +	list_del(&range->list);
> >  	mutex_unlock(&hmm->lock);
> 
> Looks fine:
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> 
> Btw, is there any reason new ranges are added to the front and not the
> tail of the list?

Couldn't find one. I think order on this list doesn't matter.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
  2019-06-15 14:21     ` Christoph Hellwig
@ 2019-06-18  0:45       ` Jason Gunthorpe
       [not found]         ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18  0:45 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ben Skeggs

On Sat, Jun 15, 2019 at 07:21:06AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 13, 2019 at 09:44:49PM -0300, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > hmm_release() is called exactly once per hmm. ops->release() cannot
> > accidentally trigger any action that would recurse back onto
> > hmm->mirrors_sem.
> 
> In linux-next amdgpu actually calls hmm_mirror_unregister from its
> release function.  That whole release function looks rather sketchy,
> but we probably need to sort that out first.

Does it? I see this:

static void amdgpu_hmm_mirror_release(struct hmm_mirror *mirror)
{
        struct amdgpu_mn *amn = container_of(mirror, struct amdgpu_mn, mirror);

        INIT_WORK(&amn->work, amdgpu_mn_destroy);
        schedule_work(&amn->work);
}

static struct hmm_mirror_ops amdgpu_hmm_mirror_ops[] = {
        [AMDGPU_MN_TYPE_GFX] = {
                .sync_cpu_device_pagetables = amdgpu_mn_sync_pagetables_gfx,
                .release = amdgpu_hmm_mirror_release
        },
        [AMDGPU_MN_TYPE_HSA] = {
                .sync_cpu_device_pagetables = amdgpu_mn_sync_pagetables_hsa,
                .release = amdgpu_hmm_mirror_release
        },
};


Am I looking at the wrong thing? Looks like it calls it through a work
queue should should be OK..

Though very strange that amdgpu only destroys the mirror via release,
that cannot be right.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
       [not found]         ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org>
@ 2019-06-18  5:37           ` Christoph Hellwig
       [not found]             ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-18  5:37 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Mon, Jun 17, 2019 at 09:45:09PM -0300, Jason Gunthorpe wrote:
> Am I looking at the wrong thing? Looks like it calls it through a work
> queue should should be OK..

Yes, it calls it through a work queue.  I guess that is fine because
it needs to take the lock again.

> Though very strange that amdgpu only destroys the mirror via release,
> that cannot be right.

As said the whole things looks rather odd to me.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
  2019-06-15 13:59     ` Christoph Hellwig
@ 2019-06-18 13:05       ` Jason Gunthorpe
       [not found]         ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 13:05 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ira Weiny, Ben Skeggs

On Sat, Jun 15, 2019 at 06:59:06AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 13, 2019 at 09:44:40PM -0300, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > Ralph observes that hmm_range_register() can only be called by a driver
> > while a mirror is registered. Make this clear in the API by passing in the
> > mirror structure as a parameter.
> > 
> > This also simplifies understanding the lifetime model for struct hmm, as
> > the hmm pointer must be valid as part of a registered mirror so all we
> > need in hmm_register_range() is a simple kref_get.
> 
> Looks good, at least an an intermediate step:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> > index f6956d78e3cb25..22a97ada108b4e 100644
> > +++ b/mm/hmm.c
> > @@ -914,13 +914,13 @@ static void hmm_pfns_clear(struct hmm_range *range,
> >   * Track updates to the CPU page table see include/linux/hmm.h
> >   */
> >  int hmm_range_register(struct hmm_range *range,
> > -		       struct mm_struct *mm,
> > +		       struct hmm_mirror *mirror,
> >  		       unsigned long start,
> >  		       unsigned long end,
> >  		       unsigned page_shift)
> >  {
> >  	unsigned long mask = ((1UL << page_shift) - 1UL);
> > -	struct hmm *hmm;
> > +	struct hmm *hmm = mirror->hmm;
> >  
> >  	range->valid = false;
> >  	range->hmm = NULL;
> > @@ -934,20 +934,15 @@ int hmm_range_register(struct hmm_range *range,
> >  	range->start = start;
> >  	range->end = end;
> 
> But while you're at it:  the calling conventions of hmm_range_register
> are still rather odd, as the staet, end and page_shift arguments are
> only used to fill out fields in the range structure passed in.  Might
> be worth cleaning up as well if we change the calling convention.

I'm thinking to tackle that as part of the mmu notififer invlock
idea.. Once the range looses the lock then we don't really need to
register it at all.

Thanks,
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
  2019-06-15 14:16     ` Christoph Hellwig
@ 2019-06-18 13:13       ` Jason Gunthorpe
       [not found]         ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 13:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ben Skeggs

On Sat, Jun 15, 2019 at 07:16:12AM -0700, Christoph Hellwig wrote:
> On Thu, Jun 13, 2019 at 09:44:46PM -0300, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > No other register/unregister kernel API attempts to provide this kind of
> > protection as it is inherently racy, so just drop it.
> > 
> > Callers should provide their own protection, it appears nouveau already
> > does, but just in case drop a debugging POISON.
> 
> I don't even think we even need to bother with the POISON, normal list
> debugging will already catch a double unregistration anyway.

mirror->hmm isn't a list so list debugging won't help.

My concern when I wrote this was that one of the in flight patches I
can't see might be depending on this double-unregister-is-safe
behavior, so I wanted them to crash reliably.

It is a really overly conservative thing to do..

Thanks,
Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
       [not found]         ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org>
@ 2019-06-18 13:27           ` Christoph Hellwig
  2019-06-18 18:57             ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-18 13:27 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Tue, Jun 18, 2019 at 10:13:24AM -0300, Jason Gunthorpe wrote:
> > I don't even think we even need to bother with the POISON, normal list
> > debugging will already catch a double unregistration anyway.
> 
> mirror->hmm isn't a list so list debugging won't help.
> 
> My concern when I wrote this was that one of the in flight patches I
> can't see might be depending on this double-unregister-is-safe
> behavior, so I wanted them to crash reliably.
> 
> It is a really overly conservative thing to do..

mirror->list is a list, and if we do a list_del on it during the
second unregistration it will trip up on the list poisoning.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
  2019-06-15 14:14     ` Christoph Hellwig
@ 2019-06-18 15:11       ` Jason Gunthorpe
       [not found]         ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 15:11 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ben Skeggs

On Sat, Jun 15, 2019 at 07:14:35AM -0700, Christoph Hellwig wrote:
> >  	mutex_lock(&hmm->lock);
> > -	list_for_each_entry(range, &hmm->ranges, list)
> > -		range->valid = false;
> > -	wake_up_all(&hmm->wq);
> > +	/*
> > +	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
> > +	 * prevented as long as a range exists.
> > +	 */
> > +	WARN_ON(!list_empty(&hmm->ranges));
> >  	mutex_unlock(&hmm->lock);
> 
> This can just use list_empty_careful and avoid the lock entirely.

Sure, it is just a debugging helper and the mmput should serialize
thinigs enough to be reliable. I had to move the RCU patch ahead of
this. Thanks

diff --git a/mm/hmm.c b/mm/hmm.c
index a9ace28984ea42..1eddda45cefae7 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -124,13 +124,11 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 	if (!kref_get_unless_zero(&hmm->kref))
 		return;
 
-	mutex_lock(&hmm->lock);
 	/*
 	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
 	 * prevented as long as a range exists.
 	 */
-	WARN_ON(!list_empty(&hmm->ranges));
-	mutex_unlock(&hmm->lock);
+	WARN_ON(!list_empty_careful(&hmm->ranges));
 
 	down_write(&hmm->mirrors_sem);
 	mirror = list_first_entry_or_null(&hmm->mirrors, struct hmm_mirror,
@@ -938,7 +936,7 @@ void hmm_range_unregister(struct hmm_range *range)
 		return;
 
 	mutex_lock(&hmm->lock);
-	list_del(&range->list);
+	list_del_init(&range->list);
 	mutex_unlock(&hmm->lock);
 
 	/* Drop reference taken by hmm_range_register() */
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister
  2019-06-15 14:17     ` Christoph Hellwig
@ 2019-06-18 18:04       ` Jason Gunthorpe
  0 siblings, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 18:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, Souptick Joarder,
	linux-mm, Jerome Glisse, amd-gfx, Ira Weiny, Ben Skeggs

On Sat, Jun 15, 2019 at 07:17:26AM -0700, Christoph Hellwig wrote:
> > -	/* Sanity check this really should not happen. */
> > -	if (hmm == NULL || range->end <= range->start)
> > -		return;
> > -
> >  	mutex_lock(&hmm->lock);
> >  	list_del_rcu(&range->list);
> >  	mutex_unlock(&hmm->lock);
> >  
> >  	/* Drop reference taken by hmm_range_register() */
> > -	range->valid = false;
> >  	mmput(hmm->mm);
> >  	hmm_put(hmm);
> > -	range->hmm = NULL;
> > +
> > +	/*
> > +	 * The range is now invalid and the ref on the hmm is dropped, so
> > +         * poison the pointer.  Leave other fields in place, for the caller's
> > +         * use.
> > +         */
> > +	range->valid = false;
> > +	memset(&range->hmm, POISON_INUSE, sizeof(range->hmm));
> 
> Formatting seems to be messed up.  But again I don't see the value
> in the poisoning, just let normal linked list debugging do its work.
> The other cleanups looks fine to me.

tabs vs spaces, I fixed it. This one is more murky than the other - it
is to prevent the caller from using any of the range APIs after the
range is unregistered, but we could also safely use NULL here, I
think.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable
  2019-06-15 14:12     ` Christoph Hellwig
  2019-06-18  0:36       ` Jason Gunthorpe
@ 2019-06-18 18:55       ` Jason Gunthorpe
  1 sibling, 0 replies; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 18:55 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ira Weiny, Ben Skeggs

On Sat, Jun 15, 2019 at 07:12:11AM -0700, Christoph Hellwig wrote:
> > +	spin_lock(&mm->page_table_lock);
> > +	if (mm->hmm) {
> > +		if (kref_get_unless_zero(&mm->hmm->kref)) {
> > +			spin_unlock(&mm->page_table_lock);
> > +			return mm->hmm;
> > +		}
> > +	}
> > +	spin_unlock(&mm->page_table_lock);
> 
> This could become:
> 
> 	spin_lock(&mm->page_table_lock);
> 	hmm = mm->hmm
> 	if (hmm && kref_get_unless_zero(&hmm->kref))
> 		goto out_unlock;
> 	spin_unlock(&mm->page_table_lock);
> 
> as the last two lines of the function already drop the page_table_lock
> and then return hmm.  Or drop the "hmm = mm->hmm" asignment above and
> return mm->hmm as that should be always identical to hmm at the end
> to save another line.
> 
> > +	/*
> > +	 * The mm->hmm pointer is kept valid while notifier ops can be running
> > +	 * so they don't have to deal with a NULL mm->hmm value
> > +	 */
> 
> The comment confuses me.  How does the page_table_lock relate to
> possibly running notifiers, as I can't find that we take
> page_table_lock?  Or is it just about the fact that we only clear
> mm->hmm in the free callback, and not in hmm_free?

Revised with:

From bdc02a1d502db08457823e6b2b983861a3574a76 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg@mellanox.com>
Date: Thu, 23 May 2019 10:24:13 -0300
Subject: [PATCH] mm/hmm: Simplify hmm_get_or_create and make it reliable

As coded this function can false-fail in various racy situations. Make it
reliable and simpler by running under the write side of the mmap_sem and
avoiding the false-failing compare/exchange pattern. Due to the mmap_sem
this no longer has to avoid racing with a 2nd parallel
hmm_get_or_create().

Unfortunately this still has to use the page_table_lock as the
non-sleeping lock protecting mm->hmm, since the contexts where we free the
hmm are incompatible with mmap_sem.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Tested-by: Philip Yang <Philip.Yang@amd.com>
---
v2:
- Fix error unwind of mmgrab (Jerome)
- Use hmm local instead of 2nd container_of (Jerome)
v3:
- Can't use mmap_sem in the SRCU callback, keep using the
  page_table_lock (Philip)
v4:
- Put the mm->hmm = NULL in the kref release, reduce LOC
  in hmm_get_or_create() (Christoph)
---
 mm/hmm.c | 77 ++++++++++++++++++++++----------------------------------
 1 file changed, 30 insertions(+), 47 deletions(-)

diff --git a/mm/hmm.c b/mm/hmm.c
index 080b17a2e87e2d..0423f4ca3a7e09 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -31,16 +31,6 @@
 #if IS_ENABLED(CONFIG_HMM_MIRROR)
 static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
 
-static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
-{
-	struct hmm *hmm = READ_ONCE(mm->hmm);
-
-	if (hmm && kref_get_unless_zero(&hmm->kref))
-		return hmm;
-
-	return NULL;
-}
-
 /**
  * hmm_get_or_create - register HMM against an mm (HMM internal)
  *
@@ -55,11 +45,16 @@ static inline struct hmm *mm_get_hmm(struct mm_struct *mm)
  */
 static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 {
-	struct hmm *hmm = mm_get_hmm(mm);
-	bool cleanup = false;
+	struct hmm *hmm;
+
+	lockdep_assert_held_exclusive(&mm->mmap_sem);
 
-	if (hmm)
-		return hmm;
+	/* Abuse the page_table_lock to also protect mm->hmm. */
+	spin_lock(&mm->page_table_lock);
+	hmm = mm->hmm;
+	if (mm->hmm && kref_get_unless_zero(&mm->hmm->kref))
+		goto out_unlock;
+	spin_unlock(&mm->page_table_lock);
 
 	hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
 	if (!hmm)
@@ -74,57 +69,45 @@ static struct hmm *hmm_get_or_create(struct mm_struct *mm)
 	hmm->notifiers = 0;
 	hmm->dead = false;
 	hmm->mm = mm;
-	mmgrab(hmm->mm);
 
-	spin_lock(&mm->page_table_lock);
-	if (!mm->hmm)
-		mm->hmm = hmm;
-	else
-		cleanup = true;
-	spin_unlock(&mm->page_table_lock);
+	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
+	if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
+		kfree(hmm);
+		return NULL;
+	}
 
-	if (cleanup)
-		goto error;
+	mmgrab(hmm->mm);
 
 	/*
-	 * We should only get here if hold the mmap_sem in write mode ie on
-	 * registration of first mirror through hmm_mirror_register()
+	 * We hold the exclusive mmap_sem here so we know that mm->hmm is
+	 * still NULL or 0 kref, and is safe to update.
 	 */
-	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
-	if (__mmu_notifier_register(&hmm->mmu_notifier, mm))
-		goto error_mm;
-
-	return hmm;
-
-error_mm:
 	spin_lock(&mm->page_table_lock);
-	if (mm->hmm == hmm)
-		mm->hmm = NULL;
+	mm->hmm = hmm;
+
+out_unlock:
 	spin_unlock(&mm->page_table_lock);
-error:
-	mmdrop(hmm->mm);
-	kfree(hmm);
-	return NULL;
+	return hmm;
 }
 
 static void hmm_free_rcu(struct rcu_head *rcu)
 {
-	kfree(container_of(rcu, struct hmm, rcu));
+	struct hmm *hmm = container_of(rcu, struct hmm, rcu);
+
+	mmdrop(hmm->mm);
+	kfree(hmm);
 }
 
 static void hmm_free(struct kref *kref)
 {
 	struct hmm *hmm = container_of(kref, struct hmm, kref);
-	struct mm_struct *mm = hmm->mm;
 
-	mmu_notifier_unregister_no_release(&hmm->mmu_notifier, mm);
+	spin_lock(&hmm->mm->page_table_lock);
+	if (hmm->mm->hmm == hmm)
+		hmm->mm->hmm = NULL;
+	spin_unlock(&hmm->mm->page_table_lock);
 
-	spin_lock(&mm->page_table_lock);
-	if (mm->hmm == hmm)
-		mm->hmm = NULL;
-	spin_unlock(&mm->page_table_lock);
-
-	mmdrop(hmm->mm);
+	mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
 	mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
 }
 
-- 
2.21.0

_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
  2019-06-18 13:27           ` Christoph Hellwig
@ 2019-06-18 18:57             ` Jason Gunthorpe
       [not found]               ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-18 18:57 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ben Skeggs

On Tue, Jun 18, 2019 at 06:27:22AM -0700, Christoph Hellwig wrote:
> On Tue, Jun 18, 2019 at 10:13:24AM -0300, Jason Gunthorpe wrote:
> > > I don't even think we even need to bother with the POISON, normal list
> > > debugging will already catch a double unregistration anyway.
> > 
> > mirror->hmm isn't a list so list debugging won't help.
> > 
> > My concern when I wrote this was that one of the in flight patches I
> > can't see might be depending on this double-unregister-is-safe
> > behavior, so I wanted them to crash reliably.
> > 
> > It is a really overly conservative thing to do..
> 
> mirror->list is a list, and if we do a list_del on it during the
> second unregistration it will trip up on the list poisoning.

With the previous loose coupling of the mirror and the range some code
might rance to try to create a range without a mirror, which will now
reliably crash with the poison.

It isn't so much the double unregister that worries me, but racing
unregister with range functions.

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
       [not found]             ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
@ 2019-06-19  0:53               ` Kuehling, Felix
       [not found]                 ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Kuehling, Felix @ 2019-06-19  0:53 UTC (permalink / raw)
  To: Christoph Hellwig, Jason Gunthorpe
  Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On 2019-06-18 1:37, Christoph Hellwig wrote:
> On Mon, Jun 17, 2019 at 09:45:09PM -0300, Jason Gunthorpe wrote:
>> Am I looking at the wrong thing? Looks like it calls it through a work
>> queue should should be OK..
> Yes, it calls it through a work queue.  I guess that is fine because
> it needs to take the lock again.
>
>> Though very strange that amdgpu only destroys the mirror via release,
>> that cannot be right.
> As said the whole things looks rather odd to me.

This code is derived from our old MMU notifier code. Before HMM we used 
to register a single MMU notifier per mm_struct and look up virtual 
address ranges that had been registered for mirroring via driver API 
calls. The idea was to reuse a single MMU notifier for the life time of 
the process. It would remain registered until we got a notifier_release.

hmm_mirror took the place of that when we converted the code to HMM.

I suppose we could destroy the mirror earlier, when we have no more 
registered virtual address ranges, and create a new one if needed later.

Regards,
   Felix


_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
       [not found]                 ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org>
@ 2019-06-19  8:07                   ` Christoph Hellwig
  2019-06-19 11:56                     ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19  8:07 UTC (permalink / raw)
  To: Kuehling, Felix
  Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	Jason Gunthorpe, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Wed, Jun 19, 2019 at 12:53:55AM +0000, Kuehling, Felix wrote:
> This code is derived from our old MMU notifier code. Before HMM we used 
> to register a single MMU notifier per mm_struct and look up virtual 
> address ranges that had been registered for mirroring via driver API 
> calls. The idea was to reuse a single MMU notifier for the life time of 
> the process. It would remain registered until we got a notifier_release.
> 
> hmm_mirror took the place of that when we converted the code to HMM.
> 
> I suppose we could destroy the mirror earlier, when we have no more 
> registered virtual address ranges, and create a new one if needed later.

I didn't write the code, but if you look at hmm_mirror it already is
a multiplexer over the mmu notifier, and the intent clearly seems that
you register one per range that you want to mirror, and not multiplex
it once again.  In other words - I think each amdgpu_mn_node should
probably have its own hmm_mirror.  And while the amdgpu_mn_node objects
are currently stored in an interval tree it seems like they are only
linearly iterated anyway, so a list actually seems pretty suitable.  If
not we need to improve the core data structures instead of working
around them.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register
       [not found]         ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org>
@ 2019-06-19  8:14           ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19  8:14 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ira Weiny, Ben Skeggs

On Tue, Jun 18, 2019 at 10:05:44AM -0300, Jason Gunthorpe wrote:
> I'm thinking to tackle that as part of the mmu notififer invlock
> idea.. Once the range looses the lock then we don't really need to
> register it at all.

Ok.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
       [not found]         ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org>
@ 2019-06-19  8:18           ` Christoph Hellwig
  2019-06-19 11:34             ` Jason Gunthorpe
  0 siblings, 1 reply; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19  8:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

>  	mutex_lock(&hmm->lock);
> -	list_del(&range->list);
> +	list_del_init(&range->list);
>  	mutex_unlock(&hmm->lock);

I don't see the point why this is a list_del_init - that just
reinitializeѕ range->list, but doesn't change anything for the list
head it was removed from.  (and if the list_del_init was intended
a later patch in your branch reverts it to plain list_del..)
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration
       [not found]               ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org>
@ 2019-06-19  8:19                 ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19  8:19 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Tue, Jun 18, 2019 at 03:57:57PM -0300, Jason Gunthorpe wrote:
> With the previous loose coupling of the mirror and the range some code
> might rance to try to create a range without a mirror, which will now
> reliably crash with the poison.
> 
> It isn't so much the double unregister that worries me, but racing
> unregister with range functions.

Oh well.  It was just a nitpick for the highly unusual code patterns
in the two unregister routines, probably not worth fighting over even
if I still don't see the point.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
  2019-06-19  8:18           ` Christoph Hellwig
@ 2019-06-19 11:34             ` Jason Gunthorpe
       [not found]               ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-19 11:34 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell, linux-rdma,
	John Hubbard, Felix.Kuehling, dri-devel, linux-mm, Jerome Glisse,
	amd-gfx, Ben Skeggs

On Wed, Jun 19, 2019 at 01:18:58AM -0700, Christoph Hellwig wrote:
> >  	mutex_lock(&hmm->lock);
> > -	list_del(&range->list);
> > +	list_del_init(&range->list);
> >  	mutex_unlock(&hmm->lock);
> 
> I don't see the point why this is a list_del_init - that just
> reinitializeѕ range->list, but doesn't change anything for the list
> head it was removed from.  (and if the list_del_init was intended
> a later patch in your branch reverts it to plain list_del..)

Just following the instructions:

/**
 * list_empty_careful - tests whether a list is empty and not being modified
 * @head: the list to test
 *
 * Description:
 * tests whether a list is empty _and_ checks that no other CPU might be
 * in the process of modifying either member (next or prev)
 *
 * NOTE: using list_empty_careful() without synchronization
 * can only be safe if the only activity that can happen
 * to the list entry is list_del_init(). Eg. it cannot be used
 * if another CPU could re-list_add() it.
 */

Agree it doesn't seem obvious why this is relevant when checking the
list head..

Maybe the comment is a bit misleading?

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range
       [not found]               ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org>
@ 2019-06-19 11:54                 ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19 11:54 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Philip Yang, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard,
	Felix.Kuehling-5C7GfCeVMHo,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Wed, Jun 19, 2019 at 08:34:52AM -0300, Jason Gunthorpe wrote:
> /**
>  * list_empty_careful - tests whether a list is empty and not being modified
>  * @head: the list to test
>  *
>  * Description:
>  * tests whether a list is empty _and_ checks that no other CPU might be
>  * in the process of modifying either member (next or prev)
>  *
>  * NOTE: using list_empty_careful() without synchronization
>  * can only be safe if the only activity that can happen
>  * to the list entry is list_del_init(). Eg. it cannot be used
>  * if another CPU could re-list_add() it.
>  */
> 
> Agree it doesn't seem obvious why this is relevant when checking the
> list head..
> 
> Maybe the comment is a bit misleading?

From looking at the commit log in the history tree list_empty_careful
was initially added by Linus, and then mingo added that comment later.
I don't see how list_del_init would change anything here, so I suspect
list_del_init was just used as a short hand for list_del or
list_del_init.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
  2019-06-19  8:07                   ` Christoph Hellwig
@ 2019-06-19 11:56                     ` Jason Gunthorpe
       [not found]                       ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org>
  0 siblings, 1 reply; 45+ messages in thread
From: Jason Gunthorpe @ 2019-06-19 11:56 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell, linux-rdma,
	John Hubbard, Kuehling, Felix, dri-devel, linux-mm,
	Jerome Glisse, amd-gfx, Ben Skeggs

On Wed, Jun 19, 2019 at 01:07:05AM -0700, Christoph Hellwig wrote:
> On Wed, Jun 19, 2019 at 12:53:55AM +0000, Kuehling, Felix wrote:
> > This code is derived from our old MMU notifier code. Before HMM we used 
> > to register a single MMU notifier per mm_struct and look up virtual 
> > address ranges that had been registered for mirroring via driver API 
> > calls. The idea was to reuse a single MMU notifier for the life time of 
> > the process. It would remain registered until we got a notifier_release.
> > 
> > hmm_mirror took the place of that when we converted the code to HMM.
> > 
> > I suppose we could destroy the mirror earlier, when we have no more 
> > registered virtual address ranges, and create a new one if needed later.
> 
> I didn't write the code, but if you look at hmm_mirror it already is
> a multiplexer over the mmu notifier, and the intent clearly seems that
> you register one per range that you want to mirror, and not multiplex
> it once again.  In other words - I think each amdgpu_mn_node should
> probably have its own hmm_mirror.  And while the amdgpu_mn_node objects
> are currently stored in an interval tree it seems like they are only
> linearly iterated anyway, so a list actually seems pretty suitable.  If
> not we need to improve the core data structures instead of working
> around them.

This looks a lot like the ODP code (amdgpu_mn_node == ib_umem_odp)

The interval tree is to quickly find the driver object(s) that have
the virtual pages during invalidation:

static int amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror,
                        const struct hmm_update *update)
{
        it = interval_tree_iter_first(&amn->objects, start, end);
        while (it) {
                [..]
                amdgpu_mn_invalidate_node(node, start, end);

And following the ODP model there should be a single hmm_mirror per-mm
(user can fork and stuff, this is something I want to have core code
help with). 

The hmm_mirror can either exist so long as objects exist, or it can
exist until the chardev is closed - but never longer than the
chardev's lifetime.

Maybe we should be considering providing a mmu notifier & interval
tree & lock abstraction since ODP & AMD are very similar here..

Jason
_______________________________________________
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release
       [not found]                       ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org>
@ 2019-06-19 12:03                         ` Christoph Hellwig
  0 siblings, 0 replies; 45+ messages in thread
From: Christoph Hellwig @ 2019-06-19 12:03 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, Yang, Philip, Ralph Campbell,
	linux-rdma-u79uwXL29TY76Z2rM5mHXA, John Hubbard, Kuehling, Felix,
	dri-devel-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Christoph Hellwig,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Jerome Glisse,
	amd-gfx-PD4FTy7X32lNgt0PjOBp9y5qC8QIuHrW, Ben Skeggs

On Wed, Jun 19, 2019 at 08:56:32AM -0300, Jason Gunthorpe wrote:
> This looks a lot like the ODP code (amdgpu_mn_node == ib_umem_odp)
> 
> The interval tree is to quickly find the driver object(s) that have
> the virtual pages during invalidation:
> 
> static int amdgpu_mn_sync_pagetables_gfx(struct hmm_mirror *mirror,
>                         const struct hmm_update *update)
> {
>         it = interval_tree_iter_first(&amn->objects, start, end);
>         while (it) {
>                 [..]
>                 amdgpu_mn_invalidate_node(node, start, end);
> 
> And following the ODP model there should be a single hmm_mirror per-mm
> (user can fork and stuff, this is something I want to have core code
> help with). 

That makes the hmm_mirror object pretty silly, though as the scope
is then exactly the same as the mmu_notifier itself.

> The hmm_mirror can either exist so long as objects exist, or it can
> exist until the chardev is closed - but never longer than the
> chardev's lifetime.
> 
> Maybe we should be considering providing a mmu notifier & interval
> tree & lock abstraction since ODP & AMD are very similar here..

It defintively sounds like a good idea to move this kind of object
management into common code.  Nouvea actually seems like the odd one
out here by not having a list of objects below the mirror, but then
again and interval tree with a single entry wouldn't really hurt it
either.
_______________________________________________
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2019-06-19 12:03 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-06-14  0:44 [PATCH v3 hmm 00/12] mm/hmm: Various revisions from a locking/code review Jason Gunthorpe
2019-06-14  0:44 ` [PATCH v3 hmm 01/12] mm/hmm: fix use after free with struct hmm in the mmu notifiers Jason Gunthorpe
     [not found]   ` <20190614004450.20252-2-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 13:56     ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 02/12] mm/hmm: Use hmm_mirror not mm as an argument for hmm_range_register Jason Gunthorpe
     [not found]   ` <20190614004450.20252-3-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 13:59     ` Christoph Hellwig
2019-06-18 13:05       ` Jason Gunthorpe
     [not found]         ` <20190618130544.GC6961-uk2M96/98Pc@public.gmane.org>
2019-06-19  8:14           ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 03/12] mm/hmm: Hold a mmgrab from hmm to mm Jason Gunthorpe
     [not found]   ` <20190614004450.20252-4-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 13:59     ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 04/12] mm/hmm: Simplify hmm_get_or_create and make it reliable Jason Gunthorpe
     [not found]   ` <20190614004450.20252-5-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:12     ` Christoph Hellwig
2019-06-18  0:36       ` Jason Gunthorpe
2019-06-18 18:55       ` Jason Gunthorpe
2019-06-14  0:44 ` [PATCH v3 hmm 05/12] mm/hmm: Remove duplicate condition test before wait_event_timeout Jason Gunthorpe
     [not found]   ` <20190614004450.20252-6-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:12     ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 06/12] mm/hmm: Hold on to the mmget for the lifetime of the range Jason Gunthorpe
     [not found]   ` <20190614004450.20252-7-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:14     ` Christoph Hellwig
2019-06-18 15:11       ` Jason Gunthorpe
     [not found]         ` <20190618151100.GI6961-uk2M96/98Pc@public.gmane.org>
2019-06-19  8:18           ` Christoph Hellwig
2019-06-19 11:34             ` Jason Gunthorpe
     [not found]               ` <20190619113452.GB9360-uk2M96/98Pc@public.gmane.org>
2019-06-19 11:54                 ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 07/12] mm/hmm: Use lockdep instead of comments Jason Gunthorpe
     [not found]   ` <20190614004450.20252-8-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:14     ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 08/12] mm/hmm: Remove racy protection against double-unregistration Jason Gunthorpe
     [not found]   ` <20190614004450.20252-9-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:16     ` Christoph Hellwig
2019-06-18 13:13       ` Jason Gunthorpe
     [not found]         ` <20190618131324.GF6961-uk2M96/98Pc@public.gmane.org>
2019-06-18 13:27           ` Christoph Hellwig
2019-06-18 18:57             ` Jason Gunthorpe
     [not found]               ` <20190618185757.GP6961-uk2M96/98Pc@public.gmane.org>
2019-06-19  8:19                 ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 09/12] mm/hmm: Poison hmm_range during unregister Jason Gunthorpe
     [not found]   ` <20190614004450.20252-10-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:17     ` Christoph Hellwig
2019-06-18 18:04       ` Jason Gunthorpe
2019-06-14  0:44 ` [PATCH v3 hmm 10/12] mm/hmm: Do not use list*_rcu() for hmm->ranges Jason Gunthorpe
     [not found]   ` <20190614004450.20252-11-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:18     ` Christoph Hellwig
2019-06-18  0:38       ` Jason Gunthorpe
2019-06-14  0:44 ` [PATCH v3 hmm 11/12] mm/hmm: Remove confusing comment and logic from hmm_release Jason Gunthorpe
     [not found]   ` <20190614004450.20252-12-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:21     ` Christoph Hellwig
2019-06-18  0:45       ` Jason Gunthorpe
     [not found]         ` <20190618004509.GE30762-uk2M96/98Pc@public.gmane.org>
2019-06-18  5:37           ` Christoph Hellwig
     [not found]             ` <20190618053733.GA25048-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org>
2019-06-19  0:53               ` Kuehling, Felix
     [not found]                 ` <be4f8573-6284-04a6-7862-23bb357bfe3c-5C7GfCeVMHo@public.gmane.org>
2019-06-19  8:07                   ` Christoph Hellwig
2019-06-19 11:56                     ` Jason Gunthorpe
     [not found]                       ` <20190619115632.GC9360-uk2M96/98Pc@public.gmane.org>
2019-06-19 12:03                         ` Christoph Hellwig
2019-06-14  0:44 ` [PATCH v3 hmm 12/12] mm/hmm: Fix error flows in hmm_invalidate_range_start Jason Gunthorpe
     [not found]   ` <20190614004450.20252-13-jgg-uk2M96/98Pc@public.gmane.org>
2019-06-15 14:25     ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).