IOMMU Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations
@ 2019-08-06 23:15 Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
                   ` (12 more replies)
  0 siblings, 13 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This series introduces a new registration flow for mmu_notifiers based on
the idea that the user would like to get a single refcounted piece of
memory for a mm, keyed to its use.

For instance many users of mmu_notifiers use an interval tree or similar
to dispatch notifications to some object. There are many objects but only
one notifier subscription per mm holding the tree.

Of the 12 places that call mmu_notifier_register:
 - 7 are maintaining some kind of obvious mapping of mm_struct to
   mmu_notifier registration, ie in some linked list or hash table. Of
   the 7 this series converts 4 (gru, hmm, RDMA, radeon)

 - 3 (hfi1, gntdev, vhost) are registering multiple notifiers, but each
   one immediately does some VA range filtering, ie with an interval tree.
   These would be better with a global subsystem-wide range filter and
   could convert to this API.

 - 2 (kvm, amd_iommu) are deliberately using a single mm at a time, and
   really can't use this API. One of the intel-svm's modes is also in this
   list

The 3/7 unconverted drivers are:
 - intel-svm
   This driver tracks mm's in a global linked list 'global_svm_list'
   and would benefit from this API.

   Its flow is a bit complex, since it also wants a set of non-shared
   notifiers.

 - i915_gem_usrptr
   This driver tracks mm's in a per-device hash
   table (dev_priv->mm_structs), but only has an optional use of
   mmu_notifiers.  Since it still seems to need the hash table it is
   difficult to convert.

 - amdkfd/kfd_process
   This driver is using a global SRCU hash table to track mm's

   The control flow here is very complicated and the driver is relying on
   this hash table to be fast on the ioctl syscall path.

   It would definitely benefit, but only if the ioctl path didn't need to
   do the search so often.

This series is already entangled with patches in the hmm & RDMA tree and
will require some git topic branches for the RDMA ODP stuff. I intend for
it to go through the hmm tree.

There is a git version here:

https://github.com/jgunthorpe/linux/commits/mmu_notifier

Which has the required pre-patches for the RDMA ODP conversion that are
still being reviewed.

Jason Gunthorpe (11):
  mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the
    caller
  mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
  mm/mmu_notifiers: add a get/put scheme for the registration
  misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
  hmm: use mmu_notifier_get/put for 'struct hmm'
  RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
  RDMA/odp: remove ib_ucontext from ib_umem
  drm/radeon: use mmu_notifier_get/put for struct radeon_mn
  drm/amdkfd: fix a use after free race with mmu_notifer unregister
  drm/amdkfd: use mmu_notifier_put
  mm/mmu_notifiers: remove unregister_no_release

 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |   1 +
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |   3 -
 drivers/gpu/drm/amd/amdkfd/kfd_process.c |  88 ++++-----
 drivers/gpu/drm/nouveau/nouveau_drm.c    |   3 +
 drivers/gpu/drm/radeon/radeon.h          |   3 -
 drivers/gpu/drm/radeon/radeon_device.c   |   2 -
 drivers/gpu/drm/radeon/radeon_drv.c      |   2 +
 drivers/gpu/drm/radeon/radeon_mn.c       | 157 ++++------------
 drivers/infiniband/core/umem.c           |   4 +-
 drivers/infiniband/core/umem_odp.c       | 183 ++++++------------
 drivers/infiniband/core/uverbs_cmd.c     |   3 -
 drivers/infiniband/core/uverbs_main.c    |   1 +
 drivers/infiniband/hw/mlx5/main.c        |   5 -
 drivers/misc/sgi-gru/grufile.c           |   1 +
 drivers/misc/sgi-gru/grutables.h         |   2 -
 drivers/misc/sgi-gru/grutlbpurge.c       |  84 +++------
 include/linux/hmm.h                      |  12 +-
 include/linux/mm_types.h                 |   6 -
 include/linux/mmu_notifier.h             |  40 +++-
 include/rdma/ib_umem.h                   |   2 +-
 include/rdma/ib_umem_odp.h               |  10 +-
 include/rdma/ib_verbs.h                  |   3 -
 kernel/fork.c                            |   1 -
 mm/hmm.c                                 | 121 +++---------
 mm/mmu_notifier.c                        | 230 +++++++++++++++++------
 25 files changed, 408 insertions(+), 559 deletions(-)

-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-08 10:24   ` Christoph Hellwig
  2019-08-14 20:14   ` Ralph Campbell
  2019-08-06 23:15 ` [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm Jason Gunthorpe
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This simplifies the code to not have so many one line functions and extra
logic. __mmu_notifier_register() simply becomes the entry point to
register the notifier, and the other one calls it under lock.

Also add a lockdep_assert to check that the callers are holding the lock
as expected.

Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/mmu_notifier.c | 35 ++++++++++++++---------------------
 1 file changed, 14 insertions(+), 21 deletions(-)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index b5670620aea0fc..218a6f108bc2d0 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -236,22 +236,22 @@ void __mmu_notifier_invalidate_range(struct mm_struct *mm,
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range);
 
-static int do_mmu_notifier_register(struct mmu_notifier *mn,
-				    struct mm_struct *mm,
-				    int take_mmap_sem)
+/*
+ * Same as mmu_notifier_register but here the caller must hold the
+ * mmap_sem in write mode.
+ */
+int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct mmu_notifier_mm *mmu_notifier_mm;
 	int ret;
 
+	lockdep_assert_held_write(&mm->mmap_sem);
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 
-	ret = -ENOMEM;
 	mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
 	if (unlikely(!mmu_notifier_mm))
-		goto out;
+		return -ENOMEM;
 
-	if (take_mmap_sem)
-		down_write(&mm->mmap_sem);
 	ret = mm_take_all_locks(mm);
 	if (unlikely(ret))
 		goto out_clean;
@@ -279,13 +279,11 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,
 
 	mm_drop_all_locks(mm);
 out_clean:
-	if (take_mmap_sem)
-		up_write(&mm->mmap_sem);
 	kfree(mmu_notifier_mm);
-out:
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(__mmu_notifier_register);
 
 /*
  * Must not hold mmap_sem nor any other VM related lock when calling
@@ -302,19 +300,14 @@ static int do_mmu_notifier_register(struct mmu_notifier *mn,
  */
 int mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 {
-	return do_mmu_notifier_register(mn, mm, 1);
-}
-EXPORT_SYMBOL_GPL(mmu_notifier_register);
+	int ret;
 
-/*
- * Same as mmu_notifier_register but here the caller must hold the
- * mmap_sem in write mode.
- */
-int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
-{
-	return do_mmu_notifier_register(mn, mm, 0);
+	down_write(&mm->mmap_sem);
+	ret = __mmu_notifier_register(mn, mm);
+	up_write(&mm->mmap_sem);
+	return ret;
 }
-EXPORT_SYMBOL_GPL(__mmu_notifier_register);
+EXPORT_SYMBOL_GPL(mmu_notifier_register);
 
 /* this is called after the last mmu_notifier_unregister() returned */
 void __mmu_notifier_mm_destroy(struct mm_struct *mm)
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-08 10:26   ` Christoph Hellwig
  2019-08-14 20:32   ` Ralph Campbell
  2019-08-06 23:15 ` [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration Jason Gunthorpe
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

A prior commit e0f3c3f78da2 ("mm/mmu_notifier: init notifier if necessary")
made an attempt at doing this, but had to be reverted as calling
the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
commit 35cfa2b0b491 ("mm/mmu_notifier: allocate mmu_notifier in advance").

However, we can avoid that problem by doing the allocation only under
the mmap_sem, which is already happening.

Since all writers to mm->mmu_notifier_mm hold the write side of the
mmap_sem reading it under that sem is deterministic and we can use that to
decide if the allocation path is required, without speculation.

The actual update to mmu_notifier_mm must still be done under the
mm_take_all_locks() to ensure read-side coherency.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 mm/mmu_notifier.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 218a6f108bc2d0..696810f632ade1 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -242,27 +242,32 @@ EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range);
  */
 int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 {
-	struct mmu_notifier_mm *mmu_notifier_mm;
+	struct mmu_notifier_mm *mmu_notifier_mm = NULL;
 	int ret;
 
 	lockdep_assert_held_write(&mm->mmap_sem);
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 
-	mmu_notifier_mm = kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
-	if (unlikely(!mmu_notifier_mm))
-		return -ENOMEM;
+	if (!mm->mmu_notifier_mm) {
+		/*
+		 * kmalloc cannot be called under mm_take_all_locks(), but we
+		 * know that mm->mmu_notifier_mm can't change while we hold
+		 * the write side of the mmap_sem.
+		 */
+		mmu_notifier_mm =
+			kmalloc(sizeof(struct mmu_notifier_mm), GFP_KERNEL);
+		if (!mmu_notifier_mm)
+			return -ENOMEM;
+
+		INIT_HLIST_HEAD(&mmu_notifier_mm->list);
+		spin_lock_init(&mmu_notifier_mm->lock);
+	}
 
 	ret = mm_take_all_locks(mm);
 	if (unlikely(ret))
 		goto out_clean;
 
-	if (!mm_has_notifiers(mm)) {
-		INIT_HLIST_HEAD(&mmu_notifier_mm->list);
-		spin_lock_init(&mmu_notifier_mm->lock);
-
-		mm->mmu_notifier_mm = mmu_notifier_mm;
-		mmu_notifier_mm = NULL;
-	}
+	/* Pairs with the mmdrop in mmu_notifier_unregister_* */
 	mmgrab(mm);
 
 	/*
@@ -273,14 +278,19 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 	 * We can't race against any other mmu notifier method either
 	 * thanks to mm_take_all_locks().
 	 */
+	if (mmu_notifier_mm)
+		mm->mmu_notifier_mm = mmu_notifier_mm;
+
 	spin_lock(&mm->mmu_notifier_mm->lock);
 	hlist_add_head_rcu(&mn->hlist, &mm->mmu_notifier_mm->list);
 	spin_unlock(&mm->mmu_notifier_mm->lock);
 
 	mm_drop_all_locks(mm);
+	BUG_ON(atomic_read(&mm->mm_users) <= 0);
+	return 0;
+
 out_clean:
 	kfree(mmu_notifier_mm);
-	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_register);
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-14 21:20   ` Ralph Campbell
  2019-08-06 23:15 ` [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct Jason Gunthorpe
                   ` (9 subsequent siblings)
  12 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

Many places in the kernel have a flow where userspace will create some
object and that object will need to connect to the subsystem's
mmu_notifier subscription for the duration of its lifetime.

In this case the subsystem is usually tracking multiple mm_structs and it
is difficult to keep track of what struct mmu_notifier's have been
allocated for what mm's.

Since this has been open coded in a variety of exciting ways, provide core
functionality to do this safely.

This approach uses the strct mmu_notifier_ops * as a key to determine if
the subsystem has a notifier registered on the mm or not. If there is a
registration then the existing notifier struct is returned, otherwise the
ops->alloc_notifiers() is used to create a new per-subsystem notifier for
the mm.

The destroy side incorporates an async call_srcu based destruction which
will avoid bugs in the callers such as commit 6d7c3cde93c1 ("mm/hmm: fix
use after free with struct hmm in the mmu notifiers").

Since we are inside the mmu notifier core locking is fairly simple, the
allocation uses the same approach as for mmu_notifier_mm, the write side
of the mmap_sem makes everything deterministic and we only need to do
hlist_add_head_rcu() under the mm_take_all_locks(). The new users count
and the discoverability in the hlist is fully serialized by the
mmu_notifier_mm->lock.

Co-developed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 include/linux/mmu_notifier.h |  35 ++++++++
 mm/mmu_notifier.c            | 156 +++++++++++++++++++++++++++++++++--
 2 files changed, 185 insertions(+), 6 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index b6c004bd9f6ad9..31aa971315a142 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -211,6 +211,19 @@ struct mmu_notifier_ops {
 	 */
 	void (*invalidate_range)(struct mmu_notifier *mn, struct mm_struct *mm,
 				 unsigned long start, unsigned long end);
+
+	/*
+	 * These callbacks are used with the get/put interface to manage the
+	 * lifetime of the mmu_notifier memory. alloc_notifier() returns a new
+	 * notifier for use with the mm.
+	 *
+	 * free_notifier() is only called after the mmu_notifier has been
+	 * fully put, calls to any ops callback are prevented and no ops
+	 * callbacks are currently running. It is called from a SRCU callback
+	 * and cannot sleep.
+	 */
+	struct mmu_notifier *(*alloc_notifier)(struct mm_struct *mm);
+	void (*free_notifier)(struct mmu_notifier *mn);
 };
 
 /*
@@ -227,6 +240,9 @@ struct mmu_notifier_ops {
 struct mmu_notifier {
 	struct hlist_node hlist;
 	const struct mmu_notifier_ops *ops;
+	struct mm_struct *mm;
+	struct rcu_head rcu;
+	unsigned int users;
 };
 
 static inline int mm_has_notifiers(struct mm_struct *mm)
@@ -234,6 +250,21 @@ static inline int mm_has_notifiers(struct mm_struct *mm)
 	return unlikely(mm->mmu_notifier_mm);
 }
 
+struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
+					     struct mm_struct *mm);
+static inline struct mmu_notifier *
+mmu_notifier_get(const struct mmu_notifier_ops *ops, struct mm_struct *mm)
+{
+	struct mmu_notifier *ret;
+
+	down_write(&mm->mmap_sem);
+	ret = mmu_notifier_get_locked(ops, mm);
+	up_write(&mm->mmap_sem);
+	return ret;
+}
+void mmu_notifier_put(struct mmu_notifier *mn);
+void mmu_notifier_synchronize(void);
+
 extern int mmu_notifier_register(struct mmu_notifier *mn,
 				 struct mm_struct *mm);
 extern int __mmu_notifier_register(struct mmu_notifier *mn,
@@ -581,6 +612,10 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 #define pudp_huge_clear_flush_notify pudp_huge_clear_flush
 #define set_pte_at_notify set_pte_at
 
+static inline void mmu_notifier_synchronize(void)
+{
+}
+
 #endif /* CONFIG_MMU_NOTIFIER */
 
 #endif /* _LINUX_MMU_NOTIFIER_H */
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 696810f632ade1..4a770b5211b71d 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -248,6 +248,9 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 	lockdep_assert_held_write(&mm->mmap_sem);
 	BUG_ON(atomic_read(&mm->mm_users) <= 0);
 
+	mn->mm = mm;
+	mn->users = 1;
+
 	if (!mm->mmu_notifier_mm) {
 		/*
 		 * kmalloc cannot be called under mm_take_all_locks(), but we
@@ -295,18 +298,24 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_register);
 
-/*
+/**
+ * mmu_notifier_register - Register a notifier on a mm
+ * @mn: The notifier to attach
+ * @mm : The mm to attach the notifier to
+ *
  * Must not hold mmap_sem nor any other VM related lock when calling
  * this registration function. Must also ensure mm_users can't go down
  * to zero while this runs to avoid races with mmu_notifier_release,
  * so mm has to be current->mm or the mm should be pinned safely such
  * as with get_task_mm(). If the mm is not current->mm, the mm_users
  * pin should be released by calling mmput after mmu_notifier_register
- * returns. mmu_notifier_unregister must be always called to
- * unregister the notifier. mm_count is automatically pinned to allow
- * mmu_notifier_unregister to safely run at any time later, before or
- * after exit_mmap. ->release will always be called before exit_mmap
- * frees the pages.
+ * returns.
+ *
+ * mmu_notifier_unregister() or mmu_notifier_put() must be always called to
+ * unregister the notifier.
+ *
+ * While the caller has a mmu_notifer get the mm pointer will remain valid,
+ * and can be converted to an active mm pointer via mmget_not_zero().
  */
 int mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 {
@@ -319,6 +328,72 @@ int mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_register);
 
+static struct mmu_notifier *
+find_get_mmu_notifier(struct mm_struct *mm, const struct mmu_notifier_ops *ops)
+{
+	struct mmu_notifier *mn;
+
+	spin_lock(&mm->mmu_notifier_mm->lock);
+	hlist_for_each_entry_rcu (mn, &mm->mmu_notifier_mm->list, hlist) {
+		if (mn->ops != ops)
+			continue;
+
+		if (likely(mn->users != UINT_MAX))
+			mn->users++;
+		else
+			mn = ERR_PTR(-EOVERFLOW);
+		spin_unlock(&mm->mmu_notifier_mm->lock);
+		return mn;
+	}
+	spin_unlock(&mm->mmu_notifier_mm->lock);
+	return NULL;
+}
+
+/**
+ * mmu_notifier_get_locked - Return the single struct mmu_notifier for
+ *                           the mm & ops
+ * @ops: The operations struct being subscribe with
+ * @mm : The mm to attach notifiers too
+ *
+ * This function either allocates a new mmu_notifier via
+ * ops->alloc_notifier(), or returns an already existing notifier on the
+ * list. The value of the ops pointer is used to determine when two notifiers
+ * are the same.
+ *
+ * Each call to mmu_notifier_get() must be paired with a caller to
+ * mmu_notifier_put(). The caller must hold the write side of mm->mmap_sem.
+ *
+ * While the caller has a mmu_notifer get the mm pointer will remain valid,
+ * and can be converted to an active mm pointer via mmget_not_zero().
+ */
+struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
+					     struct mm_struct *mm)
+{
+	struct mmu_notifier *mn;
+	int ret;
+
+	lockdep_assert_held_write(&mm->mmap_sem);
+
+	if (mm->mmu_notifier_mm) {
+		mn = find_get_mmu_notifier(mm, ops);
+		if (mn)
+			return mn;
+	}
+
+	mn = ops->alloc_notifier(mm);
+	if (IS_ERR(mn))
+		return mn;
+	mn->ops = ops;
+	ret = __mmu_notifier_register(mn, mm);
+	if (ret)
+		goto out_free;
+	return mn;
+out_free:
+	mn->ops->free_notifier(mn);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_get_locked);
+
 /* this is called after the last mmu_notifier_unregister() returned */
 void __mmu_notifier_mm_destroy(struct mm_struct *mm)
 {
@@ -397,6 +472,75 @@ void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
 
+static void mmu_notifier_free_rcu(struct rcu_head *rcu)
+{
+	struct mmu_notifier *mn = container_of(rcu, struct mmu_notifier, rcu);
+	struct mm_struct *mm = mn->mm;
+
+	mn->ops->free_notifier(mn);
+	/* Pairs with the get in __mmu_notifier_register() */
+	mmdrop(mm);
+}
+
+/**
+ * mmu_notifier_put - Release the reference on the notifier
+ * @mn: The notifier to act on
+ *
+ * This function must be paired with each mmu_notifier_get(), it releases the
+ * reference obtained by the get. If this is the last reference then process
+ * to free the notifier will be run asynchronously.
+ *
+ * Unlike mmu_notifier_unregister() the get/put flow only calls ops->release
+ * when the mm_struct is destroyed. Instead free_notifier is always called to
+ * release any resources held by the user.
+ *
+ * As ops->release is not guaranteed to be called, the user must ensure that
+ * all sptes are dropped, and no new sptes can be established before
+ * mmu_notifier_put() is called.
+ *
+ * This function can be called from the ops->release callback, however the
+ * caller must still ensure it is called pairwise with mmu_notifier_get().
+ *
+ * Modules calling this function must call mmu_notifier_module_unload() in
+ * their __exit functions to ensure the async work is completed.
+ */
+void mmu_notifier_put(struct mmu_notifier *mn)
+{
+	struct mm_struct *mm = mn->mm;
+
+	spin_lock(&mm->mmu_notifier_mm->lock);
+	if (WARN_ON(!mn->users) || --mn->users)
+		goto out_unlock;
+	hlist_del_init_rcu(&mn->hlist);
+	spin_unlock(&mm->mmu_notifier_mm->lock);
+
+	call_srcu(&srcu, &mn->rcu, mmu_notifier_free_rcu);
+	return;
+
+out_unlock:
+	spin_unlock(&mm->mmu_notifier_mm->lock);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_put);
+
+/**
+ * mmu_notifier_synchronize - Ensure all mmu_notifiers are freed
+ *
+ * This function ensures that all outsanding async SRU work from
+ * mmu_notifier_put() is completed. After it returns any mmu_notifier_ops
+ * associated with an unused mmu_notifier will no longer be called.
+ *
+ * Before using the caller must ensure that all of its mmu_notifiers have been
+ * fully released via mmu_notifier_put().
+ *
+ * Modules using the mmu_notifier_put() API should call this in their __exit
+ * function to avoid module unloading races.
+ */
+void mmu_notifier_synchronize(void)
+{
+	synchronize_srcu(&srcu);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_synchronize);
+
 bool
 mmu_notifier_range_update_to_read_only(const struct mmu_notifier_range *range)
 {
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (2 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-08 10:25   ` Christoph Hellwig
  2019-08-06 23:15 ` [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm' Jason Gunthorpe
                   ` (8 subsequent siblings)
  12 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

GRU is already using almost the same algorithm as get/put, it even
helpfully has a 10 year old comment to make this algorithm common, which
is finally happening.

There are a few differences and fixes from this conversion:
- GRU used rcu not srcu to read the hlist
- Unclear how the locking worked to prevent gru_register_mmu_notifier()
  from running concurrently with gru_drop_mmu_notifier() - this version is
  safe
- GRU had a release function which only set a variable without any locking
  that skiped the synchronize_srcu during unregister which looks racey,
  but this makes it reliable via the integrated call_srcu().
- It is unclear if the mmap_sem is actually held when
  __mmu_notifier_register() was called, lockdep will now warn if this is
  wrong

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/misc/sgi-gru/grufile.c     |  1 +
 drivers/misc/sgi-gru/grutables.h   |  2 -
 drivers/misc/sgi-gru/grutlbpurge.c | 84 +++++++++---------------------
 3 files changed, 25 insertions(+), 62 deletions(-)

diff --git a/drivers/misc/sgi-gru/grufile.c b/drivers/misc/sgi-gru/grufile.c
index a2a142ae087bfa..9d042310214ff9 100644
--- a/drivers/misc/sgi-gru/grufile.c
+++ b/drivers/misc/sgi-gru/grufile.c
@@ -573,6 +573,7 @@ static void __exit gru_exit(void)
 	gru_free_tables();
 	misc_deregister(&gru_miscdev);
 	gru_proc_exit();
+	mmu_notifier_synchronize();
 }
 
 static const struct file_operations gru_fops = {
diff --git a/drivers/misc/sgi-gru/grutables.h b/drivers/misc/sgi-gru/grutables.h
index 438191c220570c..a7e44b2eb413f6 100644
--- a/drivers/misc/sgi-gru/grutables.h
+++ b/drivers/misc/sgi-gru/grutables.h
@@ -307,10 +307,8 @@ struct gru_mm_tracker {				/* pack to reduce size */
 
 struct gru_mm_struct {
 	struct mmu_notifier	ms_notifier;
-	atomic_t		ms_refcnt;
 	spinlock_t		ms_asid_lock;	/* protects ASID assignment */
 	atomic_t		ms_range_active;/* num range_invals active */
-	char			ms_released;
 	wait_queue_head_t	ms_wait_queue;
 	DECLARE_BITMAP(ms_asidmap, GRU_MAX_GRUS);
 	struct gru_mm_tracker	ms_asids[GRU_MAX_GRUS];
diff --git a/drivers/misc/sgi-gru/grutlbpurge.c b/drivers/misc/sgi-gru/grutlbpurge.c
index 59ba0adf23cee4..10921cd2608dfa 100644
--- a/drivers/misc/sgi-gru/grutlbpurge.c
+++ b/drivers/misc/sgi-gru/grutlbpurge.c
@@ -235,83 +235,47 @@ static void gru_invalidate_range_end(struct mmu_notifier *mn,
 		gms, range->start, range->end);
 }
 
-static void gru_release(struct mmu_notifier *mn, struct mm_struct *mm)
+static struct mmu_notifier *gru_alloc_notifier(struct mm_struct *mm)
 {
-	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
-						 ms_notifier);
+	struct gru_mm_struct *gms;
+
+	gms = kzalloc(sizeof(*gms), GFP_KERNEL);
+	if (!gms)
+		return ERR_PTR(-ENOMEM);
+	STAT(gms_alloc);
+	spin_lock_init(&gms->ms_asid_lock);
+	init_waitqueue_head(&gms->ms_wait_queue);
 
-	gms->ms_released = 1;
-	gru_dbg(grudev, "gms %p\n", gms);
+	return &gms->ms_notifier;
 }
 
+static void gru_free_notifier(struct mmu_notifier *mn)
+{
+	kfree(container_of(mn, struct gru_mm_struct, ms_notifier));
+	STAT(gms_free);
+}
 
 static const struct mmu_notifier_ops gru_mmuops = {
 	.invalidate_range_start	= gru_invalidate_range_start,
 	.invalidate_range_end	= gru_invalidate_range_end,
-	.release		= gru_release,
+	.alloc_notifier		= gru_alloc_notifier,
+	.free_notifier		= gru_free_notifier,
 };
 
-/* Move this to the basic mmu_notifier file. But for now... */
-static struct mmu_notifier *mmu_find_ops(struct mm_struct *mm,
-			const struct mmu_notifier_ops *ops)
-{
-	struct mmu_notifier *mn, *gru_mn = NULL;
-
-	if (mm->mmu_notifier_mm) {
-		rcu_read_lock();
-		hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list,
-					 hlist)
-		    if (mn->ops == ops) {
-			gru_mn = mn;
-			break;
-		}
-		rcu_read_unlock();
-	}
-	return gru_mn;
-}
-
 struct gru_mm_struct *gru_register_mmu_notifier(void)
 {
-	struct gru_mm_struct *gms;
 	struct mmu_notifier *mn;
-	int err;
-
-	mn = mmu_find_ops(current->mm, &gru_mmuops);
-	if (mn) {
-		gms = container_of(mn, struct gru_mm_struct, ms_notifier);
-		atomic_inc(&gms->ms_refcnt);
-	} else {
-		gms = kzalloc(sizeof(*gms), GFP_KERNEL);
-		if (!gms)
-			return ERR_PTR(-ENOMEM);
-		STAT(gms_alloc);
-		spin_lock_init(&gms->ms_asid_lock);
-		gms->ms_notifier.ops = &gru_mmuops;
-		atomic_set(&gms->ms_refcnt, 1);
-		init_waitqueue_head(&gms->ms_wait_queue);
-		err = __mmu_notifier_register(&gms->ms_notifier, current->mm);
-		if (err)
-			goto error;
-	}
-	if (gms)
-		gru_dbg(grudev, "gms %p, refcnt %d\n", gms,
-			atomic_read(&gms->ms_refcnt));
-	return gms;
-error:
-	kfree(gms);
-	return ERR_PTR(err);
+
+	mn = mmu_notifier_get_locked(&gru_mmuops, current->mm);
+	if (IS_ERR(mn))
+		return ERR_CAST(mn);
+
+	return container_of(mn, struct gru_mm_struct, ms_notifier);
 }
 
 void gru_drop_mmu_notifier(struct gru_mm_struct *gms)
 {
-	gru_dbg(grudev, "gms %p, refcnt %d, released %d\n", gms,
-		atomic_read(&gms->ms_refcnt), gms->ms_released);
-	if (atomic_dec_return(&gms->ms_refcnt) == 0) {
-		if (!gms->ms_released)
-			mmu_notifier_unregister(&gms->ms_notifier, current->mm);
-		kfree(gms);
-		STAT(gms_free);
-	}
+	mmu_notifier_put(&gms->ms_notifier);
 }
 
 /*
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm'
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (3 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-08 10:28   ` Christoph Hellwig
  2019-08-14 21:51   ` Ralph Campbell
  2019-08-06 23:15 ` [PATCH v3 hmm 06/11] RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' Jason Gunthorpe
                   ` (7 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This is a significant simplification, it eliminates all the remaining
'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
paths, and takes away all the ugly locking and abuse of page_table_lock.

mmu_notifier_get() provides the single struct hmm per struct mm which
eliminates mm->hmm.

It also directly guarantees that no mmu_notifier op callback is callable
while concurrent free is possible, this eliminates all the krefs inside
the mmu_notifier callbacks.

The remaining krefs in the range code were overly cautious, drivers are
already not permitted to free the mirror while a range exists.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c |   1 +
 drivers/gpu/drm/nouveau/nouveau_drm.c   |   3 +
 include/linux/hmm.h                     |  12 +--
 include/linux/mm_types.h                |   6 --
 kernel/fork.c                           |   1 -
 mm/hmm.c                                | 121 ++++++------------------
 6 files changed, 33 insertions(+), 111 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
index f2e8b4238efd49..d50774a5f98ef7 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c
@@ -1464,6 +1464,7 @@ static void __exit amdgpu_exit(void)
 	amdgpu_unregister_atpx_handler();
 	amdgpu_sync_fini();
 	amdgpu_fence_slab_fini();
+	mmu_notifier_synchronize();
 }
 
 module_init(amdgpu_init);
diff --git a/drivers/gpu/drm/nouveau/nouveau_drm.c b/drivers/gpu/drm/nouveau/nouveau_drm.c
index 7c2fcaba42d6c3..a0e48a482452d7 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drm.c
@@ -28,6 +28,7 @@
 #include <linux/pci.h>
 #include <linux/pm_runtime.h>
 #include <linux/vga_switcheroo.h>
+#include <linux/mmu_notifier.h>
 
 #include <drm/drmP.h>
 #include <drm/drm_crtc_helper.h>
@@ -1292,6 +1293,8 @@ nouveau_drm_exit(void)
 #ifdef CONFIG_NOUVEAU_PLATFORM_DRIVER
 	platform_driver_unregister(&nouveau_platform_driver);
 #endif
+	if (IS_ENABLED(CONFIG_DRM_NOUVEAU_SVM))
+		mmu_notifier_synchronize();
 }
 
 module_init(nouveau_drm_init);
diff --git a/include/linux/hmm.h b/include/linux/hmm.h
index 82265118d94abd..c3902449db6412 100644
--- a/include/linux/hmm.h
+++ b/include/linux/hmm.h
@@ -84,15 +84,12 @@
  * @notifiers: count of active mmu notifiers
  */
 struct hmm {
-	struct mm_struct	*mm;
-	struct kref		kref;
+	struct mmu_notifier	mmu_notifier;
 	spinlock_t		ranges_lock;
 	struct list_head	ranges;
 	struct list_head	mirrors;
-	struct mmu_notifier	mmu_notifier;
 	struct rw_semaphore	mirrors_sem;
 	wait_queue_head_t	wq;
-	struct rcu_head		rcu;
 	long			notifiers;
 };
 
@@ -436,13 +433,6 @@ long hmm_range_dma_unmap(struct hmm_range *range,
  */
 #define HMM_RANGE_DEFAULT_TIMEOUT 1000
 
-/* Below are for HMM internal use only! Not to be used by device driver! */
-static inline void hmm_mm_init(struct mm_struct *mm)
-{
-	mm->hmm = NULL;
-}
-#else /* IS_ENABLED(CONFIG_HMM_MIRROR) */
-static inline void hmm_mm_init(struct mm_struct *mm) {}
 #endif /* IS_ENABLED(CONFIG_HMM_MIRROR) */
 
 #endif /* LINUX_HMM_H */
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 3a37a89eb7a7c3..525d25d93330f2 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -25,7 +25,6 @@
 
 struct address_space;
 struct mem_cgroup;
-struct hmm;
 
 /*
  * Each physical page in the system has a struct page associated with
@@ -502,11 +501,6 @@ struct mm_struct {
 		atomic_long_t hugetlb_usage;
 #endif
 		struct work_struct async_put_work;
-
-#ifdef CONFIG_HMM_MIRROR
-		/* HMM needs to track a few things per mm */
-		struct hmm *hmm;
-#endif
 	} __randomize_layout;
 
 	/*
diff --git a/kernel/fork.c b/kernel/fork.c
index d8ae0f1b414802..bd4a0762f12f3e 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1007,7 +1007,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p,
 	mm_init_owner(mm, p);
 	RCU_INIT_POINTER(mm->exe_file, NULL);
 	mmu_notifier_mm_init(mm);
-	hmm_mm_init(mm);
 	init_tlb_flush_pending(mm);
 #if defined(CONFIG_TRANSPARENT_HUGEPAGE) && !USE_SPLIT_PMD_PTLOCKS
 	mm->pmd_huge_pte = NULL;
diff --git a/mm/hmm.c b/mm/hmm.c
index 9a908902e4cc38..00f94f94906afc 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -26,101 +26,37 @@
 #include <linux/mmu_notifier.h>
 #include <linux/memory_hotplug.h>
 
-static const struct mmu_notifier_ops hmm_mmu_notifier_ops;
-
-/**
- * hmm_get_or_create - register HMM against an mm (HMM internal)
- *
- * @mm: mm struct to attach to
- * Return: an HMM object, either by referencing the existing
- *          (per-process) object, or by creating a new one.
- *
- * This is not intended to be used directly by device drivers. If mm already
- * has an HMM struct then it get a reference on it and returns it. Otherwise
- * it allocates an HMM struct, initializes it, associate it with the mm and
- * returns it.
- */
-static struct hmm *hmm_get_or_create(struct mm_struct *mm)
+static struct mmu_notifier *hmm_alloc_notifier(struct mm_struct *mm)
 {
 	struct hmm *hmm;
 
-	lockdep_assert_held_write(&mm->mmap_sem);
-
-	/* Abuse the page_table_lock to also protect mm->hmm. */
-	spin_lock(&mm->page_table_lock);
-	hmm = mm->hmm;
-	if (mm->hmm && kref_get_unless_zero(&mm->hmm->kref))
-		goto out_unlock;
-	spin_unlock(&mm->page_table_lock);
-
-	hmm = kmalloc(sizeof(*hmm), GFP_KERNEL);
+	hmm = kzalloc(sizeof(*hmm), GFP_KERNEL);
 	if (!hmm)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
+
 	init_waitqueue_head(&hmm->wq);
 	INIT_LIST_HEAD(&hmm->mirrors);
 	init_rwsem(&hmm->mirrors_sem);
-	hmm->mmu_notifier.ops = NULL;
 	INIT_LIST_HEAD(&hmm->ranges);
 	spin_lock_init(&hmm->ranges_lock);
-	kref_init(&hmm->kref);
 	hmm->notifiers = 0;
-	hmm->mm = mm;
-
-	hmm->mmu_notifier.ops = &hmm_mmu_notifier_ops;
-	if (__mmu_notifier_register(&hmm->mmu_notifier, mm)) {
-		kfree(hmm);
-		return NULL;
-	}
-
-	mmgrab(hmm->mm);
-
-	/*
-	 * We hold the exclusive mmap_sem here so we know that mm->hmm is
-	 * still NULL or 0 kref, and is safe to update.
-	 */
-	spin_lock(&mm->page_table_lock);
-	mm->hmm = hmm;
-
-out_unlock:
-	spin_unlock(&mm->page_table_lock);
-	return hmm;
+	return &hmm->mmu_notifier;
 }
 
-static void hmm_free_rcu(struct rcu_head *rcu)
+static void hmm_free_notifier(struct mmu_notifier *mn)
 {
-	struct hmm *hmm = container_of(rcu, struct hmm, rcu);
+	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 
-	mmdrop(hmm->mm);
+	WARN_ON(!list_empty(&hmm->ranges));
+	WARN_ON(!list_empty(&hmm->mirrors));
 	kfree(hmm);
 }
 
-static void hmm_free(struct kref *kref)
-{
-	struct hmm *hmm = container_of(kref, struct hmm, kref);
-
-	spin_lock(&hmm->mm->page_table_lock);
-	if (hmm->mm->hmm == hmm)
-		hmm->mm->hmm = NULL;
-	spin_unlock(&hmm->mm->page_table_lock);
-
-	mmu_notifier_unregister_no_release(&hmm->mmu_notifier, hmm->mm);
-	mmu_notifier_call_srcu(&hmm->rcu, hmm_free_rcu);
-}
-
-static inline void hmm_put(struct hmm *hmm)
-{
-	kref_put(&hmm->kref, hmm_free);
-}
-
 static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 	struct hmm_mirror *mirror;
 
-	/* Bail out if hmm is in the process of being freed */
-	if (!kref_get_unless_zero(&hmm->kref))
-		return;
-
 	/*
 	 * Since hmm_range_register() holds the mmget() lock hmm_release() is
 	 * prevented as long as a range exists.
@@ -137,8 +73,6 @@ static void hmm_release(struct mmu_notifier *mn, struct mm_struct *mm)
 			mirror->ops->release(mirror);
 	}
 	up_read(&hmm->mirrors_sem);
-
-	hmm_put(hmm);
 }
 
 static void notifiers_decrement(struct hmm *hmm)
@@ -169,9 +103,6 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 	unsigned long flags;
 	int ret = 0;
 
-	if (!kref_get_unless_zero(&hmm->kref))
-		return 0;
-
 	spin_lock_irqsave(&hmm->ranges_lock, flags);
 	hmm->notifiers++;
 	list_for_each_entry(range, &hmm->ranges, list) {
@@ -206,7 +137,6 @@ static int hmm_invalidate_range_start(struct mmu_notifier *mn,
 out:
 	if (ret)
 		notifiers_decrement(hmm);
-	hmm_put(hmm);
 	return ret;
 }
 
@@ -215,17 +145,15 @@ static void hmm_invalidate_range_end(struct mmu_notifier *mn,
 {
 	struct hmm *hmm = container_of(mn, struct hmm, mmu_notifier);
 
-	if (!kref_get_unless_zero(&hmm->kref))
-		return;
-
 	notifiers_decrement(hmm);
-	hmm_put(hmm);
 }
 
 static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
 	.release		= hmm_release,
 	.invalidate_range_start	= hmm_invalidate_range_start,
 	.invalidate_range_end	= hmm_invalidate_range_end,
+	.alloc_notifier		= hmm_alloc_notifier,
+	.free_notifier		= hmm_free_notifier,
 };
 
 /*
@@ -237,18 +165,27 @@ static const struct mmu_notifier_ops hmm_mmu_notifier_ops = {
  *
  * To start mirroring a process address space, the device driver must register
  * an HMM mirror struct.
+ *
+ * The caller cannot unregister the hmm_mirror while any ranges are
+ * registered.
+ *
+ * Callers using this function must put a call to mmu_notifier_synchronize()
+ * in their module exit functions.
  */
 int hmm_mirror_register(struct hmm_mirror *mirror, struct mm_struct *mm)
 {
+	struct mmu_notifier *mn;
+
 	lockdep_assert_held_write(&mm->mmap_sem);
 
 	/* Sanity check */
 	if (!mm || !mirror || !mirror->ops)
 		return -EINVAL;
 
-	mirror->hmm = hmm_get_or_create(mm);
-	if (!mirror->hmm)
-		return -ENOMEM;
+	mn = mmu_notifier_get_locked(&hmm_mmu_notifier_ops, mm);
+	if (IS_ERR(mn))
+		return PTR_ERR(mn);
+	mirror->hmm = container_of(mn, struct hmm, mmu_notifier);
 
 	down_write(&mirror->hmm->mirrors_sem);
 	list_add(&mirror->list, &mirror->hmm->mirrors);
@@ -272,7 +209,7 @@ void hmm_mirror_unregister(struct hmm_mirror *mirror)
 	down_write(&hmm->mirrors_sem);
 	list_del(&mirror->list);
 	up_write(&hmm->mirrors_sem);
-	hmm_put(hmm);
+	mmu_notifier_put(&hmm->mmu_notifier);
 }
 EXPORT_SYMBOL(hmm_mirror_unregister);
 
@@ -880,14 +817,13 @@ int hmm_range_register(struct hmm_range *range,
 	range->end = end;
 
 	/* Prevent hmm_release() from running while the range is valid */
-	if (!mmget_not_zero(hmm->mm))
+	if (!mmget_not_zero(hmm->mmu_notifier.mm))
 		return -EFAULT;
 
 	/* Initialize range to track CPU page table updates. */
 	spin_lock_irqsave(&hmm->ranges_lock, flags);
 
 	range->hmm = hmm;
-	kref_get(&hmm->kref);
 	list_add(&range->list, &hmm->ranges);
 
 	/*
@@ -919,8 +855,7 @@ void hmm_range_unregister(struct hmm_range *range)
 	spin_unlock_irqrestore(&hmm->ranges_lock, flags);
 
 	/* Drop reference taken by hmm_range_register() */
-	mmput(hmm->mm);
-	hmm_put(hmm);
+	mmput(hmm->mmu_notifier.mm);
 
 	/*
 	 * The range is now invalid and the ref on the hmm is dropped, so
@@ -970,14 +905,14 @@ long hmm_range_fault(struct hmm_range *range, unsigned int flags)
 	struct mm_walk mm_walk;
 	int ret;
 
-	lockdep_assert_held(&hmm->mm->mmap_sem);
+	lockdep_assert_held(&hmm->mmu_notifier.mm->mmap_sem);
 
 	do {
 		/* If range is no longer valid force retry. */
 		if (!range->valid)
 			return -EBUSY;
 
-		vma = find_vma(hmm->mm, start);
+		vma = find_vma(hmm->mmu_notifier.mm, start);
 		if (vma == NULL || (vma->vm_flags & device_vma))
 			return -EFAULT;
 
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 06/11] RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (4 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm' Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 07/11] RDMA/odp: remove ib_ucontext from ib_umem Jason Gunthorpe
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

This is a significant simplification, no extra list is kept per FD, and
the interval tree is now shared between all the ucontexts, reducing
overhead if there are multiple ucontexts active.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/core/umem_odp.c    | 170 ++++++++------------------
 drivers/infiniband/core/uverbs_cmd.c  |   3 -
 drivers/infiniband/core/uverbs_main.c |   1 +
 drivers/infiniband/hw/mlx5/main.c     |   5 -
 include/rdma/ib_umem_odp.h            |  10 +-
 include/rdma/ib_verbs.h               |   3 -
 6 files changed, 54 insertions(+), 138 deletions(-)

diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index c6a992392ee2b8..a02e6e3d7b72fb 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -82,7 +82,7 @@ static void ib_umem_notifier_release(struct mmu_notifier *mn,
 	struct rb_node *node;
 
 	down_read(&per_mm->umem_rwsem);
-	if (!per_mm->active)
+	if (!per_mm->mn.users)
 		goto out;
 
 	for (node = rb_first_cached(&per_mm->umem_tree); node;
@@ -132,7 +132,7 @@ static int ib_umem_notifier_invalidate_range_start(struct mmu_notifier *mn,
 	else if (!down_read_trylock(&per_mm->umem_rwsem))
 		return -EAGAIN;
 
-	if (!per_mm->active) {
+	if (!per_mm->mn.users) {
 		up_read(&per_mm->umem_rwsem);
 		/*
 		 * At this point active is permanently set and visible to this
@@ -165,7 +165,7 @@ static void ib_umem_notifier_invalidate_range_end(struct mmu_notifier *mn,
 	struct ib_ucontext_per_mm *per_mm =
 		container_of(mn, struct ib_ucontext_per_mm, mn);
 
-	if (unlikely(!per_mm->active))
+	if (unlikely(!per_mm->mn.users))
 		return;
 
 	rbt_ib_umem_for_each_in_range(&per_mm->umem_tree, range->start,
@@ -174,125 +174,47 @@ static void ib_umem_notifier_invalidate_range_end(struct mmu_notifier *mn,
 	up_read(&per_mm->umem_rwsem);
 }
 
-static const struct mmu_notifier_ops ib_umem_notifiers = {
-	.release                    = ib_umem_notifier_release,
-	.invalidate_range_start     = ib_umem_notifier_invalidate_range_start,
-	.invalidate_range_end       = ib_umem_notifier_invalidate_range_end,
-};
-
-static void remove_umem_from_per_mm(struct ib_umem_odp *umem_odp)
-{
-	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
-
-	if (umem_odp->is_implicit_odp)
-		return;
-
-	down_write(&per_mm->umem_rwsem);
-	interval_tree_remove(&umem_odp->interval_tree, &per_mm->umem_tree);
-	complete_all(&umem_odp->notifier_completion);
-	up_write(&per_mm->umem_rwsem);
-}
-
-static struct ib_ucontext_per_mm *alloc_per_mm(struct ib_ucontext *ctx,
-					       struct mm_struct *mm)
+static struct mmu_notifier *ib_umem_alloc_notifier(struct mm_struct *mm)
 {
 	struct ib_ucontext_per_mm *per_mm;
-	int ret;
 
 	per_mm = kzalloc(sizeof(*per_mm), GFP_KERNEL);
 	if (!per_mm)
 		return ERR_PTR(-ENOMEM);
 
-	per_mm->context = ctx;
-	per_mm->mm = mm;
 	per_mm->umem_tree = RB_ROOT_CACHED;
 	init_rwsem(&per_mm->umem_rwsem);
-	per_mm->active = true;
 
+	WARN_ON(mm != current->mm);
 	rcu_read_lock();
 	per_mm->tgid = get_task_pid(current->group_leader, PIDTYPE_PID);
 	rcu_read_unlock();
-
-	WARN_ON(mm != current->mm);
-
-	per_mm->mn.ops = &ib_umem_notifiers;
-	ret = mmu_notifier_register(&per_mm->mn, per_mm->mm);
-	if (ret) {
-		dev_err(&ctx->device->dev,
-			"Failed to register mmu_notifier %d\n", ret);
-		goto out_pid;
-	}
-
-	list_add(&per_mm->ucontext_list, &ctx->per_mm_list);
-	return per_mm;
-
-out_pid:
-	put_pid(per_mm->tgid);
-	kfree(per_mm);
-	return ERR_PTR(ret);
+	return &per_mm->mn;
 }
 
-static struct ib_ucontext_per_mm *get_per_mm(struct ib_umem_odp *umem_odp)
+static void ib_umem_free_notifier(struct mmu_notifier *mn)
 {
-	struct ib_ucontext *ctx = umem_odp->umem.context;
-	struct ib_ucontext_per_mm *per_mm;
-
-	lockdep_assert_held(&ctx->per_mm_list_lock);
-
-	/*
-	 * Generally speaking we expect only one or two per_mm in this list,
-	 * so no reason to optimize this search today.
-	 */
-	list_for_each_entry(per_mm, &ctx->per_mm_list, ucontext_list) {
-		if (per_mm->mm == umem_odp->umem.owning_mm)
-			return per_mm;
-	}
-
-	return alloc_per_mm(ctx, umem_odp->umem.owning_mm);
-}
-
-static void free_per_mm(struct rcu_head *rcu)
-{
-	kfree(container_of(rcu, struct ib_ucontext_per_mm, rcu));
-}
-
-static void put_per_mm(struct ib_umem_odp *umem_odp)
-{
-	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
-	struct ib_ucontext *ctx = umem_odp->umem.context;
-	bool need_free;
-
-	mutex_lock(&ctx->per_mm_list_lock);
-	umem_odp->per_mm = NULL;
-	per_mm->odp_mrs_count--;
-	need_free = per_mm->odp_mrs_count == 0;
-	if (need_free)
-		list_del(&per_mm->ucontext_list);
-	mutex_unlock(&ctx->per_mm_list_lock);
-
-	if (!need_free)
-		return;
-
-	/*
-	 * NOTE! mmu_notifier_unregister() can happen between a start/end
-	 * callback, resulting in an start/end, and thus an unbalanced
-	 * lock. This doesn't really matter to us since we are about to kfree
-	 * the memory that holds the lock, however LOCKDEP doesn't like this.
-	 */
-	down_write(&per_mm->umem_rwsem);
-	per_mm->active = false;
-	up_write(&per_mm->umem_rwsem);
+	struct ib_ucontext_per_mm *per_mm =
+		container_of(mn, struct ib_ucontext_per_mm, mn);
 
 	WARN_ON(!RB_EMPTY_ROOT(&per_mm->umem_tree.rb_root));
-	mmu_notifier_unregister_no_release(&per_mm->mn, per_mm->mm);
+
 	put_pid(per_mm->tgid);
-	mmu_notifier_call_srcu(&per_mm->rcu, free_per_mm);
+	kfree(per_mm);
 }
 
-static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
-				   struct ib_ucontext_per_mm *per_mm)
+static const struct mmu_notifier_ops ib_umem_notifiers = {
+	.release                    = ib_umem_notifier_release,
+	.invalidate_range_start     = ib_umem_notifier_invalidate_range_start,
+	.invalidate_range_end       = ib_umem_notifier_invalidate_range_end,
+	.alloc_notifier		    = ib_umem_alloc_notifier,
+	.free_notifier		    = ib_umem_free_notifier,
+};
+
+static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp)
 {
-	struct ib_ucontext *ctx = umem_odp->umem.context;
+	struct ib_ucontext_per_mm *per_mm;
+	struct mmu_notifier *mn;
 	int ret;
 
 	if (!umem_odp->is_implicit_odp) {
@@ -338,18 +260,13 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 		}
 	}
 
-	mutex_lock(&ctx->per_mm_list_lock);
-	if (per_mm) {
-		umem_odp->per_mm = per_mm;
-	} else {
-		umem_odp->per_mm = get_per_mm(umem_odp);
-		if (IS_ERR(umem_odp->per_mm)) {
-			ret = PTR_ERR(umem_odp->per_mm);
-			goto out_unlock;
-		}
+	mn = mmu_notifier_get(&ib_umem_notifiers, umem_odp->umem.owning_mm);
+	if (IS_ERR(mn)) {
+		ret = PTR_ERR(mn);
+		goto out_dma_list;
 	}
-	per_mm->odp_mrs_count++;
-	mutex_unlock(&ctx->per_mm_list_lock);
+	umem_odp->per_mm = per_mm =
+		container_of(mn, struct ib_ucontext_per_mm, mn);
 
 	mutex_init(&umem_odp->umem_mutex);
 	init_completion(&umem_odp->notifier_completion);
@@ -363,8 +280,7 @@ static inline int ib_init_umem_odp(struct ib_umem_odp *umem_odp,
 
 	return 0;
 
-out_unlock:
-	mutex_unlock(&ctx->per_mm_list_lock);
+out_dma_list:
 	kvfree(umem_odp->dma_list);
 out_page_list:
 	kvfree(umem_odp->page_list);
@@ -409,7 +325,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
 	umem_odp->is_implicit_odp = 1;
 	umem_odp->page_shift = PAGE_SHIFT;
 
-	ret = ib_init_umem_odp(umem_odp, NULL);
+	ret = ib_init_umem_odp(umem_odp);
 	if (ret) {
 		kfree(umem_odp);
 		return ERR_PTR(ret);
@@ -455,7 +371,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root,
 	umem->owning_mm  = root->umem.owning_mm;
 	odp_data->page_shift = PAGE_SHIFT;
 
-	ret = ib_init_umem_odp(odp_data, root->per_mm);
+	ret = ib_init_umem_odp(odp_data);
 	if (ret) {
 		kfree(odp_data);
 		return ERR_PTR(ret);
@@ -498,11 +414,13 @@ int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access)
 		up_read(&mm->mmap_sem);
 	}
 
-	return ib_init_umem_odp(umem_odp, NULL);
+	return ib_init_umem_odp(umem_odp);
 }
 
 void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 {
+	struct ib_ucontext_per_mm *per_mm = umem_odp->per_mm;
+
 	/*
 	 * Ensure that no more pages are mapped in the umem.
 	 *
@@ -512,8 +430,24 @@ void ib_umem_odp_release(struct ib_umem_odp *umem_odp)
 	ib_umem_odp_unmap_dma_pages(umem_odp, ib_umem_start(umem_odp),
 				    ib_umem_end(umem_odp));
 
-	remove_umem_from_per_mm(umem_odp);
-	put_per_mm(umem_odp);
+	down_write(&per_mm->umem_rwsem);
+	if (!umem_odp->is_implicit_odp) {
+		interval_tree_remove(&umem_odp->interval_tree,
+				     &per_mm->umem_tree);
+		complete_all(&umem_odp->notifier_completion);
+	}
+	/*
+	 * NOTE! mmu_notifier_unregister() can happen between a start/end
+	 * callback, resulting in an start/end, and thus an unbalanced
+	 * lock. This doesn't really matter to us since we are about to kfree
+	 * the memory that holds the lock, however LOCKDEP doesn't like this.
+	 * Thus we call the mmu_notifier_put under the rwsem and test the
+	 * internal users count to reliably see if we are past this point.
+	 */
+	mmu_notifier_put(&umem_odp->per_mm->mn);
+	up_write(&per_mm->umem_rwsem);
+
+	umem_odp->per_mm = NULL;
 	kvfree(umem_odp->dma_list);
 	kvfree(umem_odp->page_list);
 }
diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
index 8f4fd4fac1593a..7c10dfe417a446 100644
--- a/drivers/infiniband/core/uverbs_cmd.c
+++ b/drivers/infiniband/core/uverbs_cmd.c
@@ -252,9 +252,6 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
 	ucontext->closing = false;
 	ucontext->cleanup_retryable = false;
 
-	mutex_init(&ucontext->per_mm_list_lock);
-	INIT_LIST_HEAD(&ucontext->per_mm_list);
-
 	ret = get_unused_fd_flags(O_CLOEXEC);
 	if (ret < 0)
 		goto err_free;
diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
index 11c13c1381cf5c..e369ac0d6f5159 100644
--- a/drivers/infiniband/core/uverbs_main.c
+++ b/drivers/infiniband/core/uverbs_main.c
@@ -1487,6 +1487,7 @@ static void __exit ib_uverbs_cleanup(void)
 				 IB_UVERBS_NUM_FIXED_MINOR);
 	unregister_chrdev_region(dynamic_uverbs_dev,
 				 IB_UVERBS_NUM_DYNAMIC_MINOR);
+	mmu_notifier_synchronize();
 }
 
 module_init(ib_uverbs_init);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index cdb6bbbaa14fd8..4400ff7457c7c8 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1995,11 +1995,6 @@ static void mlx5_ib_dealloc_ucontext(struct ib_ucontext *ibcontext)
 	struct mlx5_ib_dev *dev = to_mdev(ibcontext->device);
 	struct mlx5_bfreg_info *bfregi;
 
-	/* All umem's must be destroyed before destroying the ucontext. */
-	mutex_lock(&ibcontext->per_mm_list_lock);
-	WARN_ON(!list_empty(&ibcontext->per_mm_list));
-	mutex_unlock(&ibcontext->per_mm_list_lock);
-
 	bfregi = &context->bfregi;
 	mlx5_ib_dealloc_transport_domain(dev, context->tdn, context->devx_uid);
 
diff --git a/include/rdma/ib_umem_odp.h b/include/rdma/ib_umem_odp.h
index 468c9afabbb2fd..5a6c7cd3f33388 100644
--- a/include/rdma/ib_umem_odp.h
+++ b/include/rdma/ib_umem_odp.h
@@ -116,20 +116,12 @@ static inline unsigned long ib_umem_end(struct ib_umem_odp *umem_odp)
 #ifdef CONFIG_INFINIBAND_ON_DEMAND_PAGING
 
 struct ib_ucontext_per_mm {
-	struct ib_ucontext *context;
-	struct mm_struct *mm;
+	struct mmu_notifier mn;
 	struct pid *tgid;
-	bool active;
 
 	struct rb_root_cached umem_tree;
 	/* Protects umem_tree */
 	struct rw_semaphore umem_rwsem;
-
-	struct mmu_notifier mn;
-	unsigned int odp_mrs_count;
-
-	struct list_head ucontext_list;
-	struct rcu_head rcu;
 };
 
 int ib_umem_odp_get(struct ib_umem_odp *umem_odp, int access);
diff --git a/include/rdma/ib_verbs.h b/include/rdma/ib_verbs.h
index 16196196659a4c..9bd3cde7e8dbe9 100644
--- a/include/rdma/ib_verbs.h
+++ b/include/rdma/ib_verbs.h
@@ -1417,9 +1417,6 @@ struct ib_ucontext {
 
 	bool cleanup_retryable;
 
-	struct mutex per_mm_list_lock;
-	struct list_head per_mm_list;
-
 	struct ib_rdmacg_object	cg_obj;
 	/*
 	 * Implementation details of the RDMA core, don't use in drivers:
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 07/11] RDMA/odp: remove ib_ucontext from ib_umem
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (5 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 06/11] RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn Jason Gunthorpe
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

At this point the ucontext is only being stored to access the ib_device,
so just store the ib_device directly instead. This is more natural and
logical as the umem has nothing to do with the ucontext.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/infiniband/core/umem.c     |  4 ++--
 drivers/infiniband/core/umem_odp.c | 13 ++++++-------
 include/rdma/ib_umem.h             |  2 +-
 3 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/drivers/infiniband/core/umem.c b/drivers/infiniband/core/umem.c
index c59aa57d36510f..5ab9165ffbef0a 100644
--- a/drivers/infiniband/core/umem.c
+++ b/drivers/infiniband/core/umem.c
@@ -242,7 +242,7 @@ struct ib_umem *ib_umem_get(struct ib_udata *udata, unsigned long addr,
 			return ERR_PTR(-ENOMEM);
 	}
 
-	umem->context    = context;
+	umem->ibdev = context->device;
 	umem->length     = size;
 	umem->address    = addr;
 	umem->writable   = ib_access_writable(access);
@@ -370,7 +370,7 @@ void ib_umem_release(struct ib_umem *umem)
 		return;
 	}
 
-	__ib_umem_release(umem->context->device, umem, 1);
+	__ib_umem_release(umem->ibdev, umem, 1);
 
 	atomic64_sub(ib_umem_num_pages(umem), &umem->owning_mm->pinned_vm);
 	__ib_umem_release_tail(umem);
diff --git a/drivers/infiniband/core/umem_odp.c b/drivers/infiniband/core/umem_odp.c
index a02e6e3d7b72fb..da72318e17592f 100644
--- a/drivers/infiniband/core/umem_odp.c
+++ b/drivers/infiniband/core/umem_odp.c
@@ -103,7 +103,7 @@ static void ib_umem_notifier_release(struct mmu_notifier *mn,
 		 */
 		smp_wmb();
 		complete_all(&umem_odp->notifier_completion);
-		umem_odp->umem.context->device->ops.invalidate_range(
+		umem_odp->umem.ibdev->ops.invalidate_range(
 			umem_odp, ib_umem_start(umem_odp),
 			ib_umem_end(umem_odp));
 	}
@@ -116,7 +116,7 @@ static int invalidate_range_start_trampoline(struct ib_umem_odp *item,
 					     u64 start, u64 end, void *cookie)
 {
 	ib_umem_notifier_start_account(item);
-	item->umem.context->device->ops.invalidate_range(item, start, end);
+	item->umem.ibdev->ops.invalidate_range(item, start, end);
 	return 0;
 }
 
@@ -319,7 +319,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_implicit(struct ib_udata *udata,
 	if (!umem_odp)
 		return ERR_PTR(-ENOMEM);
 	umem = &umem_odp->umem;
-	umem->context = context;
+	umem->ibdev = context->device;
 	umem->writable = ib_access_writable(access);
 	umem->owning_mm = current->mm;
 	umem_odp->is_implicit_odp = 1;
@@ -364,7 +364,7 @@ struct ib_umem_odp *ib_umem_odp_alloc_child(struct ib_umem_odp *root,
 	if (!odp_data)
 		return ERR_PTR(-ENOMEM);
 	umem = &odp_data->umem;
-	umem->context    = root->umem.context;
+	umem->ibdev = root->umem.ibdev;
 	umem->length     = size;
 	umem->address    = addr;
 	umem->writable   = root->umem.writable;
@@ -477,8 +477,7 @@ static int ib_umem_odp_map_dma_single_page(
 		u64 access_mask,
 		unsigned long current_seq)
 {
-	struct ib_ucontext *context = umem_odp->umem.context;
-	struct ib_device *dev = context->device;
+	struct ib_device *dev = umem_odp->umem.ibdev;
 	dma_addr_t dma_addr;
 	int remove_existing_mapping = 0;
 	int ret = 0;
@@ -691,7 +690,7 @@ void ib_umem_odp_unmap_dma_pages(struct ib_umem_odp *umem_odp, u64 virt,
 {
 	int idx;
 	u64 addr;
-	struct ib_device *dev = umem_odp->umem.context->device;
+	struct ib_device *dev = umem_odp->umem.ibdev;
 
 	virt = max_t(u64, virt, ib_umem_start(umem_odp));
 	bound = min_t(u64, bound, ib_umem_end(umem_odp));
diff --git a/include/rdma/ib_umem.h b/include/rdma/ib_umem.h
index 1052d0d62be7d2..a91b2af64ec47b 100644
--- a/include/rdma/ib_umem.h
+++ b/include/rdma/ib_umem.h
@@ -42,7 +42,7 @@ struct ib_ucontext;
 struct ib_umem_odp;
 
 struct ib_umem {
-	struct ib_ucontext     *context;
+	struct ib_device       *ibdev;
 	struct mm_struct       *owning_mm;
 	size_t			length;
 	unsigned long		address;
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (6 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 07/11] RDMA/odp: remove ib_ucontext from ib_umem Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-14 16:07   ` Jason Gunthorpe
  2019-08-15  8:28   ` Christian König
  2019-08-06 23:15 ` [PATCH v3 hmm 09/11] drm/amdkfd: fix a use after free race with mmu_notifer unregister Jason Gunthorpe
                   ` (4 subsequent siblings)
  12 siblings, 2 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

radeon is using a device global hash table to track what mmu_notifiers
have been registered on struct mm. This is better served with the new
get/put scheme instead.

radeon has a bug where it was not blocking notifier release() until all
the BO's had been invalidated. This could result in a use after free of
pages the BOs. This is tied into a second bug where radeon left the
notifiers running endlessly even once the interval tree became
empty. This could result in a use after free with module unload.

Both are fixed by changing the lifetime model, the BOs exist in the
interval tree with their natural lifetimes independent of the mm_struct
lifetime using the get/put scheme. The release runs synchronously and just
does invalidate_start across the entire interval tree to create the
required DMA fence.

Additions to the interval tree after release are already impossible as
only current->mm is used during the add.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/gpu/drm/radeon/radeon.h        |   3 -
 drivers/gpu/drm/radeon/radeon_device.c |   2 -
 drivers/gpu/drm/radeon/radeon_drv.c    |   2 +
 drivers/gpu/drm/radeon/radeon_mn.c     | 157 ++++++-------------------
 4 files changed, 38 insertions(+), 126 deletions(-)

AMD team: I wonder if kfd has similar lifetime issues?

diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
index 32808e50be12f8..918164f90b114a 100644
--- a/drivers/gpu/drm/radeon/radeon.h
+++ b/drivers/gpu/drm/radeon/radeon.h
@@ -2451,9 +2451,6 @@ struct radeon_device {
 	/* tracking pinned memory */
 	u64 vram_pin_size;
 	u64 gart_pin_size;
-
-	struct mutex	mn_lock;
-	DECLARE_HASHTABLE(mn_hash, 7);
 };
 
 bool radeon_is_px(struct drm_device *dev);
diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
index dceb554e567446..788b1d8a80e660 100644
--- a/drivers/gpu/drm/radeon/radeon_device.c
+++ b/drivers/gpu/drm/radeon/radeon_device.c
@@ -1325,8 +1325,6 @@ int radeon_device_init(struct radeon_device *rdev,
 	init_rwsem(&rdev->pm.mclk_lock);
 	init_rwsem(&rdev->exclusive_lock);
 	init_waitqueue_head(&rdev->irq.vblank_queue);
-	mutex_init(&rdev->mn_lock);
-	hash_init(rdev->mn_hash);
 	r = radeon_gem_init(rdev);
 	if (r)
 		return r;
diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
index a6cbe11f79c611..b6535ac91fdb74 100644
--- a/drivers/gpu/drm/radeon/radeon_drv.c
+++ b/drivers/gpu/drm/radeon/radeon_drv.c
@@ -35,6 +35,7 @@
 #include <linux/module.h>
 #include <linux/pm_runtime.h>
 #include <linux/vga_switcheroo.h>
+#include <linux/mmu_notifier.h>
 
 #include <drm/drm_crtc_helper.h>
 #include <drm/drm_drv.h>
@@ -624,6 +625,7 @@ static void __exit radeon_exit(void)
 {
 	pci_unregister_driver(pdriver);
 	radeon_unregister_atpx_handler();
+	mmu_notifier_synchronize();
 }
 
 module_init(radeon_init);
diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
index 8c3871ed23a9f0..fc8254273a800b 100644
--- a/drivers/gpu/drm/radeon/radeon_mn.c
+++ b/drivers/gpu/drm/radeon/radeon_mn.c
@@ -37,17 +37,8 @@
 #include "radeon.h"
 
 struct radeon_mn {
-	/* constant after initialisation */
-	struct radeon_device	*rdev;
-	struct mm_struct	*mm;
 	struct mmu_notifier	mn;
 
-	/* only used on destruction */
-	struct work_struct	work;
-
-	/* protected by rdev->mn_lock */
-	struct hlist_node	node;
-
 	/* objects protected by lock */
 	struct mutex		lock;
 	struct rb_root_cached	objects;
@@ -58,55 +49,6 @@ struct radeon_mn_node {
 	struct list_head		bos;
 };
 
-/**
- * radeon_mn_destroy - destroy the rmn
- *
- * @work: previously sheduled work item
- *
- * Lazy destroys the notifier from a work item
- */
-static void radeon_mn_destroy(struct work_struct *work)
-{
-	struct radeon_mn *rmn = container_of(work, struct radeon_mn, work);
-	struct radeon_device *rdev = rmn->rdev;
-	struct radeon_mn_node *node, *next_node;
-	struct radeon_bo *bo, *next_bo;
-
-	mutex_lock(&rdev->mn_lock);
-	mutex_lock(&rmn->lock);
-	hash_del(&rmn->node);
-	rbtree_postorder_for_each_entry_safe(node, next_node,
-					     &rmn->objects.rb_root, it.rb) {
-
-		interval_tree_remove(&node->it, &rmn->objects);
-		list_for_each_entry_safe(bo, next_bo, &node->bos, mn_list) {
-			bo->mn = NULL;
-			list_del_init(&bo->mn_list);
-		}
-		kfree(node);
-	}
-	mutex_unlock(&rmn->lock);
-	mutex_unlock(&rdev->mn_lock);
-	mmu_notifier_unregister(&rmn->mn, rmn->mm);
-	kfree(rmn);
-}
-
-/**
- * radeon_mn_release - callback to notify about mm destruction
- *
- * @mn: our notifier
- * @mn: the mm this callback is about
- *
- * Shedule a work item to lazy destroy our notifier.
- */
-static void radeon_mn_release(struct mmu_notifier *mn,
-			      struct mm_struct *mm)
-{
-	struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
-	INIT_WORK(&rmn->work, radeon_mn_destroy);
-	schedule_work(&rmn->work);
-}
-
 /**
  * radeon_mn_invalidate_range_start - callback to notify about mm change
  *
@@ -183,65 +125,44 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
 	return ret;
 }
 
-static const struct mmu_notifier_ops radeon_mn_ops = {
-	.release = radeon_mn_release,
-	.invalidate_range_start = radeon_mn_invalidate_range_start,
-};
+static void radeon_mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
+{
+	struct mmu_notifier_range range = {
+		.mm = mm,
+		.start = 0,
+		.end = ULONG_MAX,
+		.flags = 0,
+		.event = MMU_NOTIFY_UNMAP,
+	};
+
+	radeon_mn_invalidate_range_start(mn, &range);
+}
 
-/**
- * radeon_mn_get - create notifier context
- *
- * @rdev: radeon device pointer
- *
- * Creates a notifier context for current->mm.
- */
-static struct radeon_mn *radeon_mn_get(struct radeon_device *rdev)
+static struct mmu_notifier *radeon_mn_alloc_notifier(struct mm_struct *mm)
 {
-	struct mm_struct *mm = current->mm;
 	struct radeon_mn *rmn;
-	int r;
-
-	if (down_write_killable(&mm->mmap_sem))
-		return ERR_PTR(-EINTR);
-
-	mutex_lock(&rdev->mn_lock);
-
-	hash_for_each_possible(rdev->mn_hash, rmn, node, (unsigned long)mm)
-		if (rmn->mm == mm)
-			goto release_locks;
 
 	rmn = kzalloc(sizeof(*rmn), GFP_KERNEL);
-	if (!rmn) {
-		rmn = ERR_PTR(-ENOMEM);
-		goto release_locks;
-	}
+	if (!rmn)
+		return ERR_PTR(-ENOMEM);
 
-	rmn->rdev = rdev;
-	rmn->mm = mm;
-	rmn->mn.ops = &radeon_mn_ops;
 	mutex_init(&rmn->lock);
 	rmn->objects = RB_ROOT_CACHED;
-	
-	r = __mmu_notifier_register(&rmn->mn, mm);
-	if (r)
-		goto free_rmn;
-
-	hash_add(rdev->mn_hash, &rmn->node, (unsigned long)mm);
-
-release_locks:
-	mutex_unlock(&rdev->mn_lock);
-	up_write(&mm->mmap_sem);
-
-	return rmn;
-
-free_rmn:
-	mutex_unlock(&rdev->mn_lock);
-	up_write(&mm->mmap_sem);
-	kfree(rmn);
+	return &rmn->mn;
+}
 
-	return ERR_PTR(r);
+static void radeon_mn_free_notifier(struct mmu_notifier *mn)
+{
+	kfree(container_of(mn, struct radeon_mn, mn));
 }
 
+static const struct mmu_notifier_ops radeon_mn_ops = {
+	.release = radeon_mn_release,
+	.invalidate_range_start = radeon_mn_invalidate_range_start,
+	.alloc_notifier = radeon_mn_alloc_notifier,
+	.free_notifier = radeon_mn_free_notifier,
+};
+
 /**
  * radeon_mn_register - register a BO for notifier updates
  *
@@ -254,15 +175,16 @@ static struct radeon_mn *radeon_mn_get(struct radeon_device *rdev)
 int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
 {
 	unsigned long end = addr + radeon_bo_size(bo) - 1;
-	struct radeon_device *rdev = bo->rdev;
+	struct mmu_notifier *mn;
 	struct radeon_mn *rmn;
 	struct radeon_mn_node *node = NULL;
 	struct list_head bos;
 	struct interval_tree_node *it;
 
-	rmn = radeon_mn_get(rdev);
-	if (IS_ERR(rmn))
-		return PTR_ERR(rmn);
+	mn = mmu_notifier_get(&radeon_mn_ops, current->mm);
+	if (IS_ERR(mn))
+		return PTR_ERR(mn);
+	rmn = container_of(mn, struct radeon_mn, mn);
 
 	INIT_LIST_HEAD(&bos);
 
@@ -309,22 +231,13 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
  */
 void radeon_mn_unregister(struct radeon_bo *bo)
 {
-	struct radeon_device *rdev = bo->rdev;
-	struct radeon_mn *rmn;
+	struct radeon_mn *rmn = bo->mn;
 	struct list_head *head;
 
-	mutex_lock(&rdev->mn_lock);
-	rmn = bo->mn;
-	if (rmn == NULL) {
-		mutex_unlock(&rdev->mn_lock);
-		return;
-	}
-
 	mutex_lock(&rmn->lock);
 	/* save the next list entry for later */
 	head = bo->mn_list.next;
 
-	bo->mn = NULL;
 	list_del(&bo->mn_list);
 
 	if (list_empty(head)) {
@@ -335,5 +248,7 @@ void radeon_mn_unregister(struct radeon_bo *bo)
 	}
 
 	mutex_unlock(&rmn->lock);
-	mutex_unlock(&rdev->mn_lock);
+
+	mmu_notifier_put(&rmn->mn);
+	bo->mn = NULL;
 }
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 09/11] drm/amdkfd: fix a use after free race with mmu_notifer unregister
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (7 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-06 23:15 ` [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put Jason Gunthorpe
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

When using mmu_notifer_unregister_no_release() the caller must ensure
there is a SRCU synchronize before the mn memory is freed, otherwise use
after free races are possible, for instance:

     CPU0                                      CPU1
                                      invalidate_range_start
                                         hlist_for_each_entry_rcu(..)
 mmu_notifier_unregister_no_release(&p->mn)
 kfree(mn)
                                      if (mn->ops->invalidate_range_end)

The error unwind in amdkfd misses the SRCU synchronization.

amdkfd keeps the kfd_process around until the mm is released, so split the
flow to fully initialize the kfd_process and register it for find_process,
and with the notifier. Past this point the kfd_process does not need to be
cleaned up as it is fully ready.

The final failable step does a vm_mmap() and does not seem to impact the
kfd_process global state. Since it also cannot be undone (and already has
problems with undo if it internally fails), it has to be last.

This way we don't have to try to unwind the mmu_notifier_register() and
avoid the problem with the SRCU.

Along the way this also fixes various other error unwind bugs in the flow.

Fixes: 45102048f77e ("amdkfd: Add process queue manager module")
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 78 +++++++++++-------------
 1 file changed, 37 insertions(+), 41 deletions(-)

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index 8f1076c0c88a25..c06e6190f21ffa 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -62,8 +62,8 @@ static struct workqueue_struct *kfd_restore_wq;
 
 static struct kfd_process *find_process(const struct task_struct *thread);
 static void kfd_process_ref_release(struct kref *ref);
-static struct kfd_process *create_process(const struct task_struct *thread,
-					struct file *filep);
+static struct kfd_process *create_process(const struct task_struct *thread);
+static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep);
 
 static void evict_process_worker(struct work_struct *work);
 static void restore_process_worker(struct work_struct *work);
@@ -289,7 +289,15 @@ struct kfd_process *kfd_create_process(struct file *filep)
 	if (process) {
 		pr_debug("Process already found\n");
 	} else {
-		process = create_process(thread, filep);
+		process = create_process(thread);
+		if (IS_ERR(process))
+			goto out;
+
+		ret = kfd_process_init_cwsr_apu(process, filep);
+		if (ret) {
+			process = ERR_PTR(ret);
+			goto out;
+		}
 
 		if (!procfs.kobj)
 			goto out;
@@ -609,81 +617,69 @@ static int kfd_process_device_init_cwsr_dgpu(struct kfd_process_device *pdd)
 	return 0;
 }
 
-static struct kfd_process *create_process(const struct task_struct *thread,
-					struct file *filep)
+/*
+ * On return the kfd_process is fully operational and will be freed when the
+ * mm is released
+ */
+static struct kfd_process *create_process(const struct task_struct *thread)
 {
 	struct kfd_process *process;
 	int err = -ENOMEM;
 
 	process = kzalloc(sizeof(*process), GFP_KERNEL);
-
 	if (!process)
 		goto err_alloc_process;
 
-	process->pasid = kfd_pasid_alloc();
-	if (process->pasid == 0)
-		goto err_alloc_pasid;
-
-	if (kfd_alloc_process_doorbells(process) < 0)
-		goto err_alloc_doorbells;
-
 	kref_init(&process->ref);
-
 	mutex_init(&process->mutex);
-
 	process->mm = thread->mm;
-
-	/* register notifier */
-	process->mmu_notifier.ops = &kfd_process_mmu_notifier_ops;
-	err = mmu_notifier_register(&process->mmu_notifier, process->mm);
-	if (err)
-		goto err_mmu_notifier;
-
-	hash_add_rcu(kfd_processes_table, &process->kfd_processes,
-			(uintptr_t)process->mm);
-
 	process->lead_thread = thread->group_leader;
-	get_task_struct(process->lead_thread);
-
 	INIT_LIST_HEAD(&process->per_device_data);
-
+	INIT_DELAYED_WORK(&process->eviction_work, evict_process_worker);
+	INIT_DELAYED_WORK(&process->restore_work, restore_process_worker);
+	process->last_restore_timestamp = get_jiffies_64();
 	kfd_event_init_process(process);
+	process->is_32bit_user_mode = in_compat_syscall();
+
+	process->pasid = kfd_pasid_alloc();
+	if (process->pasid == 0)
+		goto err_alloc_pasid;
+
+	if (kfd_alloc_process_doorbells(process) < 0)
+		goto err_alloc_doorbells;
 
 	err = pqm_init(&process->pqm, process);
 	if (err != 0)
 		goto err_process_pqm_init;
 
 	/* init process apertures*/
-	process->is_32bit_user_mode = in_compat_syscall();
 	err = kfd_init_apertures(process);
 	if (err != 0)
 		goto err_init_apertures;
 
-	INIT_DELAYED_WORK(&process->eviction_work, evict_process_worker);
-	INIT_DELAYED_WORK(&process->restore_work, restore_process_worker);
-	process->last_restore_timestamp = get_jiffies_64();
-
-	err = kfd_process_init_cwsr_apu(process, filep);
+	/* Must be last, have to use release destruction after this */
+	process->mmu_notifier.ops = &kfd_process_mmu_notifier_ops;
+	err = mmu_notifier_register(&process->mmu_notifier, process->mm);
 	if (err)
-		goto err_init_cwsr;
+		goto err_register_notifier;
+
+	get_task_struct(process->lead_thread);
+	hash_add_rcu(kfd_processes_table, &process->kfd_processes,
+			(uintptr_t)process->mm);
 
 	return process;
 
-err_init_cwsr:
+err_register_notifier:
 	kfd_process_free_outstanding_kfd_bos(process);
 	kfd_process_destroy_pdds(process);
 err_init_apertures:
 	pqm_uninit(&process->pqm);
 err_process_pqm_init:
-	hash_del_rcu(&process->kfd_processes);
-	synchronize_rcu();
-	mmu_notifier_unregister_no_release(&process->mmu_notifier, process->mm);
-err_mmu_notifier:
-	mutex_destroy(&process->mutex);
 	kfd_free_process_doorbells(process);
 err_alloc_doorbells:
 	kfd_pasid_free(process->pasid);
 err_alloc_pasid:
+	mutex_destroy(&process->mutex);
 	kfree(process);
 err_alloc_process:
 	return ERR_PTR(err);
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (8 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 09/11] drm/amdkfd: fix a use after free race with mmu_notifer unregister Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-06 23:47   ` Kuehling, Felix
  2019-08-06 23:15 ` [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release Jason Gunthorpe
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

The sequence of mmu_notifier_unregister_no_release(),
mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
free_notifier callback.

As this is the last user of those APIs, converting it means we can drop
them.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  3 ---
 drivers/gpu/drm/amd/amdkfd/kfd_process.c | 10 ++++------
 2 files changed, 4 insertions(+), 9 deletions(-)

I'm really not sure what this is doing, but it is very strange to have a
release with no other callback. It would be good if this would change to use
get as well.

diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
index 3933fb6a371efb..9450e20d17093b 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
@@ -686,9 +686,6 @@ struct kfd_process {
 	/* We want to receive a notification when the mm_struct is destroyed */
 	struct mmu_notifier mmu_notifier;
 
-	/* Use for delayed freeing of kfd_process structure */
-	struct rcu_head	rcu;
-
 	unsigned int pasid;
 	unsigned int doorbell_index;
 
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
index c06e6190f21ffa..e5e326f2f2675e 100644
--- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
+++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
@@ -486,11 +486,9 @@ static void kfd_process_ref_release(struct kref *ref)
 	queue_work(kfd_process_wq, &p->release_work);
 }
 
-static void kfd_process_destroy_delayed(struct rcu_head *rcu)
+static void kfd_process_free_notifier(struct mmu_notifier *mn)
 {
-	struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
-
-	kfd_unref_process(p);
+	kfd_unref_process(container_of(mn, struct kfd_process, mmu_notifier));
 }
 
 static void kfd_process_notifier_release(struct mmu_notifier *mn,
@@ -542,12 +540,12 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
 
 	mutex_unlock(&p->mutex);
 
-	mmu_notifier_unregister_no_release(&p->mmu_notifier, mm);
-	mmu_notifier_call_srcu(&p->rcu, &kfd_process_destroy_delayed);
+	mmu_notifier_put(&p->mmu_notifier);
 }
 
 static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
 	.release = kfd_process_notifier_release,
+	.free_notifier = kfd_process_free_notifier,
 };
 
 static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep)
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (9 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put Jason Gunthorpe
@ 2019-08-06 23:15 ` Jason Gunthorpe
  2019-08-08 10:29   ` Christoph Hellwig
  2019-08-14 21:53   ` Ralph Campbell
  2019-08-14 23:56 ` [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Ralph Campbell
  2019-08-16 15:14 ` Jason Gunthorpe
  12 siblings, 2 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-06 23:15 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christoph Hellwig

From: Jason Gunthorpe <jgg@mellanox.com>

mmu_notifier_unregister_no_release() and mmu_notifier_call_srcu() no
longer have any users, they have all been converted to use
mmu_notifier_put().

So delete this difficult to use interface.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
---
 include/linux/mmu_notifier.h |  5 -----
 mm/mmu_notifier.c            | 31 -------------------------------
 2 files changed, 36 deletions(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 31aa971315a142..52929e5ef70826 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -271,8 +271,6 @@ extern int __mmu_notifier_register(struct mmu_notifier *mn,
 				   struct mm_struct *mm);
 extern void mmu_notifier_unregister(struct mmu_notifier *mn,
 				    struct mm_struct *mm);
-extern void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
-					       struct mm_struct *mm);
 extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
 extern void __mmu_notifier_release(struct mm_struct *mm);
 extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
@@ -513,9 +511,6 @@ static inline void mmu_notifier_range_init(struct mmu_notifier_range *range,
 	set_pte_at(___mm, ___address, __ptep, ___pte);			\
 })
 
-extern void mmu_notifier_call_srcu(struct rcu_head *rcu,
-				   void (*func)(struct rcu_head *rcu));
-
 #else /* CONFIG_MMU_NOTIFIER */
 
 struct mmu_notifier_range {
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 4a770b5211b71d..2ec48f8ba9e288 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -21,18 +21,6 @@
 /* global SRCU for all MMs */
 DEFINE_STATIC_SRCU(srcu);
 
-/*
- * This function allows mmu_notifier::release callback to delay a call to
- * a function that will free appropriate resources. The function must be
- * quick and must not block.
- */
-void mmu_notifier_call_srcu(struct rcu_head *rcu,
-			    void (*func)(struct rcu_head *rcu))
-{
-	call_srcu(&srcu, rcu, func);
-}
-EXPORT_SYMBOL_GPL(mmu_notifier_call_srcu);
-
 /*
  * This function can't run concurrently against mmu_notifier_register
  * because mm->mm_users > 0 during mmu_notifier_register and exit_mmap
@@ -453,25 +441,6 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
 
-/*
- * Same as mmu_notifier_unregister but no callback and no srcu synchronization.
- */
-void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
-					struct mm_struct *mm)
-{
-	spin_lock(&mm->mmu_notifier_mm->lock);
-	/*
-	 * Can not use list_del_rcu() since __mmu_notifier_release
-	 * can delete it before we hold the lock.
-	 */
-	hlist_del_init_rcu(&mn->hlist);
-	spin_unlock(&mm->mmu_notifier_mm->lock);
-
-	BUG_ON(atomic_read(&mm->mm_count) <= 0);
-	mmdrop(mm);
-}
-EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
-
 static void mmu_notifier_free_rcu(struct rcu_head *rcu)
 {
 	struct mmu_notifier *mn = container_of(rcu, struct mmu_notifier, rcu);
-- 
2.22.0

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put
  2019-08-06 23:15 ` [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put Jason Gunthorpe
@ 2019-08-06 23:47   ` Kuehling, Felix
  2019-08-07 11:42     ` Jason Gunthorpe
  0 siblings, 1 reply; 32+ messages in thread
From: Kuehling, Felix @ 2019-08-06 23:47 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, Zhou, David(ChunMing),
	Ralph Campbell, Dimitri Sivanich, Gavin Shan, Andrea Righi,
	linux-rdma, John Hubbard, intel-gfx, linux-kernel, dri-devel,
	Koenig, Christian, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Deucher,  Alexander, Christoph Hellwig

On 2019-08-06 19:15, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> The sequence of mmu_notifier_unregister_no_release(),
> mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
> free_notifier callback.
>
> As this is the last user of those APIs, converting it means we can drop
> them.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>

> ---
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  3 ---
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 10 ++++------
>   2 files changed, 4 insertions(+), 9 deletions(-)
>
> I'm really not sure what this is doing, but it is very strange to have a
> release with no other callback. It would be good if this would change to use
> get as well.
KFD uses the MMU notifier to detect process termination and free all the 
resources associated with the process. This was first added for APUs 
where the IOMMUv2 is set up to perform address translations using the 
CPU page table for device memory access. That's where the association of 
KFD process resources with the lifetime of the mm_struct comes from.

Regards,
   Felix


>
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> index 3933fb6a371efb..9450e20d17093b 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h
> @@ -686,9 +686,6 @@ struct kfd_process {
>   	/* We want to receive a notification when the mm_struct is destroyed */
>   	struct mmu_notifier mmu_notifier;
>   
> -	/* Use for delayed freeing of kfd_process structure */
> -	struct rcu_head	rcu;
> -
>   	unsigned int pasid;
>   	unsigned int doorbell_index;
>   
> diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> index c06e6190f21ffa..e5e326f2f2675e 100644
> --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c
> @@ -486,11 +486,9 @@ static void kfd_process_ref_release(struct kref *ref)
>   	queue_work(kfd_process_wq, &p->release_work);
>   }
>   
> -static void kfd_process_destroy_delayed(struct rcu_head *rcu)
> +static void kfd_process_free_notifier(struct mmu_notifier *mn)
>   {
> -	struct kfd_process *p = container_of(rcu, struct kfd_process, rcu);
> -
> -	kfd_unref_process(p);
> +	kfd_unref_process(container_of(mn, struct kfd_process, mmu_notifier));
>   }
>   
>   static void kfd_process_notifier_release(struct mmu_notifier *mn,
> @@ -542,12 +540,12 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn,
>   
>   	mutex_unlock(&p->mutex);
>   
> -	mmu_notifier_unregister_no_release(&p->mmu_notifier, mm);
> -	mmu_notifier_call_srcu(&p->rcu, &kfd_process_destroy_delayed);
> +	mmu_notifier_put(&p->mmu_notifier);
>   }
>   
>   static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = {
>   	.release = kfd_process_notifier_release,
> +	.free_notifier = kfd_process_free_notifier,
>   };
>   
>   static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep)
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put
  2019-08-06 23:47   ` Kuehling, Felix
@ 2019-08-07 11:42     ` Jason Gunthorpe
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-07 11:42 UTC (permalink / raw)
  To: Kuehling, Felix
  Cc: Andrea Arcangeli, Zhou, David(ChunMing),
	Ralph Campbell, Dimitri Sivanich, Gavin Shan, Andrea Righi,
	linux-rdma, John Hubbard, intel-gfx, linux-kernel, dri-devel,
	Koenig, Christian, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Deucher,  Alexander, Christoph Hellwig

On Tue, Aug 06, 2019 at 11:47:44PM +0000, Kuehling, Felix wrote:
> On 2019-08-06 19:15, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> >
> > The sequence of mmu_notifier_unregister_no_release(),
> > mmu_notifier_call_srcu() is identical to mmu_notifier_put() with the
> > free_notifier callback.
> >
> > As this is the last user of those APIs, converting it means we can drop
> > them.
> >
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> 
> Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
> 
> >   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |  3 ---
> >   drivers/gpu/drm/amd/amdkfd/kfd_process.c | 10 ++++------
> >   2 files changed, 4 insertions(+), 9 deletions(-)
> >
> > I'm really not sure what this is doing, but it is very strange to have a
> > release with no other callback. It would be good if this would change to use
> > get as well.
> KFD uses the MMU notifier to detect process termination and free all the 
> resources associated with the process. This was first added for APUs 
> where the IOMMUv2 is set up to perform address translations using the 
> CPU page table for device memory access. That's where the association of 
> KFD process resources with the lifetime of the mm_struct comes from.

When all the HW objects that could do DMA to this process are
destroyed then the mmu notififer should be torn down. The module
should remain locked until the DMA objects are destroyed.

I'm still unclear why this is needed, the IOMMU for PASID already has
notififers, and already blocks access when the mm_struct goes away,
why add a second layer of tracking?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller
  2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
@ 2019-08-08 10:24   ` Christoph Hellwig
  2019-08-14 20:14   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2019-08-08 10:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Jason Gunthorpe, Alex Deucher, intel-gfx,
	Christoph Hellwig

On Tue, Aug 06, 2019 at 08:15:38PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This simplifies the code to not have so many one line functions and extra
> logic. __mmu_notifier_register() simply becomes the entry point to
> register the notifier, and the other one calls it under lock.
> 
> Also add a lockdep_assert to check that the callers are holding the lock
> as expected.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
  2019-08-06 23:15 ` [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct Jason Gunthorpe
@ 2019-08-08 10:25   ` Christoph Hellwig
  2019-08-14 15:58     ` Jason Gunthorpe
  0 siblings, 1 reply; 32+ messages in thread
From: Christoph Hellwig @ 2019-08-08 10:25 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Jason Gunthorpe, Alex Deucher, intel-gfx,
	Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
  2019-08-06 23:15 ` [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm Jason Gunthorpe
@ 2019-08-08 10:26   ` Christoph Hellwig
  2019-08-14 20:32   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2019-08-08 10:26 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Jason Gunthorpe, Alex Deucher, intel-gfx,
	Christoph Hellwig

On Tue, Aug 06, 2019 at 08:15:39PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> A prior commit e0f3c3f78da2 ("mm/mmu_notifier: init notifier if necessary")
> made an attempt at doing this, but had to be reverted as calling
> the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
> commit 35cfa2b0b491 ("mm/mmu_notifier: allocate mmu_notifier in advance").
> 
> However, we can avoid that problem by doing the allocation only under
> the mmap_sem, which is already happening.
> 
> Since all writers to mm->mmu_notifier_mm hold the write side of the
> mmap_sem reading it under that sem is deterministic and we can use that to
> decide if the allocation path is required, without speculation.
> 
> The actual update to mmu_notifier_mm must still be done under the
> mm_take_all_locks() to ensure read-side coherency.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm'
  2019-08-06 23:15 ` [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm' Jason Gunthorpe
@ 2019-08-08 10:28   ` Christoph Hellwig
  2019-08-14 21:51   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2019-08-08 10:28 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Jason Gunthorpe, Alex Deucher, intel-gfx,
	Christoph Hellwig

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release
  2019-08-06 23:15 ` [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release Jason Gunthorpe
@ 2019-08-08 10:29   ` Christoph Hellwig
  2019-08-14 21:53   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Christoph Hellwig @ 2019-08-08 10:29 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Jason Gunthorpe, Alex Deucher, intel-gfx,
	Christoph Hellwig

On Tue, Aug 06, 2019 at 08:15:48PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> mmu_notifier_unregister_no_release() and mmu_notifier_call_srcu() no
> longer have any users, they have all been converted to use
> mmu_notifier_put().
> 
> So delete this difficult to use interface.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
  2019-08-08 10:25   ` Christoph Hellwig
@ 2019-08-14 15:58     ` Jason Gunthorpe
  2019-08-14 17:18       ` Dimitri Sivanich
  0 siblings, 1 reply; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-14 15:58 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel, linux-mm,
	Jérôme Glisse, iommu, amd-gfx, Alex Deucher, intel-gfx,
	Christian König

On Thu, Aug 08, 2019 at 12:25:56PM +0200, Christoph Hellwig wrote:
> Looks good,
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Dimitri, are you OK with this patch?

Thanks,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn
  2019-08-06 23:15 ` [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn Jason Gunthorpe
@ 2019-08-14 16:07   ` Jason Gunthorpe
  2019-08-15  8:28   ` Christian König
  1 sibling, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-14 16:07 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, linux-rdma, John Hubbard, Kuehling, Felix,
	linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Alex Deucher, intel-gfx,
	Christoph Hellwig

On Tue, Aug 06, 2019 at 08:15:45PM -0300, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> radeon is using a device global hash table to track what mmu_notifiers
> have been registered on struct mm. This is better served with the new
> get/put scheme instead.
> 
> radeon has a bug where it was not blocking notifier release() until all
> the BO's had been invalidated. This could result in a use after free of
> pages the BOs. This is tied into a second bug where radeon left the
> notifiers running endlessly even once the interval tree became
> empty. This could result in a use after free with module unload.
> 
> Both are fixed by changing the lifetime model, the BOs exist in the
> interval tree with their natural lifetimes independent of the mm_struct
> lifetime using the get/put scheme. The release runs synchronously and just
> does invalidate_start across the entire interval tree to create the
> required DMA fence.
> 
> Additions to the interval tree after release are already impossible as
> only current->mm is used during the add.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
>  drivers/gpu/drm/radeon/radeon.h        |   3 -
>  drivers/gpu/drm/radeon/radeon_device.c |   2 -
>  drivers/gpu/drm/radeon/radeon_drv.c    |   2 +
>  drivers/gpu/drm/radeon/radeon_mn.c     | 157 ++++++-------------------
>  4 files changed, 38 insertions(+), 126 deletions(-)

AMD team: Are you OK with this patch?

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
  2019-08-14 15:58     ` Jason Gunthorpe
@ 2019-08-14 17:18       ` Dimitri Sivanich
  0 siblings, 0 replies; 32+ messages in thread
From: Dimitri Sivanich @ 2019-08-14 17:18 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Gavin Shan, Andrea Righi, Dimitri Sivanich, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Alex Deucher, intel-gfx, Christoph Hellwig

On Wed, Aug 14, 2019 at 03:58:34PM +0000, Jason Gunthorpe wrote:
> On Thu, Aug 08, 2019 at 12:25:56PM +0200, Christoph Hellwig wrote:
> > Looks good,
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> Dimitri, are you OK with this patch?
>

I think this looks OK.

Reviewed-by: Dimitri Sivanich <sivanich@hpe.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller
  2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
  2019-08-08 10:24   ` Christoph Hellwig
@ 2019-08-14 20:14   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 20:14 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This simplifies the code to not have so many one line functions and extra
> logic. __mmu_notifier_register() simply becomes the entry point to
> register the notifier, and the other one calls it under lock.
> 
> Also add a lockdep_assert to check that the callers are holding the lock
> as expected.
> 
> Suggested-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Nice clean up.
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
  2019-08-06 23:15 ` [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm Jason Gunthorpe
  2019-08-08 10:26   ` Christoph Hellwig
@ 2019-08-14 20:32   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 20:32 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> A prior commit e0f3c3f78da2 ("mm/mmu_notifier: init notifier if necessary")
> made an attempt at doing this, but had to be reverted as calling
> the GFP_KERNEL allocator under the i_mmap_mutex causes deadlock, see
> commit 35cfa2b0b491 ("mm/mmu_notifier: allocate mmu_notifier in advance").
> 
> However, we can avoid that problem by doing the allocation only under
> the mmap_sem, which is already happening.
> 
> Since all writers to mm->mmu_notifier_mm hold the write side of the
> mmap_sem reading it under that sem is deterministic and we can use that to
> decide if the allocation path is required, without speculation.
> 
> The actual update to mmu_notifier_mm must still be done under the
> mm_take_all_locks() to ensure read-side coherency.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good to me.
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration
  2019-08-06 23:15 ` [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration Jason Gunthorpe
@ 2019-08-14 21:20   ` Ralph Campbell
  2019-08-15  0:13     ` Jason Gunthorpe
  0 siblings, 1 reply; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 21:20 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> Many places in the kernel have a flow where userspace will create some
> object and that object will need to connect to the subsystem's
> mmu_notifier subscription for the duration of its lifetime.
> 
> In this case the subsystem is usually tracking multiple mm_structs and it
> is difficult to keep track of what struct mmu_notifier's have been
> allocated for what mm's.
> 
> Since this has been open coded in a variety of exciting ways, provide core
> functionality to do this safely.
> 
> This approach uses the strct mmu_notifier_ops * as a key to determine if

s/strct/struct

> the subsystem has a notifier registered on the mm or not. If there is a
> registration then the existing notifier struct is returned, otherwise the
> ops->alloc_notifiers() is used to create a new per-subsystem notifier for
> the mm.
> 
> The destroy side incorporates an async call_srcu based destruction which
> will avoid bugs in the callers such as commit 6d7c3cde93c1 ("mm/hmm: fix
> use after free with struct hmm in the mmu notifiers").
> 
> Since we are inside the mmu notifier core locking is fairly simple, the
> allocation uses the same approach as for mmu_notifier_mm, the write side
> of the mmap_sem makes everything deterministic and we only need to do
> hlist_add_head_rcu() under the mm_take_all_locks(). The new users count
> and the discoverability in the hlist is fully serialized by the
> mmu_notifier_mm->lock.
> 
> Co-developed-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Christoph Hellwig <hch@infradead.org>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>

> ---
>   include/linux/mmu_notifier.h |  35 ++++++++
>   mm/mmu_notifier.c            | 156 +++++++++++++++++++++++++++++++++--
>   2 files changed, 185 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> index b6c004bd9f6ad9..31aa971315a142 100644
> --- a/include/linux/mmu_notifier.h
> +++ b/include/linux/mmu_notifier.h
> @@ -211,6 +211,19 @@ struct mmu_notifier_ops {
>   	 */
>   	void (*invalidate_range)(struct mmu_notifier *mn, struct mm_struct *mm,
>   				 unsigned long start, unsigned long end);
> +
> +	/*
> +	 * These callbacks are used with the get/put interface to manage the
> +	 * lifetime of the mmu_notifier memory. alloc_notifier() returns a new
> +	 * notifier for use with the mm.
> +	 *
> +	 * free_notifier() is only called after the mmu_notifier has been
> +	 * fully put, calls to any ops callback are prevented and no ops
> +	 * callbacks are currently running. It is called from a SRCU callback
> +	 * and cannot sleep.
> +	 */
> +	struct mmu_notifier *(*alloc_notifier)(struct mm_struct *mm);
> +	void (*free_notifier)(struct mmu_notifier *mn);
>   };
>   
>   /*
> @@ -227,6 +240,9 @@ struct mmu_notifier_ops {
>   struct mmu_notifier {
>   	struct hlist_node hlist;
>   	const struct mmu_notifier_ops *ops;
> +	struct mm_struct *mm;
> +	struct rcu_head rcu;
> +	unsigned int users;
>   };
>   
>   static inline int mm_has_notifiers(struct mm_struct *mm)
> @@ -234,6 +250,21 @@ static inline int mm_has_notifiers(struct mm_struct *mm)
>   	return unlikely(mm->mmu_notifier_mm);
>   }
>   
> +struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
> +					     struct mm_struct *mm);
> +static inline struct mmu_notifier *
> +mmu_notifier_get(const struct mmu_notifier_ops *ops, struct mm_struct *mm)
> +{
> +	struct mmu_notifier *ret;
> +
> +	down_write(&mm->mmap_sem);
> +	ret = mmu_notifier_get_locked(ops, mm);
> +	up_write(&mm->mmap_sem);
> +	return ret;
> +}
> +void mmu_notifier_put(struct mmu_notifier *mn);
> +void mmu_notifier_synchronize(void);
> +
>   extern int mmu_notifier_register(struct mmu_notifier *mn,
>   				 struct mm_struct *mm);
>   extern int __mmu_notifier_register(struct mmu_notifier *mn,
> @@ -581,6 +612,10 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
>   #define pudp_huge_clear_flush_notify pudp_huge_clear_flush
>   #define set_pte_at_notify set_pte_at
>   
> +static inline void mmu_notifier_synchronize(void)
> +{
> +}
> +
>   #endif /* CONFIG_MMU_NOTIFIER */
>   
>   #endif /* _LINUX_MMU_NOTIFIER_H */
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 696810f632ade1..4a770b5211b71d 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -248,6 +248,9 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
>   	lockdep_assert_held_write(&mm->mmap_sem);
>   	BUG_ON(atomic_read(&mm->mm_users) <= 0);
>   
> +	mn->mm = mm;
> +	mn->users = 1;
> +
>   	if (!mm->mmu_notifier_mm) {
>   		/*
>   		 * kmalloc cannot be called under mm_take_all_locks(), but we
> @@ -295,18 +298,24 @@ int __mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
>   }
>   EXPORT_SYMBOL_GPL(__mmu_notifier_register);
>   
> -/*
> +/**
> + * mmu_notifier_register - Register a notifier on a mm
> + * @mn: The notifier to attach
> + * @mm : The mm to attach the notifier to

Why the space before the colon?
I know, super tiny nit.

> + *
>    * Must not hold mmap_sem nor any other VM related lock when calling
>    * this registration function. Must also ensure mm_users can't go down
>    * to zero while this runs to avoid races with mmu_notifier_release,
>    * so mm has to be current->mm or the mm should be pinned safely such
>    * as with get_task_mm(). If the mm is not current->mm, the mm_users
>    * pin should be released by calling mmput after mmu_notifier_register
> - * returns. mmu_notifier_unregister must be always called to
> - * unregister the notifier. mm_count is automatically pinned to allow
> - * mmu_notifier_unregister to safely run at any time later, before or
> - * after exit_mmap. ->release will always be called before exit_mmap
> - * frees the pages.
> + * returns.
> + *
> + * mmu_notifier_unregister() or mmu_notifier_put() must be always called to
> + * unregister the notifier.
> + *
> + * While the caller has a mmu_notifer get the mm pointer will remain valid,

s/mmu_notifer/mmu_notifier
Maybe "While the caller has a mmu_notifier, the mn->mm pointer will 
remain valid,"

> + * and can be converted to an active mm pointer via mmget_not_zero().
>    */
>   int mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
>   {
> @@ -319,6 +328,72 @@ int mmu_notifier_register(struct mmu_notifier *mn, struct mm_struct *mm)
>   }
>   EXPORT_SYMBOL_GPL(mmu_notifier_register);
>   
> +static struct mmu_notifier *
> +find_get_mmu_notifier(struct mm_struct *mm, const struct mmu_notifier_ops *ops)
> +{
> +	struct mmu_notifier *mn;
> +
> +	spin_lock(&mm->mmu_notifier_mm->lock);
> +	hlist_for_each_entry_rcu (mn, &mm->mmu_notifier_mm->list, hlist) {
> +		if (mn->ops != ops)
> +			continue;
> +
> +		if (likely(mn->users != UINT_MAX))
> +			mn->users++;
> +		else
> +			mn = ERR_PTR(-EOVERFLOW);
> +		spin_unlock(&mm->mmu_notifier_mm->lock);
> +		return mn;
> +	}
> +	spin_unlock(&mm->mmu_notifier_mm->lock);
> +	return NULL;
> +}
> +
> +/**
> + * mmu_notifier_get_locked - Return the single struct mmu_notifier for
> + *                           the mm & ops
> + * @ops: The operations struct being subscribe with
> + * @mm : The mm to attach notifiers too
> + *
> + * This function either allocates a new mmu_notifier via
> + * ops->alloc_notifier(), or returns an already existing notifier on the
> + * list. The value of the ops pointer is used to determine when two notifiers
> + * are the same.
> + *
> + * Each call to mmu_notifier_get() must be paired with a caller to

s/caller/call

> + * mmu_notifier_put(). The caller must hold the write side of mm->mmap_sem.
> + *
> + * While the caller has a mmu_notifer get the mm pointer will remain valid,

same as "mmu_notifer" above.

> + * and can be converted to an active mm pointer via mmget_not_zero().
> + */
> +struct mmu_notifier *mmu_notifier_get_locked(const struct mmu_notifier_ops *ops,
> +					     struct mm_struct *mm)
> +{
> +	struct mmu_notifier *mn;
> +	int ret;
> +
> +	lockdep_assert_held_write(&mm->mmap_sem);
> +
> +	if (mm->mmu_notifier_mm) {
> +		mn = find_get_mmu_notifier(mm, ops);
> +		if (mn)
> +			return mn;
> +	}
> +
> +	mn = ops->alloc_notifier(mm);
> +	if (IS_ERR(mn))
> +		return mn;
> +	mn->ops = ops;
> +	ret = __mmu_notifier_register(mn, mm);
> +	if (ret)
> +		goto out_free;
> +	return mn;
> +out_free:
> +	mn->ops->free_notifier(mn);
> +	return ERR_PTR(ret);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_get_locked);
> +
>   /* this is called after the last mmu_notifier_unregister() returned */
>   void __mmu_notifier_mm_destroy(struct mm_struct *mm)
>   {
> @@ -397,6 +472,75 @@ void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
>   }
>   EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
>   
> +static void mmu_notifier_free_rcu(struct rcu_head *rcu)
> +{
> +	struct mmu_notifier *mn = container_of(rcu, struct mmu_notifier, rcu);
> +	struct mm_struct *mm = mn->mm;
> +
> +	mn->ops->free_notifier(mn);
> +	/* Pairs with the get in __mmu_notifier_register() */
> +	mmdrop(mm);
> +}
> +
> +/**
> + * mmu_notifier_put - Release the reference on the notifier
> + * @mn: The notifier to act on
> + *
> + * This function must be paired with each mmu_notifier_get(), it releases the
> + * reference obtained by the get. If this is the last reference then process
> + * to free the notifier will be run asynchronously.
> + *
> + * Unlike mmu_notifier_unregister() the get/put flow only calls ops->release
> + * when the mm_struct is destroyed. Instead free_notifier is always called to
> + * release any resources held by the user.
> + *
> + * As ops->release is not guaranteed to be called, the user must ensure that
> + * all sptes are dropped, and no new sptes can be established before
> + * mmu_notifier_put() is called.
> + *
> + * This function can be called from the ops->release callback, however the
> + * caller must still ensure it is called pairwise with mmu_notifier_get().
> + *
> + * Modules calling this function must call mmu_notifier_module_unload() in

I think you mean mmu_notifier_synchronize().

> + * their __exit functions to ensure the async work is completed.
> + */
> +void mmu_notifier_put(struct mmu_notifier *mn)
> +{
> +	struct mm_struct *mm = mn->mm;
> +
> +	spin_lock(&mm->mmu_notifier_mm->lock);
> +	if (WARN_ON(!mn->users) || --mn->users)
> +		goto out_unlock;
> +	hlist_del_init_rcu(&mn->hlist);
> +	spin_unlock(&mm->mmu_notifier_mm->lock);
> +
> +	call_srcu(&srcu, &mn->rcu, mmu_notifier_free_rcu);
> +	return;
> +
> +out_unlock:
> +	spin_unlock(&mm->mmu_notifier_mm->lock);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_put);
> +
> +/**
> + * mmu_notifier_synchronize - Ensure all mmu_notifiers are freed
> + *
> + * This function ensures that all outsanding async SRU work from

s/outsanding/outstanding

> + * mmu_notifier_put() is completed. After it returns any mmu_notifier_ops
> + * associated with an unused mmu_notifier will no longer be called.
> + *
> + * Before using the caller must ensure that all of its mmu_notifiers have been
> + * fully released via mmu_notifier_put().
> + *
> + * Modules using the mmu_notifier_put() API should call this in their __exit
> + * function to avoid module unloading races.
> + */
> +void mmu_notifier_synchronize(void)
> +{
> +	synchronize_srcu(&srcu);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_synchronize);
> +
>   bool
>   mmu_notifier_range_update_to_read_only(const struct mmu_notifier_range *range)
>   {
> 
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm'
  2019-08-06 23:15 ` [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm' Jason Gunthorpe
  2019-08-08 10:28   ` Christoph Hellwig
@ 2019-08-14 21:51   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 21:51 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This is a significant simplification, it eliminates all the remaining
> 'hmm' stuff in mm_struct, eliminates krefing along the critical notifier
> paths, and takes away all the ugly locking and abuse of page_table_lock.
> 
> mmu_notifier_get() provides the single struct hmm per struct mm which
> eliminates mm->hmm.
> 
> It also directly guarantees that no mmu_notifier op callback is callable
> while concurrent free is possible, this eliminates all the krefs inside
> the mmu_notifier callbacks.
> 
> The remaining krefs in the range code were overly cautious, drivers are
> already not permitted to free the mirror while a range exists.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Looks good.
Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release
  2019-08-06 23:15 ` [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release Jason Gunthorpe
  2019-08-08 10:29   ` Christoph Hellwig
@ 2019-08-14 21:53   ` Ralph Campbell
  1 sibling, 0 replies; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 21:53 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> mmu_notifier_unregister_no_release() and mmu_notifier_call_srcu() no
> longer have any users, they have all been converted to use
> mmu_notifier_put().
> 
> So delete this difficult to use interface.
> 
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Reviewed-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (10 preceding siblings ...)
  2019-08-06 23:15 ` [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release Jason Gunthorpe
@ 2019-08-14 23:56 ` Ralph Campbell
  2019-08-16 15:14 ` Jason Gunthorpe
  12 siblings, 0 replies; 32+ messages in thread
From: Ralph Campbell @ 2019-08-14 23:56 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König,
	Jérôme Glisse, iommu, amd-gfx, Jason Gunthorpe,
	Alex Deucher, intel-gfx, Christoph Hellwig


On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> From: Jason Gunthorpe <jgg@mellanox.com>
> 
> This series introduces a new registration flow for mmu_notifiers based on
> the idea that the user would like to get a single refcounted piece of
> memory for a mm, keyed to its use.
> 
> For instance many users of mmu_notifiers use an interval tree or similar
> to dispatch notifications to some object. There are many objects but only
> one notifier subscription per mm holding the tree.
> 
> Of the 12 places that call mmu_notifier_register:
>   - 7 are maintaining some kind of obvious mapping of mm_struct to
>     mmu_notifier registration, ie in some linked list or hash table. Of
>     the 7 this series converts 4 (gru, hmm, RDMA, radeon)
> 
>   - 3 (hfi1, gntdev, vhost) are registering multiple notifiers, but each
>     one immediately does some VA range filtering, ie with an interval tree.
>     These would be better with a global subsystem-wide range filter and
>     could convert to this API.
> 
>   - 2 (kvm, amd_iommu) are deliberately using a single mm at a time, and
>     really can't use this API. One of the intel-svm's modes is also in this
>     list
> 
> The 3/7 unconverted drivers are:
>   - intel-svm
>     This driver tracks mm's in a global linked list 'global_svm_list'
>     and would benefit from this API.
> 
>     Its flow is a bit complex, since it also wants a set of non-shared
>     notifiers.
> 
>   - i915_gem_usrptr
>     This driver tracks mm's in a per-device hash
>     table (dev_priv->mm_structs), but only has an optional use of
>     mmu_notifiers.  Since it still seems to need the hash table it is
>     difficult to convert.
> 
>   - amdkfd/kfd_process
>     This driver is using a global SRCU hash table to track mm's
> 
>     The control flow here is very complicated and the driver is relying on
>     this hash table to be fast on the ioctl syscall path.
> 
>     It would definitely benefit, but only if the ioctl path didn't need to
>     do the search so often.
> 
> This series is already entangled with patches in the hmm & RDMA tree and
> will require some git topic branches for the RDMA ODP stuff. I intend for
> it to go through the hmm tree.
> 
> There is a git version here:
> 
> https://github.com/jgunthorpe/linux/commits/mmu_notifier
> 
> Which has the required pre-patches for the RDMA ODP conversion that are
> still being reviewed.
> 
> Jason Gunthorpe (11):
>    mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the
>      caller
>    mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
>    mm/mmu_notifiers: add a get/put scheme for the registration
>    misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
>    hmm: use mmu_notifier_get/put for 'struct hmm'
>    RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
>    RDMA/odp: remove ib_ucontext from ib_umem
>    drm/radeon: use mmu_notifier_get/put for struct radeon_mn
>    drm/amdkfd: fix a use after free race with mmu_notifer unregister
>    drm/amdkfd: use mmu_notifier_put
>    mm/mmu_notifiers: remove unregister_no_release
> 
>   drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c  |   1 +
>   drivers/gpu/drm/amd/amdkfd/kfd_priv.h    |   3 -
>   drivers/gpu/drm/amd/amdkfd/kfd_process.c |  88 ++++-----
>   drivers/gpu/drm/nouveau/nouveau_drm.c    |   3 +
>   drivers/gpu/drm/radeon/radeon.h          |   3 -
>   drivers/gpu/drm/radeon/radeon_device.c   |   2 -
>   drivers/gpu/drm/radeon/radeon_drv.c      |   2 +
>   drivers/gpu/drm/radeon/radeon_mn.c       | 157 ++++------------
>   drivers/infiniband/core/umem.c           |   4 +-
>   drivers/infiniband/core/umem_odp.c       | 183 ++++++------------
>   drivers/infiniband/core/uverbs_cmd.c     |   3 -
>   drivers/infiniband/core/uverbs_main.c    |   1 +
>   drivers/infiniband/hw/mlx5/main.c        |   5 -
>   drivers/misc/sgi-gru/grufile.c           |   1 +
>   drivers/misc/sgi-gru/grutables.h         |   2 -
>   drivers/misc/sgi-gru/grutlbpurge.c       |  84 +++------
>   include/linux/hmm.h                      |  12 +-
>   include/linux/mm_types.h                 |   6 -
>   include/linux/mmu_notifier.h             |  40 +++-
>   include/rdma/ib_umem.h                   |   2 +-
>   include/rdma/ib_umem_odp.h               |  10 +-
>   include/rdma/ib_verbs.h                  |   3 -
>   kernel/fork.c                            |   1 -
>   mm/hmm.c                                 | 121 +++---------
>   mm/mmu_notifier.c                        | 230 +++++++++++++++++------
>   25 files changed, 408 insertions(+), 559 deletions(-)

For the core MM, HMM, and nouveau changes you can add:
Tested-by: Ralph Campbell <rcampbell@nvidia.com>
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration
  2019-08-14 21:20   ` Ralph Campbell
@ 2019-08-15  0:13     ` Jason Gunthorpe
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-15  0:13 UTC (permalink / raw)
  To: Ralph Campbell
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Dimitri Sivanich,
	Gavin Shan, Andrea Righi, linux-rdma, John Hubbard, Kuehling,
	Felix, linux-kernel, dri-devel, Christian König, linux-mm,
	Jérôme Glisse, iommu, amd-gfx, Alex Deucher, intel-gfx,
	Christoph Hellwig

On Wed, Aug 14, 2019 at 02:20:31PM -0700, Ralph Campbell wrote:
> 
> On 8/6/19 4:15 PM, Jason Gunthorpe wrote:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > Many places in the kernel have a flow where userspace will create some
> > object and that object will need to connect to the subsystem's
> > mmu_notifier subscription for the duration of its lifetime.
> > 
> > In this case the subsystem is usually tracking multiple mm_structs and it
> > is difficult to keep track of what struct mmu_notifier's have been
> > allocated for what mm's.
> > 
> > Since this has been open coded in a variety of exciting ways, provide core
> > functionality to do this safely.
> > 
> > This approach uses the strct mmu_notifier_ops * as a key to determine if
> 
> s/strct/struct

Yes, thanks for all of this, I like having comments, but I'm a
terrible proofreader :(

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn
  2019-08-06 23:15 ` [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn Jason Gunthorpe
  2019-08-14 16:07   ` Jason Gunthorpe
@ 2019-08-15  8:28   ` Christian König
  2019-08-15 19:46     ` Jason Gunthorpe
  1 sibling, 1 reply; 32+ messages in thread
From: Christian König @ 2019-08-15  8:28 UTC (permalink / raw)
  To: Jason Gunthorpe, linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christoph Hellwig, Jérôme Glisse, iommu, amd-gfx,
	Jason Gunthorpe, Alex Deucher, intel-gfx, Christian König

Am 07.08.19 um 01:15 schrieb Jason Gunthorpe:
> From: Jason Gunthorpe <jgg@mellanox.com>
>
> radeon is using a device global hash table to track what mmu_notifiers
> have been registered on struct mm. This is better served with the new
> get/put scheme instead.
>
> radeon has a bug where it was not blocking notifier release() until all
> the BO's had been invalidated. This could result in a use after free of
> pages the BOs. This is tied into a second bug where radeon left the
> notifiers running endlessly even once the interval tree became
> empty. This could result in a use after free with module unload.
>
> Both are fixed by changing the lifetime model, the BOs exist in the
> interval tree with their natural lifetimes independent of the mm_struct
> lifetime using the get/put scheme. The release runs synchronously and just
> does invalidate_start across the entire interval tree to create the
> required DMA fence.
>
> Additions to the interval tree after release are already impossible as
> only current->mm is used during the add.
>
> Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>

Acked-by: Christian König <christian.koenig@amd.com>

But I'm wondering if we shouldn't completely drop radeon userptr support.

It's just to buggy,
Christian.

> ---
>   drivers/gpu/drm/radeon/radeon.h        |   3 -
>   drivers/gpu/drm/radeon/radeon_device.c |   2 -
>   drivers/gpu/drm/radeon/radeon_drv.c    |   2 +
>   drivers/gpu/drm/radeon/radeon_mn.c     | 157 ++++++-------------------
>   4 files changed, 38 insertions(+), 126 deletions(-)
>
> AMD team: I wonder if kfd has similar lifetime issues?
>
> diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h
> index 32808e50be12f8..918164f90b114a 100644
> --- a/drivers/gpu/drm/radeon/radeon.h
> +++ b/drivers/gpu/drm/radeon/radeon.h
> @@ -2451,9 +2451,6 @@ struct radeon_device {
>   	/* tracking pinned memory */
>   	u64 vram_pin_size;
>   	u64 gart_pin_size;
> -
> -	struct mutex	mn_lock;
> -	DECLARE_HASHTABLE(mn_hash, 7);
>   };
>   
>   bool radeon_is_px(struct drm_device *dev);
> diff --git a/drivers/gpu/drm/radeon/radeon_device.c b/drivers/gpu/drm/radeon/radeon_device.c
> index dceb554e567446..788b1d8a80e660 100644
> --- a/drivers/gpu/drm/radeon/radeon_device.c
> +++ b/drivers/gpu/drm/radeon/radeon_device.c
> @@ -1325,8 +1325,6 @@ int radeon_device_init(struct radeon_device *rdev,
>   	init_rwsem(&rdev->pm.mclk_lock);
>   	init_rwsem(&rdev->exclusive_lock);
>   	init_waitqueue_head(&rdev->irq.vblank_queue);
> -	mutex_init(&rdev->mn_lock);
> -	hash_init(rdev->mn_hash);
>   	r = radeon_gem_init(rdev);
>   	if (r)
>   		return r;
> diff --git a/drivers/gpu/drm/radeon/radeon_drv.c b/drivers/gpu/drm/radeon/radeon_drv.c
> index a6cbe11f79c611..b6535ac91fdb74 100644
> --- a/drivers/gpu/drm/radeon/radeon_drv.c
> +++ b/drivers/gpu/drm/radeon/radeon_drv.c
> @@ -35,6 +35,7 @@
>   #include <linux/module.h>
>   #include <linux/pm_runtime.h>
>   #include <linux/vga_switcheroo.h>
> +#include <linux/mmu_notifier.h>
>   
>   #include <drm/drm_crtc_helper.h>
>   #include <drm/drm_drv.h>
> @@ -624,6 +625,7 @@ static void __exit radeon_exit(void)
>   {
>   	pci_unregister_driver(pdriver);
>   	radeon_unregister_atpx_handler();
> +	mmu_notifier_synchronize();
>   }
>   
>   module_init(radeon_init);
> diff --git a/drivers/gpu/drm/radeon/radeon_mn.c b/drivers/gpu/drm/radeon/radeon_mn.c
> index 8c3871ed23a9f0..fc8254273a800b 100644
> --- a/drivers/gpu/drm/radeon/radeon_mn.c
> +++ b/drivers/gpu/drm/radeon/radeon_mn.c
> @@ -37,17 +37,8 @@
>   #include "radeon.h"
>   
>   struct radeon_mn {
> -	/* constant after initialisation */
> -	struct radeon_device	*rdev;
> -	struct mm_struct	*mm;
>   	struct mmu_notifier	mn;
>   
> -	/* only used on destruction */
> -	struct work_struct	work;
> -
> -	/* protected by rdev->mn_lock */
> -	struct hlist_node	node;
> -
>   	/* objects protected by lock */
>   	struct mutex		lock;
>   	struct rb_root_cached	objects;
> @@ -58,55 +49,6 @@ struct radeon_mn_node {
>   	struct list_head		bos;
>   };
>   
> -/**
> - * radeon_mn_destroy - destroy the rmn
> - *
> - * @work: previously sheduled work item
> - *
> - * Lazy destroys the notifier from a work item
> - */
> -static void radeon_mn_destroy(struct work_struct *work)
> -{
> -	struct radeon_mn *rmn = container_of(work, struct radeon_mn, work);
> -	struct radeon_device *rdev = rmn->rdev;
> -	struct radeon_mn_node *node, *next_node;
> -	struct radeon_bo *bo, *next_bo;
> -
> -	mutex_lock(&rdev->mn_lock);
> -	mutex_lock(&rmn->lock);
> -	hash_del(&rmn->node);
> -	rbtree_postorder_for_each_entry_safe(node, next_node,
> -					     &rmn->objects.rb_root, it.rb) {
> -
> -		interval_tree_remove(&node->it, &rmn->objects);
> -		list_for_each_entry_safe(bo, next_bo, &node->bos, mn_list) {
> -			bo->mn = NULL;
> -			list_del_init(&bo->mn_list);
> -		}
> -		kfree(node);
> -	}
> -	mutex_unlock(&rmn->lock);
> -	mutex_unlock(&rdev->mn_lock);
> -	mmu_notifier_unregister(&rmn->mn, rmn->mm);
> -	kfree(rmn);
> -}
> -
> -/**
> - * radeon_mn_release - callback to notify about mm destruction
> - *
> - * @mn: our notifier
> - * @mn: the mm this callback is about
> - *
> - * Shedule a work item to lazy destroy our notifier.
> - */
> -static void radeon_mn_release(struct mmu_notifier *mn,
> -			      struct mm_struct *mm)
> -{
> -	struct radeon_mn *rmn = container_of(mn, struct radeon_mn, mn);
> -	INIT_WORK(&rmn->work, radeon_mn_destroy);
> -	schedule_work(&rmn->work);
> -}
> -
>   /**
>    * radeon_mn_invalidate_range_start - callback to notify about mm change
>    *
> @@ -183,65 +125,44 @@ static int radeon_mn_invalidate_range_start(struct mmu_notifier *mn,
>   	return ret;
>   }
>   
> -static const struct mmu_notifier_ops radeon_mn_ops = {
> -	.release = radeon_mn_release,
> -	.invalidate_range_start = radeon_mn_invalidate_range_start,
> -};
> +static void radeon_mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
> +{
> +	struct mmu_notifier_range range = {
> +		.mm = mm,
> +		.start = 0,
> +		.end = ULONG_MAX,
> +		.flags = 0,
> +		.event = MMU_NOTIFY_UNMAP,
> +	};
> +
> +	radeon_mn_invalidate_range_start(mn, &range);
> +}
>   
> -/**
> - * radeon_mn_get - create notifier context
> - *
> - * @rdev: radeon device pointer
> - *
> - * Creates a notifier context for current->mm.
> - */
> -static struct radeon_mn *radeon_mn_get(struct radeon_device *rdev)
> +static struct mmu_notifier *radeon_mn_alloc_notifier(struct mm_struct *mm)
>   {
> -	struct mm_struct *mm = current->mm;
>   	struct radeon_mn *rmn;
> -	int r;
> -
> -	if (down_write_killable(&mm->mmap_sem))
> -		return ERR_PTR(-EINTR);
> -
> -	mutex_lock(&rdev->mn_lock);
> -
> -	hash_for_each_possible(rdev->mn_hash, rmn, node, (unsigned long)mm)
> -		if (rmn->mm == mm)
> -			goto release_locks;
>   
>   	rmn = kzalloc(sizeof(*rmn), GFP_KERNEL);
> -	if (!rmn) {
> -		rmn = ERR_PTR(-ENOMEM);
> -		goto release_locks;
> -	}
> +	if (!rmn)
> +		return ERR_PTR(-ENOMEM);
>   
> -	rmn->rdev = rdev;
> -	rmn->mm = mm;
> -	rmn->mn.ops = &radeon_mn_ops;
>   	mutex_init(&rmn->lock);
>   	rmn->objects = RB_ROOT_CACHED;
> -	
> -	r = __mmu_notifier_register(&rmn->mn, mm);
> -	if (r)
> -		goto free_rmn;
> -
> -	hash_add(rdev->mn_hash, &rmn->node, (unsigned long)mm);
> -
> -release_locks:
> -	mutex_unlock(&rdev->mn_lock);
> -	up_write(&mm->mmap_sem);
> -
> -	return rmn;
> -
> -free_rmn:
> -	mutex_unlock(&rdev->mn_lock);
> -	up_write(&mm->mmap_sem);
> -	kfree(rmn);
> +	return &rmn->mn;
> +}
>   
> -	return ERR_PTR(r);
> +static void radeon_mn_free_notifier(struct mmu_notifier *mn)
> +{
> +	kfree(container_of(mn, struct radeon_mn, mn));
>   }
>   
> +static const struct mmu_notifier_ops radeon_mn_ops = {
> +	.release = radeon_mn_release,
> +	.invalidate_range_start = radeon_mn_invalidate_range_start,
> +	.alloc_notifier = radeon_mn_alloc_notifier,
> +	.free_notifier = radeon_mn_free_notifier,
> +};
> +
>   /**
>    * radeon_mn_register - register a BO for notifier updates
>    *
> @@ -254,15 +175,16 @@ static struct radeon_mn *radeon_mn_get(struct radeon_device *rdev)
>   int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>   {
>   	unsigned long end = addr + radeon_bo_size(bo) - 1;
> -	struct radeon_device *rdev = bo->rdev;
> +	struct mmu_notifier *mn;
>   	struct radeon_mn *rmn;
>   	struct radeon_mn_node *node = NULL;
>   	struct list_head bos;
>   	struct interval_tree_node *it;
>   
> -	rmn = radeon_mn_get(rdev);
> -	if (IS_ERR(rmn))
> -		return PTR_ERR(rmn);
> +	mn = mmu_notifier_get(&radeon_mn_ops, current->mm);
> +	if (IS_ERR(mn))
> +		return PTR_ERR(mn);
> +	rmn = container_of(mn, struct radeon_mn, mn);
>   
>   	INIT_LIST_HEAD(&bos);
>   
> @@ -309,22 +231,13 @@ int radeon_mn_register(struct radeon_bo *bo, unsigned long addr)
>    */
>   void radeon_mn_unregister(struct radeon_bo *bo)
>   {
> -	struct radeon_device *rdev = bo->rdev;
> -	struct radeon_mn *rmn;
> +	struct radeon_mn *rmn = bo->mn;
>   	struct list_head *head;
>   
> -	mutex_lock(&rdev->mn_lock);
> -	rmn = bo->mn;
> -	if (rmn == NULL) {
> -		mutex_unlock(&rdev->mn_lock);
> -		return;
> -	}
> -
>   	mutex_lock(&rmn->lock);
>   	/* save the next list entry for later */
>   	head = bo->mn_list.next;
>   
> -	bo->mn = NULL;
>   	list_del(&bo->mn_list);
>   
>   	if (list_empty(head)) {
> @@ -335,5 +248,7 @@ void radeon_mn_unregister(struct radeon_bo *bo)
>   	}
>   
>   	mutex_unlock(&rmn->lock);
> -	mutex_unlock(&rdev->mn_lock);
> +
> +	mmu_notifier_put(&rmn->mn);
> +	bo->mn = NULL;
>   }

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn
  2019-08-15  8:28   ` Christian König
@ 2019-08-15 19:46     ` Jason Gunthorpe
  0 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-15 19:46 UTC (permalink / raw)
  To: christian.koenig
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christoph Hellwig, linux-mm, Jérôme Glisse, iommu,
	amd-gfx, Alex Deucher, intel-gfx

On Thu, Aug 15, 2019 at 10:28:21AM +0200, Christian König wrote:
> Am 07.08.19 um 01:15 schrieb Jason Gunthorpe:
> > From: Jason Gunthorpe <jgg@mellanox.com>
> > 
> > radeon is using a device global hash table to track what mmu_notifiers
> > have been registered on struct mm. This is better served with the new
> > get/put scheme instead.
> > 
> > radeon has a bug where it was not blocking notifier release() until all
> > the BO's had been invalidated. This could result in a use after free of
> > pages the BOs. This is tied into a second bug where radeon left the
> > notifiers running endlessly even once the interval tree became
> > empty. This could result in a use after free with module unload.
> > 
> > Both are fixed by changing the lifetime model, the BOs exist in the
> > interval tree with their natural lifetimes independent of the mm_struct
> > lifetime using the get/put scheme. The release runs synchronously and just
> > does invalidate_start across the entire interval tree to create the
> > required DMA fence.
> > 
> > Additions to the interval tree after release are already impossible as
> > only current->mm is used during the add.
> > 
> > Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
> 
> Acked-by: Christian König <christian.koenig@amd.com>

Thanks!

> But I'm wondering if we shouldn't completely drop radeon userptr support.
> It's just to buggy,

I would not object :)

Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations
  2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
                   ` (11 preceding siblings ...)
  2019-08-14 23:56 ` [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Ralph Campbell
@ 2019-08-16 15:14 ` Jason Gunthorpe
  12 siblings, 0 replies; 32+ messages in thread
From: Jason Gunthorpe @ 2019-08-16 15:14 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, David (ChunMing) Zhou, Ralph Campbell,
	Dimitri Sivanich, Gavin Shan, Andrea Righi, linux-rdma,
	John Hubbard, Kuehling, Felix, linux-kernel, dri-devel,
	Christian König, Jérôme Glisse, iommu, amd-gfx,
	Alex Deucher, intel-gfx, Christoph Hellwig

On Tue, Aug 06, 2019 at 08:15:37PM -0300, Jason Gunthorpe wrote:
> This series is already entangled with patches in the hmm & RDMA tree and
> will require some git topic branches for the RDMA ODP stuff. I intend for
> it to go through the hmm tree.

> Jason Gunthorpe (11):
>   mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the
>     caller
>   mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm
>   mm/mmu_notifiers: add a get/put scheme for the registration
>   misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct
>   hmm: use mmu_notifier_get/put for 'struct hmm'
>   drm/radeon: use mmu_notifier_get/put for struct radeon_mn
>   drm/amdkfd: fix a use after free race with mmu_notifer unregister
>   drm/amdkfd: use mmu_notifier_put

Other than these patches:

>   RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm'
>   RDMA/odp: remove ib_ucontext from ib_umem
>   mm/mmu_notifiers: remove unregister_no_release

This series has been applied.

I will apply the ODP patches when the series they depend on is merged
to the RDMA tree

Any further acks/remarks I will annotate, thanks in advance

Thanks to all reviewers,
Jason
_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, back to index

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-06 23:15 [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 01/11] mm/mmu_notifiers: hoist do_mmu_notifier_register down_write to the caller Jason Gunthorpe
2019-08-08 10:24   ` Christoph Hellwig
2019-08-14 20:14   ` Ralph Campbell
2019-08-06 23:15 ` [PATCH v3 hmm 02/11] mm/mmu_notifiers: do not speculatively allocate a mmu_notifier_mm Jason Gunthorpe
2019-08-08 10:26   ` Christoph Hellwig
2019-08-14 20:32   ` Ralph Campbell
2019-08-06 23:15 ` [PATCH v3 hmm 03/11] mm/mmu_notifiers: add a get/put scheme for the registration Jason Gunthorpe
2019-08-14 21:20   ` Ralph Campbell
2019-08-15  0:13     ` Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 04/11] misc/sgi-gru: use mmu_notifier_get/put for struct gru_mm_struct Jason Gunthorpe
2019-08-08 10:25   ` Christoph Hellwig
2019-08-14 15:58     ` Jason Gunthorpe
2019-08-14 17:18       ` Dimitri Sivanich
2019-08-06 23:15 ` [PATCH v3 hmm 05/11] hmm: use mmu_notifier_get/put for 'struct hmm' Jason Gunthorpe
2019-08-08 10:28   ` Christoph Hellwig
2019-08-14 21:51   ` Ralph Campbell
2019-08-06 23:15 ` [PATCH v3 hmm 06/11] RDMA/odp: use mmu_notifier_get/put for 'struct ib_ucontext_per_mm' Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 07/11] RDMA/odp: remove ib_ucontext from ib_umem Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 08/11] drm/radeon: use mmu_notifier_get/put for struct radeon_mn Jason Gunthorpe
2019-08-14 16:07   ` Jason Gunthorpe
2019-08-15  8:28   ` Christian König
2019-08-15 19:46     ` Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 09/11] drm/amdkfd: fix a use after free race with mmu_notifer unregister Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 10/11] drm/amdkfd: use mmu_notifier_put Jason Gunthorpe
2019-08-06 23:47   ` Kuehling, Felix
2019-08-07 11:42     ` Jason Gunthorpe
2019-08-06 23:15 ` [PATCH v3 hmm 11/11] mm/mmu_notifiers: remove unregister_no_release Jason Gunthorpe
2019-08-08 10:29   ` Christoph Hellwig
2019-08-14 21:53   ` Ralph Campbell
2019-08-14 23:56 ` [PATCH v3 hmm 00/11] Add mmu_notifier_get/put for managing mmu notifier registrations Ralph Campbell
2019-08-16 15:14 ` Jason Gunthorpe

IOMMU Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-iommu/0 linux-iommu/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-iommu linux-iommu/ https://lore.kernel.org/linux-iommu \
		iommu@lists.linux-foundation.org iommu@archiver.kernel.org
	public-inbox-index linux-iommu


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.linux-foundation.lists.iommu


AGPL code for this site: git clone https://public-inbox.org/ public-inbox