All of lore.kernel.org
 help / color / mirror / Atom feed
* mmu_notifier: preparatory patches for hmm and or iommuv2 v6
@ 2014-07-17 18:46 j.glisse
  2014-07-17 18:46 ` [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync j.glisse
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: j.glisse @ 2014-07-17 18:46 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Oded Gabbay

Nutshell few patches to improve mmu_notifier :
 - patch 1/3 allow to free resources when mm_struct is destroy.
 - patch 2/3 provide context informations to mmu_notifier listener.
 - patch 3/3 pass vma to range_start/range_end to avoid duplicate
   vma lookup inside the listener.

I restricted myself to set of less controversial patches and i believe
i have addressed all comments that were previously made. Thanks again
for all feedback, i hope this version is the good one.

This is somewhat of a v5 but i do not include core hmm with those
patches. So previous discussion thread :
v1 http://www.spinics.net/lists/linux-mm/msg72501.html
v2 http://www.spinics.net/lists/linux-mm/msg74532.html
v3 http://www.spinics.net/lists/linux-mm/msg74656.html
v4 http://www.spinics.net/lists/linux-mm/msg75401.html
v5 http://www.spinics.net/lists/linux-mm/msg75875.html

Cheers,
JA(C)rA'me Glisse


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-17 18:46 mmu_notifier: preparatory patches for hmm and or iommuv2 v6 j.glisse
@ 2014-07-17 18:46 ` j.glisse
  2014-07-21 15:01   ` Oded Gabbay
  2014-07-17 18:46 ` [PATCH 2/7] mmu_notifier: add event information to address invalidation v3 j.glisse
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: j.glisse @ 2014-07-17 18:46 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Oded Gabbay,
	Jérôme Glisse

From: Peter Zijlstra <peterz@infradead.org>

New mmu_notifier listener are eager to cleanup there structure after the
mmu_notifier::release callback. In order to allow this the patch provide
a function that allows to add a delayed call to the mmu_notifier srcu. It
also add a function that will call barrier_srcu so those listener can sync
with mmu_notifier.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: JA(C)rA'me Glisse <jglisse@redhat.com>
---
 include/linux/mmu_notifier.h |  6 ++++++
 mm/mmu_notifier.c            | 40 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 45 insertions(+), 1 deletion(-)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index deca874..2728869 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -170,6 +170,8 @@ extern int __mmu_notifier_register(struct mmu_notifier *mn,
 				   struct mm_struct *mm);
 extern void mmu_notifier_unregister(struct mmu_notifier *mn,
 				    struct mm_struct *mm);
+extern void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
+					       struct mm_struct *mm);
 extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
 extern void __mmu_notifier_release(struct mm_struct *mm);
 extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
@@ -288,6 +290,10 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
 	set_pte_at(___mm, ___address, __ptep, ___pte);			\
 })
 
+extern void mmu_notifier_call_srcu(struct rcu_head *rcu,
+				   void (*func)(struct rcu_head *rcu));
+extern void mmu_notifier_synchronize(void);
+
 #else /* CONFIG_MMU_NOTIFIER */
 
 static inline void mmu_notifier_release(struct mm_struct *mm)
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 41cefdf..950813b 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -23,6 +23,25 @@
 static struct srcu_struct srcu;
 
 /*
+ * This function allows mmu_notifier::release callback to delay a call to
+ * a function that will free appropriate resources. The function must be
+ * quick and must not block.
+ */
+void mmu_notifier_call_srcu(struct rcu_head *rcu,
+			    void (*func)(struct rcu_head *rcu))
+{
+	call_srcu(&srcu, rcu, func);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_call_srcu);
+
+void mmu_notifier_synchronize(void)
+{
+	/* Wait for any running method to finish. */
+	srcu_barrier(&srcu);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_synchronize);
+
+/*
  * This function can't run concurrently against mmu_notifier_register
  * because mm->mm_users > 0 during mmu_notifier_register and exit_mmap
  * runs with mm_users == 0. Other tasks may still invoke mmu notifiers
@@ -53,7 +72,6 @@ void __mmu_notifier_release(struct mm_struct *mm)
 		 */
 		if (mn->ops->release)
 			mn->ops->release(mn, mm);
-	srcu_read_unlock(&srcu, id);
 
 	spin_lock(&mm->mmu_notifier_mm->lock);
 	while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
@@ -69,6 +87,7 @@ void __mmu_notifier_release(struct mm_struct *mm)
 		hlist_del_init_rcu(&mn->hlist);
 	}
 	spin_unlock(&mm->mmu_notifier_mm->lock);
+	srcu_read_unlock(&srcu, id);
 
 	/*
 	 * synchronize_srcu here prevents mmu_notifier_release from returning to
@@ -325,6 +344,25 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
 
+/*
+ * Same as mmu_notifier_unregister but no callback and no srcu synchronization.
+ */
+void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
+					struct mm_struct *mm)
+{
+	spin_lock(&mm->mmu_notifier_mm->lock);
+	/*
+	 * Can not use list_del_rcu() since __mmu_notifier_release
+	 * can delete it before we hold the lock.
+	 */
+	hlist_del_init_rcu(&mn->hlist);
+	spin_unlock(&mm->mmu_notifier_mm->lock);
+
+	BUG_ON(atomic_read(&mm->mm_count) <= 0);
+	mmdrop(mm);
+}
+EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
+
 static int __init mmu_notifier_init(void)
 {
 	return init_srcu_struct(&srcu);
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/7] mmu_notifier: add event information to address invalidation v3
  2014-07-17 18:46 mmu_notifier: preparatory patches for hmm and or iommuv2 v6 j.glisse
  2014-07-17 18:46 ` [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync j.glisse
@ 2014-07-17 18:46 ` j.glisse
  2014-07-17 18:46 ` [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3 j.glisse
  2014-07-24 15:46 ` mmu_notifier: preparatory patches for hmm and or iommuv2 v6 Jerome Glisse
  3 siblings, 0 replies; 13+ messages in thread
From: j.glisse @ 2014-07-17 18:46 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Oded Gabbay,
	Jérôme Glisse

From: JA(C)rA'me Glisse <jglisse@redhat.com>

The event information will be usefull for new user of mmu_notifier API.
The event argument differentiate between a vma disappearing, a page
being write protected or simply a page being unmaped. This allow new
user to take different path for different event for instance on unmap
the resource used to track a vma are still valid and should stay around.
While if the event is saying that a vma is being destroy it means that any
resources used to track this vma can be free.

Changed since v1:
  - renamed action into event (updated commit message too).
  - simplified the event names and clarified their intented usage
    also documenting what exceptation the listener can have in
    respect to each event.

Changed since v2:
  - Avoid crazy name.
  - Do not move code that do not need to move.

Signed-off-by: JA(C)rA'me Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c |   3 +-
 drivers/iommu/amd_iommu_v2.c            |  14 ++--
 drivers/misc/sgi-gru/grutlbpurge.c      |   9 ++-
 drivers/xen/gntdev.c                    |   9 ++-
 fs/proc/task_mmu.c                      |   6 +-
 include/linux/mmu_notifier.h            | 123 ++++++++++++++++++++++++++------
 kernel/events/uprobes.c                 |  10 ++-
 mm/filemap_xip.c                        |   2 +-
 mm/huge_memory.c                        |  39 ++++++----
 mm/hugetlb.c                            |  23 +++---
 mm/ksm.c                                |  18 +++--
 mm/memory.c                             |  27 ++++---
 mm/migrate.c                            |   9 ++-
 mm/mmu_notifier.c                       |  28 +++++---
 mm/mprotect.c                           |   5 +-
 mm/mremap.c                             |   6 +-
 mm/rmap.c                               |  24 +++++--
 virt/kvm/kvm_main.c                     |  12 ++--
 18 files changed, 263 insertions(+), 104 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index b41614d..8c18bb1 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -56,7 +56,8 @@ struct i915_mmu_object {
 static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 						       struct mm_struct *mm,
 						       unsigned long start,
-						       unsigned long end)
+						       unsigned long end,
+						       enum mmu_event event)
 {
 	struct i915_mmu_notifier *mn = container_of(_mn, struct i915_mmu_notifier, mn);
 	struct interval_tree_node *it = NULL;
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 92fb77c..33482ef 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -414,21 +414,25 @@ static int mn_clear_flush_young(struct mmu_notifier *mn,
 static void mn_change_pte(struct mmu_notifier *mn,
 			  struct mm_struct *mm,
 			  unsigned long address,
-			  pte_t pte)
+			  pte_t pte,
+			  enum mmu_event event)
 {
 	__mn_flush_page(mn, address);
 }
 
 static void mn_invalidate_page(struct mmu_notifier *mn,
 			       struct mm_struct *mm,
-			       unsigned long address)
+			       unsigned long address,
+			       enum mmu_event event)
 {
 	__mn_flush_page(mn, address);
 }
 
 static void mn_invalidate_range_start(struct mmu_notifier *mn,
 				      struct mm_struct *mm,
-				      unsigned long start, unsigned long end)
+				      unsigned long start,
+				      unsigned long end,
+				      enum mmu_event event)
 {
 	struct pasid_state *pasid_state;
 	struct device_state *dev_state;
@@ -449,7 +453,9 @@ static void mn_invalidate_range_start(struct mmu_notifier *mn,
 
 static void mn_invalidate_range_end(struct mmu_notifier *mn,
 				    struct mm_struct *mm,
-				    unsigned long start, unsigned long end)
+				    unsigned long start,
+				    unsigned long end,
+				    enum mmu_event event)
 {
 	struct pasid_state *pasid_state;
 	struct device_state *dev_state;
diff --git a/drivers/misc/sgi-gru/grutlbpurge.c b/drivers/misc/sgi-gru/grutlbpurge.c
index 2129274..e67fed1 100644
--- a/drivers/misc/sgi-gru/grutlbpurge.c
+++ b/drivers/misc/sgi-gru/grutlbpurge.c
@@ -221,7 +221,8 @@ void gru_flush_all_tlb(struct gru_state *gru)
  */
 static void gru_invalidate_range_start(struct mmu_notifier *mn,
 				       struct mm_struct *mm,
-				       unsigned long start, unsigned long end)
+				       unsigned long start, unsigned long end,
+				       enum mmu_event event)
 {
 	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
 						 ms_notifier);
@@ -235,7 +236,8 @@ static void gru_invalidate_range_start(struct mmu_notifier *mn,
 
 static void gru_invalidate_range_end(struct mmu_notifier *mn,
 				     struct mm_struct *mm, unsigned long start,
-				     unsigned long end)
+				     unsigned long end,
+				     enum mmu_event event)
 {
 	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
 						 ms_notifier);
@@ -248,7 +250,8 @@ static void gru_invalidate_range_end(struct mmu_notifier *mn,
 }
 
 static void gru_invalidate_page(struct mmu_notifier *mn, struct mm_struct *mm,
-				unsigned long address)
+				unsigned long address,
+				enum mmu_event event)
 {
 	struct gru_mm_struct *gms = container_of(mn, struct gru_mm_struct,
 						 ms_notifier);
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 073b4a1..fe9da94 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -428,7 +428,9 @@ static void unmap_if_in_range(struct grant_map *map,
 
 static void mn_invl_range_start(struct mmu_notifier *mn,
 				struct mm_struct *mm,
-				unsigned long start, unsigned long end)
+				unsigned long start,
+				unsigned long end,
+				enum mmu_event event)
 {
 	struct gntdev_priv *priv = container_of(mn, struct gntdev_priv, mn);
 	struct grant_map *map;
@@ -445,9 +447,10 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
 
 static void mn_invl_page(struct mmu_notifier *mn,
 			 struct mm_struct *mm,
-			 unsigned long address)
+			 unsigned long address,
+			 enum mmu_event event)
 {
-	mn_invl_range_start(mn, mm, address, address + PAGE_SIZE);
+	mn_invl_range_start(mn, mm, address, address + PAGE_SIZE, event);
 }
 
 static void mn_release(struct mmu_notifier *mn,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index cfa63ee..e9e79f7 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -830,7 +830,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 		};
 		down_read(&mm->mmap_sem);
 		if (type == CLEAR_REFS_SOFT_DIRTY)
-			mmu_notifier_invalidate_range_start(mm, 0, -1);
+			mmu_notifier_invalidate_range_start(mm, 0,
+							    -1, MMU_STATUS);
 		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			cp.vma = vma;
 			if (is_vm_hugetlb_page(vma))
@@ -858,7 +859,8 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 					&clear_refs_walk);
 		}
 		if (type == CLEAR_REFS_SOFT_DIRTY)
-			mmu_notifier_invalidate_range_end(mm, 0, -1);
+			mmu_notifier_invalidate_range_end(mm, 0,
+							  -1, MMU_STATUS);
 		flush_tlb_mm(mm);
 		up_read(&mm->mmap_sem);
 		mmput(mm);
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 2728869..496ff08 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -9,6 +9,58 @@
 struct mmu_notifier;
 struct mmu_notifier_ops;
 
+/* MMU Events report fine-grained information to the callback routine, allowing
+ * the event listener to make a more informed decision as to what action to
+ * take. The event types are:
+ *
+ *   - MMU_MIGRATE: memory is migrating from one page to another, thus all write
+ *     access must stop after invalidate_range_start callback returns.
+ *     Furthermore, no read access should be allowed either, as a new page can
+ *     be remapped with write access before the invalidate_range_end callback
+ *     happens and thus any read access to old page might read stale data. There
+ *     are several sources for this event, including:
+ *
+ *         - A page moving to swap (various reasons, including page reclaim),
+ *         - An mremap syscall,
+ *         - migration for NUMA reasons,
+ *         - balancing the memory pool,
+ *         - write fault on COW page,
+ *         - and more that are not listed here.
+ *
+ *   - MMU_MPROT: memory access protection is changing. Refer to the vma to get
+ *     the new access protection. All memory access are still valid until the
+ *     invalidate_range_end callback.
+ *
+ *   - MMU_MUNMAP: the range is being unmapped (outcome of a munmap syscall or
+ *     process destruction). However, access is still allowed, up until the
+ *     invalidate_range_free_pages callback. This also implies that secondary
+ *     page table can be trimmed, because the address range is no longer valid.
+ *
+ *   - MMU_WRITE_BACK: memory is being written back to disk, all write accesses
+ *     must stop after invalidate_range_start callback returns. Read access are
+ *     still allowed.
+ *
+ *   - MMU_WRITE_PROTECT: memory is being writte protected (ie should be mapped
+ *     read only no matter what the vma memory protection allows). All write
+ *     accesses must stop after invalidate_range_start callback returns. Read
+ *     access are still allowed.
+ *
+ *   - MMU_STATUS memory status change, like soft dirty, or huge page
+ *     splitting flag being set on pmd.
+ *
+ * If in doubt when adding a new notifier caller, please use MMU_MIGRATE,
+ * because it will always lead to reasonable behavior, but will not allow the
+ * listener a chance to optimize its events.
+ */
+enum mmu_event {
+	MMU_MIGRATE = 0,
+	MMU_MPROT,
+	MMU_MUNMAP,
+	MMU_STATUS,
+	MMU_WRITE_BACK,
+	MMU_WRITE_PROTECT,
+};
+
 #ifdef CONFIG_MMU_NOTIFIER
 
 /*
@@ -79,7 +131,8 @@ struct mmu_notifier_ops {
 	void (*change_pte)(struct mmu_notifier *mn,
 			   struct mm_struct *mm,
 			   unsigned long address,
-			   pte_t pte);
+			   pte_t pte,
+			   enum mmu_event event);
 
 	/*
 	 * Before this is invoked any secondary MMU is still ok to
@@ -90,7 +143,8 @@ struct mmu_notifier_ops {
 	 */
 	void (*invalidate_page)(struct mmu_notifier *mn,
 				struct mm_struct *mm,
-				unsigned long address);
+				unsigned long address,
+				enum mmu_event event);
 
 	/*
 	 * invalidate_range_start() and invalidate_range_end() must be
@@ -137,10 +191,14 @@ struct mmu_notifier_ops {
 	 */
 	void (*invalidate_range_start)(struct mmu_notifier *mn,
 				       struct mm_struct *mm,
-				       unsigned long start, unsigned long end);
+				       unsigned long start,
+				       unsigned long end,
+				       enum mmu_event event);
 	void (*invalidate_range_end)(struct mmu_notifier *mn,
 				     struct mm_struct *mm,
-				     unsigned long start, unsigned long end);
+				     unsigned long start,
+				     unsigned long end,
+				     enum mmu_event event);
 };
 
 /*
@@ -179,13 +237,20 @@ extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
 extern int __mmu_notifier_test_young(struct mm_struct *mm,
 				     unsigned long address);
 extern void __mmu_notifier_change_pte(struct mm_struct *mm,
-				      unsigned long address, pte_t pte);
+				      unsigned long address,
+				      pte_t pte,
+				      enum mmu_event event);
 extern void __mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address);
+					  unsigned long address,
+					  enum mmu_event event);
 extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
-				  unsigned long start, unsigned long end);
+						  unsigned long start,
+						  unsigned long end,
+						  enum mmu_event event);
 extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
-				  unsigned long start, unsigned long end);
+						unsigned long start,
+						unsigned long end,
+						enum mmu_event event);
 
 static inline void mmu_notifier_release(struct mm_struct *mm)
 {
@@ -210,31 +275,38 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm,
 }
 
 static inline void mmu_notifier_change_pte(struct mm_struct *mm,
-					   unsigned long address, pte_t pte)
+					   unsigned long address,
+					   pte_t pte,
+					   enum mmu_event event)
 {
 	if (mm_has_notifiers(mm))
-		__mmu_notifier_change_pte(mm, address, pte);
+		__mmu_notifier_change_pte(mm, address, pte, event);
 }
 
 static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
+						unsigned long address,
+						enum mmu_event event)
 {
 	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_page(mm, address);
+		__mmu_notifier_invalidate_page(mm, address, event);
 }
 
 static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+						       unsigned long start,
+						       unsigned long end,
+						       enum mmu_event event)
 {
 	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_range_start(mm, start, end);
+		__mmu_notifier_invalidate_range_start(mm, start, end, event);
 }
 
 static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+						     unsigned long start,
+						     unsigned long end,
+						     enum mmu_event event)
 {
 	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_range_end(mm, start, end);
+		__mmu_notifier_invalidate_range_end(mm, start, end, event);
 }
 
 static inline void mmu_notifier_mm_init(struct mm_struct *mm)
@@ -280,13 +352,13 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
  * old page would remain mapped readonly in the secondary MMUs after the new
  * page is already writable by some CPU through the primary MMU.
  */
-#define set_pte_at_notify(__mm, __address, __ptep, __pte)		\
+#define set_pte_at_notify(__mm, __address, __ptep, __pte, __event)	\
 ({									\
 	struct mm_struct *___mm = __mm;					\
 	unsigned long ___address = __address;				\
 	pte_t ___pte = __pte;						\
 									\
-	mmu_notifier_change_pte(___mm, ___address, ___pte);		\
+	mmu_notifier_change_pte(___mm, ___address, ___pte, __event);	\
 	set_pte_at(___mm, ___address, __ptep, ___pte);			\
 })
 
@@ -313,22 +385,29 @@ static inline int mmu_notifier_test_young(struct mm_struct *mm,
 }
 
 static inline void mmu_notifier_change_pte(struct mm_struct *mm,
-					   unsigned long address, pte_t pte)
+					   unsigned long address,
+					   pte_t pte,
+					   enum mmu_event event)
 {
 }
 
 static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
+						unsigned long address,
+						enum mmu_event event)
 {
 }
 
 static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+						       unsigned long start,
+						       unsigned long end,
+						       enum mmu_event event)
 {
 }
 
 static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+						     unsigned long start,
+						     unsigned long end,
+						     enum mmu_event event)
 {
 }
 
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 1d0af8a..62d07e9 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -176,7 +176,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	/* For try_to_free_swap() and munlock_vma_page() below */
 	lock_page(page);
 
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 	err = -EAGAIN;
 	ptep = page_check_address(page, mm, addr, &ptl, 0);
 	if (!ptep)
@@ -194,7 +195,9 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 
 	flush_cache_page(vma, addr, pte_pfn(*ptep));
 	ptep_clear_flush(vma, addr, ptep);
-	set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma->vm_page_prot));
+	set_pte_at_notify(mm, addr, ptep,
+			  mk_pte(kpage, vma->vm_page_prot),
+			  MMU_MIGRATE);
 
 	page_remove_rmap(page);
 	if (!page_mapped(page))
@@ -208,7 +211,8 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	err = 0;
  unlock:
 	mem_cgroup_cancel_charge(kpage, memcg);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 	unlock_page(page);
 	return err;
 }
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index d8d9fe3..a2b3f09 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -198,7 +198,7 @@ retry:
 			BUG_ON(pte_dirty(pteval));
 			pte_unmap_unlock(pte, ptl);
 			/* must invalidate_page _before_ freeing the page */
-			mmu_notifier_invalidate_page(mm, address);
+			mmu_notifier_invalidate_page(mm, address, MMU_MIGRATE);
 			page_cache_release(page);
 		}
 	}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 02559ef..a2f1a20 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1029,7 +1029,8 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+					    MMU_MIGRATE);
 
 	ptl = pmd_lock(mm, pmd);
 	if (unlikely(!pmd_same(*pmd, orig_pmd)))
@@ -1063,7 +1064,8 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 	page_remove_rmap(page);
 	spin_unlock(ptl);
 
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 
 	ret |= VM_FAULT_WRITE;
 	put_page(page);
@@ -1073,7 +1075,8 @@ out:
 
 out_free_pages:
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 	for (i = 0; i < HPAGE_PMD_NR; i++) {
 		memcg = (void *)page_private(pages[i]);
 		set_page_private(pages[i], 0);
@@ -1165,7 +1168,8 @@ alloc:
 
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+					    MMU_MIGRATE);
 
 	spin_lock(ptl);
 	if (page)
@@ -1197,7 +1201,8 @@ alloc:
 	}
 	spin_unlock(ptl);
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 out:
 	return ret;
 out_unlock:
@@ -1632,7 +1637,8 @@ static int __split_huge_page_splitting(struct page *page,
 	const unsigned long mmun_start = address;
 	const unsigned long mmun_end   = address + HPAGE_PMD_SIZE;
 
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_STATUS);
 	pmd = page_check_address_pmd(page, mm, address,
 			PAGE_CHECK_ADDRESS_PMD_NOTSPLITTING_FLAG, &ptl);
 	if (pmd) {
@@ -1647,7 +1653,8 @@ static int __split_huge_page_splitting(struct page *page,
 		ret = 1;
 		spin_unlock(ptl);
 	}
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_STATUS);
 
 	return ret;
 }
@@ -2446,7 +2453,8 @@ static void collapse_huge_page(struct mm_struct *mm,
 
 	mmun_start = address;
 	mmun_end   = address + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 	pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
 	/*
 	 * After this gup_fast can't run anymore. This also removes
@@ -2456,7 +2464,8 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 */
 	_pmd = pmdp_clear_flush(vma, address, pmd);
 	spin_unlock(pmd_ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 
 	spin_lock(pte_ptl);
 	isolated = __collapse_huge_page_isolate(vma, address, pte);
@@ -2845,24 +2854,28 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address,
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
 again:
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 	ptl = pmd_lock(mm, pmd);
 	if (unlikely(!pmd_trans_huge(*pmd))) {
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(mm, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 		return;
 	}
 	if (is_huge_zero_pmd(*pmd)) {
 		__split_huge_zero_page_pmd(vma, haddr, pmd);
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(mm, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 		return;
 	}
 	page = pmd_page(*pmd);
 	VM_BUG_ON_PAGE(!page_count(page), page);
 	get_page(page);
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 
 	split_huge_page(page);
 
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index a8d4155..fc38555 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2557,7 +2557,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	mmun_start = vma->vm_start;
 	mmun_end = vma->vm_end;
 	if (cow)
-		mmu_notifier_invalidate_range_start(src, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_start(src, mmun_start,
+						    mmun_end, MMU_MIGRATE);
 
 	for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) {
 		spinlock_t *src_ptl, *dst_ptl;
@@ -2607,7 +2608,8 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	}
 
 	if (cow)
-		mmu_notifier_invalidate_range_end(src, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(src, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 
 	return ret;
 }
@@ -2633,7 +2635,8 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	BUG_ON(end & ~huge_page_mask(h));
 
 	tlb_start_vma(tlb, vma);
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 again:
 	for (address = start; address < end; address += sz) {
 		ptep = huge_pte_offset(mm, address);
@@ -2704,7 +2707,8 @@ unlock:
 		if (address < end && !ref_page)
 			goto again;
 	}
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 	tlb_end_vma(tlb, vma);
 }
 
@@ -2882,8 +2886,8 @@ retry_avoidcopy:
 
 	mmun_start = address & huge_page_mask(h);
 	mmun_end = mmun_start + huge_page_size(h);
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
-
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+					    MMU_MIGRATE);
 	/*
 	 * Retake the page table lock to check for racing updates
 	 * before the page tables are altered
@@ -2903,7 +2907,8 @@ retry_avoidcopy:
 		new_page = old_page;
 	}
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+					  MMU_MIGRATE);
 out_release_all:
 	page_cache_release(new_page);
 out_release_old:
@@ -3341,7 +3346,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	BUG_ON(address >= end);
 	flush_cache_range(vma, address, end);
 
-	mmu_notifier_invalidate_range_start(mm, start, end);
+	mmu_notifier_invalidate_range_start(mm, start, end, MMU_MPROT);
 	mutex_lock(&vma->vm_file->f_mapping->i_mmap_mutex);
 	for (; address < end; address += huge_page_size(h)) {
 		spinlock_t *ptl;
@@ -3371,7 +3376,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	 */
 	flush_tlb_range(vma, start, end);
 	mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
-	mmu_notifier_invalidate_range_end(mm, start, end);
+	mmu_notifier_invalidate_range_end(mm, start, end, MMU_MPROT);
 
 	return pages << h->order;
 }
diff --git a/mm/ksm.c b/mm/ksm.c
index fb75902..21d210b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -872,7 +872,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 
 	mmun_start = addr;
 	mmun_end   = addr + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+					    MMU_WRITE_PROTECT);
 
 	ptep = page_check_address(page, mm, addr, &ptl, 0);
 	if (!ptep)
@@ -904,7 +905,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 		if (pte_dirty(entry))
 			set_page_dirty(page);
 		entry = pte_mkclean(pte_wrprotect(entry));
-		set_pte_at_notify(mm, addr, ptep, entry);
+		set_pte_at_notify(mm, addr, ptep, entry, MMU_WRITE_PROTECT);
 	}
 	*orig_pte = *ptep;
 	err = 0;
@@ -912,7 +913,8 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 out_unlock:
 	pte_unmap_unlock(ptep, ptl);
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+					  MMU_WRITE_PROTECT);
 out:
 	return err;
 }
@@ -948,7 +950,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 
 	mmun_start = addr;
 	mmun_end   = addr + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+					    MMU_MIGRATE);
 
 	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
 	if (!pte_same(*ptep, orig_pte)) {
@@ -961,7 +964,9 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 
 	flush_cache_page(vma, addr, pte_pfn(*ptep));
 	ptep_clear_flush(vma, addr, ptep);
-	set_pte_at_notify(mm, addr, ptep, mk_pte(kpage, vma->vm_page_prot));
+	set_pte_at_notify(mm, addr, ptep,
+			  mk_pte(kpage, vma->vm_page_prot),
+			  MMU_MIGRATE);
 
 	page_remove_rmap(page);
 	if (!page_mapped(page))
@@ -971,7 +976,8 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 	pte_unmap_unlock(ptep, ptl);
 	err = 0;
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+					  MMU_MIGRATE);
 out:
 	return err;
 }
diff --git a/mm/memory.c b/mm/memory.c
index eb37dfb..4dae9f5 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1050,7 +1050,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	mmun_end   = end;
 	if (is_cow)
 		mmu_notifier_invalidate_range_start(src_mm, mmun_start,
-						    mmun_end);
+						    mmun_end, MMU_MIGRATE);
 
 	ret = 0;
 	dst_pgd = pgd_offset(dst_mm, addr);
@@ -1067,7 +1067,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
 
 	if (is_cow)
-		mmu_notifier_invalidate_range_end(src_mm, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(src_mm, mmun_start, mmun_end,
+						  MMU_MIGRATE);
 	return ret;
 }
 
@@ -1371,10 +1372,12 @@ void unmap_vmas(struct mmu_gather *tlb,
 {
 	struct mm_struct *mm = vma->vm_mm;
 
-	mmu_notifier_invalidate_range_start(mm, start_addr, end_addr);
+	mmu_notifier_invalidate_range_start(mm, start_addr,
+					    end_addr, MMU_MUNMAP);
 	for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next)
 		unmap_single_vma(tlb, vma, start_addr, end_addr, NULL);
-	mmu_notifier_invalidate_range_end(mm, start_addr, end_addr);
+	mmu_notifier_invalidate_range_end(mm, start_addr,
+					  end_addr, MMU_MUNMAP);
 }
 
 /**
@@ -1396,10 +1399,10 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, start, end);
 	update_hiwater_rss(mm);
-	mmu_notifier_invalidate_range_start(mm, start, end);
+	mmu_notifier_invalidate_range_start(mm, start, end, MMU_MUNMAP);
 	for ( ; vma && vma->vm_start < end; vma = vma->vm_next)
 		unmap_single_vma(&tlb, vma, start, end, details);
-	mmu_notifier_invalidate_range_end(mm, start, end);
+	mmu_notifier_invalidate_range_end(mm, start, end, MMU_MUNMAP);
 	tlb_finish_mmu(&tlb, start, end);
 }
 
@@ -1422,9 +1425,9 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, address, end);
 	update_hiwater_rss(mm);
-	mmu_notifier_invalidate_range_start(mm, address, end);
+	mmu_notifier_invalidate_range_start(mm, address, end, MMU_MUNMAP);
 	unmap_single_vma(&tlb, vma, address, end, details);
-	mmu_notifier_invalidate_range_end(mm, address, end);
+	mmu_notifier_invalidate_range_end(mm, address, end, MMU_MUNMAP);
 	tlb_finish_mmu(&tlb, address, end);
 }
 
@@ -2208,7 +2211,8 @@ gotten:
 
 	mmun_start  = address & PAGE_MASK;
 	mmun_end    = mmun_start + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 
 	/*
 	 * Re-check the pte - we dropped the lock
@@ -2240,7 +2244,7 @@ gotten:
 		 * mmu page tables (such as kvm shadow page tables), we want the
 		 * new page to be mapped directly into the secondary page table.
 		 */
-		set_pte_at_notify(mm, address, page_table, entry);
+		set_pte_at_notify(mm, address, page_table, entry, MMU_MIGRATE);
 		update_mmu_cache(vma, address, page_table);
 		if (old_page) {
 			/*
@@ -2279,7 +2283,8 @@ gotten:
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	if (mmun_end > mmun_start)
-		mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(mm, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 	if (old_page) {
 		/*
 		 * Don't let another task, with possibly unlocked vma,
diff --git a/mm/migrate.c b/mm/migrate.c
index 7f5a424..791e265 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1820,12 +1820,14 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	WARN_ON(PageLRU(new_page));
 
 	/* Recheck the target PMD */
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 	ptl = pmd_lock(mm, pmd);
 	if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
 fail_putback:
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+		mmu_notifier_invalidate_range_end(mm, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 
 		/* Reverse changes made by migrate_page_copy() */
 		if (TestClearPageActive(new_page))
@@ -1878,7 +1880,8 @@ fail_putback:
 	page_remove_rmap(page);
 
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 
 	/* Take an "isolate" reference and put new page on the LRU. */
 	get_page(new_page);
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index 950813b..de039e4 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -141,8 +141,10 @@ int __mmu_notifier_test_young(struct mm_struct *mm,
 	return young;
 }
 
-void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address,
-			       pte_t pte)
+void __mmu_notifier_change_pte(struct mm_struct *mm,
+			       unsigned long address,
+			       pte_t pte,
+			       enum mmu_event event)
 {
 	struct mmu_notifier *mn;
 	int id;
@@ -150,13 +152,14 @@ void __mmu_notifier_change_pte(struct mm_struct *mm, unsigned long address,
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->change_pte)
-			mn->ops->change_pte(mn, mm, address, pte);
+			mn->ops->change_pte(mn, mm, address, pte, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 
 void __mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address)
+				    unsigned long address,
+				    enum mmu_event event)
 {
 	struct mmu_notifier *mn;
 	int id;
@@ -164,13 +167,16 @@ void __mmu_notifier_invalidate_page(struct mm_struct *mm,
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_page)
-			mn->ops->invalidate_page(mn, mm, address);
+			mn->ops->invalidate_page(mn, mm, address, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 
 void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+					   unsigned long start,
+					   unsigned long end,
+					   enum mmu_event event)
+
 {
 	struct mmu_notifier *mn;
 	int id;
@@ -178,14 +184,17 @@ void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_range_start)
-			mn->ops->invalidate_range_start(mn, mm, start, end);
+			mn->ops->invalidate_range_start(mn, mm, start,
+							end, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_start);
 
 void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
-				  unsigned long start, unsigned long end)
+					 unsigned long start,
+					 unsigned long end,
+					 enum mmu_event event)
 {
 	struct mmu_notifier *mn;
 	int id;
@@ -193,7 +202,8 @@ void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_range_end)
-			mn->ops->invalidate_range_end(mn, mm, start, end);
+			mn->ops->invalidate_range_end(mn, mm, start,
+						      end, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
diff --git a/mm/mprotect.c b/mm/mprotect.c
index c43d557..886405b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -157,7 +157,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!mni_start) {
 			mni_start = addr;
-			mmu_notifier_invalidate_range_start(mm, mni_start, end);
+			mmu_notifier_invalidate_range_start(mm, mni_start,
+							    end, MMU_MPROT);
 		}
 
 		if (pmd_trans_huge(*pmd)) {
@@ -185,7 +186,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 	} while (pmd++, addr = next, addr != end);
 
 	if (mni_start)
-		mmu_notifier_invalidate_range_end(mm, mni_start, end);
+		mmu_notifier_invalidate_range_end(mm, mni_start, end, MMU_MPROT);
 
 	if (nr_huge_updates)
 		count_vm_numa_events(NUMA_HUGE_PTE_UPDATES, nr_huge_updates);
diff --git a/mm/mremap.c b/mm/mremap.c
index 05f1180..6827d2f 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -177,7 +177,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
 
 	mmun_start = old_addr;
 	mmun_end   = old_end;
-	mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start,
+					    mmun_end, MMU_MIGRATE);
 
 	for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
 		cond_resched();
@@ -228,7 +229,8 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
 	if (likely(need_flush))
 		flush_tlb_range(vma, old_end-len, old_addr);
 
-	mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start,
+					  mmun_end, MMU_MIGRATE);
 
 	return len + old_addr - old_end;	/* how much done */
 }
diff --git a/mm/rmap.c b/mm/rmap.c
index 3e8491c..6b053cd 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -840,7 +840,7 @@ static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	pte_unmap_unlock(pte, ptl);
 
 	if (ret) {
-		mmu_notifier_invalidate_page(mm, address);
+		mmu_notifier_invalidate_page(mm, address, MMU_WRITE_BACK);
 		(*cleaned)++;
 	}
 out:
@@ -1128,6 +1128,10 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	spinlock_t *ptl;
 	int ret = SWAP_AGAIN;
 	enum ttu_flags flags = (enum ttu_flags)arg;
+	enum mmu_event event = MMU_MIGRATE;
+
+	if (flags & TTU_MUNLOCK)
+		event = MMU_STATUS;
 
 	pte = page_check_address(page, mm, address, &ptl, 0);
 	if (!pte)
@@ -1233,7 +1237,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 out_unmap:
 	pte_unmap_unlock(pte, ptl);
 	if (ret != SWAP_FAIL && !(flags & TTU_MUNLOCK))
-		mmu_notifier_invalidate_page(mm, address);
+		mmu_notifier_invalidate_page(mm, address, event);
 out:
 	return ret;
 
@@ -1287,7 +1291,9 @@ out_mlock:
 #define CLUSTER_MASK	(~(CLUSTER_SIZE - 1))
 
 static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
-		struct vm_area_struct *vma, struct page *check_page)
+				struct vm_area_struct *vma,
+				struct page *check_page,
+				enum ttu_flags flags)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t *pmd;
@@ -1301,6 +1307,10 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 	unsigned long end;
 	int ret = SWAP_AGAIN;
 	int locked_vma = 0;
+	enum mmu_event event = MMU_MIGRATE;
+
+	if (flags & TTU_MUNLOCK)
+		event = MMU_STATUS;
 
 	address = (vma->vm_start + cursor) & CLUSTER_MASK;
 	end = address + CLUSTER_SIZE;
@@ -1315,7 +1325,7 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 
 	mmun_start = address;
 	mmun_end   = end;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end, event);
 
 	/*
 	 * If we can acquire the mmap_sem for read, and vma is VM_LOCKED,
@@ -1380,7 +1390,7 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 		(*mapcount)--;
 	}
 	pte_unmap_unlock(pte - 1, ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
+	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end, event);
 	if (locked_vma)
 		up_read(&vma->vm_mm->mmap_sem);
 	return ret;
@@ -1436,7 +1446,9 @@ static int try_to_unmap_nonlinear(struct page *page,
 			while (cursor < max_nl_cursor &&
 				cursor < vma->vm_end - vma->vm_start) {
 				if (try_to_unmap_cluster(cursor, &mapcount,
-						vma, page) == SWAP_MLOCK)
+							 vma, page,
+							 (enum ttu_flags)arg)
+							 == SWAP_MLOCK)
 					ret = SWAP_MLOCK;
 				cursor += CLUSTER_SIZE;
 				vma->vm_private_data = (void *) cursor;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 4b6c01b..6e1992f 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -262,7 +262,8 @@ static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
 
 static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 					     struct mm_struct *mm,
-					     unsigned long address)
+					     unsigned long address,
+					     enum mmu_event event)
 {
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);
 	int need_tlb_flush, idx;
@@ -301,7 +302,8 @@ static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
 static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,
 					struct mm_struct *mm,
 					unsigned long address,
-					pte_t pte)
+					pte_t pte,
+					enum mmu_event event)
 {
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);
 	int idx;
@@ -317,7 +319,8 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,
 static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 						    struct mm_struct *mm,
 						    unsigned long start,
-						    unsigned long end)
+						    unsigned long end,
+						    enum mmu_event event)
 {
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);
 	int need_tlb_flush = 0, idx;
@@ -343,7 +346,8 @@ static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
 						  struct mm_struct *mm,
 						  unsigned long start,
-						  unsigned long end)
+						  unsigned long end,
+						  enum mmu_event event)
 {
 	struct kvm *kvm = mmu_notifier_to_kvm(mn);
 
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3
  2014-07-17 18:46 mmu_notifier: preparatory patches for hmm and or iommuv2 v6 j.glisse
  2014-07-17 18:46 ` [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync j.glisse
  2014-07-17 18:46 ` [PATCH 2/7] mmu_notifier: add event information to address invalidation v3 j.glisse
@ 2014-07-17 18:46 ` j.glisse
  2014-07-24 22:51   ` Andrew Morton
  2014-07-24 15:46 ` mmu_notifier: preparatory patches for hmm and or iommuv2 v6 Jerome Glisse
  3 siblings, 1 reply; 13+ messages in thread
From: j.glisse @ 2014-07-17 18:46 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Oded Gabbay,
	Jérôme Glisse

From: JA(C)rA'me Glisse <jglisse@redhat.com>

New user of the mmu_notifier interface need to lookup vma in order to
perform the invalidation operation. Instead of redoing a vma lookup
inside the callback just pass through the vma from the call site where
it is already available.

This needs small refactoring in memory.c to call invalidate_range on
vma boundary while previously it was call once for larger range. The
affected function are :
  - unmap_vmas() call either from exit_mmap or from munmap syscall. For
  exit_mmap the overhead might be important but this only happen when
  a process exit. For the munmap case, it should be low to no overhead
  if we assume that munmap is usual use on per vma basis.
  - zap_page_range() only call to zap_page_range that are not already
  clamp to a single vma are call from the madvice syscall. I think it
  is a fair assumption that even for this case the overhead will not
  hurt performances.
  - zap_page_range_single() change as no impact as this function was a
  per vma function.

Changed since v1 :
  - Only passthrough the vma.
  - Commit comment.

Changed since v2 :
  - Updated commit message to evaluate impact of mmu_notifier call per
    vma while they were over range which could cover several vma before.

Signed-off-by: JA(C)rA'me Glisse <jglisse@redhat.com>
---
 drivers/gpu/drm/i915/i915_gem_userptr.c |  1 +
 drivers/iommu/amd_iommu_v2.c            |  6 ++---
 drivers/misc/sgi-gru/grutlbpurge.c      |  8 ++++---
 drivers/xen/gntdev.c                    |  6 ++---
 fs/proc/task_mmu.c                      | 16 ++++++++-----
 include/linux/mmu_notifier.h            | 41 +++++++++++++++++----------------
 kernel/events/uprobes.c                 |  4 ++--
 mm/filemap_xip.c                        |  3 ++-
 mm/huge_memory.c                        | 26 ++++++++++-----------
 mm/hugetlb.c                            | 16 ++++++-------
 mm/ksm.c                                |  8 +++----
 mm/memory.c                             | 30 ++++++++++++------------
 mm/migrate.c                            |  6 ++---
 mm/mmu_notifier.c                       | 15 +++++++-----
 mm/mprotect.c                           |  6 ++---
 mm/mremap.c                             |  4 ++--
 mm/rmap.c                               |  9 ++++----
 virt/kvm/kvm_main.c                     |  6 ++---
 18 files changed, 112 insertions(+), 99 deletions(-)

diff --git a/drivers/gpu/drm/i915/i915_gem_userptr.c b/drivers/gpu/drm/i915/i915_gem_userptr.c
index 8c18bb1..84a408f 100644
--- a/drivers/gpu/drm/i915/i915_gem_userptr.c
+++ b/drivers/gpu/drm/i915/i915_gem_userptr.c
@@ -55,6 +55,7 @@ struct i915_mmu_object {
 
 static void i915_gem_userptr_mn_invalidate_range_start(struct mmu_notifier *_mn,
 						       struct mm_struct *mm,
+						       struct vm_area_struct *vma,
 						       unsigned long start,
 						       unsigned long end,
 						       enum mmu_event event)
diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 33482ef..9a665a0 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -421,7 +421,7 @@ static void mn_change_pte(struct mmu_notifier *mn,
 }
 
 static void mn_invalidate_page(struct mmu_notifier *mn,
-			       struct mm_struct *mm,
+			       struct vm_area_struct *vma,
 			       unsigned long address,
 			       enum mmu_event event)
 {
@@ -429,7 +429,7 @@ static void mn_invalidate_page(struct mmu_notifier *mn,
 }
 
 static void mn_invalidate_range_start(struct mmu_notifier *mn,
-				      struct mm_struct *mm,
+				      struct vm_area_struct *vma,
 				      unsigned long start,
 				      unsigned long end,
 				      enum mmu_event event)
@@ -452,7 +452,7 @@ static void mn_invalidate_range_start(struct mmu_notifier *mn,
 }
 
 static void mn_invalidate_range_end(struct mmu_notifier *mn,
-				    struct mm_struct *mm,
+				    struct vm_area_struct *vma,
 				    unsigned long start,
 				    unsigned long end,
 				    enum mmu_event event)
diff --git a/drivers/misc/sgi-gru/grutlbpurge.c b/drivers/misc/sgi-gru/grutlbpurge.c
index e67fed1..ef29b45 100644
--- a/drivers/misc/sgi-gru/grutlbpurge.c
+++ b/drivers/misc/sgi-gru/grutlbpurge.c
@@ -220,7 +220,7 @@ void gru_flush_all_tlb(struct gru_state *gru)
  * MMUOPS notifier callout functions
  */
 static void gru_invalidate_range_start(struct mmu_notifier *mn,
-				       struct mm_struct *mm,
+				       struct vm_area_struct *vma,
 				       unsigned long start, unsigned long end,
 				       enum mmu_event event)
 {
@@ -235,7 +235,8 @@ static void gru_invalidate_range_start(struct mmu_notifier *mn,
 }
 
 static void gru_invalidate_range_end(struct mmu_notifier *mn,
-				     struct mm_struct *mm, unsigned long start,
+				     struct vm_area_struct *vma,
+				     unsigned long start,
 				     unsigned long end,
 				     enum mmu_event event)
 {
@@ -249,7 +250,8 @@ static void gru_invalidate_range_end(struct mmu_notifier *mn,
 	gru_dbg(grudev, "gms %p, start 0x%lx, end 0x%lx\n", gms, start, end);
 }
 
-static void gru_invalidate_page(struct mmu_notifier *mn, struct mm_struct *mm,
+static void gru_invalidate_page(struct mmu_notifier *mn,
+				struct vm_area_struct *vma,
 				unsigned long address,
 				enum mmu_event event)
 {
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index fe9da94..768f425 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -427,7 +427,7 @@ static void unmap_if_in_range(struct grant_map *map,
 }
 
 static void mn_invl_range_start(struct mmu_notifier *mn,
-				struct mm_struct *mm,
+				struct vm_area_struct *vma,
 				unsigned long start,
 				unsigned long end,
 				enum mmu_event event)
@@ -446,11 +446,11 @@ static void mn_invl_range_start(struct mmu_notifier *mn,
 }
 
 static void mn_invl_page(struct mmu_notifier *mn,
-			 struct mm_struct *mm,
+			 struct vm_area_struct *vma,
 			 unsigned long address,
 			 enum mmu_event event)
 {
-	mn_invl_range_start(mn, mm, address, address + PAGE_SIZE, event);
+	mn_invl_range_start(mn, vma, address, address + PAGE_SIZE, event);
 }
 
 static void mn_release(struct mmu_notifier *mn,
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index e9e79f7..d1ed285 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -829,13 +829,15 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 			.private = &cp,
 		};
 		down_read(&mm->mmap_sem);
-		if (type == CLEAR_REFS_SOFT_DIRTY)
-			mmu_notifier_invalidate_range_start(mm, 0,
-							    -1, MMU_STATUS);
 		for (vma = mm->mmap; vma; vma = vma->vm_next) {
 			cp.vma = vma;
 			if (is_vm_hugetlb_page(vma))
 				continue;
+			if (type == CLEAR_REFS_SOFT_DIRTY)
+				mmu_notifier_invalidate_range_start(vma,
+								    vma->vm_start,
+								    vma->vm_end,
+								    MMU_STATUS);
 			/*
 			 * Writing 1 to /proc/pid/clear_refs affects all pages.
 			 *
@@ -857,10 +859,12 @@ static ssize_t clear_refs_write(struct file *file, const char __user *buf,
 			}
 			walk_page_range(vma->vm_start, vma->vm_end,
 					&clear_refs_walk);
+			if (type == CLEAR_REFS_SOFT_DIRTY)
+				mmu_notifier_invalidate_range_end(vma,
+								  vma->vm_start,
+								  vma->vm_end,
+								  MMU_STATUS);
 		}
-		if (type == CLEAR_REFS_SOFT_DIRTY)
-			mmu_notifier_invalidate_range_end(mm, 0,
-							  -1, MMU_STATUS);
 		flush_tlb_mm(mm);
 		up_read(&mm->mmap_sem);
 		mmput(mm);
diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index 496ff08..798ecfa 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -142,7 +142,7 @@ struct mmu_notifier_ops {
 	 * be called internally to this method.
 	 */
 	void (*invalidate_page)(struct mmu_notifier *mn,
-				struct mm_struct *mm,
+				struct vm_area_struct *vma,
 				unsigned long address,
 				enum mmu_event event);
 
@@ -190,12 +190,12 @@ struct mmu_notifier_ops {
 	 * the last refcount is dropped.
 	 */
 	void (*invalidate_range_start)(struct mmu_notifier *mn,
-				       struct mm_struct *mm,
+				       struct vm_area_struct *vma,
 				       unsigned long start,
 				       unsigned long end,
 				       enum mmu_event event);
 	void (*invalidate_range_end)(struct mmu_notifier *mn,
-				     struct mm_struct *mm,
+				     struct vm_area_struct *vma,
 				     unsigned long start,
 				     unsigned long end,
 				     enum mmu_event event);
@@ -240,14 +240,14 @@ extern void __mmu_notifier_change_pte(struct mm_struct *mm,
 				      unsigned long address,
 				      pte_t pte,
 				      enum mmu_event event);
-extern void __mmu_notifier_invalidate_page(struct mm_struct *mm,
-					  unsigned long address,
-					  enum mmu_event event);
-extern void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
+extern void __mmu_notifier_invalidate_page(struct vm_area_struct *vma,
+					   unsigned long address,
+					   enum mmu_event event);
+extern void __mmu_notifier_invalidate_range_start(struct vm_area_struct *vma,
 						  unsigned long start,
 						  unsigned long end,
 						  enum mmu_event event);
-extern void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
+extern void __mmu_notifier_invalidate_range_end(struct vm_area_struct *vma,
 						unsigned long start,
 						unsigned long end,
 						enum mmu_event event);
@@ -283,30 +283,31 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 		__mmu_notifier_change_pte(mm, address, pte, event);
 }
 
-static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_page(struct vm_area_struct *vma,
 						unsigned long address,
 						enum mmu_event event)
 {
-	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_page(mm, address, event);
+	if (mm_has_notifiers(vma->vm_mm))
+		__mmu_notifier_invalidate_page(vma, address, event);
 }
 
-static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_range_start(struct vm_area_struct *vma,
 						       unsigned long start,
 						       unsigned long end,
 						       enum mmu_event event)
 {
-	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_range_start(mm, start, end, event);
+	if (mm_has_notifiers(vma->vm_mm))
+		__mmu_notifier_invalidate_range_start(vma, start,
+						      end, event);
 }
 
-static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_range_end(struct vm_area_struct *vma,
 						     unsigned long start,
 						     unsigned long end,
 						     enum mmu_event event)
 {
-	if (mm_has_notifiers(mm))
-		__mmu_notifier_invalidate_range_end(mm, start, end, event);
+	if (mm_has_notifiers(vma->vm_mm))
+		__mmu_notifier_invalidate_range_end(vma, start, end, event);
 }
 
 static inline void mmu_notifier_mm_init(struct mm_struct *mm)
@@ -391,20 +392,20 @@ static inline void mmu_notifier_change_pte(struct mm_struct *mm,
 {
 }
 
-static inline void mmu_notifier_invalidate_page(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_page(struct vm_area_struct *vma,
 						unsigned long address,
 						enum mmu_event event)
 {
 }
 
-static inline void mmu_notifier_invalidate_range_start(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_range_start(struct vm_area_struct *vma,
 						       unsigned long start,
 						       unsigned long end,
 						       enum mmu_event event)
 {
 }
 
-static inline void mmu_notifier_invalidate_range_end(struct mm_struct *mm,
+static inline void mmu_notifier_invalidate_range_end(struct vm_area_struct *vma,
 						     unsigned long start,
 						     unsigned long end,
 						     enum mmu_event event)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 62d07e9..8c65320 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -176,7 +176,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	/* For try_to_free_swap() and munlock_vma_page() below */
 	lock_page(page);
 
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 	err = -EAGAIN;
 	ptep = page_check_address(page, mm, addr, &ptl, 0);
@@ -211,7 +211,7 @@ static int __replace_page(struct vm_area_struct *vma, unsigned long addr,
 	err = 0;
  unlock:
 	mem_cgroup_cancel_charge(kpage, memcg);
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 	unlock_page(page);
 	return err;
diff --git a/mm/filemap_xip.c b/mm/filemap_xip.c
index a2b3f09..f0113df 100644
--- a/mm/filemap_xip.c
+++ b/mm/filemap_xip.c
@@ -198,7 +198,8 @@ retry:
 			BUG_ON(pte_dirty(pteval));
 			pte_unmap_unlock(pte, ptl);
 			/* must invalidate_page _before_ freeing the page */
-			mmu_notifier_invalidate_page(mm, address, MMU_MIGRATE);
+			mmu_notifier_invalidate_page(mm, vma, address,
+						     MMU_MIGRATE);
 			page_cache_release(page);
 		}
 	}
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a2f1a20..d60043b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1029,7 +1029,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_start(vma, mmun_start, mmun_end,
 					    MMU_MIGRATE);
 
 	ptl = pmd_lock(mm, pmd);
@@ -1064,7 +1064,7 @@ static int do_huge_pmd_wp_page_fallback(struct mm_struct *mm,
 	page_remove_rmap(page);
 	spin_unlock(ptl);
 
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 
 	ret |= VM_FAULT_WRITE;
@@ -1075,7 +1075,7 @@ out:
 
 out_free_pages:
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 	for (i = 0; i < HPAGE_PMD_NR; i++) {
 		memcg = (void *)page_private(pages[i]);
@@ -1168,7 +1168,7 @@ alloc:
 
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_start(vma, mmun_start, mmun_end,
 					    MMU_MIGRATE);
 
 	spin_lock(ptl);
@@ -1201,7 +1201,7 @@ alloc:
 	}
 	spin_unlock(ptl);
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 out:
 	return ret;
@@ -1637,7 +1637,7 @@ static int __split_huge_page_splitting(struct page *page,
 	const unsigned long mmun_start = address;
 	const unsigned long mmun_end   = address + HPAGE_PMD_SIZE;
 
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_STATUS);
 	pmd = page_check_address_pmd(page, mm, address,
 			PAGE_CHECK_ADDRESS_PMD_NOTSPLITTING_FLAG, &ptl);
@@ -1653,7 +1653,7 @@ static int __split_huge_page_splitting(struct page *page,
 		ret = 1;
 		spin_unlock(ptl);
 	}
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_STATUS);
 
 	return ret;
@@ -2453,7 +2453,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 
 	mmun_start = address;
 	mmun_end   = address + HPAGE_PMD_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 	pmd_ptl = pmd_lock(mm, pmd); /* probably unnecessary */
 	/*
@@ -2464,7 +2464,7 @@ static void collapse_huge_page(struct mm_struct *mm,
 	 */
 	_pmd = pmdp_clear_flush(vma, address, pmd);
 	spin_unlock(pmd_ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 
 	spin_lock(pte_ptl);
@@ -2854,19 +2854,19 @@ void __split_huge_page_pmd(struct vm_area_struct *vma, unsigned long address,
 	mmun_start = haddr;
 	mmun_end   = haddr + HPAGE_PMD_SIZE;
 again:
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 	ptl = pmd_lock(mm, pmd);
 	if (unlikely(!pmd_trans_huge(*pmd))) {
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start,
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
 						  mmun_end, MMU_MIGRATE);
 		return;
 	}
 	if (is_huge_zero_pmd(*pmd)) {
 		__split_huge_zero_page_pmd(vma, haddr, pmd);
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start,
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
 						  mmun_end, MMU_MIGRATE);
 		return;
 	}
@@ -2874,7 +2874,7 @@ again:
 	VM_BUG_ON_PAGE(!page_count(page), page);
 	get_page(page);
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 
 	split_huge_page(page);
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index fc38555..e2736f4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -2557,7 +2557,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	mmun_start = vma->vm_start;
 	mmun_end = vma->vm_end;
 	if (cow)
-		mmu_notifier_invalidate_range_start(src, mmun_start,
+		mmu_notifier_invalidate_range_start(vma, mmun_start,
 						    mmun_end, MMU_MIGRATE);
 
 	for (addr = vma->vm_start; addr < vma->vm_end; addr += sz) {
@@ -2608,7 +2608,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src,
 	}
 
 	if (cow)
-		mmu_notifier_invalidate_range_end(src, mmun_start,
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
 						  mmun_end, MMU_MIGRATE);
 
 	return ret;
@@ -2635,7 +2635,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma,
 	BUG_ON(end & ~huge_page_mask(h));
 
 	tlb_start_vma(tlb, vma);
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 again:
 	for (address = start; address < end; address += sz) {
@@ -2707,7 +2707,7 @@ unlock:
 		if (address < end && !ref_page)
 			goto again;
 	}
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 	tlb_end_vma(tlb, vma);
 }
@@ -2886,7 +2886,7 @@ retry_avoidcopy:
 
 	mmun_start = address & huge_page_mask(h);
 	mmun_end = mmun_start + huge_page_size(h);
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_start(vma, mmun_start, mmun_end,
 					    MMU_MIGRATE);
 	/*
 	 * Retake the page table lock to check for racing updates
@@ -2907,7 +2907,7 @@ retry_avoidcopy:
 		new_page = old_page;
 	}
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_end(vma, mmun_start, mmun_end,
 					  MMU_MIGRATE);
 out_release_all:
 	page_cache_release(new_page);
@@ -3346,7 +3346,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	BUG_ON(address >= end);
 	flush_cache_range(vma, address, end);
 
-	mmu_notifier_invalidate_range_start(mm, start, end, MMU_MPROT);
+	mmu_notifier_invalidate_range_start(vma, start, end, MMU_MPROT);
 	mutex_lock(&vma->vm_file->f_mapping->i_mmap_mutex);
 	for (; address < end; address += huge_page_size(h)) {
 		spinlock_t *ptl;
@@ -3376,7 +3376,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma,
 	 */
 	flush_tlb_range(vma, start, end);
 	mutex_unlock(&vma->vm_file->f_mapping->i_mmap_mutex);
-	mmu_notifier_invalidate_range_end(mm, start, end, MMU_MPROT);
+	mmu_notifier_invalidate_range_end(vma, start, end, MMU_MPROT);
 
 	return pages << h->order;
 }
diff --git a/mm/ksm.c b/mm/ksm.c
index 21d210b..d64c0b4 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -872,7 +872,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 
 	mmun_start = addr;
 	mmun_end   = addr + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_start(vma, mmun_start, mmun_end,
 					    MMU_WRITE_PROTECT);
 
 	ptep = page_check_address(page, mm, addr, &ptl, 0);
@@ -913,7 +913,7 @@ static int write_protect_page(struct vm_area_struct *vma, struct page *page,
 out_unlock:
 	pte_unmap_unlock(ptep, ptl);
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_end(vma, mmun_start, mmun_end,
 					  MMU_WRITE_PROTECT);
 out:
 	return err;
@@ -950,7 +950,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 
 	mmun_start = addr;
 	mmun_end   = addr + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_start(vma, mmun_start, mmun_end,
 					    MMU_MIGRATE);
 
 	ptep = pte_offset_map_lock(mm, pmd, addr, &ptl);
@@ -976,7 +976,7 @@ static int replace_page(struct vm_area_struct *vma, struct page *page,
 	pte_unmap_unlock(ptep, ptl);
 	err = 0;
 out_mn:
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end,
+	mmu_notifier_invalidate_range_end(vma, mmun_start, mmun_end,
 					  MMU_MIGRATE);
 out:
 	return err;
diff --git a/mm/memory.c b/mm/memory.c
index 4dae9f5..409cace 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1049,7 +1049,7 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	mmun_start = addr;
 	mmun_end   = end;
 	if (is_cow)
-		mmu_notifier_invalidate_range_start(src_mm, mmun_start,
+		mmu_notifier_invalidate_range_start(vma, mmun_start,
 						    mmun_end, MMU_MIGRATE);
 
 	ret = 0;
@@ -1067,8 +1067,8 @@ int copy_page_range(struct mm_struct *dst_mm, struct mm_struct *src_mm,
 	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
 
 	if (is_cow)
-		mmu_notifier_invalidate_range_end(src_mm, mmun_start, mmun_end,
-						  MMU_MIGRATE);
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
+						  mmun_end, MMU_MIGRATE);
 	return ret;
 }
 
@@ -1319,6 +1319,11 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 	if (end <= vma->vm_start)
 		return;
 
+	mmu_notifier_invalidate_range_start(vma,
+					    max(start_addr, vma->vm_start),
+					    min(end_addr, vma->vm_end),
+					    MMU_MUNMAP);
+
 	if (vma->vm_file)
 		uprobe_munmap(vma, start, end);
 
@@ -1346,6 +1351,11 @@ static void unmap_single_vma(struct mmu_gather *tlb,
 		} else
 			unmap_page_range(tlb, vma, start, end, details);
 	}
+
+	mmu_notifier_invalidate_range_end(vma,
+					  max(start_addr, vma->vm_start),
+					  min(end_addr, vma->vm_end),
+					  MMU_MUNMAP);
 }
 
 /**
@@ -1370,14 +1380,8 @@ void unmap_vmas(struct mmu_gather *tlb,
 		struct vm_area_struct *vma, unsigned long start_addr,
 		unsigned long end_addr)
 {
-	struct mm_struct *mm = vma->vm_mm;
-
-	mmu_notifier_invalidate_range_start(mm, start_addr,
-					    end_addr, MMU_MUNMAP);
 	for ( ; vma && vma->vm_start < end_addr; vma = vma->vm_next)
 		unmap_single_vma(tlb, vma, start_addr, end_addr, NULL);
-	mmu_notifier_invalidate_range_end(mm, start_addr,
-					  end_addr, MMU_MUNMAP);
 }
 
 /**
@@ -1399,10 +1403,8 @@ void zap_page_range(struct vm_area_struct *vma, unsigned long start,
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, start, end);
 	update_hiwater_rss(mm);
-	mmu_notifier_invalidate_range_start(mm, start, end, MMU_MUNMAP);
 	for ( ; vma && vma->vm_start < end; vma = vma->vm_next)
 		unmap_single_vma(&tlb, vma, start, end, details);
-	mmu_notifier_invalidate_range_end(mm, start, end, MMU_MUNMAP);
 	tlb_finish_mmu(&tlb, start, end);
 }
 
@@ -1425,9 +1427,7 @@ static void zap_page_range_single(struct vm_area_struct *vma, unsigned long addr
 	lru_add_drain();
 	tlb_gather_mmu(&tlb, mm, address, end);
 	update_hiwater_rss(mm);
-	mmu_notifier_invalidate_range_start(mm, address, end, MMU_MUNMAP);
 	unmap_single_vma(&tlb, vma, address, end, details);
-	mmu_notifier_invalidate_range_end(mm, address, end, MMU_MUNMAP);
 	tlb_finish_mmu(&tlb, address, end);
 }
 
@@ -2211,7 +2211,7 @@ gotten:
 
 	mmun_start  = address & PAGE_MASK;
 	mmun_end    = mmun_start + PAGE_SIZE;
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 
 	/*
@@ -2283,7 +2283,7 @@ gotten:
 unlock:
 	pte_unmap_unlock(page_table, ptl);
 	if (mmun_end > mmun_start)
-		mmu_notifier_invalidate_range_end(mm, mmun_start,
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
 						  mmun_end, MMU_MIGRATE);
 	if (old_page) {
 		/*
diff --git a/mm/migrate.c b/mm/migrate.c
index 791e265..73cdf4d 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1820,13 +1820,13 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	WARN_ON(PageLRU(new_page));
 
 	/* Recheck the target PMD */
-	mmu_notifier_invalidate_range_start(mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 	ptl = pmd_lock(mm, pmd);
 	if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
 fail_putback:
 		spin_unlock(ptl);
-		mmu_notifier_invalidate_range_end(mm, mmun_start,
+		mmu_notifier_invalidate_range_end(vma, mmun_start,
 						  mmun_end, MMU_MIGRATE);
 
 		/* Reverse changes made by migrate_page_copy() */
@@ -1880,7 +1880,7 @@ fail_putback:
 	page_remove_rmap(page);
 
 	spin_unlock(ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 
 	/* Take an "isolate" reference and put new page on the LRU. */
diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
index de039e4..5d5fe2d 100644
--- a/mm/mmu_notifier.c
+++ b/mm/mmu_notifier.c
@@ -157,52 +157,55 @@ void __mmu_notifier_change_pte(struct mm_struct *mm,
 	srcu_read_unlock(&srcu, id);
 }
 
-void __mmu_notifier_invalidate_page(struct mm_struct *mm,
+void __mmu_notifier_invalidate_page(struct vm_area_struct *vma,
 				    unsigned long address,
 				    enum mmu_event event)
 {
+	struct mm_struct *mm = vma->vm_mm;
 	struct mmu_notifier *mn;
 	int id;
 
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_page)
-			mn->ops->invalidate_page(mn, mm, address, event);
+			mn->ops->invalidate_page(mn, vma, address, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 
-void __mmu_notifier_invalidate_range_start(struct mm_struct *mm,
+void __mmu_notifier_invalidate_range_start(struct vm_area_struct *vma,
 					   unsigned long start,
 					   unsigned long end,
 					   enum mmu_event event)
 
 {
+	struct mm_struct *mm = vma->vm_mm;
 	struct mmu_notifier *mn;
 	int id;
 
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_range_start)
-			mn->ops->invalidate_range_start(mn, mm, start,
+			mn->ops->invalidate_range_start(mn, vma, start,
 							end, event);
 	}
 	srcu_read_unlock(&srcu, id);
 }
 EXPORT_SYMBOL_GPL(__mmu_notifier_invalidate_range_start);
 
-void __mmu_notifier_invalidate_range_end(struct mm_struct *mm,
+void __mmu_notifier_invalidate_range_end(struct vm_area_struct *vma,
 					 unsigned long start,
 					 unsigned long end,
 					 enum mmu_event event)
 {
+	struct mm_struct *mm = vma->vm_mm;
 	struct mmu_notifier *mn;
 	int id;
 
 	id = srcu_read_lock(&srcu);
 	hlist_for_each_entry_rcu(mn, &mm->mmu_notifier_mm->list, hlist) {
 		if (mn->ops->invalidate_range_end)
-			mn->ops->invalidate_range_end(mn, mm, start,
+			mn->ops->invalidate_range_end(mn, vma, start,
 						      end, event);
 	}
 	srcu_read_unlock(&srcu, id);
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 886405b..fdcb254 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -140,7 +140,6 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		pgprot_t newprot, int dirty_accountable, int prot_numa)
 {
 	pmd_t *pmd;
-	struct mm_struct *mm = vma->vm_mm;
 	unsigned long next;
 	unsigned long pages = 0;
 	unsigned long nr_huge_updates = 0;
@@ -157,7 +156,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!mni_start) {
 			mni_start = addr;
-			mmu_notifier_invalidate_range_start(mm, mni_start,
+			mmu_notifier_invalidate_range_start(vma, mni_start,
 							    end, MMU_MPROT);
 		}
 
@@ -186,7 +185,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 	} while (pmd++, addr = next, addr != end);
 
 	if (mni_start)
-		mmu_notifier_invalidate_range_end(mm, mni_start, end, MMU_MPROT);
+		mmu_notifier_invalidate_range_end(vma, mni_start,
+						  end, MMU_MPROT);
 
 	if (nr_huge_updates)
 		count_vm_numa_events(NUMA_HUGE_PTE_UPDATES, nr_huge_updates);
diff --git a/mm/mremap.c b/mm/mremap.c
index 6827d2f..a223c20 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -177,7 +177,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
 
 	mmun_start = old_addr;
 	mmun_end   = old_end;
-	mmu_notifier_invalidate_range_start(vma->vm_mm, mmun_start,
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
 					    mmun_end, MMU_MIGRATE);
 
 	for (; old_addr < old_end; old_addr += extent, new_addr += extent) {
@@ -229,7 +229,7 @@ unsigned long move_page_tables(struct vm_area_struct *vma,
 	if (likely(need_flush))
 		flush_tlb_range(vma, old_end-len, old_addr);
 
-	mmu_notifier_invalidate_range_end(vma->vm_mm, mmun_start,
+	mmu_notifier_invalidate_range_end(vma, mmun_start,
 					  mmun_end, MMU_MIGRATE);
 
 	return len + old_addr - old_end;	/* how much done */
diff --git a/mm/rmap.c b/mm/rmap.c
index 6b053cd..c2382c7 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -840,7 +840,7 @@ static int page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 	pte_unmap_unlock(pte, ptl);
 
 	if (ret) {
-		mmu_notifier_invalidate_page(mm, address, MMU_WRITE_BACK);
+		mmu_notifier_invalidate_page(vma, address, MMU_WRITE_BACK);
 		(*cleaned)++;
 	}
 out:
@@ -1237,7 +1237,7 @@ static int try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 out_unmap:
 	pte_unmap_unlock(pte, ptl);
 	if (ret != SWAP_FAIL && !(flags & TTU_MUNLOCK))
-		mmu_notifier_invalidate_page(mm, address, event);
+		mmu_notifier_invalidate_page(vma, address, event);
 out:
 	return ret;
 
@@ -1325,7 +1325,8 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 
 	mmun_start = address;
 	mmun_end   = end;
-	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end, event);
+	mmu_notifier_invalidate_range_start(vma, mmun_start,
+					    mmun_end, event);
 
 	/*
 	 * If we can acquire the mmap_sem for read, and vma is VM_LOCKED,
@@ -1390,7 +1391,7 @@ static int try_to_unmap_cluster(unsigned long cursor, unsigned int *mapcount,
 		(*mapcount)--;
 	}
 	pte_unmap_unlock(pte - 1, ptl);
-	mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end, event);
+	mmu_notifier_invalidate_range_end(vma, mmun_start, mmun_end, event);
 	if (locked_vma)
 		up_read(&vma->vm_mm->mmap_sem);
 	return ret;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 6e1992f..35ed19c 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -261,7 +261,7 @@ static inline struct kvm *mmu_notifier_to_kvm(struct mmu_notifier *mn)
 }
 
 static void kvm_mmu_notifier_invalidate_page(struct mmu_notifier *mn,
-					     struct mm_struct *mm,
+					     struct vm_area_struct *vma,
 					     unsigned long address,
 					     enum mmu_event event)
 {
@@ -317,7 +317,7 @@ static void kvm_mmu_notifier_change_pte(struct mmu_notifier *mn,
 }
 
 static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
-						    struct mm_struct *mm,
+						    struct vm_area_struct *vma,
 						    unsigned long start,
 						    unsigned long end,
 						    enum mmu_event event)
@@ -344,7 +344,7 @@ static void kvm_mmu_notifier_invalidate_range_start(struct mmu_notifier *mn,
 }
 
 static void kvm_mmu_notifier_invalidate_range_end(struct mmu_notifier *mn,
-						  struct mm_struct *mm,
+						  struct vm_area_struct *vma,
 						  unsigned long start,
 						  unsigned long end,
 						  enum mmu_event event)
-- 
1.9.0

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-17 18:46 ` [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync j.glisse
@ 2014-07-21 15:01   ` Oded Gabbay
  2014-07-29 15:04     ` Oded Gabbay
  0 siblings, 1 reply; 13+ messages in thread
From: Oded Gabbay @ 2014-07-21 15:01 UTC (permalink / raw)
  To: j.glisse, linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Jérôme Glisse

On 17/07/14 21:46, j.glisse@gmail.com wrote:
> From: Peter Zijlstra <peterz@infradead.org>
>
> New mmu_notifier listener are eager to cleanup there structure after the
> mmu_notifier::release callback. In order to allow this the patch provide
> a function that allows to add a delayed call to the mmu_notifier srcu. It
> also add a function that will call barrier_srcu so those listener can sync
> with mmu_notifier.

Tested with amdkfd and iommuv2 driver
So,
Tested-by: Oded Gabbay <oded.gabbay@amd.com>
>
> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
> ---
>   include/linux/mmu_notifier.h |  6 ++++++
>   mm/mmu_notifier.c            | 40 +++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 45 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
> index deca874..2728869 100644
> --- a/include/linux/mmu_notifier.h
> +++ b/include/linux/mmu_notifier.h
> @@ -170,6 +170,8 @@ extern int __mmu_notifier_register(struct mmu_notifier *mn,
>   				   struct mm_struct *mm);
>   extern void mmu_notifier_unregister(struct mmu_notifier *mn,
>   				    struct mm_struct *mm);
> +extern void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
> +					       struct mm_struct *mm);
>   extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
>   extern void __mmu_notifier_release(struct mm_struct *mm);
>   extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
> @@ -288,6 +290,10 @@ static inline void mmu_notifier_mm_destroy(struct mm_struct *mm)
>   	set_pte_at(___mm, ___address, __ptep, ___pte);			\
>   })
>
> +extern void mmu_notifier_call_srcu(struct rcu_head *rcu,
> +				   void (*func)(struct rcu_head *rcu));
> +extern void mmu_notifier_synchronize(void);
> +
>   #else /* CONFIG_MMU_NOTIFIER */
>
>   static inline void mmu_notifier_release(struct mm_struct *mm)
> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
> index 41cefdf..950813b 100644
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -23,6 +23,25 @@
>   static struct srcu_struct srcu;
>
>   /*
> + * This function allows mmu_notifier::release callback to delay a call to
> + * a function that will free appropriate resources. The function must be
> + * quick and must not block.
> + */
> +void mmu_notifier_call_srcu(struct rcu_head *rcu,
> +			    void (*func)(struct rcu_head *rcu))
> +{
> +	call_srcu(&srcu, rcu, func);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_call_srcu);
> +
> +void mmu_notifier_synchronize(void)
> +{
> +	/* Wait for any running method to finish. */
> +	srcu_barrier(&srcu);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_synchronize);
> +
> +/*
>    * This function can't run concurrently against mmu_notifier_register
>    * because mm->mm_users > 0 during mmu_notifier_register and exit_mmap
>    * runs with mm_users == 0. Other tasks may still invoke mmu notifiers
> @@ -53,7 +72,6 @@ void __mmu_notifier_release(struct mm_struct *mm)
>   		 */
>   		if (mn->ops->release)
>   			mn->ops->release(mn, mm);
> -	srcu_read_unlock(&srcu, id);
>
>   	spin_lock(&mm->mmu_notifier_mm->lock);
>   	while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
> @@ -69,6 +87,7 @@ void __mmu_notifier_release(struct mm_struct *mm)
>   		hlist_del_init_rcu(&mn->hlist);
>   	}
>   	spin_unlock(&mm->mmu_notifier_mm->lock);
> +	srcu_read_unlock(&srcu, id);
>
>   	/*
>   	 * synchronize_srcu here prevents mmu_notifier_release from returning to
> @@ -325,6 +344,25 @@ void mmu_notifier_unregister(struct mmu_notifier *mn, struct mm_struct *mm)
>   }
>   EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
>
> +/*
> + * Same as mmu_notifier_unregister but no callback and no srcu synchronization.
> + */
> +void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
> +					struct mm_struct *mm)
> +{
> +	spin_lock(&mm->mmu_notifier_mm->lock);
> +	/*
> +	 * Can not use list_del_rcu() since __mmu_notifier_release
> +	 * can delete it before we hold the lock.
> +	 */
> +	hlist_del_init_rcu(&mn->hlist);
> +	spin_unlock(&mm->mmu_notifier_mm->lock);
> +
> +	BUG_ON(atomic_read(&mm->mm_count) <= 0);
> +	mmdrop(mm);
> +}
> +EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
> +
>   static int __init mmu_notifier_init(void)
>   {
>   	return init_srcu_struct(&srcu);
>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mmu_notifier: preparatory patches for hmm and or iommuv2 v6
  2014-07-17 18:46 mmu_notifier: preparatory patches for hmm and or iommuv2 v6 j.glisse
                   ` (2 preceding siblings ...)
  2014-07-17 18:46 ` [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3 j.glisse
@ 2014-07-24 15:46 ` Jerome Glisse
  2014-07-24 18:41   ` Oded Gabbay
  3 siblings, 1 reply; 13+ messages in thread
From: Jerome Glisse @ 2014-07-24 15:46 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher, Oded Gabbay

On Thu, Jul 17, 2014 at 02:46:46PM -0400, j.glisse@gmail.com wrote:
> Nutshell few patches to improve mmu_notifier :
>  - patch 1/3 allow to free resources when mm_struct is destroy.
>  - patch 2/3 provide context informations to mmu_notifier listener.
>  - patch 3/3 pass vma to range_start/range_end to avoid duplicate
>    vma lookup inside the listener.
> 
> I restricted myself to set of less controversial patches and i believe
> i have addressed all comments that were previously made. Thanks again
> for all feedback, i hope this version is the good one.
> 
> This is somewhat of a v5 but i do not include core hmm with those
> patches. So previous discussion thread :
> v1 http://www.spinics.net/lists/linux-mm/msg72501.html
> v2 http://www.spinics.net/lists/linux-mm/msg74532.html
> v3 http://www.spinics.net/lists/linux-mm/msg74656.html
> v4 http://www.spinics.net/lists/linux-mm/msg75401.html
> v5 http://www.spinics.net/lists/linux-mm/msg75875.html
> 

Anyone willing to review this ? Or is there no objection ? I would
really appreciate to know where i am standing on those 3 patches.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: mmu_notifier: preparatory patches for hmm and or iommuv2 v6
  2014-07-24 15:46 ` mmu_notifier: preparatory patches for hmm and or iommuv2 v6 Jerome Glisse
@ 2014-07-24 18:41   ` Oded Gabbay
  0 siblings, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2014-07-24 18:41 UTC (permalink / raw)
  To: Jerome Glisse, linux-kernel, linux-mm, akpm
  Cc: Linus Torvalds, joro, Mel Gorman, H. Peter Anvin, Peter Zijlstra,
	Andrea Arcangeli, Johannes Weiner, Larry Woodman, Rik van Riel,
	Dave Airlie, Brendan Conoboy, Joe Donohue, Duncan Poole,
	Sherry Cheung, Subhash Gutti, John Hubbard, Mark Hairgrove,
	Lucien Dunning, Cameron Buschardt, Arvind Gopalakrishnan,
	Shachar Raindel, Liran Liss, Roland Dreier, Ben Sander,
	Greg Stoner, John Bridgman, Michael Mantor, Paul Blinzer,
	Laurent Morichetti, Alexander Deucher

On 24/07/14 18:46, Jerome Glisse wrote:
> On Thu, Jul 17, 2014 at 02:46:46PM -0400, j.glisse@gmail.com wrote:
>> Nutshell few patches to improve mmu_notifier :
>>  - patch 1/3 allow to free resources when mm_struct is destroy.
>>  - patch 2/3 provide context informations to mmu_notifier listener.
>>  - patch 3/3 pass vma to range_start/range_end to avoid duplicate
>>    vma lookup inside the listener.
>>
>> I restricted myself to set of less controversial patches and i believe
>> i have addressed all comments that were previously made. Thanks again
>> for all feedback, i hope this version is the good one.
>>
>> This is somewhat of a v5 but i do not include core hmm with those
>> patches. So previous discussion thread :
>> v1 http://www.spinics.net/lists/linux-mm/msg72501.html
>> v2 http://www.spinics.net/lists/linux-mm/msg74532.html
>> v3 http://www.spinics.net/lists/linux-mm/msg74656.html
>> v4 http://www.spinics.net/lists/linux-mm/msg75401.html
>> v5 http://www.spinics.net/lists/linux-mm/msg75875.html
>>
> 
> Anyone willing to review this ? Or is there no objection ? I would
> really appreciate to know where i am standing on those 3 patches.
> 
> Cheers,
> Jérôme
> 
I think I already wrote it but anyway, I reviewed and tested patch 1/3.

	Oded

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3
  2014-07-17 18:46 ` [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3 j.glisse
@ 2014-07-24 22:51   ` Andrew Morton
  2014-07-25 15:14     ` Jerome Glisse
  0 siblings, 1 reply; 13+ messages in thread
From: Andrew Morton @ 2014-07-24 22:51 UTC (permalink / raw)
  To: j.glisse
  Cc: linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier, Ben Sander, Greg Stoner,
	John Bridgman, Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Oded Gabbay, Jérôme Glisse

On Thu, 17 Jul 2014 14:46:49 -0400 j.glisse@gmail.com wrote:

> New user of the mmu_notifier interface need to lookup vma in order to
> perform the invalidation operation. Instead of redoing a vma lookup
> inside the callback just pass through the vma from the call site where
> it is already available.

Well..  what's wrong with performing the vma lookup within the
callback?  Is it a correctness thing?  Deadlock avoidance, for example?

Or is it a performance thing?  If so, that's a bit surprising - please
provide details?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3
  2014-07-24 22:51   ` Andrew Morton
@ 2014-07-25 15:14     ` Jerome Glisse
  0 siblings, 0 replies; 13+ messages in thread
From: Jerome Glisse @ 2014-07-25 15:14 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, Linus Torvalds, joro, Mel Gorman,
	H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier, Ben Sander, Greg Stoner,
	John Bridgman, Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Oded Gabbay, Jérôme Glisse

On Thu, Jul 24, 2014 at 03:51:57PM -0700, Andrew Morton wrote:
> On Thu, 17 Jul 2014 14:46:49 -0400 j.glisse@gmail.com wrote:
> 
> > New user of the mmu_notifier interface need to lookup vma in order to
> > perform the invalidation operation. Instead of redoing a vma lookup
> > inside the callback just pass through the vma from the call site where
> > it is already available.
> 
> Well..  what's wrong with performing the vma lookup within the
> callback?  Is it a correctness thing?  Deadlock avoidance, for example?
> 
> Or is it a performance thing?  If so, that's a bit surprising - please
> provide details?

No i have no benchmark showing any issue on that, just that 99% of call
place to mmu_notifier already had the vma handy and i thought it would
be helpful to other but all current user are not using vma. So as there
is no strong argument in favor of this change if you feel it is unnecessary
and fear any kind of regression than just ignore it and i will live with
vma lookup inside my code.

Cheers,
Jerome

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-21 15:01   ` Oded Gabbay
@ 2014-07-29 15:04     ` Oded Gabbay
  2014-07-29 20:48       ` Andrew Morton
  0 siblings, 1 reply; 13+ messages in thread
From: Oded Gabbay @ 2014-07-29 15:04 UTC (permalink / raw)
  To: linux-kernel, linux-mm, akpm
  Cc: j.glisse, Linus Torvalds, joro, Mel Gorman, H. Peter Anvin,
	Peter Zijlstra, Andrea Arcangeli, Johannes Weiner, Larry Woodman,
	Rik van Riel, Dave Airlie, Brendan Conoboy, Joe Donohue,
	Duncan Poole, Sherry Cheung, Subhash Gutti, John Hubbard,
	Mark Hairgrove, Lucien Dunning, Cameron Buschardt,
	Arvind Gopalakrishnan, Shachar Raindel, Liran Liss,
	Roland Dreier, Ben Sander, Greg Stoner, John Bridgman,
	Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Jérôme Glisse

On 21/07/14 18:01, Oded Gabbay wrote:
> On 17/07/14 21:46, j.glisse@gmail.com wrote:
>> From: Peter Zijlstra <peterz@infradead.org>
>>
>> New mmu_notifier listener are eager to cleanup there structure after the
>> mmu_notifier::release callback. In order to allow this the patch provide
>> a function that allows to add a delayed call to the mmu_notifier srcu. It
>> also add a function that will call barrier_srcu so those listener can sync
>> with mmu_notifier.
>
> Tested with amdkfd and iommuv2 driver
> So,
> Tested-by: Oded Gabbay <oded.gabbay@amd.com>

akpm, any chance that only this specific patch from Peter.Z will get in 3.17 ?
I must have it for amdkfd (HSA driver). Without it, I can't be in 3.17 either.

	Oded

>>
>> Signed-off-by: Peter Zijlstra <peterz@infradead.org>
>> Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
>> ---
>>   include/linux/mmu_notifier.h |  6 ++++++
>>   mm/mmu_notifier.c            | 40 +++++++++++++++++++++++++++++++++++++++-
>>   2 files changed, 45 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
>> index deca874..2728869 100644
>> --- a/include/linux/mmu_notifier.h
>> +++ b/include/linux/mmu_notifier.h
>> @@ -170,6 +170,8 @@ extern int __mmu_notifier_register(struct mmu_notifier *mn,
>>                      struct mm_struct *mm);
>>   extern void mmu_notifier_unregister(struct mmu_notifier *mn,
>>                       struct mm_struct *mm);
>> +extern void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
>> +                           struct mm_struct *mm);
>>   extern void __mmu_notifier_mm_destroy(struct mm_struct *mm);
>>   extern void __mmu_notifier_release(struct mm_struct *mm);
>>   extern int __mmu_notifier_clear_flush_young(struct mm_struct *mm,
>> @@ -288,6 +290,10 @@ static inline void mmu_notifier_mm_destroy(struct
>> mm_struct *mm)
>>       set_pte_at(___mm, ___address, __ptep, ___pte);            \
>>   })
>>
>> +extern void mmu_notifier_call_srcu(struct rcu_head *rcu,
>> +                   void (*func)(struct rcu_head *rcu));
>> +extern void mmu_notifier_synchronize(void);
>> +
>>   #else /* CONFIG_MMU_NOTIFIER */
>>
>>   static inline void mmu_notifier_release(struct mm_struct *mm)
>> diff --git a/mm/mmu_notifier.c b/mm/mmu_notifier.c
>> index 41cefdf..950813b 100644
>> --- a/mm/mmu_notifier.c
>> +++ b/mm/mmu_notifier.c
>> @@ -23,6 +23,25 @@
>>   static struct srcu_struct srcu;
>>
>>   /*
>> + * This function allows mmu_notifier::release callback to delay a call to
>> + * a function that will free appropriate resources. The function must be
>> + * quick and must not block.
>> + */
>> +void mmu_notifier_call_srcu(struct rcu_head *rcu,
>> +                void (*func)(struct rcu_head *rcu))
>> +{
>> +    call_srcu(&srcu, rcu, func);
>> +}
>> +EXPORT_SYMBOL_GPL(mmu_notifier_call_srcu);
>> +
>> +void mmu_notifier_synchronize(void)
>> +{
>> +    /* Wait for any running method to finish. */
>> +    srcu_barrier(&srcu);
>> +}
>> +EXPORT_SYMBOL_GPL(mmu_notifier_synchronize);
>> +
>> +/*
>>    * This function can't run concurrently against mmu_notifier_register
>>    * because mm->mm_users > 0 during mmu_notifier_register and exit_mmap
>>    * runs with mm_users == 0. Other tasks may still invoke mmu notifiers
>> @@ -53,7 +72,6 @@ void __mmu_notifier_release(struct mm_struct *mm)
>>            */
>>           if (mn->ops->release)
>>               mn->ops->release(mn, mm);
>> -    srcu_read_unlock(&srcu, id);
>>
>>       spin_lock(&mm->mmu_notifier_mm->lock);
>>       while (unlikely(!hlist_empty(&mm->mmu_notifier_mm->list))) {
>> @@ -69,6 +87,7 @@ void __mmu_notifier_release(struct mm_struct *mm)
>>           hlist_del_init_rcu(&mn->hlist);
>>       }
>>       spin_unlock(&mm->mmu_notifier_mm->lock);
>> +    srcu_read_unlock(&srcu, id);
>>
>>       /*
>>        * synchronize_srcu here prevents mmu_notifier_release from returning to
>> @@ -325,6 +344,25 @@ void mmu_notifier_unregister(struct mmu_notifier *mn,
>> struct mm_struct *mm)
>>   }
>>   EXPORT_SYMBOL_GPL(mmu_notifier_unregister);
>>
>> +/*
>> + * Same as mmu_notifier_unregister but no callback and no srcu synchronization.
>> + */
>> +void mmu_notifier_unregister_no_release(struct mmu_notifier *mn,
>> +                    struct mm_struct *mm)
>> +{
>> +    spin_lock(&mm->mmu_notifier_mm->lock);
>> +    /*
>> +     * Can not use list_del_rcu() since __mmu_notifier_release
>> +     * can delete it before we hold the lock.
>> +     */
>> +    hlist_del_init_rcu(&mn->hlist);
>> +    spin_unlock(&mm->mmu_notifier_mm->lock);
>> +
>> +    BUG_ON(atomic_read(&mm->mm_count) <= 0);
>> +    mmdrop(mm);
>> +}
>> +EXPORT_SYMBOL_GPL(mmu_notifier_unregister_no_release);
>> +
>>   static int __init mmu_notifier_init(void)
>>   {
>>       return init_srcu_struct(&srcu);
>>
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=ilto:"dont@kvack.org"> email@kvack.org </a>

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-29 15:04     ` Oded Gabbay
@ 2014-07-29 20:48       ` Andrew Morton
  2014-07-29 21:04         ` Oded Gabbay
  2014-08-02 18:54         ` Oded Gabbay
  0 siblings, 2 replies; 13+ messages in thread
From: Andrew Morton @ 2014-07-29 20:48 UTC (permalink / raw)
  To: Oded Gabbay
  Cc: linux-kernel, linux-mm, j.glisse, Linus Torvalds, joro,
	Mel Gorman, H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier, Ben Sander, Greg Stoner,
	John Bridgman, Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Jérôme Glisse

On Tue, 29 Jul 2014 18:04:32 +0300 Oded Gabbay <oded.gabbay@amd.com> wrote:

> On 21/07/14 18:01, Oded Gabbay wrote:
> > On 17/07/14 21:46, j.glisse@gmail.com wrote:
> >> From: Peter Zijlstra <peterz@infradead.org>
> >>
> >> New mmu_notifier listener are eager to cleanup there structure after the
> >> mmu_notifier::release callback. In order to allow this the patch provide
> >> a function that allows to add a delayed call to the mmu_notifier srcu. It
> >> also add a function that will call barrier_srcu so those listener can sync
> >> with mmu_notifier.
> >
> > Tested with amdkfd and iommuv2 driver
> > So,
> > Tested-by: Oded Gabbay <oded.gabbay@amd.com>
> 
> akpm, any chance that only this specific patch from Peter.Z will get in 3.17 ?
> I must have it for amdkfd (HSA driver). Without it, I can't be in 3.17 either.

I can send it in for 3.17-rc1.

Can we get a better changelog please?  "eager to cleanup there
structure after the mmu_notifier::release callback" is terribly vague
and brief.  Fully describe the problem then describe the proposed
solution.  Because others may be able to suggest alternative ways of
solving that problem, but this text simply doesn't provide enough details for
them to do so.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-29 20:48       ` Andrew Morton
@ 2014-07-29 21:04         ` Oded Gabbay
  2014-08-02 18:54         ` Oded Gabbay
  1 sibling, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2014-07-29 21:04 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, j.glisse, Linus Torvalds, joro,
	Mel Gorman, H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier, Ben Sander, Greg Stoner,
	John Bridgman, Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Jérôme Glisse

On 29/07/14 23:48, Andrew Morton wrote:
> On Tue, 29 Jul 2014 18:04:32 +0300 Oded Gabbay <oded.gabbay@amd.com> wrote:
> 
>> On 21/07/14 18:01, Oded Gabbay wrote:
>>> On 17/07/14 21:46, j.glisse@gmail.com wrote:
>>>> From: Peter Zijlstra <peterz@infradead.org>
>>>>
>>>> New mmu_notifier listener are eager to cleanup there structure after the
>>>> mmu_notifier::release callback. In order to allow this the patch provide
>>>> a function that allows to add a delayed call to the mmu_notifier srcu. It
>>>> also add a function that will call barrier_srcu so those listener can sync
>>>> with mmu_notifier.
>>>
>>> Tested with amdkfd and iommuv2 driver
>>> So,
>>> Tested-by: Oded Gabbay <oded.gabbay@amd.com>
>>
>> akpm, any chance that only this specific patch from Peter.Z will get in 3.17 ?
>> I must have it for amdkfd (HSA driver). Without it, I can't be in 3.17 either.
> 
> I can send it in for 3.17-rc1.
> 
> Can we get a better changelog please?  "eager to cleanup there
> structure after the mmu_notifier::release callback" is terribly vague
> and brief.  Fully describe the problem then describe the proposed
> solution.  Because others may be able to suggest alternative ways of
> solving that problem, but this text simply doesn't provide enough details for
> them to do so.
> 

Thank you :)

I will fix the commit msg and I will submit this patch as part of my v3
patch set of amdkfd (this will be the first patch of the set). I'm going
to publish it in a few days (I guess on Friday). You are also on the
amdkfd patch set list, so you will get the v3 as well.

	Oded

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync.
  2014-07-29 20:48       ` Andrew Morton
  2014-07-29 21:04         ` Oded Gabbay
@ 2014-08-02 18:54         ` Oded Gabbay
  1 sibling, 0 replies; 13+ messages in thread
From: Oded Gabbay @ 2014-08-02 18:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-kernel, linux-mm, j.glisse, Linus Torvalds, joro,
	Mel Gorman, H. Peter Anvin, Peter Zijlstra, Andrea Arcangeli,
	Johannes Weiner, Larry Woodman, Rik van Riel, Dave Airlie,
	Brendan Conoboy, Joe Donohue, Duncan Poole, Sherry Cheung,
	Subhash Gutti, John Hubbard, Mark Hairgrove, Lucien Dunning,
	Cameron Buschardt, Arvind Gopalakrishnan, Shachar Raindel,
	Liran Liss, Roland Dreier, Ben Sander, Greg Stoner,
	John Bridgman, Michael Mantor, Paul Blinzer, Laurent Morichetti,
	Alexander Deucher, Jérôme Glisse

Hi Andrew,
I re-submitted the patch today with a more detailed explanation about
the problem and the proposed solution.
I would like to add that I tested this solution with iommuv2 driver and
with AMD's amdkfd driver (which we currently try to upstream).
If you could merge it for 3.17-rc1 as you said, that would be great.

	Oded

On 29/07/14 23:48, Andrew Morton wrote:
> On Tue, 29 Jul 2014 18:04:32 +0300 Oded Gabbay <oded.gabbay@amd.com> wrote:
> 
>> On 21/07/14 18:01, Oded Gabbay wrote:
>>> On 17/07/14 21:46, j.glisse@gmail.com wrote:
>>>> From: Peter Zijlstra <peterz@infradead.org>
>>>>
>>>> New mmu_notifier listener are eager to cleanup there structure after the
>>>> mmu_notifier::release callback. In order to allow this the patch provide
>>>> a function that allows to add a delayed call to the mmu_notifier srcu. It
>>>> also add a function that will call barrier_srcu so those listener can sync
>>>> with mmu_notifier.
>>>
>>> Tested with amdkfd and iommuv2 driver
>>> So,
>>> Tested-by: Oded Gabbay <oded.gabbay@amd.com>
>>
>> akpm, any chance that only this specific patch from Peter.Z will get in 3.17 ?
>> I must have it for amdkfd (HSA driver). Without it, I can't be in 3.17 either.
> 
> I can send it in for 3.17-rc1.
> 
> Can we get a better changelog please?  "eager to cleanup there
> structure after the mmu_notifier::release callback" is terribly vague
> and brief.  Fully describe the problem then describe the proposed
> solution.  Because others may be able to suggest alternative ways of
> solving that problem, but this text simply doesn't provide enough details for
> them to do so.
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2014-08-02 18:54 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-17 18:46 mmu_notifier: preparatory patches for hmm and or iommuv2 v6 j.glisse
2014-07-17 18:46 ` [PATCH 1/7] mmu_notifier: add call_srcu and sync function for listener to delay call and sync j.glisse
2014-07-21 15:01   ` Oded Gabbay
2014-07-29 15:04     ` Oded Gabbay
2014-07-29 20:48       ` Andrew Morton
2014-07-29 21:04         ` Oded Gabbay
2014-08-02 18:54         ` Oded Gabbay
2014-07-17 18:46 ` [PATCH 2/7] mmu_notifier: add event information to address invalidation v3 j.glisse
2014-07-17 18:46 ` [PATCH 3/7] mmu_notifier: pass through vma to invalidate_range and invalidate_page v3 j.glisse
2014-07-24 22:51   ` Andrew Morton
2014-07-25 15:14     ` Jerome Glisse
2014-07-24 15:46 ` mmu_notifier: preparatory patches for hmm and or iommuv2 v6 Jerome Glisse
2014-07-24 18:41   ` Oded Gabbay

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.