linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] mmu_notifier semantic update
@ 2017-08-29 20:11 Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 1/4] mm/mmu_notifier: document new behavior for mmu_notifier_invalidate_page() Jérôme Glisse
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Jérôme Glisse @ 2017-08-29 20:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Jérôme Glisse, Linus Torvalds, Bernhard Held,
	Adam Borowski, Andrea Arcangeli, Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton,
	Dan Williams, Ross Zwisler

So we do not want to allow sleep during call to mmu_notifier_invalidate_page()
but some code do not have surrounding mmu_notifier_invalidate_range_start()/
mmu_notifier_invalidate_range_end() or mmu_notifier_invalidate_range()

This patch serie just make sure that there is at least a call (outside spinlock
section) to mmu_notifier_invalidate_range() after mmu_notifier_invalidate_page()

This fix issue with AMD IOMMU v2 while avoiding to introduce issue for others
user of the mmu_notifier API. For releavent threads see:

https://lkml.kernel.org/r/20170809204333.27485-1-jglisse@redhat.com
https://lkml.kernel.org/r/20170804134928.l4klfcnqatni7vsc@black.fi.intel.com
https://marc.info/?l=kvm&m=150327081325160&w=2

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>

Jérôme Glisse (4):
  mm/mmu_notifier: document new behavior for
    mmu_notifier_invalidate_page()
  dax/mmu_notifier: update to new mmu_notifier semantic
  mm/rmap: update to new mmu_notifier_invalidate_page() semantic
  iommu/amd: update to new mmu_notifier_invalidate_page() semantic

 drivers/iommu/amd_iommu_v2.c |  8 --------
 fs/dax.c                     |  8 ++++++--
 include/linux/mmu_notifier.h |  6 ++++++
 mm/rmap.c                    | 18 +++++++++++++++++-
 4 files changed, 29 insertions(+), 11 deletions(-)

-- 
2.13.5

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] mm/mmu_notifier: document new behavior for mmu_notifier_invalidate_page()
  2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
@ 2017-08-29 20:11 ` Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 2/4] dax/mmu_notifier: update to new mmu_notifier semantic Jérôme Glisse
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jérôme Glisse @ 2017-08-29 20:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Jérôme Glisse, Linus Torvalds, Bernhard Held,
	Adam Borowski, Andrea Arcangeli, Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton

The invalidate page callback use to happen outside the page table spinlock
and thus callback use to be allow to sleep. This is no longer the case.
However now all call to mmu_notifier_invalidate_page() are bracketed by
call to mmu_notifier_invalidate_range_start/mmu_notifier_invalidate_range_end

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/mmu_notifier.h | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/include/linux/mmu_notifier.h b/include/linux/mmu_notifier.h
index c91b3bcd158f..acc72167b9cb 100644
--- a/include/linux/mmu_notifier.h
+++ b/include/linux/mmu_notifier.h
@@ -100,6 +100,12 @@ struct mmu_notifier_ops {
 	 * pte because the page hasn't been freed yet and it won't be
 	 * freed until this returns. If required set_page_dirty has to
 	 * be called internally to this method.
+	 *
+	 * Note that previously this callback wasn't call from under
+	 * a spinlock and thus you were able to sleep inside it. This
+	 * is no longer the case. However now all call to this callback
+	 * is either bracketed by call to range_start()/range_end() or
+	 * follow by a call to invalidate_range().
 	 */
 	void (*invalidate_page)(struct mmu_notifier *mn,
 				struct mm_struct *mm,
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] dax/mmu_notifier: update to new mmu_notifier semantic
  2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 1/4] mm/mmu_notifier: document new behavior for mmu_notifier_invalidate_page() Jérôme Glisse
@ 2017-08-29 20:11 ` Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 3/4] mm/rmap: update to new mmu_notifier_invalidate_page() semantic Jérôme Glisse
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Jérôme Glisse @ 2017-08-29 20:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Jérôme Glisse, Dan Williams, Ross Zwisler,
	Linus Torvalds, Bernhard Held, Adam Borowski, Andrea Arcangeli,
	Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton

mmu_notifier_invalidate_page() can now be call from under the spinlock.
Move it approprietly and add a call to mmu_notifier_invalidate_range()
for user that need to be able to sleep.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 fs/dax.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 865d42c63e23..23cfb055e92e 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -650,7 +650,7 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 
 	i_mmap_lock_read(mapping);
 	vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) {
-		unsigned long address;
+		unsigned long start, address, end;
 
 		cond_resched();
 
@@ -676,6 +676,9 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 			pmd = pmd_wrprotect(pmd);
 			pmd = pmd_mkclean(pmd);
 			set_pmd_at(vma->vm_mm, address, pmdp, pmd);
+			start = address & PMD_MASK;
+			end = start + PMD_SIZE;
+			mmu_notifier_invalidate_page(vma->vm_mm, address);
 			changed = true;
 unlock_pmd:
 			spin_unlock(ptl);
@@ -691,13 +694,16 @@ static void dax_mapping_entry_mkclean(struct address_space *mapping,
 			pte = pte_wrprotect(pte);
 			pte = pte_mkclean(pte);
 			set_pte_at(vma->vm_mm, address, ptep, pte);
+			mmu_notifier_invalidate_page(vma->vm_mm, address);
 			changed = true;
+			start = address;
+			end = start + PAGE_SIZE;
 unlock_pte:
 			pte_unmap_unlock(ptep, ptl);
 		}
 
 		if (changed)
-			mmu_notifier_invalidate_page(vma->vm_mm, address);
+			mmu_notifier_invalidate_range(vma->vm_mm, start, end);
 	}
 	i_mmap_unlock_read(mapping);
 }
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] mm/rmap: update to new mmu_notifier_invalidate_page() semantic
  2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 1/4] mm/mmu_notifier: document new behavior for mmu_notifier_invalidate_page() Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 2/4] dax/mmu_notifier: update to new mmu_notifier semantic Jérôme Glisse
@ 2017-08-29 20:11 ` Jérôme Glisse
  2017-08-29 20:11 ` [PATCH 4/4] iommu/amd: " Jérôme Glisse
  2017-08-29 20:15 ` [PATCH 0/4] mmu_notifier semantic update Jerome Glisse
  4 siblings, 0 replies; 6+ messages in thread
From: Jérôme Glisse @ 2017-08-29 20:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Jérôme Glisse, Linus Torvalds, Bernhard Held,
	Adam Borowski, Andrea Arcangeli, Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton

mmu_notifier_invalidate_page() is now be call from under the spinlock.
Add a call to mmu_notifier_invalidate_range() for user that need to be
able to sleep.

Relevent threads:
https://lkml.kernel.org/r/20170809204333.27485-1-jglisse@redhat.com
https://lkml.kernel.org/r/20170804134928.l4klfcnqatni7vsc@black.fi.intel.com
https://marc.info/?l=kvm&m=150327081325160&w=2

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 mm/rmap.c | 18 +++++++++++++++++-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index c8993c63eb25..06792e28093c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -887,6 +887,8 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 		.address = address,
 		.flags = PVMW_SYNC,
 	};
+	unsigned long start = address, end = address;
+	bool invalidate = false;
 	int *cleaned = arg;
 
 	while (page_vma_mapped_walk(&pvmw)) {
@@ -927,10 +929,16 @@ static bool page_mkclean_one(struct page *page, struct vm_area_struct *vma,
 
 		if (ret) {
 			mmu_notifier_invalidate_page(vma->vm_mm, address);
+			/* range is exclusive */
+			end = address + PAGE_SIZE;
+			invalidate = true;
 			(*cleaned)++;
 		}
 	}
 
+	if (invalidate)
+		mmu_notifier_invalidate_range(vma->vm_mm, start, end);
+
 	return true;
 }
 
@@ -1323,8 +1331,9 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 	};
 	pte_t pteval;
 	struct page *subpage;
-	bool ret = true;
+	bool ret = true, invalidate = false;
 	enum ttu_flags flags = (enum ttu_flags)arg;
+	unsigned long start = address, end = address;
 
 	/* munlock has nothing to gain from examining un-locked vmas */
 	if ((flags & TTU_MUNLOCK) && !(vma->vm_flags & VM_LOCKED))
@@ -1491,7 +1500,14 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
 		page_remove_rmap(subpage, PageHuge(page));
 		put_page(page);
 		mmu_notifier_invalidate_page(mm, address);
+		/* range is exclusive */
+		end = address + PAGE_SIZE;
+		invalidate = true;
 	}
+
+	if (invalidate)
+		mmu_notifier_invalidate_range(vma->vm_mm, start, end);
+
 	return ret;
 }
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] iommu/amd: update to new mmu_notifier_invalidate_page() semantic
  2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
                   ` (2 preceding siblings ...)
  2017-08-29 20:11 ` [PATCH 3/4] mm/rmap: update to new mmu_notifier_invalidate_page() semantic Jérôme Glisse
@ 2017-08-29 20:11 ` Jérôme Glisse
  2017-08-29 20:15 ` [PATCH 0/4] mmu_notifier semantic update Jerome Glisse
  4 siblings, 0 replies; 6+ messages in thread
From: Jérôme Glisse @ 2017-08-29 20:11 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Jérôme Glisse, Joerg Roedel, Suravee Suthikulpanit,
	Linus Torvalds, Bernhard Held, Adam Borowski, Andrea Arcangeli,
	Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton

mmu_notifier_invalidate_page() is now be call from under the spinlock.
But we can now rely on invalidate_range() being call afterward.

Signed-off-by: Jérôme Glisse <jglisse@redhat.com>
Cc: Joerg Roedel <jroedel@suse.de>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Bernhard Held <berny156@gmx.de>
Cc: Adam Borowski <kilobyte@angband.pl>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Wanpeng Li <kernellwp@gmail.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: axie <axie@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
---
 drivers/iommu/amd_iommu_v2.c | 8 --------
 1 file changed, 8 deletions(-)

diff --git a/drivers/iommu/amd_iommu_v2.c b/drivers/iommu/amd_iommu_v2.c
index 6629c472eafd..dccf5b76eff2 100644
--- a/drivers/iommu/amd_iommu_v2.c
+++ b/drivers/iommu/amd_iommu_v2.c
@@ -391,13 +391,6 @@ static int mn_clear_flush_young(struct mmu_notifier *mn,
 	return 0;
 }
 
-static void mn_invalidate_page(struct mmu_notifier *mn,
-			       struct mm_struct *mm,
-			       unsigned long address)
-{
-	__mn_flush_page(mn, address);
-}
-
 static void mn_invalidate_range(struct mmu_notifier *mn,
 				struct mm_struct *mm,
 				unsigned long start, unsigned long end)
@@ -436,7 +429,6 @@ static void mn_release(struct mmu_notifier *mn, struct mm_struct *mm)
 static const struct mmu_notifier_ops iommu_mn = {
 	.release		= mn_release,
 	.clear_flush_young      = mn_clear_flush_young,
-	.invalidate_page        = mn_invalidate_page,
 	.invalidate_range       = mn_invalidate_range,
 };
 
-- 
2.13.5

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] mmu_notifier semantic update
  2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
                   ` (3 preceding siblings ...)
  2017-08-29 20:11 ` [PATCH 4/4] iommu/amd: " Jérôme Glisse
@ 2017-08-29 20:15 ` Jerome Glisse
  4 siblings, 0 replies; 6+ messages in thread
From: Jerome Glisse @ 2017-08-29 20:15 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Linus Torvalds, Bernhard Held, Adam Borowski, Andrea Arcangeli,
	Radim Krčmář,
	Wanpeng Li, Paolo Bonzini, Takashi Iwai, Nadav Amit,
	Mike Galbraith, Kirill A . Shutemov, axie, Andrew Morton,
	Dan Williams, Ross Zwisler

On Tue, Aug 29, 2017 at 04:11:28PM -0400, Jérôme Glisse wrote:
> So we do not want to allow sleep during call to mmu_notifier_invalidate_page()
> but some code do not have surrounding mmu_notifier_invalidate_range_start()/
> mmu_notifier_invalidate_range_end() or mmu_notifier_invalidate_range()
> 
> This patch serie just make sure that there is at least a call (outside spinlock
> section) to mmu_notifier_invalidate_range() after mmu_notifier_invalidate_page()
> 
> This fix issue with AMD IOMMU v2 while avoiding to introduce issue for others
> user of the mmu_notifier API. For releavent threads see:
> 
> https://lkml.kernel.org/r/20170809204333.27485-1-jglisse@redhat.com
> https://lkml.kernel.org/r/20170804134928.l4klfcnqatni7vsc@black.fi.intel.com
> https://marc.info/?l=kvm&m=150327081325160&w=2

Please ignore this. Instead plan is to kill invalidate_page() switch
it to invalidate_range() and make sure there is always range_start/
range_end happening around.

Jérôme

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-08-29 20:16 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-29 20:11 [PATCH 0/4] mmu_notifier semantic update Jérôme Glisse
2017-08-29 20:11 ` [PATCH 1/4] mm/mmu_notifier: document new behavior for mmu_notifier_invalidate_page() Jérôme Glisse
2017-08-29 20:11 ` [PATCH 2/4] dax/mmu_notifier: update to new mmu_notifier semantic Jérôme Glisse
2017-08-29 20:11 ` [PATCH 3/4] mm/rmap: update to new mmu_notifier_invalidate_page() semantic Jérôme Glisse
2017-08-29 20:11 ` [PATCH 4/4] iommu/amd: " Jérôme Glisse
2017-08-29 20:15 ` [PATCH 0/4] mmu_notifier semantic update Jerome Glisse

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).