linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing
@ 2017-06-13 10:28 Will Deacon
  2017-06-13 10:28 ` [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages Will Deacon
                   ` (3 more replies)
  0 siblings, 4 replies; 7+ messages in thread
From: Will Deacon @ 2017-06-13 10:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: mark.rutland, akpm, kirill.shutemov, Punit.Agrawal, mgorman,
	steve.capper, vbabka, Will Deacon

Hi all,

This is v2 of the patches previously posted here:

   http://www.spinics.net/lists/linux-mm/msg128577.html

Changes since v1 include:

  * Use smp_mb() instead of smp_mb__before_atomic() before atomic_set()
  * Added acks and fixes tag

Feedback welcome,

Will

--->8

Mark Rutland (1):
  mm: numa: avoid waiting on freed migrated pages

Will Deacon (2):
  mm/page_ref: Ensure page_ref_unfreeze is ordered against prior
    accesses
  mm: migrate: Stabilise page count when migrating transparent hugepages

 include/linux/page_ref.h |  1 +
 mm/huge_memory.c         |  8 +++++++-
 mm/migrate.c             | 15 ++-------------
 3 files changed, 10 insertions(+), 14 deletions(-)

-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages
  2017-06-13 10:28 [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Will Deacon
@ 2017-06-13 10:28 ` Will Deacon
  2017-06-13 10:28 ` [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Will Deacon
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2017-06-13 10:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: mark.rutland, akpm, kirill.shutemov, Punit.Agrawal, mgorman,
	steve.capper, vbabka, Will Deacon

From: Mark Rutland <mark.rutland@arm.com>

In do_huge_pmd_numa_page(), we attempt to handle a migrating thp pmd by
waiting until the pmd is unlocked before we return and retry. However,
we can race with migrate_misplaced_transhuge_page():

// do_huge_pmd_numa_page                // migrate_misplaced_transhuge_page()
// Holds 0 refs on page                 // Holds 2 refs on page

vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd);
/* ... */
if (pmd_trans_migrating(*vmf->pmd)) {
        page = pmd_page(*vmf->pmd);
        spin_unlock(vmf->ptl);
                                        ptl = pmd_lock(mm, pmd);
                                        if (page_count(page) != 2)) {
                                                /* roll back */
                                        }
                                        /* ... */
                                        mlock_migrate_page(new_page, page);
                                        /* ... */
                                        spin_unlock(ptl);
                                        put_page(page);
                                        put_page(page); // page freed here
        wait_on_page_locked(page);
        goto out;
}

This can result in the freed page having its waiters flag set
unexpectedly, which trips the PAGE_FLAGS_CHECK_AT_PREP checks in the
page alloc/free functions. This has been observed on arm64 KVM guests.

We can avoid this by having do_huge_pmd_numa_page() take a reference on
the page before dropping the pmd lock, mirroring what we do in
__migration_entry_wait().

When we hit the race, migrate_misplaced_transhuge_page() will see the
reference and abort the migration, as it may do today in other cases.

Acked-by: Steve Capper <steve.capper@arm.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Fixes: b8916634b77bffb2 ("mm: Prevent parallel splits during THP migration")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 mm/huge_memory.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index a84909cf20d3..88c6167f194d 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1426,8 +1426,11 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 	 */
 	if (unlikely(pmd_trans_migrating(*vmf->pmd))) {
 		page = pmd_page(*vmf->pmd);
+		if (!get_page_unless_zero(page))
+			goto out_unlock;
 		spin_unlock(vmf->ptl);
 		wait_on_page_locked(page);
+		put_page(page);
 		goto out;
 	}
 
@@ -1459,9 +1462,12 @@ int do_huge_pmd_numa_page(struct vm_fault *vmf, pmd_t pmd)
 
 	/* Migration could have started since the pmd_trans_migrating check */
 	if (!page_locked) {
+		page_nid = -1;
+		if (!get_page_unless_zero(page))
+			goto out_unlock;
 		spin_unlock(vmf->ptl);
 		wait_on_page_locked(page);
-		page_nid = -1;
+		put_page(page);
 		goto out;
 	}
 
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses
  2017-06-13 10:28 [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Will Deacon
  2017-06-13 10:28 ` [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages Will Deacon
@ 2017-06-13 10:28 ` Will Deacon
  2017-06-13 14:06   ` Kirill A. Shutemov
  2017-06-13 10:28 ` [PATCH v2 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages Will Deacon
  2017-06-15 20:32 ` [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Andrew Morton
  3 siblings, 1 reply; 7+ messages in thread
From: Will Deacon @ 2017-06-13 10:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: mark.rutland, akpm, kirill.shutemov, Punit.Agrawal, mgorman,
	steve.capper, vbabka, Will Deacon

page_ref_freeze and page_ref_unfreeze are designed to be used as a pair,
wrapping a critical section where struct pages can be modified without
having to worry about consistency for a concurrent fast-GUP.

Whilst page_ref_freeze has full barrier semantics due to its use of
atomic_cmpxchg, page_ref_unfreeze is implemented using atomic_set, which
doesn't provide any barrier semantics and allows the operation to be
reordered with respect to page modifications in the critical section.

This patch ensures that page_ref_unfreeze is ordered after any critical
section updates, by invoking smp_mb() prior to the atomic_set.

Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Acked-by: Steve Capper <steve.capper@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 include/linux/page_ref.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
index 610e13271918..1fd71733aa68 100644
--- a/include/linux/page_ref.h
+++ b/include/linux/page_ref.h
@@ -174,6 +174,7 @@ static inline void page_ref_unfreeze(struct page *page, int count)
 	VM_BUG_ON_PAGE(page_count(page) != 0, page);
 	VM_BUG_ON(count == 0);
 
+	smp_mb();
 	atomic_set(&page->_refcount, count);
 	if (page_ref_tracepoint_active(__tracepoint_page_ref_unfreeze))
 		__page_ref_unfreeze(page, count);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages
  2017-06-13 10:28 [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Will Deacon
  2017-06-13 10:28 ` [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages Will Deacon
  2017-06-13 10:28 ` [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Will Deacon
@ 2017-06-13 10:28 ` Will Deacon
  2017-06-15 20:32 ` [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Andrew Morton
  3 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2017-06-13 10:28 UTC (permalink / raw)
  To: linux-mm, linux-kernel
  Cc: mark.rutland, akpm, kirill.shutemov, Punit.Agrawal, mgorman,
	steve.capper, vbabka, Will Deacon

When migrating a transparent hugepage, migrate_misplaced_transhuge_page
guards itself against a concurrent fastgup of the page by checking that
the page count is equal to 2 before and after installing the new pmd.

If the page count changes, then the pmd is reverted back to the original
entry, however there is a small window where the new (possibly writable)
pmd is installed and the underlying page could be written by userspace.
Restoring the old pmd could therefore result in loss of data.

This patch fixes the problem by freezing the page count whilst updating
the page tables, which protects against a concurrent fastgup without the
need to restore the old pmd in the failure case (since the page count can
no longer change under our feet).

Cc: Mel Gorman <mgorman@suse.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
---
 mm/migrate.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index 89a0a1707f4c..8b21f1b1ec6e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1913,7 +1913,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	int page_lru = page_is_file_cache(page);
 	unsigned long mmun_start = address & HPAGE_PMD_MASK;
 	unsigned long mmun_end = mmun_start + HPAGE_PMD_SIZE;
-	pmd_t orig_entry;
 
 	/*
 	 * Rate-limit the amount of data that is being migrated to a node.
@@ -1956,8 +1955,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	/* Recheck the target PMD */
 	mmu_notifier_invalidate_range_start(mm, mmun_start, mmun_end);
 	ptl = pmd_lock(mm, pmd);
-	if (unlikely(!pmd_same(*pmd, entry) || page_count(page) != 2)) {
-fail_putback:
+	if (unlikely(!pmd_same(*pmd, entry) || !page_ref_freeze(page, 2))) {
 		spin_unlock(ptl);
 		mmu_notifier_invalidate_range_end(mm, mmun_start, mmun_end);
 
@@ -1979,7 +1977,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 		goto out_unlock;
 	}
 
-	orig_entry = *pmd;
 	entry = mk_huge_pmd(new_page, vma->vm_page_prot);
 	entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
 
@@ -1996,15 +1993,7 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm,
 	set_pmd_at(mm, mmun_start, pmd, entry);
 	update_mmu_cache_pmd(vma, address, &entry);
 
-	if (page_count(page) != 2) {
-		set_pmd_at(mm, mmun_start, pmd, orig_entry);
-		flush_pmd_tlb_range(vma, mmun_start, mmun_end);
-		mmu_notifier_invalidate_range(mm, mmun_start, mmun_end);
-		update_mmu_cache_pmd(vma, address, &entry);
-		page_remove_rmap(new_page, true);
-		goto fail_putback;
-	}
-
+	page_ref_unfreeze(page, 2);
 	mlock_migrate_page(new_page, page);
 	page_remove_rmap(page, true);
 	set_page_owner_migrate_reason(new_page, MR_NUMA_MISPLACED);
-- 
2.1.4

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses
  2017-06-13 10:28 ` [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Will Deacon
@ 2017-06-13 14:06   ` Kirill A. Shutemov
  0 siblings, 0 replies; 7+ messages in thread
From: Kirill A. Shutemov @ 2017-06-13 14:06 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-mm, linux-kernel, mark.rutland, akpm, Punit.Agrawal,
	mgorman, steve.capper, vbabka

On Tue, Jun 13, 2017 at 11:28:41AM +0100, Will Deacon wrote:
> page_ref_freeze and page_ref_unfreeze are designed to be used as a pair,
> wrapping a critical section where struct pages can be modified without
> having to worry about consistency for a concurrent fast-GUP.
> 
> Whilst page_ref_freeze has full barrier semantics due to its use of
> atomic_cmpxchg, page_ref_unfreeze is implemented using atomic_set, which
> doesn't provide any barrier semantics and allows the operation to be
> reordered with respect to page modifications in the critical section.
> 
> This patch ensures that page_ref_unfreeze is ordered after any critical
> section updates, by invoking smp_mb() prior to the atomic_set.
> 
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Acked-by: Steve Capper <steve.capper@arm.com>
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> ---
>  include/linux/page_ref.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h
> index 610e13271918..1fd71733aa68 100644
> --- a/include/linux/page_ref.h
> +++ b/include/linux/page_ref.h
> @@ -174,6 +174,7 @@ static inline void page_ref_unfreeze(struct page *page, int count)
>  	VM_BUG_ON_PAGE(page_count(page) != 0, page);
>  	VM_BUG_ON(count == 0);
>  
> +	smp_mb();

Don't we want some comment here?

Otherwise:

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing
  2017-06-13 10:28 [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Will Deacon
                   ` (2 preceding siblings ...)
  2017-06-13 10:28 ` [PATCH v2 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages Will Deacon
@ 2017-06-15 20:32 ` Andrew Morton
  2017-06-19  9:35   ` Will Deacon
  3 siblings, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2017-06-15 20:32 UTC (permalink / raw)
  To: Will Deacon
  Cc: linux-mm, linux-kernel, mark.rutland, kirill.shutemov,
	Punit.Agrawal, mgorman, steve.capper, vbabka

On Tue, 13 Jun 2017 11:28:39 +0100 Will Deacon <will.deacon@arm.com> wrote:

> This is v2 of the patches previously posted here:
> 
>    http://www.spinics.net/lists/linux-mm/msg128577.html
> 
> Changes since v1 include:
> 
>   * Use smp_mb() instead of smp_mb__before_atomic() before atomic_set()
>   * Added acks and fixes tag
> 
> Feedback welcome,
> 
> Will
> 
> --->8
> 
> Mark Rutland (1):
>   mm: numa: avoid waiting on freed migrated pages
> 
> Will Deacon (2):
>   mm/page_ref: Ensure page_ref_unfreeze is ordered against prior
>     accesses
>   mm: migrate: Stabilise page count when migrating transparent hugepages

I marked [1/3] for -stable backporting and held the other two for
4.13-rc1.  Maybe that wasn't appropriate...

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing
  2017-06-15 20:32 ` [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Andrew Morton
@ 2017-06-19  9:35   ` Will Deacon
  0 siblings, 0 replies; 7+ messages in thread
From: Will Deacon @ 2017-06-19  9:35 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, mark.rutland, kirill.shutemov,
	Punit.Agrawal, mgorman, steve.capper, vbabka

On Thu, Jun 15, 2017 at 01:32:52PM -0700, Andrew Morton wrote:
> On Tue, 13 Jun 2017 11:28:39 +0100 Will Deacon <will.deacon@arm.com> wrote:
> 
> > This is v2 of the patches previously posted here:
> > 
> >    http://www.spinics.net/lists/linux-mm/msg128577.html
> > 
> > Changes since v1 include:
> > 
> >   * Use smp_mb() instead of smp_mb__before_atomic() before atomic_set()
> >   * Added acks and fixes tag
> > 
> > Feedback welcome,
> > 
> > Will
> > 
> > --->8
> > 
> > Mark Rutland (1):
> >   mm: numa: avoid waiting on freed migrated pages
> > 
> > Will Deacon (2):
> >   mm/page_ref: Ensure page_ref_unfreeze is ordered against prior
> >     accesses
> >   mm: migrate: Stabilise page count when migrating transparent hugepages
> 
> I marked [1/3] for -stable backporting and held the other two for
> 4.13-rc1.  Maybe that wasn't appropriate...

I think that's about right. Patches 2 and 3 fix issues found by inspection,
rather than something we've knowingly run into.

Thanks,

Will

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2017-06-19  9:35 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-13 10:28 [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Will Deacon
2017-06-13 10:28 ` [PATCH v2 1/3] mm: numa: avoid waiting on freed migrated pages Will Deacon
2017-06-13 10:28 ` [PATCH v2 2/3] mm/page_ref: Ensure page_ref_unfreeze is ordered against prior accesses Will Deacon
2017-06-13 14:06   ` Kirill A. Shutemov
2017-06-13 10:28 ` [PATCH v2 3/3] mm: migrate: Stabilise page count when migrating transparent hugepages Will Deacon
2017-06-15 20:32 ` [PATCH v2 0/3] mm: huge pages: Misc fixes for issues found during fuzzing Andrew Morton
2017-06-19  9:35   ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).