stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 3/4] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
       [not found] <20210830023032.374861-1-chenwandun@huawei.com>
@ 2021-08-30  2:30 ` Chen Wandun
  2021-08-30  2:33   ` Chen Wandun
  0 siblings, 1 reply; 2+ messages in thread
From: Chen Wandun @ 2021-08-30  2:30 UTC (permalink / raw)
  To: wangkefeng.wang, chenhuang5, cy.fan, liushixin2, jingxiangfeng,
	liuyongqiang13, tongtiangen, zouyue3, sunnanyong, liupeng256,
	zhiwentao, patchwork
  Cc: Hugh Dickins, Kirill A . Shutemov, Yang Shi, Wang Yugui,
	Matthew Wilcox (Oracle),
	Naoya Horiguchi, Alistair Popple, Ralph Campbell, Zi Yan,
	Miaohe Lin, Minchan Kim, Jue Wang, Peter Xu, Jan Kara,
	Shakeel Butt, Oscar Salvador, stable, Andrew Morton,
	Linus Torvalds, Chen Wandun

From: Hugh Dickins <hughd@google.com>

mainline inclusion
from mainline-v5.13-rc7
commit 99fa8a48203d62b3743d866fc48ef6abaee682be
category: bugfix
bugzilla: 110269
DTS: #87
CVE: NA

-------------------------------------------------

Patch series "mm/thp: fix THP splitting unmap BUGs and related", v10.

Here is v2 batch of long-standing THP bug fixes that I had not got
around to sending before, but prompted now by Wang Yugui's report
https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/

Wang Yugui has tested a rollup of these fixes applied to 5.10.39, and
they have done no harm, but have *not* fixed that issue: something more
is needed and I have no idea of what.

This patch (of 7):

Stressing huge tmpfs page migration racing hole punch often crashed on
the VM_BUG_ON(!pmd_present) in pmdp_huge_clear_flush(), with DEBUG_VM=y
kernel; or shortly afterwards, on a bad dereference in
__split_huge_pmd_locked() when DEBUG_VM=n.  They forgot to allow for pmd
migration entries in the non-anonymous case.

Full disclosure: those particular experiments were on a kernel with more
relaxed mmap_lock and i_mmap_rwsem locking, and were not repeated on the
vanilla kernel: it is conceivable that stricter locking happens to avoid
those cases, or makes them less likely; but __split_huge_pmd_locked()
already allowed for pmd migration entries when handling anonymous THPs,
so this commit brings the shmem and file THP handling into line.

And while there: use old_pmd rather than _pmd, as in the following
blocks; and make it clearer to the eye that the !vma_is_anonymous()
block is self-contained, making an early return after accounting for
unmapping.

Link: https://lkml.kernel.org/r/af88612-1473-2eaa-903-8d1a448b26@google.com
Link: https://lkml.kernel.org/r/dd221a99-efb3-cd1d-6256-7e646af29314@google.com
Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp")
Signed-off-by: Hugh Dickins <hughd@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Wang Yugui <wangyugui@e16-tech.com>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Ralph Campbell <rcampbell@nvidia.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Jue Wang <juew@google.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Conflicts:
	mm/huge_memory.c
Signed-off-by: Chen Wandun <chenwandun@huawei.com>
---
 mm/huge_memory.c     | 27 ++++++++++++++++++---------
 mm/pgtable-generic.c |  5 ++---
 2 files changed, 20 insertions(+), 12 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d68b2c3aebc7..89419fffefab 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2199,7 +2199,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 	count_vm_event(THP_SPLIT_PMD);
 
 	if (!vma_is_anonymous(vma)) {
-		_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
+		old_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
 		/*
 		 * We are going to unmap this huge page. So
 		 * just go ahead and zap it
@@ -2208,16 +2208,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 			zap_deposited_table(mm, pmd);
 		if (vma_is_dax(vma))
 			return;
-		page = pmd_page(_pmd);
-		if (!PageDirty(page) && pmd_dirty(_pmd))
-			set_page_dirty(page);
-		if (!PageReferenced(page) && pmd_young(_pmd))
-			SetPageReferenced(page);
-		page_remove_rmap(page, true);
-		put_page(page);
+		if (unlikely(is_pmd_migration_entry(old_pmd))) {
+			swp_entry_t entry;
+
+			entry = pmd_to_swp_entry(old_pmd);
+			page = migration_entry_to_page(entry);
+		} else {
+			page = pmd_page(old_pmd);
+			if (!PageDirty(page) && pmd_dirty(old_pmd))
+				set_page_dirty(page);
+			if (!PageReferenced(page) && pmd_young(old_pmd))
+				SetPageReferenced(page);
+			page_remove_rmap(page, true);
+			put_page(page);
+		}
 		add_mm_counter(mm, MM_FILEPAGES, -HPAGE_PMD_NR);
 		return;
-	} else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
+	}
+
+	if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
 		/*
 		 * FIXME: Do we want to invalidate secondary mmu by calling
 		 * mmu_notifier_invalidate_range() see comments below inside
diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
index 5e9957e19870..36770fcdc358 100644
--- a/mm/pgtable-generic.c
+++ b/mm/pgtable-generic.c
@@ -125,9 +125,8 @@ pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address,
 {
 	pmd_t pmd;
 	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
-	VM_BUG_ON(!pmd_present(*pmdp));
-	/* Below assumes pmd_present() is true */
-	VM_BUG_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
+	VM_BUG_ON(pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) &&
+			   !pmd_devmap(*pmdp));
 	pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
 	flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
 	return pmd;
-- 
2.18.0.huawei.25


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH 3/4] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry
  2021-08-30  2:30 ` [PATCH 3/4] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry Chen Wandun
@ 2021-08-30  2:33   ` Chen Wandun
  0 siblings, 0 replies; 2+ messages in thread
From: Chen Wandun @ 2021-08-30  2:33 UTC (permalink / raw)
  To: wangkefeng.wang, chenhuang5, cy.fan, liushixin2, jingxiangfeng,
	liuyongqiang13, tongtiangen, zouyue3, sunnanyong, liupeng256,
	zhiwentao, patchwork
  Cc: Hugh Dickins, Kirill A . Shutemov, Yang Shi, Wang Yugui,
	Matthew Wilcox (Oracle),
	Naoya Horiguchi, Alistair Popple, Ralph Campbell, Zi Yan,
	Miaohe Lin, Minchan Kim, Jue Wang, Peter Xu, Jan Kara,
	Shakeel Butt, Oscar Salvador, stable, Andrew Morton,
	Linus Torvalds

sorry, please ignore this

在 2021/8/30 10:30, Chen Wandun 写道:
> From: Hugh Dickins <hughd@google.com>
>
> mainline inclusion
> from mainline-v5.13-rc7
> commit 99fa8a48203d62b3743d866fc48ef6abaee682be
> category: bugfix
> bugzilla: 110269
> DTS: #87
> CVE: NA
>
> -------------------------------------------------
>
> Patch series "mm/thp: fix THP splitting unmap BUGs and related", v10.
>
> Here is v2 batch of long-standing THP bug fixes that I had not got
> around to sending before, but prompted now by Wang Yugui's report
> https://lore.kernel.org/linux-mm/20210412180659.B9E3.409509F4@e16-tech.com/
>
> Wang Yugui has tested a rollup of these fixes applied to 5.10.39, and
> they have done no harm, but have *not* fixed that issue: something more
> is needed and I have no idea of what.
>
> This patch (of 7):
>
> Stressing huge tmpfs page migration racing hole punch often crashed on
> the VM_BUG_ON(!pmd_present) in pmdp_huge_clear_flush(), with DEBUG_VM=y
> kernel; or shortly afterwards, on a bad dereference in
> __split_huge_pmd_locked() when DEBUG_VM=n.  They forgot to allow for pmd
> migration entries in the non-anonymous case.
>
> Full disclosure: those particular experiments were on a kernel with more
> relaxed mmap_lock and i_mmap_rwsem locking, and were not repeated on the
> vanilla kernel: it is conceivable that stricter locking happens to avoid
> those cases, or makes them less likely; but __split_huge_pmd_locked()
> already allowed for pmd migration entries when handling anonymous THPs,
> so this commit brings the shmem and file THP handling into line.
>
> And while there: use old_pmd rather than _pmd, as in the following
> blocks; and make it clearer to the eye that the !vma_is_anonymous()
> block is self-contained, making an early return after accounting for
> unmapping.
>
> Link: https://lkml.kernel.org/r/af88612-1473-2eaa-903-8d1a448b26@google.com
> Link: https://lkml.kernel.org/r/dd221a99-efb3-cd1d-6256-7e646af29314@google.com
> Fixes: e71769ae5260 ("mm: enable thp migration for shmem thp")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Wang Yugui <wangyugui@e16-tech.com>
> Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
> Cc: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Cc: Alistair Popple <apopple@nvidia.com>
> Cc: Ralph Campbell <rcampbell@nvidia.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Miaohe Lin <linmiaohe@huawei.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Jue Wang <juew@google.com>
> Cc: Peter Xu <peterx@redhat.com>
> Cc: Jan Kara <jack@suse.cz>
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Conflicts:
> 	mm/huge_memory.c
> Signed-off-by: Chen Wandun <chenwandun@huawei.com>
> ---
>   mm/huge_memory.c     | 27 ++++++++++++++++++---------
>   mm/pgtable-generic.c |  5 ++---
>   2 files changed, 20 insertions(+), 12 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index d68b2c3aebc7..89419fffefab 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2199,7 +2199,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>   	count_vm_event(THP_SPLIT_PMD);
>   
>   	if (!vma_is_anonymous(vma)) {
> -		_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
> +		old_pmd = pmdp_huge_clear_flush_notify(vma, haddr, pmd);
>   		/*
>   		 * We are going to unmap this huge page. So
>   		 * just go ahead and zap it
> @@ -2208,16 +2208,25 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>   			zap_deposited_table(mm, pmd);
>   		if (vma_is_dax(vma))
>   			return;
> -		page = pmd_page(_pmd);
> -		if (!PageDirty(page) && pmd_dirty(_pmd))
> -			set_page_dirty(page);
> -		if (!PageReferenced(page) && pmd_young(_pmd))
> -			SetPageReferenced(page);
> -		page_remove_rmap(page, true);
> -		put_page(page);
> +		if (unlikely(is_pmd_migration_entry(old_pmd))) {
> +			swp_entry_t entry;
> +
> +			entry = pmd_to_swp_entry(old_pmd);
> +			page = migration_entry_to_page(entry);
> +		} else {
> +			page = pmd_page(old_pmd);
> +			if (!PageDirty(page) && pmd_dirty(old_pmd))
> +				set_page_dirty(page);
> +			if (!PageReferenced(page) && pmd_young(old_pmd))
> +				SetPageReferenced(page);
> +			page_remove_rmap(page, true);
> +			put_page(page);
> +		}
>   		add_mm_counter(mm, MM_FILEPAGES, -HPAGE_PMD_NR);
>   		return;
> -	} else if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
> +	}
> +
> +	if (pmd_trans_huge(*pmd) && is_huge_zero_pmd(*pmd)) {
>   		/*
>   		 * FIXME: Do we want to invalidate secondary mmu by calling
>   		 * mmu_notifier_invalidate_range() see comments below inside
> diff --git a/mm/pgtable-generic.c b/mm/pgtable-generic.c
> index 5e9957e19870..36770fcdc358 100644
> --- a/mm/pgtable-generic.c
> +++ b/mm/pgtable-generic.c
> @@ -125,9 +125,8 @@ pmd_t pmdp_huge_clear_flush(struct vm_area_struct *vma, unsigned long address,
>   {
>   	pmd_t pmd;
>   	VM_BUG_ON(address & ~HPAGE_PMD_MASK);
> -	VM_BUG_ON(!pmd_present(*pmdp));
> -	/* Below assumes pmd_present() is true */
> -	VM_BUG_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
> +	VM_BUG_ON(pmd_present(*pmdp) && !pmd_trans_huge(*pmdp) &&
> +			   !pmd_devmap(*pmdp));
>   	pmd = pmdp_huge_get_and_clear(vma->vm_mm, address, pmdp);
>   	flush_pmd_tlb_range(vma, address, address + HPAGE_PMD_SIZE);
>   	return pmd;

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-08-30  2:34 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20210830023032.374861-1-chenwandun@huawei.com>
2021-08-30  2:30 ` [PATCH 3/4] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry Chen Wandun
2021-08-30  2:33   ` Chen Wandun

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).