All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry()
@ 2020-02-20  7:52 Huang, Ying
  2020-02-20 10:22 ` William Kucharski
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Huang, Ying @ 2020-02-20  7:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, Huang Ying, Zi Yan, Kirill A . Shutemov,
	Andrea Arcangeli, Michal Hocko, Vlastimil Babka

From: Huang Ying <ying.huang@intel.com>

In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
atomically.  But the PMD is read before that with an ordinary memory
reading.  If the THP (transparent huge page) is written between the
PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and
cause data corruption.  The race window is quite small, but still
possible in theory, so need to be fixed.

The race is fixed via using the return value of pmdp_invalidate() to
get the original content of PMD, which is a read/modify/write atomic
operation.  So no THP writing can occur in between.

The race has been introduced when the THP migration support is added
in the commit 616b8371539a ("mm: thp: enable thp migration in generic
path").  But this fix depends on the commit d52605d7cb30 ("mm: do not
lose dirty and accessed bits in pmdp_invalidate()").  So it's easy to
be backported after v4.16.  But the race window is really small, so it
may be fine not to backport the fix at all.

Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
---
 mm/huge_memory.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 580098e115bd..b1e069e68189 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
 		return;
 
 	flush_cache_range(vma, address, address + HPAGE_PMD_SIZE);
-	pmdval = *pvmw->pmd;
-	pmdp_invalidate(vma, address, pvmw->pmd);
+	pmdval = pmdp_invalidate(vma, address, pvmw->pmd);
 	if (pmd_dirty(pmdval))
 		set_page_dirty(page);
 	entry = make_migration_entry(page, pmd_write(pmdval));
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry()
  2020-02-20  7:52 [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() Huang, Ying
@ 2020-02-20 10:22 ` William Kucharski
  2020-02-20 13:13 ` Kirill A. Shutemov
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: William Kucharski @ 2020-02-20 10:22 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Zi Yan,
	Kirill A . Shutemov, Andrea Arcangeli, Michal Hocko,
	Vlastimil Babka


> On Feb 20, 2020, at 12:52 AM, Huang, Ying <ying.huang@intel.com> wrote:
> 
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>

Looks good to me.

Reviewed-by: William Kucharski <william.kucharski@oracle.com>

> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> ---
> mm/huge_memory.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 580098e115bd..b1e069e68189 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
> 		return;
> 
> 	flush_cache_range(vma, address, address + HPAGE_PMD_SIZE);
> -	pmdval = *pvmw->pmd;
> -	pmdp_invalidate(vma, address, pvmw->pmd);
> +	pmdval = pmdp_invalidate(vma, address, pvmw->pmd);
> 	if (pmd_dirty(pmdval))
> 		set_page_dirty(page);
> 	entry = make_migration_entry(page, pmd_write(pmdval));
> -- 
> 2.25.0
> 
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry()
  2020-02-20  7:52 [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() Huang, Ying
  2020-02-20 10:22 ` William Kucharski
@ 2020-02-20 13:13 ` Kirill A. Shutemov
  2020-02-20 13:18 ` Zi Yan
  2020-02-21  0:55 ` Andrew Morton
  3 siblings, 0 replies; 5+ messages in thread
From: Kirill A. Shutemov @ 2020-02-20 13:13 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Zi Yan,
	Kirill A . Shutemov, Andrea Arcangeli, Michal Hocko,
	Vlastimil Babka

On Thu, Feb 20, 2020 at 03:52:20PM +0800, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
> 
> In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
> atomically.  But the PMD is read before that with an ordinary memory
> reading.  If the THP (transparent huge page) is written between the
> PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and
> cause data corruption.  The race window is quite small, but still
> possible in theory, so need to be fixed.
> 
> The race is fixed via using the return value of pmdp_invalidate() to
> get the original content of PMD, which is a read/modify/write atomic
> operation.  So no THP writing can occur in between.
> 
> The race has been introduced when the THP migration support is added
> in the commit 616b8371539a ("mm: thp: enable thp migration in generic
> path").  But this fix depends on the commit d52605d7cb30 ("mm: do not
> lose dirty and accessed bits in pmdp_invalidate()").  So it's easy to
> be backported after v4.16.  But the race window is really small, so it
> may be fine not to backport the fix at all.
> 
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry()
  2020-02-20  7:52 [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() Huang, Ying
  2020-02-20 10:22 ` William Kucharski
  2020-02-20 13:13 ` Kirill A. Shutemov
@ 2020-02-20 13:18 ` Zi Yan
  2020-02-21  0:55 ` Andrew Morton
  3 siblings, 0 replies; 5+ messages in thread
From: Zi Yan @ 2020-02-20 13:18 UTC (permalink / raw)
  To: Huang, Ying
  Cc: Andrew Morton, linux-mm, linux-kernel, Kirill A . Shutemov,
	Andrea Arcangeli, Michal Hocko, Vlastimil Babka

[-- Attachment #1: Type: text/plain, Size: 2218 bytes --]

On 20 Feb 2020, at 2:52, Huang, Ying wrote:

> From: Huang Ying <ying.huang@intel.com>
>
> In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
> atomically.  But the PMD is read before that with an ordinary memory
> reading.  If the THP (transparent huge page) is written between the
> PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and
> cause data corruption.  The race window is quite small, but still
> possible in theory, so need to be fixed.
>
> The race is fixed via using the return value of pmdp_invalidate() to
> get the original content of PMD, which is a read/modify/write atomic
> operation.  So no THP writing can occur in between.
>
> The race has been introduced when the THP migration support is added
> in the commit 616b8371539a ("mm: thp: enable thp migration in generic
> path").  But this fix depends on the commit d52605d7cb30 ("mm: do not
> lose dirty and accessed bits in pmdp_invalidate()").  So it's easy to
> be backported after v4.16.  But the race window is really small, so it
> may be fine not to backport the fix at all.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> ---
>  mm/huge_memory.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 580098e115bd..b1e069e68189 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3060,8 +3060,7 @@ void set_pmd_migration_entry(struct page_vma_mapped_walk *pvmw,
>                 return;
>
>         flush_cache_range(vma, address, address + HPAGE_PMD_SIZE);
> -       pmdval = *pvmw->pmd;
> -       pmdp_invalidate(vma, address, pvmw->pmd);
> +       pmdval = pmdp_invalidate(vma, address, pvmw->pmd);
>         if (pmd_dirty(pmdval))
>                 set_page_dirty(page);
>         entry = make_migration_entry(page, pmd_write(pmdval));
> --
> 2.25.0

Looks good to me. Thanks.

Reviewed-by: Zi Yan <ziy@nvidia.com>


—
Best Regards,
Yan Zi

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 854 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry()
  2020-02-20  7:52 [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() Huang, Ying
                   ` (2 preceding siblings ...)
  2020-02-20 13:18 ` Zi Yan
@ 2020-02-21  0:55 ` Andrew Morton
  3 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2020-02-21  0:55 UTC (permalink / raw)
  To: Huang, Ying
  Cc: linux-mm, linux-kernel, Zi Yan, Kirill A . Shutemov,
	Andrea Arcangeli, Michal Hocko, Vlastimil Babka

On Thu, 20 Feb 2020 15:52:20 +0800 "Huang, Ying" <ying.huang@intel.com> wrote:

> From: Huang Ying <ying.huang@intel.com>
> 
> In set_pmd_migration_entry(), pmdp_invalidate() is used to change PMD
> atomically.  But the PMD is read before that with an ordinary memory
> reading.  If the THP (transparent huge page) is written between the
> PMD reading and pmdp_invalidate(), the PMD dirty bit may be lost, and
> cause data corruption.  The race window is quite small, but still
> possible in theory, so need to be fixed.
> 
> The race is fixed via using the return value of pmdp_invalidate() to
> get the original content of PMD, which is a read/modify/write atomic
> operation.  So no THP writing can occur in between.
> 
> The race has been introduced when the THP migration support is added
> in the commit 616b8371539a ("mm: thp: enable thp migration in generic
> path").  But this fix depends on the commit d52605d7cb30 ("mm: do not
> lose dirty and accessed bits in pmdp_invalidate()").  So it's easy to
> be backported after v4.16.  But the race window is really small, so it
> may be fine not to backport the fix at all.

Thanks.  I'm inclined to add a cc:stable to this one.  Silent data corruption is
pretty serious.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-02-21  0:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-20  7:52 [PATCH] mm: Fix possible PMD dirty bit lost in set_pmd_migration_entry() Huang, Ying
2020-02-20 10:22 ` William Kucharski
2020-02-20 13:13 ` Kirill A. Shutemov
2020-02-20 13:18 ` Zi Yan
2020-02-21  0:55 ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.