LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH] mm: thp: fix soft dirty for migration when split
@ 2018-12-06  8:46 Peter Xu
  2018-12-07  3:34 ` Peter Xu
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Xu @ 2018-12-06  8:46 UTC (permalink / raw)
  To: linux-kernel
  Cc: peterx, Andrea Arcangeli, Andrew Morton, Kirill A. Shutemov,
	Matthew Wilcox, Michal Hocko, Dave Jiang, Aneesh Kumar K.V,
	Souptick Joarder, Konstantin Khlebnikov, linux-mm

When splitting a huge migrating PMD, we'll transfer the soft dirty bit
from the huge page to the small pages.  However we're possibly using a
wrong data since when fetching the bit we're using pmd_soft_dirty()
upon a migration entry.  Fix it up.

CC: Andrea Arcangeli <aarcange@redhat.com>
CC: Andrew Morton <akpm@linux-foundation.org>
CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
CC: Matthew Wilcox <willy@infradead.org>
CC: Michal Hocko <mhocko@suse.com>
CC: Dave Jiang <dave.jiang@intel.com>
CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
CC: Souptick Joarder <jrdr.linux@gmail.com>
CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
CC: linux-mm@kvack.org
CC: linux-kernel@vger.kernel.org
Signed-off-by: Peter Xu <peterx@redhat.com>
---

I noticed this during code reading.  Only compile tested.  I'm sending
a patch directly for review comments since it's relatively
straightforward and not easy to test.  Please have a look, thanks.
---
 mm/huge_memory.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f2d19e4fe854..fb0787c3dd3b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2161,7 +2161,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
 		SetPageDirty(page);
 	write = pmd_write(old_pmd);
 	young = pmd_young(old_pmd);
-	soft_dirty = pmd_soft_dirty(old_pmd);
+	if (unlikely(pmd_migration))
+		soft_dirty = pmd_swp_soft_dirty(old_pmd);
+	else
+		soft_dirty = pmd_soft_dirty(old_pmd);
 
 	/*
 	 * Withdraw the table only after we mark the pmd entry invalid.
-- 
2.17.1


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: thp: fix soft dirty for migration when split
  2018-12-06  8:46 [PATCH] mm: thp: fix soft dirty for migration when split Peter Xu
@ 2018-12-07  3:34 ` Peter Xu
       [not found]   ` <CALYGNiMjWDL6XaOFgfrM1WR6_GnmxfLBXwJ=YYGVNfEKNX0MfQ@mail.gmail.com>
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Xu @ 2018-12-07  3:34 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrea Arcangeli, Andrew Morton, Kirill A. Shutemov,
	Matthew Wilcox, Michal Hocko, Dave Jiang, Aneesh Kumar K.V,
	Souptick Joarder, Konstantin Khlebnikov, linux-mm

On Thu, Dec 06, 2018 at 04:46:04PM +0800, Peter Xu wrote:
> When splitting a huge migrating PMD, we'll transfer the soft dirty bit
> from the huge page to the small pages.  However we're possibly using a
> wrong data since when fetching the bit we're using pmd_soft_dirty()
> upon a migration entry.  Fix it up.

Note that if my understanding is correct about the problem then if
without the patch there is chance to lose some of the dirty bits in
the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
of swap offset instead of bit 2) and it could potentially corrupt the
memory of an userspace program which depends on the dirty bit.

> 
> CC: Andrea Arcangeli <aarcange@redhat.com>
> CC: Andrew Morton <akpm@linux-foundation.org>
> CC: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> CC: Matthew Wilcox <willy@infradead.org>
> CC: Michal Hocko <mhocko@suse.com>
> CC: Dave Jiang <dave.jiang@intel.com>
> CC: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> CC: Souptick Joarder <jrdr.linux@gmail.com>
> CC: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> CC: linux-mm@kvack.org
> CC: linux-kernel@vger.kernel.org
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> 
> I noticed this during code reading.  Only compile tested.  I'm sending
> a patch directly for review comments since it's relatively
> straightforward and not easy to test.  Please have a look, thanks.
> ---
>  mm/huge_memory.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index f2d19e4fe854..fb0787c3dd3b 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2161,7 +2161,10 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd,
>  		SetPageDirty(page);
>  	write = pmd_write(old_pmd);
>  	young = pmd_young(old_pmd);
> -	soft_dirty = pmd_soft_dirty(old_pmd);
> +	if (unlikely(pmd_migration))
> +		soft_dirty = pmd_swp_soft_dirty(old_pmd);
> +	else
> +		soft_dirty = pmd_soft_dirty(old_pmd);
>  
>  	/*
>  	 * Withdraw the table only after we mark the pmd entry invalid.
> -- 
> 2.17.1
> 

Regards,

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: thp: fix soft dirty for migration when split
       [not found]   ` <CALYGNiMjWDL6XaOFgfrM1WR6_GnmxfLBXwJ=YYGVNfEKNX0MfQ@mail.gmail.com>
@ 2018-12-11  4:48     ` Peter Xu
  2018-12-11 13:12       ` Konstantin Khlebnikov
  0 siblings, 1 reply; 4+ messages in thread
From: Peter Xu @ 2018-12-11  4:48 UTC (permalink / raw)
  To: Konstantin Khlebnikov
  Cc: Linux Kernel Mailing List, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, Matthew Wilcox, Michal Hocko, dave.jiang,
	Aneesh Kumar K.V, jrdr.linux,
	Константин
	Хлебников,
	linux-mm

On Mon, Dec 10, 2018 at 07:50:52PM +0300, Konstantin Khlebnikov wrote:
> On Fri, Dec 7, 2018 at 6:34 AM Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Dec 06, 2018 at 04:46:04PM +0800, Peter Xu wrote:
> > > When splitting a huge migrating PMD, we'll transfer the soft dirty bit
> > > from the huge page to the small pages.  However we're possibly using a
> > > wrong data since when fetching the bit we're using pmd_soft_dirty()
> > > upon a migration entry.  Fix it up.
> >
> > Note that if my understanding is correct about the problem then if
> > without the patch there is chance to lose some of the dirty bits in
> > the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
> > of swap offset instead of bit 2) and it could potentially corrupt the
> > memory of an userspace program which depends on the dirty bit.
> 
> It seems this code is broken in case of pmd_migraion:
> 
> old_pmd = pmdp_invalidate(vma, haddr, pmd);
> 
> #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> pmd_migration = is_pmd_migration_entry(old_pmd);
> if (pmd_migration) {
> swp_entry_t entry;
> 
> entry = pmd_to_swp_entry(old_pmd);
> page = pfn_to_page(swp_offset(entry));
> } else
> #endif
> page = pmd_page(old_pmd);
> VM_BUG_ON_PAGE(!page_count(page), page);
> page_ref_add(page, HPAGE_PMD_NR - 1);
> if (pmd_dirty(old_pmd))
> SetPageDirty(page);
> write = pmd_write(old_pmd);
> young = pmd_young(old_pmd);
> soft_dirty = pmd_soft_dirty(old_pmd);
> 
> Not just soft_dirt - all bits (dirty, write, young) have diffrent encoding
> or not present at all for migration entry.

Hi, Konstantin,

Actually I noticed it but I thought it didn't hurt since both
write/young flags are not used at all when applying to the small pages
when pmd_migration==true.  But indeed there's at least an unexpected
side effect of an extra call to SetPageDirty() that I missed.

I'll repost soon.  Thanks!

-- 
Peter Xu

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] mm: thp: fix soft dirty for migration when split
  2018-12-11  4:48     ` Peter Xu
@ 2018-12-11 13:12       ` Konstantin Khlebnikov
  0 siblings, 0 replies; 4+ messages in thread
From: Konstantin Khlebnikov @ 2018-12-11 13:12 UTC (permalink / raw)
  To: peterx
  Cc: Linux Kernel Mailing List, Andrea Arcangeli, Andrew Morton,
	Kirill A. Shutemov, Matthew Wilcox, Michal Hocko, dave.jiang,
	Aneesh Kumar K.V, Souptick Joarder,
	Константин
	Хлебников,
	linux-mm

On Tue, Dec 11, 2018 at 7:48 AM Peter Xu <peterx@redhat.com> wrote:
>
> On Mon, Dec 10, 2018 at 07:50:52PM +0300, Konstantin Khlebnikov wrote:
> > On Fri, Dec 7, 2018 at 6:34 AM Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Dec 06, 2018 at 04:46:04PM +0800, Peter Xu wrote:
> > > > When splitting a huge migrating PMD, we'll transfer the soft dirty bit
> > > > from the huge page to the small pages.  However we're possibly using a
> > > > wrong data since when fetching the bit we're using pmd_soft_dirty()
> > > > upon a migration entry.  Fix it up.
> > >
> > > Note that if my understanding is correct about the problem then if
> > > without the patch there is chance to lose some of the dirty bits in
> > > the migrating pmd pages (on x86_64 we're fetching bit 11 which is part
> > > of swap offset instead of bit 2) and it could potentially corrupt the
> > > memory of an userspace program which depends on the dirty bit.
> >
> > It seems this code is broken in case of pmd_migraion:
> >
> > old_pmd = pmdp_invalidate(vma, haddr, pmd);
> >
> > #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
> > pmd_migration = is_pmd_migration_entry(old_pmd);
> > if (pmd_migration) {
> > swp_entry_t entry;
> >
> > entry = pmd_to_swp_entry(old_pmd);
> > page = pfn_to_page(swp_offset(entry));
> > } else
> > #endif
> > page = pmd_page(old_pmd);
> > VM_BUG_ON_PAGE(!page_count(page), page);
> > page_ref_add(page, HPAGE_PMD_NR - 1);
> > if (pmd_dirty(old_pmd))
> > SetPageDirty(page);
> > write = pmd_write(old_pmd);
> > young = pmd_young(old_pmd);
> > soft_dirty = pmd_soft_dirty(old_pmd);
> >
> > Not just soft_dirt - all bits (dirty, write, young) have diffrent encoding
> > or not present at all for migration entry.
>
> Hi, Konstantin,
>
> Actually I noticed it but I thought it didn't hurt since both
> write/young flags are not used at all when applying to the small pages
> when pmd_migration==true.  But indeed there's at least an unexpected
> side effect of an extra call to SetPageDirty() that I missed.

"write" is used for making smaller migration entry:

swp_entry = make_migration_entry(page + i, write);

>
>
> I'll repost soon.  Thanks!
>
> --
> Peter Xu

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, back to index

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06  8:46 [PATCH] mm: thp: fix soft dirty for migration when split Peter Xu
2018-12-07  3:34 ` Peter Xu
     [not found]   ` <CALYGNiMjWDL6XaOFgfrM1WR6_GnmxfLBXwJ=YYGVNfEKNX0MfQ@mail.gmail.com>
2018-12-11  4:48     ` Peter Xu
2018-12-11 13:12       ` Konstantin Khlebnikov

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox