linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
@ 2019-09-03  8:27 sunqiuyang
  2019-09-03 13:17 ` Michal Hocko
  2019-09-10 19:23 ` Minchan Kim
  0 siblings, 2 replies; 13+ messages in thread
From: sunqiuyang @ 2019-09-03  8:27 UTC (permalink / raw)
  To: linux-kernel, linux-mm; +Cc: sunqiuyang

From: Qiuyang Sun <sunqiuyang@huawei.com>

Currently, after a page is migrated, it
1) has its PG_isolated flag cleared in move_to_new_page(), and
2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
However, between steps 1) and 2), the page could be isolated by another
thread in isolate_movable_page(), and added to another LRU list, leading
to list_del corruption later.

This patch fixes the bug by moving list_del into the critical section
protected by lock_page(), so that a page will not be isolated again before
it has been deleted from its LRU list.

Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
---
 mm/migrate.c | 11 +++--------
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index a42858d..c58a606 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
 	/* Drop an anon_vma reference if we took one */
 	if (anon_vma)
 		put_anon_vma(anon_vma);
+	if (rc != -EAGAIN)
+		list_del(&page->lru);
 	unlock_page(page);
 out:
 	/*
@@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 			put_new_page(newpage, private);
 		else
 			put_page(newpage);
+		list_del(&page->lru);
 		goto out;
 	}
 
@@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
 out:
 	if (rc != -EAGAIN) {
 		/*
-		 * A page that has been migrated has all references
-		 * removed and will be freed. A page that has not been
-		 * migrated will have kepts its references and be
-		 * restored.
-		 */
-		list_del(&page->lru);
-
-		/*
 		 * Compaction can migrate also non-LRU pages which are
 		 * not accounted to NR_ISOLATED_*. They can be recognized
 		 * as __PageMovable
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-03  8:27 [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages sunqiuyang
@ 2019-09-03 13:17 ` Michal Hocko
  2019-09-04  2:18   ` sunqiuyang
  2019-09-10 19:23 ` Minchan Kim
  1 sibling, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2019-09-03 13:17 UTC (permalink / raw)
  To: sunqiuyang; +Cc: linux-kernel, linux-mm

On Tue 03-09-19 16:27:46, sunqiuyang wrote:
> From: Qiuyang Sun <sunqiuyang@huawei.com>
> 
> Currently, after a page is migrated, it
> 1) has its PG_isolated flag cleared in move_to_new_page(), and
> 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> However, between steps 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.

Care to explain the race? Both paths use page_lock AFAICS
> 
> This patch fixes the bug by moving list_del into the critical section
> protected by lock_page(), so that a page will not be isolated again before
> it has been deleted from its LRU list.
> 
> Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
> ---
>  mm/migrate.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a42858d..c58a606 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>  	/* Drop an anon_vma reference if we took one */
>  	if (anon_vma)
>  		put_anon_vma(anon_vma);
> +	if (rc != -EAGAIN)
> +		list_del(&page->lru);
>  	unlock_page(page);
>  out:
>  	/*
> @@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  			put_new_page(newpage, private);
>  		else
>  			put_page(newpage);
> +		list_del(&page->lru);
>  		goto out;
>  	}
>  
> @@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  out:
>  	if (rc != -EAGAIN) {
>  		/*
> -		 * A page that has been migrated has all references
> -		 * removed and will be freed. A page that has not been
> -		 * migrated will have kepts its references and be
> -		 * restored.
> -		 */
> -		list_del(&page->lru);
> -
> -		/*
>  		 * Compaction can migrate also non-LRU pages which are
>  		 * not accounted to NR_ISOLATED_*. They can be recognized
>  		 * as __PageMovable
> -- 
> 1.8.3.1

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-03 13:17 ` Michal Hocko
@ 2019-09-04  2:18   ` sunqiuyang
  2019-09-04  6:38     ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: sunqiuyang @ 2019-09-04  2:18 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, linux-mm

The isolate path of non-lru movable pages:

isolate_migratepages_block
	isolate_movable_page
		trylock_page
		// if PageIsolated, goto out_no_isolated
		a_ops->isolate_page
		__SetPageIsolated
		unlock_page
	list_add(&page->lru, &cc->migratepages)

The migration path:

unmap_and_move
	__unmap_and_move
		lock_page
		move_to_new_page
			a_ops->migratepage
			__ClearPageIsolated
		unlock_page
	/* here, the page could be isolated again by another thread, and added into another cc->migratepages,
	since PG_Isolated has been cleared, and not protected by page_lock */
	list_del(&page->lru)

Suppose thread A isolates three pages in the order p1, p2, p3, A's cc->migratepages will be like
	head_A - p3 - p2 - p1
After p2 is migrated (but before list_del), it is isolated by another thread B. Then list_del will delete p2
from the cc->migratepages of B (instead of A). When A continues to migrate and delete p1, it will find:
	p1->prev == p2
	p2->next == LIST_POISON1. 

So we will end up with a bug like
"list_del corruption. prev->next should be ffffffbf0a1eb8e0, but was dead000000000100"
(see __list_del_entry_valid).


________________________________________
From: Michal Hocko [mhocko@kernel.org]
Sent: Tuesday, September 03, 2019 21:17
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

On Tue 03-09-19 16:27:46, sunqiuyang wrote:
> From: Qiuyang Sun <sunqiuyang@huawei.com>
>
> Currently, after a page is migrated, it
> 1) has its PG_isolated flag cleared in move_to_new_page(), and
> 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> However, between steps 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.

Care to explain the race? Both paths use page_lock AFAICS
>
> This patch fixes the bug by moving list_del into the critical section
> protected by lock_page(), so that a page will not be isolated again before
> it has been deleted from its LRU list.
>
> Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
> ---
>  mm/migrate.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a42858d..c58a606 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>       /* Drop an anon_vma reference if we took one */
>       if (anon_vma)
>               put_anon_vma(anon_vma);
> +     if (rc != -EAGAIN)
> +             list_del(&page->lru);
>       unlock_page(page);
>  out:
>       /*
> @@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>                       put_new_page(newpage, private);
>               else
>                       put_page(newpage);
> +             list_del(&page->lru);
>               goto out;
>       }
>
> @@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  out:
>       if (rc != -EAGAIN) {
>               /*
> -              * A page that has been migrated has all references
> -              * removed and will be freed. A page that has not been
> -              * migrated will have kepts its references and be
> -              * restored.
> -              */
> -             list_del(&page->lru);
> -
> -             /*
>                * Compaction can migrate also non-LRU pages which are
>                * not accounted to NR_ISOLATED_*. They can be recognized
>                * as __PageMovable
> --
> 1.8.3.1

--
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04  2:18   ` sunqiuyang
@ 2019-09-04  6:38     ` Michal Hocko
  2019-09-04  7:27       ` sunqiuyang
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2019-09-04  6:38 UTC (permalink / raw)
  To: sunqiuyang; +Cc: linux-kernel, linux-mm

On Wed 04-09-19 02:18:38, sunqiuyang wrote:
> The isolate path of non-lru movable pages:
> 
> isolate_migratepages_block
> 	isolate_movable_page
> 		trylock_page
> 		// if PageIsolated, goto out_no_isolated
> 		a_ops->isolate_page
> 		__SetPageIsolated
> 		unlock_page
> 	list_add(&page->lru, &cc->migratepages)
> 
> The migration path:
> 
> unmap_and_move
> 	__unmap_and_move
> 		lock_page
> 		move_to_new_page
> 			a_ops->migratepage
> 			__ClearPageIsolated
> 		unlock_page
> 	/* here, the page could be isolated again by another thread, and added into another cc->migratepages,
> 	since PG_Isolated has been cleared, and not protected by page_lock */
> 	list_del(&page->lru)

But the page has been migrated already and not freed yet because there
is still a pin on it. So nobody should be touching it at this stage.
Or do I still miss something?
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04  6:38     ` Michal Hocko
@ 2019-09-04  7:27       ` sunqiuyang
  2019-09-04  8:14         ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: sunqiuyang @ 2019-09-04  7:27 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, linux-mm

isolate_migratepages_block() from another thread may try to isolate the page again:

for (; low_pfn < end_pfn; low_pfn++) {
  /* ... */
  page = pfn_to_page(low_pfn);
 /* ... */
  if (!PageLRU(page)) {
    if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
        /* ... */
        if (!isolate_movable_page(page, isolate_mode))
          goto isolate_success;
      /*... */
isolate_success:
     list_add(&page->lru, &cc->migratepages);

And this page will be added to another list.
Or, do you see any reason that the page cannot go through this path?
________________________________________
From: Michal Hocko [mhocko@kernel.org]
Sent: Wednesday, September 04, 2019 14:38
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

On Wed 04-09-19 02:18:38, sunqiuyang wrote:
> The isolate path of non-lru movable pages:
>
> isolate_migratepages_block
>       isolate_movable_page
>               trylock_page
>               // if PageIsolated, goto out_no_isolated
>               a_ops->isolate_page
>               __SetPageIsolated
>               unlock_page
>       list_add(&page->lru, &cc->migratepages)
>
> The migration path:
>
> unmap_and_move
>       __unmap_and_move
>               lock_page
>               move_to_new_page
>                       a_ops->migratepage
>                       __ClearPageIsolated
>               unlock_page
>       /* here, the page could be isolated again by another thread, and added into another cc->migratepages,
>       since PG_Isolated has been cleared, and not protected by page_lock */
>       list_del(&page->lru)

But the page has been migrated already and not freed yet because there
is still a pin on it. So nobody should be touching it at this stage.
Or do I still miss something?
--
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04  7:27       ` sunqiuyang
@ 2019-09-04  8:14         ` Michal Hocko
  2019-09-04 12:19           ` sunqiuyang
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2019-09-04  8:14 UTC (permalink / raw)
  To: sunqiuyang; +Cc: linux-kernel, linux-mm

Do not top post please

On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> isolate_migratepages_block() from another thread may try to isolate the page again:
> 
> for (; low_pfn < end_pfn; low_pfn++) {
>   /* ... */
>   page = pfn_to_page(low_pfn);
>  /* ... */
>   if (!PageLRU(page)) {
>     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
>         /* ... */
>         if (!isolate_movable_page(page, isolate_mode))
>           goto isolate_success;
>       /*... */
> isolate_success:
>      list_add(&page->lru, &cc->migratepages);
> 
> And this page will be added to another list.
> Or, do you see any reason that the page cannot go through this path?

The page shouldn't be __PageMovable after the migration is done. All the
state should have been transfered to the new page IIUC.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04  8:14         ` Michal Hocko
@ 2019-09-04 12:19           ` sunqiuyang
  2019-09-04 12:52             ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: sunqiuyang @ 2019-09-04 12:19 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, linux-mm


________________________________________
From: Michal Hocko [mhocko@kernel.org]
Sent: Wednesday, September 04, 2019 16:14
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

Do not top post please

On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> isolate_migratepages_block() from another thread may try to isolate the page again:
>
> for (; low_pfn < end_pfn; low_pfn++) {
>   /* ... */
>   page = pfn_to_page(low_pfn);
>  /* ... */
>   if (!PageLRU(page)) {
>     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
>         /* ... */
>         if (!isolate_movable_page(page, isolate_mode))
>           goto isolate_success;
>       /*... */
> isolate_success:
>      list_add(&page->lru, &cc->migratepages);
>
> And this page will be added to another list.
> Or, do you see any reason that the page cannot go through this path?

The page shouldn't be __PageMovable after the migration is done. All the
state should have been transfered to the new page IIUC.

----
I don't see where page->mapping is modified after the migration is done. 

Actually, the last comment in move_to_new_page() says,
"Anonymous and movable page->mapping will be cleard by
free_pages_prepare so don't reset it here for keeping
the type to work PageAnon, for example. "

Or did I miss something? Thanks,

--
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04 12:19           ` sunqiuyang
@ 2019-09-04 12:52             ` Michal Hocko
  2019-09-05  1:44               ` sunqiuyang
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2019-09-04 12:52 UTC (permalink / raw)
  To: sunqiuyang; +Cc: linux-kernel, linux-mm

On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > Do not top post please
> > 
> > On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> > > isolate_migratepages_block() from another thread may try to isolate the page again:
> > >
> > > for (; low_pfn < end_pfn; low_pfn++) {
> > >   /* ... */
> > >   page = pfn_to_page(low_pfn);
> > >  /* ... */
> > >   if (!PageLRU(page)) {
> > >     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
> > >         /* ... */
> > >         if (!isolate_movable_page(page, isolate_mode))
> > >           goto isolate_success;
> > >       /*... */
> > > isolate_success:
> > >      list_add(&page->lru, &cc->migratepages);
> > >
> > > And this page will be added to another list.
> > > Or, do you see any reason that the page cannot go through this path?
> > 
> > The page shouldn't be __PageMovable after the migration is done. All the
> > state should have been transfered to the new page IIUC.
> > 
>
> I don't see where page->mapping is modified after the migration is done. 
> 
> Actually, the last comment in move_to_new_page() says,
> "Anonymous and movable page->mapping will be cleard by
> free_pages_prepare so don't reset it here for keeping
> the type to work PageAnon, for example. "
> 
> Or did I miss something? Thanks,

This talks about mapping rather than flags stored in the mapping.
I can see that in tree migration handlers (z3fold_page_migrate,
vmballoon_migratepage via balloon_page_delete, zs_page_migrate via
reset_page) all reset the movable flag. I am not sure whether that is a
documented requirement or just a coincidence. Maybe it should be
documented. I would like to hear from Minchan.

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-04 12:52             ` Michal Hocko
@ 2019-09-05  1:44               ` sunqiuyang
  2019-09-09  8:40                 ` Michal Hocko
  0 siblings, 1 reply; 13+ messages in thread
From: sunqiuyang @ 2019-09-05  1:44 UTC (permalink / raw)
  To: Michal Hocko; +Cc: linux-kernel, linux-mm


________________________________________
From: Michal Hocko [mhocko@kernel.org]
Sent: Wednesday, September 04, 2019 20:52
To: sunqiuyang
Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages

On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > Do not top post please
> >
> > On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> > > isolate_migratepages_block() from another thread may try to isolate the page again:
> > >
> > > for (; low_pfn < end_pfn; low_pfn++) {
> > >   /* ... */
> > >   page = pfn_to_page(low_pfn);
> > >  /* ... */
> > >   if (!PageLRU(page)) {
> > >     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
> > >         /* ... */
> > >         if (!isolate_movable_page(page, isolate_mode))
> > >           goto isolate_success;
> > >       /*... */
> > > isolate_success:
> > >      list_add(&page->lru, &cc->migratepages);
> > >
> > > And this page will be added to another list.
> > > Or, do you see any reason that the page cannot go through this path?
> >
> > The page shouldn't be __PageMovable after the migration is done. All the
> > state should have been transfered to the new page IIUC.
> >
>
> I don't see where page->mapping is modified after the migration is done.
>
> Actually, the last comment in move_to_new_page() says,
> "Anonymous and movable page->mapping will be cleard by
> free_pages_prepare so don't reset it here for keeping
> the type to work PageAnon, for example. "
>
> Or did I miss something? Thanks,

This talks about mapping rather than flags stored in the mapping.
I can see that in tree migration handlers (z3fold_page_migrate,
vmballoon_migratepage via balloon_page_delete, zs_page_migrate via
reset_page) all reset the movable flag. I am not sure whether that is a
documented requirement or just a coincidence. Maybe it should be
documented. I would like to hear from Minchan.

---
I checked the three migration handlers and only found __ClearPageMovable,
which clears registered address_space val with keeping PAGE_MAPPING_MOVABLE flag,
so the page should still be __PageMovable when caught by another migration thread. Right?

---

--
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-05  1:44               ` sunqiuyang
@ 2019-09-09  8:40                 ` Michal Hocko
  2019-09-12 17:21                   ` Minchan Kim
  0 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2019-09-09  8:40 UTC (permalink / raw)
  To: sunqiuyang, Minchan Kim; +Cc: linux-kernel, linux-mm

On Thu 05-09-19 01:44:12, sunqiuyang wrote:
> > 
> > ________________________________________
> > From: Michal Hocko [mhocko@kernel.org]
> > Sent: Wednesday, September 04, 2019 20:52
> > To: sunqiuyang
> > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
> > Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
> > 
> > On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > > > Do not top post please
> > > >
> > > > On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> > > > > isolate_migratepages_block() from another thread may try to isolate the page again:
> > > > >
> > > > > for (; low_pfn < end_pfn; low_pfn++) {
> > > > >   /* ... */
> > > > >   page = pfn_to_page(low_pfn);
> > > > >  /* ... */
> > > > >   if (!PageLRU(page)) {
> > > > >     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
> > > > >         /* ... */
> > > > >         if (!isolate_movable_page(page, isolate_mode))
> > > > >           goto isolate_success;
> > > > >       /*... */
> > > > > isolate_success:
> > > > >      list_add(&page->lru, &cc->migratepages);
> > > > >
> > > > > And this page will be added to another list.
> > > > > Or, do you see any reason that the page cannot go through this path?
> > > >
> > > > The page shouldn't be __PageMovable after the migration is done. All the
> > > > state should have been transfered to the new page IIUC.
> > > >
> > >
> > > I don't see where page->mapping is modified after the migration is done.
> > >
> > > Actually, the last comment in move_to_new_page() says,
> > > "Anonymous and movable page->mapping will be cleard by
> > > free_pages_prepare so don't reset it here for keeping
> > > the type to work PageAnon, for example. "
> > >
> > > Or did I miss something? Thanks,
> > 
> > This talks about mapping rather than flags stored in the mapping.
> > I can see that in tree migration handlers (z3fold_page_migrate,
> > vmballoon_migratepage via balloon_page_delete, zs_page_migrate via
> > reset_page) all reset the movable flag. I am not sure whether that is a
> > documented requirement or just a coincidence. Maybe it should be
> > documented. I would like to hear from Minchan.
> 
> I checked the three migration handlers and only found __ClearPageMovable,
> which clears registered address_space val with keeping PAGE_MAPPING_MOVABLE flag,
> so the page should still be __PageMovable when caught by another migration thread. Right?

Minchan, could you have a look at this please? I find __PageMovable
semantic really awkward and I do not want to make misleading statements
here.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-03  8:27 [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages sunqiuyang
  2019-09-03 13:17 ` Michal Hocko
@ 2019-09-10 19:23 ` Minchan Kim
  2019-09-10 19:31   ` Michal Hocko
  1 sibling, 1 reply; 13+ messages in thread
From: Minchan Kim @ 2019-09-10 19:23 UTC (permalink / raw)
  To: sunqiuyang; +Cc: linux-kernel, linux-mm

On Tue, Sep 03, 2019 at 04:27:46PM +0800, sunqiuyang wrote:
> From: Qiuyang Sun <sunqiuyang@huawei.com>
> 
> Currently, after a page is migrated, it
> 1) has its PG_isolated flag cleared in move_to_new_page(), and
> 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> However, between steps 1) and 2), the page could be isolated by another
> thread in isolate_movable_page(), and added to another LRU list, leading
> to list_del corruption later.

Once non-LRU page is migrated out successfully, driver should clear
the movable flag in the page. Look at reset_page in zs_page_migrate.
So, other thread couldn't isolate the page during the window.

If I miss something, let me know it.
Thanks.

> 
> This patch fixes the bug by moving list_del into the critical section
> protected by lock_page(), so that a page will not be isolated again before
> it has been deleted from its LRU list.
> 
> Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
> ---
>  mm/migrate.c | 11 +++--------
>  1 file changed, 3 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index a42858d..c58a606 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1124,6 +1124,8 @@ static int __unmap_and_move(struct page *page, struct page *newpage,
>  	/* Drop an anon_vma reference if we took one */
>  	if (anon_vma)
>  		put_anon_vma(anon_vma);
> +	if (rc != -EAGAIN)
> +		list_del(&page->lru);
>  	unlock_page(page);
>  out:
>  	/*
> @@ -1190,6 +1192,7 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  			put_new_page(newpage, private);
>  		else
>  			put_page(newpage);
> +		list_del(&page->lru);
>  		goto out;
>  	}
>  
> @@ -1200,14 +1203,6 @@ static ICE_noinline int unmap_and_move(new_page_t get_new_page,
>  out:
>  	if (rc != -EAGAIN) {
>  		/*
> -		 * A page that has been migrated has all references
> -		 * removed and will be freed. A page that has not been
> -		 * migrated will have kepts its references and be
> -		 * restored.
> -		 */
> -		list_del(&page->lru);
> -
> -		/*
>  		 * Compaction can migrate also non-LRU pages which are
>  		 * not accounted to NR_ISOLATED_*. They can be recognized
>  		 * as __PageMovable
> -- 
> 1.8.3.1
> 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-10 19:23 ` Minchan Kim
@ 2019-09-10 19:31   ` Michal Hocko
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2019-09-10 19:31 UTC (permalink / raw)
  To: Minchan Kim; +Cc: sunqiuyang, linux-kernel, linux-mm

On Tue 10-09-19 12:23:04, Minchan Kim wrote:
> On Tue, Sep 03, 2019 at 04:27:46PM +0800, sunqiuyang wrote:
> > From: Qiuyang Sun <sunqiuyang@huawei.com>
> > 
> > Currently, after a page is migrated, it
> > 1) has its PG_isolated flag cleared in move_to_new_page(), and
> > 2) is deleted from its LRU list (cc->migratepages) in unmap_and_move().
> > However, between steps 1) and 2), the page could be isolated by another
> > thread in isolate_movable_page(), and added to another LRU list, leading
> > to list_del corruption later.
> 
> Once non-LRU page is migrated out successfully, driver should clear
> the movable flag in the page. Look at reset_page in zs_page_migrate.
> So, other thread couldn't isolate the page during the window.
> 
> If I miss something, let me know it.

Please have a look at http://lkml.kernel.org/r/157FC541501A9C4C862B2F16FFE316DC190C5990@dggeml512-mbx.china.huawei.com
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
  2019-09-09  8:40                 ` Michal Hocko
@ 2019-09-12 17:21                   ` Minchan Kim
  0 siblings, 0 replies; 13+ messages in thread
From: Minchan Kim @ 2019-09-12 17:21 UTC (permalink / raw)
  To: Michal Hocko; +Cc: sunqiuyang, linux-kernel, linux-mm

On Mon, Sep 09, 2019 at 10:40:29AM +0200, Michal Hocko wrote:
> On Thu 05-09-19 01:44:12, sunqiuyang wrote:
> > > 
> > > ________________________________________
> > > From: Michal Hocko [mhocko@kernel.org]
> > > Sent: Wednesday, September 04, 2019 20:52
> > > To: sunqiuyang
> > > Cc: linux-kernel@vger.kernel.org; linux-mm@kvack.org
> > > Subject: Re: [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages
> > > 
> > > On Wed 04-09-19 12:19:11, sunqiuyang wrote:
> > > > > Do not top post please
> > > > >
> > > > > On Wed 04-09-19 07:27:25, sunqiuyang wrote:
> > > > > > isolate_migratepages_block() from another thread may try to isolate the page again:
> > > > > >
> > > > > > for (; low_pfn < end_pfn; low_pfn++) {
> > > > > >   /* ... */
> > > > > >   page = pfn_to_page(low_pfn);
> > > > > >  /* ... */
> > > > > >   if (!PageLRU(page)) {
> > > > > >     if (unlikely(__PageMovable(page)) && !PageIsolated(page)) {
> > > > > >         /* ... */
> > > > > >         if (!isolate_movable_page(page, isolate_mode))
> > > > > >           goto isolate_success;
> > > > > >       /*... */
> > > > > > isolate_success:
> > > > > >      list_add(&page->lru, &cc->migratepages);
> > > > > >
> > > > > > And this page will be added to another list.
> > > > > > Or, do you see any reason that the page cannot go through this path?
> > > > >
> > > > > The page shouldn't be __PageMovable after the migration is done. All the
> > > > > state should have been transfered to the new page IIUC.
> > > > >
> > > >
> > > > I don't see where page->mapping is modified after the migration is done.

Look at __ClearPageMovable which modify page->mapping.
Once driver is migrated the page successfully, it should call __ClearPageMovable.
To not consume new a page flag at that time, this flag is stored at page->mapping
since we already have squeezed several flags in there.

> > > >
> > > > Actually, the last comment in move_to_new_page() says,
> > > > "Anonymous and movable page->mapping will be cleard by
> > > > free_pages_prepare so don't reset it here for keeping
> > > > the type to work PageAnon, for example. "
> > > >
> > > > Or did I miss something? Thanks,
> > > 
> > > This talks about mapping rather than flags stored in the mapping.
> > > I can see that in tree migration handlers (z3fold_page_migrate,
> > > vmballoon_migratepage via balloon_page_delete, zs_page_migrate via
> > > reset_page) all reset the movable flag. I am not sure whether that is a
> > > documented requirement or just a coincidence. Maybe it should be
> > > documented. I would like to hear from Minchan.

It is intended. See Documentation/vm/page_migration.rst

   After isolation, VM calls migratepage of driver with isolated page.
   The function of migratepage is to move content of the old page to new page
   and set up fields of struct page newpage. Keep in mind that you should
   indicate to the VM the oldpage is no longer movable via __ClearPageMovable()
   under page_lock if you migrated the oldpage successfully and returns
   MIGRATEPAGE_SUCCESS.

Thanks.


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2019-09-12 17:21 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-03  8:27 [PATCH 1/1] mm/migrate: fix list corruption in migration of non-LRU movable pages sunqiuyang
2019-09-03 13:17 ` Michal Hocko
2019-09-04  2:18   ` sunqiuyang
2019-09-04  6:38     ` Michal Hocko
2019-09-04  7:27       ` sunqiuyang
2019-09-04  8:14         ` Michal Hocko
2019-09-04 12:19           ` sunqiuyang
2019-09-04 12:52             ` Michal Hocko
2019-09-05  1:44               ` sunqiuyang
2019-09-09  8:40                 ` Michal Hocko
2019-09-12 17:21                   ` Minchan Kim
2019-09-10 19:23 ` Minchan Kim
2019-09-10 19:31   ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).