All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
@ 2018-12-14  2:39 Wei Yang
  2018-12-14  3:57 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-14  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

Below is a brief call flow for __offline_pages() and
alloc_contig_range():

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()

Since set_migratetype_isolate() is only used in
start_isolate_page_range(), which is just used in __offline_pages() and
alloc_contig_range(). And both of them call drain_all_pages() if every
check looks good. This means it is not necessary call drain_all_pages()
in each iteration of set_migratetype_isolate().

By doing so, the logic seems a little bit clearer.
set_migratetype_isolate() handles pages in Buddy, while
drain_all_pages() takes care of pages in pcp.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
---
 mm/page_isolation.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 43e085608846..f44c0e333bed 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -83,8 +83,6 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
 	}
 
 	spin_unlock_irqrestore(&zone->lock, flags);
-	if (!ret)
-		drain_all_pages(zone);
 	return ret;
 }
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14  2:39 [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate() Wei Yang
@ 2018-12-14  3:57 ` Andrew Morton
  2018-12-14  7:01   ` Wei Yang
  2018-12-14 15:17   ` Wei Yang
  2018-12-17 12:25 ` Michal Hocko
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
  2 siblings, 2 replies; 38+ messages in thread
From: Andrew Morton @ 2018-12-14  3:57 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, mhocko, osalvador, david, Minchan Kim, Mel Gorman

On Fri, 14 Dec 2018 10:39:12 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:

> Below is a brief call flow for __offline_pages()

Offtopic...

set_migratetype_isolate() has the comment

	/*
	 * immobile means "not-on-lru" pages. If immobile is larger than
	 * removable-by-driver pages reported by notifier, we'll fail.
	 */

what the heck does that mean?  It used to talk about unmovable pages,
but this was mysteriously changed to use the unique term "immobile" by
Minchan's ee6f509c32 ("mm: factor out memory isolate functions"). 
Could someone please take a look?


> and
> alloc_contig_range():
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()
> 
> Since set_migratetype_isolate() is only used in
> start_isolate_page_range(), which is just used in __offline_pages() and
> alloc_contig_range(). And both of them call drain_all_pages() if every
> check looks good. This means it is not necessary call drain_all_pages()
> in each iteration of set_migratetype_isolate().
>
> By doing so, the logic seems a little bit clearer.
> set_migratetype_isolate() handles pages in Buddy, while
> drain_all_pages() takes care of pages in pcp.

Well.  drain_all_pages() moves pages from pcp to buddy so I'm not sure
that argument holds water.

Can we step back a bit and ask ourselves what all these draining
operations are actually for?  What is the intent behind each callsite? 
Figuring that out (and perhaps even documenting it!) would help us
decide the most appropriate places from which to perform the drain.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14  3:57 ` Andrew Morton
@ 2018-12-14  7:01   ` Wei Yang
  2018-12-14 15:17   ` Wei Yang
  1 sibling, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-14  7:01 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wei Yang, linux-mm, mhocko, osalvador, david, Minchan Kim, Mel Gorman

On Thu, Dec 13, 2018 at 07:57:12PM -0800, Andrew Morton wrote:
>On Fri, 14 Dec 2018 10:39:12 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
>
>> Below is a brief call flow for __offline_pages()
>
>Offtopic...
>
>set_migratetype_isolate() has the comment
>
>	/*
>	 * immobile means "not-on-lru" pages. If immobile is larger than
>	 * removable-by-driver pages reported by notifier, we'll fail.
>	 */
>
>what the heck does that mean?  It used to talk about unmovable pages,
>but this was mysteriously changed to use the unique term "immobile" by
>Minchan's ee6f509c32 ("mm: factor out memory isolate functions"). 
>Could someone please take a look?
>

What immobile stands for? I searched the whole kernel tree and just this
place use this terminology.

>
>> and
>> alloc_contig_range():
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()
>> 
>> Since set_migratetype_isolate() is only used in
>> start_isolate_page_range(), which is just used in __offline_pages() and
>> alloc_contig_range(). And both of them call drain_all_pages() if every
>> check looks good. This means it is not necessary call drain_all_pages()
>> in each iteration of set_migratetype_isolate().
>>
>> By doing so, the logic seems a little bit clearer.
>> set_migratetype_isolate() handles pages in Buddy, while
>> drain_all_pages() takes care of pages in pcp.
>
>Well.  drain_all_pages() moves pages from pcp to buddy so I'm not sure
>that argument holds water.
>

You mean the wartermark?

>Can we step back a bit and ask ourselves what all these draining
>operations are actually for?  What is the intent behind each callsite? 
>Figuring that out (and perhaps even documenting it!) would help us
>decide the most appropriate places from which to perform the drain.

That is great. I found myself hard to understand current implementation.
Let me try to write down what I understand now.

Current mm subsystem manage memory with a hierarchic way.

  * Buddy system
  * pcp pageset
  * slub

With this background, we handle pages differently for different layer.

  * set_migratetype_isolate() handle pages still in Buddy system.
  * drain_all_pages() handle pages in pcp pageset.
  * I don't know who handle pages in slub.

While there are still pages out there, eg. page table, file pages, I
don't understand how they are handled during offline. Especially, how to
catch them all in a specific range.

Now go back to this patch. 

   __offline_pages()/alloc_contig_range()
       start_isolate_page_range()
           set_migratetype_isolate()
               drain_all_pages()
       drain_all_pages()

start_isolate_page_range() will iterate a range with pageblock step to
isolate them. Since both __offline_pages() and alloc_contig_range()
require this range to be in the same zone, drain_all_pages() will drain
the pcp pageset of the same zone several times. After that,
drain_all_pages() will be called again to drain pages.

One thing we can notice is after set_migratetype_isolate() for a
particular range, this range's page will not be available for
allocation. But the pages after this range still has a chance to be put
on pcp pageset. And during this process, pages of the same zone but out
of the whole range could be put on the pcp pageset. This means current
implementation would drain those pages several times and may increase
contention for this zone.

This behavior seems suboptimal. And we can do this just in once to drain
all of them.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14  3:57 ` Andrew Morton
  2018-12-14  7:01   ` Wei Yang
@ 2018-12-14 15:17   ` Wei Yang
  2018-12-17 12:21     ` Michal Hocko
  1 sibling, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-14 15:17 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Wei Yang, linux-mm, mhocko, osalvador, david, Minchan Kim, Mel Gorman

On Thu, Dec 13, 2018 at 07:57:12PM -0800, Andrew Morton wrote:
>On Fri, 14 Dec 2018 10:39:12 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
>
>> Below is a brief call flow for __offline_pages()
>
>Offtopic...
>
>set_migratetype_isolate() has the comment
>
>	/*
>	 * immobile means "not-on-lru" pages. If immobile is larger than
>	 * removable-by-driver pages reported by notifier, we'll fail.
>	 */
>
>what the heck does that mean?  It used to talk about unmovable pages,
>but this was mysteriously changed to use the unique term "immobile" by
>Minchan's ee6f509c32 ("mm: factor out memory isolate functions"). 
>Could someone please take a look?
>
>
>> and
>> alloc_contig_range():
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()
>> 
>> Since set_migratetype_isolate() is only used in
>> start_isolate_page_range(), which is just used in __offline_pages() and
>> alloc_contig_range(). And both of them call drain_all_pages() if every
>> check looks good. This means it is not necessary call drain_all_pages()
>> in each iteration of set_migratetype_isolate().
>>
>> By doing so, the logic seems a little bit clearer.
>> set_migratetype_isolate() handles pages in Buddy, while
>> drain_all_pages() takes care of pages in pcp.
>
>Well.  drain_all_pages() moves pages from pcp to buddy so I'm not sure
>that argument holds water.
>
>Can we step back a bit and ask ourselves what all these draining
>operations are actually for?  What is the intent behind each callsite? 
>Figuring that out (and perhaps even documenting it!) would help us
>decide the most appropriate places from which to perform the drain.

With some rethinking we even could take drain_all_pages() out of the
repeat loop. Because after isolation, the page in this range will not be
put to pcp pageset. So we just need to drain pages once.

The change may look like this.

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6910e0eea074..120e9fdfd055 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1590,6 +1590,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
        if (ret)
                goto failed_removal;

+       drain_all_pages(zone);
        pfn = start_pfn;
 repeat:
        /* start memory hot removal */
@@ -1599,7 +1600,6 @@ static int __ref __offline_pages(unsigned long start_pfn,

        cond_resched();
        lru_add_drain_all();
-       drain_all_pages(zone);

        pfn = scan_movable_pages(start_pfn, end_pfn);
        if (pfn) { /* We have movable pages */

-- 
Wei Yang
Help you, Help me

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14 15:17   ` Wei Yang
@ 2018-12-17 12:21     ` Michal Hocko
  2018-12-18 20:48       ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-17 12:21 UTC (permalink / raw)
  To: Wei Yang
  Cc: Andrew Morton, linux-mm, osalvador, david, Minchan Kim, Mel Gorman

On Fri 14-12-18 15:17:56, Wei Yang wrote:
> On Thu, Dec 13, 2018 at 07:57:12PM -0800, Andrew Morton wrote:
> >On Fri, 14 Dec 2018 10:39:12 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
> >
> >> Below is a brief call flow for __offline_pages()
> >
> >Offtopic...
> >
> >set_migratetype_isolate() has the comment
> >
> >	/*
> >	 * immobile means "not-on-lru" pages. If immobile is larger than
> >	 * removable-by-driver pages reported by notifier, we'll fail.
> >	 */
> >
> >what the heck does that mean?  It used to talk about unmovable pages,
> >but this was mysteriously changed to use the unique term "immobile" by
> >Minchan's ee6f509c32 ("mm: factor out memory isolate functions"). 
> >Could someone please take a look?
> >
> >
> >> and
> >> alloc_contig_range():
> >> 
> >>   __offline_pages()/alloc_contig_range()
> >>       start_isolate_page_range()
> >>           set_migratetype_isolate()
> >>               drain_all_pages()
> >>       drain_all_pages()
> >> 
> >> Since set_migratetype_isolate() is only used in
> >> start_isolate_page_range(), which is just used in __offline_pages() and
> >> alloc_contig_range(). And both of them call drain_all_pages() if every
> >> check looks good. This means it is not necessary call drain_all_pages()
> >> in each iteration of set_migratetype_isolate().
> >>
> >> By doing so, the logic seems a little bit clearer.
> >> set_migratetype_isolate() handles pages in Buddy, while
> >> drain_all_pages() takes care of pages in pcp.
> >
> >Well.  drain_all_pages() moves pages from pcp to buddy so I'm not sure
> >that argument holds water.
> >
> >Can we step back a bit and ask ourselves what all these draining
> >operations are actually for?  What is the intent behind each callsite? 
> >Figuring that out (and perhaps even documenting it!) would help us
> >decide the most appropriate places from which to perform the drain.
> 
> With some rethinking we even could take drain_all_pages() out of the
> repeat loop. Because after isolation, the page in this range will not be
> put to pcp pageset. So we just need to drain pages once.
> 
> The change may look like this.

No, this is incorrect. Draining pcp lists before scan_movable_pages is
most likely sub-optimal, because scan_movable_pages will simply ignore
pages being on the pcp lists. But we definitely want to drain before we
terminate the offlining phase because we do not want to have isolated
pages on those lists before we allow the final hotremove.

The way how we retry the migration loop until there is no page in use
just guarantees that drain_all_pages is called. If you put it out of the
loop then you just break that assumption. Moving drain_all_pages down
after the migration is done should work well AFAICS but I didn't really
think through all potential side effects nor have time to do so now.

> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6910e0eea074..120e9fdfd055 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1590,6 +1590,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
>         if (ret)
>                 goto failed_removal;
> 
> +       drain_all_pages(zone);
>         pfn = start_pfn;
>  repeat:
>         /* start memory hot removal */
> @@ -1599,7 +1600,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
> 
>         cond_resched();
>         lru_add_drain_all();
> -       drain_all_pages(zone);
> 
>         pfn = scan_movable_pages(start_pfn, end_pfn);
>         if (pfn) { /* We have movable pages */
> 
> -- 
> Wei Yang
> Help you, Help me

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14  2:39 [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate() Wei Yang
  2018-12-14  3:57 ` Andrew Morton
@ 2018-12-17 12:25 ` Michal Hocko
  2018-12-17 15:08   ` Wei Yang
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
  2 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-17 12:25 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Fri 14-12-18 10:39:12, Wei Yang wrote:
> Below is a brief call flow for __offline_pages() and
> alloc_contig_range():
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()
> 
> Since set_migratetype_isolate() is only used in
> start_isolate_page_range(), which is just used in __offline_pages() and
> alloc_contig_range(). And both of them call drain_all_pages() if every
> check looks good. This means it is not necessary call drain_all_pages()
> in each iteration of set_migratetype_isolate().
> 
> By doing so, the logic seems a little bit clearer.
> set_migratetype_isolate() handles pages in Buddy, while
> drain_all_pages() takes care of pages in pcp.

I have to confess I am not sure about the purpose of the draining here.
I suspect it is to make sure that pages in the pcp lists really get
isolated and if that is the case then it makes sense.

In any case I strongly suggest not touching this code without a very
good explanation on why this is not needed. Callers do XYZ is not a
proper explanation because assumes that all callers will know that this
has to be done. So either we really need to drain and then it is better
to make it here or we don't but that requires some explanation.

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> ---
>  mm/page_isolation.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 43e085608846..f44c0e333bed 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -83,8 +83,6 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
>  	}
>  
>  	spin_unlock_irqrestore(&zone->lock, flags);
> -	if (!ret)
> -		drain_all_pages(zone);
>  	return ret;
>  }
>  
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-17 12:25 ` Michal Hocko
@ 2018-12-17 15:08   ` Wei Yang
  2018-12-17 15:48     ` Michal Hocko
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-17 15:08 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Mon, Dec 17, 2018 at 01:25:23PM +0100, Michal Hocko wrote:
>On Fri 14-12-18 10:39:12, Wei Yang wrote:
>> Below is a brief call flow for __offline_pages() and
>> alloc_contig_range():
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()
>> 
>> Since set_migratetype_isolate() is only used in
>> start_isolate_page_range(), which is just used in __offline_pages() and
>> alloc_contig_range(). And both of them call drain_all_pages() if every
>> check looks good. This means it is not necessary call drain_all_pages()
>> in each iteration of set_migratetype_isolate().
>> 
>> By doing so, the logic seems a little bit clearer.
>> set_migratetype_isolate() handles pages in Buddy, while
>> drain_all_pages() takes care of pages in pcp.
>
>I have to confess I am not sure about the purpose of the draining here.
>I suspect it is to make sure that pages in the pcp lists really get
>isolated and if that is the case then it makes sense.
>
>In any case I strongly suggest not touching this code without a very
>good explanation on why this is not needed. Callers do XYZ is not a
>proper explanation because assumes that all callers will know that this
>has to be done. So either we really need to drain and then it is better
>to make it here or we don't but that requires some explanation.
>

Yep, let me try to explain what is trying to do.

Based on my understanding, online_pages do two things

    * adjust zone/pgdat status
    * put pages into Buddy

Generally, offline_pages do the reverse

    * take pages out of Buddy
    * adjust zone/pgdat status

While it is not that easy to take pages out of Buddy, since pages are

    * pcp list
    * slub
    * other usage

This means before taking a page out of Buddy, we need to return it first
to Buddy.

Current implementation is interesting by introducing migrate type. By
setting migrate type to MIGRATE_ISOLATE, this range of pages will never
be allocated from Buddy. And every page returned in this range will
never be touched by Buddy.

Function start_isolate_page_range() just do this.

Then let's focus on the pcp list. This is a little bit different
than other allocated pages. These are actually "partially" allocated
pages. They are not counted in Buddy Free pages, either no real use. So
we have two choice to get back those pages:

    * wait until it is allocated to a real user and wait for return
    * or drain them directly

Current implementation take 2nd approach.

Then we can see there are also two way to drain them:

    * drain them range by range
    * drain them in a whole range

Both looks good, but not necessary to do them both. Because after we set
a pageblock migrate type to MIGRATE_ISOLATE, pages in this range will
never be allocated nor be put on pcp list. So after we drain one
particular range, it is not necessary to drain this range again.

The reason why I choose to drain them in a whole range is current
drain_all_pages() just carry zone information. For example, a zone may
have 1G while a pageblock is 128M. The pageblock is 1/8 of this zone.
This means in case there are 8 pages on pcp list, only 1 page drained by
drain_all_pages belongs to this pageblock. But we drain other 7 healthy
pages.

 CPU1 pcp list                            CPU2 pcp list

 +---------------+                        +---------------+ 
 |A1  B3  C8  F6 |                        |E1  G3  D8  B6 |
 +---------------+                        +---------------+


   A         B         C         D         E         F         G
  +---------+---------+---------+---------+---------+---------+---------+
  |012345678|         |         |         |         |         |         |
  +---------+---------+---------+---------+---------+---------+---------+
                                |<-pgblk->|
  |<-                              Zone                               ->|


This is a chart for illustration. In case we want to isolate pgblk D,
while zone pcp list has 8 pages and only one belongs to this pgblk D.
This means the drain on pgblk base has much side effect. And with one
drain on each pgblk, this may increase the contention on this zone.

Well, another approach is to enable drain_all_pages() with exact range
information. But neither approach needs to do them both.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-17 15:08   ` Wei Yang
@ 2018-12-17 15:48     ` Michal Hocko
  2018-12-18 14:44       ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-17 15:48 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Mon 17-12-18 15:08:19, Wei Yang wrote:
> On Mon, Dec 17, 2018 at 01:25:23PM +0100, Michal Hocko wrote:
> >On Fri 14-12-18 10:39:12, Wei Yang wrote:
> >> Below is a brief call flow for __offline_pages() and
> >> alloc_contig_range():
> >> 
> >>   __offline_pages()/alloc_contig_range()
> >>       start_isolate_page_range()
> >>           set_migratetype_isolate()
> >>               drain_all_pages()
> >>       drain_all_pages()
> >> 
> >> Since set_migratetype_isolate() is only used in
> >> start_isolate_page_range(), which is just used in __offline_pages() and
> >> alloc_contig_range(). And both of them call drain_all_pages() if every
> >> check looks good. This means it is not necessary call drain_all_pages()
> >> in each iteration of set_migratetype_isolate().
> >> 
> >> By doing so, the logic seems a little bit clearer.
> >> set_migratetype_isolate() handles pages in Buddy, while
> >> drain_all_pages() takes care of pages in pcp.
> >
> >I have to confess I am not sure about the purpose of the draining here.
> >I suspect it is to make sure that pages in the pcp lists really get
> >isolated and if that is the case then it makes sense.
> >
> >In any case I strongly suggest not touching this code without a very
> >good explanation on why this is not needed. Callers do XYZ is not a
> >proper explanation because assumes that all callers will know that this
> >has to be done. So either we really need to drain and then it is better
> >to make it here or we don't but that requires some explanation.
> >
> 
> Yep, let me try to explain what is trying to do.
> 
> Based on my understanding, online_pages do two things
> 
>     * adjust zone/pgdat status
>     * put pages into Buddy
> 
> Generally, offline_pages do the reverse
> 
>     * take pages out of Buddy
>     * adjust zone/pgdat status
> 
> While it is not that easy to take pages out of Buddy, since pages are
> 
>     * pcp list
>     * slub
>     * other usage
> 
> This means before taking a page out of Buddy, we need to return it first
> to Buddy.
> 
> Current implementation is interesting by introducing migrate type. By
> setting migrate type to MIGRATE_ISOLATE, this range of pages will never
> be allocated from Buddy. And every page returned in this range will
> never be touched by Buddy.
> 
> Function start_isolate_page_range() just do this.
> 
> Then let's focus on the pcp list. This is a little bit different
> than other allocated pages. These are actually "partially" allocated
> pages. They are not counted in Buddy Free pages, either no real use. So
> we have two choice to get back those pages:
> 
>     * wait until it is allocated to a real user and wait for return
>     * or drain them directly
> 
> Current implementation take 2nd approach.
> 
> Then we can see there are also two way to drain them:
> 
>     * drain them range by range
>     * drain them in a whole range
> 
> Both looks good, but not necessary to do them both. Because after we set
> a pageblock migrate type to MIGRATE_ISOLATE, pages in this range will
> never be allocated nor be put on pcp list. So after we drain one
> particular range, it is not necessary to drain this range again.

OK, this is an important point and actually the argument that i am
wrong. I have missed that free_unref_page_commit skips pcp lists for
MIGRATE_ISOLATE (resp. all migrate types above MIGRATE_PCPTYPES).
Then you are right that we are OK to drain the zone only once _after_ we
have isolated the full range.

So please send a new patch with this clarification in the changelog and
I will ack it.
 
> The reason why I choose to drain them in a whole range is current
> drain_all_pages() just carry zone information. For example, a zone may
> have 1G while a pageblock is 128M. The pageblock is 1/8 of this zone.
> This means in case there are 8 pages on pcp list, only 1 page drained by
> drain_all_pages belongs to this pageblock. But we drain other 7 healthy
> pages.
> 
>  CPU1 pcp list                            CPU2 pcp list
> 
>  +---------------+                        +---------------+ 
>  |A1  B3  C8  F6 |                        |E1  G3  D8  B6 |
>  +---------------+                        +---------------+
> 
> 
>    A         B         C         D         E         F         G
>   +---------+---------+---------+---------+---------+---------+---------+
>   |012345678|         |         |         |         |         |         |
>   +---------+---------+---------+---------+---------+---------+---------+
>                                 |<-pgblk->|
>   |<-                              Zone                               ->|
> 
> 
> This is a chart for illustration. In case we want to isolate pgblk D,
> while zone pcp list has 8 pages and only one belongs to this pgblk D.
> This means the drain on pgblk base has much side effect. And with one
> drain on each pgblk, this may increase the contention on this zone.
> 
> Well, another approach is to enable drain_all_pages() with exact range
> information. But neither approach needs to do them both.

Is this actually worth the additional complexity? Have you seen an
actual workload that would benefit from that?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-17 15:48     ` Michal Hocko
@ 2018-12-18 14:44       ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-18 14:44 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Mon, Dec 17, 2018 at 04:48:12PM +0100, Michal Hocko wrote:
>On Mon 17-12-18 15:08:19, Wei Yang wrote:
>> On Mon, Dec 17, 2018 at 01:25:23PM +0100, Michal Hocko wrote:
>> >On Fri 14-12-18 10:39:12, Wei Yang wrote:
>> >> Below is a brief call flow for __offline_pages() and
>> >> alloc_contig_range():
>> >> 
>> >>   __offline_pages()/alloc_contig_range()
>> >>       start_isolate_page_range()
>> >>           set_migratetype_isolate()
>> >>               drain_all_pages()
>> >>       drain_all_pages()
>> >> 
>> >> Since set_migratetype_isolate() is only used in
>> >> start_isolate_page_range(), which is just used in __offline_pages() and
>> >> alloc_contig_range(). And both of them call drain_all_pages() if every
>> >> check looks good. This means it is not necessary call drain_all_pages()
>> >> in each iteration of set_migratetype_isolate().
>> >> 
>> >> By doing so, the logic seems a little bit clearer.
>> >> set_migratetype_isolate() handles pages in Buddy, while
>> >> drain_all_pages() takes care of pages in pcp.
>> >
>> >I have to confess I am not sure about the purpose of the draining here.
>> >I suspect it is to make sure that pages in the pcp lists really get
>> >isolated and if that is the case then it makes sense.
>> >
>> >In any case I strongly suggest not touching this code without a very
>> >good explanation on why this is not needed. Callers do XYZ is not a
>> >proper explanation because assumes that all callers will know that this
>> >has to be done. So either we really need to drain and then it is better
>> >to make it here or we don't but that requires some explanation.
>> >
>> 
>> Yep, let me try to explain what is trying to do.
>> 
>> Based on my understanding, online_pages do two things
>> 
>>     * adjust zone/pgdat status
>>     * put pages into Buddy
>> 
>> Generally, offline_pages do the reverse
>> 
>>     * take pages out of Buddy
>>     * adjust zone/pgdat status
>> 
>> While it is not that easy to take pages out of Buddy, since pages are
>> 
>>     * pcp list
>>     * slub
>>     * other usage
>> 
>> This means before taking a page out of Buddy, we need to return it first
>> to Buddy.
>> 
>> Current implementation is interesting by introducing migrate type. By
>> setting migrate type to MIGRATE_ISOLATE, this range of pages will never
>> be allocated from Buddy. And every page returned in this range will
>> never be touched by Buddy.
>> 
>> Function start_isolate_page_range() just do this.
>> 
>> Then let's focus on the pcp list. This is a little bit different
>> than other allocated pages. These are actually "partially" allocated
>> pages. They are not counted in Buddy Free pages, either no real use. So
>> we have two choice to get back those pages:
>> 
>>     * wait until it is allocated to a real user and wait for return
>>     * or drain them directly
>> 
>> Current implementation take 2nd approach.
>> 
>> Then we can see there are also two way to drain them:
>> 
>>     * drain them range by range
>>     * drain them in a whole range
>> 
>> Both looks good, but not necessary to do them both. Because after we set
>> a pageblock migrate type to MIGRATE_ISOLATE, pages in this range will
>> never be allocated nor be put on pcp list. So after we drain one
>> particular range, it is not necessary to drain this range again.
>
>OK, this is an important point and actually the argument that i am
>wrong. I have missed that free_unref_page_commit skips pcp lists for
>MIGRATE_ISOLATE (resp. all migrate types above MIGRATE_PCPTYPES).
>Then you are right that we are OK to drain the zone only once _after_ we
>have isolated the full range.
>
>So please send a new patch with this clarification in the changelog and
>I will ack it.
> 
>> The reason why I choose to drain them in a whole range is current
>> drain_all_pages() just carry zone information. For example, a zone may
>> have 1G while a pageblock is 128M. The pageblock is 1/8 of this zone.
>> This means in case there are 8 pages on pcp list, only 1 page drained by
>> drain_all_pages belongs to this pageblock. But we drain other 7 healthy
>> pages.
>> 
>>  CPU1 pcp list                            CPU2 pcp list
>> 
>>  +---------------+                        +---------------+ 
>>  |A1  B3  C8  F6 |                        |E1  G3  D8  B6 |
>>  +---------------+                        +---------------+
>> 
>> 
>>    A         B         C         D         E         F         G
>>   +---------+---------+---------+---------+---------+---------+---------+
>>   |012345678|         |         |         |         |         |         |
>>   +---------+---------+---------+---------+---------+---------+---------+
>>                                 |<-pgblk->|
>>   |<-                              Zone                               ->|
>> 
>> 
>> This is a chart for illustration. In case we want to isolate pgblk D,
>> while zone pcp list has 8 pages and only one belongs to this pgblk D.
>> This means the drain on pgblk base has much side effect. And with one
>> drain on each pgblk, this may increase the contention on this zone.
>> 
>> Well, another approach is to enable drain_all_pages() with exact range
>> information. But neither approach needs to do them both.
>
>Is this actually worth the additional complexity? Have you seen an
>actual workload that would benefit from that?

No, I just mention the possible approach in my mind. While currently
drain pcp list once is enough.

I will prepare v2 with more detailed changelog with migratetype thing.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-14  2:39 [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate() Wei Yang
  2018-12-14  3:57 ` Andrew Morton
  2018-12-17 12:25 ` Michal Hocko
@ 2018-12-18 20:46 ` Wei Yang
  2018-12-18 21:14   ` David Hildenbrand
                     ` (4 more replies)
  2 siblings, 5 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-18 20:46 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

Below is a brief call flow for __offline_pages() and
alloc_contig_range():

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()

Current logic is: isolate and drain pcp list for each pageblock and
drain pcp list again. This is not necessary and we could just drain pcp
list once after isolate this whole range.

The reason is start_isolate_page_range() will set the migrate type of
a range to MIGRATE_ISOLATE. After doing so, this range will never be
allocated from Buddy, neither to a real user nor to pcp list.

Since drain_all_pages() is zone based, by reduce times of
drain_all_pages() also reduce some contention on this particular zone.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

---
v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
---
 mm/page_isolation.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 43e085608846..f44c0e333bed 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -83,8 +83,6 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
 	}
 
 	spin_unlock_irqrestore(&zone->lock, flags);
-	if (!ret)
-		drain_all_pages(zone);
 	return ret;
 }
 
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-17 12:21     ` Michal Hocko
@ 2018-12-18 20:48       ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-18 20:48 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Wei Yang, Andrew Morton, linux-mm, osalvador, david, Minchan Kim,
	Mel Gorman

On Mon, Dec 17, 2018 at 01:21:32PM +0100, Michal Hocko wrote:
>On Fri 14-12-18 15:17:56, Wei Yang wrote:
>> On Thu, Dec 13, 2018 at 07:57:12PM -0800, Andrew Morton wrote:
>> >On Fri, 14 Dec 2018 10:39:12 +0800 Wei Yang <richard.weiyang@gmail.com> wrote:
>> >
>> >> Below is a brief call flow for __offline_pages()
>> >
>> >Offtopic...
>> >
>> >set_migratetype_isolate() has the comment
>> >
>> >	/*
>> >	 * immobile means "not-on-lru" pages. If immobile is larger than
>> >	 * removable-by-driver pages reported by notifier, we'll fail.
>> >	 */
>> >
>> >what the heck does that mean?  It used to talk about unmovable pages,
>> >but this was mysteriously changed to use the unique term "immobile" by
>> >Minchan's ee6f509c32 ("mm: factor out memory isolate functions"). 
>> >Could someone please take a look?
>> >
>> >
>> >> and
>> >> alloc_contig_range():
>> >> 
>> >>   __offline_pages()/alloc_contig_range()
>> >>       start_isolate_page_range()
>> >>           set_migratetype_isolate()
>> >>               drain_all_pages()
>> >>       drain_all_pages()
>> >> 
>> >> Since set_migratetype_isolate() is only used in
>> >> start_isolate_page_range(), which is just used in __offline_pages() and
>> >> alloc_contig_range(). And both of them call drain_all_pages() if every
>> >> check looks good. This means it is not necessary call drain_all_pages()
>> >> in each iteration of set_migratetype_isolate().
>> >>
>> >> By doing so, the logic seems a little bit clearer.
>> >> set_migratetype_isolate() handles pages in Buddy, while
>> >> drain_all_pages() takes care of pages in pcp.
>> >
>> >Well.  drain_all_pages() moves pages from pcp to buddy so I'm not sure
>> >that argument holds water.
>> >
>> >Can we step back a bit and ask ourselves what all these draining
>> >operations are actually for?  What is the intent behind each callsite? 
>> >Figuring that out (and perhaps even documenting it!) would help us
>> >decide the most appropriate places from which to perform the drain.
>> 
>> With some rethinking we even could take drain_all_pages() out of the
>> repeat loop. Because after isolation, the page in this range will not be
>> put to pcp pageset. So we just need to drain pages once.
>> 
>> The change may look like this.
>
>No, this is incorrect. Draining pcp lists before scan_movable_pages is
>most likely sub-optimal, because scan_movable_pages will simply ignore
>pages being on the pcp lists. But we definitely want to drain before we
>terminate the offlining phase because we do not want to have isolated
>pages on those lists before we allow the final hotremove.
>
>The way how we retry the migration loop until there is no page in use
>just guarantees that drain_all_pages is called. If you put it out of the
>loop then you just break that assumption. Moving drain_all_pages down
>after the migration is done should work well AFAICS but I didn't really
>think through all potential side effects nor have time to do so now.
>

After discussion, do you agree with this proposal?

>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index 6910e0eea074..120e9fdfd055 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1590,6 +1590,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
>>         if (ret)
>>                 goto failed_removal;
>> 
>> +       drain_all_pages(zone);
>>         pfn = start_pfn;
>>  repeat:
>>         /* start memory hot removal */
>> @@ -1599,7 +1600,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>> 
>>         cond_resched();
>>         lru_add_drain_all();
>> -       drain_all_pages(zone);
>> 
>>         pfn = scan_movable_pages(start_pfn, end_pfn);
>>         if (pfn) { /* We have movable pages */
>> 
>> -- 
>> Wei Yang
>> Help you, Help me
>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
@ 2018-12-18 21:14   ` David Hildenbrand
  2018-12-18 21:49     ` Wei Yang
  2018-12-18 23:29   ` Oscar Salvador
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 38+ messages in thread
From: David Hildenbrand @ 2018-12-18 21:14 UTC (permalink / raw)
  To: Wei Yang, linux-mm; +Cc: akpm, mhocko, osalvador

On 18.12.18 21:46, Wei Yang wrote:
> Below is a brief call flow for __offline_pages() and
> alloc_contig_range():
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()
> 
> Current logic is: isolate and drain pcp list for each pageblock and
> drain pcp list again. This is not necessary and we could just drain pcp
> list once after isolate this whole range.
> 
> The reason is start_isolate_page_range() will set the migrate type of
> a range to MIGRATE_ISOLATE. After doing so, this range will never be
> allocated from Buddy, neither to a real user nor to pcp list.
> 
> Since drain_all_pages() is zone based, by reduce times of
> drain_all_pages() also reduce some contention on this particular zone.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

Yes, as far as I can see, when a MIGRATE_ISOLATE page gets freed, it
will not go onto the pcp list again.

However, start_isolate_page_range() is also called via
alloc_contig_range(). Are you sure we can effectively drop the
drain_all_pages() for that call path?

> 
> ---
> v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
> ---
>  mm/page_isolation.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 43e085608846..f44c0e333bed 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -83,8 +83,6 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
>  	}
>  
>  	spin_unlock_irqrestore(&zone->lock, flags);
> -	if (!ret)
> -		drain_all_pages(zone);
>  	return ret;
>  }
>  
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 21:14   ` David Hildenbrand
@ 2018-12-18 21:49     ` Wei Yang
  2018-12-18 22:18       ` David Hildenbrand
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-18 21:49 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Wei Yang, linux-mm, akpm, mhocko, osalvador

On Tue, Dec 18, 2018 at 10:14:25PM +0100, David Hildenbrand wrote:
>On 18.12.18 21:46, Wei Yang wrote:
>> Below is a brief call flow for __offline_pages() and
>> alloc_contig_range():
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()
>> 
>> Current logic is: isolate and drain pcp list for each pageblock and
>> drain pcp list again. This is not necessary and we could just drain pcp
>> list once after isolate this whole range.
>> 
>> The reason is start_isolate_page_range() will set the migrate type of
>> a range to MIGRATE_ISOLATE. After doing so, this range will never be
>> allocated from Buddy, neither to a real user nor to pcp list.
>> 
>> Since drain_all_pages() is zone based, by reduce times of
>> drain_all_pages() also reduce some contention on this particular zone.
>> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
>Yes, as far as I can see, when a MIGRATE_ISOLATE page gets freed, it
>will not go onto the pcp list again.
>
>However, start_isolate_page_range() is also called via
>alloc_contig_range(). Are you sure we can effectively drop the
>drain_all_pages() for that call path?
>

alloc_contig_range() does following now:

   - isolate page range
   - do reclaim and migration
   - drain lru
   - drain pcp list

If step 2 fails, it will not drain lru and pcp list.

I don't see we have to drain pcp list before step 2. And after this
change, it will save some effort if step 2 fails.

>> 
>> ---
>> v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
>> ---
>>  mm/page_isolation.c | 2 --
>>  1 file changed, 2 deletions(-)
>> 
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 43e085608846..f44c0e333bed 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -83,8 +83,6 @@ static int set_migratetype_isolate(struct page *page, int migratetype,
>>  	}
>>  
>>  	spin_unlock_irqrestore(&zone->lock, flags);
>> -	if (!ret)
>> -		drain_all_pages(zone);
>>  	return ret;
>>  }
>>  
>> 
>
>
>-- 
>
>Thanks,
>
>David / dhildenb

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 21:49     ` Wei Yang
@ 2018-12-18 22:18       ` David Hildenbrand
  0 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2018-12-18 22:18 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, mhocko, osalvador

On 18.12.18 22:49, Wei Yang wrote:
> On Tue, Dec 18, 2018 at 10:14:25PM +0100, David Hildenbrand wrote:
>> On 18.12.18 21:46, Wei Yang wrote:
>>> Below is a brief call flow for __offline_pages() and
>>> alloc_contig_range():
>>>
>>>   __offline_pages()/alloc_contig_range()
>>>       start_isolate_page_range()
>>>           set_migratetype_isolate()
>>>               drain_all_pages()
>>>       drain_all_pages()
>>>
>>> Current logic is: isolate and drain pcp list for each pageblock and
>>> drain pcp list again. This is not necessary and we could just drain pcp
>>> list once after isolate this whole range.
>>>
>>> The reason is start_isolate_page_range() will set the migrate type of
>>> a range to MIGRATE_ISOLATE. After doing so, this range will never be
>>> allocated from Buddy, neither to a real user nor to pcp list.
>>>
>>> Since drain_all_pages() is zone based, by reduce times of
>>> drain_all_pages() also reduce some contention on this particular zone.
>>>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>
>> Yes, as far as I can see, when a MIGRATE_ISOLATE page gets freed, it
>> will not go onto the pcp list again.
>>
>> However, start_isolate_page_range() is also called via
>> alloc_contig_range(). Are you sure we can effectively drop the
>> drain_all_pages() for that call path?
>>
> 
> alloc_contig_range() does following now:
> 
>    - isolate page range
>    - do reclaim and migration
>    - drain lru
>    - drain pcp list
> 
> If step 2 fails, it will not drain lru and pcp list.
> 
> I don't see we have to drain pcp list before step 2. And after this
> change, it will save some effort if step 2 fails.

Sorry, I missed that you actually documented the "alloc_contig_range"
scenario in you patch description. My fault!

Acked-by: David Hildenbrand <david@redhat.com>


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
  2018-12-18 21:14   ` David Hildenbrand
@ 2018-12-18 23:29   ` Oscar Salvador
  2018-12-19  9:51   ` Michal Hocko
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 38+ messages in thread
From: Oscar Salvador @ 2018-12-18 23:29 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, mhocko, david

On Wed, Dec 19, 2018 at 04:46:56AM +0800, Wei Yang wrote:
> Below is a brief call flow for __offline_pages() and
> alloc_contig_range():
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()
> 
> Current logic is: isolate and drain pcp list for each pageblock and
> drain pcp list again. This is not necessary and we could just drain pcp
> list once after isolate this whole range.
> 
> The reason is start_isolate_page_range() will set the migrate type of
> a range to MIGRATE_ISOLATE. After doing so, this range will never be
> allocated from Buddy, neither to a real user nor to pcp list.
> 
> Since drain_all_pages() is zone based, by reduce times of
> drain_all_pages() also reduce some contention on this particular zone.
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

It is a bit late and I hope I did not miss anything, but looks good to me.

Reviewed-by: Oscar Salvador <osalvador@suse.de>

Thanks!
-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
  2018-12-18 21:14   ` David Hildenbrand
  2018-12-18 23:29   ` Oscar Salvador
@ 2018-12-19  9:51   ` Michal Hocko
  2018-12-19  9:57     ` Oscar Salvador
  2018-12-19 13:29     ` Wei Yang
  2018-12-19 10:05   ` Michal Hocko
  2018-12-21 17:02     ` Wei Yang
  4 siblings, 2 replies; 38+ messages in thread
From: Michal Hocko @ 2018-12-19  9:51 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Wed 19-12-18 04:46:56, Wei Yang wrote:
> Below is a brief call flow for __offline_pages() and
> alloc_contig_range():
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()
> 
> Current logic is: isolate and drain pcp list for each pageblock and
> drain pcp list again. This is not necessary and we could just drain pcp
> list once after isolate this whole range.
> 
> The reason is start_isolate_page_range() will set the migrate type of
> a range to MIGRATE_ISOLATE. After doing so, this range will never be
> allocated from Buddy, neither to a real user nor to pcp list.

But it is important to note that those pages still can be allocated from
the pcp lists until we do drain_all_pages().

One thing that I really do not like about this patch (and I believe I
have mentioned that previously) that you rely on callers to do the right
thing. The proper fix would be to do the draining in
start_isolate_page_range and remove them from callers. Also what does
prevent start_isolate_page_range to work on multiple zones? At least
contiguous allocator can do that in principle.

So no I do not like this patch, it is not an improvement.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19  9:51   ` Michal Hocko
@ 2018-12-19  9:57     ` Oscar Salvador
  2018-12-19 13:53       ` Wei Yang
  2018-12-19 13:29     ` Wei Yang
  1 sibling, 1 reply; 38+ messages in thread
From: Oscar Salvador @ 2018-12-19  9:57 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, david

On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
> On Wed 19-12-18 04:46:56, Wei Yang wrote:
> > Below is a brief call flow for __offline_pages() and
> > alloc_contig_range():
> > 
> >   __offline_pages()/alloc_contig_range()
> >       start_isolate_page_range()
> >           set_migratetype_isolate()
> >               drain_all_pages()
> >       drain_all_pages()
> > 
> > Current logic is: isolate and drain pcp list for each pageblock and
> > drain pcp list again. This is not necessary and we could just drain pcp
> > list once after isolate this whole range.
> > 
> > The reason is start_isolate_page_range() will set the migrate type of
> > a range to MIGRATE_ISOLATE. After doing so, this range will never be
> > allocated from Buddy, neither to a real user nor to pcp list.
> 
> But it is important to note that those pages still can be allocated from
> the pcp lists until we do drain_all_pages().

I had the same fear, but then I saw that move_freepages_block()->move_freepages() moves
the pages to a new list:

<--
list_move(&page->lru,
			  &zone->free_area[order].free_list[migratetype]);
-->


But looking at it again, I see that this is only for BuddyPages, so I guess
that pcp-pages do not really get unlinked, so we could still allocate them.

Uhmf, I missed that.

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-18 20:46 ` [PATCH v2] " Wei Yang
                     ` (2 preceding siblings ...)
  2018-12-19  9:51   ` Michal Hocko
@ 2018-12-19 10:05   ` Michal Hocko
  2018-12-21 17:02     ` Wei Yang
  4 siblings, 0 replies; 38+ messages in thread
From: Michal Hocko @ 2018-12-19 10:05 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Wed 19-12-18 04:46:56, Wei Yang wrote:
[...]
> Since drain_all_pages() is zone based, by reduce times of
> drain_all_pages() also reduce some contention on this particular zone.

I forgot to add. As said before this is a really weak justification. If
there is really some contention then I would like to see some numbers
backing that claim.

A proper justification would be that reallying on draining in callers
just sucks. As we can see we are doing that suboptimally based on a weak
understanding of the functionality. So it makes sense to remove that
draining and rely on the isolation code do the right thing. Then it is a
clear cleanup.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19  9:51   ` Michal Hocko
  2018-12-19  9:57     ` Oscar Salvador
@ 2018-12-19 13:29     ` Wei Yang
  2018-12-19 13:40       ` Michal Hocko
  1 sibling, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-19 13:29 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
>On Wed 19-12-18 04:46:56, Wei Yang wrote:
>> Below is a brief call flow for __offline_pages() and
>> alloc_contig_range():
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()
>> 
>> Current logic is: isolate and drain pcp list for each pageblock and
>> drain pcp list again. This is not necessary and we could just drain pcp
>> list once after isolate this whole range.
>> 
>> The reason is start_isolate_page_range() will set the migrate type of
>> a range to MIGRATE_ISOLATE. After doing so, this range will never be
>> allocated from Buddy, neither to a real user nor to pcp list.
>
>But it is important to note that those pages still can be allocated from
>the pcp lists until we do drain_all_pages().
>
>One thing that I really do not like about this patch (and I believe I
>have mentioned that previously) that you rely on callers to do the right
>thing. The proper fix would be to do the draining in
>start_isolate_page_range and remove them from callers. Also what does

Well, I don't really understand this meaning previously.

So you prefer set_migratetype_isolate() do the drain instead of the
caller (__offline_pages) do the drain. Is my understanding correct?

>prevent start_isolate_page_range to work on multiple zones? At least
>contiguous allocator can do that in principle.

As the comment mentioned, in current implementation the range must be in
one zone.

>
>So no I do not like this patch, it is not an improvement.
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 13:29     ` Wei Yang
@ 2018-12-19 13:40       ` Michal Hocko
  2018-12-19 13:56         ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-19 13:40 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Wed 19-12-18 13:29:34, Wei Yang wrote:
> On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
> >On Wed 19-12-18 04:46:56, Wei Yang wrote:
> >> Below is a brief call flow for __offline_pages() and
> >> alloc_contig_range():
> >> 
> >>   __offline_pages()/alloc_contig_range()
> >>       start_isolate_page_range()
> >>           set_migratetype_isolate()
> >>               drain_all_pages()
> >>       drain_all_pages()
> >> 
> >> Current logic is: isolate and drain pcp list for each pageblock and
> >> drain pcp list again. This is not necessary and we could just drain pcp
> >> list once after isolate this whole range.
> >> 
> >> The reason is start_isolate_page_range() will set the migrate type of
> >> a range to MIGRATE_ISOLATE. After doing so, this range will never be
> >> allocated from Buddy, neither to a real user nor to pcp list.
> >
> >But it is important to note that those pages still can be allocated from
> >the pcp lists until we do drain_all_pages().
> >
> >One thing that I really do not like about this patch (and I believe I
> >have mentioned that previously) that you rely on callers to do the right
> >thing. The proper fix would be to do the draining in
> >start_isolate_page_range and remove them from callers. Also what does
> 
> Well, I don't really understand this meaning previously.
> 
> So you prefer set_migratetype_isolate() do the drain instead of the
> caller (__offline_pages) do the drain. Is my understanding correct?

Either set_migratetype_isolate or start_isolate_page_range. The later
only if this is guaranteed that we cannot intemix zones in the range.

> >prevent start_isolate_page_range to work on multiple zones? At least
> >contiguous allocator can do that in principle.
> 
> As the comment mentioned, in current implementation the range must be in
> one zone.

I do not see anything like that documented for set_migratetype_isolate.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19  9:57     ` Oscar Salvador
@ 2018-12-19 13:53       ` Wei Yang
  2018-12-19 14:13         ` Michal Hocko
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-19 13:53 UTC (permalink / raw)
  To: Oscar Salvador; +Cc: Michal Hocko, Wei Yang, linux-mm, akpm, david

On Wed, Dec 19, 2018 at 10:57:19AM +0100, Oscar Salvador wrote:
>On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
>> On Wed 19-12-18 04:46:56, Wei Yang wrote:
>> > Below is a brief call flow for __offline_pages() and
>> > alloc_contig_range():
>> > 
>> >   __offline_pages()/alloc_contig_range()
>> >       start_isolate_page_range()
>> >           set_migratetype_isolate()
>> >               drain_all_pages()
>> >       drain_all_pages()
>> > 
>> > Current logic is: isolate and drain pcp list for each pageblock and
>> > drain pcp list again. This is not necessary and we could just drain pcp
>> > list once after isolate this whole range.
>> > 
>> > The reason is start_isolate_page_range() will set the migrate type of
>> > a range to MIGRATE_ISOLATE. After doing so, this range will never be
>> > allocated from Buddy, neither to a real user nor to pcp list.
>> 
>> But it is important to note that those pages still can be allocated from
>> the pcp lists until we do drain_all_pages().
>
>I had the same fear, but then I saw that move_freepages_block()->move_freepages() moves
>the pages to a new list:
>
><--
>list_move(&page->lru,
>			  &zone->free_area[order].free_list[migratetype]);
>-->
>
>
>But looking at it again, I see that this is only for BuddyPages, so I guess
>that pcp-pages do not really get unlinked, so we could still allocate them.

Well, I think you are right. But with this in mind, current code looks
buggy.

Between has_unmovable_pages() and drain_all_pages(), others still could
allocate pages on pcp list, right? This means we thought we have
isolated the range, but not.

So even we do drain_all_pages(), we still missed some pages in this
range.

>
>Uhmf, I missed that.
>
>-- 
>Oscar Salvador
>SUSE L3

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 13:40       ` Michal Hocko
@ 2018-12-19 13:56         ` Wei Yang
  2018-12-19 14:12           ` Michal Hocko
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-19 13:56 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Wed, Dec 19, 2018 at 02:40:56PM +0100, Michal Hocko wrote:
>On Wed 19-12-18 13:29:34, Wei Yang wrote:
>> On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
>> >On Wed 19-12-18 04:46:56, Wei Yang wrote:
>> >> Below is a brief call flow for __offline_pages() and
>> >> alloc_contig_range():
>> >> 
>> >>   __offline_pages()/alloc_contig_range()
>> >>       start_isolate_page_range()
>> >>           set_migratetype_isolate()
>> >>               drain_all_pages()
>> >>       drain_all_pages()
>> >> 
>> >> Current logic is: isolate and drain pcp list for each pageblock and
>> >> drain pcp list again. This is not necessary and we could just drain pcp
>> >> list once after isolate this whole range.
>> >> 
>> >> The reason is start_isolate_page_range() will set the migrate type of
>> >> a range to MIGRATE_ISOLATE. After doing so, this range will never be
>> >> allocated from Buddy, neither to a real user nor to pcp list.
>> >
>> >But it is important to note that those pages still can be allocated from
>> >the pcp lists until we do drain_all_pages().
>> >
>> >One thing that I really do not like about this patch (and I believe I
>> >have mentioned that previously) that you rely on callers to do the right
>> >thing. The proper fix would be to do the draining in
>> >start_isolate_page_range and remove them from callers. Also what does
>> 
>> Well, I don't really understand this meaning previously.
>> 
>> So you prefer set_migratetype_isolate() do the drain instead of the
>> caller (__offline_pages) do the drain. Is my understanding correct?
>
>Either set_migratetype_isolate or start_isolate_page_range. The later
>only if this is guaranteed that we cannot intemix zones in the range.
>
>> >prevent start_isolate_page_range to work on multiple zones? At least
>> >contiguous allocator can do that in principle.
>> 
>> As the comment mentioned, in current implementation the range must be in
>> one zone.
>
>I do not see anything like that documented for set_migratetype_isolate.

The comment is not on set_migratetype_isolate, but for its two
(grandparent) callers:

   __offline_pages
   alloc_contig_range

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 13:56         ` Wei Yang
@ 2018-12-19 14:12           ` Michal Hocko
  2018-12-19 14:41             ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-19 14:12 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Wed 19-12-18 13:56:35, Wei Yang wrote:
> On Wed, Dec 19, 2018 at 02:40:56PM +0100, Michal Hocko wrote:
> >On Wed 19-12-18 13:29:34, Wei Yang wrote:
[...]
> >> As the comment mentioned, in current implementation the range must be in
> >> one zone.
> >
> >I do not see anything like that documented for set_migratetype_isolate.
> 
> The comment is not on set_migratetype_isolate, but for its two
> (grandparent) callers:
> 
>    __offline_pages
>    alloc_contig_range

But those are consumers while the main api here is
start_isolate_page_range. What happens if we grow a new user?
Go over the same problems? See the difference?

Please try to look at these things from a higher level. We really do not
want micro optimise on behalf of a sane API. Unless there is a very good
reason to do that - e.g. when the performance difference is really huge.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 13:53       ` Wei Yang
@ 2018-12-19 14:13         ` Michal Hocko
  2018-12-19 14:33           ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-19 14:13 UTC (permalink / raw)
  To: Wei Yang; +Cc: Oscar Salvador, linux-mm, akpm, david

On Wed 19-12-18 13:53:07, Wei Yang wrote:
> On Wed, Dec 19, 2018 at 10:57:19AM +0100, Oscar Salvador wrote:
> >On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
> >> On Wed 19-12-18 04:46:56, Wei Yang wrote:
> >> > Below is a brief call flow for __offline_pages() and
> >> > alloc_contig_range():
> >> > 
> >> >   __offline_pages()/alloc_contig_range()
> >> >       start_isolate_page_range()
> >> >           set_migratetype_isolate()
> >> >               drain_all_pages()
> >> >       drain_all_pages()
> >> > 
> >> > Current logic is: isolate and drain pcp list for each pageblock and
> >> > drain pcp list again. This is not necessary and we could just drain pcp
> >> > list once after isolate this whole range.
> >> > 
> >> > The reason is start_isolate_page_range() will set the migrate type of
> >> > a range to MIGRATE_ISOLATE. After doing so, this range will never be
> >> > allocated from Buddy, neither to a real user nor to pcp list.
> >> 
> >> But it is important to note that those pages still can be allocated from
> >> the pcp lists until we do drain_all_pages().
> >
> >I had the same fear, but then I saw that move_freepages_block()->move_freepages() moves
> >the pages to a new list:
> >
> ><--
> >list_move(&page->lru,
> >			  &zone->free_area[order].free_list[migratetype]);
> >-->
> >
> >
> >But looking at it again, I see that this is only for BuddyPages, so I guess
> >that pcp-pages do not really get unlinked, so we could still allocate them.
> 
> Well, I think you are right. But with this in mind, current code looks
> buggy.
> 
> Between has_unmovable_pages() and drain_all_pages(), others still could
> allocate pages on pcp list, right? This means we thought we have
> isolated the range, but not.

THere is no guarantee in that regards and I believe there is also no
demand for such a strong semantic. Or I do not see it at least. 
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 14:13         ` Michal Hocko
@ 2018-12-19 14:33           ` Wei Yang
  2018-12-19 14:39             ` Michal Hocko
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-19 14:33 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, Oscar Salvador, linux-mm, akpm, david

On Wed, Dec 19, 2018 at 03:13:43PM +0100, Michal Hocko wrote:
>On Wed 19-12-18 13:53:07, Wei Yang wrote:
>> On Wed, Dec 19, 2018 at 10:57:19AM +0100, Oscar Salvador wrote:
>> >On Wed, Dec 19, 2018 at 10:51:10AM +0100, Michal Hocko wrote:
>> >> On Wed 19-12-18 04:46:56, Wei Yang wrote:
>> >> > Below is a brief call flow for __offline_pages() and
>> >> > alloc_contig_range():
>> >> > 
>> >> >   __offline_pages()/alloc_contig_range()
>> >> >       start_isolate_page_range()
>> >> >           set_migratetype_isolate()
>> >> >               drain_all_pages()
>> >> >       drain_all_pages()
>> >> > 
>> >> > Current logic is: isolate and drain pcp list for each pageblock and
>> >> > drain pcp list again. This is not necessary and we could just drain pcp
>> >> > list once after isolate this whole range.
>> >> > 
>> >> > The reason is start_isolate_page_range() will set the migrate type of
>> >> > a range to MIGRATE_ISOLATE. After doing so, this range will never be
>> >> > allocated from Buddy, neither to a real user nor to pcp list.
>> >> 
>> >> But it is important to note that those pages still can be allocated from
>> >> the pcp lists until we do drain_all_pages().
>> >
>> >I had the same fear, but then I saw that move_freepages_block()->move_freepages() moves
>> >the pages to a new list:
>> >
>> ><--
>> >list_move(&page->lru,
>> >			  &zone->free_area[order].free_list[migratetype]);
>> >-->
>> >
>> >
>> >But looking at it again, I see that this is only for BuddyPages, so I guess
>> >that pcp-pages do not really get unlinked, so we could still allocate them.
>> 
>> Well, I think you are right. But with this in mind, current code looks
>> buggy.
>> 
>> Between has_unmovable_pages() and drain_all_pages(), others still could
>> allocate pages on pcp list, right? This means we thought we have
>> isolated the range, but not.
>
>THere is no guarantee in that regards and I believe there is also no
>demand for such a strong semantic. Or I do not see it at least. 

If there is no demand for this, allocating page before drain_all_pages()
is reasonable. The time gap between isolating page and drain_all_pages
is fine.

Then I am confused about the objection to this patch. Finally, we drain
all the pages in pcp list and the range is isolated.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 14:33           ` Wei Yang
@ 2018-12-19 14:39             ` Michal Hocko
  2018-12-20 15:58               ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-19 14:39 UTC (permalink / raw)
  To: Wei Yang; +Cc: Oscar Salvador, linux-mm, akpm, david

On Wed 19-12-18 14:33:27, Wei Yang wrote:
[...]
> Then I am confused about the objection to this patch. Finally, we drain
> all the pages in pcp list and the range is isolated.

Please read my emails more carefully. As I've said, the only reason to
do care about draining is to remove it from where it doesn't belong.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 14:12           ` Michal Hocko
@ 2018-12-19 14:41             ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-19 14:41 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Wed, Dec 19, 2018 at 03:12:35PM +0100, Michal Hocko wrote:
>On Wed 19-12-18 13:56:35, Wei Yang wrote:
>> On Wed, Dec 19, 2018 at 02:40:56PM +0100, Michal Hocko wrote:
>> >On Wed 19-12-18 13:29:34, Wei Yang wrote:
>[...]
>> >> As the comment mentioned, in current implementation the range must be in
>> >> one zone.
>> >
>> >I do not see anything like that documented for set_migratetype_isolate.
>> 
>> The comment is not on set_migratetype_isolate, but for its two
>> (grandparent) callers:
>> 
>>    __offline_pages
>>    alloc_contig_range
>
>But those are consumers while the main api here is
>start_isolate_page_range. What happens if we grow a new user?
>Go over the same problems? See the difference?

I didn't intend to fight for my patch, just want to clarify current
implementation :-)

>
>Please try to look at these things from a higher level. We really do not
>want micro optimise on behalf of a sane API. Unless there is a very good
>reason to do that - e.g. when the performance difference is really huge.

Well, actually I get your idea and agree with you not rely on the caller
to drain the page is the proper way to handle this.

Again, I just want to clarify current situation and try to find a proper
way to make it better. Maybe I lost some point, while I am willing get
feedback and suggestions from all of you.

>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-19 14:39             ` Michal Hocko
@ 2018-12-20 15:58               ` Wei Yang
  2018-12-20 16:23                 ` Michal Hocko
  0 siblings, 1 reply; 38+ messages in thread
From: Wei Yang @ 2018-12-20 15:58 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, Oscar Salvador, linux-mm, akpm, david

On Wed, Dec 19, 2018 at 03:39:27PM +0100, Michal Hocko wrote:
>On Wed 19-12-18 14:33:27, Wei Yang wrote:
>[...]
>> Then I am confused about the objection to this patch. Finally, we drain
>> all the pages in pcp list and the range is isolated.
>
>Please read my emails more carefully. As I've said, the only reason to
>do care about draining is to remove it from where it doesn't belong.

I go through the thread again and classify two main opinions from you
and Oscar.

1) We can still allocate pages in a specific range from pcp list even we
   have already isolate this range.
2) We shouldn't rely on caller to drain pages and
   set_migratetype_isolate() may handle a range cross zones.

I understand the second one and agree it is not proper to rely on caller
and make the assumption on range for set_migratetype_isolate().

My confusion comes from the first one. As you and Oscar both mentioned
this and Oscar said "I had the same fear", this makes me think current
implementation is buggy. But your following reply said this is not. This
means current approach works fine.

If the above understanding is correct, and combining with previous
discussion, the improvement we can do is to remove the drain_all_pages()
in __offline_pages()/alloc_contig_range(). By doing so, the pcp list
drain doesn't rely on caller and the isolation/drain on each pageblock
ensures pcp list will not contain any page in this range now and future.
This imply the drain_all_pages() in
__offline_pages()/alloc_contig_range() is not necessary.

Is my understanding correct?

>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-20 15:58               ` Wei Yang
@ 2018-12-20 16:23                 ` Michal Hocko
  2018-12-21  3:37                   ` Wei Yang
  0 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2018-12-20 16:23 UTC (permalink / raw)
  To: Wei Yang; +Cc: Oscar Salvador, linux-mm, akpm, david

On Thu 20-12-18 15:58:03, Wei Yang wrote:
> On Wed, Dec 19, 2018 at 03:39:27PM +0100, Michal Hocko wrote:
> >On Wed 19-12-18 14:33:27, Wei Yang wrote:
> >[...]
> >> Then I am confused about the objection to this patch. Finally, we drain
> >> all the pages in pcp list and the range is isolated.
> >
> >Please read my emails more carefully. As I've said, the only reason to
> >do care about draining is to remove it from where it doesn't belong.
> 
> I go through the thread again and classify two main opinions from you
> and Oscar.
> 
> 1) We can still allocate pages in a specific range from pcp list even we
>    have already isolate this range.
> 2) We shouldn't rely on caller to drain pages and
>    set_migratetype_isolate() may handle a range cross zones.
> 
> I understand the second one and agree it is not proper to rely on caller
> and make the assumption on range for set_migratetype_isolate().
> 
> My confusion comes from the first one. As you and Oscar both mentioned
> this and Oscar said "I had the same fear", this makes me think current
> implementation is buggy. But your following reply said this is not. This
> means current approach works fine.
> 
> If the above understanding is correct, and combining with previous
> discussion, the improvement we can do is to remove the drain_all_pages()
> in __offline_pages()/alloc_contig_range(). By doing so, the pcp list
> drain doesn't rely on caller and the isolation/drain on each pageblock
> ensures pcp list will not contain any page in this range now and future.
> This imply the drain_all_pages() in
> __offline_pages()/alloc_contig_range() is not necessary.
> 
> Is my understanding correct?

Yes

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v2] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate()
  2018-12-20 16:23                 ` Michal Hocko
@ 2018-12-21  3:37                   ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-21  3:37 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, Oscar Salvador, linux-mm, akpm, david

On Thu, Dec 20, 2018 at 05:23:02PM +0100, Michal Hocko wrote:
>On Thu 20-12-18 15:58:03, Wei Yang wrote:
>> On Wed, Dec 19, 2018 at 03:39:27PM +0100, Michal Hocko wrote:
>> >On Wed 19-12-18 14:33:27, Wei Yang wrote:
>> >[...]
>> >> Then I am confused about the objection to this patch. Finally, we drain
>> >> all the pages in pcp list and the range is isolated.
>> >
>> >Please read my emails more carefully. As I've said, the only reason to
>> >do care about draining is to remove it from where it doesn't belong.
>> 
>> I go through the thread again and classify two main opinions from you
>> and Oscar.
>> 
>> 1) We can still allocate pages in a specific range from pcp list even we
>>    have already isolate this range.
>> 2) We shouldn't rely on caller to drain pages and
>>    set_migratetype_isolate() may handle a range cross zones.
>> 
>> I understand the second one and agree it is not proper to rely on caller
>> and make the assumption on range for set_migratetype_isolate().
>> 
>> My confusion comes from the first one. As you and Oscar both mentioned
>> this and Oscar said "I had the same fear", this makes me think current
>> implementation is buggy. But your following reply said this is not. This
>> means current approach works fine.
>> 
>> If the above understanding is correct, and combining with previous
>> discussion, the improvement we can do is to remove the drain_all_pages()
>> in __offline_pages()/alloc_contig_range(). By doing so, the pcp list
>> drain doesn't rely on caller and the isolation/drain on each pageblock
>> ensures pcp list will not contain any page in this range now and future.
>> This imply the drain_all_pages() in
>> __offline_pages()/alloc_contig_range() is not necessary.
>> 
>> Is my understanding correct?
>
>Yes

Thanks for your clarification:-)

I would come up with a patch to remove this one.

>
>-- 
>Michal Hocko
>SUSE Labs

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v3] mm: remove extra drain pages on pcp list
@ 2018-12-21 17:02     ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-21 17:02 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

In current implementation, there are two places to isolate a range of
page: __offline_pages() and alloc_contig_range(). During this procedure,
it will drain pages on pcp list.

Below is a brief call flow:

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()                 <--- A

>From this snippet we can see current logic is isolate and drain pcp list
for each pageblock and drain pcp list again for the whole range.

While the drain at A is not necessary. The reason is
start_isolate_page_range() will set the migrate type of a range to
MIGRATE_ISOLATE. After doing so, this range will never be allocated from
Buddy, neither to a real user nor to pcp list. This means the procedure
to drain pages on pcp list after start_isolate_page_range() will not
drain any page in the target range.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

---
v3:
  * it is not proper to rely on caller to drain pages, so keep to drain
    pages during iteration and remove the one in callers.
v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
---
 mm/memory_hotplug.c | 1 -
 mm/page_alloc.c     | 1 -
 2 files changed, 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6910e0eea074..d2fa6cbbb2db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	cond_resched();
 	lru_add_drain_all();
-	drain_all_pages(zone);
 
 	pfn = scan_movable_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have movable pages */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f1edd36a1e2b..d9ee4bb3a1a7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 */
 
 	lru_add_drain_all();
-	drain_all_pages(cc.zone);
 
 	order = 0;
 	outer_start = start;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v3] mm: remove extra drain pages on pcp list
@ 2018-12-21 17:02     ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2018-12-21 17:02 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

In current implementation, there are two places to isolate a range of
page: __offline_pages() and alloc_contig_range(). During this procedure,
it will drain pages on pcp list.

Below is a brief call flow:

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()                 <--- A

From this snippet we can see current logic is isolate and drain pcp list
for each pageblock and drain pcp list again for the whole range.

While the drain at A is not necessary. The reason is
start_isolate_page_range() will set the migrate type of a range to
MIGRATE_ISOLATE. After doing so, this range will never be allocated from
Buddy, neither to a real user nor to pcp list. This means the procedure
to drain pages on pcp list after start_isolate_page_range() will not
drain any page in the target range.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

---
v3:
  * it is not proper to rely on caller to drain pages, so keep to drain
    pages during iteration and remove the one in callers.
v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
---
 mm/memory_hotplug.c | 1 -
 mm/page_alloc.c     | 1 -
 2 files changed, 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6910e0eea074..d2fa6cbbb2db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	cond_resched();
 	lru_add_drain_all();
-	drain_all_pages(zone);
 
 	pfn = scan_movable_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have movable pages */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f1edd36a1e2b..d9ee4bb3a1a7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 */
 
 	lru_add_drain_all();
-	drain_all_pages(cc.zone);
 
 	order = 0;
 	outer_start = start;
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] mm: remove extra drain pages on pcp list
  2018-12-21 17:02     ` Wei Yang
  (?)
@ 2019-01-03 13:56     ` Michal Hocko
  2019-01-05 23:27       ` Wei Yang
  -1 siblings, 1 reply; 38+ messages in thread
From: Michal Hocko @ 2019-01-03 13:56 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, osalvador, david

On Sat 22-12-18 01:02:28, Wei Yang wrote:
> In current implementation, there are two places to isolate a range of
> page: __offline_pages() and alloc_contig_range(). During this procedure,
> it will drain pages on pcp list.
> 
> Below is a brief call flow:
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()                 <--- A
> 
> >From this snippet we can see current logic is isolate and drain pcp list
> for each pageblock and drain pcp list again for the whole range.
> 
> While the drain at A is not necessary. The reason is
> start_isolate_page_range() will set the migrate type of a range to
> MIGRATE_ISOLATE. After doing so, this range will never be allocated from
> Buddy, neither to a real user nor to pcp list. This means the procedure
> to drain pages on pcp list after start_isolate_page_range() will not
> drain any page in the target range.

I am still not happy with the changelog. I would suggest the following
instead

"
start_isolate_page_range is responsible for isolating the given pfn
range. One part of that job is to make sure that also pages that are on
the allocator pcp lists are properly isolated. Otherwise they could be
reused and the range wouldn't be completely isolated until the memory is
freed back.  While there is no strict guarantee here because pages might
get allocated at any time before drain_all_pages is called there doesn't
seem to be any strong demand for such a guarantee.

In any case, draining is already done at the isolation level and there
is no need to do it again later by start_isolate_page_range callers
(memory hotplug and CMA allocator currently). Therefore remove pointless
draining in existing callers to make the code more clear and
functionally correct.
"
 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>

With something like that, you can add
Acked-by: Michal Hocko <mhocko@suse.com>

> 
> ---
> v3:
>   * it is not proper to rely on caller to drain pages, so keep to drain
>     pages during iteration and remove the one in callers.
> v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
> ---
>  mm/memory_hotplug.c | 1 -
>  mm/page_alloc.c     | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6910e0eea074..d2fa6cbbb2db 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  
>  	cond_resched();
>  	lru_add_drain_all();
> -	drain_all_pages(zone);
>  
>  	pfn = scan_movable_pages(start_pfn, end_pfn);
>  	if (pfn) { /* We have movable pages */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f1edd36a1e2b..d9ee4bb3a1a7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	 */
>  
>  	lru_add_drain_all();
> -	drain_all_pages(cc.zone);
>  
>  	order = 0;
>  	outer_start = start;
> -- 
> 2.15.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v3] mm: remove extra drain pages on pcp list
  2019-01-03 13:56     ` Michal Hocko
@ 2019-01-05 23:27       ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2019-01-05 23:27 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Wei Yang, linux-mm, akpm, osalvador, david

On Thu, Jan 03, 2019 at 02:56:09PM +0100, Michal Hocko wrote:
>On Sat 22-12-18 01:02:28, Wei Yang wrote:
>> In current implementation, there are two places to isolate a range of
>> page: __offline_pages() and alloc_contig_range(). During this procedure,
>> it will drain pages on pcp list.
>> 
>> Below is a brief call flow:
>> 
>>   __offline_pages()/alloc_contig_range()
>>       start_isolate_page_range()
>>           set_migratetype_isolate()
>>               drain_all_pages()
>>       drain_all_pages()                 <--- A
>> 
>> >From this snippet we can see current logic is isolate and drain pcp list
>> for each pageblock and drain pcp list again for the whole range.
>> 
>> While the drain at A is not necessary. The reason is
>> start_isolate_page_range() will set the migrate type of a range to
>> MIGRATE_ISOLATE. After doing so, this range will never be allocated from
>> Buddy, neither to a real user nor to pcp list. This means the procedure
>> to drain pages on pcp list after start_isolate_page_range() will not
>> drain any page in the target range.
>
>I am still not happy with the changelog. I would suggest the following
>instead
>
>"
>start_isolate_page_range is responsible for isolating the given pfn
>range. One part of that job is to make sure that also pages that are on
>the allocator pcp lists are properly isolated. Otherwise they could be
>reused and the range wouldn't be completely isolated until the memory is
>freed back.  While there is no strict guarantee here because pages might
>get allocated at any time before drain_all_pages is called there doesn't
>seem to be any strong demand for such a guarantee.
>
>In any case, draining is already done at the isolation level and there
>is no need to do it again later by start_isolate_page_range callers
>(memory hotplug and CMA allocator currently). Therefore remove pointless
>draining in existing callers to make the code more clear and
>functionally correct.
>"
> 
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>
>With something like that, you can add
>Acked-by: Michal Hocko <mhocko@suse.com>
>

Thanks, would adjust it accordingly.

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v4] mm: remove extra drain pages on pcp list
@ 2019-01-05 23:31       ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2019-01-05 23:31 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

In current implementation, there are two places to isolate a range of
page: __offline_pages() and alloc_contig_range(). During this procedure,
it will drain pages on pcp list.

Below is a brief call flow:

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()                 <--- A

>From this snippet we can see current logic is isolate and drain pcp list
for each pageblock and drain pcp list again for the whole range.

start_isolate_page_range is responsible for isolating the given pfn
range. One part of that job is to make sure that also pages that are on
the allocator pcp lists are properly isolated. Otherwise they could be
reused and the range wouldn't be completely isolated until the memory is
freed back.  While there is no strict guarantee here because pages might
get allocated at any time before drain_all_pages is called there doesn't
seem to be any strong demand for such a guarantee.

In any case, draining is already done at the isolation level and there
is no need to do it again later by start_isolate_page_range callers
(memory hotplug and CMA allocator currently). Therefore remove pointless
draining in existing callers to make the code more clear and
functionally correct.

[mhocko@suse.com: provide a clearer changelog for the last two paragraph]

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>

---
v4:
  * adjust last two paragraph changelog from Michal's comment
v3:
  * it is not proper to rely on caller to drain pages, so keep to drain
    pages during iteration and remove the one in callers.
v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
---
 mm/memory_hotplug.c | 1 -
 mm/page_alloc.c     | 1 -
 2 files changed, 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6910e0eea074..d2fa6cbbb2db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	cond_resched();
 	lru_add_drain_all();
-	drain_all_pages(zone);
 
 	pfn = scan_movable_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have movable pages */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f1edd36a1e2b..d9ee4bb3a1a7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 */
 
 	lru_add_drain_all();
-	drain_all_pages(cc.zone);
 
 	order = 0;
 	outer_start = start;
-- 
2.15.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4] mm: remove extra drain pages on pcp list
@ 2019-01-05 23:31       ` Wei Yang
  0 siblings, 0 replies; 38+ messages in thread
From: Wei Yang @ 2019-01-05 23:31 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, mhocko, osalvador, david, Wei Yang

In current implementation, there are two places to isolate a range of
page: __offline_pages() and alloc_contig_range(). During this procedure,
it will drain pages on pcp list.

Below is a brief call flow:

  __offline_pages()/alloc_contig_range()
      start_isolate_page_range()
          set_migratetype_isolate()
              drain_all_pages()
      drain_all_pages()                 <--- A

From this snippet we can see current logic is isolate and drain pcp list
for each pageblock and drain pcp list again for the whole range.

start_isolate_page_range is responsible for isolating the given pfn
range. One part of that job is to make sure that also pages that are on
the allocator pcp lists are properly isolated. Otherwise they could be
reused and the range wouldn't be completely isolated until the memory is
freed back.  While there is no strict guarantee here because pages might
get allocated at any time before drain_all_pages is called there doesn't
seem to be any strong demand for such a guarantee.

In any case, draining is already done at the isolation level and there
is no need to do it again later by start_isolate_page_range callers
(memory hotplug and CMA allocator currently). Therefore remove pointless
draining in existing callers to make the code more clear and
functionally correct.

[mhocko@suse.com: provide a clearer changelog for the last two paragraph]

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Acked-by: Michal Hocko <mhocko@suse.com>

---
v4:
  * adjust last two paragraph changelog from Michal's comment
v3:
  * it is not proper to rely on caller to drain pages, so keep to drain
    pages during iteration and remove the one in callers.
v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
---
 mm/memory_hotplug.c | 1 -
 mm/page_alloc.c     | 1 -
 2 files changed, 2 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 6910e0eea074..d2fa6cbbb2db 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	cond_resched();
 	lru_add_drain_all();
-	drain_all_pages(zone);
 
 	pfn = scan_movable_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have movable pages */
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index f1edd36a1e2b..d9ee4bb3a1a7 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	 */
 
 	lru_add_drain_all();
-	drain_all_pages(cc.zone);
 
 	order = 0;
 	outer_start = start;
-- 
2.15.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [PATCH v4] mm: remove extra drain pages on pcp list
  2019-01-05 23:31       ` Wei Yang
  (?)
@ 2019-01-07 11:34       ` David Hildenbrand
  -1 siblings, 0 replies; 38+ messages in thread
From: David Hildenbrand @ 2019-01-07 11:34 UTC (permalink / raw)
  To: Wei Yang, linux-mm; +Cc: akpm, mhocko, osalvador

On 06.01.19 00:31, Wei Yang wrote:
> In current implementation, there are two places to isolate a range of
> page: __offline_pages() and alloc_contig_range(). During this procedure,
> it will drain pages on pcp list.
> 
> Below is a brief call flow:
> 
>   __offline_pages()/alloc_contig_range()
>       start_isolate_page_range()
>           set_migratetype_isolate()
>               drain_all_pages()
>       drain_all_pages()                 <--- A
> 
> From this snippet we can see current logic is isolate and drain pcp list
> for each pageblock and drain pcp list again for the whole range.
> 
> start_isolate_page_range is responsible for isolating the given pfn
> range. One part of that job is to make sure that also pages that are on
> the allocator pcp lists are properly isolated. Otherwise they could be
> reused and the range wouldn't be completely isolated until the memory is
> freed back.  While there is no strict guarantee here because pages might
> get allocated at any time before drain_all_pages is called there doesn't
> seem to be any strong demand for such a guarantee.
> 
> In any case, draining is already done at the isolation level and there
> is no need to do it again later by start_isolate_page_range callers
> (memory hotplug and CMA allocator currently). Therefore remove pointless
> draining in existing callers to make the code more clear and
> functionally correct.
> 
> [mhocko@suse.com: provide a clearer changelog for the last two paragraph]
> 
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Acked-by: Michal Hocko <mhocko@suse.com>

Acked-by: David Hildenbrand <david@redhat.com>

> 
> ---
> v4:
>   * adjust last two paragraph changelog from Michal's comment
> v3:
>   * it is not proper to rely on caller to drain pages, so keep to drain
>     pages during iteration and remove the one in callers.
> v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
> ---
>  mm/memory_hotplug.c | 1 -
>  mm/page_alloc.c     | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6910e0eea074..d2fa6cbbb2db 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  
>  	cond_resched();
>  	lru_add_drain_all();
> -	drain_all_pages(zone);
>  
>  	pfn = scan_movable_pages(start_pfn, end_pfn);
>  	if (pfn) { /* We have movable pages */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f1edd36a1e2b..d9ee4bb3a1a7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	 */
>  
>  	lru_add_drain_all();
> -	drain_all_pages(cc.zone);
>  
>  	order = 0;
>  	outer_start = start;
> 


-- 

Thanks,

David / dhildenb

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4] mm: remove extra drain pages on pcp list
  2019-01-05 23:31       ` Wei Yang
  (?)
  (?)
@ 2019-01-08  9:10       ` Oscar Salvador
  -1 siblings, 0 replies; 38+ messages in thread
From: Oscar Salvador @ 2019-01-08  9:10 UTC (permalink / raw)
  To: Wei Yang; +Cc: linux-mm, akpm, mhocko, david

On Sun, Jan 06, 2019 at 07:31:41AM +0800, Wei Yang wrote:
> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Acked-by: Michal Hocko <mhocko@suse.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

> 
> ---
> v4:
>   * adjust last two paragraph changelog from Michal's comment
> v3:
>   * it is not proper to rely on caller to drain pages, so keep to drain
>     pages during iteration and remove the one in callers.
> v2: adjust changelog with MIGRATE_ISOLATE effects for the isolated range
> ---
>  mm/memory_hotplug.c | 1 -
>  mm/page_alloc.c     | 1 -
>  2 files changed, 2 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 6910e0eea074..d2fa6cbbb2db 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1599,7 +1599,6 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  
>  	cond_resched();
>  	lru_add_drain_all();
> -	drain_all_pages(zone);
>  
>  	pfn = scan_movable_pages(start_pfn, end_pfn);
>  	if (pfn) { /* We have movable pages */
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index f1edd36a1e2b..d9ee4bb3a1a7 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8041,7 +8041,6 @@ int alloc_contig_range(unsigned long start, unsigned long end,
>  	 */
>  
>  	lru_add_drain_all();
> -	drain_all_pages(cc.zone);
>  
>  	order = 0;
>  	outer_start = start;
> -- 
> 2.15.1
> 

-- 
Oscar Salvador
SUSE L3

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-01-08  9:10 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-14  2:39 [PATCH] mm, page_isolation: remove drain_all_pages() in set_migratetype_isolate() Wei Yang
2018-12-14  3:57 ` Andrew Morton
2018-12-14  7:01   ` Wei Yang
2018-12-14 15:17   ` Wei Yang
2018-12-17 12:21     ` Michal Hocko
2018-12-18 20:48       ` Wei Yang
2018-12-17 12:25 ` Michal Hocko
2018-12-17 15:08   ` Wei Yang
2018-12-17 15:48     ` Michal Hocko
2018-12-18 14:44       ` Wei Yang
2018-12-18 20:46 ` [PATCH v2] " Wei Yang
2018-12-18 21:14   ` David Hildenbrand
2018-12-18 21:49     ` Wei Yang
2018-12-18 22:18       ` David Hildenbrand
2018-12-18 23:29   ` Oscar Salvador
2018-12-19  9:51   ` Michal Hocko
2018-12-19  9:57     ` Oscar Salvador
2018-12-19 13:53       ` Wei Yang
2018-12-19 14:13         ` Michal Hocko
2018-12-19 14:33           ` Wei Yang
2018-12-19 14:39             ` Michal Hocko
2018-12-20 15:58               ` Wei Yang
2018-12-20 16:23                 ` Michal Hocko
2018-12-21  3:37                   ` Wei Yang
2018-12-19 13:29     ` Wei Yang
2018-12-19 13:40       ` Michal Hocko
2018-12-19 13:56         ` Wei Yang
2018-12-19 14:12           ` Michal Hocko
2018-12-19 14:41             ` Wei Yang
2018-12-19 10:05   ` Michal Hocko
2018-12-21 17:02   ` [PATCH v3] mm: remove extra drain pages on pcp list Wei Yang
2018-12-21 17:02     ` Wei Yang
2019-01-03 13:56     ` Michal Hocko
2019-01-05 23:27       ` Wei Yang
2019-01-05 23:31     ` [PATCH v4] " Wei Yang
2019-01-05 23:31       ` Wei Yang
2019-01-07 11:34       ` David Hildenbrand
2019-01-08  9:10       ` Oscar Salvador

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.