Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
@ 2018-11-06  9:55 Michal Hocko
  2018-11-06 11:00 ` osalvador
                   ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Michal Hocko @ 2018-11-06  9:55 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Baoquan He, Oscar Salvador, linux-mm, LKML, Michal Hocko

From: Michal Hocko <mhocko@suse.com>

Page state checks are racy. Under a heavy memory workload (e.g. stress
-m 200 -t 2h) it is quite easy to hit a race window when the page is
allocated but its state is not fully populated yet. A debugging patch to
dump the struct page state shows
: [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
: [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
: [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)

Note that the state has been checked for both PageLRU and PageSwapBacked
already. Closing this race completely would require some sort of retry
logic. This can be tricky and error prone (think of potential endless
or long taking loops).

Workaround this problem for movable zones at least. Such a zone should
only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
has_unmovable_pages more robust") has told us that this is not strictly
true though. Bootmem pages should be marked reserved though so we can
move the original check after the PageReserved check. Pages from other
zones are still prone to races but we even do not pretend that memory
hotremove works for those so pre-mature failure doesn't hurt that much.

Reported-and-tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---

Hi,
this has been reported [1] and we have tried multiple things to address
the issue. The only reliable way was to reintroduce the movable zone
check into has_unmovable_pages. This time it should be safe also for
the bug originally fixed by 15c30bc09085.

[1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 863d46da6586..c6d900ee4982 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		if (PageReserved(page))
 			goto unmovable;
 
+		/*
+		 * If the zone is movable and we have ruled out all reserved
+		 * pages then it should be reasonably safe to assume the rest
+		 * is movable.
+		 */
+		if (zone_idx(zone) == ZONE_MOVABLE)
+			continue;
+
 		/*
 		 * Hugepages are not in LRU lists, but they're movable.
 		 * We need not scan over tail pages bacause we don't
-- 
2.19.1

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-06  9:55 [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages Michal Hocko
@ 2018-11-06 11:00 ` osalvador
  2018-11-06 20:35 ` Balbir Singh
  2018-11-15  3:13 ` Baoquan He
  2 siblings, 0 replies; 12+ messages in thread
From: osalvador @ 2018-11-06 11:00 UTC (permalink / raw)
  To: Michal Hocko, Andrew Morton; +Cc: Baoquan He, linux-mm, LKML, Michal Hocko

On Tue, 2018-11-06 at 10:55 +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> Signed-off-by: Michal Hocko <mhocko@suse.com>

Looks good to me.

Reviewed-by: Oscar Salvador <osalvador@suse.de>


Oscar Salvador

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-06  9:55 [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages Michal Hocko
  2018-11-06 11:00 ` osalvador
@ 2018-11-06 20:35 ` Balbir Singh
  2018-11-07  7:35   ` Michal Hocko
  2018-11-15  3:13 ` Baoquan He
  2 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2018-11-06 20:35 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML, Michal Hocko

On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Page state checks are racy. Under a heavy memory workload (e.g. stress
> -m 200 -t 2h) it is quite easy to hit a race window when the page is
> allocated but its state is not fully populated yet. A debugging patch to
> dump the struct page state shows
> : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> 
> Note that the state has been checked for both PageLRU and PageSwapBacked
> already. Closing this race completely would require some sort of retry
> logic. This can be tricky and error prone (think of potential endless
> or long taking loops).
> 
> Workaround this problem for movable zones at least. Such a zone should
> only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> has_unmovable_pages more robust") has told us that this is not strictly
> true though. Bootmem pages should be marked reserved though so we can
> move the original check after the PageReserved check. Pages from other
> zones are still prone to races but we even do not pretend that memory
> hotremove works for those so pre-mature failure doesn't hurt that much.
> 
> Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
>
> Hi,
> this has been reported [1] and we have tried multiple things to address
> the issue. The only reliable way was to reintroduce the movable zone
> check into has_unmovable_pages. This time it should be safe also for
> the bug originally fixed by 15c30bc09085.
> 
> [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
>  mm/page_alloc.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 863d46da6586..c6d900ee4982 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>  		if (PageReserved(page))
>  			goto unmovable;
>  
> +		/*
> +		 * If the zone is movable and we have ruled out all reserved
> +		 * pages then it should be reasonably safe to assume the rest
> +		 * is movable.
> +		 */
> +		if (zone_idx(zone) == ZONE_MOVABLE)
> +			continue;
> +
>  		/*


There is a WARN_ON() in case of failure at the end of the routine,
is that triggered when we hit the bug? If we're adding this patch,
the WARN_ON needs to go as well.

The check seems to be quite aggressive and in a loop that iterates
pages, but has nothing to do with the page, did you mean to make
the check

zone_idx(page_zone(page)) == ZONE_MOVABLE

it also skips all checks for pinned pages and other checks


Balbir Singh. 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-06 20:35 ` Balbir Singh
@ 2018-11-07  7:35   ` Michal Hocko
  2018-11-07  7:40     ` Michal Hocko
                       ` (2 more replies)
  0 siblings, 3 replies; 12+ messages in thread
From: Michal Hocko @ 2018-11-07  7:35 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML

On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Page state checks are racy. Under a heavy memory workload (e.g. stress
> > -m 200 -t 2h) it is quite easy to hit a race window when the page is
> > allocated but its state is not fully populated yet. A debugging patch to
> > dump the struct page state shows
> > : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> > : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> > : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> > 
> > Note that the state has been checked for both PageLRU and PageSwapBacked
> > already. Closing this race completely would require some sort of retry
> > logic. This can be tricky and error prone (think of potential endless
> > or long taking loops).
> > 
> > Workaround this problem for movable zones at least. Such a zone should
> > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> > has_unmovable_pages more robust") has told us that this is not strictly
> > true though. Bootmem pages should be marked reserved though so we can
> > move the original check after the PageReserved check. Pages from other
> > zones are still prone to races but we even do not pretend that memory
> > hotremove works for those so pre-mature failure doesn't hurt that much.
> > 
> > Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> > Acked-by: Baoquan He <bhe@redhat.com>
> > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> >
> > Hi,
> > this has been reported [1] and we have tried multiple things to address
> > the issue. The only reliable way was to reintroduce the movable zone
> > check into has_unmovable_pages. This time it should be safe also for
> > the bug originally fixed by 15c30bc09085.
> > 
> > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
> >  mm/page_alloc.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 863d46da6586..c6d900ee4982 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> >  		if (PageReserved(page))
> >  			goto unmovable;
> >  
> > +		/*
> > +		 * If the zone is movable and we have ruled out all reserved
> > +		 * pages then it should be reasonably safe to assume the rest
> > +		 * is movable.
> > +		 */
> > +		if (zone_idx(zone) == ZONE_MOVABLE)
> > +			continue;
> > +
> >  		/*
> 
> 
> There is a WARN_ON() in case of failure at the end of the routine,
> is that triggered when we hit the bug? If we're adding this patch,
> the WARN_ON needs to go as well.

No the warning should stay in case we encounter reserved pages in zone
movable.

> The check seems to be quite aggressive and in a loop that iterates
> pages, but has nothing to do with the page, did you mean to make
> the check
> 
> zone_idx(page_zone(page)) == ZONE_MOVABLE

Does it make any difference? Can we actually encounter a page from a
different zone here?

> it also skips all checks for pinned pages and other checks

Yes, this is intentional and the comment tries to explain why. I wish we
could be add a more specific checks for movable pages - e.g. detect long
term pins that would prevent migration - but we do not have any facility
for that. Please note that the worst case of a false positive is a
repeated migration failure and user has a way to break out of migration
by a signal.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07  7:35   ` Michal Hocko
@ 2018-11-07  7:40     ` Michal Hocko
  2018-11-07  7:55     ` osalvador
  2018-11-07 12:53     ` Balbir Singh
  2 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2018-11-07  7:40 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML

On Wed 07-11-18 08:35:48, Michal Hocko wrote:
> On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > Page state checks are racy. Under a heavy memory workload (e.g. stress
> > > -m 200 -t 2h) it is quite easy to hit a race window when the page is
> > > allocated but its state is not fully populated yet. A debugging patch to
> > > dump the struct page state shows
> > > : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> > > : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> > > : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> > > 
> > > Note that the state has been checked for both PageLRU and PageSwapBacked
> > > already. Closing this race completely would require some sort of retry
> > > logic. This can be tricky and error prone (think of potential endless
> > > or long taking loops).
> > > 
> > > Workaround this problem for movable zones at least. Such a zone should
> > > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> > > has_unmovable_pages more robust") has told us that this is not strictly
> > > true though. Bootmem pages should be marked reserved though so we can
> > > move the original check after the PageReserved check. Pages from other
> > > zones are still prone to races but we even do not pretend that memory
> > > hotremove works for those so pre-mature failure doesn't hurt that much.
> > > 
> > > Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> > > Acked-by: Baoquan He <bhe@redhat.com>
> > > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > ---
> > >
> > > Hi,
> > > this has been reported [1] and we have tried multiple things to address
> > > the issue. The only reliable way was to reintroduce the movable zone
> > > check into has_unmovable_pages. This time it should be safe also for
> > > the bug originally fixed by 15c30bc09085.
> > > 
> > > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
> > >  mm/page_alloc.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 863d46da6586..c6d900ee4982 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> > >  		if (PageReserved(page))
> > >  			goto unmovable;
> > >  
> > > +		/*
> > > +		 * If the zone is movable and we have ruled out all reserved
> > > +		 * pages then it should be reasonably safe to assume the rest
> > > +		 * is movable.
> > > +		 */
> > > +		if (zone_idx(zone) == ZONE_MOVABLE)
> > > +			continue;
> > > +
> > >  		/*
> > 
> > 
> > There is a WARN_ON() in case of failure at the end of the routine,
> > is that triggered when we hit the bug? If we're adding this patch,
> > the WARN_ON needs to go as well.
> 
> No the warning should stay in case we encounter reserved pages in zone
> movable.

And to clarify. I am OK with changing the WARN to pr_warn if the warning
is considered harmful but we do want to note that something unexpected
is going on here.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07  7:35   ` Michal Hocko
  2018-11-07  7:40     ` Michal Hocko
@ 2018-11-07  7:55     ` osalvador
  2018-11-07  8:14       ` Michal Hocko
  2018-11-07 12:53     ` Balbir Singh
  2 siblings, 1 reply; 12+ messages in thread
From: osalvador @ 2018-11-07  7:55 UTC (permalink / raw)
  To: Michal Hocko, Balbir Singh; +Cc: Andrew Morton, Baoquan He, linux-mm, LKML

On Wed, 2018-11-07 at 08:35 +0100, Michal Hocko wrote:
> On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > The check seems to be quite aggressive and in a loop that iterates
> > pages, but has nothing to do with the page, did you mean to make
> > the check
> > 
> > zone_idx(page_zone(page)) == ZONE_MOVABLE
> 
> Does it make any difference? Can we actually encounter a page from a
> different zone here?

AFAIK, test_pages_in_a_zone() called from offline_pages() should ensure
that the range belongs to a unique zone, so we should not encounter
pages from other zones there, right?

---
Oscar
Suse L3

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07  7:55     ` osalvador
@ 2018-11-07  8:14       ` Michal Hocko
  0 siblings, 0 replies; 12+ messages in thread
From: Michal Hocko @ 2018-11-07  8:14 UTC (permalink / raw)
  To: osalvador; +Cc: Balbir Singh, Andrew Morton, Baoquan He, linux-mm, LKML

On Wed 07-11-18 08:55:26, osalvador wrote:
> On Wed, 2018-11-07 at 08:35 +0100, Michal Hocko wrote:
> > On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > > The check seems to be quite aggressive and in a loop that iterates
> > > pages, but has nothing to do with the page, did you mean to make
> > > the check
> > > 
> > > zone_idx(page_zone(page)) == ZONE_MOVABLE
> > 
> > Does it make any difference? Can we actually encounter a page from a
> > different zone here?
> 
> AFAIK, test_pages_in_a_zone() called from offline_pages() should ensure
> that the range belongs to a unique zone, so we should not encounter
> pages from other zones there, right?

Yes that is the case for memory hotplug. We do assume a single zone at
set_migratetype_isolate where we take the zone->lock. If the
contig_alloc can span multiple zones then it should check for similar.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07  7:35   ` Michal Hocko
  2018-11-07  7:40     ` Michal Hocko
  2018-11-07  7:55     ` osalvador
@ 2018-11-07 12:53     ` Balbir Singh
  2018-11-07 13:06       ` Michal Hocko
  2 siblings, 1 reply; 12+ messages in thread
From: Balbir Singh @ 2018-11-07 12:53 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML

On Wed, Nov 07, 2018 at 08:35:48AM +0100, Michal Hocko wrote:
> On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> > On Tue, Nov 06, 2018 at 10:55:24AM +0100, Michal Hocko wrote:
> > > From: Michal Hocko <mhocko@suse.com>
> > > 
> > > Page state checks are racy. Under a heavy memory workload (e.g. stress
> > > -m 200 -t 2h) it is quite easy to hit a race window when the page is
> > > allocated but its state is not fully populated yet. A debugging patch to
> > > dump the struct page state shows
> > > : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> > > : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> > > : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> > > 
> > > Note that the state has been checked for both PageLRU and PageSwapBacked
> > > already. Closing this race completely would require some sort of retry
> > > logic. This can be tricky and error prone (think of potential endless
> > > or long taking loops).
> > > 
> > > Workaround this problem for movable zones at least. Such a zone should
> > > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> > > has_unmovable_pages more robust") has told us that this is not strictly
> > > true though. Bootmem pages should be marked reserved though so we can
> > > move the original check after the PageReserved check. Pages from other
> > > zones are still prone to races but we even do not pretend that memory
> > > hotremove works for those so pre-mature failure doesn't hurt that much.
> > > 
> > > Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> > > Acked-by: Baoquan He <bhe@redhat.com>
> > > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> > > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > > ---
> > >
> > > Hi,
> > > this has been reported [1] and we have tried multiple things to address
> > > the issue. The only reliable way was to reintroduce the movable zone
> > > check into has_unmovable_pages. This time it should be safe also for
> > > the bug originally fixed by 15c30bc09085.
> > > 
> > > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
> > >  mm/page_alloc.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 863d46da6586..c6d900ee4982 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> > >  		if (PageReserved(page))
> > >  			goto unmovable;
> > >  
> > > +		/*
> > > +		 * If the zone is movable and we have ruled out all reserved
> > > +		 * pages then it should be reasonably safe to assume the rest
> > > +		 * is movable.
> > > +		 */
> > > +		if (zone_idx(zone) == ZONE_MOVABLE)
> > > +			continue;
> > > +
> > >  		/*
> > 
> > 
> > There is a WARN_ON() in case of failure at the end of the routine,
> > is that triggered when we hit the bug? If we're adding this patch,
> > the WARN_ON needs to go as well.
> 
> No the warning should stay in case we encounter reserved pages in zone
> movable.
>

Fair enough!
 
> > The check seems to be quite aggressive and in a loop that iterates
> > pages, but has nothing to do with the page, did you mean to make
> > the check
> > 
> > zone_idx(page_zone(page)) == ZONE_MOVABLE
> 
> Does it make any difference? Can we actually encounter a page from a
> different zone here?
> 

Just to avoid page state related issues, do we want to go ahead
with the migration if zone_idx(page_zone(page)) != ZONE_MOVABLE.

> > it also skips all checks for pinned pages and other checks
> 
> Yes, this is intentional and the comment tries to explain why. I wish we
> could be add a more specific checks for movable pages - e.g. detect long
> term pins that would prevent migration - but we do not have any facility
> for that. Please note that the worst case of a false positive is a
> repeated migration failure and user has a way to break out of migration
> by a signal.
>

Basically isolate_pages() will fail as opposed to hotplug failing upfront.
The basic assertion this patch makes is that all ZONE_MOVABLE pages that
are not reserved are hotpluggable.

Balbir Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07 12:53     ` Balbir Singh
@ 2018-11-07 13:06       ` Michal Hocko
  2018-11-09 10:45         ` Balbir Singh
  0 siblings, 1 reply; 12+ messages in thread
From: Michal Hocko @ 2018-11-07 13:06 UTC (permalink / raw)
  To: Balbir Singh; +Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML

On Wed 07-11-18 23:53:24, Balbir Singh wrote:
> On Wed, Nov 07, 2018 at 08:35:48AM +0100, Michal Hocko wrote:
> > On Wed 07-11-18 07:35:18, Balbir Singh wrote:
[...]
> > > The check seems to be quite aggressive and in a loop that iterates
> > > pages, but has nothing to do with the page, did you mean to make
> > > the check
> > > 
> > > zone_idx(page_zone(page)) == ZONE_MOVABLE
> > 
> > Does it make any difference? Can we actually encounter a page from a
> > different zone here?
> > 
> 
> Just to avoid page state related issues, do we want to go ahead
> with the migration if zone_idx(page_zone(page)) != ZONE_MOVABLE.

Could you be more specific what kind of state related issues you have in
mind?

> > > it also skips all checks for pinned pages and other checks
> > 
> > Yes, this is intentional and the comment tries to explain why. I wish we
> > could be add a more specific checks for movable pages - e.g. detect long
> > term pins that would prevent migration - but we do not have any facility
> > for that. Please note that the worst case of a false positive is a
> > repeated migration failure and user has a way to break out of migration
> > by a signal.
> >
> 
> Basically isolate_pages() will fail as opposed to hotplug failing upfront.
> The basic assertion this patch makes is that all ZONE_MOVABLE pages that
> are not reserved are hotpluggable.

Yes, that is correct.

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-07 13:06       ` Michal Hocko
@ 2018-11-09 10:45         ` Balbir Singh
  0 siblings, 0 replies; 12+ messages in thread
From: Balbir Singh @ 2018-11-09 10:45 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Baoquan He, Oscar Salvador, linux-mm, LKML

On Wed, Nov 07, 2018 at 02:06:55PM +0100, Michal Hocko wrote:
> On Wed 07-11-18 23:53:24, Balbir Singh wrote:
> > On Wed, Nov 07, 2018 at 08:35:48AM +0100, Michal Hocko wrote:
> > > On Wed 07-11-18 07:35:18, Balbir Singh wrote:
> [...]
> > > > The check seems to be quite aggressive and in a loop that iterates
> > > > pages, but has nothing to do with the page, did you mean to make
> > > > the check
> > > > 
> > > > zone_idx(page_zone(page)) == ZONE_MOVABLE
> > > 
> > > Does it make any difference? Can we actually encounter a page from a
> > > different zone here?
> > > 
> > 
> > Just to avoid page state related issues, do we want to go ahead
> > with the migration if zone_idx(page_zone(page)) != ZONE_MOVABLE.
> 
> Could you be more specific what kind of state related issues you have in
> mind?
> 

I was wondering if page_zone() is setup correctly, but it's setup
upfront, so I don't think that is ever an issue.

> > > > it also skips all checks for pinned pages and other checks
> > > 
> > > Yes, this is intentional and the comment tries to explain why. I wish we
> > > could be add a more specific checks for movable pages - e.g. detect long
> > > term pins that would prevent migration - but we do not have any facility
> > > for that. Please note that the worst case of a false positive is a
> > > repeated migration failure and user has a way to break out of migration
> > > by a signal.
> > >
> > 
> > Basically isolate_pages() will fail as opposed to hotplug failing upfront.
> > The basic assertion this patch makes is that all ZONE_MOVABLE pages that
> > are not reserved are hotpluggable.
> 
> Yes, that is correct.
>

I wonder if it is easier to catch a __SetPageReserved() on ZONE_MOVABLE memory
at set time, the downside is that we never know if that memory will ever be
hot(un)plugged. The patch itself, I think is OK

Acked-by: Balbir Singh <bsingharora@gmail.com>

Balbir Singh.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-06  9:55 [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages Michal Hocko
  2018-11-06 11:00 ` osalvador
  2018-11-06 20:35 ` Balbir Singh
@ 2018-11-15  3:13 ` Baoquan He
  2018-11-15  3:18   ` Baoquan He
  2 siblings, 1 reply; 12+ messages in thread
From: Baoquan He @ 2018-11-15  3:13 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Oscar Salvador, linux-mm, LKML, Michal Hocko

On 11/06/18 at 10:55am, Michal Hocko wrote:
> From: Michal Hocko <mhocko@suse.com>
> 
> Page state checks are racy. Under a heavy memory workload (e.g. stress
> -m 200 -t 2h) it is quite easy to hit a race window when the page is
> allocated but its state is not fully populated yet. A debugging patch to

The original phenomenon is the value of /sys/devices/system/memory/memoryxxx/removable
is 0 on several memory blocks of hotpluggable node. And almost on each
hotpluggable node, there are one or several blocks which has this zero
value of removable attribute. It caused the hot removing failure always.

And only cat /sys/devices/system/memory/memoryxxx/removable will trigger
the call trace.

With this fix, all 'removable' of memory block on those hotpluggable
nodes are '1', and hotplug can succeed.

> dump the struct page state shows
> : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> 
> Note that the state has been checked for both PageLRU and PageSwapBacked
> already. Closing this race completely would require some sort of retry
> logic. This can be tricky and error prone (think of potential endless
> or long taking loops).
> 
> Workaround this problem for movable zones at least. Such a zone should
> only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> has_unmovable_pages more robust") has told us that this is not strictly
> true though. Bootmem pages should be marked reserved though so we can
> move the original check after the PageReserved check. Pages from other
> zones are still prone to races but we even do not pretend that memory
> hotremove works for those so pre-mature failure doesn't hurt that much.
> 
> Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> Acked-by: Baoquan He <bhe@redhat.com>
> Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")

Fixes: 15c30bc09085 "mm, memory_hotplug: make has_unmovable_pages more robust")

> Signed-off-by: Michal Hocko <mhocko@suse.com>
> ---
> 
> Hi,
> this has been reported [1] and we have tried multiple things to address
> the issue. The only reliable way was to reintroduce the movable zone
> check into has_unmovable_pages. This time it should be safe also for
> the bug originally fixed by 15c30bc09085.
> 
> [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
>  mm/page_alloc.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 863d46da6586..c6d900ee4982 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
>  		if (PageReserved(page))
>  			goto unmovable;
>  
> +		/*
> +		 * If the zone is movable and we have ruled out all reserved
> +		 * pages then it should be reasonably safe to assume the rest
> +		 * is movable.
> +		 */
> +		if (zone_idx(zone) == ZONE_MOVABLE)
> +			continue;
> +
>  		/*
>  		 * Hugepages are not in LRU lists, but they're movable.
>  		 * We need not scan over tail pages bacause we don't
> -- 
> 2.19.1
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
  2018-11-15  3:13 ` Baoquan He
@ 2018-11-15  3:18   ` Baoquan He
  0 siblings, 0 replies; 12+ messages in thread
From: Baoquan He @ 2018-11-15  3:18 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Andrew Morton, Oscar Salvador, linux-mm, LKML, Michal Hocko

On 11/15/18 at 11:13am, Baoquan He wrote:
> On 11/06/18 at 10:55am, Michal Hocko wrote:
> > From: Michal Hocko <mhocko@suse.com>
> > 
> > Page state checks are racy. Under a heavy memory workload (e.g. stress
> > -m 200 -t 2h) it is quite easy to hit a race window when the page is
> > allocated but its state is not fully populated yet. A debugging patch to
> 
> The original phenomenon is the value of /sys/devices/system/memory/memoryxxx/removable
> is 0 on several memory blocks of hotpluggable node. And almost on each
> hotpluggable node, there are one or several blocks which has this zero
> value of removable attribute. It caused the hot removing failure always.
> 
> And only cat /sys/devices/system/memory/memoryxxx/removable will trigger
> the call trace.
> 
> With this fix, all 'removable' of memory block on those hotpluggable
> nodes are '1', and hotplug can succeed.

Oh, by the way, hot removing/adding can always succeed when no memory
pressure is added.

The hot removing failure with high memory pressure has been raised in
another thread.

Thanks
Baoquan

> 
> > dump the struct page state shows
> > : [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
> > : [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
> > : [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)
> > 
> > Note that the state has been checked for both PageLRU and PageSwapBacked
> > already. Closing this race completely would require some sort of retry
> > logic. This can be tricky and error prone (think of potential endless
> > or long taking loops).
> > 
> > Workaround this problem for movable zones at least. Such a zone should
> > only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
> > has_unmovable_pages more robust") has told us that this is not strictly
> > true though. Bootmem pages should be marked reserved though so we can
> > move the original check after the PageReserved check. Pages from other
> > zones are still prone to races but we even do not pretend that memory
> > hotremove works for those so pre-mature failure doesn't hurt that much.
> > 
> > Reported-and-tested-by: Baoquan He <bhe@redhat.com>
> > Acked-by: Baoquan He <bhe@redhat.com>
> > Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
> 
> Fixes: 15c30bc09085 "mm, memory_hotplug: make has_unmovable_pages more robust")
> 
> > Signed-off-by: Michal Hocko <mhocko@suse.com>
> > ---
> > 
> > Hi,
> > this has been reported [1] and we have tried multiple things to address
> > the issue. The only reliable way was to reintroduce the movable zone
> > check into has_unmovable_pages. This time it should be safe also for
> > the bug originally fixed by 15c30bc09085.
> > 
> > [1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
> >  mm/page_alloc.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 863d46da6586..c6d900ee4982 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
> >  		if (PageReserved(page))
> >  			goto unmovable;
> >  
> > +		/*
> > +		 * If the zone is movable and we have ruled out all reserved
> > +		 * pages then it should be reasonably safe to assume the rest
> > +		 * is movable.
> > +		 */
> > +		if (zone_idx(zone) == ZONE_MOVABLE)
> > +			continue;
> > +
> >  		/*
> >  		 * Hugepages are not in LRU lists, but they're movable.
> >  		 * We need not scan over tail pages bacause we don't
> > -- 
> > 2.19.1
> > 

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, back to index

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-06  9:55 [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages Michal Hocko
2018-11-06 11:00 ` osalvador
2018-11-06 20:35 ` Balbir Singh
2018-11-07  7:35   ` Michal Hocko
2018-11-07  7:40     ` Michal Hocko
2018-11-07  7:55     ` osalvador
2018-11-07  8:14       ` Michal Hocko
2018-11-07 12:53     ` Balbir Singh
2018-11-07 13:06       ` Michal Hocko
2018-11-09 10:45         ` Balbir Singh
2018-11-15  3:13 ` Baoquan He
2018-11-15  3:18   ` Baoquan He

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git