Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Baoquan He <bhe@redhat.com>, Oscar Salvador <OSalvador@suse.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>
Subject: [PATCH] mm, memory_hotplug: check zone_movable in has_unmovable_pages
Date: Tue,  6 Nov 2018 10:55:24 +0100
Message-ID: <20181106095524.14629-1-mhocko@kernel.org> (raw)

From: Michal Hocko <mhocko@suse.com>

Page state checks are racy. Under a heavy memory workload (e.g. stress
-m 200 -t 2h) it is quite easy to hit a race window when the page is
allocated but its state is not fully populated yet. A debugging patch to
dump the struct page state shows
: [  476.575516] has_unmovable_pages: pfn:0x10dfec00, found:0x1, count:0x0
: [  476.582103] page:ffffea0437fb0000 count:1 mapcount:1 mapping:ffff880e05239841 index:0x7f26e5000 compound_mapcount: 1
: [  476.592645] flags: 0x5fffffc0090034(uptodate|lru|active|head|swapbacked)

Note that the state has been checked for both PageLRU and PageSwapBacked
already. Closing this race completely would require some sort of retry
logic. This can be tricky and error prone (think of potential endless
or long taking loops).

Workaround this problem for movable zones at least. Such a zone should
only contain movable pages. 15c30bc09085 ("mm, memory_hotplug: make
has_unmovable_pages more robust") has told us that this is not strictly
true though. Bootmem pages should be marked reserved though so we can
move the original check after the PageReserved check. Pages from other
zones are still prone to races but we even do not pretend that memory
hotremove works for those so pre-mature failure doesn't hurt that much.

Reported-and-tested-by: Baoquan He <bhe@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Fixes: "mm, memory_hotplug: make has_unmovable_pages more robust")
Signed-off-by: Michal Hocko <mhocko@suse.com>
---

Hi,
this has been reported [1] and we have tried multiple things to address
the issue. The only reliable way was to reintroduce the movable zone
check into has_unmovable_pages. This time it should be safe also for
the bug originally fixed by 15c30bc09085.

[1] http://lkml.kernel.org/r/20181101091055.GA15166@MiWiFi-R3L-srv
 mm/page_alloc.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 863d46da6586..c6d900ee4982 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7788,6 +7788,14 @@ bool has_unmovable_pages(struct zone *zone, struct page *page, int count,
 		if (PageReserved(page))
 			goto unmovable;
 
+		/*
+		 * If the zone is movable and we have ruled out all reserved
+		 * pages then it should be reasonably safe to assume the rest
+		 * is movable.
+		 */
+		if (zone_idx(zone) == ZONE_MOVABLE)
+			continue;
+
 		/*
 		 * Hugepages are not in LRU lists, but they're movable.
 		 * We need not scan over tail pages bacause we don't
-- 
2.19.1

             reply index

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-06  9:55 Michal Hocko [this message]
2018-11-06 11:00 ` osalvador
2018-11-06 20:35 ` Balbir Singh
2018-11-07  7:35   ` Michal Hocko
2018-11-07  7:40     ` Michal Hocko
2018-11-07  7:55     ` osalvador
2018-11-07  8:14       ` Michal Hocko
2018-11-07 12:53     ` Balbir Singh
2018-11-07 13:06       ` Michal Hocko
2018-11-09 10:45         ` Balbir Singh
2018-11-15  3:13 ` Baoquan He
2018-11-15  3:18   ` Baoquan He

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181106095524.14629-1-mhocko@kernel.org \
    --to=mhocko@kernel.org \
    --cc=OSalvador@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git