Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>
Subject: [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early
Date: Mon, 18 Sep 2017 09:08:33 +0200
Message-ID: <20170918070834.13083-2-mhocko@kernel.org> (raw)
In-Reply-To: <20170918070834.13083-1-mhocko@kernel.org>

From: Michal Hocko <mhocko@suse.com>

Memory offlining can fail just too eagerly under a heavy memory pressure.

[ 5410.336792] page:ffffea22a646bd00 count:255 mapcount:252 mapping:ffff88ff926c9f38 index:0x3
[ 5410.336809] flags: 0x9855fe40010048(uptodate|active|mappedtodisk)
[ 5410.336811] page dumped because: isolation failed
[ 5410.336813] page->mem_cgroup:ffff8801cd662000
[ 5420.655030] memory offlining [mem 0x18b580000000-0x18b5ffffffff] failed

Isolation has failed here because the page is not on LRU. Most probably
because it was on the pcp LRU cache or it has been removed from the LRU
already but it hasn't been freed yet. In both cases the page doesn't look
non-migrable so retrying more makes sense.

__offline_pages seems rather cluttered when it comes to the retry
logic. We have 5 retries at maximum and a timeout. We could argue
whether the timeout makes sense but failing just because of a race when
somebody isoltes a page from LRU or puts it on a pcp LRU lists is just
wrong. It only takes it to race with a process which unmaps some pages
and remove them from the LRU list and we can fail the whole offline
because of something that is a temporary condition and actually not
harmful for the offline.

Please note that unmovable pages should be already excluded during
start_isolate_page_range. We could argue that has_unmovable_pages is
racy and MIGRATE_MOVABLE check doesn't provide any hard guarantee either
but kernel zones (aka < ZONE_MOVABLE) will very likely detect unmovable
pages in most cases and movable zone shouldn't contain unmovable pages
at all. Some of those pages might be pinned but not for ever because
that would be a bug on its own. In any case the context is still
interruptible and so the userspace can easily bail out when the
operation takes too long. This is certainly better behavior than a
hardcoded retry loop which is racy.

Fix this by removing the max retry count and only rely on the timeout
resp. interruption by a signal from the userspace. Also retry rather
than fail when check_pages_isolated sees some !free pages because those
could be a result of the race as well.

Acked-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Michal Hocko <mhocko@suse.com>
---
 mm/memory_hotplug.c | 40 ++++++++++------------------------------
 1 file changed, 10 insertions(+), 30 deletions(-)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 459bbc182d10..c9dcbe6d2ac6 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1597,7 +1597,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
 {
 	unsigned long pfn, nr_pages, expire;
 	long offlined_pages;
-	int ret, drain, retry_max, node;
+	int ret, node;
 	unsigned long flags;
 	unsigned long valid_start, valid_end;
 	struct zone *zone;
@@ -1634,43 +1634,25 @@ static int __ref __offline_pages(unsigned long start_pfn,
 
 	pfn = start_pfn;
 	expire = jiffies + timeout;
-	drain = 0;
-	retry_max = 5;
 repeat:
 	/* start memory hot removal */
-	ret = -EAGAIN;
+	ret = -EBUSY;
 	if (time_after(jiffies, expire))
 		goto failed_removal;
 	ret = -EINTR;
 	if (signal_pending(current))
 		goto failed_removal;
-	ret = 0;
-	if (drain) {
-		lru_add_drain_all_cpuslocked();
-		cond_resched();
-		drain_all_pages(zone);
-	}
+
+	cond_resched();
+	lru_add_drain_all_cpuslocked();
+	drain_all_pages(zone);
 
 	pfn = scan_movable_pages(start_pfn, end_pfn);
 	if (pfn) { /* We have movable pages */
 		ret = do_migrate_range(pfn, end_pfn);
-		if (!ret) {
-			drain = 1;
-			goto repeat;
-		} else {
-			if (ret < 0)
-				if (--retry_max == 0)
-					goto failed_removal;
-			yield();
-			drain = 1;
-			goto repeat;
-		}
+		goto repeat;
 	}
-	/* drain all zone's lru pagevec, this is asynchronous... */
-	lru_add_drain_all_cpuslocked();
-	yield();
-	/* drain pcp pages, this is synchronous. */
-	drain_all_pages(zone);
+
 	/*
 	 * dissolve free hugepages in the memory block before doing offlining
 	 * actually in order to make hugetlbfs's object counting consistent.
@@ -1680,10 +1662,8 @@ static int __ref __offline_pages(unsigned long start_pfn,
 		goto failed_removal;
 	/* check again */
 	offlined_pages = check_pages_isolated(start_pfn, end_pfn);
-	if (offlined_pages < 0) {
-		ret = -EBUSY;
-		goto failed_removal;
-	}
+	if (offlined_pages < 0)
+		goto repeat;
 	pr_info("Offlined Pages %ld\n", offlined_pages);
 	/* Ok, all of our target is isolated.
 	   We cannot do rollback at this point. */
-- 
2.14.1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply index

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-18  7:08 [PATCH v2 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko
2017-09-18  7:08 ` Michal Hocko [this message]
2017-10-10 12:05   ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michael Ellerman
2017-10-10 12:27     ` Michal Hocko
2017-10-11  2:37       ` Michael Ellerman
2017-10-11  5:19         ` Michael Ellerman
2017-10-11 14:05           ` Anshuman Khandual
2017-10-11 14:16             ` Michal Hocko
2017-10-11  6:51         ` Michal Hocko
2017-10-11  8:04           ` Vlastimil Babka
2017-10-11  8:13             ` Michal Hocko
2017-10-11 11:17               ` Vlastimil Babka
2017-10-11 11:24                 ` Michal Hocko
2017-10-13 11:42             ` Michael Ellerman
2017-10-13 11:58               ` Michal Hocko
2017-10-13 12:00                 ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michal Hocko
2017-10-13 12:00                   ` [PATCH 2/2] mm, page_alloc: fail has_unmovable_pages when seeing reserved pages Michal Hocko
2017-10-13 12:04                     ` Vlastimil Babka
2017-10-13 12:07                       ` Michal Hocko
2017-10-17 13:03                         ` Vlastimil Babka
2017-10-17 11:41                   ` [PATCH 1/2] mm: drop migrate type checks from has_unmovable_pages Michael Ellerman
2017-10-17 12:03                     ` Michal Hocko
2017-10-17 13:02                   ` Vlastimil Babka
2017-10-19  2:51                   ` Joonsoo Kim
2017-10-19  7:15                     ` Michal Hocko
2017-10-19  7:33                       ` Joonsoo Kim
2017-10-19  8:20                         ` Michal Hocko
2017-10-19 12:21                           ` Michal Hocko
2017-10-20  2:13                             ` Joonsoo Kim
2017-10-20  5:59                               ` Michal Hocko
2017-10-20  6:50                                 ` Joonsoo Kim
2017-10-20  7:02                                   ` Michal Hocko
2017-10-23  5:23                                     ` Joonsoo Kim
2017-10-23  8:10                                       ` Michal Hocko
2017-10-24  4:44                                         ` Joonsoo Kim
2017-10-24  7:44                                           ` Michal Hocko
2017-10-24  8:12                                           ` Vlastimil Babka
2017-10-24 12:25                                             ` Michal Hocko
2017-10-26  2:47                                             ` Joonsoo Kim
2017-10-26  7:41                                               ` Michal Hocko
2017-10-20  7:22                               ` Xishi Qiu
2017-10-20  8:17                                 ` Michal Hocko
2017-10-23  5:26                                   ` Joonsoo Kim
2017-10-26 13:04                             ` Vlastimil Babka
2017-10-26 13:59                             ` Michal Hocko
2017-09-18  7:08 ` [PATCH 2/2] mm, memory_hotplug: remove timeout from __offline_memory Michal Hocko
  -- strict thread matches above, loose matches on Subject: below --
2017-09-04  8:21 [PATCH 0/2] mm, memory_hotplug: redefine memory offline retry logic Michal Hocko
2017-09-04  8:21 ` [PATCH 1/2] mm, memory_hotplug: do not fail offlining too early Michal Hocko
2017-09-05  6:29   ` Anshuman Khandual
2017-09-05  7:13     ` Michal Hocko
2017-09-08 17:26   ` Vlastimil Babka
2017-09-11  8:17     ` Michal Hocko
2017-09-13 11:41       ` Vlastimil Babka
2017-09-13 12:10         ` Michal Hocko
2017-09-13 12:14           ` Michal Hocko
2017-09-13 12:19             ` Vlastimil Babka
2017-09-13 12:32               ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170918070834.13083-2-mhocko@kernel.org \
    --to=mhocko@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=imammedo@redhat.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=qiuxishi@huawei.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git