linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vitaly Kuznetsov <vkuznets@redhat.com>
To: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@suse.com>, Vlastimil Babka <vbabka@suse.cz>,
	Mel Gorman <mgorman@techsingularity.net>,
	YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>,
	Hillf Danton <hillf.zj@alibaba-inc.com>,
	Johannes Weiner <hannes@cmpxchg.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	Stephen Hemminger <sthemmin@microsoft.com>,
	Alex Ng <alexng@microsoft.com>
Subject: [PATCH RFC] mm/memory_hotplug: make it possible to offline blocks with reserved pages
Date: Wed,  8 Nov 2017 14:01:55 +0100	[thread overview]
Message-ID: <20171108130155.25499-1-vkuznets@redhat.com> (raw)

Hyper-V balloon driver needs to hotplug memory in smaller chunks and to
workaround Linux's 128Mb allignment requirement so it does a trick: partly
populated 128Mb blocks are added and then a custom online_page_callback
hook checks if the particular page is 'backed' during onlining, in case it
is not backed it is left in Reserved state. When the host adds more pages
to the block we bring them online from the driver (see
hv_bring_pgs_online()/hv_page_online_one() in drivers/hv/hv_balloon.c).
Eventually the whole block becomes fully populated and we hotplug the next
128Mb. This all works for quite some time already.

What is not working is offlining of such partly populated blocks:
check_pages_isolated_cb() callback will not pass with a sinle Reserved page
and we end up with -EBUSY. However, there's no reason to fail offlining in
this case: these pages are already offline, we may just skip them. Add the
appropriate workaround to test_pages_isolated().

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
RFC part:
- Other usages of Reserved pages making offlining blocks with them a no-go
  may exist.
- I'm not exactly sure that adding another parameter to
  test_pages_isolated() is a good idea, we may go with a single flag for
  both Reserved and HwPoisoned pages: we have just two call sites and they
  have opposite needs (true, true in one case and false, false in the
  other).
---
 include/linux/page-isolation.h |  2 +-
 mm/memory_hotplug.c            |  2 +-
 mm/page_alloc.c                |  8 +++++++-
 mm/page_isolation.c            | 11 ++++++++---
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 05a04e603686..daba12a59574 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -61,7 +61,7 @@ undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
  * Test all pages in [start_pfn, end_pfn) are isolated or not.
  */
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
-			bool skip_hwpoisoned_pages);
+			bool skip_hwpoisoned_pages, bool skip_reserved_pages);
 
 struct page *alloc_migrate_target(struct page *page, unsigned long private,
 				int **resultp);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index d4b5f29906b9..5b7d1482804f 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1467,7 +1467,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages,
 {
 	int ret;
 	long offlined = *(long *)data;
-	ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true);
+	ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true, true);
 	offlined = nr_pages;
 	if (!ret)
 		*(long *)data += offlined;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 77e4d3c5c57b..b475928c476c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -7632,7 +7632,7 @@ int alloc_contig_range(unsigned long start, unsigned long end,
 	}
 
 	/* Make sure the range is really isolated. */
-	if (test_pages_isolated(outer_start, end, false)) {
+	if (test_pages_isolated(outer_start, end, false, false)) {
 		pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n",
 			__func__, outer_start, end);
 		ret = -EBUSY;
@@ -7746,6 +7746,12 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn)
 			continue;
 		}
 
+		/* Some pages might never be online, skip them */
+		if (unlikely(PageReserved(page))) {
+			pfn++;
+			continue;
+		}
+
 		BUG_ON(page_count(page));
 		BUG_ON(!PageBuddy(page));
 		order = page_order(page);
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 44f213935bf6..fd9c18e00b92 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -233,7 +233,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
  */
 static unsigned long
 __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
-				  bool skip_hwpoisoned_pages)
+				  bool skip_hwpoisoned_pages,
+				  bool skip_reserved_pages)
 {
 	struct page *page;
 
@@ -253,6 +254,9 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
 		else if (skip_hwpoisoned_pages && PageHWPoison(page))
 			/* A HWPoisoned page cannot be also PageBuddy */
 			pfn++;
+		else if (skip_reserved_pages && PageReserved(page))
+			/* Skipping Reserved pages */
+			pfn++;
 		else
 			break;
 	}
@@ -262,7 +266,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
 
 /* Caller should ensure that requested range is in a single zone */
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
-			bool skip_hwpoisoned_pages)
+			bool skip_hwpoisoned_pages, bool skip_reserved_pages)
 {
 	unsigned long pfn, flags;
 	struct page *page;
@@ -285,7 +289,8 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
 	zone = page_zone(page);
 	spin_lock_irqsave(&zone->lock, flags);
 	pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn,
-						skip_hwpoisoned_pages);
+						skip_hwpoisoned_pages,
+						skip_reserved_pages);
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	trace_test_pages_isolated(start_pfn, end_pfn, pfn);
-- 
2.13.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2017-11-08 13:02 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-08 13:01 Vitaly Kuznetsov [this message]
2017-11-08 14:25 ` [PATCH RFC] mm/memory_hotplug: make it possible to offline blocks with reserved pages Michal Hocko
2017-11-08 15:39   ` Vitaly Kuznetsov
2017-11-08 15:57     ` Michal Hocko
2017-11-08 16:16       ` Vitaly Kuznetsov
2017-11-09 13:16         ` Michal Hocko
2017-11-09 13:30           ` Vitaly Kuznetsov
2017-11-09 13:42             ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171108130155.25499-1-vkuznets@redhat.com \
    --to=vkuznets@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexng@microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=hillf.zj@alibaba-inc.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@suse.com \
    --cc=sthemmin@microsoft.com \
    --cc=vbabka@suse.cz \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).