From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752632AbdKHPj4 (ORCPT ); Wed, 8 Nov 2017 10:39:56 -0500 Received: from mx1.redhat.com ([209.132.183.28]:51306 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752491AbdKHPjy (ORCPT ); Wed, 8 Nov 2017 10:39:54 -0500 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 90C16491 Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=vkuznets@redhat.com From: Vitaly Kuznetsov To: Michal Hocko Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrew Morton , Vlastimil Babka , Mel Gorman , YASUAKI ISHIMATSU , Hillf Danton , Johannes Weiner , "K. Y. Srinivasan" , Stephen Hemminger , Alex Ng Subject: Re: [PATCH RFC] mm/memory_hotplug: make it possible to offline blocks with reserved pages References: <20171108130155.25499-1-vkuznets@redhat.com> <20171108142528.vsrkkqw6fihxdjio@dhcp22.suse.cz> Date: Wed, 08 Nov 2017 16:39:49 +0100 In-Reply-To: <20171108142528.vsrkkqw6fihxdjio@dhcp22.suse.cz> (Michal Hocko's message of "Wed, 8 Nov 2017 15:25:28 +0100") Message-ID: <87y3nglqyi.fsf@vitty.brq.redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Wed, 08 Nov 2017 15:39:54 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Michal Hocko writes: > On Wed 08-11-17 14:01:55, Vitaly Kuznetsov wrote: >> Hyper-V balloon driver needs to hotplug memory in smaller chunks and to >> workaround Linux's 128Mb allignment requirement so it does a trick: partly >> populated 128Mb blocks are added and then a custom online_page_callback >> hook checks if the particular page is 'backed' during onlining, in case it >> is not backed it is left in Reserved state. When the host adds more pages >> to the block we bring them online from the driver (see >> hv_bring_pgs_online()/hv_page_online_one() in drivers/hv/hv_balloon.c). >> Eventually the whole block becomes fully populated and we hotplug the next >> 128Mb. This all works for quite some time already. > > Why does HyperV needs to workaround the section size limit in the first > place? We are allocation memmap for the whole section anyway so it won't > save any memory. So the whole thing sounds rather dubious to me. > Memory hotplug requirements in Windows are different, they have 2Mb granularity, not 128Mb like we have in Linux x86. Imagine there's a request to add 32Mb of memory comming from the Hyper-V host. What can we do? Don't add anything at all and wait till we're suggested to add > 128Mb and then add a section or the current approach. >> What is not working is offlining of such partly populated blocks: >> check_pages_isolated_cb() callback will not pass with a sinle Reserved page >> and we end up with -EBUSY. However, there's no reason to fail offlining in >> this case: these pages are already offline, we may just skip them. Add the >> appropriate workaround to test_pages_isolated(). > > How do you recognize pages reserved by other users. You cannot simply > remove them, it would just blow up. > I exepcted sumothing like that, thus RFC. Is there a way to detect pages which were never onlined? E.g. it is Reserved and count == 0? >> Signed-off-by: Vitaly Kuznetsov >> --- >> RFC part: >> - Other usages of Reserved pages making offlining blocks with them a no-go >> may exist. >> - I'm not exactly sure that adding another parameter to >> test_pages_isolated() is a good idea, we may go with a single flag for >> both Reserved and HwPoisoned pages: we have just two call sites and they >> have opposite needs (true, true in one case and false, false in the >> other). >> --- >> include/linux/page-isolation.h | 2 +- >> mm/memory_hotplug.c | 2 +- >> mm/page_alloc.c | 8 +++++++- >> mm/page_isolation.c | 11 ++++++++--- >> 4 files changed, 17 insertions(+), 6 deletions(-) >> >> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h >> index 05a04e603686..daba12a59574 100644 >> --- a/include/linux/page-isolation.h >> +++ b/include/linux/page-isolation.h >> @@ -61,7 +61,7 @@ undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, >> * Test all pages in [start_pfn, end_pfn) are isolated or not. >> */ >> int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, >> - bool skip_hwpoisoned_pages); >> + bool skip_hwpoisoned_pages, bool skip_reserved_pages); >> >> struct page *alloc_migrate_target(struct page *page, unsigned long private, >> int **resultp); >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c >> index d4b5f29906b9..5b7d1482804f 100644 >> --- a/mm/memory_hotplug.c >> +++ b/mm/memory_hotplug.c >> @@ -1467,7 +1467,7 @@ check_pages_isolated_cb(unsigned long start_pfn, unsigned long nr_pages, >> { >> int ret; >> long offlined = *(long *)data; >> - ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true); >> + ret = test_pages_isolated(start_pfn, start_pfn + nr_pages, true, true); >> offlined = nr_pages; >> if (!ret) >> *(long *)data += offlined; >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 77e4d3c5c57b..b475928c476c 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -7632,7 +7632,7 @@ int alloc_contig_range(unsigned long start, unsigned long end, >> } >> >> /* Make sure the range is really isolated. */ >> - if (test_pages_isolated(outer_start, end, false)) { >> + if (test_pages_isolated(outer_start, end, false, false)) { >> pr_info_ratelimited("%s: [%lx, %lx) PFNs busy\n", >> __func__, outer_start, end); >> ret = -EBUSY; >> @@ -7746,6 +7746,12 @@ __offline_isolated_pages(unsigned long start_pfn, unsigned long end_pfn) >> continue; >> } >> >> + /* Some pages might never be online, skip them */ >> + if (unlikely(PageReserved(page))) { >> + pfn++; >> + continue; >> + } >> + >> BUG_ON(page_count(page)); >> BUG_ON(!PageBuddy(page)); >> order = page_order(page); >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c >> index 44f213935bf6..fd9c18e00b92 100644 >> --- a/mm/page_isolation.c >> +++ b/mm/page_isolation.c >> @@ -233,7 +233,8 @@ int undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn, >> */ >> static unsigned long >> __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, >> - bool skip_hwpoisoned_pages) >> + bool skip_hwpoisoned_pages, >> + bool skip_reserved_pages) >> { >> struct page *page; >> >> @@ -253,6 +254,9 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, >> else if (skip_hwpoisoned_pages && PageHWPoison(page)) >> /* A HWPoisoned page cannot be also PageBuddy */ >> pfn++; >> + else if (skip_reserved_pages && PageReserved(page)) >> + /* Skipping Reserved pages */ >> + pfn++; >> else >> break; >> } >> @@ -262,7 +266,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn, >> >> /* Caller should ensure that requested range is in a single zone */ >> int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, >> - bool skip_hwpoisoned_pages) >> + bool skip_hwpoisoned_pages, bool skip_reserved_pages) >> { >> unsigned long pfn, flags; >> struct page *page; >> @@ -285,7 +289,8 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn, >> zone = page_zone(page); >> spin_lock_irqsave(&zone->lock, flags); >> pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, >> - skip_hwpoisoned_pages); >> + skip_hwpoisoned_pages, >> + skip_reserved_pages); >> spin_unlock_irqrestore(&zone->lock, flags); >> >> trace_test_pages_isolated(start_pfn, end_pfn, pfn); >> -- >> 2.13.6 >> -- Vitaly