From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E93FC432C3 for ; Tue, 3 Dec 2019 15:10:42 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0448C2073C for ; Tue, 3 Dec 2019 15:10:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0448C2073C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 914A36B058D; Tue, 3 Dec 2019 10:10:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C4636B058E; Tue, 3 Dec 2019 10:10:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 765406B058F; Tue, 3 Dec 2019 10:10:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0089.hostedemail.com [216.40.44.89]) by kanga.kvack.org (Postfix) with ESMTP id 5794A6B058D for ; Tue, 3 Dec 2019 10:10:41 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with SMTP id D4DBA582D for ; Tue, 3 Dec 2019 15:10:40 +0000 (UTC) X-FDA: 76224167040.29.hat67_7a32de903a738 X-HE-Tag: hat67_7a32de903a738 X-Filterd-Recvd-Size: 14449 Received: from mx1.suse.de (mx2.suse.de [195.135.220.15]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Tue, 3 Dec 2019 15:10:39 +0000 (UTC) X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 78381C1F9; Tue, 3 Dec 2019 15:10:37 +0000 (UTC) Date: Tue, 3 Dec 2019 16:10:30 +0100 From: Oscar Salvador To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-sh@vger.kernel.org, x86@kernel.org, Catalin Marinas , Will Deacon , Tony Luck , Fenghua Yu , Benjamin Herrenschmidt , Paul Mackerras , Michael Ellerman , Heiko Carstens , Vasily Gorbik , Christian Borntraeger , Yoshinori Sato , Rich Felker , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "H. Peter Anvin" , Andrew Morton , Mark Rutland , Steve Capper , Mike Rapoport , Anshuman Khandual , Yu Zhao , Jun Yao , Robin Murphy , Michal Hocko , "Matthew Wilcox (Oracle)" , Christophe Leroy , "Aneesh Kumar K.V" , Pavel Tatashin , Gerald Schaefer , Halil Pasic , Tom Lendacky , Greg Kroah-Hartman , Masahiro Yamada , Dan Williams , Wei Yang , Qian Cai , Jason Gunthorpe , Logan Gunthorpe , Ira Weiny Subject: Re: [PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory Message-ID: <20191203151030.GB2600@linux> References: <20191006085646.5768-1-david@redhat.com> <20191006085646.5768-6-david@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191006085646.5768-6-david@redhat.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Oct 06, 2019 at 10:56:41AM +0200, David Hildenbrand wrote: > Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug") > Signed-off-by: David Hildenbrand I did not see anything wrong with the taken approach, and makes sense to me. The only thing that puzzles me is we seem to not balance spanned_pages for ZONE_DEVICE anymore. memremap_pages() increments them via move_pfn_range_to_zone, but we skip ZONE_DEVICE in remove_pfn_range_from_zone. That is not really related to this patch, so I might be missing something, but it caught my eye while reviewing this. Anyway, for this one: Reviewed-by: Oscar Salvador off-topic: I __think__ we really need to trim the CC list. > --- > arch/arm64/mm/mmu.c | 4 +--- > arch/ia64/mm/init.c | 4 +--- > arch/powerpc/mm/mem.c | 3 +-- > arch/s390/mm/init.c | 4 +--- > arch/sh/mm/init.c | 4 +--- > arch/x86/mm/init_32.c | 4 +--- > arch/x86/mm/init_64.c | 4 +--- > include/linux/memory_hotplug.h | 7 +++++-- > mm/memory_hotplug.c | 31 ++++++++++++++++--------------- > mm/memremap.c | 2 +- > 10 files changed, 29 insertions(+), 38 deletions(-) > > diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c > index 60c929f3683b..d10247fab0fd 100644 > --- a/arch/arm64/mm/mmu.c > +++ b/arch/arm64/mm/mmu.c > @@ -1069,7 +1069,6 @@ void arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct zone *zone; > > /* > * FIXME: Cleanup page tables (also in arch_add_memory() in case > @@ -1078,7 +1077,6 @@ void arch_remove_memory(int nid, u64 start, u64 size, > * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be > * unlocked yet. > */ > - zone = page_zone(pfn_to_page(start_pfn)); > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > } > #endif > diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c > index bf9df2625bc8..a6dd80a2c939 100644 > --- a/arch/ia64/mm/init.c > +++ b/arch/ia64/mm/init.c > @@ -689,9 +689,7 @@ void arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct zone *zone; > > - zone = page_zone(pfn_to_page(start_pfn)); > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > } > #endif > diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c > index be941d382c8d..97e5922cb52e 100644 > --- a/arch/powerpc/mm/mem.c > +++ b/arch/powerpc/mm/mem.c > @@ -130,10 +130,9 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap); > int ret; > > - __remove_pages(page_zone(page), start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > > /* Remove htab bolted mappings for this section of memory */ > start = (unsigned long)__va(start); > diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c > index a124f19f7b3c..c1d96e588152 100644 > --- a/arch/s390/mm/init.c > +++ b/arch/s390/mm/init.c > @@ -291,10 +291,8 @@ void arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct zone *zone; > > - zone = page_zone(pfn_to_page(start_pfn)); > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > vmem_remove_mapping(start, size); > } > #endif /* CONFIG_MEMORY_HOTPLUG */ > diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c > index dfdbaa50946e..d1b1ff2be17a 100644 > --- a/arch/sh/mm/init.c > +++ b/arch/sh/mm/init.c > @@ -434,9 +434,7 @@ void arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = PFN_DOWN(start); > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct zone *zone; > > - zone = page_zone(pfn_to_page(start_pfn)); > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > } > #endif /* CONFIG_MEMORY_HOTPLUG */ > diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c > index 930edeb41ec3..0a74407ef92e 100644 > --- a/arch/x86/mm/init_32.c > +++ b/arch/x86/mm/init_32.c > @@ -865,10 +865,8 @@ void arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct zone *zone; > > - zone = page_zone(pfn_to_page(start_pfn)); > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > } > #endif > > diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c > index a6b5c653727b..b8541d77452c 100644 > --- a/arch/x86/mm/init_64.c > +++ b/arch/x86/mm/init_64.c > @@ -1212,10 +1212,8 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size, > { > unsigned long start_pfn = start >> PAGE_SHIFT; > unsigned long nr_pages = size >> PAGE_SHIFT; > - struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap); > - struct zone *zone = page_zone(page); > > - __remove_pages(zone, start_pfn, nr_pages, altmap); > + __remove_pages(start_pfn, nr_pages, altmap); > kernel_physical_mapping_remove(start, start + size); > } > #endif /* CONFIG_MEMORY_HOTPLUG */ > diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h > index bc477e98a310..517b70943732 100644 > --- a/include/linux/memory_hotplug.h > +++ b/include/linux/memory_hotplug.h > @@ -126,8 +126,8 @@ static inline bool movable_node_is_enabled(void) > > extern void arch_remove_memory(int nid, u64 start, u64 size, > struct vmem_altmap *altmap); > -extern void __remove_pages(struct zone *zone, unsigned long start_pfn, > - unsigned long nr_pages, struct vmem_altmap *altmap); > +extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages, > + struct vmem_altmap *altmap); > > /* reasonably generic interface to expand the physical pages */ > extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, > @@ -346,6 +346,9 @@ extern int add_memory(int nid, u64 start, u64 size); > extern int add_memory_resource(int nid, struct resource *resource); > extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn, > unsigned long nr_pages, struct vmem_altmap *altmap); > +extern void remove_pfn_range_from_zone(struct zone *zone, > + unsigned long start_pfn, > + unsigned long nr_pages); > extern bool is_memblock_offlined(struct memory_block *mem); > extern int sparse_add_section(int nid, unsigned long pfn, > unsigned long nr_pages, struct vmem_altmap *altmap); > diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c > index f96608d24f6a..5b003ffa5dc9 100644 > --- a/mm/memory_hotplug.c > +++ b/mm/memory_hotplug.c > @@ -457,8 +457,9 @@ static void update_pgdat_span(struct pglist_data *pgdat) > pgdat->node_spanned_pages = node_end_pfn - node_start_pfn; > } > > -static void __remove_zone(struct zone *zone, unsigned long start_pfn, > - unsigned long nr_pages) > +void __ref remove_pfn_range_from_zone(struct zone *zone, > + unsigned long start_pfn, > + unsigned long nr_pages) > { > struct pglist_data *pgdat = zone->zone_pgdat; > unsigned long flags; > @@ -473,28 +474,30 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn, > return; > #endif > > + clear_zone_contiguous(zone); > + > pgdat_resize_lock(zone->zone_pgdat, &flags); > shrink_zone_span(zone, start_pfn, start_pfn + nr_pages); > update_pgdat_span(pgdat); > pgdat_resize_unlock(zone->zone_pgdat, &flags); > + > + set_zone_contiguous(zone); > } > > -static void __remove_section(struct zone *zone, unsigned long pfn, > - unsigned long nr_pages, unsigned long map_offset, > - struct vmem_altmap *altmap) > +static void __remove_section(unsigned long pfn, unsigned long nr_pages, > + unsigned long map_offset, > + struct vmem_altmap *altmap) > { > struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn)); > > if (WARN_ON_ONCE(!valid_section(ms))) > return; > > - __remove_zone(zone, pfn, nr_pages); > sparse_remove_section(ms, pfn, nr_pages, map_offset, altmap); > } > > /** > - * __remove_pages() - remove sections of pages from a zone > - * @zone: zone from which pages need to be removed > + * __remove_pages() - remove sections of pages > * @pfn: starting pageframe (must be aligned to start of a section) > * @nr_pages: number of pages to remove (must be multiple of section size) > * @altmap: alternative device page map or %NULL if default memmap is used > @@ -504,16 +507,14 @@ static void __remove_section(struct zone *zone, unsigned long pfn, > * sure that pages are marked reserved and zones are adjust properly by > * calling offline_pages(). > */ > -void __remove_pages(struct zone *zone, unsigned long pfn, > - unsigned long nr_pages, struct vmem_altmap *altmap) > +void __remove_pages(unsigned long pfn, unsigned long nr_pages, > + struct vmem_altmap *altmap) > { > unsigned long map_offset = 0; > unsigned long nr, start_sec, end_sec; > > map_offset = vmem_altmap_offset(altmap); > > - clear_zone_contiguous(zone); > - > if (check_pfn_span(pfn, nr_pages, "remove")) > return; > > @@ -525,13 +526,11 @@ void __remove_pages(struct zone *zone, unsigned long pfn, > cond_resched(); > pfns = min(nr_pages, PAGES_PER_SECTION > - (pfn & ~PAGE_SECTION_MASK)); > - __remove_section(zone, pfn, pfns, map_offset, altmap); > + __remove_section(pfn, pfns, map_offset, altmap); > pfn += pfns; > nr_pages -= pfns; > map_offset = 0; > } > - > - set_zone_contiguous(zone); > } > > int set_online_page_callback(online_page_callback_t callback) > @@ -859,6 +858,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ > (unsigned long long) pfn << PAGE_SHIFT, > (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1); > memory_notify(MEM_CANCEL_ONLINE, &arg); > + remove_pfn_range_from_zone(zone, pfn, nr_pages); > mem_hotplug_done(); > return ret; > } > @@ -1605,6 +1605,7 @@ static int __ref __offline_pages(unsigned long start_pfn, > writeback_set_ratelimit(); > > memory_notify(MEM_OFFLINE, &arg); > + remove_pfn_range_from_zone(zone, start_pfn, nr_pages); > mem_hotplug_done(); > return 0; > > diff --git a/mm/memremap.c b/mm/memremap.c > index 8c2fb44c3b4d..70263e6f093e 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -140,7 +140,7 @@ void memunmap_pages(struct dev_pagemap *pgmap) > > mem_hotplug_begin(); > if (pgmap->type == MEMORY_DEVICE_PRIVATE) { > - __remove_pages(page_zone(first_page), PHYS_PFN(res->start), > + __remove_pages(PHYS_PFN(res->start), > PHYS_PFN(resource_size(res)), NULL); > } else { > arch_remove_memory(nid, res->start, resource_size(res), > -- > 2.21.0 > -- Oscar Salvador SUSE L3