Linux-mm Archive on lore.kernel.org
 help / color / Atom feed
From: Oscar Salvador <osalvador@suse.de>
To: David Hildenbrand <david@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org, linux-ia64@vger.kernel.org,
	linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org,
	linux-sh@vger.kernel.org, x86@kernel.org,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>, Tony Luck <tony.luck@intel.com>,
	Fenghua Yu <fenghua.yu@intel.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	Paul Mackerras <paulus@samba.org>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Vasily Gorbik <gor@linux.ibm.com>,
	Christian Borntraeger <borntraeger@de.ibm.com>,
	Yoshinori Sato <ysato@users.sourceforge.jp>,
	Rich Felker <dalias@libc.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Steve Capper <steve.capper@arm.com>,
	Mike Rapoport <rppt@linux.ibm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Yu Zhao <yuzhao@google.com>, Jun Yao <yaojun8558363@gmail.com>,
	Robin Murphy <robin.murphy@arm.com>,
	Michal Hocko <mhocko@suse.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	Christophe Leroy <christophe.leroy@c-s.fr>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	Gerald Schaefer <gerald.schaefer@de.ibm.com>,
	Halil Pasic <pasic@linux.ibm.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Masahiro Yamada <yamada.masahiro@socionext.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Wei Yang <richard.weiyang@gmail.com>, Qian Cai <cai@lca.pw>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Logan Gunthorpe <logang@deltatee.com>,
	Ira Weiny <ira.weiny@intel.com>
Subject: Re: [PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory
Date: Tue, 3 Dec 2019 16:10:30 +0100
Message-ID: <20191203151030.GB2600@linux> (raw)
In-Reply-To: <20191006085646.5768-6-david@redhat.com>

On Sun, Oct 06, 2019 at 10:56:41AM +0200, David Hildenbrand wrote:
> Fixes: d0dc12e86b31 ("mm/memory_hotplug: optimize memory hotplug")
> Signed-off-by: David Hildenbrand <david@redhat.com>

I did not see anything wrong with the taken approach, and makes sense to me.
The only thing that puzzles me is we seem to not balance spanned_pages
for ZONE_DEVICE anymore.
memremap_pages() increments them via move_pfn_range_to_zone, but we skip
ZONE_DEVICE in remove_pfn_range_from_zone.

That is not really related to this patch, so I might be missing something,
but it caught my eye while reviewing this.

Anyway, for this one:

Reviewed-by: Oscar Salvador <osalvador@suse.de>


off-topic: I __think__ we really need to trim the CC list.

> ---
>  arch/arm64/mm/mmu.c            |  4 +---
>  arch/ia64/mm/init.c            |  4 +---
>  arch/powerpc/mm/mem.c          |  3 +--
>  arch/s390/mm/init.c            |  4 +---
>  arch/sh/mm/init.c              |  4 +---
>  arch/x86/mm/init_32.c          |  4 +---
>  arch/x86/mm/init_64.c          |  4 +---
>  include/linux/memory_hotplug.h |  7 +++++--
>  mm/memory_hotplug.c            | 31 ++++++++++++++++---------------
>  mm/memremap.c                  |  2 +-
>  10 files changed, 29 insertions(+), 38 deletions(-)
> 
> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
> index 60c929f3683b..d10247fab0fd 100644
> --- a/arch/arm64/mm/mmu.c
> +++ b/arch/arm64/mm/mmu.c
> @@ -1069,7 +1069,6 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct zone *zone;
>  
>  	/*
>  	 * FIXME: Cleanup page tables (also in arch_add_memory() in case
> @@ -1078,7 +1077,6 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  	 * unplug. ARCH_ENABLE_MEMORY_HOTREMOVE must not be
>  	 * unlocked yet.
>  	 */
> -	zone = page_zone(pfn_to_page(start_pfn));
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  }
>  #endif
> diff --git a/arch/ia64/mm/init.c b/arch/ia64/mm/init.c
> index bf9df2625bc8..a6dd80a2c939 100644
> --- a/arch/ia64/mm/init.c
> +++ b/arch/ia64/mm/init.c
> @@ -689,9 +689,7 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct zone *zone;
>  
> -	zone = page_zone(pfn_to_page(start_pfn));
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  }
>  #endif
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index be941d382c8d..97e5922cb52e 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -130,10 +130,9 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
>  	int ret;
>  
> -	__remove_pages(page_zone(page), start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  
>  	/* Remove htab bolted mappings for this section of memory */
>  	start = (unsigned long)__va(start);
> diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c
> index a124f19f7b3c..c1d96e588152 100644
> --- a/arch/s390/mm/init.c
> +++ b/arch/s390/mm/init.c
> @@ -291,10 +291,8 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct zone *zone;
>  
> -	zone = page_zone(pfn_to_page(start_pfn));
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  	vmem_remove_mapping(start, size);
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
> diff --git a/arch/sh/mm/init.c b/arch/sh/mm/init.c
> index dfdbaa50946e..d1b1ff2be17a 100644
> --- a/arch/sh/mm/init.c
> +++ b/arch/sh/mm/init.c
> @@ -434,9 +434,7 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = PFN_DOWN(start);
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct zone *zone;
>  
> -	zone = page_zone(pfn_to_page(start_pfn));
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
> diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c
> index 930edeb41ec3..0a74407ef92e 100644
> --- a/arch/x86/mm/init_32.c
> +++ b/arch/x86/mm/init_32.c
> @@ -865,10 +865,8 @@ void arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct zone *zone;
>  
> -	zone = page_zone(pfn_to_page(start_pfn));
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  }
>  #endif
>  
> diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
> index a6b5c653727b..b8541d77452c 100644
> --- a/arch/x86/mm/init_64.c
> +++ b/arch/x86/mm/init_64.c
> @@ -1212,10 +1212,8 @@ void __ref arch_remove_memory(int nid, u64 start, u64 size,
>  {
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> -	struct page *page = pfn_to_page(start_pfn) + vmem_altmap_offset(altmap);
> -	struct zone *zone = page_zone(page);
>  
> -	__remove_pages(zone, start_pfn, nr_pages, altmap);
> +	__remove_pages(start_pfn, nr_pages, altmap);
>  	kernel_physical_mapping_remove(start, start + size);
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index bc477e98a310..517b70943732 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -126,8 +126,8 @@ static inline bool movable_node_is_enabled(void)
>  
>  extern void arch_remove_memory(int nid, u64 start, u64 size,
>  			       struct vmem_altmap *altmap);
> -extern void __remove_pages(struct zone *zone, unsigned long start_pfn,
> -			   unsigned long nr_pages, struct vmem_altmap *altmap);
> +extern void __remove_pages(unsigned long start_pfn, unsigned long nr_pages,
> +			   struct vmem_altmap *altmap);
>  
>  /* reasonably generic interface to expand the physical pages */
>  extern int __add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages,
> @@ -346,6 +346,9 @@ extern int add_memory(int nid, u64 start, u64 size);
>  extern int add_memory_resource(int nid, struct resource *resource);
>  extern void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>  		unsigned long nr_pages, struct vmem_altmap *altmap);
> +extern void remove_pfn_range_from_zone(struct zone *zone,
> +				       unsigned long start_pfn,
> +				       unsigned long nr_pages);
>  extern bool is_memblock_offlined(struct memory_block *mem);
>  extern int sparse_add_section(int nid, unsigned long pfn,
>  		unsigned long nr_pages, struct vmem_altmap *altmap);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index f96608d24f6a..5b003ffa5dc9 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -457,8 +457,9 @@ static void update_pgdat_span(struct pglist_data *pgdat)
>  	pgdat->node_spanned_pages = node_end_pfn - node_start_pfn;
>  }
>  
> -static void __remove_zone(struct zone *zone, unsigned long start_pfn,
> -		unsigned long nr_pages)
> +void __ref remove_pfn_range_from_zone(struct zone *zone,
> +				      unsigned long start_pfn,
> +				      unsigned long nr_pages)
>  {
>  	struct pglist_data *pgdat = zone->zone_pgdat;
>  	unsigned long flags;
> @@ -473,28 +474,30 @@ static void __remove_zone(struct zone *zone, unsigned long start_pfn,
>  		return;
>  #endif
>  
> +	clear_zone_contiguous(zone);
> +
>  	pgdat_resize_lock(zone->zone_pgdat, &flags);
>  	shrink_zone_span(zone, start_pfn, start_pfn + nr_pages);
>  	update_pgdat_span(pgdat);
>  	pgdat_resize_unlock(zone->zone_pgdat, &flags);
> +
> +	set_zone_contiguous(zone);
>  }
>  
> -static void __remove_section(struct zone *zone, unsigned long pfn,
> -		unsigned long nr_pages, unsigned long map_offset,
> -		struct vmem_altmap *altmap)
> +static void __remove_section(unsigned long pfn, unsigned long nr_pages,
> +			     unsigned long map_offset,
> +			     struct vmem_altmap *altmap)
>  {
>  	struct mem_section *ms = __nr_to_section(pfn_to_section_nr(pfn));
>  
>  	if (WARN_ON_ONCE(!valid_section(ms)))
>  		return;
>  
> -	__remove_zone(zone, pfn, nr_pages);
>  	sparse_remove_section(ms, pfn, nr_pages, map_offset, altmap);
>  }
>  
>  /**
> - * __remove_pages() - remove sections of pages from a zone
> - * @zone: zone from which pages need to be removed
> + * __remove_pages() - remove sections of pages
>   * @pfn: starting pageframe (must be aligned to start of a section)
>   * @nr_pages: number of pages to remove (must be multiple of section size)
>   * @altmap: alternative device page map or %NULL if default memmap is used
> @@ -504,16 +507,14 @@ static void __remove_section(struct zone *zone, unsigned long pfn,
>   * sure that pages are marked reserved and zones are adjust properly by
>   * calling offline_pages().
>   */
> -void __remove_pages(struct zone *zone, unsigned long pfn,
> -		    unsigned long nr_pages, struct vmem_altmap *altmap)
> +void __remove_pages(unsigned long pfn, unsigned long nr_pages,
> +		    struct vmem_altmap *altmap)
>  {
>  	unsigned long map_offset = 0;
>  	unsigned long nr, start_sec, end_sec;
>  
>  	map_offset = vmem_altmap_offset(altmap);
>  
> -	clear_zone_contiguous(zone);
> -
>  	if (check_pfn_span(pfn, nr_pages, "remove"))
>  		return;
>  
> @@ -525,13 +526,11 @@ void __remove_pages(struct zone *zone, unsigned long pfn,
>  		cond_resched();
>  		pfns = min(nr_pages, PAGES_PER_SECTION
>  				- (pfn & ~PAGE_SECTION_MASK));
> -		__remove_section(zone, pfn, pfns, map_offset, altmap);
> +		__remove_section(pfn, pfns, map_offset, altmap);
>  		pfn += pfns;
>  		nr_pages -= pfns;
>  		map_offset = 0;
>  	}
> -
> -	set_zone_contiguous(zone);
>  }
>  
>  int set_online_page_callback(online_page_callback_t callback)
> @@ -859,6 +858,7 @@ int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_typ
>  		 (unsigned long long) pfn << PAGE_SHIFT,
>  		 (((unsigned long long) pfn + nr_pages) << PAGE_SHIFT) - 1);
>  	memory_notify(MEM_CANCEL_ONLINE, &arg);
> +	remove_pfn_range_from_zone(zone, pfn, nr_pages);
>  	mem_hotplug_done();
>  	return ret;
>  }
> @@ -1605,6 +1605,7 @@ static int __ref __offline_pages(unsigned long start_pfn,
>  	writeback_set_ratelimit();
>  
>  	memory_notify(MEM_OFFLINE, &arg);
> +	remove_pfn_range_from_zone(zone, start_pfn, nr_pages);
>  	mem_hotplug_done();
>  	return 0;
>  
> diff --git a/mm/memremap.c b/mm/memremap.c
> index 8c2fb44c3b4d..70263e6f093e 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -140,7 +140,7 @@ void memunmap_pages(struct dev_pagemap *pgmap)
>  
>  	mem_hotplug_begin();
>  	if (pgmap->type == MEMORY_DEVICE_PRIVATE) {
> -		__remove_pages(page_zone(first_page), PHYS_PFN(res->start),
> +		__remove_pages(PHYS_PFN(res->start),
>  			       PHYS_PFN(resource_size(res)), NULL);
>  	} else {
>  		arch_remove_memory(nid, res->start, resource_size(res),
> -- 
> 2.21.0
> 

-- 
Oscar Salvador
SUSE L3


  parent reply index

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-06  8:56 [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 01/10] mm/memunmap: Don't access uninitialized memmap in memunmap_pages() David Hildenbrand
2019-10-06 19:58   ` Damian Tometzki
2019-10-06 20:13     ` David Hildenbrand
2019-10-14  9:05   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 02/10] mm/memmap_init: Update variable name in memmap_init_zone David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 03/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_pgdat_span() David Hildenbrand
2019-10-14  9:31   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 04/10] mm/memory_hotplug: Don't access uninitialized memmaps in shrink_zone_span() David Hildenbrand
2019-10-14  9:32   ` David Hildenbrand
2019-10-14 19:17     ` Andrew Morton
2019-11-19 14:16       ` David Hildenbrand
2019-11-19 20:44         ` Andrew Morton
2019-10-06  8:56 ` [PATCH v6 05/10] mm/memory_hotplug: Shrink zones when offlining memory David Hildenbrand
2019-10-14  9:39   ` David Hildenbrand
2019-10-14 19:16     ` Andrew Morton
2019-10-27 22:45   ` David Hildenbrand
2019-11-30 23:21     ` Andrew Morton
2019-11-30 23:43       ` David Hildenbrand
2019-12-18 17:08       ` David Hildenbrand
2019-12-18 20:16         ` Andrew Morton
2019-12-03 15:10   ` Oscar Salvador [this message]
2019-12-03 15:27     ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 06/10] mm/memory_hotplug: Poison memmap in remove_pfn_range_from_zone() David Hildenbrand
2019-10-16 14:01   ` David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 07/10] mm/memory_hotplug: We always have a zone in find_(smallest|biggest)_section_pfn David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 08/10] mm/memory_hotplug: Don't check for "all holes" in shrink_zone_span() David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 09/10] mm/memory_hotplug: Drop local variables " David Hildenbrand
2019-10-06  8:56 ` [PATCH v6 10/10] mm/memory_hotplug: Cleanup __remove_pages() David Hildenbrand
2019-12-02  9:09 ` [PATCH v6 00/10] mm/memory_hotplug: Shrink zones before removing memory David Hildenbrand
2019-12-03 13:36   ` Oscar Salvador

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191203151030.GB2600@linux \
    --to=osalvador@suse.de \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=benh@kernel.crashing.org \
    --cc=borntraeger@de.ibm.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@c-s.fr \
    --cc=dalias@libc.org \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=gerald.schaefer@de.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=hpa@zytor.com \
    --cc=ira.weiny@intel.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-ia64@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=linux-sh@vger.kernel.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=logang@deltatee.com \
    --cc=luto@kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=pasha.tatashin@soleen.com \
    --cc=pasic@linux.ibm.com \
    --cc=paulus@samba.org \
    --cc=peterz@infradead.org \
    --cc=richard.weiyang@gmail.com \
    --cc=robin.murphy@arm.com \
    --cc=rppt@linux.ibm.com \
    --cc=steve.capper@arm.com \
    --cc=tglx@linutronix.de \
    --cc=thomas.lendacky@amd.com \
    --cc=tony.luck@intel.com \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=yamada.masahiro@socionext.com \
    --cc=yaojun8558363@gmail.com \
    --cc=ysato@users.sourceforge.jp \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-mm Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-mm/0 linux-mm/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-mm linux-mm/ https://lore.kernel.org/linux-mm \
		linux-mm@kvack.org
	public-inbox-index linux-mm

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kvack.linux-mm


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git