linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] mm/memory_hotplug: remove is_mem_section_removable()
@ 2020-04-07 13:54 David Hildenbrand
  2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
  2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
  0 siblings, 2 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-07 13:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, David Hildenbrand, Andrew Morton,
	Baoquan He, Benjamin Herrenschmidt, Michael Ellerman,
	Michal Hocko, Nathan Fontenot, Oscar Salvador, Paul Mackerras,
	Wei Yang

This is the follow-up of "[PATCH v1] drivers/base/memory.c: indicate all
memory blocks as removable" [1], which gets rid of
is_mem_section_removable().

More details can be found in [1] and [2]

[1] https://lkml.kernel.org/r/20200128093542.6908-1-david@redhat.com
[2] https://lkml.kernel.org/r/20200117105759.27905-1-david@redhat.com

David Hildenbrand (2):
  powerpc/pseries/hotplug-memory: stop checking
    is_mem_section_removable()
  mm/memory_hotplug: remove is_mem_section_removable()

 .../platforms/pseries/hotplug-memory.c        | 26 +------
 include/linux/memory_hotplug.h                |  7 --
 mm/memory_hotplug.c                           | 75 -------------------
 3 files changed, 3 insertions(+), 105 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-07 13:54 [PATCH v1 0/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
@ 2020-04-07 13:54 ` David Hildenbrand
  2020-04-07 13:58   ` Michal Hocko
                     ` (2 more replies)
  2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
  1 sibling, 3 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-07 13:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, David Hildenbrand, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Baoquan He,
	Wei Yang

In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
blocks as removable"), the user space interface to compute whether a memory
block can be offlined (exposed via
/sys/devices/system/memory/memoryX/removable) has effectively been
deprecated. We want to remove the leftovers of the kernel implementation.

When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
we'll start by:
1. Testing if it contains any holes, and reject if so
2. Testing if pages belong to different zones, and reject if so
3. Isolating the page range, checking if it contains any unmovable pages

Using is_mem_section_removable() before trying to offline is not only racy,
it can easily result in false positives/negatives. Let's stop manually
checking is_mem_section_removable(), and let device_offline() handle it
completely instead. We can remove the racy is_mem_section_removable()
implementation next.

We now take more locks (e.g., memory hotplug lock when offlining and the
zone lock when isolating), but maybe we should optimize that
implementation instead if this ever becomes a real problem (after all,
memory unplug is already an expensive operation). We started using
is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
Implement memory hotplug remove in the kernel"), with the initial
hotremove support of lmbs.

Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
 1 file changed, 3 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index b2cde1732301..5ace2f9a277e 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
 
 static bool lmb_is_removable(struct drmem_lmb *lmb)
 {
-	int i, scns_per_block;
-	bool rc = true;
-	unsigned long pfn, block_sz;
-	u64 phys_addr;
-
 	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
 		return false;
 
-	block_sz = memory_block_size_bytes();
-	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
-	phys_addr = lmb->base_addr;
-
 #ifdef CONFIG_FA_DUMP
 	/*
 	 * Don't hot-remove memory that falls in fadump boot memory area
 	 * and memory that is reserved for capturing old kernel memory.
 	 */
-	if (is_fadump_memory_area(phys_addr, block_sz))
+	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
 		return false;
 #endif
-
-	for (i = 0; i < scns_per_block; i++) {
-		pfn = PFN_DOWN(phys_addr);
-		if (!pfn_in_present_section(pfn)) {
-			phys_addr += MIN_MEMORY_BLOCK_SIZE;
-			continue;
-		}
-
-		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
-		phys_addr += MIN_MEMORY_BLOCK_SIZE;
-	}
-
-	return rc;
+	/* device_offline() will determine if we can actually remove this lmb */
+	return true;
 }
 
 static int dlpar_add_lmb(struct drmem_lmb *);
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()
  2020-04-07 13:54 [PATCH v1 0/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
  2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
@ 2020-04-07 13:54 ` David Hildenbrand
  2020-04-07 14:00   ` Michal Hocko
                     ` (2 more replies)
  1 sibling, 3 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-07 13:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, linuxppc-dev, David Hildenbrand, Michael Ellerman,
	Benjamin Herrenschmidt, Michal Hocko, Andrew Morton,
	Oscar Salvador, Baoquan He, Wei Yang

Fortunately, all users of is_mem_section_removable() are gone. Get rid of
it, including some now unnecessary functions.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Baoquan He <bhe@redhat.com>
Cc: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/memory_hotplug.h |  7 ----
 mm/memory_hotplug.c            | 75 ----------------------------------
 2 files changed, 82 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 93d9ada74ddd..7dca9cd6076b 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
 
-extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
 extern void try_offline_node(int nid);
 extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
 extern int remove_memory(int nid, u64 start, u64 size);
 extern void __remove_memory(int nid, u64 start, u64 size);
 
 #else
-static inline bool is_mem_section_removable(unsigned long pfn,
-					unsigned long nr_pages)
-{
-	return false;
-}
-
 static inline void try_offline_node(int nid) {}
 
 static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 47cf6036eb31..4d338d546d52 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
 EXPORT_SYMBOL_GPL(add_memory);
 
 #ifdef CONFIG_MEMORY_HOTREMOVE
-/*
- * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
- * set and the size of the free page is given by page_order(). Using this,
- * the function determines if the pageblock contains only free pages.
- * Due to buddy contraints, a free page at least the size of a pageblock will
- * be located at the start of the pageblock
- */
-static inline int pageblock_free(struct page *page)
-{
-	return PageBuddy(page) && page_order(page) >= pageblock_order;
-}
-
-/* Return the pfn of the start of the next active pageblock after a given pfn */
-static unsigned long next_active_pageblock(unsigned long pfn)
-{
-	struct page *page = pfn_to_page(pfn);
-
-	/* Ensure the starting page is pageblock-aligned */
-	BUG_ON(pfn & (pageblock_nr_pages - 1));
-
-	/* If the entire pageblock is free, move to the end of free page */
-	if (pageblock_free(page)) {
-		int order;
-		/* be careful. we don't have locks, page_order can be changed.*/
-		order = page_order(page);
-		if ((order < MAX_ORDER) && (order >= pageblock_order))
-			return pfn + (1 << order);
-	}
-
-	return pfn + pageblock_nr_pages;
-}
-
-static bool is_pageblock_removable_nolock(unsigned long pfn)
-{
-	struct page *page = pfn_to_page(pfn);
-	struct zone *zone;
-
-	/*
-	 * We have to be careful here because we are iterating over memory
-	 * sections which are not zone aware so we might end up outside of
-	 * the zone but still within the section.
-	 * We have to take care about the node as well. If the node is offline
-	 * its NODE_DATA will be NULL - see page_zone.
-	 */
-	if (!node_online(page_to_nid(page)))
-		return false;
-
-	zone = page_zone(page);
-	pfn = page_to_pfn(page);
-	if (!zone_spans_pfn(zone, pfn))
-		return false;
-
-	return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
-				    MEMORY_OFFLINE);
-}
-
-/* Checks if this range of memory is likely to be hot-removable. */
-bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
-{
-	unsigned long end_pfn, pfn;
-
-	end_pfn = min(start_pfn + nr_pages,
-			zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
-
-	/* Check the starting page of each pageblock within the range */
-	for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
-		if (!is_pageblock_removable_nolock(pfn))
-			return false;
-		cond_resched();
-	}
-
-	/* All pageblocks in the memory block are likely to be hot-removable */
-	return true;
-}
-
 /*
  * Confirm all pages in a range [start, end) belong to the same zone (skipping
  * memory holes). When true, return the zone.
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
@ 2020-04-07 13:58   ` Michal Hocko
  2020-04-08  2:46   ` Baoquan He
  2020-04-09  7:26   ` Michael Ellerman
  2 siblings, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2020-04-07 13:58 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Andrew Morton, Oscar Salvador, Baoquan He, Wei Yang

On Tue 07-04-20 15:54:15, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
> 
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
> 
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
> 
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

I am not familiar with this code but it makes sense to make it sync with
the global behavior.

> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>  1 file changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> -	int i, scns_per_block;
> -	bool rc = true;
> -	unsigned long pfn, block_sz;
> -	u64 phys_addr;
> -
>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>  		return false;
>  
> -	block_sz = memory_block_size_bytes();
> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> -	phys_addr = lmb->base_addr;
> -
>  #ifdef CONFIG_FA_DUMP
>  	/*
>  	 * Don't hot-remove memory that falls in fadump boot memory area
>  	 * and memory that is reserved for capturing old kernel memory.
>  	 */
> -	if (is_fadump_memory_area(phys_addr, block_sz))
> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>  		return false;
>  #endif
> -
> -	for (i = 0; i < scns_per_block; i++) {
> -		pfn = PFN_DOWN(phys_addr);
> -		if (!pfn_in_present_section(pfn)) {
> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
> -			continue;
> -		}
> -
> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
> -	}
> -
> -	return rc;
> +	/* device_offline() will determine if we can actually remove this lmb */
> +	return true;
>  }
>  
>  static int dlpar_add_lmb(struct drmem_lmb *);
> -- 
> 2.25.1

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
@ 2020-04-07 14:00   ` Michal Hocko
  2020-04-07 21:30   ` Wei Yang
  2020-04-08  2:48   ` Baoquan He
  2 siblings, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2020-04-07 14:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, Michael Ellerman,
	Benjamin Herrenschmidt, Andrew Morton, Oscar Salvador,
	Baoquan He, Wei Yang

On Tue 07-04-20 15:54:16, David Hildenbrand wrote:
> Fortunately, all users of is_mem_section_removable() are gone. Get rid of
> it, including some now unnecessary functions.
> 
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Michal Hocko <mhocko@suse.com>

> ---
>  include/linux/memory_hotplug.h |  7 ----
>  mm/memory_hotplug.c            | 75 ----------------------------------

\o/

Thanks!

>  2 files changed, 82 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 93d9ada74ddd..7dca9cd6076b 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  
> -extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
>  extern void try_offline_node(int nid);
>  extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
>  extern int remove_memory(int nid, u64 start, u64 size);
>  extern void __remove_memory(int nid, u64 start, u64 size);
>  
>  #else
> -static inline bool is_mem_section_removable(unsigned long pfn,
> -					unsigned long nr_pages)
> -{
> -	return false;
> -}
> -
>  static inline void try_offline_node(int nid) {}
>  
>  static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 47cf6036eb31..4d338d546d52 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
>  EXPORT_SYMBOL_GPL(add_memory);
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> -/*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> - */
> -static inline int pageblock_free(struct page *page)
> -{
> -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> -}
> -
> -/* Return the pfn of the start of the next active pageblock after a given pfn */
> -static unsigned long next_active_pageblock(unsigned long pfn)
> -{
> -	struct page *page = pfn_to_page(pfn);
> -
> -	/* Ensure the starting page is pageblock-aligned */
> -	BUG_ON(pfn & (pageblock_nr_pages - 1));
> -
> -	/* If the entire pageblock is free, move to the end of free page */
> -	if (pageblock_free(page)) {
> -		int order;
> -		/* be careful. we don't have locks, page_order can be changed.*/
> -		order = page_order(page);
> -		if ((order < MAX_ORDER) && (order >= pageblock_order))
> -			return pfn + (1 << order);
> -	}
> -
> -	return pfn + pageblock_nr_pages;
> -}
> -
> -static bool is_pageblock_removable_nolock(unsigned long pfn)
> -{
> -	struct page *page = pfn_to_page(pfn);
> -	struct zone *zone;
> -
> -	/*
> -	 * We have to be careful here because we are iterating over memory
> -	 * sections which are not zone aware so we might end up outside of
> -	 * the zone but still within the section.
> -	 * We have to take care about the node as well. If the node is offline
> -	 * its NODE_DATA will be NULL - see page_zone.
> -	 */
> -	if (!node_online(page_to_nid(page)))
> -		return false;
> -
> -	zone = page_zone(page);
> -	pfn = page_to_pfn(page);
> -	if (!zone_spans_pfn(zone, pfn))
> -		return false;
> -
> -	return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
> -				    MEMORY_OFFLINE);
> -}
> -
> -/* Checks if this range of memory is likely to be hot-removable. */
> -bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> -{
> -	unsigned long end_pfn, pfn;
> -
> -	end_pfn = min(start_pfn + nr_pages,
> -			zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
> -
> -	/* Check the starting page of each pageblock within the range */
> -	for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
> -		if (!is_pageblock_removable_nolock(pfn))
> -			return false;
> -		cond_resched();
> -	}
> -
> -	/* All pageblocks in the memory block are likely to be hot-removable */
> -	return true;
> -}
> -
>  /*
>   * Confirm all pages in a range [start, end) belong to the same zone (skipping
>   * memory holes). When true, return the zone.
> -- 
> 2.25.1
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
  2020-04-07 14:00   ` Michal Hocko
@ 2020-04-07 21:30   ` Wei Yang
  2020-04-08  2:48   ` Baoquan He
  2 siblings, 0 replies; 17+ messages in thread
From: Wei Yang @ 2020-04-07 21:30 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, Michael Ellerman,
	Benjamin Herrenschmidt, Michal Hocko, Andrew Morton,
	Oscar Salvador, Baoquan He, Wei Yang

On Tue, Apr 07, 2020 at 03:54:16PM +0200, David Hildenbrand wrote:
>Fortunately, all users of is_mem_section_removable() are gone. Get rid of
>it, including some now unnecessary functions.
>
>Cc: Michael Ellerman <mpe@ellerman.id.au>
>Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>Cc: Michal Hocko <mhocko@suse.com>
>Cc: Andrew Morton <akpm@linux-foundation.org>
>Cc: Oscar Salvador <osalvador@suse.de>
>Cc: Baoquan He <bhe@redhat.com>
>Cc: Wei Yang <richard.weiyang@gmail.com>
>Signed-off-by: David Hildenbrand <david@redhat.com>

Nice.

Reviewed-by: Wei Yang <richard.weiyang@gmail.com>

-- 
Wei Yang
Help you, Help me

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
  2020-04-07 13:58   ` Michal Hocko
@ 2020-04-08  2:46   ` Baoquan He
  2020-04-09  2:59     ` piliu
  2020-04-09  7:26   ` Michael Ellerman
  2 siblings, 1 reply; 17+ messages in thread
From: Baoquan He @ 2020-04-08  2:46 UTC (permalink / raw)
  To: David Hildenbrand, piliu
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Wei Yang

Add Pingfan to CC since he usually handles ppc related bugs for RHEL.

On 04/07/20 at 03:54pm, David Hildenbrand wrote:
> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.

Pingfan, can you have a look at this change on PPC?  Please feel free to
give comments if any concern, or offer ack if it's OK to you.

> 
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
> 
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
> 
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.
> 
> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>  1 file changed, 3 insertions(+), 23 deletions(-)
> 
> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
> index b2cde1732301..5ace2f9a277e 100644
> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>  
>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>  {
> -	int i, scns_per_block;
> -	bool rc = true;
> -	unsigned long pfn, block_sz;
> -	u64 phys_addr;
> -
>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>  		return false;
>  
> -	block_sz = memory_block_size_bytes();
> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
> -	phys_addr = lmb->base_addr;
> -
>  #ifdef CONFIG_FA_DUMP
>  	/*
>  	 * Don't hot-remove memory that falls in fadump boot memory area
>  	 * and memory that is reserved for capturing old kernel memory.
>  	 */
> -	if (is_fadump_memory_area(phys_addr, block_sz))
> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>  		return false;
>  #endif
> -
> -	for (i = 0; i < scns_per_block; i++) {
> -		pfn = PFN_DOWN(phys_addr);
> -		if (!pfn_in_present_section(pfn)) {
> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
> -			continue;
> -		}
> -
> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
> -	}
> -
> -	return rc;
> +	/* device_offline() will determine if we can actually remove this lmb */
> +	return true;
>  }
>  
>  static int dlpar_add_lmb(struct drmem_lmb *);
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
  2020-04-07 14:00   ` Michal Hocko
  2020-04-07 21:30   ` Wei Yang
@ 2020-04-08  2:48   ` Baoquan He
  2 siblings, 0 replies; 17+ messages in thread
From: Baoquan He @ 2020-04-08  2:48 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, Michael Ellerman,
	Benjamin Herrenschmidt, Michal Hocko, Andrew Morton,
	Oscar Salvador, Wei Yang

On 04/07/20 at 03:54pm, David Hildenbrand wrote:
> Fortunately, all users of is_mem_section_removable() are gone. Get rid of
> it, including some now unnecessary functions.
> 
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Baoquan He <bhe@redhat.com>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Assuming no issue to patch 1, this one looks good.

Reviewed-by: Baoquan He <bhe@redhat.com>

> ---
>  include/linux/memory_hotplug.h |  7 ----
>  mm/memory_hotplug.c            | 75 ----------------------------------
>  2 files changed, 82 deletions(-)
> 
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 93d9ada74ddd..7dca9cd6076b 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -314,19 +314,12 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
>  
> -extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
>  extern void try_offline_node(int nid);
>  extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
>  extern int remove_memory(int nid, u64 start, u64 size);
>  extern void __remove_memory(int nid, u64 start, u64 size);
>  
>  #else
> -static inline bool is_mem_section_removable(unsigned long pfn,
> -					unsigned long nr_pages)
> -{
> -	return false;
> -}
> -
>  static inline void try_offline_node(int nid) {}
>  
>  static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 47cf6036eb31..4d338d546d52 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1112,81 +1112,6 @@ int add_memory(int nid, u64 start, u64 size)
>  EXPORT_SYMBOL_GPL(add_memory);
>  
>  #ifdef CONFIG_MEMORY_HOTREMOVE
> -/*
> - * A free page on the buddy free lists (not the per-cpu lists) has PageBuddy
> - * set and the size of the free page is given by page_order(). Using this,
> - * the function determines if the pageblock contains only free pages.
> - * Due to buddy contraints, a free page at least the size of a pageblock will
> - * be located at the start of the pageblock
> - */
> -static inline int pageblock_free(struct page *page)
> -{
> -	return PageBuddy(page) && page_order(page) >= pageblock_order;
> -}
> -
> -/* Return the pfn of the start of the next active pageblock after a given pfn */
> -static unsigned long next_active_pageblock(unsigned long pfn)
> -{
> -	struct page *page = pfn_to_page(pfn);
> -
> -	/* Ensure the starting page is pageblock-aligned */
> -	BUG_ON(pfn & (pageblock_nr_pages - 1));
> -
> -	/* If the entire pageblock is free, move to the end of free page */
> -	if (pageblock_free(page)) {
> -		int order;
> -		/* be careful. we don't have locks, page_order can be changed.*/
> -		order = page_order(page);
> -		if ((order < MAX_ORDER) && (order >= pageblock_order))
> -			return pfn + (1 << order);
> -	}
> -
> -	return pfn + pageblock_nr_pages;
> -}
> -
> -static bool is_pageblock_removable_nolock(unsigned long pfn)
> -{
> -	struct page *page = pfn_to_page(pfn);
> -	struct zone *zone;
> -
> -	/*
> -	 * We have to be careful here because we are iterating over memory
> -	 * sections which are not zone aware so we might end up outside of
> -	 * the zone but still within the section.
> -	 * We have to take care about the node as well. If the node is offline
> -	 * its NODE_DATA will be NULL - see page_zone.
> -	 */
> -	if (!node_online(page_to_nid(page)))
> -		return false;
> -
> -	zone = page_zone(page);
> -	pfn = page_to_pfn(page);
> -	if (!zone_spans_pfn(zone, pfn))
> -		return false;
> -
> -	return !has_unmovable_pages(zone, page, MIGRATE_MOVABLE,
> -				    MEMORY_OFFLINE);
> -}
> -
> -/* Checks if this range of memory is likely to be hot-removable. */
> -bool is_mem_section_removable(unsigned long start_pfn, unsigned long nr_pages)
> -{
> -	unsigned long end_pfn, pfn;
> -
> -	end_pfn = min(start_pfn + nr_pages,
> -			zone_end_pfn(page_zone(pfn_to_page(start_pfn))));
> -
> -	/* Check the starting page of each pageblock within the range */
> -	for (pfn = start_pfn; pfn < end_pfn; pfn = next_active_pageblock(pfn)) {
> -		if (!is_pageblock_removable_nolock(pfn))
> -			return false;
> -		cond_resched();
> -	}
> -
> -	/* All pageblocks in the memory block are likely to be hot-removable */
> -	return true;
> -}
> -
>  /*
>   * Confirm all pages in a range [start, end) belong to the same zone (skipping
>   * memory holes). When true, return the zone.
> -- 
> 2.25.1
> 


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-08  2:46   ` Baoquan He
@ 2020-04-09  2:59     ` piliu
  2020-04-09  7:26       ` David Hildenbrand
  0 siblings, 1 reply; 17+ messages in thread
From: piliu @ 2020-04-09  2:59 UTC (permalink / raw)
  To: Baoquan He, David Hildenbrand
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Wei Yang



On 04/08/2020 10:46 AM, Baoquan He wrote:
> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
> 
> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>> blocks as removable"), the user space interface to compute whether a memory
>> block can be offlined (exposed via
>> /sys/devices/system/memory/memoryX/removable) has effectively been
>> deprecated. We want to remove the leftovers of the kernel implementation.
> 
> Pingfan, can you have a look at this change on PPC?  Please feel free to
> give comments if any concern, or offer ack if it's OK to you.
> 
>>
>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>> we'll start by:
>> 1. Testing if it contains any holes, and reject if so
>> 2. Testing if pages belong to different zones, and reject if so
>> 3. Isolating the page range, checking if it contains any unmovable pages
>>
>> Using is_mem_section_removable() before trying to offline is not only racy,
>> it can easily result in false positives/negatives. Let's stop manually
>> checking is_mem_section_removable(), and let device_offline() handle it
>> completely instead. We can remove the racy is_mem_section_removable()
>> implementation next.
>>
>> We now take more locks (e.g., memory hotplug lock when offlining and the
>> zone lock when isolating), but maybe we should optimize that
>> implementation instead if this ever becomes a real problem (after all,
>> memory unplug is already an expensive operation). We started using
>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>> Implement memory hotplug remove in the kernel"), with the initial
>> hotremove support of lmbs.
>>
>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: Oscar Salvador <osalvador@suse.de>
>> Cc: Baoquan He <bhe@redhat.com>
>> Cc: Wei Yang <richard.weiyang@gmail.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>
>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> index b2cde1732301..5ace2f9a277e 100644
>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>  
>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>  {
>> -	int i, scns_per_block;
>> -	bool rc = true;
>> -	unsigned long pfn, block_sz;
>> -	u64 phys_addr;
>> -
>>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>  		return false;
>>  
>> -	block_sz = memory_block_size_bytes();
>> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>> -	phys_addr = lmb->base_addr;
>> -
>>  #ifdef CONFIG_FA_DUMP
>>  	/*
>>  	 * Don't hot-remove memory that falls in fadump boot memory area
>>  	 * and memory that is reserved for capturing old kernel memory.
>>  	 */
>> -	if (is_fadump_memory_area(phys_addr, block_sz))
>> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>  		return false;
>>  #endif
>> -
>> -	for (i = 0; i < scns_per_block; i++) {
>> -		pfn = PFN_DOWN(phys_addr);
>> -		if (!pfn_in_present_section(pfn)) {
>> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> -			continue;
>> -		}
>> -
>> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
>> -	}
>> -
>> -	return rc;
>> +	/* device_offline() will determine if we can actually remove this lmb */
>> +	return true;
So I think here swaps the check and do sequence. At least it breaks
dlpar_memory_remove_by_count(). It is doable to remove
is_mem_section_removable(), but here should be more effort to re-arrange
the code.

Thanks,
Pingfan
>>  }
>>  
>>  static int dlpar_add_lmb(struct drmem_lmb *);
>> -- 
>> 2.25.1
>>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
  2020-04-07 13:58   ` Michal Hocko
  2020-04-08  2:46   ` Baoquan He
@ 2020-04-09  7:26   ` Michael Ellerman
  2020-04-09  7:32     ` David Hildenbrand
  2020-04-09  7:59     ` Michal Hocko
  2 siblings, 2 replies; 17+ messages in thread
From: Michael Ellerman @ 2020-04-09  7:26 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, linuxppc-dev, David Hildenbrand, Nathan Fontenot,
	Benjamin Herrenschmidt, Paul Mackerras, Michal Hocko,
	Andrew Morton, Oscar Salvador, Baoquan He, Wei Yang

David Hildenbrand <david@redhat.com> writes:

> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> blocks as removable"), the user space interface to compute whether a memory
> block can be offlined (exposed via
> /sys/devices/system/memory/memoryX/removable) has effectively been
> deprecated. We want to remove the leftovers of the kernel implementation.
>
> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> we'll start by:
> 1. Testing if it contains any holes, and reject if so
> 2. Testing if pages belong to different zones, and reject if so
> 3. Isolating the page range, checking if it contains any unmovable pages
>
> Using is_mem_section_removable() before trying to offline is not only racy,
> it can easily result in false positives/negatives. Let's stop manually
> checking is_mem_section_removable(), and let device_offline() handle it
> completely instead. We can remove the racy is_mem_section_removable()
> implementation next.
>
> We now take more locks (e.g., memory hotplug lock when offlining and the
> zone lock when isolating), but maybe we should optimize that
> implementation instead if this ever becomes a real problem (after all,
> memory unplug is already an expensive operation). We started using
> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> Implement memory hotplug remove in the kernel"), with the initial
> hotremove support of lmbs.

It's also not very pretty in dmesg.

Before:

  pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
  pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
  dlpar: Could not handle DLPAR request "memory add count 10"

After:

  pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
  page:c00c000001ca8200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000072a080180
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 c00c000001cffd48 c000000781c51410 c000000793327580
  raw: c00000072a080180 0000000001550001 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001cc4a80 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000079b110080
  flags: 0x7ffff000000000()
  raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
  raw: c00000079b110080 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001d08200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000074208ff00
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 c000000781c5f150 c00c000001d37f88 c000000798a24880
  raw: c00000074208ff00 0000000001550002 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001d40140 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000750059c00
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 c00c000001dfcfc8 c00c000001d3c288 c0000007851c2d00
  raw: c000000750059c00 0000000001000003 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001d9c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
  flags: 0x7ffff000000000()
  raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
  raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001dc0200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
  flags: 0x7ffff000000000()
  raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
  raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001e00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801f500
  raw: 0000000000000000 0000000008000800 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001e40440 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801e380
  raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000001e80000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a0000640
  flags: 0x7ffff000000200(slab)
  raw: 007ffff000000200 c00c000001e5af48 c00c000001e80408 c000000f42d00a00
  raw: c0000007a0000640 00000000066600a2 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000003c89d40 refcount:2 mapcount:1 mapping:0000000018c4a547 index:0x10a41
  anon flags: 0x17ffff000080024(uptodate|active|swapbacked)
  raw: 017ffff000080024 5deadbeef0000100 5deadbeef0000122 c0000007986b03c9
  raw: 0000000000010a41 0000000000000000 0000000200000000 c00000000340b000
  page dumped because: unmovable page
  page->mem_cgroup:c00000000340b000
  page:c00c000003cc0000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f3000fd38
  flags: 0x17ffff000000200(slab)
  raw: 017ffff000000200 c000000f3c911890 c000000f3c911890 c00000079fffd980
  raw: c000000f3000fd38 0000000000700003 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000003d00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a2ec0000
  flags: 0x17ffff000000000()
  raw: 017ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
  raw: c0000007a2ec0000 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000003e2c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f8b008400
  flags: 0x27ffff000000200(slab)
  raw: 027ffff000000200 c000000f8e000190 c000000f8e000190 c0000007a801e380
  raw: c000000f8b008400 0000000000400038 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  page:c00c000003fec000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
  flags: 0x37ffff000000000()
  raw: 037ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
  raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
  page dumped because: unmovable page
  pseries-hotplug-mem: Memory hot-remove failed, adding LMB's back
  dlpar: Could not handle DLPAR request "memory remove count 10"



It looks like set_migratetype_isolate() and start_isolate_page_range()
can be told not to report those warnings, but we're just calling
device_offline() which doesn't let us specify that.

cheers

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  2:59     ` piliu
@ 2020-04-09  7:26       ` David Hildenbrand
  2020-04-09  8:56         ` David Hildenbrand
  2020-04-09 14:01         ` piliu
  0 siblings, 2 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-09  7:26 UTC (permalink / raw)
  To: piliu, Baoquan He
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Wei Yang

On 09.04.20 04:59, piliu wrote:
> 
> 
> On 04/08/2020 10:46 AM, Baoquan He wrote:
>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>
>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>
>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>> give comments if any concern, or offer ack if it's OK to you.
>>
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>>
>>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Michal Hocko <mhocko@suse.com>
>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>> Cc: Baoquan He <bhe@redhat.com>
>>> Cc: Wei Yang <richard.weiyang@gmail.com>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>> ---
>>>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>>
>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> index b2cde1732301..5ace2f9a277e 100644
>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>  
>>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>  {
>>> -	int i, scns_per_block;
>>> -	bool rc = true;
>>> -	unsigned long pfn, block_sz;
>>> -	u64 phys_addr;
>>> -
>>>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>  		return false;
>>>  
>>> -	block_sz = memory_block_size_bytes();
>>> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>> -	phys_addr = lmb->base_addr;
>>> -
>>>  #ifdef CONFIG_FA_DUMP
>>>  	/*
>>>  	 * Don't hot-remove memory that falls in fadump boot memory area
>>>  	 * and memory that is reserved for capturing old kernel memory.
>>>  	 */
>>> -	if (is_fadump_memory_area(phys_addr, block_sz))
>>> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>  		return false;
>>>  #endif
>>> -
>>> -	for (i = 0; i < scns_per_block; i++) {
>>> -		pfn = PFN_DOWN(phys_addr);
>>> -		if (!pfn_in_present_section(pfn)) {
>>> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -			continue;
>>> -		}
>>> -
>>> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>> -	}
>>> -
>>> -	return rc;
>>> +	/* device_offline() will determine if we can actually remove this lmb */
>>> +	return true;
> So I think here swaps the check and do sequence. At least it breaks
> dlpar_memory_remove_by_count(). It is doable to remove
> is_mem_section_removable(), but here should be more effort to re-arrange
> the code.
> 

Thanks Pingfan,

1. "swaps the check and do sequence":

Partially. Any caller of dlpar_remove_lmb() already has to deal with
false positives. device_offline() can easily fail after
dlpar_remove_lmb() == true. It's inherently racy.

2. "breaks dlpar_memory_remove_by_count()"

Can you elaborate why it "breaks" it? It will simply try to
offline+remove lmbs, detect that it wasn't able to offline+remove as
much as it wanted (which could happen before as well easily), and re-add
the already offlined+removed ones.

3. "more effort to re-arrange the code"

What would be your suggestion?

We would rip out that racy check if we can remove as much memory as
requested in dlpar_memory_remove_by_count() and simply always try to
remove + recover.


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  7:26   ` Michael Ellerman
@ 2020-04-09  7:32     ` David Hildenbrand
  2020-04-09  7:59     ` Michal Hocko
  1 sibling, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-09  7:32 UTC (permalink / raw)
  To: Michael Ellerman, linux-kernel
  Cc: linux-mm, linuxppc-dev, Nathan Fontenot, Benjamin Herrenschmidt,
	Paul Mackerras, Michal Hocko, Andrew Morton, Oscar Salvador,
	Baoquan He, Wei Yang

> It's also not very pretty in dmesg.
> 
> Before:
> 
>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>   dlpar: Could not handle DLPAR request "memory add count 10"
> 

Thanks for running it through the mill.

Here you test "hotadd", below you test "hot-remove". But yeah, there is
a change in the amount of dmesg.

> After:
> 
>   pseries-hotplug-mem: Attempting to hot-remove 10 LMB(s)
>   page:c00c000001ca8200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000072a080180
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 c00c000001cffd48 c000000781c51410 c000000793327580
>   raw: c00000072a080180 0000000001550001 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001cc4a80 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000079b110080
>   flags: 0x7ffff000000000()
>   raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
>   raw: c00000079b110080 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001d08200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc00000074208ff00
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 c000000781c5f150 c00c000001d37f88 c000000798a24880
>   raw: c00000074208ff00 0000000001550002 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001d40140 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000750059c00
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 c00c000001dfcfc8 c00c000001d3c288 c0000007851c2d00
>   raw: c000000750059c00 0000000001000003 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001d9c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
>   flags: 0x7ffff000000000()
>   raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
>   raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001dc0200 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000002370080
>   flags: 0x7ffff000000000()
>   raw: 007ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
>   raw: c000000002370080 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001e00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801f500
>   raw: 0000000000000000 0000000008000800 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001e40440 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 5deadbeef0000100 5deadbeef0000122 c0000007a801e380
>   raw: 0000000000000000 0000000000400040 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000001e80000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a0000640
>   flags: 0x7ffff000000200(slab)
>   raw: 007ffff000000200 c00c000001e5af48 c00c000001e80408 c000000f42d00a00
>   raw: c0000007a0000640 00000000066600a2 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000003c89d40 refcount:2 mapcount:1 mapping:0000000018c4a547 index:0x10a41
>   anon flags: 0x17ffff000080024(uptodate|active|swapbacked)
>   raw: 017ffff000080024 5deadbeef0000100 5deadbeef0000122 c0000007986b03c9
>   raw: 0000000000010a41 0000000000000000 0000000200000000 c00000000340b000
>   page dumped because: unmovable page
>   page->mem_cgroup:c00000000340b000
>   page:c00c000003cc0000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f3000fd38
>   flags: 0x17ffff000000200(slab)
>   raw: 017ffff000000200 c000000f3c911890 c000000f3c911890 c00000079fffd980
>   raw: c000000f3000fd38 0000000000700003 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000003d00000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc0000007a2ec0000
>   flags: 0x17ffff000000000()
>   raw: 017ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
>   raw: c0000007a2ec0000 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000003e2c000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0xc000000f8b008400
>   flags: 0x27ffff000000200(slab)
>   raw: 027ffff000000200 c000000f8e000190 c000000f8e000190 c0000007a801e380
>   raw: c000000f8b008400 0000000000400038 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   page:c00c000003fec000 refcount:1 mapcount:0 mapping:0000000018c4a547 index:0x0
>   flags: 0x37ffff000000000()
>   raw: 037ffff000000000 5deadbeef0000100 5deadbeef0000122 0000000000000000
>   raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
>   page dumped because: unmovable page
>   pseries-hotplug-mem: Memory hot-remove failed, adding LMB's back
>   dlpar: Could not handle DLPAR request "memory remove count 10"
> 
> 
> 
> It looks like set_migratetype_isolate() and start_isolate_page_range()
> can be told not to report those warnings, but we're just calling
> device_offline() which doesn't let us specify that.

Yeah, but these messages can easily pop up (to a more limited degree)
with the current code as well, as checking for movable pages without
isolating the page range gives you no guarantees that no unmovable data
will end up on the lmb until you offline it. It's simply racy.

I discussed this output with Michal when we changed the
/sys/devices/system/memory/memoryX/removable behavior, and he had the
opinion that dmesg (debug) output should not really be an issue.

We could make this output

a) configurable at runtime and let powerpc code disable it while calling
device_offline(). So user space attempts will still trigger the messages

b) configurable at compile time


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  7:26   ` Michael Ellerman
  2020-04-09  7:32     ` David Hildenbrand
@ 2020-04-09  7:59     ` Michal Hocko
  2020-04-09  8:12       ` David Hildenbrand
  1 sibling, 1 reply; 17+ messages in thread
From: Michal Hocko @ 2020-04-09  7:59 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: David Hildenbrand, linux-kernel, linux-mm, linuxppc-dev,
	Nathan Fontenot, Benjamin Herrenschmidt, Paul Mackerras,
	Andrew Morton, Oscar Salvador, Baoquan He, Wei Yang

On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> David Hildenbrand <david@redhat.com> writes:
> 
> > In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> > blocks as removable"), the user space interface to compute whether a memory
> > block can be offlined (exposed via
> > /sys/devices/system/memory/memoryX/removable) has effectively been
> > deprecated. We want to remove the leftovers of the kernel implementation.
> >
> > When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> > we'll start by:
> > 1. Testing if it contains any holes, and reject if so
> > 2. Testing if pages belong to different zones, and reject if so
> > 3. Isolating the page range, checking if it contains any unmovable pages
> >
> > Using is_mem_section_removable() before trying to offline is not only racy,
> > it can easily result in false positives/negatives. Let's stop manually
> > checking is_mem_section_removable(), and let device_offline() handle it
> > completely instead. We can remove the racy is_mem_section_removable()
> > implementation next.
> >
> > We now take more locks (e.g., memory hotplug lock when offlining and the
> > zone lock when isolating), but maybe we should optimize that
> > implementation instead if this ever becomes a real problem (after all,
> > memory unplug is already an expensive operation). We started using
> > is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> > Implement memory hotplug remove in the kernel"), with the initial
> > hotremove support of lmbs.
> 
> It's also not very pretty in dmesg.
> 
> Before:
> 
>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>   dlpar: Could not handle DLPAR request "memory add count 10"

Yeah, there is more output but isn't that useful? Or put it differently
what is the actual problem from having those messages in the kernel log?

From the below you can clearly tell that there are kernel allocations
which prevent hot remove from happening.

If the overall size of the debugging output is a concern then we can
think of a way to reduce it. E.g. once you have a couple of pages
reported then all others from the same block are likely not interesting
much.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  7:59     ` Michal Hocko
@ 2020-04-09  8:12       ` David Hildenbrand
  2020-04-09  8:49         ` Michal Hocko
  0 siblings, 1 reply; 17+ messages in thread
From: David Hildenbrand @ 2020-04-09  8:12 UTC (permalink / raw)
  To: Michal Hocko, Michael Ellerman
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Benjamin Herrenschmidt, Paul Mackerras, Andrew Morton,
	Oscar Salvador, Baoquan He, Wei Yang

On 09.04.20 09:59, Michal Hocko wrote:
> On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
>> David Hildenbrand <david@redhat.com> writes:
>>
>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>> blocks as removable"), the user space interface to compute whether a memory
>>> block can be offlined (exposed via
>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>> we'll start by:
>>> 1. Testing if it contains any holes, and reject if so
>>> 2. Testing if pages belong to different zones, and reject if so
>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>
>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>> it can easily result in false positives/negatives. Let's stop manually
>>> checking is_mem_section_removable(), and let device_offline() handle it
>>> completely instead. We can remove the racy is_mem_section_removable()
>>> implementation next.
>>>
>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>> zone lock when isolating), but maybe we should optimize that
>>> implementation instead if this ever becomes a real problem (after all,
>>> memory unplug is already an expensive operation). We started using
>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>> Implement memory hotplug remove in the kernel"), with the initial
>>> hotremove support of lmbs.
>>
>> It's also not very pretty in dmesg.
>>
>> Before:
>>
>>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
>>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
>>   dlpar: Could not handle DLPAR request "memory add count 10"
> 
> Yeah, there is more output but isn't that useful? Or put it differently
> what is the actual problem from having those messages in the kernel log?
> 
> From the below you can clearly tell that there are kernel allocations
> which prevent hot remove from happening.
> 
> If the overall size of the debugging output is a concern then we can
> think of a way to reduce it. E.g. once you have a couple of pages
> reported then all others from the same block are likely not interesting
> much.
> 

IIRC, we only report one page per block already. (and stop, as we
detected something unmovable)

-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  8:12       ` David Hildenbrand
@ 2020-04-09  8:49         ` Michal Hocko
  0 siblings, 0 replies; 17+ messages in thread
From: Michal Hocko @ 2020-04-09  8:49 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Michael Ellerman, linux-kernel, linux-mm, linuxppc-dev,
	Nathan Fontenot, Benjamin Herrenschmidt, Paul Mackerras,
	Andrew Morton, Oscar Salvador, Baoquan He, Wei Yang

On Thu 09-04-20 10:12:20, David Hildenbrand wrote:
> On 09.04.20 09:59, Michal Hocko wrote:
> > On Thu 09-04-20 17:26:01, Michael Ellerman wrote:
> >> David Hildenbrand <david@redhat.com> writes:
> >>
> >>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
> >>> blocks as removable"), the user space interface to compute whether a memory
> >>> block can be offlined (exposed via
> >>> /sys/devices/system/memory/memoryX/removable) has effectively been
> >>> deprecated. We want to remove the leftovers of the kernel implementation.
> >>>
> >>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
> >>> we'll start by:
> >>> 1. Testing if it contains any holes, and reject if so
> >>> 2. Testing if pages belong to different zones, and reject if so
> >>> 3. Isolating the page range, checking if it contains any unmovable pages
> >>>
> >>> Using is_mem_section_removable() before trying to offline is not only racy,
> >>> it can easily result in false positives/negatives. Let's stop manually
> >>> checking is_mem_section_removable(), and let device_offline() handle it
> >>> completely instead. We can remove the racy is_mem_section_removable()
> >>> implementation next.
> >>>
> >>> We now take more locks (e.g., memory hotplug lock when offlining and the
> >>> zone lock when isolating), but maybe we should optimize that
> >>> implementation instead if this ever becomes a real problem (after all,
> >>> memory unplug is already an expensive operation). We started using
> >>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
> >>> Implement memory hotplug remove in the kernel"), with the initial
> >>> hotremove support of lmbs.
> >>
> >> It's also not very pretty in dmesg.
> >>
> >> Before:
> >>
> >>   pseries-hotplug-mem: Attempting to hot-add 10 LMB(s)
> >>   pseries-hotplug-mem: Memory hot-add failed, removing any added LMBs
> >>   dlpar: Could not handle DLPAR request "memory add count 10"
> > 
> > Yeah, there is more output but isn't that useful? Or put it differently
> > what is the actual problem from having those messages in the kernel log?
> > 
> > From the below you can clearly tell that there are kernel allocations
> > which prevent hot remove from happening.
> > 
> > If the overall size of the debugging output is a concern then we can
> > think of a way to reduce it. E.g. once you have a couple of pages
> > reported then all others from the same block are likely not interesting
> > much.
> > 
> 
> IIRC, we only report one page per block already. (and stop, as we
> detected something unmovable)

You are right.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  7:26       ` David Hildenbrand
@ 2020-04-09  8:56         ` David Hildenbrand
  2020-04-09 14:01         ` piliu
  1 sibling, 0 replies; 17+ messages in thread
From: David Hildenbrand @ 2020-04-09  8:56 UTC (permalink / raw)
  To: piliu, Baoquan He
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Wei Yang

On 09.04.20 09:26, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>>> blocks as removable"), the user space interface to compute whether a memory
>>>> block can be offlined (exposed via
>>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>
>>>>
>>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>>> we'll start by:
>>>> 1. Testing if it contains any holes, and reject if so
>>>> 2. Testing if pages belong to different zones, and reject if so
>>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>>
>>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>>> it can easily result in false positives/negatives. Let's stop manually
>>>> checking is_mem_section_removable(), and let device_offline() handle it
>>>> completely instead. We can remove the racy is_mem_section_removable()
>>>> implementation next.
>>>>
>>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>>> zone lock when isolating), but maybe we should optimize that
>>>> implementation instead if this ever becomes a real problem (after all,
>>>> memory unplug is already an expensive operation). We started using
>>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>>> Implement memory hotplug remove in the kernel"), with the initial
>>>> hotremove support of lmbs.
>>>>
>>>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>> Cc: Paul Mackerras <paulus@samba.org>
>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>> Cc: Baoquan He <bhe@redhat.com>
>>>> Cc: Wei Yang <richard.weiyang@gmail.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>>>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> index b2cde1732301..5ace2f9a277e 100644
>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>>  
>>>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>>  {
>>>> -	int i, scns_per_block;
>>>> -	bool rc = true;
>>>> -	unsigned long pfn, block_sz;
>>>> -	u64 phys_addr;
>>>> -
>>>>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>>  		return false;
>>>>  
>>>> -	block_sz = memory_block_size_bytes();
>>>> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>>> -	phys_addr = lmb->base_addr;
>>>> -
>>>>  #ifdef CONFIG_FA_DUMP
>>>>  	/*
>>>>  	 * Don't hot-remove memory that falls in fadump boot memory area
>>>>  	 * and memory that is reserved for capturing old kernel memory.
>>>>  	 */
>>>> -	if (is_fadump_memory_area(phys_addr, block_sz))
>>>> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>>  		return false;
>>>>  #endif
>>>> -
>>>> -	for (i = 0; i < scns_per_block; i++) {
>>>> -		pfn = PFN_DOWN(phys_addr);
>>>> -		if (!pfn_in_present_section(pfn)) {
>>>> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> -			continue;
>>>> -		}
>>>> -
>>>> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>>> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> -	}
>>>> -
>>>> -	return rc;
>>>> +	/* device_offline() will determine if we can actually remove this lmb */
>>>> +	return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
> 
> Thanks Pingfan,
> 
> 1. "swaps the check and do sequence":
> 
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.

Sorry, s/dlpar_remove_lmb/lmb_is_removable/


-- 
Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable()
  2020-04-09  7:26       ` David Hildenbrand
  2020-04-09  8:56         ` David Hildenbrand
@ 2020-04-09 14:01         ` piliu
  1 sibling, 0 replies; 17+ messages in thread
From: piliu @ 2020-04-09 14:01 UTC (permalink / raw)
  To: David Hildenbrand, Baoquan He
  Cc: linux-kernel, linux-mm, linuxppc-dev, Nathan Fontenot,
	Michael Ellerman, Benjamin Herrenschmidt, Paul Mackerras,
	Michal Hocko, Andrew Morton, Oscar Salvador, Wei Yang



On 04/09/2020 03:26 PM, David Hildenbrand wrote:
> On 09.04.20 04:59, piliu wrote:
>>
>>
>> On 04/08/2020 10:46 AM, Baoquan He wrote:
>>> Add Pingfan to CC since he usually handles ppc related bugs for RHEL.
>>>
>>> On 04/07/20 at 03:54pm, David Hildenbrand wrote:
>>>> In commit 53cdc1cb29e8 ("drivers/base/memory.c: indicate all memory
>>>> blocks as removable"), the user space interface to compute whether a memory
>>>> block can be offlined (exposed via
>>>> /sys/devices/system/memory/memoryX/removable) has effectively been
>>>> deprecated. We want to remove the leftovers of the kernel implementation.
>>>
>>> Pingfan, can you have a look at this change on PPC?  Please feel free to
>>> give comments if any concern, or offer ack if it's OK to you.
>>>
>>>>
>>>> When offlining a memory block (mm/memory_hotplug.c:__offline_pages()),
>>>> we'll start by:
>>>> 1. Testing if it contains any holes, and reject if so
>>>> 2. Testing if pages belong to different zones, and reject if so
>>>> 3. Isolating the page range, checking if it contains any unmovable pages
>>>>
>>>> Using is_mem_section_removable() before trying to offline is not only racy,
>>>> it can easily result in false positives/negatives. Let's stop manually
>>>> checking is_mem_section_removable(), and let device_offline() handle it
>>>> completely instead. We can remove the racy is_mem_section_removable()
>>>> implementation next.
>>>>
>>>> We now take more locks (e.g., memory hotplug lock when offlining and the
>>>> zone lock when isolating), but maybe we should optimize that
>>>> implementation instead if this ever becomes a real problem (after all,
>>>> memory unplug is already an expensive operation). We started using
>>>> is_mem_section_removable() in commit 51925fb3c5c9 ("powerpc/pseries:
>>>> Implement memory hotplug remove in the kernel"), with the initial
>>>> hotremove support of lmbs.
>>>>
>>>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>>> Cc: Paul Mackerras <paulus@samba.org>
>>>> Cc: Michal Hocko <mhocko@suse.com>
>>>> Cc: Andrew Morton <akpm@linux-foundation.org>
>>>> Cc: Oscar Salvador <osalvador@suse.de>
>>>> Cc: Baoquan He <bhe@redhat.com>
>>>> Cc: Wei Yang <richard.weiyang@gmail.com>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>  .../platforms/pseries/hotplug-memory.c        | 26 +++----------------
>>>>  1 file changed, 3 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> index b2cde1732301..5ace2f9a277e 100644
>>>> --- a/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> +++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
>>>> @@ -337,39 +337,19 @@ static int pseries_remove_mem_node(struct device_node *np)
>>>>  
>>>>  static bool lmb_is_removable(struct drmem_lmb *lmb)
>>>>  {
>>>> -	int i, scns_per_block;
>>>> -	bool rc = true;
>>>> -	unsigned long pfn, block_sz;
>>>> -	u64 phys_addr;
>>>> -
>>>>  	if (!(lmb->flags & DRCONF_MEM_ASSIGNED))
>>>>  		return false;
>>>>  
>>>> -	block_sz = memory_block_size_bytes();
>>>> -	scns_per_block = block_sz / MIN_MEMORY_BLOCK_SIZE;
>>>> -	phys_addr = lmb->base_addr;
>>>> -
>>>>  #ifdef CONFIG_FA_DUMP
>>>>  	/*
>>>>  	 * Don't hot-remove memory that falls in fadump boot memory area
>>>>  	 * and memory that is reserved for capturing old kernel memory.
>>>>  	 */
>>>> -	if (is_fadump_memory_area(phys_addr, block_sz))
>>>> +	if (is_fadump_memory_area(lmb->base_addr, memory_block_size_bytes()))
>>>>  		return false;
>>>>  #endif
>>>> -
>>>> -	for (i = 0; i < scns_per_block; i++) {
>>>> -		pfn = PFN_DOWN(phys_addr);
>>>> -		if (!pfn_in_present_section(pfn)) {
>>>> -			phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> -			continue;
>>>> -		}
>>>> -
>>>> -		rc = rc && is_mem_section_removable(pfn, PAGES_PER_SECTION);
>>>> -		phys_addr += MIN_MEMORY_BLOCK_SIZE;
>>>> -	}
>>>> -
>>>> -	return rc;
>>>> +	/* device_offline() will determine if we can actually remove this lmb */
>>>> +	return true;
>> So I think here swaps the check and do sequence. At least it breaks
>> dlpar_memory_remove_by_count(). It is doable to remove
>> is_mem_section_removable(), but here should be more effort to re-arrange
>> the code.
>>
> 
> Thanks Pingfan,
> 
> 1. "swaps the check and do sequence":
> 
> Partially. Any caller of dlpar_remove_lmb() already has to deal with
> false positives. device_offline() can easily fail after
> dlpar_remove_lmb() == true. It's inherently racy.
> 
> 2. "breaks dlpar_memory_remove_by_count()"
> 
> Can you elaborate why it "breaks" it? It will simply try to
> offline+remove lmbs, detect that it wasn't able to offline+remove as
> much as it wanted (which could happen before as well easily), and re-add
> the already offlined+removed ones.
> 
I overlooked the re-add logic. Then I think
dlpar_memory_remove_by_count() is OK with this patch.
> 3. "more effort to re-arrange the code"
> 
> What would be your suggestion?
> 
I had thought about merging the two loop "for_each_drmem_lmb()", and do
check inside the loop. But now it is needless.

The only concerned left is "if (lmbs_available < lmbs_to_remove)" fails
to alarm due to the weaken checking in lmb_is_removable(). Then after
heavy migration in offline_pages, we encounters this limit, and need to
re-add them back.

But I think it is a rare case plus hot-remove is also not a quite
frequent event. So it is worth to simplify the code by this patch.

Thanks for your classification.

For [1/2]
Reviewed-by: Pingfan Liu <piliu@redhat.com>


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2020-04-09 14:01 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-07 13:54 [PATCH v1 0/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
2020-04-07 13:54 ` [PATCH v1 1/2] powerpc/pseries/hotplug-memory: stop checking is_mem_section_removable() David Hildenbrand
2020-04-07 13:58   ` Michal Hocko
2020-04-08  2:46   ` Baoquan He
2020-04-09  2:59     ` piliu
2020-04-09  7:26       ` David Hildenbrand
2020-04-09  8:56         ` David Hildenbrand
2020-04-09 14:01         ` piliu
2020-04-09  7:26   ` Michael Ellerman
2020-04-09  7:32     ` David Hildenbrand
2020-04-09  7:59     ` Michal Hocko
2020-04-09  8:12       ` David Hildenbrand
2020-04-09  8:49         ` Michal Hocko
2020-04-07 13:54 ` [PATCH v1 2/2] mm/memory_hotplug: remove is_mem_section_removable() David Hildenbrand
2020-04-07 14:00   ` Michal Hocko
2020-04-07 21:30   ` Wei Yang
2020-04-08  2:48   ` Baoquan He

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).