All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path
@ 2016-02-09  3:32 David Gibson
  2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
                   ` (3 more replies)
  0 siblings, 4 replies; 16+ messages in thread
From: David Gibson @ 2016-02-09  3:32 UTC (permalink / raw)
  To: mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

The cleanups to the (guest side) memory hotplug paths came up in the
context of allowing hash page table resizing for PAPR guests.
However, they stand on their own and can improve reporting of several
error conditions that could already happen.

Please apply.

David Gibson (4):
  powerpc/mm: Clean up error handling for htab_remove_mapping
  powerpc/mm: Handle removing maybe-present bolted HPTEs
  powerpc/mm: Clean up memory hotplug failure paths
  powerpc/mm: Split hash page table sizing heuristic into a helper

 arch/powerpc/include/asm/machdep.h    |  2 +-
 arch/powerpc/include/asm/mmu-hash64.h |  3 ++
 arch/powerpc/mm/hash_utils_64.c       | 73 ++++++++++++++++++++++-------------
 arch/powerpc/mm/init_64.c             | 47 ++++++++++++++--------
 arch/powerpc/mm/mem.c                 | 10 ++++-
 arch/powerpc/platforms/pseries/lpar.c |  9 +++--
 6 files changed, 96 insertions(+), 48 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping
  2016-02-09  3:32 [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path David Gibson
@ 2016-02-09  3:32 ` David Gibson
  2016-02-10  8:56   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [1/4] " Michael Ellerman
  2016-02-09  3:32 ` [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs David Gibson
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 16+ messages in thread
From: David Gibson @ 2016-02-09  3:32 UTC (permalink / raw)
  To: mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

Currently, the only error that htab_remove_mapping() can report is -EINVAL,
if removal of bolted HPTEs isn't implemeted for this platform.  We make
a few clean ups to the handling of this:

 * EINVAL isn't really the right code - there's nothing wrong with the
   function's arguments - use ENODEV instead
 * We were also printing a warning message, but that's a decision better
   left up to the callers, so remove it
 * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
   error, making the warning message redundant, so no change is needed
   there.
 * The other caller is remove_section_mapping().  This is called in the
   memory hot remove path at a point after vmemmap_remove_mapping() so
   if hpte_removebolted isn't implemented, we'd expect to have already
   BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
   really shouldn't be happening.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/mm/hash_utils_64.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index ba59d59..9f7d727 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -273,11 +273,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	shift = mmu_psize_defs[psize].shift;
 	step = 1 << shift;
 
-	if (!ppc_md.hpte_removebolted) {
-		printk(KERN_WARNING "Platform doesn't implement "
-				"hpte_removebolted\n");
-		return -EINVAL;
-	}
+	if (!ppc_md.hpte_removebolted)
+		return -ENODEV;
 
 	for (vaddr = vstart; vaddr < vend; vaddr += step)
 		ppc_md.hpte_removebolted(vaddr, psize, ssize);
@@ -641,8 +638,10 @@ int create_section_mapping(unsigned long start, unsigned long end)
 
 int remove_section_mapping(unsigned long start, unsigned long end)
 {
-	return htab_remove_mapping(start, end, mmu_linear_psize,
-			mmu_kernel_ssize);
+	int rc = htab_remove_mapping(start, end, mmu_linear_psize,
+				     mmu_kernel_ssize);
+	WARN_ON(rc < 0);
+	return rc;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs
  2016-02-09  3:32 [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path David Gibson
  2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
@ 2016-02-09  3:32 ` David Gibson
  2016-02-10  8:58   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [2/4] " Michael Ellerman
  2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
  2016-02-09  3:32 ` [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper David Gibson
  3 siblings, 2 replies; 16+ messages in thread
From: David Gibson @ 2016-02-09  3:32 UTC (permalink / raw)
  To: mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

At the moment the hpte_removebolted callback in ppc_md returns void and
will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
place.  This is awkward for the case of cleaning up a mapping which was
partially made before failing.

So, we add a return value to hpte_removebolted, and have it return ENOENT
in the case that the HPTE to remove didn't exist in the first place.

In the (sole) caller, we propagate errors in hpte_removebolted to its
caller to handle.  However, we handle ENOENT specially, continuing to
complete the unmapping over the specified range before returning the error
to the caller.

This means that htab_remove_mapping() will work sanely on a partially
present mapping, removing any HPTEs which are present, while also returning
ENOENT to its caller in case it's important there.

There are two callers of htab_remove_mapping():
   - In remove_section_mapping() we already WARN_ON() any error return,
     which is reasonable - in this case the mapping should be fully
     present
   - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
     just a WARN_ON() in the case of ENOENT, since failing to remove a
     mapping that wasn't there in the first place probably shouldn't be
     fatal.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/machdep.h    |  2 +-
 arch/powerpc/mm/hash_utils_64.c       | 15 ++++++++++++---
 arch/powerpc/mm/init_64.c             |  9 +++++----
 arch/powerpc/platforms/pseries/lpar.c |  9 ++++++---
 4 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 3f191f5..fa25643 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -54,7 +54,7 @@ struct machdep_calls {
 				       int psize, int apsize,
 				       int ssize);
 	long		(*hpte_remove)(unsigned long hpte_group);
-	void            (*hpte_removebolted)(unsigned long ea,
+	int             (*hpte_removebolted)(unsigned long ea,
 					     int psize, int ssize);
 	void		(*flush_hash_range)(unsigned long number, int local);
 	void		(*hugepage_invalidate)(unsigned long vsid,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 9f7d727..99fbee0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -269,6 +269,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 {
 	unsigned long vaddr;
 	unsigned int step, shift;
+	int rc;
+	int ret = 0;
 
 	shift = mmu_psize_defs[psize].shift;
 	step = 1 << shift;
@@ -276,10 +278,17 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	if (!ppc_md.hpte_removebolted)
 		return -ENODEV;
 
-	for (vaddr = vstart; vaddr < vend; vaddr += step)
-		ppc_md.hpte_removebolted(vaddr, psize, ssize);
+	for (vaddr = vstart; vaddr < vend; vaddr += step) {
+		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
+		if (rc == -ENOENT) {
+			ret = -ENOENT;
+			continue;
+		}
+		if (rc < 0)
+			return rc;
+	}
 
-	return 0;
+	return ret;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 379a6a9..baa1a23 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 static void vmemmap_remove_mapping(unsigned long start,
 				   unsigned long page_size)
 {
-	int mapped = htab_remove_mapping(start, start + page_size,
-					 mmu_vmemmap_psize,
-					 mmu_kernel_ssize);
-	BUG_ON(mapped < 0);
+	int rc = htab_remove_mapping(start, start + page_size,
+				     mmu_vmemmap_psize,
+				     mmu_kernel_ssize);
+	BUG_ON((rc < 0) && (rc != -ENOENT));
+	WARN_ON(rc == -ENOENT);
 }
 #endif
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 477290a..2415a0d 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -505,8 +505,8 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
 }
 #endif
 
-static void pSeries_lpar_hpte_removebolted(unsigned long ea,
-					   int psize, int ssize)
+static int pSeries_lpar_hpte_removebolted(unsigned long ea,
+					  int psize, int ssize)
 {
 	unsigned long vpn;
 	unsigned long slot, vsid;
@@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned long ea,
 	vpn = hpt_vpn(ea, vsid, ssize);
 
 	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
-	BUG_ON(slot == -1);
+	if (slot == -1)
+		return -ENOENT;
+
 	/*
 	 * lpar doesn't use the passed actual page size
 	 */
 	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
+	return 0;
 }
 
 /*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths
  2016-02-09  3:32 [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path David Gibson
  2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
  2016-02-09  3:32 ` [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs David Gibson
@ 2016-02-09  3:32 ` David Gibson
  2016-02-10  9:00   ` Aneesh Kumar K.V
                     ` (2 more replies)
  2016-02-09  3:32 ` [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper David Gibson
  3 siblings, 3 replies; 16+ messages in thread
From: David Gibson @ 2016-02-09  3:32 UTC (permalink / raw)
  To: mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

This makes a number of cleanups to handling of mapping failures during
memory hotplug on Power:

For errors creating the linear mapping for the hot-added region:
  * This is now reported with EFAULT which is more appropriate than the
    previous EINVAL (the failure is unlikely to be related to the
    function's parameters)
  * An error in this path now prints a warning message, rather than just
    silently failing to add the extra memory.
  * Previously a failure here could result in the region being partially
    mapped.  We now clean up any partial mapping before failing.

For errors creating the vmemmap for the hot-added region:
   * This is now reported with EFAULT instead of causing a BUG() - this
     could happen for external reason (e.g. full hash table) so it's better
     to handle this non-fatally
   * An error message is also printed, so the failure won't be silent
   * As above a failure could cause a partially mapped region, we now
     clean this up.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Paul Mackerras <paulus@samba.org>
---
 arch/powerpc/mm/hash_utils_64.c | 13 ++++++++++---
 arch/powerpc/mm/init_64.c       | 38 ++++++++++++++++++++++++++------------
 arch/powerpc/mm/mem.c           | 10 ++++++++--
 3 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 99fbee0..fdcf9d1 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -640,9 +640,16 @@ static unsigned long __init htab_get_table_size(void)
 #ifdef CONFIG_MEMORY_HOTPLUG
 int create_section_mapping(unsigned long start, unsigned long end)
 {
-	return htab_bolt_mapping(start, end, __pa(start),
-				 pgprot_val(PAGE_KERNEL), mmu_linear_psize,
-				 mmu_kernel_ssize);
+	int rc = htab_bolt_mapping(start, end, __pa(start),
+				   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
+				   mmu_kernel_ssize);
+
+	if (rc < 0) {
+		int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
+					      mmu_kernel_ssize);
+		BUG_ON(rc2 && (rc2 != -ENOENT));
+	}
+	return rc;
 }
 
 int remove_section_mapping(unsigned long start, unsigned long end)
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index baa1a23..fbc9448 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -188,9 +188,9 @@ static int __meminit vmemmap_populated(unsigned long start, int page_size)
  */
 
 #ifdef CONFIG_PPC_BOOK3E
-static void __meminit vmemmap_create_mapping(unsigned long start,
-					     unsigned long page_size,
-					     unsigned long phys)
+static int __meminit vmemmap_create_mapping(unsigned long start,
+					    unsigned long page_size,
+					    unsigned long phys)
 {
 	/* Create a PTE encoding without page size */
 	unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED |
@@ -208,6 +208,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 	 */
 	for (i = 0; i < page_size; i += PAGE_SIZE)
 		BUG_ON(map_kernel_page(start + i, phys, flags));
+
+	return 0;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
 }
 #endif
 #else /* CONFIG_PPC_BOOK3E */
-static void __meminit vmemmap_create_mapping(unsigned long start,
-					     unsigned long page_size,
-					     unsigned long phys)
+static int __meminit vmemmap_create_mapping(unsigned long start,
+					    unsigned long page_size,
+					    unsigned long phys)
 {
-	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
-					pgprot_val(PAGE_KERNEL),
-					mmu_vmemmap_psize,
-					mmu_kernel_ssize);
-	BUG_ON(mapped < 0);
+	int rc = htab_bolt_mapping(start, start + page_size, phys,
+				   pgprot_val(PAGE_KERNEL),
+				   mmu_vmemmap_psize, mmu_kernel_ssize);
+	if (rc < 0) {
+		int rc2 = htab_remove_mapping(start, start + page_size,
+					      mmu_vmemmap_psize,
+					      mmu_kernel_ssize);
+		BUG_ON(rc2 && (rc2 != -ENOENT));
+	}
+	return rc;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -304,6 +311,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 
 	for (; start < end; start += page_size) {
 		void *p;
+		int rc;
 
 		if (vmemmap_populated(start, page_size))
 			continue;
@@ -317,7 +325,13 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 		pr_debug("      * %016lx..%016lx allocated at %p\n",
 			 start, start + page_size, p);
 
-		vmemmap_create_mapping(start, page_size, __pa(p));
+		rc = vmemmap_create_mapping(start, page_size, __pa(p));
+		if (rc < 0) {
+			pr_warning(
+				"vmemmap_populate: Unable to create vmemmap mapping: %d\n",
+				rc);
+			return -EFAULT;
+		}
 	}
 
 	return 0;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index d0f0a51..f980da6 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -119,12 +119,18 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
 	struct zone *zone;
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	int rc;
 
 	pgdata = NODE_DATA(nid);
 
 	start = (unsigned long)__va(start);
-	if (create_section_mapping(start, start + size))
-		return -EINVAL;
+	rc = create_section_mapping(start, start + size);
+	if (rc) {
+		pr_warning(
+			"Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
+			start, start + size, rc);
+		return -EFAULT;
+	}
 
 	/* this should work for most non-highmem platforms */
 	zone = pgdata->node_zones +
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper
  2016-02-09  3:32 [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path David Gibson
                   ` (2 preceding siblings ...)
  2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
@ 2016-02-09  3:32 ` David Gibson
  2016-02-10  9:01   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [4/4] " Michael Ellerman
  3 siblings, 2 replies; 16+ messages in thread
From: David Gibson @ 2016-02-09  3:32 UTC (permalink / raw)
  To: mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

htab_get_table_size() either retrieve the size of the hash page table (HPT)
from the device tree - if the HPT size is determined by firmware - or
uses a heuristic to determine a good size based on RAM size if the kernel
is responsible for allocating the HPT.

To support a PAPR extension allowing resizing of the HPT, we're going to
want the memory size -> HPT size logic elsewhere, so split it out into a
helper function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/mmu-hash64.h |  3 +++
 arch/powerpc/mm/hash_utils_64.c       | 32 +++++++++++++++++++-------------
 2 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 7352d3f..cf070fd 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
 	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
 	return get_vsid(context, ea, ssize);
 }
+
+unsigned htab_shift_for_mem_size(unsigned long mem_size);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index fdcf9d1..da5d279 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -611,10 +611,26 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
 	return 0;
 }
 
-static unsigned long __init htab_get_table_size(void)
+unsigned htab_shift_for_mem_size(unsigned long mem_size)
 {
-	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
+	unsigned memshift = __ilog2(mem_size);
+	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
+	unsigned pteg_shift;
+
+	/* round mem_size up to next power of 2 */
+	if ((1UL << memshift) < mem_size)
+		memshift += 1;
+
+	/* aim for 2 pages / pteg */
+	pteg_shift = memshift - (pshift + 1);
+
+	/* 2^11 PTEGS / 2^18 bytes is the minimum htab size permitted
+	 * by the architecture */
+	return max(pteg_shift + 7, 18U);
+}
 
+static unsigned long __init htab_get_table_size(void)
+{
 	/* If hash size isn't already provided by the platform, we try to
 	 * retrieve it from the device-tree. If it's not there neither, we
 	 * calculate it now based on the total RAM size
@@ -624,17 +640,7 @@ static unsigned long __init htab_get_table_size(void)
 	if (ppc64_pft_size)
 		return 1UL << ppc64_pft_size;
 
-	/* round mem_size up to next power of 2 */
-	mem_size = memblock_phys_mem_size();
-	rnd_mem_size = 1UL << __ilog2(mem_size);
-	if (rnd_mem_size < mem_size)
-		rnd_mem_size <<= 1;
-
-	/* # pages / 2 */
-	psize = mmu_psize_defs[mmu_virtual_psize].shift;
-	pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11);
-
-	return pteg_count << 7;
+	return 1UL << htab_shift_for_mem_size(memblock_phys_mem_size());
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping
  2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
@ 2016-02-10  8:56   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [1/4] " Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-10  8:56 UTC (permalink / raw)
  To: David Gibson, mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

David Gibson <david@gibson.dropbear.id.au> writes:

> Currently, the only error that htab_remove_mapping() can report is -EINVAL,
> if removal of bolted HPTEs isn't implemeted for this platform.  We make
> a few clean ups to the handling of this:
>
>  * EINVAL isn't really the right code - there's nothing wrong with the
>    function's arguments - use ENODEV instead
>  * We were also printing a warning message, but that's a decision better
>    left up to the callers, so remove it
>  * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
>    error, making the warning message redundant, so no change is needed
>    there.
>  * The other caller is remove_section_mapping().  This is called in the
>    memory hot remove path at a point after vmemmap_remove_mapping() so
>    if hpte_removebolted isn't implemented, we'd expect to have already
>    BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
>    really shouldn't be happening.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/mm/hash_utils_64.c | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index ba59d59..9f7d727 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -273,11 +273,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
>  
> -	if (!ppc_md.hpte_removebolted) {
> -		printk(KERN_WARNING "Platform doesn't implement "
> -				"hpte_removebolted\n");
> -		return -EINVAL;
> -	}
> +	if (!ppc_md.hpte_removebolted)
> +		return -ENODEV;
>  
>  	for (vaddr = vstart; vaddr < vend; vaddr += step)
>  		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> @@ -641,8 +638,10 @@ int create_section_mapping(unsigned long start, unsigned long end)
>  
>  int remove_section_mapping(unsigned long start, unsigned long end)
>  {
> -	return htab_remove_mapping(start, end, mmu_linear_psize,
> -			mmu_kernel_ssize);
> +	int rc = htab_remove_mapping(start, end, mmu_linear_psize,
> +				     mmu_kernel_ssize);
> +	WARN_ON(rc < 0);
> +	return rc;
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>  
> -- 
> 2.5.0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs
  2016-02-09  3:32 ` [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs David Gibson
@ 2016-02-10  8:58   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [2/4] " Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-10  8:58 UTC (permalink / raw)
  To: David Gibson, mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

David Gibson <david@gibson.dropbear.id.au> writes:

> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
>
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
>
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
>
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
>
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/machdep.h    |  2 +-
>  arch/powerpc/mm/hash_utils_64.c       | 15 ++++++++++++---
>  arch/powerpc/mm/init_64.c             |  9 +++++----
>  arch/powerpc/platforms/pseries/lpar.c |  9 ++++++---
>  4 files changed, 24 insertions(+), 11 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
> index 3f191f5..fa25643 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -54,7 +54,7 @@ struct machdep_calls {
>  				       int psize, int apsize,
>  				       int ssize);
>  	long		(*hpte_remove)(unsigned long hpte_group);
> -	void            (*hpte_removebolted)(unsigned long ea,
> +	int             (*hpte_removebolted)(unsigned long ea,
>  					     int psize, int ssize);
>  	void		(*flush_hash_range)(unsigned long number, int local);
>  	void		(*hugepage_invalidate)(unsigned long vsid,
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 9f7d727..99fbee0 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -269,6 +269,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  {
>  	unsigned long vaddr;
>  	unsigned int step, shift;
> +	int rc;
> +	int ret = 0;
>  
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
> @@ -276,10 +278,17 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  	if (!ppc_md.hpte_removebolted)
>  		return -ENODEV;
>  
> -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +		if (rc == -ENOENT) {
> +			ret = -ENOENT;
> +			continue;
> +		}
> +		if (rc < 0)
> +			return rc;
> +	}
>  
> -	return 0;
> +	return ret;
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>  
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 379a6a9..baa1a23 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
>  static void vmemmap_remove_mapping(unsigned long start,
>  				   unsigned long page_size)
>  {
> -	int mapped = htab_remove_mapping(start, start + page_size,
> -					 mmu_vmemmap_psize,
> -					 mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_remove_mapping(start, start + page_size,
> +				     mmu_vmemmap_psize,
> +				     mmu_kernel_ssize);
> +	BUG_ON((rc < 0) && (rc != -ENOENT));
> +	WARN_ON(rc == -ENOENT);
>  }
>  #endif
>  
> diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
> index 477290a..2415a0d 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -505,8 +505,8 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
>  }
>  #endif
>  
> -static void pSeries_lpar_hpte_removebolted(unsigned long ea,
> -					   int psize, int ssize)
> +static int pSeries_lpar_hpte_removebolted(unsigned long ea,
> +					  int psize, int ssize)
>  {
>  	unsigned long vpn;
>  	unsigned long slot, vsid;
> @@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned long ea,
>  	vpn = hpt_vpn(ea, vsid, ssize);
>  
>  	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
> -	BUG_ON(slot == -1);
> +	if (slot == -1)
> +		return -ENOENT;
> +
>  	/*
>  	 * lpar doesn't use the passed actual page size
>  	 */
>  	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
> +	return 0;
>  }
>  
>  /*
> -- 
> 2.5.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths
  2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
@ 2016-02-10  9:00   ` Aneesh Kumar K.V
  2016-03-01  1:59   ` [3/4] " Michael Ellerman
  2016-03-01 22:21   ` Michael Ellerman
  2 siblings, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-10  9:00 UTC (permalink / raw)
  To: David Gibson, mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

David Gibson <david@gibson.dropbear.id.au> writes:

> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
>
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
>
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.
>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Paul Mackerras <paulus@samba.org>
> ---
>  arch/powerpc/mm/hash_utils_64.c | 13 ++++++++++---
>  arch/powerpc/mm/init_64.c       | 38 ++++++++++++++++++++++++++------------
>  arch/powerpc/mm/mem.c           | 10 ++++++++--
>  3 files changed, 44 insertions(+), 17 deletions(-)
>
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 99fbee0..fdcf9d1 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -640,9 +640,16 @@ static unsigned long __init htab_get_table_size(void)
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  int create_section_mapping(unsigned long start, unsigned long end)
>  {
> -	return htab_bolt_mapping(start, end, __pa(start),
> -				 pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> -				 mmu_kernel_ssize);
> +	int rc = htab_bolt_mapping(start, end, __pa(start),
> +				   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> +				   mmu_kernel_ssize);
> +
> +	if (rc < 0) {
> +		int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
> +					      mmu_kernel_ssize);
> +		BUG_ON(rc2 && (rc2 != -ENOENT));
> +	}
> +	return rc;
>  }
>  
>  int remove_section_mapping(unsigned long start, unsigned long end)
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index baa1a23..fbc9448 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -188,9 +188,9 @@ static int __meminit vmemmap_populated(unsigned long start, int page_size)
>   */
>  
>  #ifdef CONFIG_PPC_BOOK3E
> -static void __meminit vmemmap_create_mapping(unsigned long start,
> -					     unsigned long page_size,
> -					     unsigned long phys)
> +static int __meminit vmemmap_create_mapping(unsigned long start,
> +					    unsigned long page_size,
> +					    unsigned long phys)
>  {
>  	/* Create a PTE encoding without page size */
>  	unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED |
> @@ -208,6 +208,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
>  	 */
>  	for (i = 0; i < page_size; i += PAGE_SIZE)
>  		BUG_ON(map_kernel_page(start + i, phys, flags));
> +
> +	return 0;
>  }
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
>  }
>  #endif
>  #else /* CONFIG_PPC_BOOK3E */
> -static void __meminit vmemmap_create_mapping(unsigned long start,
> -					     unsigned long page_size,
> -					     unsigned long phys)
> +static int __meminit vmemmap_create_mapping(unsigned long start,
> +					    unsigned long page_size,
> +					    unsigned long phys)
>  {
> -	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
> -					pgprot_val(PAGE_KERNEL),
> -					mmu_vmemmap_psize,
> -					mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_bolt_mapping(start, start + page_size, phys,
> +				   pgprot_val(PAGE_KERNEL),
> +				   mmu_vmemmap_psize, mmu_kernel_ssize);
> +	if (rc < 0) {
> +		int rc2 = htab_remove_mapping(start, start + page_size,
> +					      mmu_vmemmap_psize,
> +					      mmu_kernel_ssize);
> +		BUG_ON(rc2 && (rc2 != -ENOENT));
> +	}
> +	return rc;
>  }
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> @@ -304,6 +311,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
>  
>  	for (; start < end; start += page_size) {
>  		void *p;
> +		int rc;
>  
>  		if (vmemmap_populated(start, page_size))
>  			continue;
> @@ -317,7 +325,13 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
>  		pr_debug("      * %016lx..%016lx allocated at %p\n",
>  			 start, start + page_size, p);
>  
> -		vmemmap_create_mapping(start, page_size, __pa(p));
> +		rc = vmemmap_create_mapping(start, page_size, __pa(p));
> +		if (rc < 0) {
> +			pr_warning(
> +				"vmemmap_populate: Unable to create vmemmap mapping: %d\n",
> +				rc);
> +			return -EFAULT;
> +		}
>  	}
>  
>  	return 0;
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index d0f0a51..f980da6 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -119,12 +119,18 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
>  	struct zone *zone;
>  	unsigned long start_pfn = start >> PAGE_SHIFT;
>  	unsigned long nr_pages = size >> PAGE_SHIFT;
> +	int rc;
>  
>  	pgdata = NODE_DATA(nid);
>  
>  	start = (unsigned long)__va(start);
> -	if (create_section_mapping(start, start + size))
> -		return -EINVAL;
> +	rc = create_section_mapping(start, start + size);
> +	if (rc) {
> +		pr_warning(
> +			"Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
> +			start, start + size, rc);
> +		return -EFAULT;
> +	}
>  
>  	/* this should work for most non-highmem platforms */
>  	zone = pgdata->node_zones +
> -- 
> 2.5.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper
  2016-02-09  3:32 ` [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper David Gibson
@ 2016-02-10  9:01   ` Aneesh Kumar K.V
  2016-03-01 22:21   ` [4/4] " Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Aneesh Kumar K.V @ 2016-02-10  9:01 UTC (permalink / raw)
  To: David Gibson, mpe, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

David Gibson <david@gibson.dropbear.id.au> writes:

> htab_get_table_size() either retrieve the size of the hash page table (HPT)
> from the device tree - if the HPT size is determined by firmware - or
> uses a heuristic to determine a good size based on RAM size if the kernel
> is responsible for allocating the HPT.
>
> To support a PAPR extension allowing resizing of the HPT, we're going to
> want the memory size -> HPT size logic elsewhere, so split it out into a
> helper function.
>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/mmu-hash64.h |  3 +++
>  arch/powerpc/mm/hash_utils_64.c       | 32 +++++++++++++++++++-------------
>  2 files changed, 22 insertions(+), 13 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> index 7352d3f..cf070fd 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
>  	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
>  	return get_vsid(context, ea, ssize);
>  }
> +
> +unsigned htab_shift_for_mem_size(unsigned long mem_size);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index fdcf9d1..da5d279 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -611,10 +611,26 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
>  	return 0;
>  }
>  
> -static unsigned long __init htab_get_table_size(void)
> +unsigned htab_shift_for_mem_size(unsigned long mem_size)
>  {
> -	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
> +	unsigned memshift = __ilog2(mem_size);
> +	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
> +	unsigned pteg_shift;
> +
> +	/* round mem_size up to next power of 2 */
> +	if ((1UL << memshift) < mem_size)
> +		memshift += 1;
> +
> +	/* aim for 2 pages / pteg */
> +	pteg_shift = memshift - (pshift + 1);
> +
> +	/* 2^11 PTEGS / 2^18 bytes is the minimum htab size permitted
> +	 * by the architecture */
> +	return max(pteg_shift + 7, 18U);
> +}
>  
> +static unsigned long __init htab_get_table_size(void)
> +{
>  	/* If hash size isn't already provided by the platform, we try to
>  	 * retrieve it from the device-tree. If it's not there neither, we
>  	 * calculate it now based on the total RAM size
> @@ -624,17 +640,7 @@ static unsigned long __init htab_get_table_size(void)
>  	if (ppc64_pft_size)
>  		return 1UL << ppc64_pft_size;
>  
> -	/* round mem_size up to next power of 2 */
> -	mem_size = memblock_phys_mem_size();
> -	rnd_mem_size = 1UL << __ilog2(mem_size);
> -	if (rnd_mem_size < mem_size)
> -		rnd_mem_size <<= 1;
> -
> -	/* # pages / 2 */
> -	psize = mmu_psize_defs[mmu_virtual_psize].shift;
> -	pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11);
> -
> -	return pteg_count << 7;
> +	return 1UL << htab_shift_for_mem_size(memblock_phys_mem_size());
>  }
>  
>  #ifdef CONFIG_MEMORY_HOTPLUG
> -- 
> 2.5.0

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [3/4] powerpc/mm: Clean up memory hotplug failure paths
  2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
  2016-02-10  9:00   ` Aneesh Kumar K.V
@ 2016-03-01  1:59   ` Michael Ellerman
  2016-03-01  2:27     ` David Gibson
  2016-03-01 22:21   ` Michael Ellerman
  2 siblings, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2016-03-01  1:59 UTC (permalink / raw)
  To: David Gibson, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

On Tue, 2016-09-02 at 03:32:42 UTC, David Gibson wrote:
> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
>
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
>
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.
>
...
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index baa1a23..fbc9448 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
>  }
>  #endif
>  #else /* CONFIG_PPC_BOOK3E */
> -static void __meminit vmemmap_create_mapping(unsigned long start,
> -					     unsigned long page_size,
> -					     unsigned long phys)
> +static int __meminit vmemmap_create_mapping(unsigned long start,
> +					    unsigned long page_size,
> +					    unsigned long phys)
>  {
> -	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
> -					pgprot_val(PAGE_KERNEL),
> -					mmu_vmemmap_psize,
> -					mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_bolt_mapping(start, start + page_size, phys,
> +				   pgprot_val(PAGE_KERNEL),
> +				   mmu_vmemmap_psize, mmu_kernel_ssize);
> +	if (rc < 0) {
> +		int rc2 = htab_remove_mapping(start, start + page_size,
> +					      mmu_vmemmap_psize,
> +					      mmu_kernel_ssize);

This breaks the build when CONFIG_MEMORY_HOTPLUG=n, because
htab_remove_mapping() is not defined.

The obvious fix of moving htab_remove_mapping() out of CONFIG_MEMORY_HOTPLUG
works, so I'll do that unless anyone objects.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [3/4] powerpc/mm: Clean up memory hotplug failure paths
  2016-03-01  1:59   ` [3/4] " Michael Ellerman
@ 2016-03-01  2:27     ` David Gibson
  0 siblings, 0 replies; 16+ messages in thread
From: David Gibson @ 2016-03-01  2:27 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: paulus, benh, linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2819 bytes --]

On Tue, Mar 01, 2016 at 12:59:07PM +1100, Michael Ellerman wrote:
> On Tue, 2016-09-02 at 03:32:42 UTC, David Gibson wrote:
> > This makes a number of cleanups to handling of mapping failures during
> > memory hotplug on Power:
> >
> > For errors creating the linear mapping for the hot-added region:
> >   * This is now reported with EFAULT which is more appropriate than the
> >     previous EINVAL (the failure is unlikely to be related to the
> >     function's parameters)
> >   * An error in this path now prints a warning message, rather than just
> >     silently failing to add the extra memory.
> >   * Previously a failure here could result in the region being partially
> >     mapped.  We now clean up any partial mapping before failing.
> >
> > For errors creating the vmemmap for the hot-added region:
> >    * This is now reported with EFAULT instead of causing a BUG() - this
> >      could happen for external reason (e.g. full hash table) so it's better
> >      to handle this non-fatally
> >    * An error message is also printed, so the failure won't be silent
> >    * As above a failure could cause a partially mapped region, we now
> >      clean this up.
> >
> ...
> > diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> > index baa1a23..fbc9448 100644
> > --- a/arch/powerpc/mm/init_64.c
> > +++ b/arch/powerpc/mm/init_64.c
> > @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
> >  }
> >  #endif
> >  #else /* CONFIG_PPC_BOOK3E */
> > -static void __meminit vmemmap_create_mapping(unsigned long start,
> > -					     unsigned long page_size,
> > -					     unsigned long phys)
> > +static int __meminit vmemmap_create_mapping(unsigned long start,
> > +					    unsigned long page_size,
> > +					    unsigned long phys)
> >  {
> > -	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
> > -					pgprot_val(PAGE_KERNEL),
> > -					mmu_vmemmap_psize,
> > -					mmu_kernel_ssize);
> > -	BUG_ON(mapped < 0);
> > +	int rc = htab_bolt_mapping(start, start + page_size, phys,
> > +				   pgprot_val(PAGE_KERNEL),
> > +				   mmu_vmemmap_psize, mmu_kernel_ssize);
> > +	if (rc < 0) {
> > +		int rc2 = htab_remove_mapping(start, start + page_size,
> > +					      mmu_vmemmap_psize,
> > +					      mmu_kernel_ssize);
> 
> This breaks the build when CONFIG_MEMORY_HOTPLUG=n, because
> htab_remove_mapping() is not defined.
> 
> The obvious fix of moving htab_remove_mapping() out of CONFIG_MEMORY_HOTPLUG
> works, so I'll do that unless anyone objects.

Sounds good, thanks for the catch.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [1/4] powerpc/mm: Clean up error handling for htab_remove_mapping
  2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
  2016-02-10  8:56   ` Aneesh Kumar K.V
@ 2016-03-01 22:21   ` Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2016-03-01 22:21 UTC (permalink / raw)
  To: David Gibson, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

On Tue, 2016-09-02 at 03:32:40 UTC, David Gibson wrote:
> Currently, the only error that htab_remove_mapping() can report is -EINVAL,
> if removal of bolted HPTEs isn't implemeted for this platform.  We make
> a few clean ups to the handling of this:
> 
>  * EINVAL isn't really the right code - there's nothing wrong with the
>    function's arguments - use ENODEV instead
>  * We were also printing a warning message, but that's a decision better
>    left up to the callers, so remove it
>  * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
>    error, making the warning message redundant, so no change is needed
>    there.
>  * The other caller is remove_section_mapping().  This is called in the
>    memory hot remove path at a point after vmemmap_remove_mapping() so
>    if hpte_removebolted isn't implemented, we'd expect to have already
>    BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
>    really shouldn't be happening.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/abd0a0e7914a1137973119ac3b

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [4/4] powerpc/mm: Split hash page table sizing heuristic into a helper
  2016-02-09  3:32 ` [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper David Gibson
  2016-02-10  9:01   ` Aneesh Kumar K.V
@ 2016-03-01 22:21   ` Michael Ellerman
  2016-03-01 23:26     ` David Gibson
  1 sibling, 1 reply; 16+ messages in thread
From: Michael Ellerman @ 2016-03-01 22:21 UTC (permalink / raw)
  To: David Gibson, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

On Tue, 2016-09-02 at 03:32:43 UTC, David Gibson wrote:
> htab_get_table_size() either retrieve the size of the hash page table (HPT)
> from the device tree - if the HPT size is determined by firmware - or
> uses a heuristic to determine a good size based on RAM size if the kernel
> is responsible for allocating the HPT.
> 
> To support a PAPR extension allowing resizing of the HPT, we're going to
> want the memory size -> HPT size logic elsewhere, so split it out into a
> helper function.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/5c3c7ede2bdcb85fa2fd51c814

I reworded one comment a little, from:

	/* 2^11 PTEGS / 2^18 bytes is the minimum htab size permitted
	 * by the architecture */

to:
	/*
	 * 2^11 PTEGS of 128 bytes each, ie. 2^18 bytes is the minimum htab
	 * size permitted by the architecture.
	 */

To avoid any confusion about the "/" referring to division.

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs
  2016-02-09  3:32 ` [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs David Gibson
  2016-02-10  8:58   ` Aneesh Kumar K.V
@ 2016-03-01 22:21   ` Michael Ellerman
  1 sibling, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2016-03-01 22:21 UTC (permalink / raw)
  To: David Gibson, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

On Tue, 2016-09-02 at 03:32:41 UTC, David Gibson wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
> 
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
> 
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
> 
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
> 
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/27828f98a0522ad4a745a80407

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [3/4] powerpc/mm: Clean up memory hotplug failure paths
  2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
  2016-02-10  9:00   ` Aneesh Kumar K.V
  2016-03-01  1:59   ` [3/4] " Michael Ellerman
@ 2016-03-01 22:21   ` Michael Ellerman
  2 siblings, 0 replies; 16+ messages in thread
From: Michael Ellerman @ 2016-03-01 22:21 UTC (permalink / raw)
  To: David Gibson, paulus, benh; +Cc: linuxppc-dev, linux-kernel, David Gibson

On Tue, 2016-09-02 at 03:32:42 UTC, David Gibson wrote:
> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
> 
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
> 
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> Reviewed-by: Paul Mackerras <paulus@samba.org>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/1dace6c665ec59bdc4eeafa4db

cheers

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [4/4] powerpc/mm: Split hash page table sizing heuristic into a helper
  2016-03-01 22:21   ` [4/4] " Michael Ellerman
@ 2016-03-01 23:26     ` David Gibson
  0 siblings, 0 replies; 16+ messages in thread
From: David Gibson @ 2016-03-01 23:26 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: paulus, benh, linuxppc-dev, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1406 bytes --]

On Wed, Mar 02, 2016 at 09:21:19AM +1100, Michael Ellerman wrote:
> On Tue, 2016-09-02 at 03:32:43 UTC, David Gibson wrote:
> > htab_get_table_size() either retrieve the size of the hash page table (HPT)
> > from the device tree - if the HPT size is determined by firmware - or
> > uses a heuristic to determine a good size based on RAM size if the kernel
> > is responsible for allocating the HPT.
> > 
> > To support a PAPR extension allowing resizing of the HPT, we're going to
> > want the memory size -> HPT size logic elsewhere, so split it out into a
> > helper function.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> 
> Applied to powerpc next, thanks.
> 
> https://git.kernel.org/powerpc/c/5c3c7ede2bdcb85fa2fd51c814
> 
> I reworded one comment a little, from:
> 
> 	/* 2^11 PTEGS / 2^18 bytes is the minimum htab size permitted
> 	 * by the architecture */
> 
> to:
> 	/*
> 	 * 2^11 PTEGS of 128 bytes each, ie. 2^18 bytes is the minimum htab
> 	 * size permitted by the architecture.
> 	 */
> 
> To avoid any confusion about the "/" referring to division.

Good call, thanks.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-03-01 23:25 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-09  3:32 [PATCH 0/4] powerpc/mm: Cleanups to hotplug memory path David Gibson
2016-02-09  3:32 ` [PATCH 1/4] powerpc/mm: Clean up error handling for htab_remove_mapping David Gibson
2016-02-10  8:56   ` Aneesh Kumar K.V
2016-03-01 22:21   ` [1/4] " Michael Ellerman
2016-02-09  3:32 ` [PATCH 2/4] powerpc/mm: Handle removing maybe-present bolted HPTEs David Gibson
2016-02-10  8:58   ` Aneesh Kumar K.V
2016-03-01 22:21   ` [2/4] " Michael Ellerman
2016-02-09  3:32 ` [PATCH 3/4] powerpc/mm: Clean up memory hotplug failure paths David Gibson
2016-02-10  9:00   ` Aneesh Kumar K.V
2016-03-01  1:59   ` [3/4] " Michael Ellerman
2016-03-01  2:27     ` David Gibson
2016-03-01 22:21   ` Michael Ellerman
2016-02-09  3:32 ` [PATCH 4/4] powerpc/mm: Split hash page table sizing heuristic into a helper David Gibson
2016-02-10  9:01   ` Aneesh Kumar K.V
2016-03-01 22:21   ` [4/4] " Michael Ellerman
2016-03-01 23:26     ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.