All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFCv2 0/9] PAPR hash page table resizing (guest side)
@ 2016-01-29  5:23 David Gibson
  2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
                   ` (9 more replies)
  0 siblings, 10 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

Here's a second prototype of the guest side work for runtime resizing
of the has page table in PAPR guests.

This is now feature complete.  It implements the resizing, advertises
it with CAS, and will automatically invoke it to maintain a good HPT
size when memory is hot-added or hot-removed.

Patches 1-5 are standalone prerequisite cleanups that I'll be pushing
concurrently.

David Gibson (9):
  memblock: Don't mark memblock_phys_mem_size() as __init
  arch/powerpc: Clean up error handling for htab_remove_mapping
  arch/powerpc: Handle removing maybe-present bolted HPTEs
  arch/powerpc: Clean up memory hotplug failure paths
  arch/powerpc: Split hash page table sizing heuristic into a helper
  pseries: Add hypercall wrappers for hash page table resizing
  pseries: Add support for hash table resizing
  pseries: Advertise HPT resizing support via CAS
  pseries: Automatically resize HPT for memory hot add/remove

 arch/powerpc/include/asm/firmware.h       |   5 +-
 arch/powerpc/include/asm/hvcall.h         |   2 +
 arch/powerpc/include/asm/machdep.h        |   3 +-
 arch/powerpc/include/asm/mmu-hash64.h     |   3 +
 arch/powerpc/include/asm/plpar_wrappers.h |  12 +++
 arch/powerpc/include/asm/prom.h           |   1 +
 arch/powerpc/include/asm/sparsemem.h      |   1 +
 arch/powerpc/kernel/prom_init.c           |   2 +-
 arch/powerpc/mm/hash_utils_64.c           | 121 ++++++++++++++++++++++++------
 arch/powerpc/mm/init_64.c                 |  47 ++++++++----
 arch/powerpc/mm/mem.c                     |  14 +++-
 arch/powerpc/platforms/pseries/firmware.c |   1 +
 arch/powerpc/platforms/pseries/lpar.c     | 117 ++++++++++++++++++++++++++++-
 mm/memblock.c                             |   2 +-
 14 files changed, 281 insertions(+), 50 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 42+ messages in thread

* [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
@ 2016-01-29  5:23 ` David Gibson
  2016-02-01  5:50   ` Anshuman Khandual
  2016-02-08  2:46   ` Paul Mackerras
  2016-01-29  5:23 ` [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping David Gibson
                   ` (8 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

At the moment memblock_phys_mem_size() is marked as __init, and so is
discarded after boot.  This is different from most of the memblock
functions which are marked __init_memblock, and are only discarded after
boot if memory hotplug is not configured.

To allow for upcoming code which will need memblock_phys_mem_size() in the
hotplug path, change it from __init to __init_memblock.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 mm/memblock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memblock.c b/mm/memblock.c
index d2ed81e..dd79899 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1448,7 +1448,7 @@ void __init __memblock_free_late(phys_addr_t base, phys_addr_t size)
  * Remaining API functions
  */
 
-phys_addr_t __init memblock_phys_mem_size(void)
+phys_addr_t __init_memblock memblock_phys_mem_size(void)
 {
 	return memblock.memory.total_size;
 }
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
  2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
@ 2016-01-29  5:23 ` David Gibson
  2016-02-01  5:54   ` Anshuman Khandual
  2016-02-08  2:48   ` Paul Mackerras
  2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
                   ` (7 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

Currently, the only error that htab_remove_mapping() can report is -EINVAL,
if removal of bolted HPTEs isn't implemeted for this platform.  We make
a few clean ups to the handling of this:

 * EINVAL isn't really the right code - there's nothing wrong with the
   function's arguments - use ENODEV instead
 * We were also printing a warning message, but that's a decision better
   left up to the callers, so remove it
 * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
   error, making the warning message irrelevant, so no change is needed
   there.
 * The other caller is remove_section_mapping().  This is called in the
   memory hot remove path at a point after vmemmap_remove_mapping() so
   if hpte_removebolted isn't implemented, we'd expect to have already
   BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
   really shouldn't be happening.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/mm/hash_utils_64.c | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index ba59d59..9f7d727 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -273,11 +273,8 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	shift = mmu_psize_defs[psize].shift;
 	step = 1 << shift;
 
-	if (!ppc_md.hpte_removebolted) {
-		printk(KERN_WARNING "Platform doesn't implement "
-				"hpte_removebolted\n");
-		return -EINVAL;
-	}
+	if (!ppc_md.hpte_removebolted)
+		return -ENODEV;
 
 	for (vaddr = vstart; vaddr < vend; vaddr += step)
 		ppc_md.hpte_removebolted(vaddr, psize, ssize);
@@ -641,8 +638,10 @@ int create_section_mapping(unsigned long start, unsigned long end)
 
 int remove_section_mapping(unsigned long start, unsigned long end)
 {
-	return htab_remove_mapping(start, end, mmu_linear_psize,
-			mmu_kernel_ssize);
+	int rc = htab_remove_mapping(start, end, mmu_linear_psize,
+				     mmu_kernel_ssize);
+	WARN_ON(rc < 0);
+	return rc;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
  2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
  2016-01-29  5:23 ` [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping David Gibson
@ 2016-01-29  5:23 ` David Gibson
  2016-02-01  5:58   ` Anshuman Khandual
                     ` (2 more replies)
  2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
                   ` (6 subsequent siblings)
  9 siblings, 3 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

At the moment the hpte_removebolted callback in ppc_md returns void and
will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
place.  This is awkward for the case of cleaning up a mapping which was
partially made before failing.

So, we add a return value to hpte_removebolted, and have it return ENOENT
in the case that the HPTE to remove didn't exist in the first place.

In the (sole) caller, we propagate errors in hpte_removebolted to its
caller to handle.  However, we handle ENOENT specially, continuing to
complete the unmapping over the specified range before returning the error
to the caller.

This means that htab_remove_mapping() will work sanely on a partially
present mapping, removing any HPTEs which are present, while also returning
ENOENT to its caller in case it's important there.

There are two callers of htab_remove_mapping():
   - In remove_section_mapping() we already WARN_ON() any error return,
     which is reasonable - in this case the mapping should be fully
     present
   - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
     just a WARN_ON() in the case of ENOENT, since failing to remove a
     mapping that wasn't there in the first place probably shouldn't be
     fatal.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/machdep.h    |  2 +-
 arch/powerpc/mm/hash_utils_64.c       | 10 +++++++---
 arch/powerpc/mm/init_64.c             |  9 +++++----
 arch/powerpc/platforms/pseries/lpar.c |  7 +++++--
 4 files changed, 18 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index 3f191f5..a7d3f66 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -54,7 +54,7 @@ struct machdep_calls {
 				       int psize, int apsize,
 				       int ssize);
 	long		(*hpte_remove)(unsigned long hpte_group);
-	void            (*hpte_removebolted)(unsigned long ea,
+	long            (*hpte_removebolted)(unsigned long ea,
 					     int psize, int ssize);
 	void		(*flush_hash_range)(unsigned long number, int local);
 	void		(*hugepage_invalidate)(unsigned long vsid,
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 9f7d727..0737eae 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 {
 	unsigned long vaddr;
 	unsigned int step, shift;
+	int rc = 0;
 
 	shift = mmu_psize_defs[psize].shift;
 	step = 1 << shift;
@@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
 	if (!ppc_md.hpte_removebolted)
 		return -ENODEV;
 
-	for (vaddr = vstart; vaddr < vend; vaddr += step)
-		ppc_md.hpte_removebolted(vaddr, psize, ssize);
+	for (vaddr = vstart; vaddr < vend; vaddr += step) {
+		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
+		if ((rc < 0) && (rc != -ENOENT))
+			return rc;
+	}
 
-	return 0;
+	return rc;
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index 379a6a9..baa1a23 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 static void vmemmap_remove_mapping(unsigned long start,
 				   unsigned long page_size)
 {
-	int mapped = htab_remove_mapping(start, start + page_size,
-					 mmu_vmemmap_psize,
-					 mmu_kernel_ssize);
-	BUG_ON(mapped < 0);
+	int rc = htab_remove_mapping(start, start + page_size,
+				     mmu_vmemmap_psize,
+				     mmu_kernel_ssize);
+	BUG_ON((rc < 0) && (rc != -ENOENT));
+	WARN_ON(rc == -ENOENT);
 }
 #endif
 
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 477290a..92d472d 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -505,7 +505,7 @@ static void pSeries_lpar_hugepage_invalidate(unsigned long vsid,
 }
 #endif
 
-static void pSeries_lpar_hpte_removebolted(unsigned long ea,
+static long pSeries_lpar_hpte_removebolted(unsigned long ea,
 					   int psize, int ssize)
 {
 	unsigned long vpn;
@@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned long ea,
 	vpn = hpt_vpn(ea, vsid, ssize);
 
 	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
-	BUG_ON(slot == -1);
+	if (slot == -1)
+		return -ENOENT;
+
 	/*
 	 * lpar doesn't use the passed actual page size
 	 */
 	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
+	return 0;
 }
 
 /*
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (2 preceding siblings ...)
  2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
@ 2016-01-29  5:23 ` David Gibson
  2016-02-01  6:29   ` Anshuman Khandual
                     ` (2 more replies)
  2016-01-29  5:23 ` [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper David Gibson
                   ` (5 subsequent siblings)
  9 siblings, 3 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

This makes a number of cleanups to handling of mapping failures during
memory hotplug on Power:

For errors creating the linear mapping for the hot-added region:
  * This is now reported with EFAULT which is more appropriate than the
    previous EINVAL (the failure is unlikely to be related to the
    function's parameters)
  * An error in this path now prints a warning message, rather than just
    silently failing to add the extra memory.
  * Previously a failure here could result in the region being partially
    mapped.  We now clean up any partial mapping before failing.

For errors creating the vmemmap for the hot-added region:
   * This is now reported with EFAULT instead of causing a BUG() - this
     could happen for external reason (e.g. full hash table) so it's better
     to handle this non-fatally
   * An error message is also printed, so the failure won't be silent
   * As above a failure could cause a partially mapped region, we now
     clean this up.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/mm/hash_utils_64.c | 13 ++++++++++---
 arch/powerpc/mm/init_64.c       | 38 ++++++++++++++++++++++++++------------
 arch/powerpc/mm/mem.c           | 10 ++++++++--
 3 files changed, 44 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0737eae..e88a86e 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -635,9 +635,16 @@ static unsigned long __init htab_get_table_size(void)
 #ifdef CONFIG_MEMORY_HOTPLUG
 int create_section_mapping(unsigned long start, unsigned long end)
 {
-	return htab_bolt_mapping(start, end, __pa(start),
-				 pgprot_val(PAGE_KERNEL), mmu_linear_psize,
-				 mmu_kernel_ssize);
+	int rc = htab_bolt_mapping(start, end, __pa(start),
+				   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
+				   mmu_kernel_ssize);
+
+	if (rc < 0) {
+		int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
+					      mmu_kernel_ssize);
+		BUG_ON(rc2 && (rc2 != -ENOENT));
+	}
+	return rc;
 }
 
 int remove_section_mapping(unsigned long start, unsigned long end)
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index baa1a23..fbc9448 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -188,9 +188,9 @@ static int __meminit vmemmap_populated(unsigned long start, int page_size)
  */
 
 #ifdef CONFIG_PPC_BOOK3E
-static void __meminit vmemmap_create_mapping(unsigned long start,
-					     unsigned long page_size,
-					     unsigned long phys)
+static int __meminit vmemmap_create_mapping(unsigned long start,
+					    unsigned long page_size,
+					    unsigned long phys)
 {
 	/* Create a PTE encoding without page size */
 	unsigned long i, flags = _PAGE_PRESENT | _PAGE_ACCESSED |
@@ -208,6 +208,8 @@ static void __meminit vmemmap_create_mapping(unsigned long start,
 	 */
 	for (i = 0; i < page_size; i += PAGE_SIZE)
 		BUG_ON(map_kernel_page(start + i, phys, flags));
+
+	return 0;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
 }
 #endif
 #else /* CONFIG_PPC_BOOK3E */
-static void __meminit vmemmap_create_mapping(unsigned long start,
-					     unsigned long page_size,
-					     unsigned long phys)
+static int __meminit vmemmap_create_mapping(unsigned long start,
+					    unsigned long page_size,
+					    unsigned long phys)
 {
-	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
-					pgprot_val(PAGE_KERNEL),
-					mmu_vmemmap_psize,
-					mmu_kernel_ssize);
-	BUG_ON(mapped < 0);
+	int rc = htab_bolt_mapping(start, start + page_size, phys,
+				   pgprot_val(PAGE_KERNEL),
+				   mmu_vmemmap_psize, mmu_kernel_ssize);
+	if (rc < 0) {
+		int rc2 = htab_remove_mapping(start, start + page_size,
+					      mmu_vmemmap_psize,
+					      mmu_kernel_ssize);
+		BUG_ON(rc2 && (rc2 != -ENOENT));
+	}
+	return rc;
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
@@ -304,6 +311,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 
 	for (; start < end; start += page_size) {
 		void *p;
+		int rc;
 
 		if (vmemmap_populated(start, page_size))
 			continue;
@@ -317,7 +325,13 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 		pr_debug("      * %016lx..%016lx allocated at %p\n",
 			 start, start + page_size, p);
 
-		vmemmap_create_mapping(start, page_size, __pa(p));
+		rc = vmemmap_create_mapping(start, page_size, __pa(p));
+		if (rc < 0) {
+			pr_warning(
+				"vmemmap_populate: Unable to create vmemmap mapping: %d\n",
+				rc);
+			return -EFAULT;
+		}
 	}
 
 	return 0;
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 22d94c3..8ffc1e2 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -119,12 +119,18 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
 	struct zone *zone;
 	unsigned long start_pfn = start >> PAGE_SHIFT;
 	unsigned long nr_pages = size >> PAGE_SHIFT;
+	int rc;
 
 	pgdata = NODE_DATA(nid);
 
 	start = (unsigned long)__va(start);
-	if (create_section_mapping(start, start + size))
-		return -EINVAL;
+	rc = create_section_mapping(start, start + size);
+	if (rc) {
+		pr_warning(
+			"Unable to create mapping for hot added memory 0x%llx..0x%llx: %d\n",
+			start, start + size, rc);
+		return -EFAULT;
+	}
 
 	/* this should work for most non-highmem platforms */
 	zone = pgdata->node_zones +
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (3 preceding siblings ...)
  2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
@ 2016-01-29  5:23 ` David Gibson
  2016-02-01  7:04   ` Anshuman Khandual
  2016-01-29  5:24 ` [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing David Gibson
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:23 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

htab_get_table_size() either retrieve the size of the hash page table (HPT)
from the device tree - if the HPT size is determined by firmware - or
uses a heuristic to determine a good size based on RAM size if the kernel
is responsible for allocating the HPT.

To support a PAPR extension allowing resizing of the HPT, we're going to
want the memory size -> HPT size logic elsewhere, so split it out into a
helper function.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/mmu-hash64.h |  3 +++
 arch/powerpc/mm/hash_utils_64.c       | 30 +++++++++++++++++-------------
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
index 7352d3f..cf070fd 100644
--- a/arch/powerpc/include/asm/mmu-hash64.h
+++ b/arch/powerpc/include/asm/mmu-hash64.h
@@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
 	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
 	return get_vsid(context, ea, ssize);
 }
+
+unsigned htab_shift_for_mem_size(unsigned long mem_size);
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index e88a86e..d63f7dc 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
 	return 0;
 }
 
-static unsigned long __init htab_get_table_size(void)
+unsigned htab_shift_for_mem_size(unsigned long mem_size)
 {
-	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
+	unsigned memshift = __ilog2(mem_size);
+	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
+	unsigned pteg_shift;
+
+	/* round mem_size up to next power of 2 */
+	if ((1UL << memshift) < mem_size)
+		memshift += 1;
+
+	/* aim for 2 pages / pteg */
+	pteg_shift = memshift - (pshift + 1);
+
+	return max(pteg_shift + 7, 18U);
+}
 
+static unsigned long __init htab_get_table_size(void)
+{
 	/* If hash size isn't already provided by the platform, we try to
 	 * retrieve it from the device-tree. If it's not there neither, we
 	 * calculate it now based on the total RAM size
@@ -619,17 +633,7 @@ static unsigned long __init htab_get_table_size(void)
 	if (ppc64_pft_size)
 		return 1UL << ppc64_pft_size;
 
-	/* round mem_size up to next power of 2 */
-	mem_size = memblock_phys_mem_size();
-	rnd_mem_size = 1UL << __ilog2(mem_size);
-	if (rnd_mem_size < mem_size)
-		rnd_mem_size <<= 1;
-
-	/* # pages / 2 */
-	psize = mmu_psize_defs[mmu_virtual_psize].shift;
-	pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11);
-
-	return pteg_count << 7;
+	return htab_shift_for_mem_size(memblock_phys_mem_size());
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (4 preceding siblings ...)
  2016-01-29  5:23 ` [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper David Gibson
@ 2016-01-29  5:24 ` David Gibson
  2016-02-01  7:11   ` Anshuman Khandual
  2016-02-08  5:58   ` Paul Mackerras
  2016-01-29  5:24 ` [RFCv2 7/9] pseries: Add support for hash " David Gibson
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:24 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

This adds the hypercall numbers and wrapper functions for the hash page
table resizing hypercalls.

These are experimental "platform specific" values for now, until we have a
formal PAPR update.

It also adds a new firmware feature flat to track the presence of the
HPT resizing calls.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/firmware.h       |  5 +++--
 arch/powerpc/include/asm/hvcall.h         |  2 ++
 arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
 arch/powerpc/platforms/pseries/firmware.c |  1 +
 4 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
index b062924..32435d2 100644
--- a/arch/powerpc/include/asm/firmware.h
+++ b/arch/powerpc/include/asm/firmware.h
@@ -42,7 +42,7 @@
 #define FW_FEATURE_SPLPAR	ASM_CONST(0x0000000000100000)
 #define FW_FEATURE_LPAR		ASM_CONST(0x0000000000400000)
 #define FW_FEATURE_PS3_LV1	ASM_CONST(0x0000000000800000)
-/* Free				ASM_CONST(0x0000000001000000) */
+#define FW_FEATURE_HPT_RESIZE	ASM_CONST(0x0000000001000000)
 #define FW_FEATURE_CMO		ASM_CONST(0x0000000002000000)
 #define FW_FEATURE_VPHN		ASM_CONST(0x0000000004000000)
 #define FW_FEATURE_XCMO		ASM_CONST(0x0000000008000000)
@@ -66,7 +66,8 @@ enum {
 		FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
 		FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
 		FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
-		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
+		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
+		FW_FEATURE_HPT_RESIZE,
 	FW_FEATURE_PSERIES_ALWAYS = 0,
 	FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
 	FW_FEATURE_POWERNV_ALWAYS = 0,
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index e3b54dd..195e080 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -293,6 +293,8 @@
 
 /* Platform specific hcalls, used by KVM */
 #define H_RTAS			0xf000
+#define H_RESIZE_HPT_PREPARE	0xf003
+#define H_RESIZE_HPT_COMMIT	0xf004
 
 /* "Platform specific hcalls", provided by PHYP */
 #define H_GET_24X7_CATALOG_PAGE	0xF078
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 1b39424..b7ee6d9 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -242,6 +242,18 @@ static inline long plpar_pte_protect(unsigned long flags, unsigned long ptex,
 	return plpar_hcall_norets(H_PROTECT, flags, ptex, avpn);
 }
 
+static inline long plpar_resize_hpt_prepare(unsigned long flags,
+					    unsigned long shift)
+{
+	return plpar_hcall_norets(H_RESIZE_HPT_PREPARE, flags, shift);
+}
+
+static inline long plpar_resize_hpt_commit(unsigned long flags,
+					   unsigned long shift)
+{
+	return plpar_hcall_norets(H_RESIZE_HPT_COMMIT, flags, shift);
+}
+
 static inline long plpar_tce_get(unsigned long liobn, unsigned long ioba,
 		unsigned long *tce_ret)
 {
diff --git a/arch/powerpc/platforms/pseries/firmware.c b/arch/powerpc/platforms/pseries/firmware.c
index 8c80588..7b287be 100644
--- a/arch/powerpc/platforms/pseries/firmware.c
+++ b/arch/powerpc/platforms/pseries/firmware.c
@@ -63,6 +63,7 @@ hypertas_fw_features_table[] = {
 	{FW_FEATURE_VPHN,		"hcall-vphn"},
 	{FW_FEATURE_SET_MODE,		"hcall-set-mode"},
 	{FW_FEATURE_BEST_ENERGY,	"hcall-best-energy-1*"},
+	{FW_FEATURE_HPT_RESIZE,		"hcall-hpt-resize"},
 };
 
 /* Build up the firmware features bitmask using the contents of
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 7/9] pseries: Add support for hash table resizing
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (5 preceding siblings ...)
  2016-01-29  5:24 ` [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing David Gibson
@ 2016-01-29  5:24 ` David Gibson
  2016-02-01  8:31   ` Anshuman Khandual
  2016-02-08  5:59   ` Paul Mackerras
  2016-01-29  5:24 ` [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS David Gibson
                   ` (2 subsequent siblings)
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:24 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

This adds support for using experimental hypercalls to change the size
of the main hash page table while running as a PAPR guest.  For now these
hypercalls are only in experimental qemu versions.

The interface is two part: first H_RESIZE_HPT_PREPARE is used to allocate
and prepare the new hash table.  This may be slow, but can be done
asynchronously.  Then, H_RESIZE_HPT_COMMIT is used to switch to the new
hash table.  This requires that no CPUs be concurrently updating the HPT,
and so must be run under stop_machine().

This also adds a debugfs file which can be used to manually control
HPT resizing or testing purposes.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/machdep.h    |   1 +
 arch/powerpc/mm/hash_utils_64.c       |  28 +++++++++
 arch/powerpc/platforms/pseries/lpar.c | 110 ++++++++++++++++++++++++++++++++++
 3 files changed, 139 insertions(+)

diff --git a/arch/powerpc/include/asm/machdep.h b/arch/powerpc/include/asm/machdep.h
index a7d3f66..532d795 100644
--- a/arch/powerpc/include/asm/machdep.h
+++ b/arch/powerpc/include/asm/machdep.h
@@ -61,6 +61,7 @@ struct machdep_calls {
 					       unsigned long addr,
 					       unsigned char *hpte_slot_array,
 					       int psize, int ssize, int local);
+	int		(*resize_hpt)(unsigned long shift);
 	/*
 	 * Special for kexec.
 	 * To be called in real mode with interrupts disabled. No locks are
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index d63f7dc..882e409 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -34,6 +34,7 @@
 #include <linux/signal.h>
 #include <linux/memblock.h>
 #include <linux/context_tracking.h>
+#include <linux/debugfs.h>
 
 #include <asm/processor.h>
 #include <asm/pgtable.h>
@@ -1578,3 +1579,30 @@ void setup_initial_memory_limit(phys_addr_t first_memblock_base,
 	/* Finally limit subsequent allocations */
 	memblock_set_current_limit(ppc64_rma_size);
 }
+
+static int ppc64_pft_size_get(void *data, u64 *val)
+{
+	*val = ppc64_pft_size;
+	return 0;
+}
+
+static int ppc64_pft_size_set(void *data, u64 val)
+{
+	if (!ppc_md.resize_hpt)
+		return -ENODEV;
+	return ppc_md.resize_hpt(val);
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_ppc64_pft_size,
+			ppc64_pft_size_get, ppc64_pft_size_set,	"%llu\n");
+
+static int __init hash64_debugfs(void)
+{
+	if (!debugfs_create_file("pft-size", 0600, powerpc_debugfs_root,
+				 NULL, &fops_ppc64_pft_size)) {
+		pr_err("lpar: unable to create ppc64_pft_size debugsfs file\n");
+	}
+
+	return 0;
+}
+machine_device_initcall(pseries, hash64_debugfs);
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c
index 92d472d..ebf02e7 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -27,6 +27,8 @@
 #include <linux/console.h>
 #include <linux/export.h>
 #include <linux/jump_label.h>
+#include <linux/delay.h>
+#include <linux/stop_machine.h>
 #include <asm/processor.h>
 #include <asm/mmu.h>
 #include <asm/page.h>
@@ -603,6 +605,113 @@ static int __init disable_bulk_remove(char *str)
 
 __setup("bulk_remove=", disable_bulk_remove);
 
+#define HPT_RESIZE_TIMEOUT	10000 /* ms */
+
+struct hpt_resize_state {
+	unsigned long shift;
+	int commit_rc;
+};
+
+static int pseries_lpar_resize_hpt_commit(void *data)
+{
+	struct hpt_resize_state *state = data;
+
+	state->commit_rc = plpar_resize_hpt_commit(0, state->shift);
+	if (state->commit_rc != H_SUCCESS)
+		return -EIO;
+
+	/* Hypervisor has transitioned the HTAB, update our globals */
+	ppc64_pft_size = state->shift;
+	htab_size_bytes = 1UL << ppc64_pft_size;
+	htab_hash_mask = (htab_size_bytes >> 7) - 1;
+
+	return 0;
+}
+
+/* Must be called in user context */
+static int pseries_lpar_resize_hpt(unsigned long shift)
+{
+	struct hpt_resize_state state = {
+		.shift = shift,
+		.commit_rc = H_FUNCTION,
+	};
+	unsigned int delay, total_delay = 0;
+	int rc;
+	ktime_t t0, t1, t2;
+
+	might_sleep();
+
+	if (!firmware_has_feature(FW_FEATURE_HPT_RESIZE))
+		return -ENODEV;
+
+	printk(KERN_INFO "lpar: Attempting to resize HPT to shift %lu\n",
+	       shift);
+
+	t0 = ktime_get();
+
+	rc = plpar_resize_hpt_prepare(0, shift);
+	while (H_IS_LONG_BUSY(rc)) {
+		delay = get_longbusy_msecs(rc);
+		total_delay += delay;
+		if (total_delay > HPT_RESIZE_TIMEOUT) {
+			/* prepare call with shift==0 cancels an
+			 * in-progress resize */
+			rc = plpar_resize_hpt_prepare(0, 0);
+			if (rc != H_SUCCESS)
+				printk(KERN_WARNING
+				       "lpar: Unexpected error %d cancelling timed out HPT resize\n",
+				       rc);
+			return -ETIMEDOUT;
+		}
+		msleep(delay);
+		rc = plpar_resize_hpt_prepare(0, shift);
+	};
+
+	switch (rc) {
+	case H_SUCCESS:
+		/* Continue on */
+		break;
+
+	case H_PARAMETER:
+		return -EINVAL;
+	case H_RESOURCE:
+		return -EPERM;
+	default:
+		printk(KERN_WARNING
+		       "lpar: Unexpected error %d from H_RESIZE_HPT_PREPARE\n",
+		       rc);
+		return -EIO;
+	}
+
+	t1 = ktime_get();
+
+	rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);
+
+	t2 = ktime_get();
+
+	if (rc != 0) {
+		switch (state.commit_rc) {
+		case H_PTEG_FULL:
+			printk(KERN_WARNING
+			       "lpar: Hash collision while resizing HPT\n");
+			return -ENOSPC;
+
+		default:
+			printk(KERN_WARNING
+			       "lpar: Unexpected error %d from H_RESIZE_HPT_COMMIT\n",
+			       state.commit_rc);
+			return -EIO;
+		};
+	}
+
+	printk(KERN_INFO
+	       "lpar: HPT resize to shift %lu complete (%lld ms / %lld ms)\n",
+	       shift, (long long) ktime_ms_delta(t1, t0),
+	       (long long) ktime_ms_delta(t2, t1));
+
+	return 0;
+}
+
 void __init hpte_init_lpar(void)
 {
 	ppc_md.hpte_invalidate	= pSeries_lpar_hpte_invalidate;
@@ -614,6 +723,7 @@ void __init hpte_init_lpar(void)
 	ppc_md.flush_hash_range	= pSeries_lpar_flush_hash_range;
 	ppc_md.hpte_clear_all   = pSeries_lpar_hptab_clear;
 	ppc_md.hugepage_invalidate = pSeries_lpar_hugepage_invalidate;
+	ppc_md.resize_hpt = pseries_lpar_resize_hpt;
 }
 
 #ifdef CONFIG_PPC_SMLPAR
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (6 preceding siblings ...)
  2016-01-29  5:24 ` [RFCv2 7/9] pseries: Add support for hash " David Gibson
@ 2016-01-29  5:24 ` David Gibson
  2016-02-01  8:36   ` Anshuman Khandual
  2016-02-08  6:00   ` Paul Mackerras
  2016-01-29  5:24 ` [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove David Gibson
  2016-02-01  5:50 ` [RFCv2 0/9] PAPR hash page table resizing (guest side) Anshuman Khandual
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:24 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

The hypervisor needs to know a guest is capable of using the HPT resizing
PAPR extension in order to make full advantage of it for memory hotplug.

If the hypervisor knows the guest is HPT resize aware, it can size the
initial HPT based on the initial guest RAM size, relying on the guest to
resize the HPT when more memory is hot-added.  Without this, the hypervisor
must size the HPT for the maximum possible guest RAM, which can lead to
a huge waste of space if the guest never actually expends to that maximum
size.

This patch advertises the guest's support for HPT resizing via the
ibm,client-architecture-support OF interface.  Obviously, the actual
encoding in the CAS vector is tentative until the extension is officially
incorporated into PAPR.  For now we use bit 0 of (previously unused) byte 8
of option vector 5.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/prom.h | 1 +
 arch/powerpc/kernel/prom_init.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/prom.h b/arch/powerpc/include/asm/prom.h
index 7f436ba..ef08208 100644
--- a/arch/powerpc/include/asm/prom.h
+++ b/arch/powerpc/include/asm/prom.h
@@ -151,6 +151,7 @@ struct of_drconf_cell {
 #define OV5_XCMO		0x0440	/* Page Coalescing */
 #define OV5_TYPE1_AFFINITY	0x0580	/* Type 1 NUMA affinity */
 #define OV5_PRRN		0x0540	/* Platform Resource Reassignment */
+#define OV5_HPT_RESIZE		0x0880	/* Hash Page Table resizing */
 #define OV5_PFO_HW_RNG		0x0E80	/* PFO Random Number Generator */
 #define OV5_PFO_HW_842		0x0E40	/* PFO Compression Accelerator */
 #define OV5_PFO_HW_ENCR		0x0E20	/* PFO Encryption Accelerator */
diff --git a/arch/powerpc/kernel/prom_init.c b/arch/powerpc/kernel/prom_init.c
index da51925..c6feafb 100644
--- a/arch/powerpc/kernel/prom_init.c
+++ b/arch/powerpc/kernel/prom_init.c
@@ -713,7 +713,7 @@ unsigned char ibm_architecture_vec[] = {
 	OV5_FEAT(OV5_TYPE1_AFFINITY) | OV5_FEAT(OV5_PRRN),
 	0,
 	0,
-	0,
+	OV5_FEAT(OV5_HPT_RESIZE),
 	/* WARNING: The offset of the "number of cores" field below
 	 * must match by the macro below. Update the definition if
 	 * the structure layout changes.
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (7 preceding siblings ...)
  2016-01-29  5:24 ` [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS David Gibson
@ 2016-01-29  5:24 ` David Gibson
  2016-02-01  8:51   ` Anshuman Khandual
  2016-02-08  6:01   ` Paul Mackerras
  2016-02-01  5:50 ` [RFCv2 0/9] PAPR hash page table resizing (guest side) Anshuman Khandual
  9 siblings, 2 replies; 42+ messages in thread
From: David Gibson @ 2016-01-29  5:24 UTC (permalink / raw)
  To: paulus, mpe, benh; +Cc: linuxppc-dev, aik, thuth, lvivier, David Gibson

We've now implemented code in the pseries platform to use the new PAPR
interface to allow resizing the hash page table (HPT) at runtime.

This patch uses that interface to automatically attempt to resize the HPT
when memory is hot added or removed.  This tries to always keep the HPT at
a reasonable size for our current memory size.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
---
 arch/powerpc/include/asm/sparsemem.h |  1 +
 arch/powerpc/mm/hash_utils_64.c      | 29 +++++++++++++++++++++++++++++
 arch/powerpc/mm/mem.c                |  4 ++++
 3 files changed, 34 insertions(+)

diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
index f6fc0ee..737335c 100644
--- a/arch/powerpc/include/asm/sparsemem.h
+++ b/arch/powerpc/include/asm/sparsemem.h
@@ -16,6 +16,7 @@
 #endif /* CONFIG_SPARSEMEM */
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+extern void resize_hpt_for_hotplug(unsigned long new_mem_size);
 extern int create_section_mapping(unsigned long start, unsigned long end);
 extern int remove_section_mapping(unsigned long start, unsigned long end);
 #ifdef CONFIG_NUMA
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 882e409..18cc851 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -638,6 +638,35 @@ static unsigned long __init htab_get_table_size(void)
 }
 
 #ifdef CONFIG_MEMORY_HOTPLUG
+void resize_hpt_for_hotplug(unsigned long new_mem_size)
+{
+	unsigned target_hpt_shift;
+
+	if (!ppc_md.resize_hpt)
+		return;
+
+	target_hpt_shift = htab_shift_for_mem_size(new_mem_size);
+
+	/*
+	 * To avoid lots of HPT resizes if memory size is fluctuating
+	 * across a boundary, we deliberately have some hysterisis
+	 * here: we immediately increase the HPT size if the target
+	 * shift exceeds the current shift, but we won't attempt to
+	 * reduce unless the target shift is at least 2 below the
+	 * current shift
+	 */
+	if ((target_hpt_shift > ppc64_pft_size)
+	    || (target_hpt_shift < (ppc64_pft_size - 1))) {
+		int rc;
+
+		rc = ppc_md.resize_hpt(target_hpt_shift);
+		if (rc)
+			printk(KERN_WARNING
+			       "Unable to resize hash page table to target order %d: %d\n",
+			       target_hpt_shift, rc);
+	}
+}
+
 int create_section_mapping(unsigned long start, unsigned long end)
 {
 	int rc = htab_bolt_mapping(start, end, __pa(start),
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 8ffc1e2..e77f36c 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -121,6 +121,8 @@ int arch_add_memory(int nid, u64 start, u64 size, bool for_device)
 	unsigned long nr_pages = size >> PAGE_SHIFT;
 	int rc;
 
+	resize_hpt_for_hotplug(memblock_phys_mem_size());
+
 	pgdata = NODE_DATA(nid);
 
 	start = (unsigned long)__va(start);
@@ -161,6 +163,8 @@ int arch_remove_memory(u64 start, u64 size)
 	 */
 	vm_unmap_aliases();
 
+	resize_hpt_for_hotplug(memblock_phys_mem_size());
+
 	return ret;
 }
 #endif
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [RFCv2 0/9] PAPR hash page table resizing (guest side)
  2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
                   ` (8 preceding siblings ...)
  2016-01-29  5:24 ` [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove David Gibson
@ 2016-02-01  5:50 ` Anshuman Khandual
  2016-02-02  0:57   ` David Gibson
  9 siblings, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  5:50 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh
  Cc: aik, lvivier, thuth, linuxppc-dev, Nathan Fontenot

On 01/29/2016 10:53 AM, David Gibson wrote:
> Here's a second prototype of the guest side work for runtime resizing
> of the has page table in PAPR guests.
> 
> This is now feature complete.  It implements the resizing, advertises
> it with CAS, and will automatically invoke it to maintain a good HPT
> size when memory is hot-added or hot-removed.
> 
> Patches 1-5 are standalone prerequisite cleanups that I'll be pushing
> concurrently.
> 
> David Gibson (9):
>   memblock: Don't mark memblock_phys_mem_size() as __init
>   arch/powerpc: Clean up error handling for htab_remove_mapping
>   arch/powerpc: Handle removing maybe-present bolted HPTEs
>   arch/powerpc: Clean up memory hotplug failure paths
>   arch/powerpc: Split hash page table sizing heuristic into a helper

A small nit. Please start the above commit message headers as
"powerpc/mm:" instead, which sounds more clear and uniform with
patch series related to other subsystems.

Adding Nathan in the copy, may be he will have some inputs.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init
  2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
@ 2016-02-01  5:50   ` Anshuman Khandual
  2016-02-08  2:46   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  5:50 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:53 AM, David Gibson wrote:
> At the moment memblock_phys_mem_size() is marked as __init, and so is
> discarded after boot.  This is different from most of the memblock
> functions which are marked __init_memblock, and are only discarded after
> boot if memory hotplug is not configured.
> 
> To allow for upcoming code which will need memblock_phys_mem_size() in the
> hotplug path, change it from __init to __init_memblock.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping
  2016-01-29  5:23 ` [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping David Gibson
@ 2016-02-01  5:54   ` Anshuman Khandual
  2016-02-08  2:48   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  5:54 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:53 AM, David Gibson wrote:
> Currently, the only error that htab_remove_mapping() can report is -EINVAL,
> if removal of bolted HPTEs isn't implemeted for this platform.  We make
> a few clean ups to the handling of this:
> 
>  * EINVAL isn't really the right code - there's nothing wrong with the
>    function's arguments - use ENODEV instead

You are right, guess there are other places with this kind of problem as
well.

>  * We were also printing a warning message, but that's a decision better
>    left up to the callers, so remove it
>  * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
>    error, making the warning message irrelevant, so no change is needed
>    there.

It makes it redundant not irrelevant. It still prints a valid reason why
the remove operation failed.

>  * The other caller is remove_section_mapping().  This is called in the
>    memory hot remove path at a point after vmemmap_remove_mapping() so
>    if hpte_removebolted isn't implemented, we'd expect to have already
>    BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
>    really shouldn't be happening.

Right.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
@ 2016-02-01  5:58   ` Anshuman Khandual
  2016-02-02  1:08     ` David Gibson
  2016-02-02 13:49   ` Denis Kirjanov
  2016-02-08  2:54   ` Paul Mackerras
  2 siblings, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  5:58 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:53 AM, David Gibson wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
> 
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
> 
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
> 
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.

Yeah makes sense.

> 
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present

Right.

>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.

Provided the caller of vmemmap_remove_mapping() which is memory hotplug
path must be handling the returned -ENOENT error correctly. Just curious
and want to make sure that any of the memory sections or pages inside the
section must not be left in a state which makes the next call in the
hotplug path fail.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
  2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
@ 2016-02-01  6:29   ` Anshuman Khandual
  2016-02-02 15:04   ` Nathan Fontenot
  2016-02-08  5:47   ` Paul Mackerras
  2 siblings, 0 replies; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  6:29 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:53 AM, David Gibson wrote:
> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
> 
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
> 
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.

Yeah this greatly improves graceful fall back when when memory mapping
failure happens at the last level during memory hotplug.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
  2016-01-29  5:23 ` [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper David Gibson
@ 2016-02-01  7:04   ` Anshuman Khandual
  2016-02-02  1:04     ` David Gibson
  0 siblings, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  7:04 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:53 AM, David Gibson wrote:
> htab_get_table_size() either retrieve the size of the hash page table (HPT)
> from the device tree - if the HPT size is determined by firmware - or
> uses a heuristic to determine a good size based on RAM size if the kernel
> is responsible for allocating the HPT.
> 
> To support a PAPR extension allowing resizing of the HPT, we're going to
> want the memory size -> HPT size logic elsewhere, so split it out into a
> helper function.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/mmu-hash64.h |  3 +++
>  arch/powerpc/mm/hash_utils_64.c       | 30 +++++++++++++++++-------------
>  2 files changed, 20 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> index 7352d3f..cf070fd 100644
> --- a/arch/powerpc/include/asm/mmu-hash64.h
> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
>  	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
>  	return get_vsid(context, ea, ssize);
>  }
> +
> +unsigned htab_shift_for_mem_size(unsigned long mem_size);
> +
>  #endif /* __ASSEMBLY__ */
>  
>  #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index e88a86e..d63f7dc 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
>  	return 0;
>  }
>  
> -static unsigned long __init htab_get_table_size(void)
> +unsigned htab_shift_for_mem_size(unsigned long mem_size)
>  {
> -	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
> +	unsigned memshift = __ilog2(mem_size);
> +	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
> +	unsigned pteg_shift;
> +
> +	/* round mem_size up to next power of 2 */
> +	if ((1UL << memshift) < mem_size)
> +		memshift += 1;
> +
> +	/* aim for 2 pages / pteg */

While here I guess its a good opportunity to write couple of lines
about why one PTE group for every two physical pages on the system,
why minimum (1UL << 11 = 2048) number of PTE groups required, why
(1U << 7 = 128) entries per PTE group and also remove the existing
confusing comments above ? Just a suggestion.

> +	pteg_shift = memshift - (pshift + 1);
> +
> +	return max(pteg_shift + 7, 18U);
> +}
>  
> +static unsigned long __init htab_get_table_size(void)
> +{
>  	/* If hash size isn't already provided by the platform, we try to
>  	 * retrieve it from the device-tree. If it's not there neither, we
>  	 * calculate it now based on the total RAM size
> @@ -619,17 +633,7 @@ static unsigned long __init htab_get_table_size(void)
>  	if (ppc64_pft_size)
>  		return 1UL << ppc64_pft_size;
>  
> -	/* round mem_size up to next power of 2 */
> -	mem_size = memblock_phys_mem_size();
> -	rnd_mem_size = 1UL << __ilog2(mem_size);
> -	if (rnd_mem_size < mem_size)
> -		rnd_mem_size <<= 1;
> -
> -	/* # pages / 2 */
> -	psize = mmu_psize_defs[mmu_virtual_psize].shift;
> -	pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11);
> -
> -	return pteg_count << 7;
> +	return htab_shift_for_mem_size(memblock_phys_mem_size());

Would it be 1UL << htab_shift_for_mem_size(memblock_phys_mem_size())
instead ? It was returning the size of the HPT not the shift of HPT
originally or I am missing something here.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-01-29  5:24 ` [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing David Gibson
@ 2016-02-01  7:11   ` Anshuman Khandual
  2016-02-02  0:58     ` David Gibson
  2016-02-08  5:58   ` Paul Mackerras
  1 sibling, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  7:11 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:54 AM, David Gibson wrote:
> This adds the hypercall numbers and wrapper functions for the hash page
> table resizing hypercalls.
> 
> These are experimental "platform specific" values for now, until we have a
> formal PAPR update.
> 
> It also adds a new firmware feature flat to track the presence of the
> HPT resizing calls.

Its a flag   ....................... ^^^^^^^ here.

> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/firmware.h       |  5 +++--
>  arch/powerpc/include/asm/hvcall.h         |  2 ++
>  arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
>  arch/powerpc/platforms/pseries/firmware.c |  1 +
>  4 files changed, 18 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
> index b062924..32435d2 100644
> --- a/arch/powerpc/include/asm/firmware.h
> +++ b/arch/powerpc/include/asm/firmware.h
> @@ -42,7 +42,7 @@
>  #define FW_FEATURE_SPLPAR	ASM_CONST(0x0000000000100000)
>  #define FW_FEATURE_LPAR		ASM_CONST(0x0000000000400000)
>  #define FW_FEATURE_PS3_LV1	ASM_CONST(0x0000000000800000)
> -/* Free				ASM_CONST(0x0000000001000000) */
> +#define FW_FEATURE_HPT_RESIZE	ASM_CONST(0x0000000001000000)
>  #define FW_FEATURE_CMO		ASM_CONST(0x0000000002000000)
>  #define FW_FEATURE_VPHN		ASM_CONST(0x0000000004000000)
>  #define FW_FEATURE_XCMO		ASM_CONST(0x0000000008000000)
> @@ -66,7 +66,8 @@ enum {
>  		FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
>  		FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
>  		FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
> -		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
> +		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
> +		FW_FEATURE_HPT_RESIZE,
>  	FW_FEATURE_PSERIES_ALWAYS = 0,
>  	FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
>  	FW_FEATURE_POWERNV_ALWAYS = 0,
> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
> index e3b54dd..195e080 100644
> --- a/arch/powerpc/include/asm/hvcall.h
> +++ b/arch/powerpc/include/asm/hvcall.h
> @@ -293,6 +293,8 @@
>  
>  /* Platform specific hcalls, used by KVM */
>  #define H_RTAS			0xf000
> +#define H_RESIZE_HPT_PREPARE	0xf003
> +#define H_RESIZE_HPT_COMMIT	0xf004

This sound better and matches FW_FEATURE_HPT_RESIZE ?

#define H_HPT_RESIZE_PREPARE	0xf003
#define H_HPT_RESIZE_COMMIT	0xf004

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 7/9] pseries: Add support for hash table resizing
  2016-01-29  5:24 ` [RFCv2 7/9] pseries: Add support for hash " David Gibson
@ 2016-02-01  8:31   ` Anshuman Khandual
  2016-02-01 11:04     ` David Gibson
  2016-02-08  5:59   ` Paul Mackerras
  1 sibling, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  8:31 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:54 AM, David Gibson wrote:

> +
> +static int pseries_lpar_resize_hpt_commit(void *data)
> +{
> +	struct hpt_resize_state *state = data;
> +
> +	state->commit_rc = plpar_resize_hpt_commit(0, state->shift);
> +	if (state->commit_rc != H_SUCCESS)
> +		return -EIO;
> +
> +	/* Hypervisor has transitioned the HTAB, update our globals */
> +	ppc64_pft_size = state->shift;
> +	htab_size_bytes = 1UL << ppc64_pft_size;
> +	htab_hash_mask = (htab_size_bytes >> 7) - 1;
> +
> +	return 0;
> +}
> +

snip

> +/* Must be called in user context */
> +static int pseries_lpar_resize_hpt(unsigned long shift)
> +{
> +	struct hpt_resize_state state = {
> +		.shift = shift,
> +		.commit_rc = H_FUNCTION,

> +
> +	rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);

With my limited knowledge of stop_machine, wondering if the current
or any future version of 'pseries_lpar_resize_hpt_commit' function
can cause HPT change (page fault path) while stop is executing it.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS
  2016-01-29  5:24 ` [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS David Gibson
@ 2016-02-01  8:36   ` Anshuman Khandual
  2016-02-08  6:00   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  8:36 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:54 AM, David Gibson wrote:
> The hypervisor needs to know a guest is capable of using the HPT resizing
> PAPR extension in order to make full advantage of it for memory hotplug.
> 
> If the hypervisor knows the guest is HPT resize aware, it can size the
> initial HPT based on the initial guest RAM size, relying on the guest to
> resize the HPT when more memory is hot-added.  Without this, the hypervisor
> must size the HPT for the maximum possible guest RAM, which can lead to
> a huge waste of space if the guest never actually expends to that maximum
> size.
> 
> This patch advertises the guest's support for HPT resizing via the
> ibm,client-architecture-support OF interface.  Obviously, the actual
> encoding in the CAS vector is tentative until the extension is officially
> incorporated into PAPR.  For now we use bit 0 of (previously unused) byte 8
> of option vector 5.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Anshuman Khandual <khandual@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove
  2016-01-29  5:24 ` [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove David Gibson
@ 2016-02-01  8:51   ` Anshuman Khandual
  2016-02-01 10:55     ` David Gibson
  2016-02-08  6:01   ` Paul Mackerras
  1 sibling, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-01  8:51 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/29/2016 10:54 AM, David Gibson wrote:
>  #ifdef CONFIG_MEMORY_HOTPLUG
> +void resize_hpt_for_hotplug(unsigned long new_mem_size)
> +{
> +	unsigned target_hpt_shift;
> +
> +	if (!ppc_md.resize_hpt)
> +		return;
> +
> +	target_hpt_shift = htab_shift_for_mem_size(new_mem_size);
> +
> +	/*
> +	 * To avoid lots of HPT resizes if memory size is fluctuating
> +	 * across a boundary, we deliberately have some hysterisis


What do you mean by 'memory size is fluctuating across a boundary' ?
Through memory hotplug interface ? Why some one will do that ? I
can understand why we dont have this check in the sysfs debug path
as we would like to test any memory HPT re sizing scenario we want
in any sequence of increase or decrease we want.

Overall the RFC V2 looks pretty good. Looking forward to see the
host side of the code for this feature.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove
  2016-02-01  8:51   ` Anshuman Khandual
@ 2016-02-01 10:55     ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-01 10:55 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1511 bytes --]

On Mon, Feb 01, 2016 at 02:21:46PM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:54 AM, David Gibson wrote:
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > +void resize_hpt_for_hotplug(unsigned long new_mem_size)
> > +{
> > +	unsigned target_hpt_shift;
> > +
> > +	if (!ppc_md.resize_hpt)
> > +		return;
> > +
> > +	target_hpt_shift = htab_shift_for_mem_size(new_mem_size);
> > +
> > +	/*
> > +	 * To avoid lots of HPT resizes if memory size is fluctuating
> > +	 * across a boundary, we deliberately have some hysterisis
> 
> 
> What do you mean by 'memory size is fluctuating across a boundary' ?
> Through memory hotplug interface ? Why some one will do that ?

I was thinking it might be possible to have some management system
that automatically adjusts memory size based on load, and if that
happened to land on a boundary you could get nasty behaviour.

> I
> can understand why we dont have this check in the sysfs debug path
> as we would like to test any memory HPT re sizing scenario we want
> in any sequence of increase or decrease we want.
> 
> Overall the RFC V2 looks pretty good. Looking forward to see the
> host side of the code for this feature.

The qemu host side has been posted to qemu-devel@nongnu.org already.
I haven't started on a KVM HV implementation yet.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 7/9] pseries: Add support for hash table resizing
  2016-02-01  8:31   ` Anshuman Khandual
@ 2016-02-01 11:04     ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-01 11:04 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 1544 bytes --]

On Mon, Feb 01, 2016 at 02:01:09PM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:54 AM, David Gibson wrote:
> 
> > +
> > +static int pseries_lpar_resize_hpt_commit(void *data)
> > +{
> > +	struct hpt_resize_state *state = data;
> > +
> > +	state->commit_rc = plpar_resize_hpt_commit(0, state->shift);
> > +	if (state->commit_rc != H_SUCCESS)
> > +		return -EIO;
> > +
> > +	/* Hypervisor has transitioned the HTAB, update our globals */
> > +	ppc64_pft_size = state->shift;
> > +	htab_size_bytes = 1UL << ppc64_pft_size;
> > +	htab_hash_mask = (htab_size_bytes >> 7) - 1;
> > +
> > +	return 0;
> > +}
> > +
> 
> snip
> 
> > +/* Must be called in user context */
> > +static int pseries_lpar_resize_hpt(unsigned long shift)
> > +{
> > +	struct hpt_resize_state state = {
> > +		.shift = shift,
> > +		.commit_rc = H_FUNCTION,
> 
> > +
> > +	rc = stop_machine(pseries_lpar_resize_hpt_commit, &state, NULL);
> 
> With my limited knowledge of stop_machine, wondering if the current
> or any future version of 'pseries_lpar_resize_hpt_commit' function
> can cause HPT change (page fault path) while stop is executing it.

It can, but the H_RESIZE_HPT_COMMIT hypercall is synchronous so the
cpu executing it can't make any HPT updates during it.  The
stop_machine() prevents any other cpus doing HPT updates.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 0/9] PAPR hash page table resizing (guest side)
  2016-02-01  5:50 ` [RFCv2 0/9] PAPR hash page table resizing (guest side) Anshuman Khandual
@ 2016-02-02  0:57   ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-02  0:57 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev, Nathan Fontenot

[-- Attachment #1: Type: text/plain, Size: 1281 bytes --]

On Mon, Feb 01, 2016 at 11:20:03AM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:53 AM, David Gibson wrote:
> > Here's a second prototype of the guest side work for runtime resizing
> > of the has page table in PAPR guests.
> > 
> > This is now feature complete.  It implements the resizing, advertises
> > it with CAS, and will automatically invoke it to maintain a good HPT
> > size when memory is hot-added or hot-removed.
> > 
> > Patches 1-5 are standalone prerequisite cleanups that I'll be pushing
> > concurrently.
> > 
> > David Gibson (9):
> >   memblock: Don't mark memblock_phys_mem_size() as __init
> >   arch/powerpc: Clean up error handling for htab_remove_mapping
> >   arch/powerpc: Handle removing maybe-present bolted HPTEs
> >   arch/powerpc: Clean up memory hotplug failure paths
> >   arch/powerpc: Split hash page table sizing heuristic into a helper
> 
> A small nit. Please start the above commit message headers as
> "powerpc/mm:" instead, which sounds more clear and uniform with
> patch series related to other subsystems.

Ok.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-02-01  7:11   ` Anshuman Khandual
@ 2016-02-02  0:58     ` David Gibson
  2016-02-04 11:11       ` Anshuman Khandual
  0 siblings, 1 reply; 42+ messages in thread
From: David Gibson @ 2016-02-02  0:58 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2911 bytes --]

On Mon, Feb 01, 2016 at 12:41:31PM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:54 AM, David Gibson wrote:
> > This adds the hypercall numbers and wrapper functions for the hash page
> > table resizing hypercalls.
> > 
> > These are experimental "platform specific" values for now, until we have a
> > formal PAPR update.
> > 
> > It also adds a new firmware feature flat to track the presence of the
> > HPT resizing calls.
> 
> Its a flag   ....................... ^^^^^^^ here.

Oops, thanks.

> 
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  arch/powerpc/include/asm/firmware.h       |  5 +++--
> >  arch/powerpc/include/asm/hvcall.h         |  2 ++
> >  arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
> >  arch/powerpc/platforms/pseries/firmware.c |  1 +
> >  4 files changed, 18 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
> > index b062924..32435d2 100644
> > --- a/arch/powerpc/include/asm/firmware.h
> > +++ b/arch/powerpc/include/asm/firmware.h
> > @@ -42,7 +42,7 @@
> >  #define FW_FEATURE_SPLPAR	ASM_CONST(0x0000000000100000)
> >  #define FW_FEATURE_LPAR		ASM_CONST(0x0000000000400000)
> >  #define FW_FEATURE_PS3_LV1	ASM_CONST(0x0000000000800000)
> > -/* Free				ASM_CONST(0x0000000001000000) */
> > +#define FW_FEATURE_HPT_RESIZE	ASM_CONST(0x0000000001000000)
> >  #define FW_FEATURE_CMO		ASM_CONST(0x0000000002000000)
> >  #define FW_FEATURE_VPHN		ASM_CONST(0x0000000004000000)
> >  #define FW_FEATURE_XCMO		ASM_CONST(0x0000000008000000)
> > @@ -66,7 +66,8 @@ enum {
> >  		FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
> >  		FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
> >  		FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
> > -		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
> > +		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
> > +		FW_FEATURE_HPT_RESIZE,
> >  	FW_FEATURE_PSERIES_ALWAYS = 0,
> >  	FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
> >  	FW_FEATURE_POWERNV_ALWAYS = 0,
> > diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
> > index e3b54dd..195e080 100644
> > --- a/arch/powerpc/include/asm/hvcall.h
> > +++ b/arch/powerpc/include/asm/hvcall.h
> > @@ -293,6 +293,8 @@
> >  
> >  /* Platform specific hcalls, used by KVM */
> >  #define H_RTAS			0xf000
> > +#define H_RESIZE_HPT_PREPARE	0xf003
> > +#define H_RESIZE_HPT_COMMIT	0xf004
> 
> This sound better and matches FW_FEATURE_HPT_RESIZE ?

I'm not quite sure what you're suggesting here.

> #define H_HPT_RESIZE_PREPARE	0xf003
> #define H_HPT_RESIZE_COMMIT	0xf004
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
  2016-02-01  7:04   ` Anshuman Khandual
@ 2016-02-02  1:04     ` David Gibson
  2016-02-04 10:56       ` Anshuman Khandual
  0 siblings, 1 reply; 42+ messages in thread
From: David Gibson @ 2016-02-02  1:04 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4300 bytes --]

On Mon, Feb 01, 2016 at 12:34:32PM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:53 AM, David Gibson wrote:
> > htab_get_table_size() either retrieve the size of the hash page table (HPT)
> > from the device tree - if the HPT size is determined by firmware - or
> > uses a heuristic to determine a good size based on RAM size if the kernel
> > is responsible for allocating the HPT.
> > 
> > To support a PAPR extension allowing resizing of the HPT, we're going to
> > want the memory size -> HPT size logic elsewhere, so split it out into a
> > helper function.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  arch/powerpc/include/asm/mmu-hash64.h |  3 +++
> >  arch/powerpc/mm/hash_utils_64.c       | 30 +++++++++++++++++-------------
> >  2 files changed, 20 insertions(+), 13 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> > index 7352d3f..cf070fd 100644
> > --- a/arch/powerpc/include/asm/mmu-hash64.h
> > +++ b/arch/powerpc/include/asm/mmu-hash64.h
> > @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
> >  	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
> >  	return get_vsid(context, ea, ssize);
> >  }
> > +
> > +unsigned htab_shift_for_mem_size(unsigned long mem_size);
> > +
> >  #endif /* __ASSEMBLY__ */
> >  
> >  #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
> > diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> > index e88a86e..d63f7dc 100644
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
> >  	return 0;
> >  }
> >  
> > -static unsigned long __init htab_get_table_size(void)
> > +unsigned htab_shift_for_mem_size(unsigned long mem_size)
> >  {
> > -	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
> > +	unsigned memshift = __ilog2(mem_size);
> > +	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
> > +	unsigned pteg_shift;
> > +
> > +	/* round mem_size up to next power of 2 */
> > +	if ((1UL << memshift) < mem_size)
> > +		memshift += 1;
> > +
> > +	/* aim for 2 pages / pteg */
> 
> While here I guess its a good opportunity to write couple of lines
> about why one PTE group for every two physical pages on the system,

Well, that don't really know, it's just copied from the existing code.

> why minimum (1UL << 11 = 2048) number of PTE groups required,

Ok.

> why
> (1U << 7 = 128) entries per PTE group

Um.. what?  Because that's how big a PTEG is, I don't think
re-explaining the HPT structure here is useful.

> and also remove the existing
> confusing comments above ? Just a suggestion.

Not sure which comment you mean.

> 
> > +	pteg_shift = memshift - (pshift + 1);
> > +
> > +	return max(pteg_shift + 7, 18U);
> > +}
> >  
> > +static unsigned long __init htab_get_table_size(void)
> > +{
> >  	/* If hash size isn't already provided by the platform, we try to
> >  	 * retrieve it from the device-tree. If it's not there neither, we
> >  	 * calculate it now based on the total RAM size
> > @@ -619,17 +633,7 @@ static unsigned long __init htab_get_table_size(void)
> >  	if (ppc64_pft_size)
> >  		return 1UL << ppc64_pft_size;
> >  
> > -	/* round mem_size up to next power of 2 */
> > -	mem_size = memblock_phys_mem_size();
> > -	rnd_mem_size = 1UL << __ilog2(mem_size);
> > -	if (rnd_mem_size < mem_size)
> > -		rnd_mem_size <<= 1;
> > -
> > -	/* # pages / 2 */
> > -	psize = mmu_psize_defs[mmu_virtual_psize].shift;
> > -	pteg_count = max(rnd_mem_size >> (psize + 1), 1UL << 11);
> > -
> > -	return pteg_count << 7;
> > +	return htab_shift_for_mem_size(memblock_phys_mem_size());
> 
> Would it be 1UL << htab_shift_for_mem_size(memblock_phys_mem_size())
> instead ? It was returning the size of the HPT not the shift of HPT
> originally or I am missing something here.

Oops, yes.  That would have broken all non-LPAR platforms.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-02-01  5:58   ` Anshuman Khandual
@ 2016-02-02  1:08     ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-02  1:08 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 2601 bytes --]

On Mon, Feb 01, 2016 at 11:28:54AM +0530, Anshuman Khandual wrote:
> On 01/29/2016 10:53 AM, David Gibson wrote:
> > At the moment the hpte_removebolted callback in ppc_md returns void and
> > will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> > place.  This is awkward for the case of cleaning up a mapping which was
> > partially made before failing.
> > 
> > So, we add a return value to hpte_removebolted, and have it return ENOENT
> > in the case that the HPTE to remove didn't exist in the first place.
> > 
> > In the (sole) caller, we propagate errors in hpte_removebolted to its
> > caller to handle.  However, we handle ENOENT specially, continuing to
> > complete the unmapping over the specified range before returning the error
> > to the caller.
> > 
> > This means that htab_remove_mapping() will work sanely on a partially
> > present mapping, removing any HPTEs which are present, while also returning
> > ENOENT to its caller in case it's important there.
> 
> Yeah makes sense.
> 
> > 
> > There are two callers of htab_remove_mapping():
> >    - In remove_section_mapping() we already WARN_ON() any error return,
> >      which is reasonable - in this case the mapping should be fully
> >      present
> 
> Right.
> 
> >    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
> >      just a WARN_ON() in the case of ENOENT, since failing to remove a
> >      mapping that wasn't there in the first place probably shouldn't be
> >      fatal.
> 
> Provided the caller of vmemmap_remove_mapping() which is memory hotplug
> path must be handling the returned -ENOENT error correctly.

vmemmap_remove_mapping() is void, so there's no -ENOENT returned, just
the WARN_ON().

> Just curious
> and want to make sure that any of the memory sections or pages inside the
> section must not be left in a state which makes the next call in the
> hotplug path fail.

So, this situation shouldn't happen - the mapping should be complete -
but there's nothing obvious that the caller should do extra.  It asked
that the mapping be removed, and we discovered that some of it wasn't
there to begin with.  Whether we can continue safely depends on
what exactly caused the mapping not to be fully present in the first
place, and whether that had other conseuqences, but we have no way of
knowing that here.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
  2016-02-01  5:58   ` Anshuman Khandual
@ 2016-02-02 13:49   ` Denis Kirjanov
  2016-02-08  2:54   ` Paul Mackerras
  2 siblings, 0 replies; 42+ messages in thread
From: Denis Kirjanov @ 2016-02-02 13:49 UTC (permalink / raw)
  To: David Gibson; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

On 1/29/16, David Gibson <david@gibson.dropbear.id.au> wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
>
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
>
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
>
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
>
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
>
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/include/asm/machdep.h    |  2 +-
>  arch/powerpc/mm/hash_utils_64.c       | 10 +++++++---
>  arch/powerpc/mm/init_64.c             |  9 +++++----
>  arch/powerpc/platforms/pseries/lpar.c |  7 +++++--
>  4 files changed, 18 insertions(+), 10 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/machdep.h
> b/arch/powerpc/include/asm/machdep.h
> index 3f191f5..a7d3f66 100644
> --- a/arch/powerpc/include/asm/machdep.h
> +++ b/arch/powerpc/include/asm/machdep.h
> @@ -54,7 +54,7 @@ struct machdep_calls {
>  				       int psize, int apsize,
>  				       int ssize);
>  	long		(*hpte_remove)(unsigned long hpte_group);
> -	void            (*hpte_removebolted)(unsigned long ea,
> +	long            (*hpte_removebolted)(unsigned long ea,
>  					     int psize, int ssize);
>  	void		(*flush_hash_range)(unsigned long number, int local);
>  	void		(*hugepage_invalidate)(unsigned long vsid,
> diff --git a/arch/powerpc/mm/hash_utils_64.c
> b/arch/powerpc/mm/hash_utils_64.c
> index 9f7d727..0737eae 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned
> long vend,
>  {
>  	unsigned long vaddr;
>  	unsigned int step, shift;
> +	int rc = 0;
>
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
> @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned
> long vend,
>  	if (!ppc_md.hpte_removebolted)
>  		return -ENODEV;
>
> -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
but the function proto return type is long.

> +		if ((rc < 0) && (rc != -ENOENT))
> +			return rc;
> +	}
>
> -	return 0;
> +	return rc;
>  }
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>
> diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
> index 379a6a9..baa1a23 100644
> --- a/arch/powerpc/mm/init_64.c
> +++ b/arch/powerpc/mm/init_64.c
> @@ -232,10 +232,11 @@ static void __meminit vmemmap_create_mapping(unsigned
> long start,
>  static void vmemmap_remove_mapping(unsigned long start,
>  				   unsigned long page_size)
>  {
> -	int mapped = htab_remove_mapping(start, start + page_size,
> -					 mmu_vmemmap_psize,
> -					 mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_remove_mapping(start, start + page_size,
> +				     mmu_vmemmap_psize,
> +				     mmu_kernel_ssize);
> +	BUG_ON((rc < 0) && (rc != -ENOENT));
> +	WARN_ON(rc == -ENOENT);
>  }
>  #endif
>
> diff --git a/arch/powerpc/platforms/pseries/lpar.c
> b/arch/powerpc/platforms/pseries/lpar.c
> index 477290a..92d472d 100644
> --- a/arch/powerpc/platforms/pseries/lpar.c
> +++ b/arch/powerpc/platforms/pseries/lpar.c
> @@ -505,7 +505,7 @@ static void pSeries_lpar_hugepage_invalidate(unsigned
> long vsid,
>  }
>  #endif
>
> -static void pSeries_lpar_hpte_removebolted(unsigned long ea,
> +static long pSeries_lpar_hpte_removebolted(unsigned long ea,
>  					   int psize, int ssize)
>  {
>  	unsigned long vpn;
> @@ -515,11 +515,14 @@ static void pSeries_lpar_hpte_removebolted(unsigned
> long ea,
>  	vpn = hpt_vpn(ea, vsid, ssize);
>
>  	slot = pSeries_lpar_hpte_find(vpn, psize, ssize);
> -	BUG_ON(slot == -1);
> +	if (slot == -1)
> +		return -ENOENT;
> +
>  	/*
>  	 * lpar doesn't use the passed actual page size
>  	 */
>  	pSeries_lpar_hpte_invalidate(slot, vpn, psize, 0, ssize, 0);
> +	return 0;
>  }
>
>  /*
> --
> 2.5.0
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
  2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
  2016-02-01  6:29   ` Anshuman Khandual
@ 2016-02-02 15:04   ` Nathan Fontenot
  2016-02-03  4:31     ` David Gibson
  2016-02-08  5:47   ` Paul Mackerras
  2 siblings, 1 reply; 42+ messages in thread
From: Nathan Fontenot @ 2016-02-02 15:04 UTC (permalink / raw)
  To: David Gibson, paulus, mpe, benh; +Cc: aik, lvivier, thuth, linuxppc-dev

On 01/28/2016 11:23 PM, David Gibson wrote:
> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
> 
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
> 
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> ---
>  arch/powerpc/mm/hash_utils_64.c | 13 ++++++++++---
>  arch/powerpc/mm/init_64.c       | 38 ++++++++++++++++++++++++++------------
>  arch/powerpc/mm/mem.c           | 10 ++++++++--
>  3 files changed, 44 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index 0737eae..e88a86e 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -635,9 +635,16 @@ static unsigned long __init htab_get_table_size(void)
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  int create_section_mapping(unsigned long start, unsigned long end)
>  {
> -	return htab_bolt_mapping(start, end, __pa(start),
> -				 pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> -				 mmu_kernel_ssize);
> +	int rc = htab_bolt_mapping(start, end, __pa(start),
> +				   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> +				   mmu_kernel_ssize);
> +
> +	if (rc < 0) {
> +		int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
> +					      mmu_kernel_ssize);
> +		BUG_ON(rc2 && (rc2 != -ENOENT));
> +	}
> +	return rc;
>  }
>  

<-- snip -->

>  #ifdef CONFIG_MEMORY_HOTPLUG
> @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
>  }
>  #endif
>  #else /* CONFIG_PPC_BOOK3E */
> -static void __meminit vmemmap_create_mapping(unsigned long start,
> -					     unsigned long page_size,
> -					     unsigned long phys)
> +static int __meminit vmemmap_create_mapping(unsigned long start,
> +					    unsigned long page_size,
> +					    unsigned long phys)
>  {
> -	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
> -					pgprot_val(PAGE_KERNEL),
> -					mmu_vmemmap_psize,
> -					mmu_kernel_ssize);
> -	BUG_ON(mapped < 0);
> +	int rc = htab_bolt_mapping(start, start + page_size, phys,
> +				   pgprot_val(PAGE_KERNEL),
> +				   mmu_vmemmap_psize, mmu_kernel_ssize);
> +	if (rc < 0) {
> +		int rc2 = htab_remove_mapping(start, start + page_size,
> +					      mmu_vmemmap_psize,
> +					      mmu_kernel_ssize);
> +		BUG_ON(rc2 && (rc2 != -ENOENT));
> +	}
> +	return rc;
>  }
>  

If I'm reading this correctly it appears that create_section_mapping() and
vmemmap_create_mapping() for !PPC_BOOK3E are identical. Any reason to not
have one routine, perhaps just have vmemmap_create_mapping() just call
create_section_mapping()?

-Nathan

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
  2016-02-02 15:04   ` Nathan Fontenot
@ 2016-02-03  4:31     ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-03  4:31 UTC (permalink / raw)
  To: Nathan Fontenot; +Cc: paulus, mpe, benh, aik, lvivier, thuth, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 4177 bytes --]

On Tue, Feb 02, 2016 at 09:04:23AM -0600, Nathan Fontenot wrote:
> On 01/28/2016 11:23 PM, David Gibson wrote:
> > This makes a number of cleanups to handling of mapping failures during
> > memory hotplug on Power:
> > 
> > For errors creating the linear mapping for the hot-added region:
> >   * This is now reported with EFAULT which is more appropriate than the
> >     previous EINVAL (the failure is unlikely to be related to the
> >     function's parameters)
> >   * An error in this path now prints a warning message, rather than just
> >     silently failing to add the extra memory.
> >   * Previously a failure here could result in the region being partially
> >     mapped.  We now clean up any partial mapping before failing.
> > 
> > For errors creating the vmemmap for the hot-added region:
> >    * This is now reported with EFAULT instead of causing a BUG() - this
> >      could happen for external reason (e.g. full hash table) so it's better
> >      to handle this non-fatally
> >    * An error message is also printed, so the failure won't be silent
> >    * As above a failure could cause a partially mapped region, we now
> >      clean this up.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> > ---
> >  arch/powerpc/mm/hash_utils_64.c | 13 ++++++++++---
> >  arch/powerpc/mm/init_64.c       | 38 ++++++++++++++++++++++++++------------
> >  arch/powerpc/mm/mem.c           | 10 ++++++++--
> >  3 files changed, 44 insertions(+), 17 deletions(-)
> > 
> > diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> > index 0737eae..e88a86e 100644
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -635,9 +635,16 @@ static unsigned long __init htab_get_table_size(void)
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> >  int create_section_mapping(unsigned long start, unsigned long end)
> >  {
> > -	return htab_bolt_mapping(start, end, __pa(start),
> > -				 pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> > -				 mmu_kernel_ssize);
> > +	int rc = htab_bolt_mapping(start, end, __pa(start),
> > +				   pgprot_val(PAGE_KERNEL), mmu_linear_psize,
> > +				   mmu_kernel_ssize);
> > +
> > +	if (rc < 0) {
> > +		int rc2 = htab_remove_mapping(start, end, mmu_linear_psize,
> > +					      mmu_kernel_ssize);
> > +		BUG_ON(rc2 && (rc2 != -ENOENT));
> > +	}
> > +	return rc;
> >  }
> >  
> 
> <-- snip -->
> 
> >  #ifdef CONFIG_MEMORY_HOTPLUG
> > @@ -217,15 +219,20 @@ static void vmemmap_remove_mapping(unsigned long start,
> >  }
> >  #endif
> >  #else /* CONFIG_PPC_BOOK3E */
> > -static void __meminit vmemmap_create_mapping(unsigned long start,
> > -					     unsigned long page_size,
> > -					     unsigned long phys)
> > +static int __meminit vmemmap_create_mapping(unsigned long start,
> > +					    unsigned long page_size,
> > +					    unsigned long phys)
> >  {
> > -	int  mapped = htab_bolt_mapping(start, start + page_size, phys,
> > -					pgprot_val(PAGE_KERNEL),
> > -					mmu_vmemmap_psize,
> > -					mmu_kernel_ssize);
> > -	BUG_ON(mapped < 0);
> > +	int rc = htab_bolt_mapping(start, start + page_size, phys,
> > +				   pgprot_val(PAGE_KERNEL),
> > +				   mmu_vmemmap_psize, mmu_kernel_ssize);
> > +	if (rc < 0) {
> > +		int rc2 = htab_remove_mapping(start, start + page_size,
> > +					      mmu_vmemmap_psize,
> > +					      mmu_kernel_ssize);
> > +		BUG_ON(rc2 && (rc2 != -ENOENT));
> > +	}
> > +	return rc;
> >  }
> >  
> 
> If I'm reading this correctly it appears that create_section_mapping() and
> vmemmap_create_mapping() for !PPC_BOOK3E are identical. Any reason to not
> have one routine, perhaps just have vmemmap_create_mapping() just call
> create_section_mapping()?

Not really, apart from documenting what they're used for.  They're
both fairly trivial wrappers around htab_bolt_mapping().  I think
cleaning this up is outside the scope of this series though.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
  2016-02-02  1:04     ` David Gibson
@ 2016-02-04 10:56       ` Anshuman Khandual
  2016-02-08  5:57         ` Paul Mackerras
  0 siblings, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-04 10:56 UTC (permalink / raw)
  To: David Gibson; +Cc: lvivier, thuth, aik, paulus, linuxppc-dev, Aneesh Kumar K.V

On 02/02/2016 06:34 AM, David Gibson wrote:
> On Mon, Feb 01, 2016 at 12:34:32PM +0530, Anshuman Khandual wrote:
>> On 01/29/2016 10:53 AM, David Gibson wrote:
>>> htab_get_table_size() either retrieve the size of the hash page table (HPT)
>>> from the device tree - if the HPT size is determined by firmware - or
>>> uses a heuristic to determine a good size based on RAM size if the kernel
>>> is responsible for allocating the HPT.
>>>
>>> To support a PAPR extension allowing resizing of the HPT, we're going to
>>> want the memory size -> HPT size logic elsewhere, so split it out into a
>>> helper function.
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>>  arch/powerpc/include/asm/mmu-hash64.h |  3 +++
>>>  arch/powerpc/mm/hash_utils_64.c       | 30 +++++++++++++++++-------------
>>>  2 files changed, 20 insertions(+), 13 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
>>> index 7352d3f..cf070fd 100644
>>> --- a/arch/powerpc/include/asm/mmu-hash64.h
>>> +++ b/arch/powerpc/include/asm/mmu-hash64.h
>>> @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
>>>  	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
>>>  	return get_vsid(context, ea, ssize);
>>>  }
>>> +
>>> +unsigned htab_shift_for_mem_size(unsigned long mem_size);
>>> +
>>>  #endif /* __ASSEMBLY__ */
>>>  
>>>  #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
>>> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
>>> index e88a86e..d63f7dc 100644
>>> --- a/arch/powerpc/mm/hash_utils_64.c
>>> +++ b/arch/powerpc/mm/hash_utils_64.c
>>> @@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
>>>  	return 0;
>>>  }
>>>  
>>> -static unsigned long __init htab_get_table_size(void)
>>> +unsigned htab_shift_for_mem_size(unsigned long mem_size)
>>>  {
>>> -	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
>>> +	unsigned memshift = __ilog2(mem_size);
>>> +	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
>>> +	unsigned pteg_shift;
>>> +
>>> +	/* round mem_size up to next power of 2 */
>>> +	if ((1UL << memshift) < mem_size)
>>> +		memshift += 1;
>>> +
>>> +	/* aim for 2 pages / pteg */
>>
>> While here I guess its a good opportunity to write couple of lines
>> about why one PTE group for every two physical pages on the system,
> 
> Well, that don't really know, it's just copied from the existing code.

Aneesh, would you know why ?

> 
>> why minimum (1UL << 11 = 2048) number of PTE groups required,

Aneesh, would you know why ?

> 
> Ok.
> 
>> why
>> (1U << 7 = 128) entries per PTE group
> 
> Um.. what?  Because that's how big a PTEG is, I don't think
> re-explaining the HPT structure here is useful.

Agreed, though think some where these things should be macros not used
as hard coded numbers like this.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-02-02  0:58     ` David Gibson
@ 2016-02-04 11:11       ` Anshuman Khandual
  2016-02-07 22:33         ` David Gibson
  0 siblings, 1 reply; 42+ messages in thread
From: Anshuman Khandual @ 2016-02-04 11:11 UTC (permalink / raw)
  To: David Gibson; +Cc: lvivier, thuth, aik, paulus, linuxppc-dev

On 02/02/2016 06:28 AM, David Gibson wrote:
> On Mon, Feb 01, 2016 at 12:41:31PM +0530, Anshuman Khandual wrote:
>> On 01/29/2016 10:54 AM, David Gibson wrote:
>>> This adds the hypercall numbers and wrapper functions for the hash page
>>> table resizing hypercalls.
>>>
>>> These are experimental "platform specific" values for now, until we have a
>>> formal PAPR update.
>>>
>>> It also adds a new firmware feature flat to track the presence of the
>>> HPT resizing calls.
>>
>> Its a flag   ....................... ^^^^^^^ here.
> 
> Oops, thanks.
> 
>>
>>>
>>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
>>> ---
>>>  arch/powerpc/include/asm/firmware.h       |  5 +++--
>>>  arch/powerpc/include/asm/hvcall.h         |  2 ++
>>>  arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
>>>  arch/powerpc/platforms/pseries/firmware.c |  1 +
>>>  4 files changed, 18 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
>>> index b062924..32435d2 100644
>>> --- a/arch/powerpc/include/asm/firmware.h
>>> +++ b/arch/powerpc/include/asm/firmware.h
>>> @@ -42,7 +42,7 @@
>>>  #define FW_FEATURE_SPLPAR	ASM_CONST(0x0000000000100000)
>>>  #define FW_FEATURE_LPAR		ASM_CONST(0x0000000000400000)
>>>  #define FW_FEATURE_PS3_LV1	ASM_CONST(0x0000000000800000)
>>> -/* Free				ASM_CONST(0x0000000001000000) */
>>> +#define FW_FEATURE_HPT_RESIZE	ASM_CONST(0x0000000001000000)
>>>  #define FW_FEATURE_CMO		ASM_CONST(0x0000000002000000)
>>>  #define FW_FEATURE_VPHN		ASM_CONST(0x0000000004000000)
>>>  #define FW_FEATURE_XCMO		ASM_CONST(0x0000000008000000)
>>> @@ -66,7 +66,8 @@ enum {
>>>  		FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
>>>  		FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
>>>  		FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
>>> -		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
>>> +		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
>>> +		FW_FEATURE_HPT_RESIZE,
>>>  	FW_FEATURE_PSERIES_ALWAYS = 0,
>>>  	FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
>>>  	FW_FEATURE_POWERNV_ALWAYS = 0,
>>> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
>>> index e3b54dd..195e080 100644
>>> --- a/arch/powerpc/include/asm/hvcall.h
>>> +++ b/arch/powerpc/include/asm/hvcall.h
>>> @@ -293,6 +293,8 @@
>>>  
>>>  /* Platform specific hcalls, used by KVM */
>>>  #define H_RTAS			0xf000
>>> +#define H_RESIZE_HPT_PREPARE	0xf003
>>> +#define H_RESIZE_HPT_COMMIT	0xf004
>>
>> This sound better and matches FW_FEATURE_HPT_RESIZE ?
> 
> I'm not quite sure what you're suggesting here.
> 
>> #define H_HPT_RESIZE_PREPARE	0xf003
>> #define H_HPT_RESIZE_COMMIT	0xf004

Just little bit of change of name of the macro like this


H_RESIZE_HPT_PREPARE -->  H_HPT_RESIZE_PREPARE
H_RESIZE_HPT_COMMIT -->  H_HPT_RESIZE_COMMIT

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-02-04 11:11       ` Anshuman Khandual
@ 2016-02-07 22:33         ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-07 22:33 UTC (permalink / raw)
  To: Anshuman Khandual; +Cc: lvivier, thuth, aik, paulus, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3440 bytes --]

On Thu, Feb 04, 2016 at 04:41:10PM +0530, Anshuman Khandual wrote:
> On 02/02/2016 06:28 AM, David Gibson wrote:
> > On Mon, Feb 01, 2016 at 12:41:31PM +0530, Anshuman Khandual wrote:
> >> On 01/29/2016 10:54 AM, David Gibson wrote:
> >>> This adds the hypercall numbers and wrapper functions for the hash page
> >>> table resizing hypercalls.
> >>>
> >>> These are experimental "platform specific" values for now, until we have a
> >>> formal PAPR update.
> >>>
> >>> It also adds a new firmware feature flat to track the presence of the
> >>> HPT resizing calls.
> >>
> >> Its a flag   ....................... ^^^^^^^ here.
> > 
> > Oops, thanks.
> > 
> >>
> >>>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>> ---
> >>>  arch/powerpc/include/asm/firmware.h       |  5 +++--
> >>>  arch/powerpc/include/asm/hvcall.h         |  2 ++
> >>>  arch/powerpc/include/asm/plpar_wrappers.h | 12 ++++++++++++
> >>>  arch/powerpc/platforms/pseries/firmware.c |  1 +
> >>>  4 files changed, 18 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/include/asm/firmware.h b/arch/powerpc/include/asm/firmware.h
> >>> index b062924..32435d2 100644
> >>> --- a/arch/powerpc/include/asm/firmware.h
> >>> +++ b/arch/powerpc/include/asm/firmware.h
> >>> @@ -42,7 +42,7 @@
> >>>  #define FW_FEATURE_SPLPAR	ASM_CONST(0x0000000000100000)
> >>>  #define FW_FEATURE_LPAR		ASM_CONST(0x0000000000400000)
> >>>  #define FW_FEATURE_PS3_LV1	ASM_CONST(0x0000000000800000)
> >>> -/* Free				ASM_CONST(0x0000000001000000) */
> >>> +#define FW_FEATURE_HPT_RESIZE	ASM_CONST(0x0000000001000000)
> >>>  #define FW_FEATURE_CMO		ASM_CONST(0x0000000002000000)
> >>>  #define FW_FEATURE_VPHN		ASM_CONST(0x0000000004000000)
> >>>  #define FW_FEATURE_XCMO		ASM_CONST(0x0000000008000000)
> >>> @@ -66,7 +66,8 @@ enum {
> >>>  		FW_FEATURE_MULTITCE | FW_FEATURE_SPLPAR | FW_FEATURE_LPAR |
> >>>  		FW_FEATURE_CMO | FW_FEATURE_VPHN | FW_FEATURE_XCMO |
> >>>  		FW_FEATURE_SET_MODE | FW_FEATURE_BEST_ENERGY |
> >>> -		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN,
> >>> +		FW_FEATURE_TYPE1_AFFINITY | FW_FEATURE_PRRN |
> >>> +		FW_FEATURE_HPT_RESIZE,
> >>>  	FW_FEATURE_PSERIES_ALWAYS = 0,
> >>>  	FW_FEATURE_POWERNV_POSSIBLE = FW_FEATURE_OPAL,
> >>>  	FW_FEATURE_POWERNV_ALWAYS = 0,
> >>> diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
> >>> index e3b54dd..195e080 100644
> >>> --- a/arch/powerpc/include/asm/hvcall.h
> >>> +++ b/arch/powerpc/include/asm/hvcall.h
> >>> @@ -293,6 +293,8 @@
> >>>  
> >>>  /* Platform specific hcalls, used by KVM */
> >>>  #define H_RTAS			0xf000
> >>> +#define H_RESIZE_HPT_PREPARE	0xf003
> >>> +#define H_RESIZE_HPT_COMMIT	0xf004
> >>
> >> This sound better and matches FW_FEATURE_HPT_RESIZE ?
> > 
> > I'm not quite sure what you're suggesting here.
> > 
> >> #define H_HPT_RESIZE_PREPARE	0xf003
> >> #define H_HPT_RESIZE_COMMIT	0xf004
> 
> Just little bit of change of name of the macro like this
> 
> 
> H_RESIZE_HPT_PREPARE -->  H_HPT_RESIZE_PREPARE
> H_RESIZE_HPT_COMMIT -->  H_HPT_RESIZE_COMMIT

Oh, I see.  Actually, I'm trying to standardize on "resize hpt" rather
than "hpt resize" everywhere.


-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init
  2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
  2016-02-01  5:50   ` Anshuman Khandual
@ 2016-02-08  2:46   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  2:46 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:23:55PM +1100, David Gibson wrote:
> At the moment memblock_phys_mem_size() is marked as __init, and so is
> discarded after boot.  This is different from most of the memblock
> functions which are marked __init_memblock, and are only discarded after
> boot if memory hotplug is not configured.
> 
> To allow for upcoming code which will need memblock_phys_mem_size() in the
> hotplug path, change it from __init to __init_memblock.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping
  2016-01-29  5:23 ` [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping David Gibson
  2016-02-01  5:54   ` Anshuman Khandual
@ 2016-02-08  2:48   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  2:48 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:23:56PM +1100, David Gibson wrote:
> Currently, the only error that htab_remove_mapping() can report is -EINVAL,
> if removal of bolted HPTEs isn't implemeted for this platform.  We make
> a few clean ups to the handling of this:
> 
>  * EINVAL isn't really the right code - there's nothing wrong with the
>    function's arguments - use ENODEV instead
>  * We were also printing a warning message, but that's a decision better
>    left up to the callers, so remove it
>  * One caller is vmemmap_remove_mapping(), which will just BUG_ON() on
>    error, making the warning message irrelevant, so no change is needed
>    there.
>  * The other caller is remove_section_mapping().  This is called in the
>    memory hot remove path at a point after vmemmap_remove_mapping() so
>    if hpte_removebolted isn't implemented, we'd expect to have already
>    BUG()ed anyway.  Put a WARN_ON() here, in lieu of a printk() since this
>    really shouldn't be happening.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
  2016-02-01  5:58   ` Anshuman Khandual
  2016-02-02 13:49   ` Denis Kirjanov
@ 2016-02-08  2:54   ` Paul Mackerras
  2016-02-09  0:43     ` David Gibson
  2 siblings, 1 reply; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  2:54 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:23:57PM +1100, David Gibson wrote:
> At the moment the hpte_removebolted callback in ppc_md returns void and
> will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> place.  This is awkward for the case of cleaning up a mapping which was
> partially made before failing.
> 
> So, we add a return value to hpte_removebolted, and have it return ENOENT
> in the case that the HPTE to remove didn't exist in the first place.
> 
> In the (sole) caller, we propagate errors in hpte_removebolted to its
> caller to handle.  However, we handle ENOENT specially, continuing to
> complete the unmapping over the specified range before returning the error
> to the caller.
> 
> This means that htab_remove_mapping() will work sanely on a partially
> present mapping, removing any HPTEs which are present, while also returning
> ENOENT to its caller in case it's important there.
> 
> There are two callers of htab_remove_mapping():
>    - In remove_section_mapping() we already WARN_ON() any error return,
>      which is reasonable - in this case the mapping should be fully
>      present
>    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
>      just a WARN_ON() in the case of ENOENT, since failing to remove a
>      mapping that wasn't there in the first place probably shouldn't be
>      fatal.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

[snip]

> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  {
>  	unsigned long vaddr;
>  	unsigned int step, shift;
> +	int rc = 0;
>  
>  	shift = mmu_psize_defs[psize].shift;
>  	step = 1 << shift;
> @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
>  	if (!ppc_md.hpte_removebolted)
>  		return -ENODEV;
>  
> -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
> +		if ((rc < 0) && (rc != -ENOENT))
> +			return rc;
> +	}
>  
> -	return 0;
> +	return rc;

This will return the rc from the last hpte_removebolted call, which
might be 0 even if earlier calls had returned -ENOENT.  Or, if the
last call fails with -ENOENT, this will return -ENOENT.  Is that
exactly what you meant?  In the case where some calls to
hpte_removebolted return -ENOENT, I would think we would want a
consistent return value, which could be either 0 or -ENOENT, but it
shouldn't depend on which specific calls fail with -ENOENT, in my
opinion.

Paul.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths
  2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
  2016-02-01  6:29   ` Anshuman Khandual
  2016-02-02 15:04   ` Nathan Fontenot
@ 2016-02-08  5:47   ` Paul Mackerras
  2 siblings, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  5:47 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:23:58PM +1100, David Gibson wrote:
> This makes a number of cleanups to handling of mapping failures during
> memory hotplug on Power:
> 
> For errors creating the linear mapping for the hot-added region:
>   * This is now reported with EFAULT which is more appropriate than the
>     previous EINVAL (the failure is unlikely to be related to the
>     function's parameters)
>   * An error in this path now prints a warning message, rather than just
>     silently failing to add the extra memory.
>   * Previously a failure here could result in the region being partially
>     mapped.  We now clean up any partial mapping before failing.
> 
> For errors creating the vmemmap for the hot-added region:
>    * This is now reported with EFAULT instead of causing a BUG() - this
>      could happen for external reason (e.g. full hash table) so it's better
>      to handle this non-fatally
>    * An error message is also printed, so the failure won't be silent
>    * As above a failure could cause a partially mapped region, we now
>      clean this up.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper
  2016-02-04 10:56       ` Anshuman Khandual
@ 2016-02-08  5:57         ` Paul Mackerras
  0 siblings, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  5:57 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: David Gibson, lvivier, thuth, aik, linuxppc-dev, Aneesh Kumar K.V

On Thu, Feb 04, 2016 at 04:26:20PM +0530, Anshuman Khandual wrote:
> On 02/02/2016 06:34 AM, David Gibson wrote:
> > On Mon, Feb 01, 2016 at 12:34:32PM +0530, Anshuman Khandual wrote:
> >> On 01/29/2016 10:53 AM, David Gibson wrote:
> >>> htab_get_table_size() either retrieve the size of the hash page table (HPT)
> >>> from the device tree - if the HPT size is determined by firmware - or
> >>> uses a heuristic to determine a good size based on RAM size if the kernel
> >>> is responsible for allocating the HPT.
> >>>
> >>> To support a PAPR extension allowing resizing of the HPT, we're going to
> >>> want the memory size -> HPT size logic elsewhere, so split it out into a
> >>> helper function.
> >>>
> >>> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> >>> ---
> >>>  arch/powerpc/include/asm/mmu-hash64.h |  3 +++
> >>>  arch/powerpc/mm/hash_utils_64.c       | 30 +++++++++++++++++-------------
> >>>  2 files changed, 20 insertions(+), 13 deletions(-)
> >>>
> >>> diff --git a/arch/powerpc/include/asm/mmu-hash64.h b/arch/powerpc/include/asm/mmu-hash64.h
> >>> index 7352d3f..cf070fd 100644
> >>> --- a/arch/powerpc/include/asm/mmu-hash64.h
> >>> +++ b/arch/powerpc/include/asm/mmu-hash64.h
> >>> @@ -607,6 +607,9 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
> >>>  	context = (MAX_USER_CONTEXT) + ((ea >> 60) - 0xc) + 1;
> >>>  	return get_vsid(context, ea, ssize);
> >>>  }
> >>> +
> >>> +unsigned htab_shift_for_mem_size(unsigned long mem_size);
> >>> +
> >>>  #endif /* __ASSEMBLY__ */
> >>>  
> >>>  #endif /* _ASM_POWERPC_MMU_HASH64_H_ */
> >>> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> >>> index e88a86e..d63f7dc 100644
> >>> --- a/arch/powerpc/mm/hash_utils_64.c
> >>> +++ b/arch/powerpc/mm/hash_utils_64.c
> >>> @@ -606,10 +606,24 @@ static int __init htab_dt_scan_pftsize(unsigned long node,
> >>>  	return 0;
> >>>  }
> >>>  
> >>> -static unsigned long __init htab_get_table_size(void)
> >>> +unsigned htab_shift_for_mem_size(unsigned long mem_size)
> >>>  {
> >>> -	unsigned long mem_size, rnd_mem_size, pteg_count, psize;
> >>> +	unsigned memshift = __ilog2(mem_size);
> >>> +	unsigned pshift = mmu_psize_defs[mmu_virtual_psize].shift;
> >>> +	unsigned pteg_shift;
> >>> +
> >>> +	/* round mem_size up to next power of 2 */
> >>> +	if ((1UL << memshift) < mem_size)
> >>> +		memshift += 1;
> >>> +
> >>> +	/* aim for 2 pages / pteg */
> >>
> >> While here I guess its a good opportunity to write couple of lines
> >> about why one PTE group for every two physical pages on the system,
> > 
> > Well, that don't really know, it's just copied from the existing code.
> 
> Aneesh, would you know why ?

1 PTEG per 2 pages means 4 HPTEs per page, which means you can map
each page to an average of 4 different virtual addresses.  It's a
heuristic that has been around for a long time and dates back to the
early days of AIX.  For Linux, running on machines which typically
have quite a lot of memory, it's probably overkill.

> > 
> >> why minimum (1UL << 11 = 2048) number of PTE groups required,
> 
> Aneesh, would you know why ?

It's in the architecture, which specifies the minimum size of the HPT
as 256kB.  The reason is because not all of the virtual address bits
are present in the HPT.  That's OK because some of the virtual address
bits are implied by the HPTEG index in the hash table.  If the HPT was
less than 256kB (2048 HPTEGs) there would be the possibility of
collisions where two different virtual addresses could hash to the
same HPTEG and their HPTEs would be impossible to tell apart.

> 
> > 
> > Ok.
> > 
> >> why
> >> (1U << 7 = 128) entries per PTE group
> > 
> > Um.. what?  Because that's how big a PTEG is, I don't think
> > re-explaining the HPT structure here is useful.
> 
> Agreed, though think some where these things should be macros not used
> as hard coded numbers like this.

Using symbols instead of constant numbers is not always clearer.  The
symbol name can give some context (but so can a suitable comment) but
has the cost of obscuring the actual numeric value.

Paul.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing
  2016-01-29  5:24 ` [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing David Gibson
  2016-02-01  7:11   ` Anshuman Khandual
@ 2016-02-08  5:58   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  5:58 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:24:00PM +1100, David Gibson wrote:
> This adds the hypercall numbers and wrapper functions for the hash page
> table resizing hypercalls.
> 
> These are experimental "platform specific" values for now, until we have a
> formal PAPR update.
> 
> It also adds a new firmware feature flat to track the presence of the
> HPT resizing calls.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 7/9] pseries: Add support for hash table resizing
  2016-01-29  5:24 ` [RFCv2 7/9] pseries: Add support for hash " David Gibson
  2016-02-01  8:31   ` Anshuman Khandual
@ 2016-02-08  5:59   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  5:59 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:24:01PM +1100, David Gibson wrote:
> This adds support for using experimental hypercalls to change the size
> of the main hash page table while running as a PAPR guest.  For now these
> hypercalls are only in experimental qemu versions.
> 
> The interface is two part: first H_RESIZE_HPT_PREPARE is used to allocate
> and prepare the new hash table.  This may be slow, but can be done
> asynchronously.  Then, H_RESIZE_HPT_COMMIT is used to switch to the new
> hash table.  This requires that no CPUs be concurrently updating the HPT,
> and so must be run under stop_machine().
> 
> This also adds a debugfs file which can be used to manually control
> HPT resizing or testing purposes.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS
  2016-01-29  5:24 ` [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS David Gibson
  2016-02-01  8:36   ` Anshuman Khandual
@ 2016-02-08  6:00   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  6:00 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:24:02PM +1100, David Gibson wrote:
> The hypervisor needs to know a guest is capable of using the HPT resizing
> PAPR extension in order to make full advantage of it for memory hotplug.
> 
> If the hypervisor knows the guest is HPT resize aware, it can size the
> initial HPT based on the initial guest RAM size, relying on the guest to
> resize the HPT when more memory is hot-added.  Without this, the hypervisor
> must size the HPT for the maximum possible guest RAM, which can lead to
> a huge waste of space if the guest never actually expends to that maximum
> size.
> 
> This patch advertises the guest's support for HPT resizing via the
> ibm,client-architecture-support OF interface.  Obviously, the actual
> encoding in the CAS vector is tentative until the extension is officially
> incorporated into PAPR.  For now we use bit 0 of (previously unused) byte 8
> of option vector 5.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove
  2016-01-29  5:24 ` [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove David Gibson
  2016-02-01  8:51   ` Anshuman Khandual
@ 2016-02-08  6:01   ` Paul Mackerras
  1 sibling, 0 replies; 42+ messages in thread
From: Paul Mackerras @ 2016-02-08  6:01 UTC (permalink / raw)
  To: David Gibson; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

On Fri, Jan 29, 2016 at 04:24:03PM +1100, David Gibson wrote:
> We've now implemented code in the pseries platform to use the new PAPR
> interface to allow resizing the hash page table (HPT) at runtime.
> 
> This patch uses that interface to automatically attempt to resize the HPT
> when memory is hot added or removed.  This tries to always keep the HPT at
> a reasonable size for our current memory size.
> 
> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

Reviewed-by: Paul Mackerras <paulus@samba.org>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs
  2016-02-08  2:54   ` Paul Mackerras
@ 2016-02-09  0:43     ` David Gibson
  0 siblings, 0 replies; 42+ messages in thread
From: David Gibson @ 2016-02-09  0:43 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: mpe, benh, linuxppc-dev, aik, thuth, lvivier

[-- Attachment #1: Type: text/plain, Size: 3276 bytes --]

On Mon, Feb 08, 2016 at 01:54:04PM +1100, Paul Mackerras wrote:
> On Fri, Jan 29, 2016 at 04:23:57PM +1100, David Gibson wrote:
> > At the moment the hpte_removebolted callback in ppc_md returns void and
> > will BUG_ON() if the hpte it's asked to remove doesn't exist in the first
> > place.  This is awkward for the case of cleaning up a mapping which was
> > partially made before failing.
> > 
> > So, we add a return value to hpte_removebolted, and have it return ENOENT
> > in the case that the HPTE to remove didn't exist in the first place.
> > 
> > In the (sole) caller, we propagate errors in hpte_removebolted to its
> > caller to handle.  However, we handle ENOENT specially, continuing to
> > complete the unmapping over the specified range before returning the error
> > to the caller.
> > 
> > This means that htab_remove_mapping() will work sanely on a partially
> > present mapping, removing any HPTEs which are present, while also returning
> > ENOENT to its caller in case it's important there.
> > 
> > There are two callers of htab_remove_mapping():
> >    - In remove_section_mapping() we already WARN_ON() any error return,
> >      which is reasonable - in this case the mapping should be fully
> >      present
> >    - In vmemmap_remove_mapping() we BUG_ON() any error.  We change that to
> >      just a WARN_ON() in the case of ENOENT, since failing to remove a
> >      mapping that wasn't there in the first place probably shouldn't be
> >      fatal.
> > 
> > Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
> 
> [snip]
> 
> > --- a/arch/powerpc/mm/hash_utils_64.c
> > +++ b/arch/powerpc/mm/hash_utils_64.c
> > @@ -269,6 +269,7 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
> >  {
> >  	unsigned long vaddr;
> >  	unsigned int step, shift;
> > +	int rc = 0;
> >  
> >  	shift = mmu_psize_defs[psize].shift;
> >  	step = 1 << shift;
> > @@ -276,10 +277,13 @@ int htab_remove_mapping(unsigned long vstart, unsigned long vend,
> >  	if (!ppc_md.hpte_removebolted)
> >  		return -ENODEV;
> >  
> > -	for (vaddr = vstart; vaddr < vend; vaddr += step)
> > -		ppc_md.hpte_removebolted(vaddr, psize, ssize);
> > +	for (vaddr = vstart; vaddr < vend; vaddr += step) {
> > +		rc = ppc_md.hpte_removebolted(vaddr, psize, ssize);
> > +		if ((rc < 0) && (rc != -ENOENT))
> > +			return rc;
> > +	}
> >  
> > -	return 0;
> > +	return rc;
> 
> This will return the rc from the last hpte_removebolted call, which
> might be 0 even if earlier calls had returned -ENOENT.  Or, if the
> last call fails with -ENOENT, this will return -ENOENT.  Is that
> exactly what you meant?  In the case where some calls to
> hpte_removebolted return -ENOENT, I would think we would want a
> consistent return value, which could be either 0 or -ENOENT, but it
> shouldn't depend on which specific calls fail with -ENOENT, in my
> opinion.

I agree.  The intention was that this returned -ENOENT iff any of the
individual calls did, but I messed up the logic; thanks for the catch.

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2016-02-09  1:20 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-29  5:23 [RFCv2 0/9] PAPR hash page table resizing (guest side) David Gibson
2016-01-29  5:23 ` [RFCv2 1/9] memblock: Don't mark memblock_phys_mem_size() as __init David Gibson
2016-02-01  5:50   ` Anshuman Khandual
2016-02-08  2:46   ` Paul Mackerras
2016-01-29  5:23 ` [RFCv2 2/9] arch/powerpc: Clean up error handling for htab_remove_mapping David Gibson
2016-02-01  5:54   ` Anshuman Khandual
2016-02-08  2:48   ` Paul Mackerras
2016-01-29  5:23 ` [RFCv2 3/9] arch/powerpc: Handle removing maybe-present bolted HPTEs David Gibson
2016-02-01  5:58   ` Anshuman Khandual
2016-02-02  1:08     ` David Gibson
2016-02-02 13:49   ` Denis Kirjanov
2016-02-08  2:54   ` Paul Mackerras
2016-02-09  0:43     ` David Gibson
2016-01-29  5:23 ` [RFCv2 4/9] arch/powerpc: Clean up memory hotplug failure paths David Gibson
2016-02-01  6:29   ` Anshuman Khandual
2016-02-02 15:04   ` Nathan Fontenot
2016-02-03  4:31     ` David Gibson
2016-02-08  5:47   ` Paul Mackerras
2016-01-29  5:23 ` [RFCv2 5/9] arch/powerpc: Split hash page table sizing heuristic into a helper David Gibson
2016-02-01  7:04   ` Anshuman Khandual
2016-02-02  1:04     ` David Gibson
2016-02-04 10:56       ` Anshuman Khandual
2016-02-08  5:57         ` Paul Mackerras
2016-01-29  5:24 ` [RFCv2 6/9] pseries: Add hypercall wrappers for hash page table resizing David Gibson
2016-02-01  7:11   ` Anshuman Khandual
2016-02-02  0:58     ` David Gibson
2016-02-04 11:11       ` Anshuman Khandual
2016-02-07 22:33         ` David Gibson
2016-02-08  5:58   ` Paul Mackerras
2016-01-29  5:24 ` [RFCv2 7/9] pseries: Add support for hash " David Gibson
2016-02-01  8:31   ` Anshuman Khandual
2016-02-01 11:04     ` David Gibson
2016-02-08  5:59   ` Paul Mackerras
2016-01-29  5:24 ` [RFCv2 8/9] pseries: Advertise HPT resizing support via CAS David Gibson
2016-02-01  8:36   ` Anshuman Khandual
2016-02-08  6:00   ` Paul Mackerras
2016-01-29  5:24 ` [RFCv2 9/9] pseries: Automatically resize HPT for memory hot add/remove David Gibson
2016-02-01  8:51   ` Anshuman Khandual
2016-02-01 10:55     ` David Gibson
2016-02-08  6:01   ` Paul Mackerras
2016-02-01  5:50 ` [RFCv2 0/9] PAPR hash page table resizing (guest side) Anshuman Khandual
2016-02-02  0:57   ` David Gibson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.