All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes
@ 2020-04-06  3:49 Bharata B Rao
  2020-04-06  3:49 ` [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem Bharata B Rao
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Memory unplug has a few bugs which I had attempted to fix ealier
at https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-July/194087.html

Now with Leonardo's patch for PAPR changes that add a separate flag bit
to LMB flags for explicitly identifying hot-removable memory
(https://lore.kernel.org/linuxppc-dev/f55a7b65a43cc9dc7b22385cf9960f8b11d5ce2e.camel@linux.ibm.com/T/#t),
a few other issues around memory unplug on radix can be fixed. This
series is a combination of those fixes.

This series works on top of above mentioned Leonardo's patch.

Bharata B Rao (5):
  powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for
    hot-plugged mem
  powerpc/mm/radix: Create separate mappings for hot-plugged memory
  powerpc/mm/radix: Fix PTE/PMD fragment count for early page table
    mappings
  powerpc/mm/radix: Free PUD table when freeing pagetable
  powerpc/mm/radix: Remove split_kernel_mapping()

 arch/powerpc/include/asm/book3s/64/pgalloc.h  |  11 +-
 arch/powerpc/include/asm/book3s/64/radix.h    |   1 +
 arch/powerpc/include/asm/sparsemem.h          |   1 +
 arch/powerpc/mm/book3s64/pgtable.c            |  31 ++-
 arch/powerpc/mm/book3s64/radix_pgtable.c      | 186 +++++++++++-------
 arch/powerpc/mm/mem.c                         |   5 +
 arch/powerpc/mm/pgtable-frag.c                |   9 +-
 .../platforms/pseries/hotplug-memory.c        |   6 +-
 8 files changed, 167 insertions(+), 83 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
@ 2020-04-06  3:49 ` Bharata B Rao
  2020-04-06  5:33   ` kbuild test robot
  2020-04-06  3:49 ` [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory Bharata B Rao
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

In addition to setting DRCONF_MEM_HOTREMOVABLE for boot-time hot-plugged
memory, we should set the same too for the memory that gets hot-plugged
post-boot. This ensures that correct LMB flags value is reflected in
ibm,dynamic-memory-vN property.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/platforms/pseries/hotplug-memory.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/hotplug-memory.c b/arch/powerpc/platforms/pseries/hotplug-memory.c
index a4d40a3ceea3..6d75f6e182ae 100644
--- a/arch/powerpc/platforms/pseries/hotplug-memory.c
+++ b/arch/powerpc/platforms/pseries/hotplug-memory.c
@@ -395,7 +395,8 @@ static int dlpar_remove_lmb(struct drmem_lmb *lmb)
 
 	invalidate_lmb_associativity_index(lmb);
 	lmb_clear_nid(lmb);
-	lmb->flags &= ~DRCONF_MEM_ASSIGNED;
+	lmb->flags &= ~(DRCONF_MEM_ASSIGNED |
+			DRCONF_MEM_HOTREMOVABLE);
 
 	return 0;
 }
@@ -678,7 +679,8 @@ static int dlpar_add_lmb(struct drmem_lmb *lmb)
 		invalidate_lmb_associativity_index(lmb);
 		lmb_clear_nid(lmb);
 	} else {
-		lmb->flags |= DRCONF_MEM_ASSIGNED;
+		lmb->flags |= (DRCONF_MEM_ASSIGNED |
+			       DRCONF_MEM_HOTREMOVABLE);
 	}
 
 	return rc;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
  2020-04-06  3:49 ` [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem Bharata B Rao
@ 2020-04-06  3:49 ` Bharata B Rao
  2020-06-22 12:46   ` Aneesh Kumar K.V
  2020-04-06  3:49 ` [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings Bharata B Rao
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Memory that gets hot-plugged _during_ boot (and not the memory
that gets plugged in after boot), is mapped with 1G mappings
and will undergo splitting when it is unplugged. The splitting
code has a few issues:

1. Recursive locking
--------------------
Memory unplug path takes cpu_hotplug_lock and calls stop_machine()
for splitting the mappings. However stop_machine() takes
cpu_hotplug_lock again causing deadlock.

2. BUG: sleeping function called from in_atomic() context
---------------------------------------------------------
Memory unplug path (remove_pagetable) takes init_mm.page_table_lock
spinlock and later calls stop_machine() which does wait_for_completion()

3. Bad unlock unbalance
-----------------------
Memory unplug path takes init_mm.page_table_lock spinlock and calls
stop_machine(). The stop_machine thread function runs in a different
thread context (migration thread) which tries to release and reaquire
ptl. Releasing ptl from a different thread than which acquired it
causes bad unlock unbalance.

These problems can be avoided if we avoid mapping hot-plugged memory
with 1G mapping, thereby removing the need for splitting them during
unplug. During radix init, identify(*) the hot-plugged memory region
and create separate mappings for each LMB so that they don't get mapped
with 1G mappings.

To create separate mappings for every LMB in the hot-plugged
region, we need lmb-size. I am currently using memory_block_size_bytes()
API to get the lmb-size. Since this is early init time code, the
machine type isn't probed yet and hence memory_block_size_bytes()
would return the default LMB size as 16MB. Hence we end up creating
separate mappings at much lower granularity than what we can ideally
do for pseries machine.

(*) Identifying and differentiating hot-plugged memory from the
boot time memory is now possible with PAPR extension to LMB flags.
(Ref: https://lore.kernel.org/linuxppc-dev/f55a7b65a43cc9dc7b22385cf9960f8b11d5ce2e.camel@linux.ibm.com/T/#t)

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index dd1bea45325c..4a4fb30f6c3d 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -16,6 +16,7 @@
 #include <linux/hugetlb.h>
 #include <linux/string_helpers.h>
 #include <linux/stop_machine.h>
+#include <linux/memory.h>
 
 #include <asm/pgtable.h>
 #include <asm/pgalloc.h>
@@ -313,6 +314,8 @@ static void __init radix_init_pgtable(void)
 {
 	unsigned long rts_field;
 	struct memblock_region *reg;
+	phys_addr_t addr;
+	u64 lmb_size = memory_block_size_bytes();
 
 	/* We don't support slb for radix */
 	mmu_slb_size = 0;
@@ -331,9 +334,15 @@ static void __init radix_init_pgtable(void)
 			continue;
 		}
 
-		WARN_ON(create_physical_mapping(reg->base,
-						reg->base + reg->size,
-						-1));
+		if (memblock_is_hotpluggable(reg)) {
+			for (addr = reg->base; addr < (reg->base + reg->size);
+				addr += lmb_size)
+				WARN_ON(create_physical_mapping(addr,
+				addr + lmb_size, -1));
+		} else
+			WARN_ON(create_physical_mapping(reg->base,
+							reg->base + reg->size,
+							-1));
 	}
 
 	/* Find out how many PID bits are supported */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
  2020-04-06  3:49 ` [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem Bharata B Rao
  2020-04-06  3:49 ` [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory Bharata B Rao
@ 2020-04-06  3:49 ` Bharata B Rao
  2020-06-22 12:53   ` Aneesh Kumar K.V
  2020-06-22 13:22   ` Aneesh Kumar K.V
  2020-04-06  3:49 ` [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable Bharata B Rao
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

We can hit the following BUG_ON during memory unplug

kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:344!
Oops: Exception in kernel mode, sig: 5 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
NIP [c000000000097d48] pmd_fragment_free+0x48/0xd0
LR [c0000000016aaefc] remove_pagetable+0x494/0x530
Call Trace:
_raw_spin_lock+0x54/0x80 (unreliable)
remove_pagetable+0x2b0/0x530
radix__remove_section_mapping+0x18/0x2c
remove_section_mapping+0x38/0x5c
arch_remove_memory+0x124/0x190
try_remove_memory+0xd0/0x1c0
__remove_memory+0x20/0x40
dlpar_remove_lmb+0xbc/0x110
dlpar_memory+0xa90/0xd40
handle_dlpar_errorlog+0xa8/0x160
pseries_hp_work_fn+0x2c/0x60
process_one_work+0x47c/0x870
worker_thread+0x364/0x5e0
kthread+0x1b4/0x1c0
ret_from_kernel_thread+0x5c/0x74

This occurs when unplug is attempted for such memory which has
been mapped using memblock pages as part of early kernel page
table setup. We wouldn't have initialized the PMD or PTE fragment
count for those PMD or PTE pages.

Fixing this includes 3 parts:

- Re-walk the init_mm page tables from mem_init() and initialize
  the PMD and PTE fragment count to 1.
- When freeing PUD, PMD and PTE page table pages, check explicitly
  if they come from memblock and if so free then appropriately.
- When we do early memblock based allocation of PMD and PUD pages,
  allocate in PAGE_SIZE granularity so that we are sure the
  complete page is used as pagetable page.

Since we now do PAGE_SIZE allocations for both PUD table and
PMD table (Note that PTE table allocation is already of PAGE_SIZE),
we end up allocating more memory for the same amount of system RAM.
Here is a comparision of how much more we need for a 64T and 2G
system after this patch:

1. 64T system
-------------
64T RAM would need 64G for vmemmap with struct page size being 64B.

128 PUD tables for 64T memory (1G mappings)
1 PUD table and 64 PMD tables for 64G vmemmap (2M mappings)

With default PUD[PMD]_TABLE_SIZE(4K), (128+1+64)*4K=772K
With PAGE_SIZE(64K) table allocations, (128+1+64)*64K=12352K

2. 2G system
------------
2G RAM would need 2M for vmemmap with struct page size being 64B.

1 PUD table for 2G memory (1G mapping)
1 PUD table and 1 PMD table for 2M vmemmap (2M mappings)

With default PUD[PMD]_TABLE_SIZE(4K), (1+1+1)*4K=12K
With new PAGE_SIZE(64K) table allocations, (1+1+1)*64K=192K

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 11 ++-
 arch/powerpc/include/asm/book3s/64/radix.h   |  1 +
 arch/powerpc/include/asm/sparsemem.h         |  1 +
 arch/powerpc/mm/book3s64/pgtable.c           | 31 ++++++++-
 arch/powerpc/mm/book3s64/radix_pgtable.c     | 72 ++++++++++++++++++--
 arch/powerpc/mm/mem.c                        |  5 ++
 arch/powerpc/mm/pgtable-frag.c               |  9 ++-
 7 files changed, 121 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index a41e91bd0580..e96572fb2871 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -109,7 +109,16 @@ static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
 
 static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 {
-	kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), pud);
+	struct page *page = virt_to_page(pud);
+
+	/*
+	 * Early pud pages allocated via memblock allocator
+	 * can't be directly freed to slab
+	 */
+	if (PageReserved(page))
+		free_reserved_page(page);
+	else
+		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), pud);
 }
 
 static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index d97db3ad9aae..0aff8750181a 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -291,6 +291,7 @@ static inline unsigned long radix__get_tree_size(void)
 #ifdef CONFIG_MEMORY_HOTPLUG
 int radix__create_section_mapping(unsigned long start, unsigned long end, int nid);
 int radix__remove_section_mapping(unsigned long start, unsigned long end);
+void radix__fixup_pgtable_fragments(void);
 #endif /* CONFIG_MEMORY_HOTPLUG */
 #endif /* __ASSEMBLY__ */
 #endif
diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
index 3192d454a733..e662f9232d35 100644
--- a/arch/powerpc/include/asm/sparsemem.h
+++ b/arch/powerpc/include/asm/sparsemem.h
@@ -15,6 +15,7 @@
 #ifdef CONFIG_MEMORY_HOTPLUG
 extern int create_section_mapping(unsigned long start, unsigned long end, int nid);
 extern int remove_section_mapping(unsigned long start, unsigned long end);
+void fixup_pgtable_fragments(void);
 
 #ifdef CONFIG_PPC_BOOK3S_64
 extern int resize_hpt_for_hotplug(unsigned long new_mem_size);
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 2bf7e1b4fd82..be7aa8786747 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -186,6 +186,13 @@ int __meminit remove_section_mapping(unsigned long start, unsigned long end)
 
 	return hash__remove_section_mapping(start, end);
 }
+
+void fixup_pgtable_fragments(void)
+{
+	if (radix_enabled())
+		radix__fixup_pgtable_fragments();
+}
+
 #endif /* CONFIG_MEMORY_HOTPLUG */
 
 void __init mmu_partition_table_init(void)
@@ -343,13 +350,23 @@ void pmd_fragment_free(unsigned long *pmd)
 
 	BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0);
 	if (atomic_dec_and_test(&page->pt_frag_refcount)) {
-		pgtable_pmd_page_dtor(page);
-		__free_page(page);
+		/*
+		 * Early pmd pages allocated via memblock
+		 * allocator wouldn't have called _ctor
+		 */
+		if (PageReserved(page))
+			free_reserved_page(page);
+		else {
+			pgtable_pmd_page_dtor(page);
+			__free_page(page);
+		}
 	}
 }
 
 static inline void pgtable_free(void *table, int index)
 {
+	struct page *page;
+
 	switch (index) {
 	case PTE_INDEX:
 		pte_fragment_free(table, 0);
@@ -358,7 +375,15 @@ static inline void pgtable_free(void *table, int index)
 		pmd_fragment_free(table);
 		break;
 	case PUD_INDEX:
-		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
+		page = virt_to_page(table);
+		/*
+		 * Early pud pages allocated via memblock
+		 * allocator need to be freed differently
+		 */
+		if (PageReserved(page))
+			free_reserved_page(page);
+		else
+			kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
 		break;
 #if defined(CONFIG_PPC_4K_PAGES) && defined(CONFIG_HUGETLB_PAGE)
 		/* 16M hugepd directory at pud level */
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 4a4fb30f6c3d..e675c0bbf9a4 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -36,6 +36,70 @@
 unsigned int mmu_pid_bits;
 unsigned int mmu_base_pid;
 
+static void fixup_pte_fragments(pmd_t *pmd)
+{
+	int i;
+
+	for (i = 0; i < PTRS_PER_PMD; i++, pmd++) {
+		pte_t *pte;
+		struct page *page;
+
+		if (pmd_none(*pmd))
+			continue;
+		if (pmd_is_leaf(*pmd))
+			continue;
+
+		pte = pte_offset_kernel(pmd, 0);
+		page = virt_to_page(pte);
+		atomic_inc(&page->pt_frag_refcount);
+	}
+}
+
+static void fixup_pmd_fragments(pud_t *pud)
+{
+	int i;
+
+	for (i = 0; i < PTRS_PER_PUD; i++, pud++) {
+		pmd_t *pmd;
+		struct page *page;
+
+		if (pud_none(*pud))
+			continue;
+		if (pud_is_leaf(*pud))
+			continue;
+
+		pmd = pmd_offset(pud, 0);
+		page = virt_to_page(pmd);
+		atomic_inc(&page->pt_frag_refcount);
+		fixup_pte_fragments(pmd);
+	}
+}
+
+/*
+ * Walk the init_mm page tables and fixup the PMD and PTE fragment
+ * counts. This allows the PUD, PMD and PTE pages to be freed
+ * back to buddy allocator properly during memory unplug.
+ */
+void radix__fixup_pgtable_fragments(void)
+{
+	int i;
+	pgd_t *pgd = pgd_offset_k(0UL);
+
+	spin_lock(&init_mm.page_table_lock);
+	for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {
+		pud_t *pud;
+
+		if (pgd_none(*pgd))
+			continue;
+		if (pgd_is_leaf(*pgd))
+			continue;
+
+		pud = pud_offset(pgd, 0);
+		fixup_pmd_fragments(pud);
+	}
+	spin_unlock(&init_mm.page_table_lock);
+}
+
 static __ref void *early_alloc_pgtable(unsigned long size, int nid,
 			unsigned long region_start, unsigned long region_end)
 {
@@ -71,8 +135,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
 
 	pgdp = pgd_offset_k(ea);
 	if (pgd_none(*pgdp)) {
-		pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
-						region_start, region_end);
+		pudp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
+					   region_end);
 		pgd_populate(&init_mm, pgdp, pudp);
 	}
 	pudp = pud_offset(pgdp, ea);
@@ -81,8 +145,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
 		goto set_the_pte;
 	}
 	if (pud_none(*pudp)) {
-		pmdp = early_alloc_pgtable(PMD_TABLE_SIZE, nid,
-						region_start, region_end);
+		pmdp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
+					   region_end);
 		pud_populate(&init_mm, pudp, pmdp);
 	}
 	pmdp = pmd_offset(pudp, ea);
diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
index 1c07d5a3f543..d43ad701f693 100644
--- a/arch/powerpc/mm/mem.c
+++ b/arch/powerpc/mm/mem.c
@@ -53,6 +53,10 @@
 
 #include <mm/mmu_decl.h>
 
+void __weak fixup_pgtable_fragments(void)
+{
+}
+
 #ifndef CPU_FTR_COHERENT_ICACHE
 #define CPU_FTR_COHERENT_ICACHE	0	/* XXX for now */
 #define CPU_FTR_NOEXECUTE	0
@@ -307,6 +311,7 @@ void __init mem_init(void)
 
 	memblock_free_all();
 
+	fixup_pgtable_fragments();
 #ifdef CONFIG_HIGHMEM
 	{
 		unsigned long pfn, highmem_mapnr;
diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
index ee4bd6d38602..16213c09896a 100644
--- a/arch/powerpc/mm/pgtable-frag.c
+++ b/arch/powerpc/mm/pgtable-frag.c
@@ -114,6 +114,13 @@ void pte_fragment_free(unsigned long *table, int kernel)
 	if (atomic_dec_and_test(&page->pt_frag_refcount)) {
 		if (!kernel)
 			pgtable_pte_page_dtor(page);
-		__free_page(page);
+		/*
+		 * Early pte pages allocated via memblock
+		 * allocator need to be freed differently
+		 */
+		if (PageReserved(page))
+			free_reserved_page(page);
+		else
+			__free_page(page);
 	}
 }
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
                   ` (2 preceding siblings ...)
  2020-04-06  3:49 ` [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings Bharata B Rao
@ 2020-04-06  3:49 ` Bharata B Rao
  2020-06-22 13:07   ` Aneesh Kumar K.V
  2020-04-06  3:49 ` [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping() Bharata B Rao
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

remove_pagetable() isn't freeing PUD table. This causes memory
leak during memory unplug. Fix this.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index e675c0bbf9a4..0d9ef3277579 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -767,6 +767,21 @@ static void free_pmd_table(pmd_t *pmd_start, pud_t *pud)
 	pud_clear(pud);
 }
 
+static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
+{
+	pud_t *pud;
+	int i;
+
+	for (i = 0; i < PTRS_PER_PUD; i++) {
+		pud = pud_start + i;
+		if (!pud_none(*pud))
+			return;
+	}
+
+	pud_free(&init_mm, pud_start);
+	pgd_clear(pgd);
+}
+
 struct change_mapping_params {
 	pte_t *pte;
 	unsigned long start;
@@ -937,6 +952,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
 
 		pud_base = (pud_t *)pgd_page_vaddr(*pgd);
 		remove_pud_table(pud_base, addr, next);
+		free_pud_table(pud_base, pgd);
 	}
 
 	spin_unlock(&init_mm.page_table_lock);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping()
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
                   ` (3 preceding siblings ...)
  2020-04-06  3:49 ` [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable Bharata B Rao
@ 2020-04-06  3:49 ` Bharata B Rao
  2020-06-22 13:07   ` Aneesh Kumar K.V
  2020-04-09  4:31 ` [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
  2020-05-20  4:34 ` Bharata B Rao
  6 siblings, 1 reply; 14+ messages in thread
From: Bharata B Rao @ 2020-04-06  3:49 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

With hot-plugged memory getting mapped with 2M mappings always,
there will be no need to split any mappings during unplug.

Hence remove split_kernel_mapping() and associated code. This
essentially is a revert of
commit 4dd5f8a99e791 ("powerpc/mm/radix: Split linear mapping on hot-unplug")

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
---
 arch/powerpc/mm/book3s64/radix_pgtable.c | 93 +++++-------------------
 1 file changed, 19 insertions(+), 74 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
index 0d9ef3277579..56f2c698deac 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -15,7 +15,6 @@
 #include <linux/mm.h>
 #include <linux/hugetlb.h>
 #include <linux/string_helpers.h>
-#include <linux/stop_machine.h>
 #include <linux/memory.h>
 
 #include <asm/pgtable.h>
@@ -782,30 +781,6 @@ static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
 	pgd_clear(pgd);
 }
 
-struct change_mapping_params {
-	pte_t *pte;
-	unsigned long start;
-	unsigned long end;
-	unsigned long aligned_start;
-	unsigned long aligned_end;
-};
-
-static int __meminit stop_machine_change_mapping(void *data)
-{
-	struct change_mapping_params *params =
-			(struct change_mapping_params *)data;
-
-	if (!data)
-		return -1;
-
-	spin_unlock(&init_mm.page_table_lock);
-	pte_clear(&init_mm, params->aligned_start, params->pte);
-	create_physical_mapping(__pa(params->aligned_start), __pa(params->start), -1);
-	create_physical_mapping(__pa(params->end), __pa(params->aligned_end), -1);
-	spin_lock(&init_mm.page_table_lock);
-	return 0;
-}
-
 static void remove_pte_table(pte_t *pte_start, unsigned long addr,
 			     unsigned long end)
 {
@@ -834,52 +809,6 @@ static void remove_pte_table(pte_t *pte_start, unsigned long addr,
 	}
 }
 
-/*
- * clear the pte and potentially split the mapping helper
- */
-static void __meminit split_kernel_mapping(unsigned long addr, unsigned long end,
-				unsigned long size, pte_t *pte)
-{
-	unsigned long mask = ~(size - 1);
-	unsigned long aligned_start = addr & mask;
-	unsigned long aligned_end = addr + size;
-	struct change_mapping_params params;
-	bool split_region = false;
-
-	if ((end - addr) < size) {
-		/*
-		 * We're going to clear the PTE, but not flushed
-		 * the mapping, time to remap and flush. The
-		 * effects if visible outside the processor or
-		 * if we are running in code close to the
-		 * mapping we cleared, we are in trouble.
-		 */
-		if (overlaps_kernel_text(aligned_start, addr) ||
-			overlaps_kernel_text(end, aligned_end)) {
-			/*
-			 * Hack, just return, don't pte_clear
-			 */
-			WARN_ONCE(1, "Linear mapping %lx->%lx overlaps kernel "
-				  "text, not splitting\n", addr, end);
-			return;
-		}
-		split_region = true;
-	}
-
-	if (split_region) {
-		params.pte = pte;
-		params.start = addr;
-		params.end = end;
-		params.aligned_start = addr & ~(size - 1);
-		params.aligned_end = min_t(unsigned long, aligned_end,
-				(unsigned long)__va(memblock_end_of_DRAM()));
-		stop_machine(stop_machine_change_mapping, &params, NULL);
-		return;
-	}
-
-	pte_clear(&init_mm, addr, pte);
-}
-
 static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
 			     unsigned long end)
 {
@@ -895,7 +824,12 @@ static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
 			continue;
 
 		if (pmd_is_leaf(*pmd)) {
-			split_kernel_mapping(addr, end, PMD_SIZE, (pte_t *)pmd);
+			if (!IS_ALIGNED(addr, PMD_SIZE) ||
+			    !IS_ALIGNED(next, PMD_SIZE)) {
+				WARN_ONCE(1, "%s: unaligned range\n", __func__);
+				continue;
+			}
+			pte_clear(&init_mm, addr, (pte_t *)pmd);
 			continue;
 		}
 
@@ -920,7 +854,12 @@ static void remove_pud_table(pud_t *pud_start, unsigned long addr,
 			continue;
 
 		if (pud_is_leaf(*pud)) {
-			split_kernel_mapping(addr, end, PUD_SIZE, (pte_t *)pud);
+			if (!IS_ALIGNED(addr, PUD_SIZE) ||
+			    !IS_ALIGNED(next, PUD_SIZE)) {
+				WARN_ONCE(1, "%s: unaligned range\n", __func__);
+				continue;
+			}
+			pte_clear(&init_mm, addr, (pte_t *)pud);
 			continue;
 		}
 
@@ -946,7 +885,13 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
 			continue;
 
 		if (pgd_is_leaf(*pgd)) {
-			split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
+			if (!IS_ALIGNED(addr, PGDIR_SIZE) ||
+			    !IS_ALIGNED(next, PGDIR_SIZE)) {
+				WARN_ONCE(1, "%s: unaligned range\n", __func__);
+				continue;
+			}
+
+			pte_clear(&init_mm, addr, (pte_t *)pgd);
 			continue;
 		}
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem
  2020-04-06  3:49 ` [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem Bharata B Rao
@ 2020-04-06  5:33   ` kbuild test robot
  0 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2020-04-06  5:33 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 2821 bytes --]

Hi Bharata,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.6 next-20200405]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Bharata-B-Rao/powerpc-mm-radix-Memory-unplug-fixes/20200406-121704
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allmodconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   arch/powerpc/platforms/pseries/hotplug-memory.c: In function 'dlpar_remove_lmb':
>> arch/powerpc/platforms/pseries/hotplug-memory.c:399:4: error: 'DRCONF_MEM_HOTREMOVABLE' undeclared (first use in this function)
     399 |    DRCONF_MEM_HOTREMOVABLE);
         |    ^~~~~~~~~~~~~~~~~~~~~~~
   arch/powerpc/platforms/pseries/hotplug-memory.c:399:4: note: each undeclared identifier is reported only once for each function it appears in
   arch/powerpc/platforms/pseries/hotplug-memory.c: In function 'dlpar_add_lmb':
   arch/powerpc/platforms/pseries/hotplug-memory.c:683:11: error: 'DRCONF_MEM_HOTREMOVABLE' undeclared (first use in this function)
     683 |           DRCONF_MEM_HOTREMOVABLE);
         |           ^~~~~~~~~~~~~~~~~~~~~~~

vim +/DRCONF_MEM_HOTREMOVABLE +399 arch/powerpc/platforms/pseries/hotplug-memory.c

   376	
   377	static int dlpar_remove_lmb(struct drmem_lmb *lmb)
   378	{
   379		unsigned long block_sz;
   380		int rc;
   381	
   382		if (!lmb_is_removable(lmb))
   383			return -EINVAL;
   384	
   385		rc = dlpar_offline_lmb(lmb);
   386		if (rc)
   387			return rc;
   388	
   389		block_sz = pseries_memory_block_size();
   390	
   391		__remove_memory(lmb->nid, lmb->base_addr, block_sz);
   392	
   393		/* Update memory regions for memory remove */
   394		memblock_remove(lmb->base_addr, block_sz);
   395	
   396		invalidate_lmb_associativity_index(lmb);
   397		lmb_clear_nid(lmb);
   398		lmb->flags &= ~(DRCONF_MEM_ASSIGNED |
 > 399				DRCONF_MEM_HOTREMOVABLE);
   400	
   401		return 0;
   402	}
   403	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 64898 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
                   ` (4 preceding siblings ...)
  2020-04-06  3:49 ` [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping() Bharata B Rao
@ 2020-04-09  4:31 ` Bharata B Rao
  2020-05-20  4:34 ` Bharata B Rao
  6 siblings, 0 replies; 14+ messages in thread
From: Bharata B Rao @ 2020-04-09  4:31 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin

On Mon, Apr 06, 2020 at 09:19:20AM +0530, Bharata B Rao wrote:
> Memory unplug has a few bugs which I had attempted to fix ealier
> at https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-July/194087.html
> 
> Now with Leonardo's patch for PAPR changes that add a separate flag bit
> to LMB flags for explicitly identifying hot-removable memory
> (https://lore.kernel.org/linuxppc-dev/f55a7b65a43cc9dc7b22385cf9960f8b11d5ce2e.camel@linux.ibm.com/T/#t),
> a few other issues around memory unplug on radix can be fixed. This
> series is a combination of those fixes.
> 
> This series works on top of above mentioned Leonardo's patch.
> 
> Bharata B Rao (5):
>   powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for
>     hot-plugged mem
>   powerpc/mm/radix: Create separate mappings for hot-plugged memory
>   powerpc/mm/radix: Fix PTE/PMD fragment count for early page table
>     mappings
>   powerpc/mm/radix: Free PUD table when freeing pagetable
>   powerpc/mm/radix: Remove split_kernel_mapping()

3/5 in this series fixes long-standing bug and multiple versions of it
has been posted outside of this series earlier.

4/5 fixes a memory leak.

I included the above tow in this series because with the patches to
explicitly mark the hotplugged memory (1/5 and 2/5), reproducing the bug
fixed by 3/5 becomes easier.

Hence 3/5 and 4/5 can be considered as standalone fixes too.

Regards,
Bharata.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes
  2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
                   ` (5 preceding siblings ...)
  2020-04-09  4:31 ` [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
@ 2020-05-20  4:34 ` Bharata B Rao
  6 siblings, 0 replies; 14+ messages in thread
From: Bharata B Rao @ 2020-05-20  4:34 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: leonardo, aneesh.kumar, npiggin

Aneesh,

Do these memory unplug fixes on radix look fine? Do you want these
to be rebased on recent kernel? Would you like me to test any specific
scenario with these fixes?

Regards,
Bharata.
 
On Mon, Apr 06, 2020 at 09:19:20AM +0530, Bharata B Rao wrote:
> Memory unplug has a few bugs which I had attempted to fix ealier
> at https://lists.ozlabs.org/pipermail/linuxppc-dev/2019-July/194087.html
> 
> Now with Leonardo's patch for PAPR changes that add a separate flag bit
> to LMB flags for explicitly identifying hot-removable memory
> (https://lore.kernel.org/linuxppc-dev/f55a7b65a43cc9dc7b22385cf9960f8b11d5ce2e.camel@linux.ibm.com/T/#t),
> a few other issues around memory unplug on radix can be fixed. This
> series is a combination of those fixes.
> 
> This series works on top of above mentioned Leonardo's patch.
> 
> Bharata B Rao (5):
>   powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for
>     hot-plugged mem
>   powerpc/mm/radix: Create separate mappings for hot-plugged memory
>   powerpc/mm/radix: Fix PTE/PMD fragment count for early page table
>     mappings
>   powerpc/mm/radix: Free PUD table when freeing pagetable
>   powerpc/mm/radix: Remove split_kernel_mapping()
> 
>  arch/powerpc/include/asm/book3s/64/pgalloc.h  |  11 +-
>  arch/powerpc/include/asm/book3s/64/radix.h    |   1 +
>  arch/powerpc/include/asm/sparsemem.h          |   1 +
>  arch/powerpc/mm/book3s64/pgtable.c            |  31 ++-
>  arch/powerpc/mm/book3s64/radix_pgtable.c      | 186 +++++++++++-------
>  arch/powerpc/mm/mem.c                         |   5 +
>  arch/powerpc/mm/pgtable-frag.c                |   9 +-
>  .../platforms/pseries/hotplug-memory.c        |   6 +-
>  8 files changed, 167 insertions(+), 83 deletions(-)
> 
> -- 
> 2.21.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory
  2020-04-06  3:49 ` [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory Bharata B Rao
@ 2020-06-22 12:46   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 14+ messages in thread
From: Aneesh Kumar K.V @ 2020-06-22 12:46 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev
  Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Bharata B Rao <bharata@linux.ibm.com> writes:

> Memory that gets hot-plugged _during_ boot (and not the memory
> that gets plugged in after boot), is mapped with 1G mappings
> and will undergo splitting when it is unplugged. The splitting
> code has a few issues:
>
> 1. Recursive locking
> --------------------
> Memory unplug path takes cpu_hotplug_lock and calls stop_machine()
> for splitting the mappings. However stop_machine() takes
> cpu_hotplug_lock again causing deadlock.
>
> 2. BUG: sleeping function called from in_atomic() context
> ---------------------------------------------------------
> Memory unplug path (remove_pagetable) takes init_mm.page_table_lock
> spinlock and later calls stop_machine() which does wait_for_completion()
>
> 3. Bad unlock unbalance
> -----------------------
> Memory unplug path takes init_mm.page_table_lock spinlock and calls
> stop_machine(). The stop_machine thread function runs in a different
> thread context (migration thread) which tries to release and reaquire
> ptl. Releasing ptl from a different thread than which acquired it
> causes bad unlock unbalance.
>
> These problems can be avoided if we avoid mapping hot-plugged memory
> with 1G mapping, thereby removing the need for splitting them during
> unplug. During radix init, identify(*) the hot-plugged memory region
> and create separate mappings for each LMB so that they don't get mapped
> with 1G mappings.
>
> To create separate mappings for every LMB in the hot-plugged
> region, we need lmb-size. I am currently using memory_block_size_bytes()
> API to get the lmb-size. Since this is early init time code, the
> machine type isn't probed yet and hence memory_block_size_bytes()
> would return the default LMB size as 16MB. Hence we end up creating
> separate mappings at much lower granularity than what we can ideally
> do for pseries machine.
>
> (*) Identifying and differentiating hot-plugged memory from the
> boot time memory is now possible with PAPR extension to LMB flags.
> (Ref: https://lore.kernel.org/linuxppc-dev/f55a7b65a43cc9dc7b22385cf9960f8b11d5ce2e.camel@linux.ibm.com/T/#t)
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index dd1bea45325c..4a4fb30f6c3d 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -16,6 +16,7 @@
>  #include <linux/hugetlb.h>
>  #include <linux/string_helpers.h>
>  #include <linux/stop_machine.h>
> +#include <linux/memory.h>
>  
>  #include <asm/pgtable.h>
>  #include <asm/pgalloc.h>
> @@ -313,6 +314,8 @@ static void __init radix_init_pgtable(void)
>  {
>  	unsigned long rts_field;
>  	struct memblock_region *reg;
> +	phys_addr_t addr;
> +	u64 lmb_size = memory_block_size_bytes();
>  
>  	/* We don't support slb for radix */
>  	mmu_slb_size = 0;
> @@ -331,9 +334,15 @@ static void __init radix_init_pgtable(void)
>  			continue;
>  		}
>  
> -		WARN_ON(create_physical_mapping(reg->base,
> -						reg->base + reg->size,
> -						-1));
> +		if (memblock_is_hotpluggable(reg)) {
> +			for (addr = reg->base; addr < (reg->base + reg->size);
> +				addr += lmb_size)
> +				WARN_ON(create_physical_mapping(addr,
> +				addr + lmb_size, -1));

Is that indentation correct? 

> +		} else
> +			WARN_ON(create_physical_mapping(reg->base,
> +							reg->base + reg->size,
> +							-1));
>  	}
>  
>  	/* Find out how many PID bits are supported */
> -- 
> 2.21.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings
  2020-04-06  3:49 ` [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings Bharata B Rao
@ 2020-06-22 12:53   ` Aneesh Kumar K.V
  2020-06-22 13:22   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 14+ messages in thread
From: Aneesh Kumar K.V @ 2020-06-22 12:53 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev
  Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Bharata B Rao <bharata@linux.ibm.com> writes:

> We can hit the following BUG_ON during memory unplug
>
> kernel BUG at arch/powerpc/mm/book3s64/pgtable.c:344!
> Oops: Exception in kernel mode, sig: 5 [#1]
> LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> NIP [c000000000097d48] pmd_fragment_free+0x48/0xd0
> LR [c0000000016aaefc] remove_pagetable+0x494/0x530
> Call Trace:
> _raw_spin_lock+0x54/0x80 (unreliable)
> remove_pagetable+0x2b0/0x530
> radix__remove_section_mapping+0x18/0x2c
> remove_section_mapping+0x38/0x5c
> arch_remove_memory+0x124/0x190
> try_remove_memory+0xd0/0x1c0
> __remove_memory+0x20/0x40
> dlpar_remove_lmb+0xbc/0x110
> dlpar_memory+0xa90/0xd40
> handle_dlpar_errorlog+0xa8/0x160
> pseries_hp_work_fn+0x2c/0x60
> process_one_work+0x47c/0x870
> worker_thread+0x364/0x5e0
> kthread+0x1b4/0x1c0
> ret_from_kernel_thread+0x5c/0x74
>
> This occurs when unplug is attempted for such memory which has
> been mapped using memblock pages as part of early kernel page
> table setup. We wouldn't have initialized the PMD or PTE fragment
> count for those PMD or PTE pages.
>
> Fixing this includes 3 parts:
>
> - Re-walk the init_mm page tables from mem_init() and initialize
>   the PMD and PTE fragment count to 1.
> - When freeing PUD, PMD and PTE page table pages, check explicitly
>   if they come from memblock and if so free then appropriately.
> - When we do early memblock based allocation of PMD and PUD pages,
>   allocate in PAGE_SIZE granularity so that we are sure the
>   complete page is used as pagetable page.
>
> Since we now do PAGE_SIZE allocations for both PUD table and
> PMD table (Note that PTE table allocation is already of PAGE_SIZE),
> we end up allocating more memory for the same amount of system RAM.
> Here is a comparision of how much more we need for a 64T and 2G
> system after this patch:
>
> 1. 64T system
> -------------
> 64T RAM would need 64G for vmemmap with struct page size being 64B.
>
> 128 PUD tables for 64T memory (1G mappings)
> 1 PUD table and 64 PMD tables for 64G vmemmap (2M mappings)
>
> With default PUD[PMD]_TABLE_SIZE(4K), (128+1+64)*4K=772K
> With PAGE_SIZE(64K) table allocations, (128+1+64)*64K=12352K
>
> 2. 2G system
> ------------
> 2G RAM would need 2M for vmemmap with struct page size being 64B.
>
> 1 PUD table for 2G memory (1G mapping)
> 1 PUD table and 1 PMD table for 2M vmemmap (2M mappings)
>
> With default PUD[PMD]_TABLE_SIZE(4K), (1+1+1)*4K=12K
> With new PAGE_SIZE(64K) table allocations, (1+1+1)*64K=192K
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgalloc.h | 11 ++-
>  arch/powerpc/include/asm/book3s/64/radix.h   |  1 +
>  arch/powerpc/include/asm/sparsemem.h         |  1 +
>  arch/powerpc/mm/book3s64/pgtable.c           | 31 ++++++++-
>  arch/powerpc/mm/book3s64/radix_pgtable.c     | 72 ++++++++++++++++++--
>  arch/powerpc/mm/mem.c                        |  5 ++
>  arch/powerpc/mm/pgtable-frag.c               |  9 ++-
>  7 files changed, 121 insertions(+), 9 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> index a41e91bd0580..e96572fb2871 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
> @@ -109,7 +109,16 @@ static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
>  
>  static inline void pud_free(struct mm_struct *mm, pud_t *pud)
>  {
> -	kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), pud);
> +	struct page *page = virt_to_page(pud);
> +
> +	/*
> +	 * Early pud pages allocated via memblock allocator
> +	 * can't be directly freed to slab
> +	 */
> +	if (PageReserved(page))
> +		free_reserved_page(page);
> +	else
> +		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), pud);
>  }
>  
>  static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
> diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
> index d97db3ad9aae..0aff8750181a 100644
> --- a/arch/powerpc/include/asm/book3s/64/radix.h
> +++ b/arch/powerpc/include/asm/book3s/64/radix.h
> @@ -291,6 +291,7 @@ static inline unsigned long radix__get_tree_size(void)
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  int radix__create_section_mapping(unsigned long start, unsigned long end, int nid);
>  int radix__remove_section_mapping(unsigned long start, unsigned long end);
> +void radix__fixup_pgtable_fragments(void);
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>  #endif /* __ASSEMBLY__ */
>  #endif
> diff --git a/arch/powerpc/include/asm/sparsemem.h b/arch/powerpc/include/asm/sparsemem.h
> index 3192d454a733..e662f9232d35 100644
> --- a/arch/powerpc/include/asm/sparsemem.h
> +++ b/arch/powerpc/include/asm/sparsemem.h
> @@ -15,6 +15,7 @@
>  #ifdef CONFIG_MEMORY_HOTPLUG
>  extern int create_section_mapping(unsigned long start, unsigned long end, int nid);
>  extern int remove_section_mapping(unsigned long start, unsigned long end);
> +void fixup_pgtable_fragments(void);
>  
>  #ifdef CONFIG_PPC_BOOK3S_64
>  extern int resize_hpt_for_hotplug(unsigned long new_mem_size);
> diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
> index 2bf7e1b4fd82..be7aa8786747 100644
> --- a/arch/powerpc/mm/book3s64/pgtable.c
> +++ b/arch/powerpc/mm/book3s64/pgtable.c
> @@ -186,6 +186,13 @@ int __meminit remove_section_mapping(unsigned long start, unsigned long end)
>  
>  	return hash__remove_section_mapping(start, end);
>  }
> +
> +void fixup_pgtable_fragments(void)
> +{
> +	if (radix_enabled())
> +		radix__fixup_pgtable_fragments();
> +}
> +
>  #endif /* CONFIG_MEMORY_HOTPLUG */
>  
>  void __init mmu_partition_table_init(void)
> @@ -343,13 +350,23 @@ void pmd_fragment_free(unsigned long *pmd)
>  
>  	BUG_ON(atomic_read(&page->pt_frag_refcount) <= 0);
>  	if (atomic_dec_and_test(&page->pt_frag_refcount)) {
> -		pgtable_pmd_page_dtor(page);
> -		__free_page(page);
> +		/*
> +		 * Early pmd pages allocated via memblock
> +		 * allocator wouldn't have called _ctor
> +		 */
> +		if (PageReserved(page))
> +			free_reserved_page(page);
> +		else {
> +			pgtable_pmd_page_dtor(page);
> +			__free_page(page);
> +		}
>  	}
>  }
>  
>  static inline void pgtable_free(void *table, int index)
>  {
> +	struct page *page;
> +
>  	switch (index) {
>  	case PTE_INDEX:
>  		pte_fragment_free(table, 0);
> @@ -358,7 +375,15 @@ static inline void pgtable_free(void *table, int index)
>  		pmd_fragment_free(table);
>  		break;
>  	case PUD_INDEX:
> -		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
> +		page = virt_to_page(table);
> +		/*
> +		 * Early pud pages allocated via memblock
> +		 * allocator need to be freed differently
> +		 */
> +		if (PageReserved(page))
> +			free_reserved_page(page);
> +		else
> +			kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
>  		break;
>  #if defined(CONFIG_PPC_4K_PAGES) && defined(CONFIG_HUGETLB_PAGE)
>  		/* 16M hugepd directory at pud level */
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 4a4fb30f6c3d..e675c0bbf9a4 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -36,6 +36,70 @@
>  unsigned int mmu_pid_bits;
>  unsigned int mmu_base_pid;
>  
> +static void fixup_pte_fragments(pmd_t *pmd)
> +{
> +	int i;
> +
> +	for (i = 0; i < PTRS_PER_PMD; i++, pmd++) {
> +		pte_t *pte;
> +		struct page *page;
> +
> +		if (pmd_none(*pmd))
> +			continue;
> +		if (pmd_is_leaf(*pmd))
> +			continue;
> +
> +		pte = pte_offset_kernel(pmd, 0);
> +		page = virt_to_page(pte);
> +		atomic_inc(&page->pt_frag_refcount);
> +	}
> +}
> +
> +static void fixup_pmd_fragments(pud_t *pud)
> +{
> +	int i;
> +
> +	for (i = 0; i < PTRS_PER_PUD; i++, pud++) {
> +		pmd_t *pmd;
> +		struct page *page;
> +
> +		if (pud_none(*pud))
> +			continue;
> +		if (pud_is_leaf(*pud))
> +			continue;
> +
> +		pmd = pmd_offset(pud, 0);
> +		page = virt_to_page(pmd);
> +		atomic_inc(&page->pt_frag_refcount);
> +		fixup_pte_fragments(pmd);
> +	}
> +}
> +
> +/*
> + * Walk the init_mm page tables and fixup the PMD and PTE fragment
> + * counts. This allows the PUD, PMD and PTE pages to be freed
> + * back to buddy allocator properly during memory unplug.
> + */
> +void radix__fixup_pgtable_fragments(void)
> +{
> +	int i;
> +	pgd_t *pgd = pgd_offset_k(0UL);
> +
> +	spin_lock(&init_mm.page_table_lock);
> +	for (i = 0; i < PTRS_PER_PGD; i++, pgd++) {
> +		pud_t *pud;
> +
> +		if (pgd_none(*pgd))
> +			continue;
> +		if (pgd_is_leaf(*pgd))
> +			continue;
> +
> +		pud = pud_offset(pgd, 0);
> +		fixup_pmd_fragments(pud);
> +	}
> +	spin_unlock(&init_mm.page_table_lock);
> +}
> +
>  static __ref void *early_alloc_pgtable(unsigned long size, int nid,
>  			unsigned long region_start, unsigned long region_end)
>  {
> @@ -71,8 +135,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
>  
>  	pgdp = pgd_offset_k(ea);
>  	if (pgd_none(*pgdp)) {
> -		pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
> -						region_start, region_end);
> +		pudp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
> +					   region_end);
>  		pgd_populate(&init_mm, pgdp, pudp);
>  	}
>  	pudp = pud_offset(pgdp, ea);
> @@ -81,8 +145,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
>  		goto set_the_pte;
>  	}
>  	if (pud_none(*pudp)) {
> -		pmdp = early_alloc_pgtable(PMD_TABLE_SIZE, nid,
> -						region_start, region_end);
> +		pmdp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
> +					   region_end);
>  		pud_populate(&init_mm, pudp, pmdp);
>  	}
>  	pmdp = pmd_offset(pudp, ea);
> diff --git a/arch/powerpc/mm/mem.c b/arch/powerpc/mm/mem.c
> index 1c07d5a3f543..d43ad701f693 100644
> --- a/arch/powerpc/mm/mem.c
> +++ b/arch/powerpc/mm/mem.c
> @@ -53,6 +53,10 @@
>  
>  #include <mm/mmu_decl.h>
>  
> +void __weak fixup_pgtable_fragments(void)
> +{
> +}
> +
>  #ifndef CPU_FTR_COHERENT_ICACHE
>  #define CPU_FTR_COHERENT_ICACHE	0	/* XXX for now */
>  #define CPU_FTR_NOEXECUTE	0
> @@ -307,6 +311,7 @@ void __init mem_init(void)
>  
>  	memblock_free_all();
>  
> +	fixup_pgtable_fragments();
>  #ifdef CONFIG_HIGHMEM
>  	{
>  		unsigned long pfn, highmem_mapnr;
> diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
> index ee4bd6d38602..16213c09896a 100644
> --- a/arch/powerpc/mm/pgtable-frag.c
> +++ b/arch/powerpc/mm/pgtable-frag.c
> @@ -114,6 +114,13 @@ void pte_fragment_free(unsigned long *table, int kernel)
>  	if (atomic_dec_and_test(&page->pt_frag_refcount)) {
>  		if (!kernel)
>  			pgtable_pte_page_dtor(page);
> -		__free_page(page);
> +		/*
> +		 * Early pte pages allocated via memblock
> +		 * allocator need to be freed differently
> +		 */
> +		if (PageReserved(page))
> +			free_reserved_page(page);
> +		else
> +			__free_page(page);
>  	}
>  }
> -- 
> 2.21.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable
  2020-04-06  3:49 ` [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable Bharata B Rao
@ 2020-06-22 13:07   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 14+ messages in thread
From: Aneesh Kumar K.V @ 2020-06-22 13:07 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev
  Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Bharata B Rao <bharata@linux.ibm.com> writes:

> remove_pagetable() isn't freeing PUD table. This causes memory
> leak during memory unplug. Fix this.
>

We had changes w.r.t p4d (folded 5 level table). You may want to get
this updated to recent kernel.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>


> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index e675c0bbf9a4..0d9ef3277579 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -767,6 +767,21 @@ static void free_pmd_table(pmd_t *pmd_start, pud_t *pud)
>  	pud_clear(pud);
>  }
>  
> +static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
> +{
> +	pud_t *pud;
> +	int i;
> +
> +	for (i = 0; i < PTRS_PER_PUD; i++) {
> +		pud = pud_start + i;
> +		if (!pud_none(*pud))
> +			return;
> +	}
> +
> +	pud_free(&init_mm, pud_start);
> +	pgd_clear(pgd);
> +}
> +
>  struct change_mapping_params {
>  	pte_t *pte;
>  	unsigned long start;
> @@ -937,6 +952,7 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
>  
>  		pud_base = (pud_t *)pgd_page_vaddr(*pgd);
>  		remove_pud_table(pud_base, addr, next);
> +		free_pud_table(pud_base, pgd);
>  	}
>  
>  	spin_unlock(&init_mm.page_table_lock);
> -- 
> 2.21.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping()
  2020-04-06  3:49 ` [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping() Bharata B Rao
@ 2020-06-22 13:07   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 14+ messages in thread
From: Aneesh Kumar K.V @ 2020-06-22 13:07 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev
  Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao

Bharata B Rao <bharata@linux.ibm.com> writes:

> With hot-plugged memory getting mapped with 2M mappings always,
> there will be no need to split any mappings during unplug.
>
> Hence remove split_kernel_mapping() and associated code. This
> essentially is a revert of
> commit 4dd5f8a99e791 ("powerpc/mm/radix: Split linear mapping on hot-unplug")
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> ---
>  arch/powerpc/mm/book3s64/radix_pgtable.c | 93 +++++-------------------
>  1 file changed, 19 insertions(+), 74 deletions(-)
>
> diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c b/arch/powerpc/mm/book3s64/radix_pgtable.c
> index 0d9ef3277579..56f2c698deac 100644
> --- a/arch/powerpc/mm/book3s64/radix_pgtable.c
> +++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
> @@ -15,7 +15,6 @@
>  #include <linux/mm.h>
>  #include <linux/hugetlb.h>
>  #include <linux/string_helpers.h>
> -#include <linux/stop_machine.h>
>  #include <linux/memory.h>
>  
>  #include <asm/pgtable.h>
> @@ -782,30 +781,6 @@ static void free_pud_table(pud_t *pud_start, pgd_t *pgd)
>  	pgd_clear(pgd);
>  }
>  
> -struct change_mapping_params {
> -	pte_t *pte;
> -	unsigned long start;
> -	unsigned long end;
> -	unsigned long aligned_start;
> -	unsigned long aligned_end;
> -};
> -
> -static int __meminit stop_machine_change_mapping(void *data)
> -{
> -	struct change_mapping_params *params =
> -			(struct change_mapping_params *)data;
> -
> -	if (!data)
> -		return -1;
> -
> -	spin_unlock(&init_mm.page_table_lock);
> -	pte_clear(&init_mm, params->aligned_start, params->pte);
> -	create_physical_mapping(__pa(params->aligned_start), __pa(params->start), -1);
> -	create_physical_mapping(__pa(params->end), __pa(params->aligned_end), -1);
> -	spin_lock(&init_mm.page_table_lock);
> -	return 0;
> -}
> -
>  static void remove_pte_table(pte_t *pte_start, unsigned long addr,
>  			     unsigned long end)
>  {
> @@ -834,52 +809,6 @@ static void remove_pte_table(pte_t *pte_start, unsigned long addr,
>  	}
>  }
>  
> -/*
> - * clear the pte and potentially split the mapping helper
> - */
> -static void __meminit split_kernel_mapping(unsigned long addr, unsigned long end,
> -				unsigned long size, pte_t *pte)
> -{
> -	unsigned long mask = ~(size - 1);
> -	unsigned long aligned_start = addr & mask;
> -	unsigned long aligned_end = addr + size;
> -	struct change_mapping_params params;
> -	bool split_region = false;
> -
> -	if ((end - addr) < size) {
> -		/*
> -		 * We're going to clear the PTE, but not flushed
> -		 * the mapping, time to remap and flush. The
> -		 * effects if visible outside the processor or
> -		 * if we are running in code close to the
> -		 * mapping we cleared, we are in trouble.
> -		 */
> -		if (overlaps_kernel_text(aligned_start, addr) ||
> -			overlaps_kernel_text(end, aligned_end)) {
> -			/*
> -			 * Hack, just return, don't pte_clear
> -			 */
> -			WARN_ONCE(1, "Linear mapping %lx->%lx overlaps kernel "
> -				  "text, not splitting\n", addr, end);
> -			return;
> -		}
> -		split_region = true;
> -	}
> -
> -	if (split_region) {
> -		params.pte = pte;
> -		params.start = addr;
> -		params.end = end;
> -		params.aligned_start = addr & ~(size - 1);
> -		params.aligned_end = min_t(unsigned long, aligned_end,
> -				(unsigned long)__va(memblock_end_of_DRAM()));
> -		stop_machine(stop_machine_change_mapping, &params, NULL);
> -		return;
> -	}
> -
> -	pte_clear(&init_mm, addr, pte);
> -}
> -
>  static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
>  			     unsigned long end)
>  {
> @@ -895,7 +824,12 @@ static void remove_pmd_table(pmd_t *pmd_start, unsigned long addr,
>  			continue;
>  
>  		if (pmd_is_leaf(*pmd)) {
> -			split_kernel_mapping(addr, end, PMD_SIZE, (pte_t *)pmd);
> +			if (!IS_ALIGNED(addr, PMD_SIZE) ||
> +			    !IS_ALIGNED(next, PMD_SIZE)) {
> +				WARN_ONCE(1, "%s: unaligned range\n", __func__);
> +				continue;
> +			}
> +			pte_clear(&init_mm, addr, (pte_t *)pmd);
>  			continue;
>  		}
>  
> @@ -920,7 +854,12 @@ static void remove_pud_table(pud_t *pud_start, unsigned long addr,
>  			continue;
>  
>  		if (pud_is_leaf(*pud)) {
> -			split_kernel_mapping(addr, end, PUD_SIZE, (pte_t *)pud);
> +			if (!IS_ALIGNED(addr, PUD_SIZE) ||
> +			    !IS_ALIGNED(next, PUD_SIZE)) {
> +				WARN_ONCE(1, "%s: unaligned range\n", __func__);
> +				continue;
> +			}
> +			pte_clear(&init_mm, addr, (pte_t *)pud);
>  			continue;
>  		}
>  
> @@ -946,7 +885,13 @@ static void __meminit remove_pagetable(unsigned long start, unsigned long end)
>  			continue;
>  
>  		if (pgd_is_leaf(*pgd)) {
> -			split_kernel_mapping(addr, end, PGDIR_SIZE, (pte_t *)pgd);
> +			if (!IS_ALIGNED(addr, PGDIR_SIZE) ||
> +			    !IS_ALIGNED(next, PGDIR_SIZE)) {
> +				WARN_ONCE(1, "%s: unaligned range\n", __func__);
> +				continue;
> +			}
> +
> +			pte_clear(&init_mm, addr, (pte_t *)pgd);
>  			continue;
>  		}
>  
> -- 
> 2.21.0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings
  2020-04-06  3:49 ` [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings Bharata B Rao
  2020-06-22 12:53   ` Aneesh Kumar K.V
@ 2020-06-22 13:22   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 14+ messages in thread
From: Aneesh Kumar K.V @ 2020-06-22 13:22 UTC (permalink / raw)
  To: Bharata B Rao, linuxppc-dev
  Cc: leonardo, aneesh.kumar, npiggin, Bharata B Rao


....

 @@ -71,8 +135,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
>  
>  	pgdp = pgd_offset_k(ea);
>  	if (pgd_none(*pgdp)) {
> -		pudp = early_alloc_pgtable(PUD_TABLE_SIZE, nid,
> -						region_start, region_end);
> +		pudp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
> +					   region_end);
>  		pgd_populate(&init_mm, pgdp, pudp);


Add a comment here explaining why we are using PAGE_SIZE instead of the
required PUD_TABLE_SIZE.

>  	}
>  	pudp = pud_offset(pgdp, ea);
> @@ -81,8 +145,8 @@ static int early_map_kernel_page(unsigned long ea, unsigned long pa,
>  		goto set_the_pte;
>  	}
>  	if (pud_none(*pudp)) {
> -		pmdp = early_alloc_pgtable(PMD_TABLE_SIZE, nid,
> -						region_start, region_end);
> +		pmdp = early_alloc_pgtable(PAGE_SIZE, nid, region_start,
> +					   region_end);
>  		pud_populate(&init_mm, pudp, pmdp);
>  	}
>  	pmdp = pmd_offset(pudp, ea);


-aneesh

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-06-22 13:25 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-06  3:49 [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
2020-04-06  3:49 ` [RFC PATCH v0 1/5] powerpc/pseries/hotplug-memory: Set DRCONF_MEM_HOTREMOVABLE for hot-plugged mem Bharata B Rao
2020-04-06  5:33   ` kbuild test robot
2020-04-06  3:49 ` [RFC PATCH v0 2/5] powerpc/mm/radix: Create separate mappings for hot-plugged memory Bharata B Rao
2020-06-22 12:46   ` Aneesh Kumar K.V
2020-04-06  3:49 ` [RFC PATCH v0 3/5] powerpc/mm/radix: Fix PTE/PMD fragment count for early page table mappings Bharata B Rao
2020-06-22 12:53   ` Aneesh Kumar K.V
2020-06-22 13:22   ` Aneesh Kumar K.V
2020-04-06  3:49 ` [RFC PATCH v0 4/5] powerpc/mm/radix: Free PUD table when freeing pagetable Bharata B Rao
2020-06-22 13:07   ` Aneesh Kumar K.V
2020-04-06  3:49 ` [RFC PATCH v0 5/5] powerpc/mm/radix: Remove split_kernel_mapping() Bharata B Rao
2020-06-22 13:07   ` Aneesh Kumar K.V
2020-04-09  4:31 ` [RFC PATCH v0 0/5] powerpc/mm/radix: Memory unplug fixes Bharata B Rao
2020-05-20  4:34 ` Bharata B Rao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.