All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock
@ 2018-04-16 11:27 Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
                   ` (12 more replies)
  0 siblings, 13 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This patch series add split pmd pagetable lock for book3s64. nohash64 also should
be able to switch to this. I need to workout the code dependency. This series
also migh have broken the build on platforms otherthan book3s64. I am sending this early
to get feedback on whether we should continue with the approach.

We switch the pmd allocator to use something similar to what we already use for
level 4 pagetable allocation. We get an order 0 page and divide that to fragments
and hand over fragments when we get request for a pmd pagetable. The pmd lock is
now stashed in the struct page backing the allocated page.

The series helps in reducing lock contention on mm->page_table_lock.


without patch

    32.72%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --32.68%--0
                          |
                          |--15.82%--pte_fragment_alloc
                          |          |
                          |           --15.79%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--14.95%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone
                          |

with patch

    12.89%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --12.83%--0
                          |
                          |--3.21%--pagevec_lru_move_fn
                          |          __lru_cache_add
                          |          |
                          |           --2.74%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--3.11%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone

.....
                          |
                           --0.55%--pte_fragment_alloc
                                     |
                                      --0.55%--do_huge_pmd_anonymous_page
                                                __handle_mm_fault
                                                handle_mm_fault
                                                __do_page_fault
                                                handle_page_fault
                                                test_mmap
                                                test_mmap
                                                start_thread
                                                __clone



Aneesh Kumar K.V (11):
  powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64
  powerpc/kvm: Switch kvm pmd allocator to custom allocator
  powerpc/mm: Use pmd_lockptr instead of opencoding it
  powerpc/mm: Rename pte fragment functions
  powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke
  powerpc/mm/nohash: Remove pte fragment dependency from nohash
  powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable
    fragment
  powerpc/book3s64/mm: Simplify the rcu callback for page table free
  powerpc/mm: Implement helpers for pagetable fragment support at PMD
    level
  powerpc/mm: Use page fragments for allocation page table at PMD level
  powerpc/book3s64: Enable split pmd ptlock.

 arch/powerpc/include/asm/book3s/64/hash-4k.h     |   8 +-
 arch/powerpc/include/asm/book3s/64/hash-64k.h    |   7 +
 arch/powerpc/include/asm/book3s/64/hash.h        |  10 -
 arch/powerpc/include/asm/book3s/64/mmu.h         |   7 +-
 arch/powerpc/include/asm/book3s/64/pgalloc.h     |  46 +---
 arch/powerpc/include/asm/book3s/64/pgtable.h     |  20 +-
 arch/powerpc/include/asm/book3s/64/radix-4k.h    |   3 +
 arch/powerpc/include/asm/book3s/64/radix-64k.h   |   4 +
 arch/powerpc/include/asm/mmu-book3e.h            |   6 -
 arch/powerpc/include/asm/nohash/64/pgalloc.h     |  95 +++-----
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h |  57 -----
 arch/powerpc/include/asm/nohash/64/pgtable.h     |   8 +-
 arch/powerpc/kvm/book3s_64_mmu_radix.c           |  36 ++-
 arch/powerpc/mm/hash_utils_64.c                  |   3 +-
 arch/powerpc/mm/mmu_context_book3s64.c           |  39 +++-
 arch/powerpc/mm/pgtable-book3s64.c               | 267 ++++++++++++++++++++++-
 arch/powerpc/mm/pgtable-hash64.c                 |   8 +-
 arch/powerpc/mm/pgtable-radix.c                  |   5 +-
 arch/powerpc/mm/pgtable_64.c                     | 171 ---------------
 arch/powerpc/platforms/Kconfig.cputype           |   4 +
 20 files changed, 427 insertions(+), 377 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/nohash/64/pgtable-64k.h

-- 
2.14.3

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-17 12:48   ` Christophe LEROY
  2018-04-18  5:42   ` Michael Ellerman
  2018-04-16 11:27 ` [PATCH V1 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64 Aneesh Kumar K.V
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V, Christophe LEROY

8xx use slice code when hugetlbfs is enabled. We missed a header include on
8xx which resulted in the below build failure.

config: mpc885_ads_defconfig + CONFIG_HUGETLBFS

   CC      arch/powerpc/mm/slice.o
arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context' [-Werror=implicit-function-declaration]
arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context' [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors
make[1]: *** [arch/powerpc/mm/slice.o] Error 1
make: *** [arch/powerpc/mm] Error 2

on PPC64 the mmu_context.h was included via linux/pkeys.h

CC: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/mm/slice.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 9cd87d11fe4e..205fe557ca10 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -35,6 +35,7 @@
 #include <asm/mmu.h>
 #include <asm/copro.h>
 #include <asm/hugetlb.h>
+#include <asm/mmu_context.h>
 
 static DEFINE_SPINLOCK(slice_convert_lock);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-05-16 13:38   ` [V1, " Michael Ellerman
  2018-04-16 11:27 ` [PATCH V1 02/11] powerpc/kvm: Switch kvm pmd allocator to custom allocator Aneesh Kumar K.V
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Only code movement and avoid #ifdef.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/pgtable-book3s64.c | 54 ++++++++++++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable_64.c       | 56 --------------------------------------
 2 files changed, 54 insertions(+), 56 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 518518fb7c45..35913b0b6d56 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -9,10 +9,13 @@
 
 #include <linux/sched.h>
 #include <linux/mm_types.h>
+#include <linux/memblock.h>
 #include <misc/cxl-base.h>
 
 #include <asm/pgalloc.h>
 #include <asm/tlb.h>
+#include <asm/trace.h>
+#include <asm/powernv.h>
 
 #include "mmu_decl.h"
 #include <trace/events/thp.h>
@@ -171,3 +174,54 @@ int __meminit remove_section_mapping(unsigned long start, unsigned long end)
 	return hash__remove_section_mapping(start, end);
 }
 #endif /* CONFIG_MEMORY_HOTPLUG */
+
+void __init mmu_partition_table_init(void)
+{
+	unsigned long patb_size = 1UL << PATB_SIZE_SHIFT;
+	unsigned long ptcr;
+
+	BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 36), "Partition table size too large.");
+	partition_tb = __va(memblock_alloc_base(patb_size, patb_size,
+						MEMBLOCK_ALLOC_ANYWHERE));
+
+	/* Initialize the Partition Table with no entries */
+	memset((void *)partition_tb, 0, patb_size);
+
+	/*
+	 * update partition table control register,
+	 * 64 K size.
+	 */
+	ptcr = __pa(partition_tb) | (PATB_SIZE_SHIFT - 12);
+	mtspr(SPRN_PTCR, ptcr);
+	powernv_set_nmmu_ptcr(ptcr);
+}
+
+void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
+				   unsigned long dw1)
+{
+	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
+
+	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
+	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
+
+	/*
+	 * Global flush of TLBs and partition table caches for this lpid.
+	 * The type of flush (hash or radix) depends on what the previous
+	 * use of this partition ID was, not the new use.
+	 */
+	asm volatile("ptesync" : : : "memory");
+	if (old & PATB_HR) {
+		asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
+			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
+		asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
+			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
+		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1);
+	} else {
+		asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
+			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
+		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
+	}
+	/* do we need fixup here ?*/
+	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
+}
+EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 9bf659d5078c..a41784dd2042 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -33,7 +33,6 @@
 #include <linux/swap.h>
 #include <linux/stddef.h>
 #include <linux/vmalloc.h>
-#include <linux/memblock.h>
 #include <linux/slab.h>
 #include <linux/hugetlb.h>
 
@@ -47,13 +46,11 @@
 #include <asm/smp.h>
 #include <asm/machdep.h>
 #include <asm/tlb.h>
-#include <asm/trace.h>
 #include <asm/processor.h>
 #include <asm/cputable.h>
 #include <asm/sections.h>
 #include <asm/firmware.h>
 #include <asm/dma.h>
-#include <asm/powernv.h>
 
 #include "mmu_decl.h"
 
@@ -429,59 +426,6 @@ void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
 }
 #endif
 
-#ifdef CONFIG_PPC_BOOK3S_64
-void __init mmu_partition_table_init(void)
-{
-	unsigned long patb_size = 1UL << PATB_SIZE_SHIFT;
-	unsigned long ptcr;
-
-	BUILD_BUG_ON_MSG((PATB_SIZE_SHIFT > 36), "Partition table size too large.");
-	partition_tb = __va(memblock_alloc_base(patb_size, patb_size,
-						MEMBLOCK_ALLOC_ANYWHERE));
-
-	/* Initialize the Partition Table with no entries */
-	memset((void *)partition_tb, 0, patb_size);
-
-	/*
-	 * update partition table control register,
-	 * 64 K size.
-	 */
-	ptcr = __pa(partition_tb) | (PATB_SIZE_SHIFT - 12);
-	mtspr(SPRN_PTCR, ptcr);
-	powernv_set_nmmu_ptcr(ptcr);
-}
-
-void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
-				   unsigned long dw1)
-{
-	unsigned long old = be64_to_cpu(partition_tb[lpid].patb0);
-
-	partition_tb[lpid].patb0 = cpu_to_be64(dw0);
-	partition_tb[lpid].patb1 = cpu_to_be64(dw1);
-
-	/*
-	 * Global flush of TLBs and partition table caches for this lpid.
-	 * The type of flush (hash or radix) depends on what the previous
-	 * use of this partition ID was, not the new use.
-	 */
-	asm volatile("ptesync" : : : "memory");
-	if (old & PATB_HR) {
-		asm volatile(PPC_TLBIE_5(%0,%1,2,0,1) : :
-			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
-		asm volatile(PPC_TLBIE_5(%0,%1,2,1,1) : :
-			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
-		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 1);
-	} else {
-		asm volatile(PPC_TLBIE_5(%0,%1,2,0,0) : :
-			     "r" (TLBIEL_INVAL_SET_LPID), "r" (lpid));
-		trace_tlbie(lpid, 0, TLBIEL_INVAL_SET_LPID, lpid, 2, 0, 0);
-	}
-	/* do we need fixup here ?*/
-	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
-}
-EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
-#endif /* CONFIG_PPC_BOOK3S_64 */
-
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mark_rodata_ro(void)
 {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 02/11] powerpc/kvm: Switch kvm pmd allocator to custom allocator
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64 Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 03/11] powerpc/mm: Use pmd_lockptr instead of opencoding it Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

In the next set of patches, we will switch pmd allocator to use page fragments
and the locking will be updated to split pmd ptlock. We want to avoid using
fragments for partition-scoped table. Use slab cache similar to level 4 table

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 36 +++++++++++++++++++++++++++++-----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index a57eafec4dc2..ccdf3761eec0 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -200,6 +200,7 @@ void kvmppc_radix_set_pte_at(struct kvm *kvm, unsigned long addr,
 }
 
 static struct kmem_cache *kvm_pte_cache;
+static struct kmem_cache *kvm_pmd_cache;
 
 static pte_t *kvmppc_pte_alloc(void)
 {
@@ -217,6 +218,16 @@ static inline int pmd_is_leaf(pmd_t pmd)
 	return !!(pmd_val(pmd) & _PAGE_PTE);
 }
 
+static pmd_t *kvmppc_pmd_alloc(void)
+{
+	return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+}
+
+static void kvmppc_pmd_free(pmd_t *pmdp)
+{
+	kmem_cache_free(kvm_pmd_cache, pmdp);
+}
+
 static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
 			     unsigned int level, unsigned long mmu_seq)
 {
@@ -239,7 +250,7 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
 	if (pud && pud_present(*pud) && !pud_huge(*pud))
 		pmd = pmd_offset(pud, gpa);
 	else if (level <= 1)
-		new_pmd = pmd_alloc_one(kvm->mm, gpa);
+		new_pmd = kvmppc_pmd_alloc();
 
 	if (level == 0 && !(pmd && pmd_present(*pmd) && !pmd_is_leaf(*pmd)))
 		new_ptep = kvmppc_pte_alloc();
@@ -382,7 +393,7 @@ static int kvmppc_create_pte(struct kvm *kvm, pte_t pte, unsigned long gpa,
 	if (new_pud)
 		pud_free(kvm->mm, new_pud);
 	if (new_pmd)
-		pmd_free(kvm->mm, new_pmd);
+		kvmppc_pmd_free(new_pmd);
 	if (new_ptep)
 		kvmppc_pte_free(new_ptep);
 	return ret;
@@ -758,7 +769,7 @@ void kvmppc_free_radix(struct kvm *kvm)
 				kvmppc_pte_free(pte);
 				pmd_clear(pmd);
 			}
-			pmd_free(kvm->mm, pmd_offset(pud, 0));
+			kvmppc_pmd_free(pmd_offset(pud, 0));
 			pud_clear(pud);
 		}
 		pud_free(kvm->mm, pud_offset(pgd, 0));
@@ -770,20 +781,35 @@ void kvmppc_free_radix(struct kvm *kvm)
 
 static void pte_ctor(void *addr)
 {
-	memset(addr, 0, PTE_TABLE_SIZE);
+	memset(addr, 0, RADIX_PTE_TABLE_SIZE);
+}
+
+static void pmd_ctor(void *addr)
+{
+	memset(addr, 0, RADIX_PMD_TABLE_SIZE);
 }
 
 int kvmppc_radix_init(void)
 {
-	unsigned long size = sizeof(void *) << PTE_INDEX_SIZE;
+	unsigned long size = sizeof(void *) << RADIX_PTE_INDEX_SIZE;
 
 	kvm_pte_cache = kmem_cache_create("kvm-pte", size, size, 0, pte_ctor);
 	if (!kvm_pte_cache)
 		return -ENOMEM;
+
+	size = sizeof(void *) << RADIX_PMD_INDEX_SIZE;
+
+	kvm_pmd_cache = kmem_cache_create("kvm-pmd", size, size, 0, pmd_ctor);
+	if (!kvm_pmd_cache) {
+		kmem_cache_destroy(kvm_pte_cache);
+		return -ENOMEM;
+	}
+
 	return 0;
 }
 
 void kvmppc_radix_exit(void)
 {
 	kmem_cache_destroy(kvm_pte_cache);
+	kmem_cache_destroy(kvm_pmd_cache);
 }
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 03/11] powerpc/mm: Use pmd_lockptr instead of opencoding it
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 02/11] powerpc/kvm: Switch kvm pmd allocator to custom allocator Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 04/11] powerpc/mm: Rename pte fragment functions Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

In later patch we switch pmd_lock from mm->page_table_lock to split pmd ptlock.
It avoid compilations issues, use pmd_lockptr helper.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/pgtable-book3s64.c | 4 ++--
 arch/powerpc/mm/pgtable-hash64.c   | 8 +++++---
 arch/powerpc/mm/pgtable-radix.c    | 2 +-
 3 files changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 35913b0b6d56..e1c304183172 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -37,7 +37,7 @@ int pmdp_set_access_flags(struct vm_area_struct *vma, unsigned long address,
 	int changed;
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(!pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
-	assert_spin_locked(&vma->vm_mm->page_table_lock);
+	assert_spin_locked(pmd_lockptr(vma->vm_mm, pmdp));
 #endif
 	changed = !pmd_same(*(pmdp), entry);
 	if (changed) {
@@ -62,7 +62,7 @@ void set_pmd_at(struct mm_struct *mm, unsigned long addr,
 {
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(pte_present(pmd_pte(*pmdp)) && !pte_protnone(pmd_pte(*pmdp)));
-	assert_spin_locked(&mm->page_table_lock);
+	assert_spin_locked(pmd_lockptr(mm, pmdp));
 	WARN_ON(!(pmd_trans_huge(pmd) || pmd_devmap(pmd)));
 #endif
 	trace_hugepage_set_pmd(addr, pmd_val(pmd));
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 199bfda5f0d9..692bfc9e372c 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -193,7 +193,7 @@ unsigned long hash__pmd_hugepage_update(struct mm_struct *mm, unsigned long addr
 
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(!hash__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
-	assert_spin_locked(&mm->page_table_lock);
+	assert_spin_locked(pmd_lockptr(mm, pmdp));
 #endif
 
 	__asm__ __volatile__(
@@ -265,7 +265,8 @@ void hash__pgtable_trans_huge_deposit(struct mm_struct *mm, pmd_t *pmdp,
 				  pgtable_t pgtable)
 {
 	pgtable_t *pgtable_slot;
-	assert_spin_locked(&mm->page_table_lock);
+
+	assert_spin_locked(pmd_lockptr(mm, pmdp));
 	/*
 	 * we store the pgtable in the second half of PMD
 	 */
@@ -285,7 +286,8 @@ pgtable_t hash__pgtable_trans_huge_withdraw(struct mm_struct *mm, pmd_t *pmdp)
 	pgtable_t pgtable;
 	pgtable_t *pgtable_slot;
 
-	assert_spin_locked(&mm->page_table_lock);
+	assert_spin_locked(pmd_lockptr(mm, pmdp));
+
 	pgtable_slot = (pgtable_t *)pmdp + PTRS_PER_PMD;
 	pgtable = *pgtable_slot;
 	/*
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index f1891e215e39..473415750cbf 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -975,7 +975,7 @@ unsigned long radix__pmd_hugepage_update(struct mm_struct *mm, unsigned long add
 
 #ifdef CONFIG_DEBUG_VM
 	WARN_ON(!radix__pmd_trans_huge(*pmdp) && !pmd_devmap(*pmdp));
-	assert_spin_locked(&mm->page_table_lock);
+	assert_spin_locked(pmd_lockptr(mm, pmdp));
 #endif
 
 	old = radix__pte_update(mm, addr, (pte_t *)pmdp, clr, set, 1);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 04/11] powerpc/mm: Rename pte fragment functions
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 03/11] powerpc/mm: Use pmd_lockptr instead of opencoding it Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 05/11] powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

We rename the alloc and get_from_cache to indicate they operate on pte
fragments. In later patch we will add pmd fragment support.

No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/pgtable_64.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index a41784dd2042..3873f94a3ae9 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -314,7 +314,7 @@ struct page *pmd_page(pmd_t pmd)
 }
 
 #ifdef CONFIG_PPC_64K_PAGES
-static pte_t *get_from_cache(struct mm_struct *mm)
+static pte_t *get_pte_from_cache(struct mm_struct *mm)
 {
 	void *pte_frag, *ret;
 
@@ -333,7 +333,7 @@ static pte_t *get_from_cache(struct mm_struct *mm)
 	return (pte_t *)ret;
 }
 
-static pte_t *__alloc_for_cache(struct mm_struct *mm, int kernel)
+static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
 {
 	void *ret = NULL;
 	struct page *page;
@@ -372,12 +372,13 @@ pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel
 {
 	pte_t *pte;
 
-	pte = get_from_cache(mm);
+	pte = get_pte_from_cache(mm);
 	if (pte)
 		return pte;
 
-	return __alloc_for_cache(mm, kernel);
+	return __alloc_for_ptecache(mm, kernel);
 }
+
 #endif /* CONFIG_PPC_64K_PAGES */
 
 void pte_fragment_free(unsigned long *table, int kernel)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 05/11] powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 04/11] powerpc/mm: Rename pte fragment functions Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 06/11] powerpc/mm/nohash: Remove pte fragment dependency from nohash Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

We have in Kconfig

config PPC_64K_PAGES
	bool "64k page size"
	depends on !PPC_FSL_BOOK3E && (44x || PPC_BOOK3S_64 || PPC_BOOK3E_64)
	select HAVE_ARCH_SOFT_DIRTY if PPC_BOOK3S_64

Only supported BOOK3E 64 bit platforms is FSL_BOOK3E. Remove the dead 64k page
support code from 64bit nohash.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-book3e.h            |  6 ---
 arch/powerpc/include/asm/nohash/64/pgalloc.h     | 60 ------------------------
 arch/powerpc/include/asm/nohash/64/pgtable-64k.h | 57 ----------------------
 arch/powerpc/include/asm/nohash/64/pgtable.h     |  8 ++--
 4 files changed, 4 insertions(+), 127 deletions(-)
 delete mode 100644 arch/powerpc/include/asm/nohash/64/pgtable-64k.h

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index cda94a0f5146..e20072972e35 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -230,10 +230,6 @@ typedef struct {
 	unsigned int	id;
 	unsigned int	active;
 	unsigned long	vdso_base;
-#ifdef CONFIG_PPC_64K_PAGES
-	/* for 4K PTE fragment support */
-	void *pte_frag;
-#endif
 } mm_context_t;
 
 /* Page size definitions, common between 32 and 64-bit
@@ -275,8 +271,6 @@ static inline unsigned int mmu_psize_to_shift(unsigned int mmu_psize)
  */
 #if defined(CONFIG_PPC_4K_PAGES)
 #define mmu_virtual_psize	MMU_PAGE_4K
-#elif defined(CONFIG_PPC_64K_PAGES)
-#define mmu_virtual_psize	MMU_PAGE_64K
 #else
 #error Unsupported page size
 #endif
diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index 9721c7867b9c..a6baf3c13bb5 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -52,8 +52,6 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
 	kmem_cache_free(PGT_CACHE(PGD_INDEX_SIZE), pgd);
 }
 
-#ifndef CONFIG_PPC_64K_PAGES
-
 #define pgd_populate(MM, PGD, PUD)	pgd_set(PGD, (unsigned long)PUD)
 
 static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long addr)
@@ -131,64 +129,6 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 	pgtable_free_tlb(tlb, page_address(table), 0);
 }
 
-#else /* if CONFIG_PPC_64K_PAGES */
-
-extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int);
-extern void pte_fragment_free(unsigned long *, int);
-extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
-#ifdef CONFIG_SMP
-extern void __tlb_remove_table(void *_table);
-#endif
-
-#define pud_populate(mm, pud, pmd)	pud_set(pud, (unsigned long)pmd)
-
-static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
-				       pte_t *pte)
-{
-	pmd_set(pmd, (unsigned long)pte);
-}
-
-static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
-				pgtable_t pte_page)
-{
-	pmd_set(pmd, (unsigned long)pte_page);
-}
-
-static inline pgtable_t pmd_pgtable(pmd_t pmd)
-{
-	return (pgtable_t)(pmd_val(pmd) & ~PMD_MASKED_BITS);
-}
-
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
-					  unsigned long address)
-{
-	return (pte_t *)pte_fragment_alloc(mm, address, 1);
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
-					unsigned long address)
-{
-	return (pgtable_t)pte_fragment_alloc(mm, address, 0);
-}
-
-static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
-{
-	pte_fragment_free((unsigned long *)pte, 1);
-}
-
-static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
-{
-	pte_fragment_free((unsigned long *)ptepage, 0);
-}
-
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
-				  unsigned long address)
-{
-	tlb_flush_pgtable(tlb, address);
-	pgtable_free_tlb(tlb, table, 0);
-}
-#endif /* CONFIG_PPC_64K_PAGES */
-
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
 	return kmem_cache_alloc(PGT_CACHE(PMD_CACHE_INDEX),
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h b/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
deleted file mode 100644
index 7210c2818e41..000000000000
--- a/arch/powerpc/include/asm/nohash/64/pgtable-64k.h
+++ /dev/null
@@ -1,57 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
-#define _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H
-
-#define __ARCH_USE_5LEVEL_HACK
-#include <asm-generic/pgtable-nopud.h>
-
-
-#define PTE_INDEX_SIZE  8
-#define PMD_INDEX_SIZE  10
-#define PUD_INDEX_SIZE	0
-#define PGD_INDEX_SIZE  12
-
-/*
- * we support 32 fragments per PTE page of 64K size
- */
-#define PTE_FRAG_NR	32
-/*
- * We use a 2K PTE page fragment and another 2K for storing
- * real_pte_t hash index
- */
-#define PTE_FRAG_SIZE_SHIFT  11
-#define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
-
-#ifndef __ASSEMBLY__
-#define PTE_TABLE_SIZE	PTE_FRAG_SIZE
-#define PMD_TABLE_SIZE	(sizeof(pmd_t) << PMD_INDEX_SIZE)
-#define PUD_TABLE_SIZE	(0)
-#define PGD_TABLE_SIZE	(sizeof(pgd_t) << PGD_INDEX_SIZE)
-#endif	/* __ASSEMBLY__ */
-
-#define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
-#define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
-#define PTRS_PER_PGD	(1 << PGD_INDEX_SIZE)
-
-/* PMD_SHIFT determines what a second-level page table entry can map */
-#define PMD_SHIFT	(PAGE_SHIFT + PTE_INDEX_SIZE)
-#define PMD_SIZE	(1UL << PMD_SHIFT)
-#define PMD_MASK	(~(PMD_SIZE-1))
-
-/* PGDIR_SHIFT determines what a third-level page table entry can map */
-#define PGDIR_SHIFT	(PMD_SHIFT + PMD_INDEX_SIZE)
-#define PGDIR_SIZE	(1UL << PGDIR_SHIFT)
-#define PGDIR_MASK	(~(PGDIR_SIZE-1))
-
-/*
- * Bits to mask out from a PMD to get to the PTE page
- * PMDs point to PTE table fragments which are PTE_FRAG_SIZE aligned.
- */
-#define PMD_MASKED_BITS		(PTE_FRAG_SIZE - 1)
-/* Bits to mask out from a PGD/PUD to get to the PMD page */
-#define PUD_MASKED_BITS		0x1ff
-
-#define pgd_pte(pgd)	(pud_pte(((pud_t){ pgd })))
-#define pte_pgd(pte)	((pgd_t)pte_pud(pte))
-
-#endif /* _ASM_POWERPC_NOHASH_64_PGTABLE_64K_H */
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index 5c5f75d005ad..404283d9ec6b 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -6,13 +6,13 @@
  * the ppc64 hashed page table.
  */
 
-#ifdef CONFIG_PPC_64K_PAGES
-#include <asm/nohash/64/pgtable-64k.h>
-#else
 #include <asm/nohash/64/pgtable-4k.h>
-#endif
 #include <asm/barrier.h>
 
+#ifdef CONFIG_PPC_64K_PAGES
+#error "Page size not supported"
+#endif
+
 #define FIRST_USER_ADDRESS	0UL
 
 /*
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 06/11] powerpc/mm/nohash: Remove pte fragment dependency from nohash
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 05/11] powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 07/11] powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Now that we have removed 64K page size support, the RCU page table free can
be much simpler for nohash. Make a copy of the the rcu callback to pgalloc.h
header similar to nohash 32. We could possibly merge 32 and 64 bit there. But
that is for a later patch

We also move the book3s specific handler to pgtable_book3s64.c. This will be
updated in a later patch to handle split pmd ptlock.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/nohash/64/pgalloc.h |  57 +++++++++++---
 arch/powerpc/mm/pgtable-book3s64.c           | 114 +++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable_64.c                 | 114 ---------------------------
 3 files changed, 159 insertions(+), 126 deletions(-)

diff --git a/arch/powerpc/include/asm/nohash/64/pgalloc.h b/arch/powerpc/include/asm/nohash/64/pgalloc.h
index a6baf3c13bb5..21624ff1f065 100644
--- a/arch/powerpc/include/asm/nohash/64/pgalloc.h
+++ b/arch/powerpc/include/asm/nohash/64/pgalloc.h
@@ -84,6 +84,18 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
 
 #define pmd_pgtable(pmd) pmd_page(pmd)
 
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	return kmem_cache_alloc(PGT_CACHE(PMD_CACHE_INDEX),
+			pgtable_gfp_flags(mm, GFP_KERNEL));
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+	kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
+}
+
+
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
 					  unsigned long address)
 {
@@ -118,26 +130,47 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t ptepage)
 	__free_page(ptepage);
 }
 
-extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
+static inline void pgtable_free(void *table, int shift)
+{
+	if (!shift) {
+		pgtable_page_dtor(table);
+		free_page((unsigned long)table);
+	} else {
+		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+		kmem_cache_free(PGT_CACHE(shift), table);
+	}
+}
+
 #ifdef CONFIG_SMP
-extern void __tlb_remove_table(void *_table);
-#endif
-static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
-				  unsigned long address)
+static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
 {
-	tlb_flush_pgtable(tlb, address);
-	pgtable_free_tlb(tlb, page_address(table), 0);
+	unsigned long pgf = (unsigned long)table;
+
+	BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+	pgf |= shift;
+	tlb_remove_table(tlb, (void *)pgf);
 }
 
-static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+static inline void __tlb_remove_table(void *_table)
 {
-	return kmem_cache_alloc(PGT_CACHE(PMD_CACHE_INDEX),
-			pgtable_gfp_flags(mm, GFP_KERNEL));
+	void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+	unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+	pgtable_free(table, shift);
 }
 
-static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+#else
+static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
 {
-	kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
+	pgtable_free(table, shift);
+}
+#endif
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
+				  unsigned long address)
+{
+	tlb_flush_pgtable(tlb, address);
+	pgtable_free_tlb(tlb, page_address(table), 0);
 }
 
 #define __pmd_free_tlb(tlb, pmd, addr)		      \
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index e1c304183172..e4e1b2d4ca27 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -225,3 +225,117 @@ void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
 	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
 }
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
+#ifdef CONFIG_PPC_64K_PAGES
+static pte_t *get_pte_from_cache(struct mm_struct *mm)
+{
+	void *pte_frag, *ret;
+
+	spin_lock(&mm->page_table_lock);
+	ret = mm->context.pte_frag;
+	if (ret) {
+		pte_frag = ret + PTE_FRAG_SIZE;
+		/*
+		 * If we have taken up all the fragments mark PTE page NULL
+		 */
+		if (((unsigned long)pte_frag & ~PAGE_MASK) == 0)
+			pte_frag = NULL;
+		mm->context.pte_frag = pte_frag;
+	}
+	spin_unlock(&mm->page_table_lock);
+	return (pte_t *)ret;
+}
+
+static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
+{
+	void *ret = NULL;
+	struct page *page;
+
+	if (!kernel) {
+		page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT);
+		if (!page)
+			return NULL;
+		if (!pgtable_page_ctor(page)) {
+			__free_page(page);
+			return NULL;
+		}
+	} else {
+		page = alloc_page(PGALLOC_GFP);
+		if (!page)
+			return NULL;
+	}
+
+	ret = page_address(page);
+	spin_lock(&mm->page_table_lock);
+	/*
+	 * If we find pgtable_page set, we return
+	 * the allocated page with single fragement
+	 * count.
+	 */
+	if (likely(!mm->context.pte_frag)) {
+		set_page_count(page, PTE_FRAG_NR);
+		mm->context.pte_frag = ret + PTE_FRAG_SIZE;
+	}
+	spin_unlock(&mm->page_table_lock);
+
+	return (pte_t *)ret;
+}
+
+pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
+{
+	pte_t *pte;
+
+	pte = get_pte_from_cache(mm);
+	if (pte)
+		return pte;
+
+	return __alloc_for_ptecache(mm, kernel);
+}
+
+#endif /* CONFIG_PPC_64K_PAGES */
+
+void pte_fragment_free(unsigned long *table, int kernel)
+{
+	struct page *page = virt_to_page(table);
+
+	if (put_page_testzero(page)) {
+		if (!kernel)
+			pgtable_page_dtor(page);
+		free_unref_page(page);
+	}
+}
+
+#ifdef CONFIG_SMP
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+{
+	unsigned long pgf = (unsigned long)table;
+
+	BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+	pgf |= shift;
+	tlb_remove_table(tlb, (void *)pgf);
+}
+
+void __tlb_remove_table(void *_table)
+{
+	void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
+	unsigned int shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+
+	if (!shift)
+		/* PTE page needs special handling */
+		pte_fragment_free(table, 0);
+	else {
+		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+		kmem_cache_free(PGT_CACHE(shift), table);
+	}
+}
+#else
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+{
+	if (!shift) {
+		/* PTE page needs special handling */
+		pte_fragment_free(table, 0);
+	} else {
+		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
+		kmem_cache_free(PGT_CACHE(shift), table);
+	}
+}
+#endif
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 3873f94a3ae9..f0208f8f9dbb 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -313,120 +313,6 @@ struct page *pmd_page(pmd_t pmd)
 	return virt_to_page(pmd_page_vaddr(pmd));
 }
 
-#ifdef CONFIG_PPC_64K_PAGES
-static pte_t *get_pte_from_cache(struct mm_struct *mm)
-{
-	void *pte_frag, *ret;
-
-	spin_lock(&mm->page_table_lock);
-	ret = mm->context.pte_frag;
-	if (ret) {
-		pte_frag = ret + PTE_FRAG_SIZE;
-		/*
-		 * If we have taken up all the fragments mark PTE page NULL
-		 */
-		if (((unsigned long)pte_frag & ~PAGE_MASK) == 0)
-			pte_frag = NULL;
-		mm->context.pte_frag = pte_frag;
-	}
-	spin_unlock(&mm->page_table_lock);
-	return (pte_t *)ret;
-}
-
-static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
-{
-	void *ret = NULL;
-	struct page *page;
-
-	if (!kernel) {
-		page = alloc_page(PGALLOC_GFP | __GFP_ACCOUNT);
-		if (!page)
-			return NULL;
-		if (!pgtable_page_ctor(page)) {
-			__free_page(page);
-			return NULL;
-		}
-	} else {
-		page = alloc_page(PGALLOC_GFP);
-		if (!page)
-			return NULL;
-	}
-
-	ret = page_address(page);
-	spin_lock(&mm->page_table_lock);
-	/*
-	 * If we find pgtable_page set, we return
-	 * the allocated page with single fragement
-	 * count.
-	 */
-	if (likely(!mm->context.pte_frag)) {
-		set_page_count(page, PTE_FRAG_NR);
-		mm->context.pte_frag = ret + PTE_FRAG_SIZE;
-	}
-	spin_unlock(&mm->page_table_lock);
-
-	return (pte_t *)ret;
-}
-
-pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel)
-{
-	pte_t *pte;
-
-	pte = get_pte_from_cache(mm);
-	if (pte)
-		return pte;
-
-	return __alloc_for_ptecache(mm, kernel);
-}
-
-#endif /* CONFIG_PPC_64K_PAGES */
-
-void pte_fragment_free(unsigned long *table, int kernel)
-{
-	struct page *page = virt_to_page(table);
-	if (put_page_testzero(page)) {
-		if (!kernel)
-			pgtable_page_dtor(page);
-		free_unref_page(page);
-	}
-}
-
-#ifdef CONFIG_SMP
-void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
-{
-	unsigned long pgf = (unsigned long)table;
-
-	BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-	pgf |= shift;
-	tlb_remove_table(tlb, (void *)pgf);
-}
-
-void __tlb_remove_table(void *_table)
-{
-	void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
-	unsigned shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
-
-	if (!shift)
-		/* PTE page needs special handling */
-		pte_fragment_free(table, 0);
-	else {
-		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-		kmem_cache_free(PGT_CACHE(shift), table);
-	}
-}
-#else
-void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
-{
-	if (!shift) {
-		/* PTE page needs special handling */
-		pte_fragment_free(table, 0);
-	} else {
-		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-		kmem_cache_free(PGT_CACHE(shift), table);
-	}
-}
-#endif
-
 #ifdef CONFIG_STRICT_KERNEL_RWX
 void mark_rodata_ro(void)
 {
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 07/11] powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 06/11] powerpc/mm/nohash: Remove pte fragment dependency from nohash Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 08/11] powerpc/book3s64/mm: Simplify the rcu callback for page table free Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

4K config use one full page at level 4 of the pagetable. Add support for single
fragment allocation in pagetable fragment code and and use that for 4K config.
This makes both 4k and 64k use the same code path. Later we will switch pmd to
use the page table fragment code. This is done only for 64bit platforms which
is using page table fragment support.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h |  6 ++++--
 arch/powerpc/include/asm/book3s/64/mmu.h     |  6 +++---
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 26 --------------------------
 arch/powerpc/mm/mmu_context_book3s64.c       | 10 ----------
 arch/powerpc/mm/pgtable-book3s64.c           | 11 ++++++++---
 5 files changed, 15 insertions(+), 44 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 4b5423030d4b..00c4db2a7682 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -38,8 +38,10 @@
 #define H_PAGE_4K_PFN	0x0
 #define H_PAGE_THP_HUGE 0x0
 #define H_PAGE_COMBO	0x0
-#define H_PTE_FRAG_NR	0
-#define H_PTE_FRAG_SIZE_SHIFT  0
+
+/* 8 bytes per each pte entry */
+#define H_PTE_FRAG_SIZE_SHIFT  (H_PTE_INDEX_SIZE + 3)
+#define H_PTE_FRAG_NR	(PAGE_SIZE >> H_PTE_FRAG_SIZE_SHIFT)
 
 /* memory key bits, only 8 keys supported */
 #define H_PTE_PKEY_BIT0	0
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 5094696eecd6..fde7803a2261 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -134,10 +134,10 @@ typedef struct {
 #ifdef CONFIG_PPC_SUBPAGE_PROT
 	struct subpage_prot_table spt;
 #endif /* CONFIG_PPC_SUBPAGE_PROT */
-#ifdef CONFIG_PPC_64K_PAGES
-	/* for 4K PTE fragment support */
+	/*
+	 * pagetable fragment support
+	 */
 	void *pte_frag;
-#endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	struct list_head iommu_group_mem_list;
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 558a159600ad..826171568192 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -173,31 +173,6 @@ static inline pgtable_t pmd_pgtable(pmd_t pmd)
 	return (pgtable_t)pmd_page_vaddr(pmd);
 }
 
-#ifdef CONFIG_PPC_4K_PAGES
-static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
-					  unsigned long address)
-{
-	return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
-}
-
-static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
-				      unsigned long address)
-{
-	struct page *page;
-	pte_t *pte;
-
-	pte = (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO | __GFP_ACCOUNT);
-	if (!pte)
-		return NULL;
-	page = virt_to_page(pte);
-	if (!pgtable_page_ctor(page)) {
-		__free_page(page);
-		return NULL;
-	}
-	return pte;
-}
-#else /* if CONFIG_PPC_64K_PAGES */
-
 static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm,
 					  unsigned long address)
 {
@@ -209,7 +184,6 @@ static inline pgtable_t pte_alloc_one(struct mm_struct *mm,
 {
 	return (pgtable_t)pte_fragment_alloc(mm, address, 0);
 }
-#endif
 
 static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
 {
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index b75194dff64c..87ee78973a35 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -159,9 +159,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 
 	mm->context.id = index;
 
-#ifdef CONFIG_PPC_64K_PAGES
 	mm->context.pte_frag = NULL;
-#endif
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	mm_iommu_init(mm);
 #endif
@@ -192,7 +190,6 @@ static void destroy_contexts(mm_context_t *ctx)
 	spin_unlock(&mmu_context_lock);
 }
 
-#ifdef CONFIG_PPC_64K_PAGES
 static void destroy_pagetable_page(struct mm_struct *mm)
 {
 	int count;
@@ -213,13 +210,6 @@ static void destroy_pagetable_page(struct mm_struct *mm)
 	}
 }
 
-#else
-static inline void destroy_pagetable_page(struct mm_struct *mm)
-{
-	return;
-}
-#endif
-
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index e4e1b2d4ca27..fc42cccb96c7 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -225,7 +225,7 @@ void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
 	asm volatile("eieio; tlbsync; ptesync" : : : "memory");
 }
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
-#ifdef CONFIG_PPC_64K_PAGES
+
 static pte_t *get_pte_from_cache(struct mm_struct *mm)
 {
 	void *pte_frag, *ret;
@@ -264,7 +264,14 @@ static pte_t *__alloc_for_ptecache(struct mm_struct *mm, int kernel)
 			return NULL;
 	}
 
+
 	ret = page_address(page);
+	/*
+	 * if we support only one fragment just return the
+	 * allocated page.
+	 */
+	if (PTE_FRAG_NR == 1)
+		return ret;
 	spin_lock(&mm->page_table_lock);
 	/*
 	 * If we find pgtable_page set, we return
@@ -291,8 +298,6 @@ pte_t *pte_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr, int kernel
 	return __alloc_for_ptecache(mm, kernel);
 }
 
-#endif /* CONFIG_PPC_64K_PAGES */
-
 void pte_fragment_free(unsigned long *table, int kernel)
 {
 	struct page *page = virt_to_page(table);
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 08/11] powerpc/book3s64/mm: Simplify the rcu callback for page table free
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 07/11] powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 09/11] powerpc/mm: Implement helpers for pagetable fragment support at PMD level Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Instead of encoding shift in the table address, use an enumerated index value.
This allow us to do different things in the callback for pte and pmd.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgalloc.h | 10 +++----
 arch/powerpc/include/asm/book3s/64/pgtable.h | 10 +++++++
 arch/powerpc/mm/pgtable-book3s64.c           | 45 ++++++++++++++++------------
 3 files changed, 41 insertions(+), 24 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 826171568192..ed313b8d3fac 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -124,14 +124,14 @@ static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
 }
 
 static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
-                                  unsigned long address)
+				  unsigned long address)
 {
 	/*
 	 * By now all the pud entries should be none entries. So go
 	 * ahead and flush the page walk cache
 	 */
 	flush_tlb_pgtable(tlb, address);
-	pgtable_free_tlb(tlb, pud, PUD_CACHE_INDEX);
+	pgtable_free_tlb(tlb, pud, PUD_INDEX);
 }
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
@@ -146,14 +146,14 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 }
 
 static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
-                                  unsigned long address)
+				  unsigned long address)
 {
 	/*
 	 * By now all the pud entries should be none entries. So go
 	 * ahead and flush the page walk cache
 	 */
 	flush_tlb_pgtable(tlb, address);
-        return pgtable_free_tlb(tlb, pmd, PMD_CACHE_INDEX);
+	return pgtable_free_tlb(tlb, pmd, PMD_INDEX);
 }
 
 static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
@@ -203,7 +203,7 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t table,
 	 * ahead and flush the page walk cache
 	 */
 	flush_tlb_pgtable(tlb, address);
-	pgtable_free_tlb(tlb, table, 0);
+	pgtable_free_tlb(tlb, table, PTE_INDEX);
 }
 
 #define check_pgt_cache()	do { } while (0)
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 47b5ffc8715d..9462bc18806c 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -273,6 +273,16 @@ extern unsigned long __pte_frag_size_shift;
 /* Bits to mask out from a PGD to get to the PUD page */
 #define PGD_MASKED_BITS		0xc0000000000000ffUL
 
+/*
+ * Used as an indicator for rcu callback functions
+ */
+enum pgtable_index {
+	PTE_INDEX = 0,
+	PMD_INDEX,
+	PUD_INDEX,
+	PGD_INDEX,
+};
+
 extern unsigned long __vmalloc_start;
 extern unsigned long __vmalloc_end;
 #define VMALLOC_START	__vmalloc_start
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index fc42cccb96c7..0a05e99b54a1 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -309,38 +309,45 @@ void pte_fragment_free(unsigned long *table, int kernel)
 	}
 }
 
+static inline void pgtable_free(void *table, int index)
+{
+	switch (index) {
+	case PTE_INDEX:
+		pte_fragment_free(table, 0);
+		break;
+	case PMD_INDEX:
+		kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), table);
+		break;
+	case PUD_INDEX:
+		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
+		break;
+		/* We don't free pgd table via RCU callback */
+	default:
+		BUG();
+	}
+}
+
 #ifdef CONFIG_SMP
-void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index)
 {
 	unsigned long pgf = (unsigned long)table;
 
-	BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-	pgf |= shift;
+	BUG_ON(index > MAX_PGTABLE_INDEX_SIZE);
+	pgf |= index;
 	tlb_remove_table(tlb, (void *)pgf);
 }
 
 void __tlb_remove_table(void *_table)
 {
 	void *table = (void *)((unsigned long)_table & ~MAX_PGTABLE_INDEX_SIZE);
-	unsigned int shift = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
+	unsigned int index = (unsigned long)_table & MAX_PGTABLE_INDEX_SIZE;
 
-	if (!shift)
-		/* PTE page needs special handling */
-		pte_fragment_free(table, 0);
-	else {
-		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-		kmem_cache_free(PGT_CACHE(shift), table);
-	}
+	return pgtable_free(table, index);
 }
 #else
-void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift)
+void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index)
 {
-	if (!shift) {
-		/* PTE page needs special handling */
-		pte_fragment_free(table, 0);
-	} else {
-		BUG_ON(shift > MAX_PGTABLE_INDEX_SIZE);
-		kmem_cache_free(PGT_CACHE(shift), table);
-	}
+
+	return pgtable_free(table, index);
 }
 #endif
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 09/11] powerpc/mm: Implement helpers for pagetable fragment support at PMD level
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (8 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 08/11] powerpc/book3s64/mm: Simplify the rcu callback for page table free Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 10/11] powerpc/mm: Use page fragments for allocation page table " Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h   |  2 +
 arch/powerpc/include/asm/book3s/64/hash-64k.h  |  7 +++
 arch/powerpc/include/asm/book3s/64/mmu.h       |  1 +
 arch/powerpc/include/asm/book3s/64/pgalloc.h   |  2 +
 arch/powerpc/include/asm/book3s/64/pgtable.h   |  6 ++
 arch/powerpc/include/asm/book3s/64/radix-4k.h  |  3 +
 arch/powerpc/include/asm/book3s/64/radix-64k.h |  4 ++
 arch/powerpc/mm/hash_utils_64.c                |  2 +
 arch/powerpc/mm/mmu_context_book3s64.c         | 37 ++++++++++--
 arch/powerpc/mm/pgtable-book3s64.c             | 84 ++++++++++++++++++++++++++
 arch/powerpc/mm/pgtable-radix.c                |  2 +
 11 files changed, 144 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 00c4db2a7682..9a3798660cef 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -42,6 +42,8 @@
 /* 8 bytes per each pte entry */
 #define H_PTE_FRAG_SIZE_SHIFT  (H_PTE_INDEX_SIZE + 3)
 #define H_PTE_FRAG_NR	(PAGE_SIZE >> H_PTE_FRAG_SIZE_SHIFT)
+#define H_PMD_FRAG_SIZE_SHIFT  (H_PMD_INDEX_SIZE + 3)
+#define H_PMD_FRAG_NR	(PAGE_SIZE >> H_PMD_FRAG_SIZE_SHIFT)
 
 /* memory key bits, only 8 keys supported */
 #define H_PTE_PKEY_BIT0	0
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index cc82745355b3..c81793d47af9 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -46,6 +46,13 @@
 #define H_PTE_FRAG_SIZE_SHIFT  (H_PTE_INDEX_SIZE + 3 + 1)
 #define H_PTE_FRAG_NR	(PAGE_SIZE >> H_PTE_FRAG_SIZE_SHIFT)
 
+#if defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLB_PAGE)
+#define H_PMD_FRAG_SIZE_SHIFT  (H_PMD_INDEX_SIZE + 3 + 1)
+#else
+#define H_PMD_FRAG_SIZE_SHIFT  (H_PMD_INDEX_SIZE + 3)
+#endif
+#define H_PMD_FRAG_NR	(PAGE_SIZE >> H_PMD_FRAG_SIZE_SHIFT)
+
 #ifndef __ASSEMBLY__
 #include <asm/errno.h>
 
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index fde7803a2261..9c8c669a6b6a 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -138,6 +138,7 @@ typedef struct {
 	 * pagetable fragment support
 	 */
 	void *pte_frag;
+	void *pmd_frag;
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	struct list_head iommu_group_mem_list;
 #endif
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index ed313b8d3fac..005f400cbf30 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -42,7 +42,9 @@ extern struct kmem_cache *pgtable_cache[];
 		})
 
 extern pte_t *pte_fragment_alloc(struct mm_struct *, unsigned long, int);
+extern pmd_t *pmd_fragment_alloc(struct mm_struct *, unsigned long);
 extern void pte_fragment_free(unsigned long *, int);
+extern void pmd_fragment_free(unsigned long *);
 extern void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift);
 #ifdef CONFIG_SMP
 extern void __tlb_remove_table(void *_table);
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 9462bc18806c..c9db19512b3c 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -246,6 +246,12 @@ extern unsigned long __pte_frag_size_shift;
 #define PTE_FRAG_SIZE_SHIFT __pte_frag_size_shift
 #define PTE_FRAG_SIZE (1UL << PTE_FRAG_SIZE_SHIFT)
 
+extern unsigned long __pmd_frag_nr;
+#define PMD_FRAG_NR __pmd_frag_nr
+extern unsigned long __pmd_frag_size_shift;
+#define PMD_FRAG_SIZE_SHIFT __pmd_frag_size_shift
+#define PMD_FRAG_SIZE (1UL << PMD_FRAG_SIZE_SHIFT)
+
 #define PTRS_PER_PTE	(1 << PTE_INDEX_SIZE)
 #define PTRS_PER_PMD	(1 << PMD_INDEX_SIZE)
 #define PTRS_PER_PUD	(1 << PUD_INDEX_SIZE)
diff --git a/arch/powerpc/include/asm/book3s/64/radix-4k.h b/arch/powerpc/include/asm/book3s/64/radix-4k.h
index ca366ec86310..863c3e8286fb 100644
--- a/arch/powerpc/include/asm/book3s/64/radix-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/radix-4k.h
@@ -15,4 +15,7 @@
 #define RADIX_PTE_FRAG_SIZE_SHIFT  (RADIX_PTE_INDEX_SIZE + 3)
 #define RADIX_PTE_FRAG_NR	(PAGE_SIZE >> RADIX_PTE_FRAG_SIZE_SHIFT)
 
+#define RADIX_PMD_FRAG_SIZE_SHIFT  (RADIX_PMD_INDEX_SIZE + 3)
+#define RADIX_PMD_FRAG_NR	(PAGE_SIZE >> RADIX_PMD_FRAG_SIZE_SHIFT)
+
 #endif /* _ASM_POWERPC_PGTABLE_RADIX_4K_H */
diff --git a/arch/powerpc/include/asm/book3s/64/radix-64k.h b/arch/powerpc/include/asm/book3s/64/radix-64k.h
index 830082496876..ccb78ca9d0c5 100644
--- a/arch/powerpc/include/asm/book3s/64/radix-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/radix-64k.h
@@ -16,4 +16,8 @@
  */
 #define RADIX_PTE_FRAG_SIZE_SHIFT  (RADIX_PTE_INDEX_SIZE + 3)
 #define RADIX_PTE_FRAG_NR	(PAGE_SIZE >> RADIX_PTE_FRAG_SIZE_SHIFT)
+
+#define RADIX_PMD_FRAG_SIZE_SHIFT  (RADIX_PMD_INDEX_SIZE + 3)
+#define RADIX_PMD_FRAG_NR	(PAGE_SIZE >> RADIX_PMD_FRAG_SIZE_SHIFT)
+
 #endif /* _ASM_POWERPC_PGTABLE_RADIX_64K_H */
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0bd3790d35df..63b1c1882e22 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1010,6 +1010,8 @@ void __init hash__early_init_mmu(void)
 	 */
 	__pte_frag_nr = H_PTE_FRAG_NR;
 	__pte_frag_size_shift = H_PTE_FRAG_SIZE_SHIFT;
+	__pmd_frag_nr = H_PMD_FRAG_NR;
+	__pmd_frag_size_shift = H_PMD_FRAG_SIZE_SHIFT;
 
 	__pte_index_size = H_PTE_INDEX_SIZE;
 	__pmd_index_size = H_PMD_INDEX_SIZE;
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 87ee78973a35..f3d4b4a0e561 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -160,6 +160,7 @@ int init_new_context(struct task_struct *tsk, struct mm_struct *mm)
 	mm->context.id = index;
 
 	mm->context.pte_frag = NULL;
+	mm->context.pmd_frag = NULL;
 #ifdef CONFIG_SPAPR_TCE_IOMMU
 	mm_iommu_init(mm);
 #endif
@@ -190,16 +191,11 @@ static void destroy_contexts(mm_context_t *ctx)
 	spin_unlock(&mmu_context_lock);
 }
 
-static void destroy_pagetable_page(struct mm_struct *mm)
+static void pte_frag_destroy(void *pte_frag)
 {
 	int count;
-	void *pte_frag;
 	struct page *page;
 
-	pte_frag = mm->context.pte_frag;
-	if (!pte_frag)
-		return;
-
 	page = virt_to_page(pte_frag);
 	/* drop all the pending references */
 	count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT;
@@ -210,6 +206,35 @@ static void destroy_pagetable_page(struct mm_struct *mm)
 	}
 }
 
+static void pmd_frag_destroy(void *pmd_frag)
+{
+	int count;
+	struct page *page;
+
+	page = virt_to_page(pmd_frag);
+	/* drop all the pending references */
+	count = ((unsigned long)pmd_frag & ~PAGE_MASK) >> PMD_FRAG_SIZE_SHIFT;
+	/* We allow PTE_FRAG_NR fragments from a PTE page */
+	if (page_ref_sub_and_test(page, PMD_FRAG_NR - count)) {
+		pgtable_pmd_page_dtor(page);
+		free_unref_page(page);
+	}
+}
+
+static void destroy_pagetable_page(struct mm_struct *mm)
+{
+	void *frag;
+
+	frag = mm->context.pte_frag;
+	if (frag)
+		pte_frag_destroy(frag);
+
+	frag = mm->context.pmd_frag;
+	if (frag)
+		pmd_frag_destroy(frag);
+	return;
+}
+
 void destroy_context(struct mm_struct *mm)
 {
 #ifdef CONFIG_SPAPR_TCE_IOMMU
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 0a05e99b54a1..47323ed8d7b5 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -20,6 +20,11 @@
 #include "mmu_decl.h"
 #include <trace/events/thp.h>
 
+unsigned long __pmd_frag_nr;
+EXPORT_SYMBOL(__pmd_frag_nr);
+unsigned long __pmd_frag_size_shift;
+EXPORT_SYMBOL(__pmd_frag_size_shift);
+
 int (*register_process_table)(unsigned long base, unsigned long page_size,
 			      unsigned long tbl_size);
 
@@ -226,6 +231,85 @@ void mmu_partition_table_set_entry(unsigned int lpid, unsigned long dw0,
 }
 EXPORT_SYMBOL_GPL(mmu_partition_table_set_entry);
 
+static pmd_t *get_pmd_from_cache(struct mm_struct *mm)
+{
+	void *pmd_frag, *ret;
+
+	spin_lock(&mm->page_table_lock);
+	ret = mm->context.pmd_frag;
+	if (ret) {
+		pmd_frag = ret + PMD_FRAG_SIZE;
+		/*
+		 * If we have taken up all the fragments mark PTE page NULL
+		 */
+		if (((unsigned long)pmd_frag & ~PAGE_MASK) == 0)
+			pmd_frag = NULL;
+		mm->context.pmd_frag = pmd_frag;
+	}
+	spin_unlock(&mm->page_table_lock);
+	return (pmd_t *)ret;
+}
+
+static pmd_t *__alloc_for_pmdcache(struct mm_struct *mm)
+{
+	void *ret = NULL;
+	struct page *page;
+	gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO;
+
+	if (mm == &init_mm)
+		gfp &= ~__GFP_ACCOUNT;
+	page = alloc_page(gfp);
+	if (!page)
+		return NULL;
+	if (!pgtable_pmd_page_ctor(page)) {
+		__free_pages(page, 0);
+		return NULL;
+	}
+
+	ret = page_address(page);
+	/*
+	 * if we support only one fragment just return the
+	 * allocated page.
+	 */
+	if (PMD_FRAG_NR == 1)
+		return ret;
+
+	spin_lock(&mm->page_table_lock);
+	/*
+	 * If we find pgtable_page set, we return
+	 * the allocated page with single fragement
+	 * count.
+	 */
+	if (likely(!mm->context.pmd_frag)) {
+		set_page_count(page, PMD_FRAG_NR);
+		mm->context.pmd_frag = ret + PMD_FRAG_SIZE;
+	}
+	spin_unlock(&mm->page_table_lock);
+
+	return (pmd_t *)ret;
+}
+
+pmd_t *pmd_fragment_alloc(struct mm_struct *mm, unsigned long vmaddr)
+{
+	pmd_t *pmd;
+
+	pmd = get_pmd_from_cache(mm);
+	if (pmd)
+		return pmd;
+
+	return __alloc_for_pmdcache(mm);
+}
+
+void pmd_fragment_free(unsigned long *pmd)
+{
+	struct page *page = virt_to_page(pmd);
+
+	if (put_page_testzero(page)) {
+		pgtable_pmd_page_dtor(page);
+		free_unref_page(page);
+	}
+}
+
 static pte_t *get_pte_from_cache(struct mm_struct *mm)
 {
 	void *pte_frag, *ret;
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 473415750cbf..32e58024e7cb 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -640,6 +640,8 @@ void __init radix__early_init_mmu(void)
 #endif
 	__pte_frag_nr = RADIX_PTE_FRAG_NR;
 	__pte_frag_size_shift = RADIX_PTE_FRAG_SIZE_SHIFT;
+	__pmd_frag_nr = RADIX_PMD_FRAG_NR;
+	__pmd_frag_size_shift = RADIX_PMD_FRAG_SIZE_SHIFT;
 
 	if (!firmware_has_feature(FW_FEATURE_LPAR)) {
 		radix_init_native();
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 10/11] powerpc/mm: Use page fragments for allocation page table at PMD level
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (9 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 09/11] powerpc/mm: Implement helpers for pagetable fragment support at PMD level Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-16 11:27 ` [PATCH V1 11/11] powerpc/book3s64: Enable split pmd ptlock Aneesh Kumar K.V
  2018-04-17 22:43 ` [PATCH V1 00/11] powerpc/mm/book3s64: Support for " Balbir Singh
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h    | 10 ----------
 arch/powerpc/include/asm/book3s/64/pgalloc.h |  8 +++-----
 arch/powerpc/include/asm/book3s/64/pgtable.h |  4 ++--
 arch/powerpc/mm/hash_utils_64.c              |  1 -
 arch/powerpc/mm/pgtable-book3s64.c           |  3 +--
 arch/powerpc/mm/pgtable-radix.c              |  1 -
 arch/powerpc/mm/pgtable_64.c                 |  2 --
 7 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index cc8cd656ccfe..0387b155f13d 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -23,16 +23,6 @@
 				 H_PUD_INDEX_SIZE + H_PGD_INDEX_SIZE + PAGE_SHIFT)
 #define H_PGTABLE_RANGE		(ASM_CONST(1) << H_PGTABLE_EADDR_SIZE)
 
-#if (defined(CONFIG_TRANSPARENT_HUGEPAGE) || defined(CONFIG_HUGETLB_PAGE)) && \
-	defined(CONFIG_PPC_64K_PAGES)
-/*
- * only with hash 64k we need to use the second half of pmd page table
- * to store pointer to deposited pgtable_t
- */
-#define H_PMD_CACHE_INDEX	(H_PMD_INDEX_SIZE + 1)
-#else
-#define H_PMD_CACHE_INDEX	H_PMD_INDEX_SIZE
-#endif
 /*
  * We store the slot details in the second half of page table.
  * Increase the pud level table so that hugetlb ptes can be stored
diff --git a/arch/powerpc/include/asm/book3s/64/pgalloc.h b/arch/powerpc/include/asm/book3s/64/pgalloc.h
index 005f400cbf30..01ee40f11f3a 100644
--- a/arch/powerpc/include/asm/book3s/64/pgalloc.h
+++ b/arch/powerpc/include/asm/book3s/64/pgalloc.h
@@ -90,8 +90,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
 	 * need to do this for 4k.
 	 */
 #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_PPC_64K_PAGES) && \
-	((H_PGD_INDEX_SIZE == H_PUD_CACHE_INDEX) ||		     \
-	 (H_PGD_INDEX_SIZE == H_PMD_CACHE_INDEX))
+	(H_PGD_INDEX_SIZE == H_PUD_CACHE_INDEX)
 	memset(pgd, 0, PGD_TABLE_SIZE);
 #endif
 	return pgd;
@@ -138,13 +137,12 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
 
 static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
 {
-	return kmem_cache_alloc(PGT_CACHE(PMD_CACHE_INDEX),
-		pgtable_gfp_flags(mm, GFP_KERNEL));
+	return pmd_fragment_alloc(mm, addr);
 }
 
 static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 {
-	kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), pmd);
+	pmd_fragment_free((unsigned long *)pmd);
 }
 
 static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c9db19512b3c..c233915abb68 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -212,13 +212,13 @@ extern unsigned long __pte_index_size;
 extern unsigned long __pmd_index_size;
 extern unsigned long __pud_index_size;
 extern unsigned long __pgd_index_size;
-extern unsigned long __pmd_cache_index;
 extern unsigned long __pud_cache_index;
 #define PTE_INDEX_SIZE  __pte_index_size
 #define PMD_INDEX_SIZE  __pmd_index_size
 #define PUD_INDEX_SIZE  __pud_index_size
 #define PGD_INDEX_SIZE  __pgd_index_size
-#define PMD_CACHE_INDEX __pmd_cache_index
+/* pmd table use page table fragments */
+#define PMD_CACHE_INDEX  0
 #define PUD_CACHE_INDEX __pud_cache_index
 /*
  * Because of use of pte fragments and THP, size of page table
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 63b1c1882e22..e25a6b0cd01e 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1018,7 +1018,6 @@ void __init hash__early_init_mmu(void)
 	__pud_index_size = H_PUD_INDEX_SIZE;
 	__pgd_index_size = H_PGD_INDEX_SIZE;
 	__pud_cache_index = H_PUD_CACHE_INDEX;
-	__pmd_cache_index = H_PMD_CACHE_INDEX;
 	__pte_table_size = H_PTE_TABLE_SIZE;
 	__pmd_table_size = H_PMD_TABLE_SIZE;
 	__pud_table_size = H_PUD_TABLE_SIZE;
diff --git a/arch/powerpc/mm/pgtable-book3s64.c b/arch/powerpc/mm/pgtable-book3s64.c
index 47323ed8d7b5..abda2b92f1ba 100644
--- a/arch/powerpc/mm/pgtable-book3s64.c
+++ b/arch/powerpc/mm/pgtable-book3s64.c
@@ -400,7 +400,7 @@ static inline void pgtable_free(void *table, int index)
 		pte_fragment_free(table, 0);
 		break;
 	case PMD_INDEX:
-		kmem_cache_free(PGT_CACHE(PMD_CACHE_INDEX), table);
+		pmd_fragment_free(table);
 		break;
 	case PUD_INDEX:
 		kmem_cache_free(PGT_CACHE(PUD_CACHE_INDEX), table);
@@ -431,7 +431,6 @@ void __tlb_remove_table(void *_table)
 #else
 void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int index)
 {
-
 	return pgtable_free(table, index);
 }
 #endif
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index 32e58024e7cb..ce24d72ea679 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -617,7 +617,6 @@ void __init radix__early_init_mmu(void)
 	__pud_index_size = RADIX_PUD_INDEX_SIZE;
 	__pgd_index_size = RADIX_PGD_INDEX_SIZE;
 	__pud_cache_index = RADIX_PUD_INDEX_SIZE;
-	__pmd_cache_index = RADIX_PMD_INDEX_SIZE;
 	__pte_table_size = RADIX_PTE_TABLE_SIZE;
 	__pmd_table_size = RADIX_PMD_TABLE_SIZE;
 	__pud_table_size = RADIX_PUD_TABLE_SIZE;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index f0208f8f9dbb..53e9eeecd5d4 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -72,8 +72,6 @@ unsigned long __pud_index_size;
 EXPORT_SYMBOL(__pud_index_size);
 unsigned long __pgd_index_size;
 EXPORT_SYMBOL(__pgd_index_size);
-unsigned long __pmd_cache_index;
-EXPORT_SYMBOL(__pmd_cache_index);
 unsigned long __pud_cache_index;
 EXPORT_SYMBOL(__pud_cache_index);
 unsigned long __pte_table_size;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V1 11/11] powerpc/book3s64: Enable split pmd ptlock.
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (10 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 10/11] powerpc/mm: Use page fragments for allocation page table " Aneesh Kumar K.V
@ 2018-04-16 11:27 ` Aneesh Kumar K.V
  2018-04-17 22:43 ` [PATCH V1 00/11] powerpc/mm/book3s64: Support for " Balbir Singh
  12 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-16 11:27 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>

Testing with a threaded version of mmap_bench which allocate 1G chunks and
with large number of threads we find:

without patch

    32.72%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --32.68%--0
                          |
                          |--15.82%--pte_fragment_alloc
                          |          |
                          |           --15.79%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--14.95%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone
                          |

with patch

    12.89%  mmap_bench  [kernel.vmlinux]            [k] do_raw_spin_lock
            |
            ---do_raw_spin_lock
               |
                --12.83%--0
                          |
                          |--3.21%--pagevec_lru_move_fn
                          |          __lru_cache_add
                          |          |
                          |           --2.74%--do_huge_pmd_anonymous_page
                          |                     __handle_mm_fault
                          |                     handle_mm_fault
                          |                     __do_page_fault
                          |                     handle_page_fault
                          |                     test_mmap
                          |                     test_mmap
                          |                     start_thread
                          |                     __clone
                          |
                          |--3.11%--do_huge_pmd_anonymous_page
                          |          __handle_mm_fault
                          |          handle_mm_fault
                          |          __do_page_fault
                          |          handle_page_fault
                          |          test_mmap
                          |          test_mmap
                          |          start_thread
                          |          __clone

.....
                          |
                           --0.55%--pte_fragment_alloc
                                     |
                                      --0.55%--do_huge_pmd_anonymous_page
                                                __handle_mm_fault
                                                handle_mm_fault
                                                __do_page_fault
                                                handle_page_fault
                                                test_mmap
                                                test_mmap
                                                start_thread
                                                __clone

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/Kconfig.cputype | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 67d3125d0610..cc892dcfa114 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -292,6 +292,10 @@ config PPC_STD_MMU_32
 	def_bool y
 	depends on PPC_STD_MMU && PPC32
 
+config ARCH_ENABLE_SPLIT_PMD_PTLOCK
+	def_bool y
+	depends on PPC_BOOK3S_64
+
 config PPC_RADIX_MMU
 	bool "Radix MMU Support"
 	depends on PPC_BOOK3S_64
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled
  2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
@ 2018-04-17 12:48   ` Christophe LEROY
  2018-04-18  5:42   ` Michael Ellerman
  1 sibling, 0 replies; 19+ messages in thread
From: Christophe LEROY @ 2018-04-17 12:48 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev



Le 16/04/2018 à 13:27, Aneesh Kumar K.V a écrit :
> 8xx use slice code when hugetlbfs is enabled. We missed a header include on
> 8xx which resulted in the below build failure.
> 
> config: mpc885_ads_defconfig + CONFIG_HUGETLBFS
> 
>     CC      arch/powerpc/mm/slice.o
> arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
> arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context' [-Werror=implicit-function-declaration]
> arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context' [-Werror=implicit-function-declaration]
> cc1: all warnings being treated as errors
> make[1]: *** [arch/powerpc/mm/slice.o] Error 1
> make: *** [arch/powerpc/mm] Error 2
> 
> on PPC64 the mmu_context.h was included via linux/pkeys.h
> 
> CC: Christophe LEROY <christophe.leroy@c-s.fr>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

Tested-by: Christophe Leroy <christophe.leroy@c-s.fr>

> ---
>   arch/powerpc/mm/slice.c | 1 +
>   1 file changed, 1 insertion(+)
> 
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 9cd87d11fe4e..205fe557ca10 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -35,6 +35,7 @@
>   #include <asm/mmu.h>
>   #include <asm/copro.h>
>   #include <asm/hugetlb.h>
> +#include <asm/mmu_context.h>
>   
>   static DEFINE_SPINLOCK(slice_convert_lock);
>   
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock
  2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
                   ` (11 preceding siblings ...)
  2018-04-16 11:27 ` [PATCH V1 11/11] powerpc/book3s64: Enable split pmd ptlock Aneesh Kumar K.V
@ 2018-04-17 22:43 ` Balbir Singh
  12 siblings, 0 replies; 19+ messages in thread
From: Balbir Singh @ 2018-04-17 22:43 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, paulus, mpe, linuxppc-dev

On Mon, 16 Apr 2018 16:57:12 +0530
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> wrote:

> This patch series add split pmd pagetable lock for book3s64. nohash64 also should
> be able to switch to this. I need to workout the code dependency. This series
> also migh have broken the build on platforms otherthan book3s64. I am sending this early
> to get feedback on whether we should continue with the approach.
> 
> We switch the pmd allocator to use something similar to what we already use for
> level 4 pagetable allocation. We get an order 0 page and divide that to fragments
> and hand over fragments when we get request for a pmd pagetable. The pmd lock is
> now stashed in the struct page backing the allocated page.

That's only for the THP case right?

> 
> The series helps in reducing lock contention on mm->page_table_lock.
>

The numbers look good.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled
  2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
  2018-04-17 12:48   ` Christophe LEROY
@ 2018-04-18  5:42   ` Michael Ellerman
  2018-04-30  5:33     ` Aneesh Kumar K.V
  1 sibling, 1 reply; 19+ messages in thread
From: Michael Ellerman @ 2018-04-18  5:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus
  Cc: linuxppc-dev, Aneesh Kumar K.V, Christophe LEROY

"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:

> 8xx use slice code when hugetlbfs is enabled. We missed a header include on
> 8xx which resulted in the below build failure.
>
> config: mpc885_ads_defconfig + CONFIG_HUGETLBFS
>
>    CC      arch/powerpc/mm/slice.o
> arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
> arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context' [-Werror=implicit-function-declaration]
> arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context' [-Werror=implicit-function-declaration]
> cc1: all warnings being treated as errors
> make[1]: *** [arch/powerpc/mm/slice.o] Error 1
> make: *** [arch/powerpc/mm] Error 2
>
> on PPC64 the mmu_context.h was included via linux/pkeys.h
>
> CC: Christophe LEROY <christophe.leroy@c-s.fr>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> ---
>  arch/powerpc/mm/slice.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 9cd87d11fe4e..205fe557ca10 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -35,6 +35,7 @@
>  #include <asm/mmu.h>
>  #include <asm/copro.h>
>  #include <asm/hugetlb.h>
> +#include <asm/mmu_context.h>

I already merged this, didn't I?

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled
  2018-04-18  5:42   ` Michael Ellerman
@ 2018-04-30  5:33     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-30  5:33 UTC (permalink / raw)
  To: Michael Ellerman, Aneesh Kumar K.V, benh, paulus
  Cc: Aneesh Kumar K.V, linuxppc-dev

Michael Ellerman <mpe@ellerman.id.au> writes:

> "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com> writes:
>
>> 8xx use slice code when hugetlbfs is enabled. We missed a header include on
>> 8xx which resulted in the below build failure.
>>
>> config: mpc885_ads_defconfig + CONFIG_HUGETLBFS
>>
>>    CC      arch/powerpc/mm/slice.o
>> arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
>> arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context' [-Werror=implicit-function-declaration]
>> arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context' [-Werror=implicit-function-declaration]
>> cc1: all warnings being treated as errors
>> make[1]: *** [arch/powerpc/mm/slice.o] Error 1
>> make: *** [arch/powerpc/mm] Error 2
>>
>> on PPC64 the mmu_context.h was included via linux/pkeys.h
>>
>> CC: Christophe LEROY <christophe.leroy@c-s.fr>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> ---
>>  arch/powerpc/mm/slice.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
>> index 9cd87d11fe4e..205fe557ca10 100644
>> --- a/arch/powerpc/mm/slice.c
>> +++ b/arch/powerpc/mm/slice.c
>> @@ -35,6 +35,7 @@
>>  #include <asm/mmu.h>
>>  #include <asm/copro.h>
>>  #include <asm/hugetlb.h>
>> +#include <asm/mmu_context.h>
>
> I already merged this, didn't I?
>

yes. This patch was wrongly included in this series.

-aneesh

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [V1, 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64
  2018-04-16 11:27 ` [PATCH V1 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64 Aneesh Kumar K.V
@ 2018-05-16 13:38   ` Michael Ellerman
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2018-05-16 13:38 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus; +Cc: linuxppc-dev, Aneesh Kumar K.V

On Mon, 2018-04-16 at 11:27:14 UTC, "Aneesh Kumar K.V" wrote:
> From: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> 
> Only code movement and avoid #ifdef.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/59879d542a8e880391863d82cddf38

cheers

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled
@ 2018-04-10  8:51 Aneesh Kumar K.V
  0 siblings, 0 replies; 19+ messages in thread
From: Aneesh Kumar K.V @ 2018-04-10  8:51 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V, Christophe LEROY

8xx use slice code when hugetlbfs is enabled. We missed a header include on
8xx which resulted in the below build failure.

config: mpc885_ads_defconfig + CONFIG_HUGETLBFS

   CC      arch/powerpc/mm/slice.o
arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
arch/powerpc/mm/slice.c:655:2: error: implicit declaration of function 'need_extra_context' [-Werror=implicit-function-declaration]
arch/powerpc/mm/slice.c:656:3: error: implicit declaration of function 'alloc_extended_context' [-Werror=implicit-function-declaration]
cc1: all warnings being treated as errors
make[1]: *** [arch/powerpc/mm/slice.o] Error 1
make: *** [arch/powerpc/mm] Error 2

on PPC64 the mmu_context.h was included via linux/pkeys.h

CC: Christophe LEROY <christophe.leroy@c-s.fr>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
 arch/powerpc/mm/slice.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 9cd87d11fe4e..205fe557ca10 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -35,6 +35,7 @@
 #include <asm/mmu.h>
 #include <asm/copro.h>
 #include <asm/hugetlb.h>
+#include <asm/mmu_context.h>
 
 static DEFINE_SPINLOCK(slice_convert_lock);
 
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2018-05-16 13:38 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-16 11:27 [PATCH V1 00/11] powerpc/mm/book3s64: Support for split pmd ptlock Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V
2018-04-17 12:48   ` Christophe LEROY
2018-04-18  5:42   ` Michael Ellerman
2018-04-30  5:33     ` Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 01/11] powerpc/mm/book3s64: Move book3s64 code to pgtable-book3s64 Aneesh Kumar K.V
2018-05-16 13:38   ` [V1, " Michael Ellerman
2018-04-16 11:27 ` [PATCH V1 02/11] powerpc/kvm: Switch kvm pmd allocator to custom allocator Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 03/11] powerpc/mm: Use pmd_lockptr instead of opencoding it Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 04/11] powerpc/mm: Rename pte fragment functions Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 05/11] powerpc/mm/book3e/64: Remove unsupported 64Kpage size from 64bit booke Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 06/11] powerpc/mm/nohash: Remove pte fragment dependency from nohash Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 07/11] powerpc/mm/book3s64/4k: Switch 4k pagesize config to use pagetable fragment Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 08/11] powerpc/book3s64/mm: Simplify the rcu callback for page table free Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 09/11] powerpc/mm: Implement helpers for pagetable fragment support at PMD level Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 10/11] powerpc/mm: Use page fragments for allocation page table " Aneesh Kumar K.V
2018-04-16 11:27 ` [PATCH V1 11/11] powerpc/book3s64: Enable split pmd ptlock Aneesh Kumar K.V
2018-04-17 22:43 ` [PATCH V1 00/11] powerpc/mm/book3s64: Support for " Balbir Singh
  -- strict thread matches above, loose matches on Subject: below --
2018-04-10  8:51 [PATCH] powerpc/8xx: Build fix with Hugetlbfs enabled Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.