All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes
@ 2017-03-16 10:31 Aneesh Kumar K.V
  2017-03-16 10:31 ` [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64 Aneesh Kumar K.V
                   ` (10 more replies)
  0 siblings, 11 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:31 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Hi,

This series collect all the different patches sent before into one patch series.
The patch series is also rebased  on top of latest upstream linus.
It also contain fixes to the patches posted earlier.

Aneesh Kumar K.V (11):
  powerpc/mm/nohash: MM_SLICE is only used by book3s 64
  powerpc/mm/slice: when computing slice mask limit lowe slice max addr
    correctly
  powerpc/mm: Cleanup bits definition between hash and radix.
  powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE
  powerpc/mm: Add translation mode information in /proc/cpuinfo
  powerpc/mm/hugetlb: Filter out hugepage size not supported by page
    table layout
  powerpc/mm: Conditional defines of pte bits are messy
  powerpc/mm: Express everything based on Radix page table defines
  powerpc/mm: Lower the max real address to 51 bits
  powerpc/mm/radix: Make max pfn bits a variable
  powerpc/mm: Move hash specific pte bits to be top bits of RPN

 arch/powerpc/include/asm/book3s/64/hash-64k.h |  8 ++++--
 arch/powerpc/include/asm/book3s/64/hash.h     | 35 ++++++++++++++++--------
 arch/powerpc/include/asm/book3s/64/hugetlb.h  |  2 +-
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 39 ++++++++++++++++-----------
 arch/powerpc/include/asm/book3s/64/radix.h    | 12 ++++++++-
 arch/powerpc/include/asm/mmu-book3e.h         |  5 ----
 arch/powerpc/include/asm/nohash/64/pgtable.h  |  5 ----
 arch/powerpc/mm/hash_native_64.c              |  1 +
 arch/powerpc/mm/hash_utils_64.c               |  1 +
 arch/powerpc/mm/hugetlbpage-book3e.c          |  7 -----
 arch/powerpc/mm/hugetlbpage.c                 | 20 ++++++++++++++
 arch/powerpc/mm/mmu_context_nohash.c          |  5 ----
 arch/powerpc/mm/pgtable-radix.c               |  1 +
 arch/powerpc/mm/pgtable_64.c                  |  3 +++
 arch/powerpc/mm/slice.c                       |  5 ++--
 arch/powerpc/mm/tlb-radix.c                   |  2 +-
 arch/powerpc/platforms/Kconfig.cputype        |  2 +-
 arch/powerpc/platforms/powernv/setup.c        |  4 +++
 arch/powerpc/platforms/pseries/setup.c        |  4 +++
 19 files changed, 103 insertions(+), 58 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
@ 2017-03-16 10:31 ` Aneesh Kumar K.V
  2017-03-16 22:00   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly Aneesh Kumar K.V
                   ` (9 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:31 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

BOOKE code is dead code as per the Kconfig details. So make it simpler
by enabling MM_SLICE only for book3s_64. The changes w.r.t nohash is just
removing deadcode. W.r.t ppc64, 4k without hugetlb will now enable MM_SLICE.
But that is good, because we reduce one extra variant which probably is not
getting tested much.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/mmu-book3e.h        | 5 -----
 arch/powerpc/include/asm/nohash/64/pgtable.h | 5 -----
 arch/powerpc/mm/hugetlbpage-book3e.c         | 7 -------
 arch/powerpc/mm/mmu_context_nohash.c         | 5 -----
 arch/powerpc/platforms/Kconfig.cputype       | 2 +-
 5 files changed, 1 insertion(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/mmu-book3e.h b/arch/powerpc/include/asm/mmu-book3e.h
index b62a8d43a06c..7ca8d8e80ffa 100644
--- a/arch/powerpc/include/asm/mmu-book3e.h
+++ b/arch/powerpc/include/asm/mmu-book3e.h
@@ -229,11 +229,6 @@ typedef struct {
 	unsigned int	id;
 	unsigned int	active;
 	unsigned long	vdso_base;
-#ifdef CONFIG_PPC_MM_SLICES
-	u64 low_slices_psize;   /* SLB page size encodings */
-	u64 high_slices_psize;  /* 4 bits per slice for now */
-	u16 user_psize;         /* page size index */
-#endif
 #ifdef CONFIG_PPC_64K_PAGES
 	/* for 4K PTE fragment support */
 	void *pte_frag;
diff --git a/arch/powerpc/include/asm/nohash/64/pgtable.h b/arch/powerpc/include/asm/nohash/64/pgtable.h
index c7f927e67d14..f0ff384d4ca5 100644
--- a/arch/powerpc/include/asm/nohash/64/pgtable.h
+++ b/arch/powerpc/include/asm/nohash/64/pgtable.h
@@ -88,11 +88,6 @@
 #include <asm/nohash/pte-book3e.h>
 #include <asm/pte-common.h>
 
-#ifdef CONFIG_PPC_MM_SLICES
-#define HAVE_ARCH_UNMAPPED_AREA
-#define HAVE_ARCH_UNMAPPED_AREA_TOPDOWN
-#endif /* CONFIG_PPC_MM_SLICES */
-
 #ifndef __ASSEMBLY__
 /* pte_clear moved to later in this file */
 
diff --git a/arch/powerpc/mm/hugetlbpage-book3e.c b/arch/powerpc/mm/hugetlbpage-book3e.c
index 83a8be791e06..bfe4e8526b2d 100644
--- a/arch/powerpc/mm/hugetlbpage-book3e.c
+++ b/arch/powerpc/mm/hugetlbpage-book3e.c
@@ -148,16 +148,9 @@ void book3e_hugetlb_preload(struct vm_area_struct *vma, unsigned long ea,
 
 	mm = vma->vm_mm;
 
-#ifdef CONFIG_PPC_MM_SLICES
-	psize = get_slice_psize(mm, ea);
-	tsize = mmu_get_tsize(psize);
-	shift = mmu_psize_defs[psize].shift;
-#else
 	psize = vma_mmu_pagesize(vma);
 	shift = __ilog2(psize);
 	tsize = shift - 10;
-#endif
-
 	/*
 	 * We can't be interrupted while we're setting up the MAS
 	 * regusters or after we've confirmed that no tlb exists.
diff --git a/arch/powerpc/mm/mmu_context_nohash.c b/arch/powerpc/mm/mmu_context_nohash.c
index c491f2c8f2b9..4554d6527682 100644
--- a/arch/powerpc/mm/mmu_context_nohash.c
+++ b/arch/powerpc/mm/mmu_context_nohash.c
@@ -333,11 +333,6 @@ int init_new_context(struct task_struct *t, struct mm_struct *mm)
 
 	mm->context.id = MMU_NO_CONTEXT;
 	mm->context.active = 0;
-
-#ifdef CONFIG_PPC_MM_SLICES
-	slice_set_user_psize(mm, mmu_virtual_psize);
-#endif
-
 	return 0;
 }
 
diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype
index 99b0ae8acb78..a7c0c1fafe68 100644
--- a/arch/powerpc/platforms/Kconfig.cputype
+++ b/arch/powerpc/platforms/Kconfig.cputype
@@ -359,7 +359,7 @@ config PPC_BOOK3E_MMU
 
 config PPC_MM_SLICES
 	bool
-	default y if (!PPC_FSL_BOOK3E && PPC64 && HUGETLB_PAGE) || (PPC_STD_MMU_64 && PPC_64K_PAGES)
+	default y if PPC_STD_MMU_64
 	default n
 
 config PPC_HAVE_PMU_SUPPORT
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
  2017-03-16 10:31 ` [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64 Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:03   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix Aneesh Kumar K.V
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

For low slice max addr should be less that 4G

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/slice.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 2b27458902ee..bf150557dba8 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -83,11 +83,10 @@ static struct slice_mask slice_range_to_mask(unsigned long start,
 	struct slice_mask ret = { 0, 0 };
 
 	if (start < SLICE_LOW_TOP) {
-		unsigned long mend = min(end, SLICE_LOW_TOP);
-		unsigned long mstart = min(start, SLICE_LOW_TOP);
+		unsigned long mend = min(end, (SLICE_LOW_TOP - 1));
 
 		ret.low_slices = (1u << (GET_LOW_SLICE_INDEX(mend) + 1))
-			- (1u << GET_LOW_SLICE_INDEX(mstart));
+			- (1u << GET_LOW_SLICE_INDEX(start));
 	}
 
 	if ((start + len) > SLICE_LOW_TOP)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix.
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
  2017-03-16 10:31 ` [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64 Aneesh Kumar K.V
  2017-03-16 10:32 ` [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:16   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE Aneesh Kumar K.V
                   ` (7 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Define everything based on bits present in pgtable.h. This will help in easily
identifying overlapping bits between hash/radix.

No functional change with this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h |  4 ++++
 arch/powerpc/include/asm/book3s/64/hash.h     |  9 +++++----
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 10 ++--------
 arch/powerpc/include/asm/book3s/64/radix.h    |  6 ++++++
 4 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index f3dd21efa2ea..b39f0b86405e 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -6,6 +6,10 @@
 #define H_PUD_INDEX_SIZE  5
 #define H_PGD_INDEX_SIZE  12
 
+/*
+ * 64k aligned address free up few of the lower bits of RPN for us
+ * We steal that here. For more deatils look at pte_pfn/pfn_pte()
+ */
 #define H_PAGE_COMBO	0x00001000 /* this is a combo 4k page */
 #define H_PAGE_4K_PFN	0x00002000 /* PFN is for a single 4k page */
 /*
diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index f7b721bbf918..ec2828b1db07 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -13,12 +13,13 @@
  * We could create separate kernel read-only if we used the 3 PP bits
  * combinations that newer processors provide but we currently don't.
  */
-#define H_PAGE_BUSY		0x00800 /* software: PTE & hash are busy */
+#define H_PAGE_BUSY		_RPAGE_SW1 /* software: PTE & hash are busy */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
 #define H_PAGE_F_GIX_SHIFT	57
-#define H_PAGE_F_GIX		(7ul << 57)	/* HPTE index within HPTEG */
-#define H_PAGE_F_SECOND		(1ul << 60)	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_HASHPTE		(1ul << 61)	/* PTE has associated HPTE */
+/* (7ul << 57) HPTE index within HPTEG */
+#define H_PAGE_F_GIX		(_RPAGE_RSV2 | _RPAGE_RSV3 | _RPAGE_RSV4)
+#define H_PAGE_F_SECOND		_RPAGE_RSV1	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_HASHPTE		_RPAGE_SW0	/* PTE has associated HPTE */
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 8f4d41936e5a..c39bc4cb9247 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -44,14 +44,8 @@
 #endif
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
 
-/*
- * For P9 DD1 only, we need to track whether the pte's huge.
- */
-#define _PAGE_LARGE	_RPAGE_RSV1
-
-
-#define _PAGE_PTE		(1ul << 62)	/* distinguishes PTEs from pointers */
-#define _PAGE_PRESENT		(1ul << 63)	/* pte contains a translation */
+#define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
+#define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
 /*
  * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
  * Instead of fixing all of them, add an alternate define which
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 9e0bb7cd6e22..2a2ea47a9bd2 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -11,6 +11,12 @@
 #include <asm/book3s/64/radix-4k.h>
 #endif
 
+/*
+ * For P9 DD1 only, we need to track whether the pte's huge.
+ */
+#define _PAGE_LARGE	_RPAGE_RSV1
+
+
 #ifndef __ASSEMBLY__
 #include <asm/book3s/64/tlbflush-radix.h>
 #include <asm/cpu_has_feature.h>
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:16   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo Aneesh Kumar K.V
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This bit is only used by radix and it is nice to follow the naming style of having
bit name start with H_/R_ depending on which translation mode they are used.

No functional change in this patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hugetlb.h | 2 +-
 arch/powerpc/include/asm/book3s/64/radix.h   | 4 ++--
 arch/powerpc/mm/tlb-radix.c                  | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h
index c62f14d0bec1..6666cd366596 100644
--- a/arch/powerpc/include/asm/book3s/64/hugetlb.h
+++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h
@@ -46,7 +46,7 @@ static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct *vma,
 	 */
 	VM_WARN_ON(page_shift == mmu_psize_defs[MMU_PAGE_1G].shift);
 	if (page_shift == mmu_psize_defs[MMU_PAGE_2M].shift)
-		return __pte(pte_val(entry) | _PAGE_LARGE);
+		return __pte(pte_val(entry) | R_PAGE_LARGE);
 	else
 		return entry;
 }
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index 2a2ea47a9bd2..ac16d1943022 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -14,7 +14,7 @@
 /*
  * For P9 DD1 only, we need to track whether the pte's huge.
  */
-#define _PAGE_LARGE	_RPAGE_RSV1
+#define R_PAGE_LARGE	_RPAGE_RSV1
 
 
 #ifndef __ASSEMBLY__
@@ -258,7 +258,7 @@ static inline int radix__pmd_trans_huge(pmd_t pmd)
 static inline pmd_t radix__pmd_mkhuge(pmd_t pmd)
 {
 	if (cpu_has_feature(CPU_FTR_POWER9_DD1))
-		return __pmd(pmd_val(pmd) | _PAGE_PTE | _PAGE_LARGE);
+		return __pmd(pmd_val(pmd) | _PAGE_PTE | R_PAGE_LARGE);
 	return __pmd(pmd_val(pmd) | _PAGE_PTE);
 }
 static inline void radix__pmdp_huge_split_prepare(struct vm_area_struct *vma,
diff --git a/arch/powerpc/mm/tlb-radix.c b/arch/powerpc/mm/tlb-radix.c
index 952713d6cf04..83dc1ccc2fa1 100644
--- a/arch/powerpc/mm/tlb-radix.c
+++ b/arch/powerpc/mm/tlb-radix.c
@@ -437,7 +437,7 @@ void radix__flush_tlb_pte_p9_dd1(unsigned long old_pte, struct mm_struct *mm,
 		return;
 	}
 
-	if (old_pte & _PAGE_LARGE)
+	if (old_pte & R_PAGE_LARGE)
 		radix__flush_tlb_page_psize(mm, address, MMU_PAGE_2M);
 	else
 		radix__flush_tlb_page_psize(mm, address, mmu_virtual_psize);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:17   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout Aneesh Kumar K.V
                   ` (5 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

With this we have on powernv and pseries /proc/cpuinfo reporting

timebase        : 512000000
platform        : PowerNV
model           : 8247-22L
machine         : PowerNV 8247-22L
firmware        : OPAL
MMU		: Hash

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/setup.c | 4 ++++
 arch/powerpc/platforms/pseries/setup.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index d50c7d99baaf..2d937f6d9260 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -95,6 +95,10 @@ static void pnv_show_cpuinfo(struct seq_file *m)
 	else
 		seq_printf(m, "firmware\t: BML\n");
 	of_node_put(root);
+	if (radix_enabled())
+		seq_printf(m, "MMU\t\t: Radix\n");
+	else
+		seq_printf(m, "MMU\t\t: Hash\n");
 }
 
 static void pnv_prepare_going_down(void)
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index b4d362ed03a1..b5d86426e97b 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -87,6 +87,10 @@ static void pSeries_show_cpuinfo(struct seq_file *m)
 		model = of_get_property(root, "model", NULL);
 	seq_printf(m, "machine\t\t: CHRP %s\n", model);
 	of_node_put(root);
+	if (radix_enabled())
+		seq_printf(m, "MMU\t\t: Radix\n");
+	else
+		seq_printf(m, "MMU\t\t: Hash\n");
 }
 
 /* Initialize firmware assisted non-maskable interrupts if
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (4 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:19   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy Aneesh Kumar K.V
                   ` (4 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Without this if firmware reports 1MB page size support we will crash
trying to use 1MB as hugetlb page size.

echo 300 > /sys/kernel/mm/hugepages/hugepages-1024kB/nr_hugepages

kernel BUG at ./arch/powerpc/include/asm/hugetlb.h:19!
.....
....
[c0000000e2c27b30] c00000000029dae8 .hugetlb_fault+0x638/0xda0
[c0000000e2c27c30] c00000000026fb64 .handle_mm_fault+0x844/0x1d70
[c0000000e2c27d70] c00000000004805c .do_page_fault+0x3dc/0x7c0
[c0000000e2c27e30] c00000000000ac98 handle_page_fault+0x10/0x30

With fix, we don't enable 1MB as hugepage size.

bash-4.2# cd /sys/kernel/mm/hugepages/
bash-4.2# ls
hugepages-16384kB  hugepages-16777216kB

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hugetlbpage.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
index 8c3389cbcd12..eb8d42bac00b 100644
--- a/arch/powerpc/mm/hugetlbpage.c
+++ b/arch/powerpc/mm/hugetlbpage.c
@@ -738,6 +738,7 @@ static int __init add_huge_page_size(unsigned long long size)
 	int shift = __ffs(size);
 	int mmu_psize;
 
+#ifndef CONFIG_PPC_BOOK3S_64
 	/* Check that it is a page size supported by the hardware and
 	 * that it fits within pagetable and slice limits. */
 	if (size <= PAGE_SIZE)
@@ -749,10 +750,29 @@ static int __init add_huge_page_size(unsigned long long size)
 	if (!is_power_of_2(size) || (shift > SLICE_HIGH_SHIFT))
 		return -EINVAL;
 #endif
+#endif /* CONFIG_PPC_BOOK3S_64 */
 
 	if ((mmu_psize = shift_to_mmu_psize(shift)) < 0)
 		return -EINVAL;
 
+#ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * We need to make sure that for different page sizes reported by
+	 * firmware we only add hugetlb support for page sizes that can be
+	 * supported by linux page table layout.
+	 * For now we have
+	 * Radix: 2M
+	 * Hash: 16M and 16G
+	 */
+	if (radix_enabled()) {
+		if (mmu_psize != MMU_PAGE_2M)
+			return -EINVAL;
+	} else {
+		if (mmu_psize != MMU_PAGE_16M && mmu_psize != MMU_PAGE_16G)
+			return -EINVAL;
+	}
+#endif
+
 	BUG_ON(mmu_psize_defs[mmu_psize].shift != shift);
 
 	/* Return if huge page size has already been setup */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (5 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:21   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 4 ----
 1 file changed, 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c39bc4cb9247..4d4ff9a324f0 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -37,11 +37,7 @@
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
 
-#ifdef CONFIG_MEM_SOFT_DIRTY
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
-#else
-#define _PAGE_SOFT_DIRTY	0x00000
-#endif
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
 
 #define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (6 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:24   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 4 ++--
 arch/powerpc/include/asm/book3s/64/pgtable.h  | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index b39f0b86405e..7be54f9590a3 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -10,8 +10,8 @@
  * 64k aligned address free up few of the lower bits of RPN for us
  * We steal that here. For more deatils look at pte_pfn/pfn_pte()
  */
-#define H_PAGE_COMBO	0x00001000 /* this is a combo 4k page */
-#define H_PAGE_4K_PFN	0x00002000 /* PFN is for a single 4k page */
+#define H_PAGE_COMBO	_RPAGE_RPN0 /* this is a combo 4k page */
+#define H_PAGE_4K_PFN	_RPAGE_RPN1 /* PFN is for a single 4k page */
 /*
  * We need to differentiate between explicit huge page and THP huge
  * page, since THP huge page also need to track real subpage details
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 4d4ff9a324f0..96566df547a8 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -36,6 +36,8 @@
 #define _RPAGE_RSV2		0x0800000000000000UL
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
+#define _RPAGE_RPN0		0x01000
+#define _RPAGE_RPN1		0x02000
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (7 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 21:26   ` Benjamin Herrenschmidt
  2017-03-16 22:27   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable Aneesh Kumar K.V
  2017-03-16 10:32 ` [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN Aneesh Kumar K.V
  10 siblings, 2 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

Max value supported by hardware is 51 bits address. Radix page table define
a slot of 57 bits for future expansion. We restrict the value supported in
linux kernel 51 bits, so that we can use the bits between 57-51 for storing
hash linux page table bits. This is done in the next patch.

This will free up the software page table bits to be used for features
that are needed for both hash and radix. The current hash linux page table
format doesn't have any free software bits. Moving hash linux page table
specific bits to top of RPN field free up the software bits for other purpose.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 96566df547a8..c470dcc815d5 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -38,6 +38,25 @@
 #define _RPAGE_RSV4		0x0200000000000000UL
 #define _RPAGE_RPN0		0x01000
 #define _RPAGE_RPN1		0x02000
+/* Max physicall address bit as per radix table */
+#define _RPAGE_PA_MAX		57
+/*
+ * Max physical address bit we will use for now.
+ *
+ * This is mostly a hardware limitation and for now Power9 has
+ * a 51 bit limit.
+ *
+ * This is different from the number of physical bit required to address
+ * the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
+ * MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
+ * number of sections we can support (SECTIONS_SHIFT).
+ *
+ * This is different from Radix page table limitation above and
+ * should always be less than that. The limit is done such that
+ * we can overload the bits between _RPAGE_PA_MAX and _PAGE_PA_MAX
+ * for hash linux page table specific bits.
+ */
+#define _PAGE_PA_MAX		51
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
@@ -51,10 +70,11 @@
  */
 #define _PAGE_NO_CACHE		_PAGE_TOLERANT
 /*
- * We support 57 bit real address in pte. Clear everything above 57, and
+ * We support _RPAGE_PA_MAX bit real address in pte. On the linux side
+ * we are limited by _PAGE_PA_MAX. Clear everything above _PAGE_PA_MAX
  * every thing below PAGE_SHIFT;
  */
-#define PTE_RPN_MASK	(((1UL << 57) - 1) & (PAGE_MASK))
+#define PTE_RPN_MASK	(((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
 /*
  * set of bits not changed in pmd_modify. Even though we have hash specific bits
  * in here, on radix we expect them to be zero.
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (8 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:29   ` Paul Mackerras
  2017-03-16 10:32 ` [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN Aneesh Kumar K.V
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This makes max pysical address bits a variable so that hash and radix
translation mode can choose what value to use. In this patch we also switch the
radix translation mode to use 57 bits. This make it resilient to future changes
to max pfn supported by platforms.

This patch is split from the previous one to make the review easier.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h    | 18 ++++++++++++++++++
 arch/powerpc/include/asm/book3s/64/pgtable.h | 28 +++++-----------------------
 arch/powerpc/include/asm/book3s/64/radix.h   |  4 ++++
 arch/powerpc/mm/hash_utils_64.c              |  1 +
 arch/powerpc/mm/pgtable-radix.c              |  1 +
 arch/powerpc/mm/pgtable_64.c                 |  3 +++
 6 files changed, 32 insertions(+), 23 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index ec2828b1db07..af3c88624d3a 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -20,6 +20,24 @@
 #define H_PAGE_F_GIX		(_RPAGE_RSV2 | _RPAGE_RSV3 | _RPAGE_RSV4)
 #define H_PAGE_F_SECOND		_RPAGE_RSV1	/* HPTE is in 2ndary HPTEG */
 #define H_PAGE_HASHPTE		_RPAGE_SW0	/* PTE has associated HPTE */
+/*
+ * Max physical address bit we will use for now.
+ *
+ * This is mostly a hardware limitation and for now Power9 has
+ * a 51 bit limit.
+ *
+ * This is different from the number of physical bit required to address
+ * the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
+ * MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
+ * number of sections we can support (SECTIONS_SHIFT).
+ *
+ * This is different from Radix page table limitation and
+ * should always be less than that. The limit is done such that
+ * we can overload the bits between _RPAGE_PA_MAX and H_PAGE_PA_MAX
+ * for hash linux page table specific bits.
+ */
+#define H_PAGE_PA_MAX		51
+#define H_PTE_RPN_MASK	(((1UL << H_PAGE_PA_MAX) - 1) & (PAGE_MASK))
 
 #ifdef CONFIG_PPC_64K_PAGES
 #include <asm/book3s/64/hash-64k.h>
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index c470dcc815d5..eb82b60b5c89 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -40,23 +40,6 @@
 #define _RPAGE_RPN1		0x02000
 /* Max physicall address bit as per radix table */
 #define _RPAGE_PA_MAX		57
-/*
- * Max physical address bit we will use for now.
- *
- * This is mostly a hardware limitation and for now Power9 has
- * a 51 bit limit.
- *
- * This is different from the number of physical bit required to address
- * the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
- * MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
- * number of sections we can support (SECTIONS_SHIFT).
- *
- * This is different from Radix page table limitation above and
- * should always be less than that. The limit is done such that
- * we can overload the bits between _RPAGE_PA_MAX and _PAGE_PA_MAX
- * for hash linux page table specific bits.
- */
-#define _PAGE_PA_MAX		51
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
@@ -70,12 +53,6 @@
  */
 #define _PAGE_NO_CACHE		_PAGE_TOLERANT
 /*
- * We support _RPAGE_PA_MAX bit real address in pte. On the linux side
- * we are limited by _PAGE_PA_MAX. Clear everything above _PAGE_PA_MAX
- * every thing below PAGE_SHIFT;
- */
-#define PTE_RPN_MASK	(((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
-/*
  * set of bits not changed in pmd_modify. Even though we have hash specific bits
  * in here, on radix we expect them to be zero.
  */
@@ -180,6 +157,11 @@
 
 #ifndef __ASSEMBLY__
 /*
+ * based on max physical address bit that we want to encode in page table
+ */
+extern unsigned long __pte_rpn_mask;
+#define PTE_RPN_MASK __pte_rpn_mask
+/*
  * page table defines
  */
 extern unsigned long __pte_index_size;
diff --git a/arch/powerpc/include/asm/book3s/64/radix.h b/arch/powerpc/include/asm/book3s/64/radix.h
index ac16d1943022..142739b31174 100644
--- a/arch/powerpc/include/asm/book3s/64/radix.h
+++ b/arch/powerpc/include/asm/book3s/64/radix.h
@@ -24,6 +24,10 @@
 
 /* An empty PTE can still have a R or C writeback */
 #define RADIX_PTE_NONE_MASK		(_PAGE_DIRTY | _PAGE_ACCESSED)
+/*
+ * Clear everything above _RPAGE_PA_MAX every thing below PAGE_SHIFT
+ */
+#define RADIX_PTE_RPN_MASK		(((1UL << _RPAGE_PA_MAX) - 1) & (PAGE_MASK))
 
 /* Bits to set in a RPMD/RPUD/RPGD */
 #define RADIX_PMD_VAL_BITS		(0x8000000000000000UL | RADIX_PTE_INDEX_SIZE)
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index c554768b1fa2..d990c3332057 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -986,6 +986,7 @@ void __init hash__early_init_mmu(void)
 	/*
 	 * initialize page table size
 	 */
+	__pte_rpn_mask = H_PTE_RPN_MASK;
 	__pte_frag_nr = H_PTE_FRAG_NR;
 	__pte_frag_size_shift = H_PTE_FRAG_SIZE_SHIFT;
 
diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
index c28165d8970b..6eecbbc7c8af 100644
--- a/arch/powerpc/mm/pgtable-radix.c
+++ b/arch/powerpc/mm/pgtable-radix.c
@@ -379,6 +379,7 @@ void __init radix__early_init_mmu(void)
 	/*
 	 * initialize page table size
 	 */
+	__pte_rpn_mask  = RADIX_PTE_RPN_MASK;
 	__pte_index_size = RADIX_PTE_INDEX_SIZE;
 	__pmd_index_size = RADIX_PMD_INDEX_SIZE;
 	__pud_index_size = RADIX_PUD_INDEX_SIZE;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index db93cf747a03..ac0c7ee60de0 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -68,6 +68,9 @@
  */
 struct prtb_entry *process_tb;
 struct patb_entry *partition_tb;
+
+unsigned long __pte_rpn_mask;
+EXPORT_SYMBOL(__pte_rpn_mask);
 /*
  * page table size
  */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN
  2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
                   ` (9 preceding siblings ...)
  2017-03-16 10:32 ` [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable Aneesh Kumar K.V
@ 2017-03-16 10:32 ` Aneesh Kumar K.V
  2017-03-16 22:34   ` Paul Mackerras
  10 siblings, 1 reply; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-16 10:32 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

We don't support the full 57 bits of physical address and hence can overload
the top bits of RPN as hash specific pte bits.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash.h    | 18 ++++++------------
 arch/powerpc/include/asm/book3s/64/pgtable.h | 19 ++++++++++++++++---
 arch/powerpc/mm/hash_native_64.c             |  1 +
 3 files changed, 23 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
index af3c88624d3a..33eb1a650317 100644
--- a/arch/powerpc/include/asm/book3s/64/hash.h
+++ b/arch/powerpc/include/asm/book3s/64/hash.h
@@ -6,20 +6,14 @@
  * Common bits between 4K and 64K pages in a linux-style PTE.
  * Additional bits may be defined in pgtable-hash64-*.h
  *
- * Note: We only support user read/write permissions. Supervisor always
- * have full read/write to pages above PAGE_OFFSET (pages below that
- * always use the user access permissions).
- *
- * We could create separate kernel read-only if we used the 3 PP bits
- * combinations that newer processors provide but we currently don't.
  */
-#define H_PAGE_BUSY		_RPAGE_SW1 /* software: PTE & hash are busy */
+#define H_PAGE_BUSY		_RPAGE_RPN45 /* software: PTE & hash are busy */
 #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
-#define H_PAGE_F_GIX_SHIFT	57
-/* (7ul << 57) HPTE index within HPTEG */
-#define H_PAGE_F_GIX		(_RPAGE_RSV2 | _RPAGE_RSV3 | _RPAGE_RSV4)
-#define H_PAGE_F_SECOND		_RPAGE_RSV1	/* HPTE is in 2ndary HPTEG */
-#define H_PAGE_HASHPTE		_RPAGE_SW0	/* PTE has associated HPTE */
+#define H_PAGE_F_GIX_SHIFT	52
+/* (7ul << 53) HPTE index within HPTEG */
+#define H_PAGE_F_SECOND		_RPAGE_RPN44	/* HPTE is in 2ndary HPTEG */
+#define H_PAGE_F_GIX		(_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
+#define H_PAGE_HASHPTE		_RPAGE_RPN40	/* PTE has associated HPTE */
 /*
  * Max physical address bit we will use for now.
  *
diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
index eb82b60b5c89..3d104f8ad891 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -36,16 +36,29 @@
 #define _RPAGE_RSV2		0x0800000000000000UL
 #define _RPAGE_RSV3		0x0400000000000000UL
 #define _RPAGE_RSV4		0x0200000000000000UL
+
+#define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
+#define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
+
+/*
+ * Top and bottom bits of RPN which can be used by hash
+ * translation mode, because we expect them to be zero
+ * otherwise.
+ */
 #define _RPAGE_RPN0		0x01000
 #define _RPAGE_RPN1		0x02000
+#define _RPAGE_RPN45		0x0100000000000000UL
+#define _RPAGE_RPN44		0x0080000000000000UL
+#define _RPAGE_RPN43		0x0040000000000000UL
+#define _RPAGE_RPN42		0x0020000000000000UL
+#define _RPAGE_RPN41		0x0010000000000000UL
+#define _RPAGE_RPN40		0x0008000000000000UL
+
 /* Max physicall address bit as per radix table */
 #define _RPAGE_PA_MAX		57
 
 #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
 #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
-
-#define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
-#define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
 /*
  * Drivers request for cache inhibited pte mapping using _PAGE_NO_CACHE
  * Instead of fixing all of them, add an alternate define which
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c
index cc332608e656..917a5a336441 100644
--- a/arch/powerpc/mm/hash_native_64.c
+++ b/arch/powerpc/mm/hash_native_64.c
@@ -246,6 +246,7 @@ static long native_hpte_insert(unsigned long hpte_group, unsigned long vpn,
 
 	__asm__ __volatile__ ("ptesync" : : : "memory");
 
+	BUILD_BUG_ON(H_PAGE_F_SECOND != (1ul  << (H_PAGE_F_GIX_SHIFT + 3)));
 	return i | (!!(vflags & HPTE_V_SECONDARY) << 3);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits
  2017-03-16 10:32 ` [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits Aneesh Kumar K.V
@ 2017-03-16 21:26   ` Benjamin Herrenschmidt
  2017-03-17  3:39     ` Aneesh Kumar K.V
  2017-03-16 22:27   ` Paul Mackerras
  1 sibling, 1 reply; 28+ messages in thread
From: Benjamin Herrenschmidt @ 2017-03-16 21:26 UTC (permalink / raw)
  To: Aneesh Kumar K.V, paulus, mpe; +Cc: linuxppc-dev

On Thu, 2017-03-16 at 16:02 +0530, Aneesh Kumar K.V wrote:
> Max value supported by hardware is 51 bits address. Radix page table define
> a slot of 57 bits for future expansion. We restrict the value supported in
> linux kernel 51 bits, so that we can use the bits between 57-51 for storing
> hash linux page table bits. This is done in the next patch.

All of them ? I would keep some for future backward compatibility. It's likely
that a successor to P9 will have more physical address bits. I feel nervous
limiting to precisely what P9 supports.

> This will free up the software page table bits to be used for features
> that are needed for both hash and radix. The current hash linux page table
> format doesn't have any free software bits. Moving hash linux page table
> specific bits to top of RPN field free up the software bits for other purpose.
> 
> > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 96566df547a8..c470dcc815d5 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -38,6 +38,25 @@
> >  #define _RPAGE_RSV4		0x0200000000000000UL
> >  #define _RPAGE_RPN0		0x01000
> >  #define _RPAGE_RPN1		0x02000
> +/* Max physicall address bit as per radix table */
> > +#define _RPAGE_PA_MAX		57
> +/*
> + * Max physical address bit we will use for now.
> + *
> + * This is mostly a hardware limitation and for now Power9 has
> + * a 51 bit limit.
> + *
> + * This is different from the number of physical bit required to address
> + * the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
> + * MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
> + * number of sections we can support (SECTIONS_SHIFT).
> + *
> + * This is different from Radix page table limitation above and
> + * should always be less than that. The limit is done such that
> + * we can overload the bits between _RPAGE_PA_MAX and _PAGE_PA_MAX
> + * for hash linux page table specific bits.
> + */
> > +#define _PAGE_PA_MAX		51
>  
> >  #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
> >  #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
> @@ -51,10 +70,11 @@
>   */
> >  #define _PAGE_NO_CACHE		_PAGE_TOLERANT
>  /*
> - * We support 57 bit real address in pte. Clear everything above 57, and
> + * We support _RPAGE_PA_MAX bit real address in pte. On the linux side
> + * we are limited by _PAGE_PA_MAX. Clear everything above _PAGE_PA_MAX
>   * every thing below PAGE_SHIFT;
>   */
> > -#define PTE_RPN_MASK	(((1UL << 57) - 1) & (PAGE_MASK))
> > +#define PTE_RPN_MASK	(((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
>  /*
>   * set of bits not changed in pmd_modify. Even though we have hash specific bits
>   * in here, on radix we expect them to be zero.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64
  2017-03-16 10:31 ` [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64 Aneesh Kumar K.V
@ 2017-03-16 22:00   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:00 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:01:59PM +0530, Aneesh Kumar K.V wrote:
> BOOKE code is dead code as per the Kconfig details. So make it simpler
> by enabling MM_SLICE only for book3s_64. The changes w.r.t nohash is just
> removing deadcode. W.r.t ppc64, 4k without hugetlb will now enable MM_SLICE.
> But that is good, because we reduce one extra variant which probably is not
> getting tested much.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly
  2017-03-16 10:32 ` [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly Aneesh Kumar K.V
@ 2017-03-16 22:03   ` Paul Mackerras
  2017-03-17  6:55     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:03 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:00PM +0530, Aneesh Kumar K.V wrote:
> For low slice max addr should be less that 4G
                                        ^^^^ than

A more verbose explanation of the off-by-1 error that you are fixing
is needed here.  Tell us what goes wrong with the current code and why
your fix is the correct one.

> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

For the code change:

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix.
  2017-03-16 10:32 ` [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix Aneesh Kumar K.V
@ 2017-03-16 22:16   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:01PM +0530, Aneesh Kumar K.V wrote:
> Define everything based on bits present in pgtable.h. This will help in easily
> identifying overlapping bits between hash/radix.
> 
> No functional change with this patch.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE
  2017-03-16 10:32 ` [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE Aneesh Kumar K.V
@ 2017-03-16 22:16   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:02PM +0530, Aneesh Kumar K.V wrote:
> This bit is only used by radix and it is nice to follow the naming style of having
> bit name start with H_/R_ depending on which translation mode they are used.
> 
> No functional change in this patch.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo
  2017-03-16 10:32 ` [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo Aneesh Kumar K.V
@ 2017-03-16 22:17   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:17 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:03PM +0530, Aneesh Kumar K.V wrote:
> With this we have on powernv and pseries /proc/cpuinfo reporting
> 
> timebase        : 512000000
> platform        : PowerNV
> model           : 8247-22L
> machine         : PowerNV 8247-22L
> firmware        : OPAL
> MMU		: Hash
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout
  2017-03-16 10:32 ` [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout Aneesh Kumar K.V
@ 2017-03-16 22:19   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:19 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:04PM +0530, Aneesh Kumar K.V wrote:
> Without this if firmware reports 1MB page size support we will crash
> trying to use 1MB as hugetlb page size.
> 
> echo 300 > /sys/kernel/mm/hugepages/hugepages-1024kB/nr_hugepages
> 
> kernel BUG at ./arch/powerpc/include/asm/hugetlb.h:19!
> .....
> ....
> [c0000000e2c27b30] c00000000029dae8 .hugetlb_fault+0x638/0xda0
> [c0000000e2c27c30] c00000000026fb64 .handle_mm_fault+0x844/0x1d70
> [c0000000e2c27d70] c00000000004805c .do_page_fault+0x3dc/0x7c0
> [c0000000e2c27e30] c00000000000ac98 handle_page_fault+0x10/0x30
> 
> With fix, we don't enable 1MB as hugepage size.
> 
> bash-4.2# cd /sys/kernel/mm/hugepages/
> bash-4.2# ls
> hugepages-16384kB  hugepages-16777216kB
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hugetlbpage.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c
> index 8c3389cbcd12..eb8d42bac00b 100644
> --- a/arch/powerpc/mm/hugetlbpage.c
> +++ b/arch/powerpc/mm/hugetlbpage.c
> @@ -738,6 +738,7 @@ static int __init add_huge_page_size(unsigned long long size)
>  	int shift = __ffs(size);
>  	int mmu_psize;
>  
> +#ifndef CONFIG_PPC_BOOK3S_64

This #ifndef doesn't really seem necessary.  All it is removing is a
check for size <= PAGE_SIZE.  Yes that check is subsumed by the checks
you are adding below, but on the other hand, #if[n]defs inside
functions are ugly and make the code harder to read.  Since this is
not a hot path, let's not have the ifndef.

>  	/* Check that it is a page size supported by the hardware and
>  	 * that it fits within pagetable and slice limits. */
>  	if (size <= PAGE_SIZE)
> @@ -749,10 +750,29 @@ static int __init add_huge_page_size(unsigned long long size)
>  	if (!is_power_of_2(size) || (shift > SLICE_HIGH_SHIFT))
>  		return -EINVAL;
>  #endif
> +#endif /* CONFIG_PPC_BOOK3S_64 */
>  
>  	if ((mmu_psize = shift_to_mmu_psize(shift)) < 0)
>  		return -EINVAL;
>  
> +#ifdef CONFIG_PPC_BOOK3S_64
> +	/*
> +	 * We need to make sure that for different page sizes reported by
> +	 * firmware we only add hugetlb support for page sizes that can be
> +	 * supported by linux page table layout.
> +	 * For now we have
> +	 * Radix: 2M
> +	 * Hash: 16M and 16G
> +	 */
> +	if (radix_enabled()) {
> +		if (mmu_psize != MMU_PAGE_2M)
> +			return -EINVAL;
> +	} else {
> +		if (mmu_psize != MMU_PAGE_16M && mmu_psize != MMU_PAGE_16G)
> +			return -EINVAL;
> +	}
> +#endif
> +
>  	BUG_ON(mmu_psize_defs[mmu_psize].shift != shift);
>  
>  	/* Return if huge page size has already been setup */
> -- 
> 2.7.4

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy
  2017-03-16 10:32 ` [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy Aneesh Kumar K.V
@ 2017-03-16 22:21   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:21 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:05PM +0530, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

I think it would be better if the subject was something like "Define
_PAGE_SOFT_DIRTY unconditionally" and the comment about conditional
defines was the patch description.

For the code change:

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines
  2017-03-16 10:32 ` [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines Aneesh Kumar K.V
@ 2017-03-16 22:24   ` Paul Mackerras
  0 siblings, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:24 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:06PM +0530, Aneesh Kumar K.V wrote:
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

This change seems correct, but of minimal benefit.

The subject could be better expressed.  How about "Define all PTE bits
based on radix definitions" or something like that?  "Everything" is a
bit too broad.

For the code change:

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits
  2017-03-16 10:32 ` [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits Aneesh Kumar K.V
  2017-03-16 21:26   ` Benjamin Herrenschmidt
@ 2017-03-16 22:27   ` Paul Mackerras
  1 sibling, 0 replies; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:27 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:07PM +0530, Aneesh Kumar K.V wrote:
> Max value supported by hardware is 51 bits address. Radix page table define
> a slot of 57 bits for future expansion. We restrict the value supported in
> linux kernel 51 bits, so that we can use the bits between 57-51 for storing
> hash linux page table bits. This is done in the next patch.
> 
> This will free up the software page table bits to be used for features
> that are needed for both hash and radix. The current hash linux page table
> format doesn't have any free software bits. Moving hash linux page table
> specific bits to top of RPN field free up the software bits for other purpose.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---

There are a couple of comment typos below, but for the actual code change:

Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

>  arch/powerpc/include/asm/book3s/64/pgtable.h | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index 96566df547a8..c470dcc815d5 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -38,6 +38,25 @@
>  #define _RPAGE_RSV4		0x0200000000000000UL
>  #define _RPAGE_RPN0		0x01000
>  #define _RPAGE_RPN1		0x02000
> +/* Max physicall address bit as per radix table */

physical not physicall

> +#define _RPAGE_PA_MAX		57
> +/*
> + * Max physical address bit we will use for now.
> + *
> + * This is mostly a hardware limitation and for now Power9 has
> + * a 51 bit limit.
> + *
> + * This is different from the number of physical bit required to address
> + * the last byte of memory. That is defined by MAX_PHYSMEM_BITS.
> + * MAX_PHYSMEM_BITS is a linux limitation imposed by the maximum
> + * number of sections we can support (SECTIONS_SHIFT).
> + *
> + * This is different from Radix page table limitation above and
> + * should always be less than that. The limit is done such that
> + * we can overload the bits between _RPAGE_PA_MAX and _PAGE_PA_MAX
> + * for hash linux page table specific bits.
> + */
> +#define _PAGE_PA_MAX		51
>  
>  #define _PAGE_SOFT_DIRTY	_RPAGE_SW3 /* software: software dirty tracking */
>  #define _PAGE_SPECIAL		_RPAGE_SW2 /* software: special page */
> @@ -51,10 +70,11 @@
>   */
>  #define _PAGE_NO_CACHE		_PAGE_TOLERANT
>  /*
> - * We support 57 bit real address in pte. Clear everything above 57, and
> + * We support _RPAGE_PA_MAX bit real address in pte. On the linux side
> + * we are limited by _PAGE_PA_MAX. Clear everything above _PAGE_PA_MAX
>   * every thing below PAGE_SHIFT;

You lost an "and" in that last sentence.

>   */
> -#define PTE_RPN_MASK	(((1UL << 57) - 1) & (PAGE_MASK))
> +#define PTE_RPN_MASK	(((1UL << _PAGE_PA_MAX) - 1) & (PAGE_MASK))
>  /*
>   * set of bits not changed in pmd_modify. Even though we have hash specific bits
>   * in here, on radix we expect them to be zero.
> -- 
> 2.7.4

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable
  2017-03-16 10:32 ` [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable Aneesh Kumar K.V
@ 2017-03-16 22:29   ` Paul Mackerras
  2017-03-17  8:54     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:29 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:08PM +0530, Aneesh Kumar K.V wrote:
> This makes max pysical address bits a variable so that hash and radix
> translation mode can choose what value to use. In this patch we also switch the
> radix translation mode to use 57 bits. This make it resilient to future changes
> to max pfn supported by platforms.
> 
> This patch is split from the previous one to make the review easier.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Why do we need to do this now?  It seems like this will add overhead
every time we set a PTE for no current benefit.

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN
  2017-03-16 10:32 ` [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN Aneesh Kumar K.V
@ 2017-03-16 22:34   ` Paul Mackerras
  2017-03-17  3:37     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 28+ messages in thread
From: Paul Mackerras @ 2017-03-16 22:34 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: benh, mpe, linuxppc-dev

On Thu, Mar 16, 2017 at 04:02:09PM +0530, Aneesh Kumar K.V wrote:
> We don't support the full 57 bits of physical address and hence can overload
> the top bits of RPN as hash specific pte bits.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/book3s/64/hash.h    | 18 ++++++------------
>  arch/powerpc/include/asm/book3s/64/pgtable.h | 19 ++++++++++++++++---
>  arch/powerpc/mm/hash_native_64.c             |  1 +
>  3 files changed, 23 insertions(+), 15 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h
> index af3c88624d3a..33eb1a650317 100644
> --- a/arch/powerpc/include/asm/book3s/64/hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/hash.h
> @@ -6,20 +6,14 @@
>   * Common bits between 4K and 64K pages in a linux-style PTE.
>   * Additional bits may be defined in pgtable-hash64-*.h
>   *
> - * Note: We only support user read/write permissions. Supervisor always
> - * have full read/write to pages above PAGE_OFFSET (pages below that
> - * always use the user access permissions).
> - *
> - * We could create separate kernel read-only if we used the 3 PP bits
> - * combinations that newer processors provide but we currently don't.
>   */
> -#define H_PAGE_BUSY		_RPAGE_SW1 /* software: PTE & hash are busy */
> +#define H_PAGE_BUSY		_RPAGE_RPN45 /* software: PTE & hash are busy */
>  #define H_PTE_NONE_MASK		_PAGE_HPTEFLAGS
> -#define H_PAGE_F_GIX_SHIFT	57
> -/* (7ul << 57) HPTE index within HPTEG */
> -#define H_PAGE_F_GIX		(_RPAGE_RSV2 | _RPAGE_RSV3 | _RPAGE_RSV4)
> -#define H_PAGE_F_SECOND		_RPAGE_RSV1	/* HPTE is in 2ndary HPTEG */
> -#define H_PAGE_HASHPTE		_RPAGE_SW0	/* PTE has associated HPTE */
> +#define H_PAGE_F_GIX_SHIFT	52
> +/* (7ul << 53) HPTE index within HPTEG */
> +#define H_PAGE_F_SECOND		_RPAGE_RPN44	/* HPTE is in 2ndary HPTEG */
> +#define H_PAGE_F_GIX		(_RPAGE_RPN43 | _RPAGE_RPN42 | _RPAGE_RPN41)
> +#define H_PAGE_HASHPTE		_RPAGE_RPN40	/* PTE has associated HPTE */
>  /*
>   * Max physical address bit we will use for now.
>   *
> diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h
> index eb82b60b5c89..3d104f8ad891 100644
> --- a/arch/powerpc/include/asm/book3s/64/pgtable.h
> +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
> @@ -36,16 +36,29 @@
>  #define _RPAGE_RSV2		0x0800000000000000UL
>  #define _RPAGE_RSV3		0x0400000000000000UL
>  #define _RPAGE_RSV4		0x0200000000000000UL
> +
> +#define _PAGE_PTE		0x4000000000000000UL	/* distinguishes PTEs from pointers */
> +#define _PAGE_PRESENT		0x8000000000000000UL	/* pte contains a translation */
> +
> +/*
> + * Top and bottom bits of RPN which can be used by hash
> + * translation mode, because we expect them to be zero
> + * otherwise.
> + */
>  #define _RPAGE_RPN0		0x01000
>  #define _RPAGE_RPN1		0x02000
> +#define _RPAGE_RPN45		0x0100000000000000UL
> +#define _RPAGE_RPN44		0x0080000000000000UL
> +#define _RPAGE_RPN43		0x0040000000000000UL
> +#define _RPAGE_RPN42		0x0020000000000000UL
> +#define _RPAGE_RPN41		0x0010000000000000UL
> +#define _RPAGE_RPN40		0x0008000000000000UL

If RPN0 is 0x1000, then this is actually RPN39 as far as I can see,
and the other RPN4* bits are likewise off by one.

Paul.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN
  2017-03-16 22:34   ` Paul Mackerras
@ 2017-03-17  3:37     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-17  3:37 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: benh, mpe, linuxppc-dev



On Friday 17 March 2017 04:04 AM, Paul Mackerras wrote:
> On Thu, Mar 16, 2017 at 04:02:09PM +0530, Aneesh Kumar K.V wrote:

.....


/* pte contains a translation */
>> +
>> +/*
>> + * Top and bottom bits of RPN which can be used by hash
>> + * translation mode, because we expect them to be zero
>> + * otherwise.
>> + */
>>  #define _RPAGE_RPN0		0x01000
>>  #define _RPAGE_RPN1		0x02000
>> +#define _RPAGE_RPN45		0x0100000000000000UL
>> +#define _RPAGE_RPN44		0x0080000000000000UL
>> +#define _RPAGE_RPN43		0x0040000000000000UL
>> +#define _RPAGE_RPN42		0x0020000000000000UL
>> +#define _RPAGE_RPN41		0x0010000000000000UL
>> +#define _RPAGE_RPN40		0x0008000000000000UL
>
> If RPN0 is 0x1000, then this is actually RPN39 as far as I can see,
> and the other RPN4* bits are likewise off by one.

0x0100000000000000 >> 12 = 0x100000000000

I guess I got that naming wrong. it is 45 bit count hence the numbering 
should be RPN44. I will fixup in next update.

-aneesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits
  2017-03-16 21:26   ` Benjamin Herrenschmidt
@ 2017-03-17  3:39     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-17  3:39 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, paulus, mpe; +Cc: linuxppc-dev



On Friday 17 March 2017 02:56 AM, Benjamin Herrenschmidt wrote:
> On Thu, 2017-03-16 at 16:02 +0530, Aneesh Kumar K.V wrote:
>> Max value supported by hardware is 51 bits address. Radix page table define
>> a slot of 57 bits for future expansion. We restrict the value supported in
>> linux kernel 51 bits, so that we can use the bits between 57-51 for storing
>> hash linux page table bits. This is done in the next patch.
>
> All of them ? I would keep some for future backward compatibility. It's likely
> that a successor to P9 will have more physical address bits. I feel nervous
> limiting to precisely what P9 supports.
>

What do you want to keep as MAX PFN bits here ?. Any new expansion will 
eat into the software bits defined by Radix and hence can't be used
for generic features.


>> This will free up the software page table bits to be used for features
>> that are needed for both hash and radix. The current hash linux page table
>> format doesn't have any free software bits. Moving hash linux page table
>> specific bits to top of RPN field free up the software bits for other purpose.
>>

-aneesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly
  2017-03-16 22:03   ` Paul Mackerras
@ 2017-03-17  6:55     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-17  6:55 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: benh, mpe, linuxppc-dev

Paul Mackerras <paulus@ozlabs.org> writes:

> On Thu, Mar 16, 2017 at 04:02:00PM +0530, Aneesh Kumar K.V wrote:
>> For low slice max addr should be less that 4G
>                                         ^^^^ than
>
> A more verbose explanation of the off-by-1 error that you are fixing
> is needed here.  Tell us what goes wrong with the current code and why
> your fix is the correct one.

How about

powerpc/mm/slice: when computing slice mask limit low slice max addr correctly

For low slice, max addr should be less that 4G. Without limiting this correctly
we will end up with a low slice mask which has 17th bit set. This is not
a problem with the current code because our low slice mask is of type u16. But
in later patch I am switching low slice mask to u64 type and having the 17bit
set result in wrong slice mask which in turn results in mmap failures.


>
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> For the code change:
>
> Reviewed-by: Paul Mackerras <paulus@ozlabs.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable
  2017-03-16 22:29   ` Paul Mackerras
@ 2017-03-17  8:54     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 28+ messages in thread
From: Aneesh Kumar K.V @ 2017-03-17  8:54 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: benh, mpe, linuxppc-dev

Paul Mackerras <paulus@ozlabs.org> writes:

> On Thu, Mar 16, 2017 at 04:02:08PM +0530, Aneesh Kumar K.V wrote:
>> This makes max pysical address bits a variable so that hash and radix
>> translation mode can choose what value to use. In this patch we also switch the
>> radix translation mode to use 57 bits. This make it resilient to future changes
>> to max pfn supported by platforms.
>> 
>> This patch is split from the previous one to make the review easier.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>
> Why do we need to do this now?  It seems like this will add overhead
> every time we set a PTE for no current benefit.

I was trying to make sure that radix kernel can run on future version of
hardware where the max pfn bit is different.

-aneesh

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-03-17  8:54 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-16 10:31 [PATCH V2 00/11] powerpc/mm/hash: Cleanup and fixes Aneesh Kumar K.V
2017-03-16 10:31 ` [PATCH V2 01/11] powerpc/mm/nohash: MM_SLICE is only used by book3s 64 Aneesh Kumar K.V
2017-03-16 22:00   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 02/11] powerpc/mm/slice: when computing slice mask limit lowe slice max addr correctly Aneesh Kumar K.V
2017-03-16 22:03   ` Paul Mackerras
2017-03-17  6:55     ` Aneesh Kumar K.V
2017-03-16 10:32 ` [PATCH V2 03/11] powerpc/mm: Cleanup bits definition between hash and radix Aneesh Kumar K.V
2017-03-16 22:16   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 04/11] powerpc/mm/radix: rename _PAGE_LARGE to R_PAGE_LARGE Aneesh Kumar K.V
2017-03-16 22:16   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 05/11] powerpc/mm: Add translation mode information in /proc/cpuinfo Aneesh Kumar K.V
2017-03-16 22:17   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 06/11] powerpc/mm/hugetlb: Filter out hugepage size not supported by page table layout Aneesh Kumar K.V
2017-03-16 22:19   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 07/11] powerpc/mm: Conditional defines of pte bits are messy Aneesh Kumar K.V
2017-03-16 22:21   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 08/11] powerpc/mm: Express everything based on Radix page table defines Aneesh Kumar K.V
2017-03-16 22:24   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 09/11] powerpc/mm: Lower the max real address to 51 bits Aneesh Kumar K.V
2017-03-16 21:26   ` Benjamin Herrenschmidt
2017-03-17  3:39     ` Aneesh Kumar K.V
2017-03-16 22:27   ` Paul Mackerras
2017-03-16 10:32 ` [PATCH V2 10/11] powerpc/mm/radix: Make max pfn bits a variable Aneesh Kumar K.V
2017-03-16 22:29   ` Paul Mackerras
2017-03-17  8:54     ` Aneesh Kumar K.V
2017-03-16 10:32 ` [PATCH V2 11/11] powerpc/mm: Move hash specific pte bits to be top bits of RPN Aneesh Kumar K.V
2017-03-16 22:34   ` Paul Mackerras
2017-03-17  3:37     ` Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.