linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH V2 0/4] Add support for 4PB virtual address space on hash
@ 2018-02-26 14:08 Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 1/4] powerpc/mm/slice: Update documentation in the file Aneesh Kumar K.V
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:08 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This patch series extended the max virtual address space value from 512TB
to 4PB with 64K page size. We do that by allocating one vsid context for
each 512TB range. More details of that is explained in patch 3.


Aneesh Kumar K.V (4):
  powerpc/mm/slice: Update documentation in the file.
  powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area
  powerpc/mm: Add support for handling > 512TB address in SLB miss
  powerpc/mm/hash64: Increase the VA range

 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   6 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   7 +-
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   6 +-
 arch/powerpc/include/asm/book3s/64/mmu.h      |  24 +++++
 arch/powerpc/include/asm/processor.h          |  16 ++-
 arch/powerpc/kernel/exceptions-64s.S          |  12 ++-
 arch/powerpc/mm/copro_fault.c                 |   2 +-
 arch/powerpc/mm/hash_utils_64.c               |   4 +-
 arch/powerpc/mm/init_64.c                     |   6 --
 arch/powerpc/mm/mmu_context_book3s64.c        |  17 ++-
 arch/powerpc/mm/pgtable-hash64.c              |   2 +-
 arch/powerpc/mm/pgtable_64.c                  |   5 -
 arch/powerpc/mm/slb.c                         | 150 ++++++++++++++++++++++++++
 arch/powerpc/mm/slb_low.S                     |   6 +-
 arch/powerpc/mm/slice.c                       |  61 ++++++-----
 arch/powerpc/mm/tlb_hash64.c                  |   2 +-
 16 files changed, 273 insertions(+), 53 deletions(-)

-- 
2.14.3

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH V2 1/4] powerpc/mm/slice: Update documentation in the file.
  2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
@ 2018-02-26 14:08 ` Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area Aneesh Kumar K.V
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:08 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

We will make code changes in the next patch. To make the review easier split
the documentation update in to a seperate patch.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/slice.c | 27 +++++++++++++++++++--------
 1 file changed, 19 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 98b53d48968f..259bbda9a222 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -478,7 +478,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	 * already
 	 */
 	slice_mask_for_size(mm, psize, &good_mask, high_limit);
-	slice_print_mask(" good_mask", good_mask);
+	slice_print_mask("Mask for page size", good_mask);
 
 	/*
 	 * Here "good" means slices that are already the right page size,
@@ -507,15 +507,17 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 			slice_or_mask(&good_mask, &compat_mask);
 	}
 #endif
-
 	/* First check hint if it's valid or if we have MAP_FIXED */
 	if (addr != 0 || fixed) {
-		/* Build a mask for the requested range */
+		/*
+		 * Build a mask for the requested range
+		 */
 		slice_range_to_mask(addr, len, &mask);
-		slice_print_mask(" mask", mask);
+		slice_print_mask("Request range mask", mask);
 
-		/* Check if we fit in the good mask. If we do, we just return,
-		 * nothing else to do
+		/*
+		 * Check if we fit in the good mask. If we do, we just
+		 * return, nothing else to do
 		 */
 		if (slice_check_fit(mm, mask, good_mask)) {
 			slice_dbg(" fits good !\n");
@@ -553,8 +555,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 		return -EBUSY;
 
 	slice_dbg(" search...\n");
-
-	/* If we had a hint that didn't work out, see if we can fit
+	/*
+	 * If we had a hint that didn't work out, see if we can fit
 	 * anywhere in the good area.
 	 */
 	if (addr) {
@@ -573,7 +575,16 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 			       psize, topdown, high_limit);
 
 #ifdef CONFIG_PPC_64K_PAGES
+	/*
+	 * If we didn't request for fixed mapping, we never looked at
+	 * compat area. Now that we are not finding space, let's look
+	 * at the 4K slice also.
+	 */
 	if (addr == -ENOMEM && psize == MMU_PAGE_64K) {
+		/*
+		 * mask variable is free here. Use that for compat
+		 * size mask.
+		 */
 		/* retry the search with 4k-page slices included */
 		slice_or_mask(&potential_mask, &compat_mask);
 		addr = slice_find_area(mm, len, potential_mask,
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area
  2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 1/4] powerpc/mm/slice: Update documentation in the file Aneesh Kumar K.V
@ 2018-02-26 14:08 ` Aneesh Kumar K.V
  2018-02-26 22:24   ` Nicholas Piggin
  2018-02-26 14:08 ` [PATCH V2 3/4] powerpc/mm: Add support for handling > 512TB address in SLB miss Aneesh Kumar K.V
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:08 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

This patch kill potential_mask and compat_mask variable and instead use tmp_mask
so that we can reduce the stack usage. This is required so that we can increase
the high_slices bitmap to a larger value.

The patch does result in extra computation in final stage, where it ends up
recomputing the compat mask again.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/mm/slice.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 259bbda9a222..832c681c341a 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -413,8 +413,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 {
 	struct slice_mask mask;
 	struct slice_mask good_mask;
-	struct slice_mask potential_mask;
-	struct slice_mask compat_mask;
+	struct slice_mask tmp_mask;
 	int fixed = (flags & MAP_FIXED);
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
 	unsigned long page_size = 1UL << pshift;
@@ -449,11 +448,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	bitmap_zero(mask.high_slices, SLICE_NUM_HIGH);
 
 	/* silence stupid warning */;
-	potential_mask.low_slices = 0;
-	bitmap_zero(potential_mask.high_slices, SLICE_NUM_HIGH);
-
-	compat_mask.low_slices = 0;
-	bitmap_zero(compat_mask.high_slices, SLICE_NUM_HIGH);
+	tmp_mask.low_slices = 0;
+	bitmap_zero(tmp_mask.high_slices, SLICE_NUM_HIGH);
 
 	/* Sanity checks */
 	BUG_ON(mm->task_size == 0);
@@ -502,9 +498,11 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 #ifdef CONFIG_PPC_64K_PAGES
 	/* If we support combo pages, we can allow 64k pages in 4k slices */
 	if (psize == MMU_PAGE_64K) {
-		slice_mask_for_size(mm, MMU_PAGE_4K, &compat_mask, high_limit);
+		slice_mask_for_size(mm, MMU_PAGE_4K, &tmp_mask, high_limit);
 		if (fixed)
-			slice_or_mask(&good_mask, &compat_mask);
+			slice_or_mask(&good_mask, &tmp_mask);
+
+		slice_print_mask("Mask for compat page size", tmp_mask);
 	}
 #endif
 	/* First check hint if it's valid or if we have MAP_FIXED */
@@ -541,11 +539,11 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	 * We don't fit in the good mask, check what other slices are
 	 * empty and thus can be converted
 	 */
-	slice_mask_for_free(mm, &potential_mask, high_limit);
-	slice_or_mask(&potential_mask, &good_mask);
-	slice_print_mask(" potential", potential_mask);
+	slice_mask_for_free(mm, &tmp_mask, high_limit);
+	slice_or_mask(&tmp_mask, &good_mask);
+	slice_print_mask("Free area/potential ", tmp_mask);
 
-	if ((addr != 0 || fixed) && slice_check_fit(mm, mask, potential_mask)) {
+	if ((addr != 0 || fixed) && slice_check_fit(mm, mask, tmp_mask)) {
 		slice_dbg(" fits potential !\n");
 		goto convert;
 	}
@@ -571,7 +569,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	/* Now let's see if we can find something in the existing slices
 	 * for that size plus free slices
 	 */
-	addr = slice_find_area(mm, len, potential_mask,
+	addr = slice_find_area(mm, len, tmp_mask,
 			       psize, topdown, high_limit);
 
 #ifdef CONFIG_PPC_64K_PAGES
@@ -585,9 +583,10 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 		 * mask variable is free here. Use that for compat
 		 * size mask.
 		 */
+		slice_mask_for_size(mm, MMU_PAGE_4K, &mask, high_limit);
 		/* retry the search with 4k-page slices included */
-		slice_or_mask(&potential_mask, &compat_mask);
-		addr = slice_find_area(mm, len, potential_mask,
+		slice_or_mask(&tmp_mask, &mask);
+		addr = slice_find_area(mm, len, tmp_mask,
 				       psize, topdown, high_limit);
 	}
 #endif
@@ -600,8 +599,9 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	slice_print_mask(" mask", mask);
 
  convert:
+	slice_mask_for_size(mm, MMU_PAGE_4K, &tmp_mask, high_limit);
 	slice_andnot_mask(&mask, &good_mask);
-	slice_andnot_mask(&mask, &compat_mask);
+	slice_andnot_mask(&mask, &tmp_mask);
 	if (mask.low_slices || !bitmap_empty(mask.high_slices, SLICE_NUM_HIGH)) {
 		slice_convert(mm, mask, psize);
 		if (psize > MMU_PAGE_BASE)
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 3/4] powerpc/mm: Add support for handling > 512TB address in SLB miss
  2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 1/4] powerpc/mm/slice: Update documentation in the file Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area Aneesh Kumar K.V
@ 2018-02-26 14:08 ` Aneesh Kumar K.V
  2018-02-26 14:08 ` [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range Aneesh Kumar K.V
  2018-02-26 14:15 ` [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
  4 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:08 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

For address above 512TB we allocate additonal mmu context. To make it all
easy address above 512TB is handled with IR/DR=1 and with stack frame setup.

We do the additonal context allocation in SLB miss handler. If the context is
not allocated, we enable interrupts and allocate the context and retry the
access which will again result in a SLB miss.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/book3s/64/hash-4k.h  |   6 ++
 arch/powerpc/include/asm/book3s/64/hash-64k.h |   5 +
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |   6 +-
 arch/powerpc/include/asm/book3s/64/mmu.h      |  24 +++++
 arch/powerpc/include/asm/processor.h          |   7 ++
 arch/powerpc/kernel/exceptions-64s.S          |  12 ++-
 arch/powerpc/mm/copro_fault.c                 |   2 +-
 arch/powerpc/mm/hash_utils_64.c               |   4 +-
 arch/powerpc/mm/mmu_context_book3s64.c        |  17 ++-
 arch/powerpc/mm/pgtable-hash64.c              |   2 +-
 arch/powerpc/mm/slb.c                         | 150 ++++++++++++++++++++++++++
 arch/powerpc/mm/slb_low.S                     |   6 +-
 arch/powerpc/mm/tlb_hash64.c                  |   2 +-
 13 files changed, 228 insertions(+), 15 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h
index 67c5475311ee..af2ba9875f18 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-4k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h
@@ -11,6 +11,12 @@
 #define H_PUD_INDEX_SIZE  9
 #define H_PGD_INDEX_SIZE  9
 
+/*
+ * No of address bits below which we use the default context
+ * for slb allocation. For 4k this is 64TB.
+ */
+#define H_BITS_FIRST_CONTEXT	46
+
 #ifndef __ASSEMBLY__
 #define H_PTE_TABLE_SIZE	(sizeof(pte_t) << H_PTE_INDEX_SIZE)
 #define H_PMD_TABLE_SIZE	(sizeof(pmd_t) << H_PMD_INDEX_SIZE)
diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 3bcf269f8f55..0ee0fc1ad675 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -6,6 +6,11 @@
 #define H_PMD_INDEX_SIZE  10
 #define H_PUD_INDEX_SIZE  7
 #define H_PGD_INDEX_SIZE  8
+/*
+ * No of address bits below which we use the default context
+ * for slb allocation. For 64k this is 512TB.
+ */
+#define H_BITS_FIRST_CONTEXT	49
 
 /*
  * 64k aligned address free up few of the lower bits of RPN for us
diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 50ed64fba4ae..8ee83f6e9c84 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -691,8 +691,8 @@ static inline int user_segment_size(unsigned long addr)
 	return MMU_SEGSIZE_256M;
 }
 
-static inline unsigned long get_vsid(unsigned long context, unsigned long ea,
-				     int ssize)
+static inline unsigned long __get_vsid(unsigned long context, unsigned long ea,
+				       int ssize)
 {
 	unsigned long va_bits = VA_BITS;
 	unsigned long vsid_bits;
@@ -744,7 +744,7 @@ static inline unsigned long get_kernel_vsid(unsigned long ea, int ssize)
 	 */
 	context = (ea >> 60) - KERNEL_REGION_CONTEXT_OFFSET;
 
-	return get_vsid(context, ea, ssize);
+	return __get_vsid(context, ea, ssize);
 }
 
 unsigned htab_shift_for_mem_size(unsigned long mem_size);
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 0abeb0e2d616..f3fe5c772e32 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -82,6 +82,12 @@ struct spinlock;
 
 typedef struct {
 	mm_context_id_t id;
+	/*
+	 * One context for each 512TB above the first 512TB.
+	 * First 512TB context is saved in id and is also used
+	 * as PIDR.
+	 */
+	mm_context_id_t extended_id[(TASK_SIZE_USER64/TASK_CONTEXT_SIZE) - 1];
 	u16 user_psize;		/* page size index */
 
 	/* Number of bits in the mm_cpumask */
@@ -174,5 +180,23 @@ extern void radix_init_pseries(void);
 static inline void radix_init_pseries(void) { };
 #endif
 
+static inline int get_esid_context(mm_context_t *ctx, unsigned long ea)
+{
+	int index = ea >> H_BITS_FIRST_CONTEXT;
+
+	if (index == 0)
+		return ctx->id;
+	return ctx->extended_id[--index];
+}
+
+static inline unsigned long get_user_vsid(mm_context_t *ctx,
+					  unsigned long ea, int ssize)
+{
+	unsigned long context = get_esid_context(ctx, ea);
+
+	return __get_vsid(context, ea, ssize);
+}
+
+
 #endif /* __ASSEMBLY__ */
 #endif /* _ASM_POWERPC_BOOK3S_64_MMU_H_ */
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 01299cdc9806..70d65b482504 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -119,9 +119,16 @@ void release_thread(struct task_struct *);
  */
 #define TASK_SIZE_USER64		TASK_SIZE_512TB
 #define DEFAULT_MAP_WINDOW_USER64	TASK_SIZE_128TB
+#define TASK_CONTEXT_SIZE		TASK_SIZE_512TB
 #else
 #define TASK_SIZE_USER64		TASK_SIZE_64TB
 #define DEFAULT_MAP_WINDOW_USER64	TASK_SIZE_64TB
+/*
+ * We don't need allocate extended context id for 4K
+ * page size. We limit max address on this config to
+ * 64TB.
+ */
+#define TASK_CONTEXT_SIZE		TASK_SIZE_64TB
 #endif
 
 /*
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index 3ac87e53b3da..166b8c0f1830 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -620,8 +620,12 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
 	ld	r10,PACA_EXSLB+EX_LR(r13)
 	lwz	r9,PACA_EXSLB+EX_CCR(r13)	/* get saved CR */
 	mtlr	r10
+	/*
+	 * Large address, check whether we have to allocate new
+	 * contexts.
+	 */
+	beq-	8f
 
-	beq-	8f		/* if bad address, make full stack frame */
 
 	bne-	cr5,2f		/* if unrecoverable exception, oops */
 
@@ -685,7 +689,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_TYPE_RADIX)
 	mr	r3,r12
 	mfspr	r11,SPRN_SRR0
 	mfspr	r12,SPRN_SRR1
-	LOAD_HANDLER(r10,bad_addr_slb)
+	LOAD_HANDLER(r10, multi_context_slb)
 	mtspr	SPRN_SRR0,r10
 	ld	r10,PACAKMSR(r13)
 	mtspr	SPRN_SRR1,r10
@@ -700,7 +704,7 @@ EXC_COMMON_BEGIN(unrecov_slb)
 	bl	unrecoverable_exception
 	b	1b
 
-EXC_COMMON_BEGIN(bad_addr_slb)
+EXC_COMMON_BEGIN(multi_context_slb)
 	EXCEPTION_PROLOG_COMMON(0x380, PACA_EXSLB)
 	RECONCILE_IRQ_STATE(r10, r11)
 	ld	r3, PACA_EXSLB+EX_DAR(r13)
@@ -710,7 +714,7 @@ EXC_COMMON_BEGIN(bad_addr_slb)
 	std	r10, _TRAP(r1)
 2:	bl	save_nvgprs
 	addi	r3, r1, STACK_FRAME_OVERHEAD
-	bl	slb_miss_bad_addr
+	bl	handle_multi_context_slb_miss
 	b	ret_from_except
 
 EXC_REAL_BEGIN(hardware_interrupt, 0x500, 0x100)
diff --git a/arch/powerpc/mm/copro_fault.c b/arch/powerpc/mm/copro_fault.c
index 697b70ad1195..7d0945bd3a61 100644
--- a/arch/powerpc/mm/copro_fault.c
+++ b/arch/powerpc/mm/copro_fault.c
@@ -112,7 +112,7 @@ int copro_calculate_slb(struct mm_struct *mm, u64 ea, struct copro_slb *slb)
 			return 1;
 		psize = get_slice_psize(mm, ea);
 		ssize = user_segment_size(ea);
-		vsid = get_vsid(mm->context.id, ea, ssize);
+		vsid = get_user_vsid(&mm->context, ea, ssize);
 		vsidkey = SLB_VSID_USER;
 		break;
 	case VMALLOC_REGION_ID:
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index cf290d415dcd..d6fdf20cdd7b 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1262,7 +1262,7 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea,
 		}
 		psize = get_slice_psize(mm, ea);
 		ssize = user_segment_size(ea);
-		vsid = get_vsid(mm->context.id, ea, ssize);
+		vsid = get_user_vsid(&mm->context, ea, ssize);
 		break;
 	case VMALLOC_REGION_ID:
 		vsid = get_kernel_vsid(ea, mmu_kernel_ssize);
@@ -1527,7 +1527,7 @@ void hash_preload(struct mm_struct *mm, unsigned long ea,
 
 	/* Get VSID */
 	ssize = user_segment_size(ea);
-	vsid = get_vsid(mm->context.id, ea, ssize);
+	vsid = get_user_vsid(&mm->context, ea, ssize);
 	if (!vsid)
 		return;
 	/*
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 929d9ef7083f..9cea7f33a00c 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -185,6 +185,19 @@ void __destroy_context(int context_id)
 }
 EXPORT_SYMBOL_GPL(__destroy_context);
 
+void destroy_extended_context(mm_context_t *ctx)
+{
+	int index, context_id;
+
+	spin_lock(&mmu_context_lock);
+	for (index = 1; index < (TASK_SIZE_USER64/TASK_CONTEXT_SIZE); index++) {
+		context_id = ctx->extended_id[index - 1];
+		if (context_id)
+			ida_remove(&mmu_context_ida, context_id);
+	}
+	spin_unlock(&mmu_context_lock);
+}
+
 #ifdef CONFIG_PPC_64K_PAGES
 static void destroy_pagetable_page(struct mm_struct *mm)
 {
@@ -220,8 +233,10 @@ void destroy_context(struct mm_struct *mm)
 #endif
 	if (radix_enabled())
 		WARN_ON(process_tb[mm->context.id].prtb0 != 0);
-	else
+	else {
 		subpage_prot_free(mm);
+		destroy_extended_context(&mm->context);
+	}
 	destroy_pagetable_page(mm);
 	__destroy_context(mm->context.id);
 	mm->context.id = MMU_NO_CONTEXT;
diff --git a/arch/powerpc/mm/pgtable-hash64.c b/arch/powerpc/mm/pgtable-hash64.c
index 469808e77e58..a87b18cf6749 100644
--- a/arch/powerpc/mm/pgtable-hash64.c
+++ b/arch/powerpc/mm/pgtable-hash64.c
@@ -320,7 +320,7 @@ void hpte_do_hugepage_flush(struct mm_struct *mm, unsigned long addr,
 
 	if (!is_kernel_addr(addr)) {
 		ssize = user_segment_size(addr);
-		vsid = get_vsid(mm->context.id, addr, ssize);
+		vsid = get_user_vsid(&mm->context, addr, ssize);
 		WARN_ON(vsid == 0);
 	} else {
 		vsid = get_kernel_vsid(addr, mmu_kernel_ssize);
diff --git a/arch/powerpc/mm/slb.c b/arch/powerpc/mm/slb.c
index 13cfe413b40d..917438457b2f 100644
--- a/arch/powerpc/mm/slb.c
+++ b/arch/powerpc/mm/slb.c
@@ -23,6 +23,7 @@
 #include <asm/smp.h>
 #include <linux/compiler.h>
 #include <linux/mm_types.h>
+#include <linux/context_tracking.h>
 
 #include <asm/udbg.h>
 #include <asm/code-patching.h>
@@ -340,3 +341,152 @@ void slb_initialize(void)
 
 	asm volatile("isync":::"memory");
 }
+
+/*
+ * Only handle insert of 1TB slb entries.
+ */
+static void insert_slb_entry(unsigned long vsid, unsigned long ea,
+			     int bpsize, int ssize)
+{
+	int slb_cache_index;
+	unsigned long flags;
+	enum slb_index index;
+	unsigned long vsid_data, esid_data;
+
+	/*
+	 * We are irq disabled, hence should be safe
+	 * to access PACA.
+	 */
+	index =  get_paca()->stab_rr;
+	/*
+	 * simple round roubin replacement of slb.
+	 */
+	if (index < mmu_slb_size)
+		index++;
+	else
+		index = SLB_NUM_BOLTED;
+	get_paca()->stab_rr = index;
+
+	flags = SLB_VSID_USER | mmu_psize_defs[bpsize].sllp;
+	vsid_data =  (vsid << SLB_VSID_SHIFT_1T) | flags |
+		((unsigned long) ssize << SLB_VSID_SSIZE_SHIFT);
+	esid_data = mk_esid_data(ea, mmu_highuser_ssize, index);
+
+	asm volatile("slbmte %0, %1" : : "r" (vsid_data), "r" (esid_data)
+		     : "memory");
+	/*
+	 * Now update slb cache entries
+	 */
+	slb_cache_index = get_paca()->slb_cache_ptr;
+	if (slb_cache_index < SLB_CACHE_ENTRIES) {
+		/*
+		 * We have space in slb cache for optimized switch_slb().
+		 * Top 36 bits from esid_data as per ISA
+		 */
+		get_paca()->slb_cache[slb_cache_index++] = esid_data >> 28;
+	}
+	/*
+	 * if we are full, just increment and return.
+	 */
+	get_paca()->slb_cache_ptr++;
+	return;
+}
+
+static void alloc_extended_context(struct mm_struct *mm, unsigned long ea)
+{
+	int context_id;
+
+	int index = (ea >> H_BITS_FIRST_CONTEXT) - 1;
+
+	/*
+	 * we need to do locking only here. If this value was not set before
+	 * we will have taken an SLB miss and will reach here. The value will
+	 * be either 0 or a valid extended context. We need to make sure two
+	 * parallel SLB miss don't end up allocating extended_context for the
+	 * same range. The locking below ensures that. For now we take the
+	 * heavy mmap_sem. But can be changed to per mm_context_t custom lock
+	 * if needed.
+	 */
+	down_read(&mm->mmap_sem);
+	context_id = hash__alloc_context_id();
+	if (context_id < 0) {
+		up_read(&mm->mmap_sem);
+		pagefault_out_of_memory();
+		return;
+	}
+	/* Check for parallel allocation after holding lock */
+	if (!mm->context.extended_id[index])
+		mm->context.extended_id[index] = context_id;
+	else
+		__destroy_context(context_id);
+	up_read(&mm->mmap_sem);
+	return;
+}
+
+static void __handle_multi_context_slb_miss(struct pt_regs *regs,
+					    unsigned long ea)
+{
+	int context, bpsize;
+	unsigned long vsid;
+	struct mm_struct *mm = current->mm;
+
+	context = get_esid_context(&mm->context, ea);
+	if (!context) {
+		/*
+		 * haven't allocated context yet for this range.
+		 * Enable irq and allo context and return. We will
+		 * take an slb miss on this again and come here with
+		 * allocated context.
+		 */
+		/* We restore the interrupt state now */
+		if (!arch_irq_disabled_regs(regs))
+			local_irq_enable();
+		return alloc_extended_context(mm, ea);
+	}
+	/*
+	 * We are always above 1TB, hence use high user segment size.
+	 */
+	vsid = __get_vsid(context, ea, mmu_highuser_ssize);
+	bpsize = get_slice_psize(mm, ea);
+
+	insert_slb_entry(vsid, ea, bpsize, mmu_highuser_ssize);
+	return;
+}
+
+/*
+ * exception_enter() handling? FIXME!!
+ */
+void handle_multi_context_slb_miss(struct pt_regs *regs)
+{
+	enum ctx_state prev_state = exception_enter();
+	unsigned long ea = regs->dar;
+
+	/*
+	 * Kernel always runs with single context. Hence
+	 * anything that request for multi context is
+	 * considered bad slb request.
+	 */
+	if (!user_mode(regs))
+		return bad_page_fault(regs, ea, SIGSEGV);
+
+	if (REGION_ID(ea) != USER_REGION_ID)
+		goto slb_bad_addr;
+	/*
+	 * Are we beyound what the page table layout support ?
+	 */
+	if ((ea & ~REGION_MASK) >= H_PGTABLE_RANGE)
+		goto slb_bad_addr;
+
+	/* Lower address should be handled by asm code */
+	if (ea <= (1UL << H_BITS_FIRST_CONTEXT))
+		goto slb_bad_addr;
+
+	__handle_multi_context_slb_miss(regs, ea);
+	exception_exit(prev_state);
+	return;
+
+slb_bad_addr:
+	_exception(SIGSEGV, regs, SEGV_BNDERR, ea);
+	exception_exit(prev_state);
+	return;
+}
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index 2cf5ef3fc50d..425b4c5ec1e9 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -75,10 +75,12 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_68_BIT_VA)
  */
 _GLOBAL(slb_allocate)
 	/*
-	 * check for bad kernel/user address
+	 * Check for address range for which we need to handle multi context. For
+         * the default context we allocate the slb via the fast path. For large
+         * address we branch out to C-code and look at additional context allocated.
 	 * (ea & ~REGION_MASK) >= PGTABLE_RANGE
 	 */
-	rldicr. r9,r3,4,(63 - H_PGTABLE_EADDR_SIZE - 4)
+	rldicr. r9,r3,4,(63 - H_BITS_FIRST_CONTEXT - 4)
 	bne-	8f
 
 	srdi	r9,r3,60		/* get region */
diff --git a/arch/powerpc/mm/tlb_hash64.c b/arch/powerpc/mm/tlb_hash64.c
index 9b23f12e863c..87d71dd25441 100644
--- a/arch/powerpc/mm/tlb_hash64.c
+++ b/arch/powerpc/mm/tlb_hash64.c
@@ -89,7 +89,7 @@ void hpte_need_flush(struct mm_struct *mm, unsigned long addr,
 	/* Build full vaddr */
 	if (!is_kernel_addr(addr)) {
 		ssize = user_segment_size(addr);
-		vsid = get_vsid(mm->context.id, addr, ssize);
+		vsid = get_user_vsid(&mm->context, addr, ssize);
 	} else {
 		vsid = get_kernel_vsid(addr, mmu_kernel_ssize);
 		ssize = mmu_kernel_ssize;
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range
  2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
                   ` (2 preceding siblings ...)
  2018-02-26 14:08 ` [PATCH V2 3/4] powerpc/mm: Add support for handling > 512TB address in SLB miss Aneesh Kumar K.V
@ 2018-02-26 14:08 ` Aneesh Kumar K.V
  2018-02-26 18:11   ` Murilo Opsfelder Araujo
  2018-02-28 23:06   ` kbuild test robot
  2018-02-26 14:15 ` [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
  4 siblings, 2 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:08 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev, Aneesh Kumar K.V

---
 arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
 arch/powerpc/include/asm/processor.h          | 9 ++++++++-
 arch/powerpc/mm/init_64.c                     | 6 ------
 arch/powerpc/mm/pgtable_64.c                  | 5 -----
 4 files changed, 9 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/hash-64k.h b/arch/powerpc/include/asm/book3s/64/hash-64k.h
index 0ee0fc1ad675..02098d7fe177 100644
--- a/arch/powerpc/include/asm/book3s/64/hash-64k.h
+++ b/arch/powerpc/include/asm/book3s/64/hash-64k.h
@@ -4,7 +4,7 @@
 
 #define H_PTE_INDEX_SIZE  8
 #define H_PMD_INDEX_SIZE  10
-#define H_PUD_INDEX_SIZE  7
+#define H_PUD_INDEX_SIZE  10
 #define H_PGD_INDEX_SIZE  8
 /*
  * No of address bits below which we use the default context
diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
index 70d65b482504..a621a068880a 100644
--- a/arch/powerpc/include/asm/processor.h
+++ b/arch/powerpc/include/asm/processor.h
@@ -109,6 +109,13 @@ void release_thread(struct task_struct *);
 #define TASK_SIZE_64TB  (0x0000400000000000UL)
 #define TASK_SIZE_128TB (0x0000800000000000UL)
 #define TASK_SIZE_512TB (0x0002000000000000UL)
+#define TASK_SIZE_1PB   (0x0004000000000000UL)
+#define TASK_SIZE_2PB   (0x0008000000000000UL)
+/*
+ * With 52 bits in the address we can support
+ * upto 4PB of range.
+ */
+#define TASK_SIZE_4PB   (0x0010000000000000UL)
 
 /*
  * For now 512TB is only supported with book3s and 64K linux page size.
@@ -117,7 +124,7 @@ void release_thread(struct task_struct *);
 /*
  * Max value currently used:
  */
-#define TASK_SIZE_USER64		TASK_SIZE_512TB
+#define TASK_SIZE_USER64		TASK_SIZE_4PB
 #define DEFAULT_MAP_WINDOW_USER64	TASK_SIZE_128TB
 #define TASK_CONTEXT_SIZE		TASK_SIZE_512TB
 #else
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index fdb424a29f03..63470b06c502 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -68,12 +68,6 @@
 
 #include "mmu_decl.h"
 
-#ifdef CONFIG_PPC_BOOK3S_64
-#if H_PGTABLE_RANGE > USER_VSID_RANGE
-#warning Limited user VSID range means pagetable space is wasted
-#endif
-#endif /* CONFIG_PPC_BOOK3S_64 */
-
 phys_addr_t memstart_addr = ~0;
 EXPORT_SYMBOL_GPL(memstart_addr);
 phys_addr_t kernstart_addr;
diff --git a/arch/powerpc/mm/pgtable_64.c b/arch/powerpc/mm/pgtable_64.c
index 28c980eb4422..16636bdf3331 100644
--- a/arch/powerpc/mm/pgtable_64.c
+++ b/arch/powerpc/mm/pgtable_64.c
@@ -57,11 +57,6 @@
 
 #include "mmu_decl.h"
 
-#ifdef CONFIG_PPC_BOOK3S_64
-#if TASK_SIZE_USER64 > (1UL << (ESID_BITS + SID_SHIFT))
-#error TASK_SIZE_USER64 exceeds user VSID range
-#endif
-#endif
 
 #ifdef CONFIG_PPC_BOOK3S_64
 /*
-- 
2.14.3

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 0/4] Add support for 4PB virtual address space on hash
  2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
                   ` (3 preceding siblings ...)
  2018-02-26 14:08 ` [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range Aneesh Kumar K.V
@ 2018-02-26 14:15 ` Aneesh Kumar K.V
  4 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-26 14:15 UTC (permalink / raw)
  To: benh, paulus, mpe; +Cc: linuxppc-dev



On 02/26/2018 07:38 PM, Aneesh Kumar K.V wrote:
> This patch series extended the max virtual address space value from 512TB
> to 4PB with 64K page size. We do that by allocating one vsid context for
> each 512TB range. More details of that is explained in patch 3.
> 
> 
> Aneesh Kumar K.V (4):
>    powerpc/mm/slice: Update documentation in the file.
>    powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area
>    powerpc/mm: Add support for handling > 512TB address in SLB miss
>    powerpc/mm/hash64: Increase the VA range
> 
>   arch/powerpc/include/asm/book3s/64/hash-4k.h  |   6 ++
>   arch/powerpc/include/asm/book3s/64/hash-64k.h |   7 +-
>   arch/powerpc/include/asm/book3s/64/mmu-hash.h |   6 +-
>   arch/powerpc/include/asm/book3s/64/mmu.h      |  24 +++++
>   arch/powerpc/include/asm/processor.h          |  16 ++-
>   arch/powerpc/kernel/exceptions-64s.S          |  12 ++-
>   arch/powerpc/mm/copro_fault.c                 |   2 +-
>   arch/powerpc/mm/hash_utils_64.c               |   4 +-
>   arch/powerpc/mm/init_64.c                     |   6 --
>   arch/powerpc/mm/mmu_context_book3s64.c        |  17 ++-
>   arch/powerpc/mm/pgtable-hash64.c              |   2 +-
>   arch/powerpc/mm/pgtable_64.c                  |   5 -
>   arch/powerpc/mm/slb.c                         | 150 ++++++++++++++++++++++++++
>   arch/powerpc/mm/slb_low.S                     |   6 +-
>   arch/powerpc/mm/slice.c                       |  61 ++++++-----
>   arch/powerpc/mm/tlb_hash64.c                  |   2 +-
>   16 files changed, 273 insertions(+), 53 deletions(-)
> 

Dependent patch

"[PATCH v5 1/6] powerpc/mm/slice: Remove intermediate bitmap copy"

https://marc.info/?l=linux-kernel&m=151930964805502&w=2

-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range
  2018-02-26 14:08 ` [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range Aneesh Kumar K.V
@ 2018-02-26 18:11   ` Murilo Opsfelder Araujo
  2018-02-27  4:02     ` Aneesh Kumar K.V
  2018-02-28 23:06   ` kbuild test robot
  1 sibling, 1 reply; 11+ messages in thread
From: Murilo Opsfelder Araujo @ 2018-02-26 18:11 UTC (permalink / raw)
  To: Aneesh Kumar K.V, benh, paulus, mpe; +Cc: linuxppc-dev

On 02/26/2018 11:08 AM, Aneesh Kumar K.V wrote:
> ---
>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
>  arch/powerpc/include/asm/processor.h          | 9 ++++++++-
>  arch/powerpc/mm/init_64.c                     | 6 ------
>  arch/powerpc/mm/pgtable_64.c                  | 5 -----
>  4 files changed, 9 insertions(+), 13 deletions(-)

Hi, Aneesh.

This patch is missing Signed-off-by: line. You're encouraged to run
checkpatch.pl to inspect your patches.

You may also want to add a brief paragraph in the commit message
explaining the "why" of this change.

Cheers
Murilo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area
  2018-02-26 14:08 ` [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area Aneesh Kumar K.V
@ 2018-02-26 22:24   ` Nicholas Piggin
  2018-02-27  4:01     ` Aneesh Kumar K.V
  0 siblings, 1 reply; 11+ messages in thread
From: Nicholas Piggin @ 2018-02-26 22:24 UTC (permalink / raw)
  To: Aneesh Kumar K.V; +Cc: Benjamin Herrenschmidt, paulus, mpe, linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 5338 bytes --]

I had a series which goes significantly further with stack reduction. What
do you think about just going with that?

I wonder if we should switch to dynamically allocating the slice stuff on
ppc64

On 27 Feb. 2018 00:28, "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
wrote:

> This patch kill potential_mask and compat_mask variable and instead use
> tmp_mask
> so that we can reduce the stack usage. This is required so that we can
> increase
> the high_slices bitmap to a larger value.
>
> The patch does result in extra computation in final stage, where it ends up
> recomputing the compat mask again.
>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/slice.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
>
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 259bbda9a222..832c681c341a 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -413,8 +413,7 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>  {
>         struct slice_mask mask;
>         struct slice_mask good_mask;
> -       struct slice_mask potential_mask;
> -       struct slice_mask compat_mask;
> +       struct slice_mask tmp_mask;
>         int fixed = (flags & MAP_FIXED);
>         int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
>         unsigned long page_size = 1UL << pshift;
> @@ -449,11 +448,8 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>         bitmap_zero(mask.high_slices, SLICE_NUM_HIGH);
>
>         /* silence stupid warning */;
> -       potential_mask.low_slices = 0;
> -       bitmap_zero(potential_mask.high_slices, SLICE_NUM_HIGH);
> -
> -       compat_mask.low_slices = 0;
> -       bitmap_zero(compat_mask.high_slices, SLICE_NUM_HIGH);
> +       tmp_mask.low_slices = 0;
> +       bitmap_zero(tmp_mask.high_slices, SLICE_NUM_HIGH);
>
>         /* Sanity checks */
>         BUG_ON(mm->task_size == 0);
> @@ -502,9 +498,11 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>  #ifdef CONFIG_PPC_64K_PAGES
>         /* If we support combo pages, we can allow 64k pages in 4k slices
> */
>         if (psize == MMU_PAGE_64K) {
> -               slice_mask_for_size(mm, MMU_PAGE_4K, &compat_mask,
> high_limit);
> +               slice_mask_for_size(mm, MMU_PAGE_4K, &tmp_mask,
> high_limit);
>                 if (fixed)
> -                       slice_or_mask(&good_mask, &compat_mask);
> +                       slice_or_mask(&good_mask, &tmp_mask);
> +
> +               slice_print_mask("Mask for compat page size", tmp_mask);
>         }
>  #endif
>         /* First check hint if it's valid or if we have MAP_FIXED */
> @@ -541,11 +539,11 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>          * We don't fit in the good mask, check what other slices are
>          * empty and thus can be converted
>          */
> -       slice_mask_for_free(mm, &potential_mask, high_limit);
> -       slice_or_mask(&potential_mask, &good_mask);
> -       slice_print_mask(" potential", potential_mask);
> +       slice_mask_for_free(mm, &tmp_mask, high_limit);
> +       slice_or_mask(&tmp_mask, &good_mask);
> +       slice_print_mask("Free area/potential ", tmp_mask);
>
> -       if ((addr != 0 || fixed) && slice_check_fit(mm, mask,
> potential_mask)) {
> +       if ((addr != 0 || fixed) && slice_check_fit(mm, mask, tmp_mask)) {
>                 slice_dbg(" fits potential !\n");
>                 goto convert;
>         }
> @@ -571,7 +569,7 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>         /* Now let's see if we can find something in the existing slices
>          * for that size plus free slices
>          */
> -       addr = slice_find_area(mm, len, potential_mask,
> +       addr = slice_find_area(mm, len, tmp_mask,
>                                psize, topdown, high_limit);
>
>  #ifdef CONFIG_PPC_64K_PAGES
> @@ -585,9 +583,10 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>                  * mask variable is free here. Use that for compat
>                  * size mask.
>                  */
> +               slice_mask_for_size(mm, MMU_PAGE_4K, &mask, high_limit);
>                 /* retry the search with 4k-page slices included */
> -               slice_or_mask(&potential_mask, &compat_mask);
> -               addr = slice_find_area(mm, len, potential_mask,
> +               slice_or_mask(&tmp_mask, &mask);
> +               addr = slice_find_area(mm, len, tmp_mask,
>                                        psize, topdown, high_limit);
>         }
>  #endif
> @@ -600,8 +599,9 @@ unsigned long slice_get_unmapped_area(unsigned long
> addr, unsigned long len,
>         slice_print_mask(" mask", mask);
>
>   convert:
> +       slice_mask_for_size(mm, MMU_PAGE_4K, &tmp_mask, high_limit);
>         slice_andnot_mask(&mask, &good_mask);
> -       slice_andnot_mask(&mask, &compat_mask);
> +       slice_andnot_mask(&mask, &tmp_mask);
>         if (mask.low_slices || !bitmap_empty(mask.high_slices,
> SLICE_NUM_HIGH)) {
>                 slice_convert(mm, mask, psize);
>                 if (psize > MMU_PAGE_BASE)
> --
> 2.14.3
>
>

[-- Attachment #2: Type: text/html, Size: 6684 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area
  2018-02-26 22:24   ` Nicholas Piggin
@ 2018-02-27  4:01     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-27  4:01 UTC (permalink / raw)
  To: Nicholas Piggin; +Cc: Benjamin Herrenschmidt, paulus, mpe, linuxppc-dev

Nicholas Piggin <nicholas.piggin@gmail.com> writes:

> I had a series which goes significantly further with stack reduction. What
> do you think about just going with that?


I am yet to review that. What I did here is minimum that is required to
get 4PB series compiled.

>
> I wonder if we should switch to dynamically allocating the slice stuff on
> ppc64

-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range
  2018-02-26 18:11   ` Murilo Opsfelder Araujo
@ 2018-02-27  4:02     ` Aneesh Kumar K.V
  0 siblings, 0 replies; 11+ messages in thread
From: Aneesh Kumar K.V @ 2018-02-27  4:02 UTC (permalink / raw)
  To: Murilo Opsfelder Araujo, benh, paulus, mpe; +Cc: linuxppc-dev

Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com> writes:

> On 02/26/2018 11:08 AM, Aneesh Kumar K.V wrote:
>> ---
>>  arch/powerpc/include/asm/book3s/64/hash-64k.h | 2 +-
>>  arch/powerpc/include/asm/processor.h          | 9 ++++++++-
>>  arch/powerpc/mm/init_64.c                     | 6 ------
>>  arch/powerpc/mm/pgtable_64.c                  | 5 -----
>>  4 files changed, 9 insertions(+), 13 deletions(-)
>
> Hi, Aneesh.
>
> This patch is missing Signed-off-by: line. You're encouraged to run
> checkpatch.pl to inspect your patches.
>
> You may also want to add a brief paragraph in the commit message
> explaining the "why" of this change.
>

Will update.

-aneesh

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range
  2018-02-26 14:08 ` [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range Aneesh Kumar K.V
  2018-02-26 18:11   ` Murilo Opsfelder Araujo
@ 2018-02-28 23:06   ` kbuild test robot
  1 sibling, 0 replies; 11+ messages in thread
From: kbuild test robot @ 2018-02-28 23:06 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: kbuild-all, benh, paulus, mpe, linuxppc-dev, Aneesh Kumar K.V

[-- Attachment #1: Type: text/plain, Size: 3426 bytes --]

Hi Aneesh,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v4.16-rc3 next-20180228]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Aneesh-Kumar-K-V/Add-support-for-4PB-virtual-address-space-on-hash/20180301-032452
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-defconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 7.2.0-11) 7.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area':
>> arch/powerpc/mm/slice.c:616:1: error: the frame size of 2080 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
    }
    ^
   arch/powerpc/mm/slice.c: In function 'is_hugepage_only_range':
   arch/powerpc/mm/slice.c:790:1: error: the frame size of 2080 bytes is larger than 2048 bytes [-Werror=frame-larger-than=]
    }
    ^
   cc1: all warnings being treated as errors

vim +616 arch/powerpc/mm/slice.c

3a8247cc2 Paul Mackerras         2008-06-18  597  
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  598  	if (addr == -ENOMEM)
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  599  		return -ENOMEM;
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  600  
a4d362150 Aneesh Kumar K.V       2017-03-22  601  	slice_range_to_mask(addr, len, &mask);
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  602  	slice_dbg(" found potential area at 0x%lx\n", addr);
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  603  	slice_print_mask(" mask", mask);
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  604  
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  605   convert:
4500a5c6a Aneesh Kumar K.V       2018-02-26  606  	slice_mask_for_size(mm, MMU_PAGE_4K, &tmp_mask, high_limit);
f3207c124 Aneesh Kumar K.V       2017-03-22  607  	slice_andnot_mask(&mask, &good_mask);
4500a5c6a Aneesh Kumar K.V       2018-02-26  608  	slice_andnot_mask(&mask, &tmp_mask);
f3207c124 Aneesh Kumar K.V       2017-03-22  609  	if (mask.low_slices || !bitmap_empty(mask.high_slices, SLICE_NUM_HIGH)) {
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  610  		slice_convert(mm, mask, psize);
3a8247cc2 Paul Mackerras         2008-06-18  611  		if (psize > MMU_PAGE_BASE)
84c3d4aae Benjamin Herrenschmidt 2008-07-16  612  			on_each_cpu(slice_flush_segments, mm, 1);
3a8247cc2 Paul Mackerras         2008-06-18  613  	}
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  614  	return addr;
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  615  
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08 @616  }
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  617  EXPORT_SYMBOL_GPL(slice_get_unmapped_area);
d0f13e3c2 Benjamin Herrenschmidt 2007-05-08  618  

:::::: The code at line 616 was first introduced by commit
:::::: d0f13e3c20b6fb73ccb467bdca97fa7cf5a574cd [POWERPC] Introduce address space "slices"

:::::: TO: Benjamin Herrenschmidt <benh@kernel.crashing.org>
:::::: CC: Paul Mackerras <paulus@samba.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 24187 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-02-28 23:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-26 14:08 [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V
2018-02-26 14:08 ` [PATCH V2 1/4] powerpc/mm/slice: Update documentation in the file Aneesh Kumar K.V
2018-02-26 14:08 ` [PATCH V2 2/4] powerpc/mm/slice: Reduce the stack usage in slice_get_unmapped_area Aneesh Kumar K.V
2018-02-26 22:24   ` Nicholas Piggin
2018-02-27  4:01     ` Aneesh Kumar K.V
2018-02-26 14:08 ` [PATCH V2 3/4] powerpc/mm: Add support for handling > 512TB address in SLB miss Aneesh Kumar K.V
2018-02-26 14:08 ` [PATCH V2 4/4] powerpc/mm/hash64: Increase the VA range Aneesh Kumar K.V
2018-02-26 18:11   ` Murilo Opsfelder Araujo
2018-02-27  4:02     ` Aneesh Kumar K.V
2018-02-28 23:06   ` kbuild test robot
2018-02-26 14:15 ` [PATCH V2 0/4] Add support for 4PB virtual address space on hash Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).