All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support
@ 2017-11-09 17:27 Nicholas Piggin
  2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

After clarifying the intended semantics, the previous patch series
went the wrong way with MAP_FIXED handling, so I fixed that.

This series is not quite ready for merge. I prefer to see what x86
does exactly because it also has some fixes to make. But time is
becoming short before 4.14, so I'd like to get some more review and
testing so we can be ready.

Thanks,
Nick

Nicholas Piggin (5):
  powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case
    allocation
  powerpc/64s/hash: Fix fork() with 512TB process address space
  powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
  powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case
    allocation
  powerpc/64s: mm_context.addr_limit is only used on hash

 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
 arch/powerpc/include/asm/book3s/64/mmu.h      |  2 +-
 arch/powerpc/include/asm/paca.h               |  2 +-
 arch/powerpc/kernel/asm-offsets.c             |  2 +-
 arch/powerpc/kernel/paca.c                    |  4 +-
 arch/powerpc/kernel/setup-common.c            |  3 +-
 arch/powerpc/mm/hugetlbpage-radix.c           | 20 +++++----
 arch/powerpc/mm/mmap.c                        | 49 ++++++++++-----------
 arch/powerpc/mm/mmu_context_book3s64.c        |  8 ++--
 arch/powerpc/mm/slb_low.S                     |  2 +-
 arch/powerpc/mm/slice.c                       | 62 +++++++++++++--------------
 11 files changed, 79 insertions(+), 77 deletions(-)

-- 
2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
@ 2017-11-09 17:27 ` Nicholas Piggin
  2017-11-13  4:59   ` Aneesh Kumar K.V
  2017-11-14 11:12   ` [v2, " Michael Ellerman
  2017-11-09 17:27 ` [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space Nicholas Piggin
                   ` (3 subsequent siblings)
  4 siblings, 2 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

When allocating VA space with a hint that crosses 128TB, the SLB addr_limit
variable is not expanded if addr is not > 128TB, but the slice allocation
looks at task_size, which is 512TB. This results in slice_check_fit()
incorrectly succeeding because the slice_count truncates off bit 128 of the
requested mask, so the comparison to the available mask succeeds.

Fix this by using mm->context.addr_limit instead of mm->task_size for
testing allocation limits. This causes such allocations to fail.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/mm/slice.c | 50 ++++++++++++++++++++++++-------------------------
 1 file changed, 24 insertions(+), 26 deletions(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 45f6740dd407..3889201b560c 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr,
 {
 	struct vm_area_struct *vma;
 
-	if ((mm->task_size - len) < addr)
+	if ((mm->context.addr_limit - len) < addr)
 		return 0;
 	vma = find_vma(mm, addr);
 	return (!vma || (addr + len) <= vm_start_gap(vma));
@@ -133,7 +133,7 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret)
 		if (!slice_low_has_vma(mm, i))
 			ret->low_slices |= 1u << i;
 
-	if (mm->task_size <= SLICE_LOW_TOP)
+	if (mm->context.addr_limit <= SLICE_LOW_TOP)
 		return;
 
 	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++)
@@ -412,25 +412,31 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	struct slice_mask compat_mask;
 	int fixed = (flags & MAP_FIXED);
 	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
+	unsigned long page_size = 1UL << pshift;
 	struct mm_struct *mm = current->mm;
 	unsigned long newaddr;
 	unsigned long high_limit;
 
-	/*
-	 * Check if we need to expland slice area.
-	 */
-	if (unlikely(addr > mm->context.addr_limit &&
-		     mm->context.addr_limit != TASK_SIZE)) {
-		mm->context.addr_limit = TASK_SIZE;
+	high_limit = DEFAULT_MAP_WINDOW;
+	if (addr >= high_limit)
+		high_limit = TASK_SIZE;
+
+	if (len > high_limit)
+		return -ENOMEM;
+	if (len & (page_size - 1))
+		return -EINVAL;
+	if (fixed) {
+		if (addr & (page_size - 1))
+			return -EINVAL;
+		if (addr > high_limit - len)
+			return -ENOMEM;
+	}
+
+	if (high_limit > mm->context.addr_limit) {
+		mm->context.addr_limit = high_limit;
 		on_each_cpu(slice_flush_segments, mm, 1);
 	}
-	/*
-	 * This mmap request can allocate upt to 512TB
-	 */
-	if (addr > DEFAULT_MAP_WINDOW)
-		high_limit = mm->context.addr_limit;
-	else
-		high_limit = DEFAULT_MAP_WINDOW;
+
 	/*
 	 * init different masks
 	 */
@@ -446,27 +452,19 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 
 	/* Sanity checks */
 	BUG_ON(mm->task_size == 0);
+	BUG_ON(mm->context.addr_limit == 0);
 	VM_BUG_ON(radix_enabled());
 
 	slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize);
 	slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d\n",
 		  addr, len, flags, topdown);
 
-	if (len > mm->task_size)
-		return -ENOMEM;
-	if (len & ((1ul << pshift) - 1))
-		return -EINVAL;
-	if (fixed && (addr & ((1ul << pshift) - 1)))
-		return -EINVAL;
-	if (fixed && addr > (mm->task_size - len))
-		return -ENOMEM;
-
 	/* If hint, make sure it matches our alignment restrictions */
 	if (!fixed && addr) {
-		addr = _ALIGN_UP(addr, 1ul << pshift);
+		addr = _ALIGN_UP(addr, page_size);
 		slice_dbg(" aligned addr=%lx\n", addr);
 		/* Ignore hint if it's too large or overlaps a VMA */
-		if (addr > mm->task_size - len ||
+		if (addr > high_limit - len ||
 		    !slice_area_is_free(mm, addr, len))
 			addr = 0;
 	}
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space
  2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
  2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
@ 2017-11-09 17:27 ` Nicholas Piggin
  2017-11-13  4:59   ` Aneesh Kumar K.V
  2017-11-09 17:27 ` [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary Nicholas Piggin
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

Hash unconditionally resets the addr_limit to default (128TB) when
the mm context is initialised. If a process has > 128TB mappings when
it forks, the child will not get the 512TB addr_limit, so accesses to
valid > 128TB mappings will fail in the child.

Fix this by only resetting the addr_limit to default if it was 0. Non
zero indicates it was duplicated from the parent (0 means exec()).

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/mm/mmu_context_book3s64.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 6d724dab27c2..846cbad45fce 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -93,11 +93,11 @@ static int hash__init_new_context(struct mm_struct *mm)
 		return index;
 
 	/*
-	 * We do switch_slb() early in fork, even before we setup the
-	 * mm->context.addr_limit. Default to max task size so that we copy the
-	 * default values to paca which will help us to handle slb miss early.
+	 * In the case of exec, use the default limit,
+	 * otherwise inherit it from the mm we are duplicating.
 	 */
-	mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
+	if (!mm->context.addr_limit)
+		mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
 
 	/*
 	 * The old code would re-promote on fork, we don't do that when using
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
  2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
  2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
  2017-11-09 17:27 ` [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space Nicholas Piggin
@ 2017-11-09 17:27 ` Nicholas Piggin
  2017-11-13  4:59   ` Aneesh Kumar K.V
  2017-11-09 17:27 ` [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
  2017-11-09 17:27 ` [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash Nicholas Piggin
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

While mapping hints with a length that cross 128TB are disallowed,
MAP_FIXED allocations that cross 128TB are allowed. These are failing
on hash (on radix they succeed). Add an additional case for fixed
mappings to expand the addr_limit when crossing 128TB.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/mm/slice.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index 3889201b560c..a4f93699194b 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -418,7 +418,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 	unsigned long high_limit;
 
 	high_limit = DEFAULT_MAP_WINDOW;
-	if (addr >= high_limit)
+	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
 		high_limit = TASK_SIZE;
 
 	if (len > high_limit)
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
                   ` (2 preceding siblings ...)
  2017-11-09 17:27 ` [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary Nicholas Piggin
@ 2017-11-09 17:27 ` Nicholas Piggin
  2017-11-13  5:01   ` Aneesh Kumar K.V
  2017-11-09 17:27 ` [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash Nicholas Piggin
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

Radix VA space allocations test addresses against mm->task_size which is
512TB, even in cases where the intention is to limit allocation to below
128TB.

This results in mmap with a hint address below 128TB but address + length
above 128TB succeeding when it should fail (as hash does after the
previous patch).

Set the high address limit to be considered up front, and base subsequent
allocation checks on that consistently.

Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/mm/hugetlbpage-radix.c | 26 ++++++++++++------
 arch/powerpc/mm/mmap.c              | 55 ++++++++++++++++++++++---------------
 2 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c
index 558e9d3891bf..bd022d16745c 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -49,17 +49,28 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
 	struct hstate *h = hstate_file(file);
+	int fixed = (flags & MAP_FIXED);
+	unsigned long high_limit;
 	struct vm_unmapped_area_info info;
 
-	if (unlikely(addr > mm->context.addr_limit && addr < TASK_SIZE))
-		mm->context.addr_limit = TASK_SIZE;
+	high_limit = DEFAULT_MAP_WINDOW;
+	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
+		high_limit = TASK_SIZE;
 
 	if (len & ~huge_page_mask(h))
 		return -EINVAL;
-	if (len > mm->task_size)
+	if (len > high_limit)
 		return -ENOMEM;
+	if (fixed) {
+		if (addr > high_limit - len)
+			return -ENOMEM;
+	}
 
-	if (flags & MAP_FIXED) {
+	if (unlikely(addr > mm->context.addr_limit &&
+		     mm->context.addr_limit != TASK_SIZE))
+		mm->context.addr_limit = TASK_SIZE;
+
+	if (fixed) {
 		if (prepare_hugepage_range(file, addr, len))
 			return -EINVAL;
 		return addr;
@@ -68,7 +79,7 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	if (addr) {
 		addr = ALIGN(addr, huge_page_size(h));
 		vma = find_vma(mm, addr);
-		if (mm->task_size - len >= addr &&
+		if (high_limit - len >= addr &&
 		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
@@ -79,12 +90,9 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
 	info.length = len;
 	info.low_limit = PAGE_SIZE;
-	info.high_limit = current->mm->mmap_base;
+	info.high_limit = mm->mmap_base + (high_limit - DEFAULT_MAP_WINDOW);
 	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
 	info.align_offset = 0;
 
-	if (addr > DEFAULT_MAP_WINDOW)
-		info.high_limit += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
-
 	return vm_unmapped_area(&info);
 }
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index 5d78b193fec4..6d476a7b5611 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -106,22 +106,32 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
+	int fixed = (flags & MAP_FIXED);
+	unsigned long high_limit;
 	struct vm_unmapped_area_info info;
 
+	high_limit = DEFAULT_MAP_WINDOW;
+	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
+		high_limit = TASK_SIZE;
+
+	if (len > high_limit)
+		return -ENOMEM;
+	if (fixed) {
+		if (addr > high_limit - len)
+			return -ENOMEM;
+	}
+
 	if (unlikely(addr > mm->context.addr_limit &&
 		     mm->context.addr_limit != TASK_SIZE))
 		mm->context.addr_limit = TASK_SIZE;
 
-	if (len > mm->task_size - mmap_min_addr)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED)
+	if (fixed)
 		return addr;
 
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
 		vma = find_vma(mm, addr);
-		if (mm->task_size - len >= addr && addr >= mmap_min_addr &&
+		if (high_limit - len >= addr && addr >= mmap_min_addr &&
 		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
@@ -129,13 +139,9 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
 	info.flags = 0;
 	info.length = len;
 	info.low_limit = mm->mmap_base;
+	info.high_limit = high_limit;
 	info.align_mask = 0;
 
-	if (unlikely(addr > DEFAULT_MAP_WINDOW))
-		info.high_limit = mm->context.addr_limit;
-	else
-		info.high_limit = DEFAULT_MAP_WINDOW;
-
 	return vm_unmapped_area(&info);
 }
 
@@ -149,37 +155,42 @@ radix__arch_get_unmapped_area_topdown(struct file *filp,
 	struct vm_area_struct *vma;
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
+	int fixed = (flags & MAP_FIXED);
+	unsigned long high_limit;
 	struct vm_unmapped_area_info info;
 
+	high_limit = DEFAULT_MAP_WINDOW;
+	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
+		high_limit = TASK_SIZE;
+
+	if (len > high_limit)
+		return -ENOMEM;
+	if (fixed) {
+		if (addr > high_limit - len)
+			return -ENOMEM;
+	}
+
 	if (unlikely(addr > mm->context.addr_limit &&
 		     mm->context.addr_limit != TASK_SIZE))
 		mm->context.addr_limit = TASK_SIZE;
 
-	/* requested length too big for entire address space */
-	if (len > mm->task_size - mmap_min_addr)
-		return -ENOMEM;
-
-	if (flags & MAP_FIXED)
+	if (fixed)
 		return addr;
 
-	/* requesting a specific address */
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
 		vma = find_vma(mm, addr);
-		if (mm->task_size - len >= addr && addr >= mmap_min_addr &&
-				(!vma || addr + len <= vm_start_gap(vma)))
+		if (high_limit - len >= addr && addr >= mmap_min_addr &&
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
 	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
 	info.length = len;
 	info.low_limit = max(PAGE_SIZE, mmap_min_addr);
-	info.high_limit = mm->mmap_base;
+	info.high_limit = mm->mmap_base + (high_limit - DEFAULT_MAP_WINDOW);
 	info.align_mask = 0;
 
-	if (addr > DEFAULT_MAP_WINDOW)
-		info.high_limit += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
-
 	addr = vm_unmapped_area(&info);
 	if (!(addr & ~PAGE_MASK))
 		return addr;
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash
  2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
                   ` (3 preceding siblings ...)
  2017-11-09 17:27 ` [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
@ 2017-11-09 17:27 ` Nicholas Piggin
  2017-11-13  5:01   ` Aneesh Kumar K.V
  4 siblings, 1 reply; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-09 17:27 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Aneesh Kumar K . V,
	Florian Weimer, Kirill A. Shutemov

Radix keeps no meaningful state in addr_limit, so remove it from
radix code and rename to slb_addr_limit to make it clear it applies
to hash only.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
 arch/powerpc/include/asm/book3s/64/mmu.h      |  2 +-
 arch/powerpc/include/asm/paca.h               |  2 +-
 arch/powerpc/kernel/asm-offsets.c             |  2 +-
 arch/powerpc/kernel/paca.c                    |  4 ++--
 arch/powerpc/kernel/setup-common.c            |  3 ++-
 arch/powerpc/mm/hugetlbpage-radix.c           |  8 +-------
 arch/powerpc/mm/mmap.c                        | 18 ++++--------------
 arch/powerpc/mm/mmu_context_book3s64.c        |  4 ++--
 arch/powerpc/mm/slb_low.S                     |  2 +-
 arch/powerpc/mm/slice.c                       | 22 +++++++++++-----------
 11 files changed, 27 insertions(+), 42 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
index 508275bb05d5..e91e115a816f 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
@@ -606,7 +606,7 @@ extern void slb_set_size(u16 size);
 
 /* 4 bits per slice and we have one slice per 1TB */
 #define SLICE_ARRAY_SIZE	(H_PGTABLE_RANGE >> 41)
-#define TASK_SLICE_ARRAY_SZ(x)	((x)->context.addr_limit >> 41)
+#define TASK_SLICE_ARRAY_SZ(x)	((x)->context.slb_addr_limit >> 41)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
index 37fdede5a24c..c9448e19847a 100644
--- a/arch/powerpc/include/asm/book3s/64/mmu.h
+++ b/arch/powerpc/include/asm/book3s/64/mmu.h
@@ -93,7 +93,7 @@ typedef struct {
 #ifdef CONFIG_PPC_MM_SLICES
 	u64 low_slices_psize;	/* SLB page size encodings */
 	unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
-	unsigned long addr_limit;
+	unsigned long slb_addr_limit;
 #else
 	u16 sllp;		/* SLB page size encoding */
 #endif
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index c907ae23c956..3892db93b837 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -143,7 +143,7 @@ struct paca_struct {
 #ifdef CONFIG_PPC_MM_SLICES
 	u64 mm_ctx_low_slices_psize;
 	unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE];
-	unsigned long addr_limit;
+	unsigned long mm_ctx_slb_addr_limit;
 #else
 	u16 mm_ctx_user_psize;
 	u16 mm_ctx_sllp;
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 200623e71474..9aace433491a 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -185,7 +185,7 @@ int main(void)
 #ifdef CONFIG_PPC_MM_SLICES
 	OFFSET(PACALOWSLICESPSIZE, paca_struct, mm_ctx_low_slices_psize);
 	OFFSET(PACAHIGHSLICEPSIZE, paca_struct, mm_ctx_high_slices_psize);
-	DEFINE(PACA_ADDR_LIMIT, offsetof(struct paca_struct, addr_limit));
+	OFFSET(PACA_SLB_ADDR_LIMIT, paca_struct, mm_ctx_slb_addr_limit);
 	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
 #endif /* CONFIG_PPC_MM_SLICES */
 #endif
diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 5d38d5ea9a24..d6597038931d 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -262,8 +262,8 @@ void copy_mm_to_paca(struct mm_struct *mm)
 
 	get_paca()->mm_ctx_id = context->id;
 #ifdef CONFIG_PPC_MM_SLICES
-	VM_BUG_ON(!mm->context.addr_limit);
-	get_paca()->addr_limit = mm->context.addr_limit;
+	VM_BUG_ON(!mm->context.slb_addr_limit);
+	get_paca()->mm_ctx_slb_addr_limit = mm->context.slb_addr_limit;
 	get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize;
 	memcpy(&get_paca()->mm_ctx_high_slices_psize,
 	       &context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm));
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index fa661ed616f5..2075322cd225 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -898,7 +898,8 @@ void __init setup_arch(char **cmdline_p)
 
 #ifdef CONFIG_PPC_MM_SLICES
 #ifdef CONFIG_PPC64
-	init_mm.context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
+	if (!radix_enabled())
+		init_mm.context.slb_addr_limit = DEFAULT_MAP_WINDOW_USER64;
 #else
 #error	"context.addr_limit not initialized."
 #endif
diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c
index bd022d16745c..2486bee0f93e 100644
--- a/arch/powerpc/mm/hugetlbpage-radix.c
+++ b/arch/powerpc/mm/hugetlbpage-radix.c
@@ -61,16 +61,10 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 		return -EINVAL;
 	if (len > high_limit)
 		return -ENOMEM;
+
 	if (fixed) {
 		if (addr > high_limit - len)
 			return -ENOMEM;
-	}
-
-	if (unlikely(addr > mm->context.addr_limit &&
-		     mm->context.addr_limit != TASK_SIZE))
-		mm->context.addr_limit = TASK_SIZE;
-
-	if (fixed) {
 		if (prepare_hugepage_range(file, addr, len))
 			return -EINVAL;
 		return addr;
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index 6d476a7b5611..d503f344e476 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -116,17 +116,12 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
 
 	if (len > high_limit)
 		return -ENOMEM;
+
 	if (fixed) {
 		if (addr > high_limit - len)
 			return -ENOMEM;
-	}
-
-	if (unlikely(addr > mm->context.addr_limit &&
-		     mm->context.addr_limit != TASK_SIZE))
-		mm->context.addr_limit = TASK_SIZE;
-
-	if (fixed)
 		return addr;
+	}
 
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
@@ -165,17 +160,12 @@ radix__arch_get_unmapped_area_topdown(struct file *filp,
 
 	if (len > high_limit)
 		return -ENOMEM;
+
 	if (fixed) {
 		if (addr > high_limit - len)
 			return -ENOMEM;
-	}
-
-	if (unlikely(addr > mm->context.addr_limit &&
-		     mm->context.addr_limit != TASK_SIZE))
-		mm->context.addr_limit = TASK_SIZE;
-
-	if (fixed)
 		return addr;
+	}
 
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
index 846cbad45fce..5e193e444ee8 100644
--- a/arch/powerpc/mm/mmu_context_book3s64.c
+++ b/arch/powerpc/mm/mmu_context_book3s64.c
@@ -96,8 +96,8 @@ static int hash__init_new_context(struct mm_struct *mm)
 	 * In the case of exec, use the default limit,
 	 * otherwise inherit it from the mm we are duplicating.
 	 */
-	if (!mm->context.addr_limit)
-		mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
+	if (!mm->context.slb_addr_limit)
+		mm->context.slb_addr_limit = DEFAULT_MAP_WINDOW_USER64;
 
 	/*
 	 * The old code would re-promote on fork, we don't do that when using
diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
index 906a86fe457b..7046bb389704 100644
--- a/arch/powerpc/mm/slb_low.S
+++ b/arch/powerpc/mm/slb_low.S
@@ -167,7 +167,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
         /*
          * user space make sure we are within the allowed limit
 	 */
-	ld	r11,PACA_ADDR_LIMIT(r13)
+	ld	r11,PACA_SLB_ADDR_LIMIT(r13)
 	cmpld	r3,r11
 	bge-	8f
 
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
index a4f93699194b..564fff06f5c1 100644
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr,
 {
 	struct vm_area_struct *vma;
 
-	if ((mm->context.addr_limit - len) < addr)
+	if ((mm->context.slb_addr_limit - len) < addr)
 		return 0;
 	vma = find_vma(mm, addr);
 	return (!vma || (addr + len) <= vm_start_gap(vma));
@@ -133,10 +133,10 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret)
 		if (!slice_low_has_vma(mm, i))
 			ret->low_slices |= 1u << i;
 
-	if (mm->context.addr_limit <= SLICE_LOW_TOP)
+	if (mm->context.slb_addr_limit <= SLICE_LOW_TOP)
 		return;
 
-	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++)
+	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++)
 		if (!slice_high_has_vma(mm, i))
 			__set_bit(i, ret->high_slices);
 }
@@ -157,7 +157,7 @@ static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_ma
 			ret->low_slices |= 1u << i;
 
 	hpsizes = mm->context.high_slices_psize;
-	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) {
+	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) {
 		mask_index = i & 0x1;
 		index = i >> 1;
 		if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == psize)
@@ -169,7 +169,7 @@ static int slice_check_fit(struct mm_struct *mm,
 			   struct slice_mask mask, struct slice_mask available)
 {
 	DECLARE_BITMAP(result, SLICE_NUM_HIGH);
-	unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.addr_limit);
+	unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit);
 
 	bitmap_and(result, mask.high_slices,
 		   available.high_slices, slice_count);
@@ -219,7 +219,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
 	mm->context.low_slices_psize = lpsizes;
 
 	hpsizes = mm->context.high_slices_psize;
-	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) {
+	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) {
 		mask_index = i & 0x1;
 		index = i >> 1;
 		if (test_bit(i, mask.high_slices))
@@ -329,8 +329,8 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
 	 * Only for that request for which high_limit is above
 	 * DEFAULT_MAP_WINDOW we should apply this.
 	 */
-	if (high_limit  > DEFAULT_MAP_WINDOW)
-		addr += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
+	if (high_limit > DEFAULT_MAP_WINDOW)
+		addr += mm->context.slb_addr_limit - DEFAULT_MAP_WINDOW;
 
 	while (addr > PAGE_SIZE) {
 		info.high_limit = addr;
@@ -432,8 +432,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 			return -ENOMEM;
 	}
 
-	if (high_limit > mm->context.addr_limit) {
-		mm->context.addr_limit = high_limit;
+	if (high_limit > mm->context.slb_addr_limit) {
+		mm->context.slb_addr_limit = high_limit;
 		on_each_cpu(slice_flush_segments, mm, 1);
 	}
 
@@ -452,7 +452,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
 
 	/* Sanity checks */
 	BUG_ON(mm->task_size == 0);
-	BUG_ON(mm->context.addr_limit == 0);
+	BUG_ON(mm->context.slb_addr_limit == 0);
 	VM_BUG_ON(radix_enabled());
 
 	slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize);
-- 
2.15.0

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
@ 2017-11-13  4:59   ` Aneesh Kumar K.V
  2017-11-13  7:36     ` Nicholas Piggin
  2017-11-14 11:12   ` [v2, " Michael Ellerman
  1 sibling, 1 reply; 13+ messages in thread
From: Aneesh Kumar K.V @ 2017-11-13  4:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

Nicholas Piggin <npiggin@gmail.com> writes:

> When allocating VA space with a hint that crosses 128TB, the SLB addr_limit
> variable is not expanded if addr is not > 128TB, but the slice allocation
> looks at task_size, which is 512TB. This results in slice_check_fit()
> incorrectly succeeding because the slice_count truncates off bit 128 of the
> requested mask, so the comparison to the available mask succeeds.
>
> Fix this by using mm->context.addr_limit instead of mm->task_size for
> testing allocation limits. This causes such allocations to fail.
>

Also note that this change the rule from > 128TB to >-128TB to select
the larger address space. I guess that is correct because without '>=' we
won't be able to allocate anything starting from 128TB (except MAP_FIXED).

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>


> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
> Reported-by: Florian Weimer <fweimer@redhat.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/mm/slice.c | 50 ++++++++++++++++++++++++-------------------------
>  1 file changed, 24 insertions(+), 26 deletions(-)
>
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 45f6740dd407..3889201b560c 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr,
>  {
>  	struct vm_area_struct *vma;
>
> -	if ((mm->task_size - len) < addr)
> +	if ((mm->context.addr_limit - len) < addr)
>  		return 0;
>  	vma = find_vma(mm, addr);
>  	return (!vma || (addr + len) <= vm_start_gap(vma));
> @@ -133,7 +133,7 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret)
>  		if (!slice_low_has_vma(mm, i))
>  			ret->low_slices |= 1u << i;
>
> -	if (mm->task_size <= SLICE_LOW_TOP)
> +	if (mm->context.addr_limit <= SLICE_LOW_TOP)
>  		return;
>
>  	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++)
> @@ -412,25 +412,31 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
>  	struct slice_mask compat_mask;
>  	int fixed = (flags & MAP_FIXED);
>  	int pshift = max_t(int, mmu_psize_defs[psize].shift, PAGE_SHIFT);
> +	unsigned long page_size = 1UL << pshift;
>  	struct mm_struct *mm = current->mm;
>  	unsigned long newaddr;
>  	unsigned long high_limit;
>
> -	/*
> -	 * Check if we need to expland slice area.
> -	 */
> -	if (unlikely(addr > mm->context.addr_limit &&
> -		     mm->context.addr_limit != TASK_SIZE)) {
> -		mm->context.addr_limit = TASK_SIZE;
> +	high_limit = DEFAULT_MAP_WINDOW;
> +	if (addr >= high_limit)
> +		high_limit = TASK_SIZE;
> +
> +	if (len > high_limit)
> +		return -ENOMEM;
> +	if (len & (page_size - 1))
> +		return -EINVAL;
> +	if (fixed) {
> +		if (addr & (page_size - 1))
> +			return -EINVAL;
> +		if (addr > high_limit - len)
> +			return -ENOMEM;
> +	}
> +
> +	if (high_limit > mm->context.addr_limit) {
> +		mm->context.addr_limit = high_limit;
>  		on_each_cpu(slice_flush_segments, mm, 1);
>  	}
> -	/*
> -	 * This mmap request can allocate upt to 512TB
> -	 */
> -	if (addr > DEFAULT_MAP_WINDOW)
> -		high_limit = mm->context.addr_limit;
> -	else
> -		high_limit = DEFAULT_MAP_WINDOW;
> +
>  	/*
>  	 * init different masks
>  	 */
> @@ -446,27 +452,19 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
>
>  	/* Sanity checks */
>  	BUG_ON(mm->task_size == 0);
> +	BUG_ON(mm->context.addr_limit == 0);
>  	VM_BUG_ON(radix_enabled());
>
>  	slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize);
>  	slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d\n",
>  		  addr, len, flags, topdown);
>
> -	if (len > mm->task_size)
> -		return -ENOMEM;
> -	if (len & ((1ul << pshift) - 1))
> -		return -EINVAL;
> -	if (fixed && (addr & ((1ul << pshift) - 1)))
> -		return -EINVAL;
> -	if (fixed && addr > (mm->task_size - len))
> -		return -ENOMEM;
> -
>  	/* If hint, make sure it matches our alignment restrictions */
>  	if (!fixed && addr) {
> -		addr = _ALIGN_UP(addr, 1ul << pshift);
> +		addr = _ALIGN_UP(addr, page_size);
>  		slice_dbg(" aligned addr=%lx\n", addr);
>  		/* Ignore hint if it's too large or overlaps a VMA */
> -		if (addr > mm->task_size - len ||
> +		if (addr > high_limit - len ||
>  		    !slice_area_is_free(mm, addr, len))
>  			addr = 0;
>  	}
> -- 
> 2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space
  2017-11-09 17:27 ` [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space Nicholas Piggin
@ 2017-11-13  4:59   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 13+ messages in thread
From: Aneesh Kumar K.V @ 2017-11-13  4:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

Nicholas Piggin <npiggin@gmail.com> writes:

> Hash unconditionally resets the addr_limit to default (128TB) when
> the mm context is initialised. If a process has > 128TB mappings when
> it forks, the child will not get the 512TB addr_limit, so accesses to
> valid > 128TB mappings will fail in the child.
>
> Fix this by only resetting the addr_limit to default if it was 0. Non
> zero indicates it was duplicated from the parent (0 means exec()).
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/mm/mmu_context_book3s64.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
> index 6d724dab27c2..846cbad45fce 100644
> --- a/arch/powerpc/mm/mmu_context_book3s64.c
> +++ b/arch/powerpc/mm/mmu_context_book3s64.c
> @@ -93,11 +93,11 @@ static int hash__init_new_context(struct mm_struct *mm)
>  		return index;
>
>  	/*
> -	 * We do switch_slb() early in fork, even before we setup the
> -	 * mm->context.addr_limit. Default to max task size so that we copy the
> -	 * default values to paca which will help us to handle slb miss early.
> +	 * In the case of exec, use the default limit,
> +	 * otherwise inherit it from the mm we are duplicating.
>  	 */
> -	mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
> +	if (!mm->context.addr_limit)
> +		mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
>
>  	/*
>  	 * The old code would re-promote on fork, we don't do that when using
> -- 
> 2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary
  2017-11-09 17:27 ` [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary Nicholas Piggin
@ 2017-11-13  4:59   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 13+ messages in thread
From: Aneesh Kumar K.V @ 2017-11-13  4:59 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

Nicholas Piggin <npiggin@gmail.com> writes:

> While mapping hints with a length that cross 128TB are disallowed,
> MAP_FIXED allocations that cross 128TB are allowed. These are failing
> on hash (on radix they succeed). Add an additional case for fixed
> mappings to expand the addr_limit when crossing 128TB.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/mm/slice.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index 3889201b560c..a4f93699194b 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -418,7 +418,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
>  	unsigned long high_limit;
>
>  	high_limit = DEFAULT_MAP_WINDOW;
> -	if (addr >= high_limit)
> +	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
>  		high_limit = TASK_SIZE;
>
>  	if (len > high_limit)
> -- 
> 2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-09 17:27 ` [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
@ 2017-11-13  5:01   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 13+ messages in thread
From: Aneesh Kumar K.V @ 2017-11-13  5:01 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

Nicholas Piggin <npiggin@gmail.com> writes:

> Radix VA space allocations test addresses against mm->task_size which is
> 512TB, even in cases where the intention is to limit allocation to below
> 128TB.
>
> This results in mmap with a hint address below 128TB but address + length
> above 128TB succeeding when it should fail (as hash does after the
> previous patch).
>
> Set the high address limit to be considered up front, and base subsequent
> allocation checks on that consistently.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/mm/hugetlbpage-radix.c | 26 ++++++++++++------
>  arch/powerpc/mm/mmap.c              | 55 ++++++++++++++++++++++---------------
>  2 files changed, 50 insertions(+), 31 deletions(-)
>
> diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c
> index 558e9d3891bf..bd022d16745c 100644
> --- a/arch/powerpc/mm/hugetlbpage-radix.c
> +++ b/arch/powerpc/mm/hugetlbpage-radix.c
> @@ -49,17 +49,28 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  	struct mm_struct *mm = current->mm;
>  	struct vm_area_struct *vma;
>  	struct hstate *h = hstate_file(file);
> +	int fixed = (flags & MAP_FIXED);
> +	unsigned long high_limit;
>  	struct vm_unmapped_area_info info;
>
> -	if (unlikely(addr > mm->context.addr_limit && addr < TASK_SIZE))
> -		mm->context.addr_limit = TASK_SIZE;
> +	high_limit = DEFAULT_MAP_WINDOW;
> +	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
> +		high_limit = TASK_SIZE;
>
>  	if (len & ~huge_page_mask(h))
>  		return -EINVAL;
> -	if (len > mm->task_size)
> +	if (len > high_limit)
>  		return -ENOMEM;
> +	if (fixed) {
> +		if (addr > high_limit - len)
> +			return -ENOMEM;
> +	}
>
> -	if (flags & MAP_FIXED) {
> +	if (unlikely(addr > mm->context.addr_limit &&
> +		     mm->context.addr_limit != TASK_SIZE))
> +		mm->context.addr_limit = TASK_SIZE;
> +
> +	if (fixed) {
>  		if (prepare_hugepage_range(file, addr, len))
>  			return -EINVAL;
>  		return addr;
> @@ -68,7 +79,7 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  	if (addr) {
>  		addr = ALIGN(addr, huge_page_size(h));
>  		vma = find_vma(mm, addr);
> -		if (mm->task_size - len >= addr &&
> +		if (high_limit - len >= addr &&
>  		    (!vma || addr + len <= vm_start_gap(vma)))
>  			return addr;
>  	}
> @@ -79,12 +90,9 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
>  	info.length = len;
>  	info.low_limit = PAGE_SIZE;
> -	info.high_limit = current->mm->mmap_base;
> +	info.high_limit = mm->mmap_base + (high_limit - DEFAULT_MAP_WINDOW);
>  	info.align_mask = PAGE_MASK & ~huge_page_mask(h);
>  	info.align_offset = 0;
>
> -	if (addr > DEFAULT_MAP_WINDOW)
> -		info.high_limit += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
> -
>  	return vm_unmapped_area(&info);
>  }
> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
> index 5d78b193fec4..6d476a7b5611 100644
> --- a/arch/powerpc/mm/mmap.c
> +++ b/arch/powerpc/mm/mmap.c
> @@ -106,22 +106,32 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
>  {
>  	struct mm_struct *mm = current->mm;
>  	struct vm_area_struct *vma;
> +	int fixed = (flags & MAP_FIXED);
> +	unsigned long high_limit;
>  	struct vm_unmapped_area_info info;
>
> +	high_limit = DEFAULT_MAP_WINDOW;
> +	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
> +		high_limit = TASK_SIZE;
> +
> +	if (len > high_limit)
> +		return -ENOMEM;
> +	if (fixed) {
> +		if (addr > high_limit - len)
> +			return -ENOMEM;
> +	}
> +
>  	if (unlikely(addr > mm->context.addr_limit &&
>  		     mm->context.addr_limit != TASK_SIZE))
>  		mm->context.addr_limit = TASK_SIZE;
>
> -	if (len > mm->task_size - mmap_min_addr)
> -		return -ENOMEM;
> -
> -	if (flags & MAP_FIXED)
> +	if (fixed)
>  		return addr;
>
>  	if (addr) {
>  		addr = PAGE_ALIGN(addr);
>  		vma = find_vma(mm, addr);
> -		if (mm->task_size - len >= addr && addr >= mmap_min_addr &&
> +		if (high_limit - len >= addr && addr >= mmap_min_addr &&
>  		    (!vma || addr + len <= vm_start_gap(vma)))
>  			return addr;
>  	}
> @@ -129,13 +139,9 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
>  	info.flags = 0;
>  	info.length = len;
>  	info.low_limit = mm->mmap_base;
> +	info.high_limit = high_limit;
>  	info.align_mask = 0;
>
> -	if (unlikely(addr > DEFAULT_MAP_WINDOW))
> -		info.high_limit = mm->context.addr_limit;
> -	else
> -		info.high_limit = DEFAULT_MAP_WINDOW;
> -
>  	return vm_unmapped_area(&info);
>  }
>
> @@ -149,37 +155,42 @@ radix__arch_get_unmapped_area_topdown(struct file *filp,
>  	struct vm_area_struct *vma;
>  	struct mm_struct *mm = current->mm;
>  	unsigned long addr = addr0;
> +	int fixed = (flags & MAP_FIXED);
> +	unsigned long high_limit;
>  	struct vm_unmapped_area_info info;
>
> +	high_limit = DEFAULT_MAP_WINDOW;
> +	if (addr >= high_limit || (fixed && (addr + len > high_limit)))
> +		high_limit = TASK_SIZE;
> +
> +	if (len > high_limit)
> +		return -ENOMEM;
> +	if (fixed) {
> +		if (addr > high_limit - len)
> +			return -ENOMEM;
> +	}
> +
>  	if (unlikely(addr > mm->context.addr_limit &&
>  		     mm->context.addr_limit != TASK_SIZE))
>  		mm->context.addr_limit = TASK_SIZE;
>
> -	/* requested length too big for entire address space */
> -	if (len > mm->task_size - mmap_min_addr)
> -		return -ENOMEM;
> -
> -	if (flags & MAP_FIXED)
> +	if (fixed)
>  		return addr;
>
> -	/* requesting a specific address */
>  	if (addr) {
>  		addr = PAGE_ALIGN(addr);
>  		vma = find_vma(mm, addr);
> -		if (mm->task_size - len >= addr && addr >= mmap_min_addr &&
> -				(!vma || addr + len <= vm_start_gap(vma)))
> +		if (high_limit - len >= addr && addr >= mmap_min_addr &&
> +		    (!vma || addr + len <= vm_start_gap(vma)))
>  			return addr;
>  	}
>
>  	info.flags = VM_UNMAPPED_AREA_TOPDOWN;
>  	info.length = len;
>  	info.low_limit = max(PAGE_SIZE, mmap_min_addr);
> -	info.high_limit = mm->mmap_base;
> +	info.high_limit = mm->mmap_base + (high_limit - DEFAULT_MAP_WINDOW);
>  	info.align_mask = 0;
>
> -	if (addr > DEFAULT_MAP_WINDOW)
> -		info.high_limit += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
> -
>  	addr = vm_unmapped_area(&info);
>  	if (!(addr & ~PAGE_MASK))
>  		return addr;
> -- 
> 2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash
  2017-11-09 17:27 ` [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash Nicholas Piggin
@ 2017-11-13  5:01   ` Aneesh Kumar K.V
  0 siblings, 0 replies; 13+ messages in thread
From: Aneesh Kumar K.V @ 2017-11-13  5:01 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Nicholas Piggin, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

Nicholas Piggin <npiggin@gmail.com> writes:

> Radix keeps no meaningful state in addr_limit, so remove it from
> radix code and rename to slb_addr_limit to make it clear it applies
> to hash only.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>  arch/powerpc/include/asm/book3s/64/mmu-hash.h |  2 +-
>  arch/powerpc/include/asm/book3s/64/mmu.h      |  2 +-
>  arch/powerpc/include/asm/paca.h               |  2 +-
>  arch/powerpc/kernel/asm-offsets.c             |  2 +-
>  arch/powerpc/kernel/paca.c                    |  4 ++--
>  arch/powerpc/kernel/setup-common.c            |  3 ++-
>  arch/powerpc/mm/hugetlbpage-radix.c           |  8 +-------
>  arch/powerpc/mm/mmap.c                        | 18 ++++--------------
>  arch/powerpc/mm/mmu_context_book3s64.c        |  4 ++--
>  arch/powerpc/mm/slb_low.S                     |  2 +-
>  arch/powerpc/mm/slice.c                       | 22 +++++++++++-----------
>  11 files changed, 27 insertions(+), 42 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu-hash.h b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> index 508275bb05d5..e91e115a816f 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu-hash.h
> @@ -606,7 +606,7 @@ extern void slb_set_size(u16 size);
>
>  /* 4 bits per slice and we have one slice per 1TB */
>  #define SLICE_ARRAY_SIZE	(H_PGTABLE_RANGE >> 41)
> -#define TASK_SLICE_ARRAY_SZ(x)	((x)->context.addr_limit >> 41)
> +#define TASK_SLICE_ARRAY_SZ(x)	((x)->context.slb_addr_limit >> 41)
>
>  #ifndef __ASSEMBLY__
>
> diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h
> index 37fdede5a24c..c9448e19847a 100644
> --- a/arch/powerpc/include/asm/book3s/64/mmu.h
> +++ b/arch/powerpc/include/asm/book3s/64/mmu.h
> @@ -93,7 +93,7 @@ typedef struct {
>  #ifdef CONFIG_PPC_MM_SLICES
>  	u64 low_slices_psize;	/* SLB page size encodings */
>  	unsigned char high_slices_psize[SLICE_ARRAY_SIZE];
> -	unsigned long addr_limit;
> +	unsigned long slb_addr_limit;
>  #else
>  	u16 sllp;		/* SLB page size encoding */
>  #endif
> diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
> index c907ae23c956..3892db93b837 100644
> --- a/arch/powerpc/include/asm/paca.h
> +++ b/arch/powerpc/include/asm/paca.h
> @@ -143,7 +143,7 @@ struct paca_struct {
>  #ifdef CONFIG_PPC_MM_SLICES
>  	u64 mm_ctx_low_slices_psize;
>  	unsigned char mm_ctx_high_slices_psize[SLICE_ARRAY_SIZE];
> -	unsigned long addr_limit;
> +	unsigned long mm_ctx_slb_addr_limit;
>  #else
>  	u16 mm_ctx_user_psize;
>  	u16 mm_ctx_sllp;
> diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
> index 200623e71474..9aace433491a 100644
> --- a/arch/powerpc/kernel/asm-offsets.c
> +++ b/arch/powerpc/kernel/asm-offsets.c
> @@ -185,7 +185,7 @@ int main(void)
>  #ifdef CONFIG_PPC_MM_SLICES
>  	OFFSET(PACALOWSLICESPSIZE, paca_struct, mm_ctx_low_slices_psize);
>  	OFFSET(PACAHIGHSLICEPSIZE, paca_struct, mm_ctx_high_slices_psize);
> -	DEFINE(PACA_ADDR_LIMIT, offsetof(struct paca_struct, addr_limit));
> +	OFFSET(PACA_SLB_ADDR_LIMIT, paca_struct, mm_ctx_slb_addr_limit);
>  	DEFINE(MMUPSIZEDEFSIZE, sizeof(struct mmu_psize_def));
>  #endif /* CONFIG_PPC_MM_SLICES */
>  #endif
> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> index 5d38d5ea9a24..d6597038931d 100644
> --- a/arch/powerpc/kernel/paca.c
> +++ b/arch/powerpc/kernel/paca.c
> @@ -262,8 +262,8 @@ void copy_mm_to_paca(struct mm_struct *mm)
>
>  	get_paca()->mm_ctx_id = context->id;
>  #ifdef CONFIG_PPC_MM_SLICES
> -	VM_BUG_ON(!mm->context.addr_limit);
> -	get_paca()->addr_limit = mm->context.addr_limit;
> +	VM_BUG_ON(!mm->context.slb_addr_limit);
> +	get_paca()->mm_ctx_slb_addr_limit = mm->context.slb_addr_limit;
>  	get_paca()->mm_ctx_low_slices_psize = context->low_slices_psize;
>  	memcpy(&get_paca()->mm_ctx_high_slices_psize,
>  	       &context->high_slices_psize, TASK_SLICE_ARRAY_SZ(mm));
> diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
> index fa661ed616f5..2075322cd225 100644
> --- a/arch/powerpc/kernel/setup-common.c
> +++ b/arch/powerpc/kernel/setup-common.c
> @@ -898,7 +898,8 @@ void __init setup_arch(char **cmdline_p)
>
>  #ifdef CONFIG_PPC_MM_SLICES
>  #ifdef CONFIG_PPC64
> -	init_mm.context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
> +	if (!radix_enabled())
> +		init_mm.context.slb_addr_limit = DEFAULT_MAP_WINDOW_USER64;
>  #else
>  #error	"context.addr_limit not initialized."
>  #endif
> diff --git a/arch/powerpc/mm/hugetlbpage-radix.c b/arch/powerpc/mm/hugetlbpage-radix.c
> index bd022d16745c..2486bee0f93e 100644
> --- a/arch/powerpc/mm/hugetlbpage-radix.c
> +++ b/arch/powerpc/mm/hugetlbpage-radix.c
> @@ -61,16 +61,10 @@ radix__hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>  		return -EINVAL;
>  	if (len > high_limit)
>  		return -ENOMEM;
> +
>  	if (fixed) {
>  		if (addr > high_limit - len)
>  			return -ENOMEM;
> -	}
> -
> -	if (unlikely(addr > mm->context.addr_limit &&
> -		     mm->context.addr_limit != TASK_SIZE))
> -		mm->context.addr_limit = TASK_SIZE;
> -
> -	if (fixed) {
>  		if (prepare_hugepage_range(file, addr, len))
>  			return -EINVAL;
>  		return addr;
> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
> index 6d476a7b5611..d503f344e476 100644
> --- a/arch/powerpc/mm/mmap.c
> +++ b/arch/powerpc/mm/mmap.c
> @@ -116,17 +116,12 @@ radix__arch_get_unmapped_area(struct file *filp, unsigned long addr,
>
>  	if (len > high_limit)
>  		return -ENOMEM;
> +
>  	if (fixed) {
>  		if (addr > high_limit - len)
>  			return -ENOMEM;
> -	}
> -
> -	if (unlikely(addr > mm->context.addr_limit &&
> -		     mm->context.addr_limit != TASK_SIZE))
> -		mm->context.addr_limit = TASK_SIZE;
> -
> -	if (fixed)
>  		return addr;
> +	}
>
>  	if (addr) {
>  		addr = PAGE_ALIGN(addr);
> @@ -165,17 +160,12 @@ radix__arch_get_unmapped_area_topdown(struct file *filp,
>
>  	if (len > high_limit)
>  		return -ENOMEM;
> +
>  	if (fixed) {
>  		if (addr > high_limit - len)
>  			return -ENOMEM;
> -	}
> -
> -	if (unlikely(addr > mm->context.addr_limit &&
> -		     mm->context.addr_limit != TASK_SIZE))
> -		mm->context.addr_limit = TASK_SIZE;
> -
> -	if (fixed)
>  		return addr;
> +	}
>
>  	if (addr) {
>  		addr = PAGE_ALIGN(addr);
> diff --git a/arch/powerpc/mm/mmu_context_book3s64.c b/arch/powerpc/mm/mmu_context_book3s64.c
> index 846cbad45fce..5e193e444ee8 100644
> --- a/arch/powerpc/mm/mmu_context_book3s64.c
> +++ b/arch/powerpc/mm/mmu_context_book3s64.c
> @@ -96,8 +96,8 @@ static int hash__init_new_context(struct mm_struct *mm)
>  	 * In the case of exec, use the default limit,
>  	 * otherwise inherit it from the mm we are duplicating.
>  	 */
> -	if (!mm->context.addr_limit)
> -		mm->context.addr_limit = DEFAULT_MAP_WINDOW_USER64;
> +	if (!mm->context.slb_addr_limit)
> +		mm->context.slb_addr_limit = DEFAULT_MAP_WINDOW_USER64;
>
>  	/*
>  	 * The old code would re-promote on fork, we don't do that when using
> diff --git a/arch/powerpc/mm/slb_low.S b/arch/powerpc/mm/slb_low.S
> index 906a86fe457b..7046bb389704 100644
> --- a/arch/powerpc/mm/slb_low.S
> +++ b/arch/powerpc/mm/slb_low.S
> @@ -167,7 +167,7 @@ END_MMU_FTR_SECTION_IFCLR(MMU_FTR_1T_SEGMENT)
>          /*
>           * user space make sure we are within the allowed limit
>  	 */
> -	ld	r11,PACA_ADDR_LIMIT(r13)
> +	ld	r11,PACA_SLB_ADDR_LIMIT(r13)
>  	cmpld	r3,r11
>  	bge-	8f
>
> diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c
> index a4f93699194b..564fff06f5c1 100644
> --- a/arch/powerpc/mm/slice.c
> +++ b/arch/powerpc/mm/slice.c
> @@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr,
>  {
>  	struct vm_area_struct *vma;
>
> -	if ((mm->context.addr_limit - len) < addr)
> +	if ((mm->context.slb_addr_limit - len) < addr)
>  		return 0;
>  	vma = find_vma(mm, addr);
>  	return (!vma || (addr + len) <= vm_start_gap(vma));
> @@ -133,10 +133,10 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret)
>  		if (!slice_low_has_vma(mm, i))
>  			ret->low_slices |= 1u << i;
>
> -	if (mm->context.addr_limit <= SLICE_LOW_TOP)
> +	if (mm->context.slb_addr_limit <= SLICE_LOW_TOP)
>  		return;
>
> -	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++)
> +	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++)
>  		if (!slice_high_has_vma(mm, i))
>  			__set_bit(i, ret->high_slices);
>  }
> @@ -157,7 +157,7 @@ static void slice_mask_for_size(struct mm_struct *mm, int psize, struct slice_ma
>  			ret->low_slices |= 1u << i;
>
>  	hpsizes = mm->context.high_slices_psize;
> -	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) {
> +	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) {
>  		mask_index = i & 0x1;
>  		index = i >> 1;
>  		if (((hpsizes[index] >> (mask_index * 4)) & 0xf) == psize)
> @@ -169,7 +169,7 @@ static int slice_check_fit(struct mm_struct *mm,
>  			   struct slice_mask mask, struct slice_mask available)
>  {
>  	DECLARE_BITMAP(result, SLICE_NUM_HIGH);
> -	unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.addr_limit);
> +	unsigned long slice_count = GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit);
>
>  	bitmap_and(result, mask.high_slices,
>  		   available.high_slices, slice_count);
> @@ -219,7 +219,7 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz
>  	mm->context.low_slices_psize = lpsizes;
>
>  	hpsizes = mm->context.high_slices_psize;
> -	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) {
> +	for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.slb_addr_limit); i++) {
>  		mask_index = i & 0x1;
>  		index = i >> 1;
>  		if (test_bit(i, mask.high_slices))
> @@ -329,8 +329,8 @@ static unsigned long slice_find_area_topdown(struct mm_struct *mm,
>  	 * Only for that request for which high_limit is above
>  	 * DEFAULT_MAP_WINDOW we should apply this.
>  	 */
> -	if (high_limit  > DEFAULT_MAP_WINDOW)
> -		addr += mm->context.addr_limit - DEFAULT_MAP_WINDOW;
> +	if (high_limit > DEFAULT_MAP_WINDOW)
> +		addr += mm->context.slb_addr_limit - DEFAULT_MAP_WINDOW;
>
>  	while (addr > PAGE_SIZE) {
>  		info.high_limit = addr;
> @@ -432,8 +432,8 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
>  			return -ENOMEM;
>  	}
>
> -	if (high_limit > mm->context.addr_limit) {
> -		mm->context.addr_limit = high_limit;
> +	if (high_limit > mm->context.slb_addr_limit) {
> +		mm->context.slb_addr_limit = high_limit;
>  		on_each_cpu(slice_flush_segments, mm, 1);
>  	}
>
> @@ -452,7 +452,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len,
>
>  	/* Sanity checks */
>  	BUG_ON(mm->task_size == 0);
> -	BUG_ON(mm->context.addr_limit == 0);
> +	BUG_ON(mm->context.slb_addr_limit == 0);
>  	VM_BUG_ON(radix_enabled());
>
>  	slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize);
> -- 
> 2.15.0

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-13  4:59   ` Aneesh Kumar K.V
@ 2017-11-13  7:36     ` Nicholas Piggin
  0 siblings, 0 replies; 13+ messages in thread
From: Nicholas Piggin @ 2017-11-13  7:36 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: linuxppc-dev, Michael Ellerman, Florian Weimer, Kirill A. Shutemov

On Mon, 13 Nov 2017 10:29:19 +0530
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> wrote:

> Nicholas Piggin <npiggin@gmail.com> writes:
> 
> > When allocating VA space with a hint that crosses 128TB, the SLB addr_limit
> > variable is not expanded if addr is not > 128TB, but the slice allocation
> > looks at task_size, which is 512TB. This results in slice_check_fit()
> > incorrectly succeeding because the slice_count truncates off bit 128 of the
> > requested mask, so the comparison to the available mask succeeds.
> >
> > Fix this by using mm->context.addr_limit instead of mm->task_size for
> > testing allocation limits. This causes such allocations to fail.
> >  
> 
> Also note that this change the rule from > 128TB to >-128TB to select
> the larger address space. I guess that is correct because without '>=' we
> won't be able to allocate anything starting from 128TB (except MAP_FIXED).

Oh yes, thanks. That should at least be in the changelog. Probably
split into its own patch really.

Thanks,
Nick

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [v2, 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation
  2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
  2017-11-13  4:59   ` Aneesh Kumar K.V
@ 2017-11-14 11:12   ` Michael Ellerman
  1 sibling, 0 replies; 13+ messages in thread
From: Michael Ellerman @ 2017-11-14 11:12 UTC (permalink / raw)
  To: Nicholas Piggin, linuxppc-dev
  Cc: Florian Weimer, Kirill A. Shutemov, Aneesh Kumar K . V, Nicholas Piggin

On Thu, 2017-11-09 at 17:27:36 UTC, Nicholas Piggin wrote:
> When allocating VA space with a hint that crosses 128TB, the SLB addr_limit
> variable is not expanded if addr is not > 128TB, but the slice allocation
> looks at task_size, which is 512TB. This results in slice_check_fit()
> incorrectly succeeding because the slice_count truncates off bit 128 of the
> requested mask, so the comparison to the available mask succeeds.
> 
> Fix this by using mm->context.addr_limit instead of mm->task_size for
> testing allocation limits. This causes such allocations to fail.
> 
> Cc: "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>
> Fixes: f4ea6dcb08 ("powerpc/mm: Enable mappings above 128TB")
> Reported-by: Florian Weimer <fweimer@redhat.com>
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>

Series applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6a72dc038b615229a1b285829d6c83

cheers

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2017-11-14 11:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-09 17:27 [PATCH v2 0/5] powerpc VA allocator fixes for 512TB support Nicholas Piggin
2017-11-09 17:27 ` [PATCH v2 1/5] powerpc/64s/hash: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
2017-11-13  4:59   ` Aneesh Kumar K.V
2017-11-13  7:36     ` Nicholas Piggin
2017-11-14 11:12   ` [v2, " Michael Ellerman
2017-11-09 17:27 ` [PATCH v2 2/5] powerpc/64s/hash: Fix fork() with 512TB process address space Nicholas Piggin
2017-11-13  4:59   ` Aneesh Kumar K.V
2017-11-09 17:27 ` [PATCH v2 3/5] powerpc/64s/hash: Allow MAP_FIXED allocations to cross 128TB boundary Nicholas Piggin
2017-11-13  4:59   ` Aneesh Kumar K.V
2017-11-09 17:27 ` [PATCH v2 4/5] powerpc/64s/radix: Fix 128TB-512TB virtual address boundary case allocation Nicholas Piggin
2017-11-13  5:01   ` Aneesh Kumar K.V
2017-11-09 17:27 ` [PATCH v2 5/5] powerpc/64s: mm_context.addr_limit is only used on hash Nicholas Piggin
2017-11-13  5:01   ` Aneesh Kumar K.V

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.