All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 3.16 0/7] 3.16.45-rc1 review
@ 2017-06-27 11:10 Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 6/7] rxrpc: Fix several cases where a padded len isn't checked in ticket decode Ben Hutchings
                   ` (7 more replies)
  0 siblings, 8 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: torvalds, Guenter Roeck, akpm

This is the start of the stable review cycle for the 3.16.45 release.
There are 7 patches in this series, which will be posted as responses
to this one.  If anyone has any issues with these being applied, please
let me know.

Responses should be made by Thu Jun 29 11:10:33 UTC 2017.
Anything received after that time might be too late.

A combined patch relative to 3.16.44 will be posted as an additional
response to this.  A shortlog and diffstat can be found below.

Ben.

-------------

David Howells (1):
      rxrpc: Fix several cases where a padded len isn't checked in ticket decode
         [5f2f97656ada8d811d3c1bef503ced266fcd53a0]

Helge Deller (1):
      Allow stack to grow up to address space limit
         [bd726c90b6b8ce87602208701b208a208e6d5600]

Hugh Dickins (2):
      mm: fix new crash in unmapped_area_topdown()
         [f4cb767d76cf7ee72f97dd76f6cfa6c76a5edc89]
      mm: larger stack guard gap, between vmas
         [1be7107fbe18eed3e319a6c3e83c78254b693acb]

Paolo Bonzini (1):
      KVM: x86: fix singlestepping over syscall
         [c8401dda2f0a00cd25c0af6a95ed50e478d25de4]

Seung-Woo Kim (1):
      regulator: core: Fix regualtor_ena_gpio_free not to access pin after freeing
         [60a2362f769cf549dc466134efe71c8bf9fbaaba]

Vladis Dronov (1):
      drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
         [ee9c4e681ec4f58e42a83cb0c22a0289ade1aacf]

 Documentation/kernel-parameters.txt     |   7 ++
 Makefile                                |   4 +-
 arch/arc/mm/mmap.c                      |   2 +-
 arch/arm/mm/mmap.c                      |   4 +-
 arch/frv/mm/elf-fdpic.c                 |   2 +-
 arch/mips/mm/mmap.c                     |   2 +-
 arch/parisc/kernel/sys_parisc.c         |  15 +--
 arch/powerpc/mm/slice.c                 |   2 +-
 arch/sh/mm/mmap.c                       |   4 +-
 arch/sparc/kernel/sys_sparc_64.c        |   4 +-
 arch/sparc/mm/hugetlbpage.c             |   2 +-
 arch/tile/mm/hugetlbpage.c              |   2 +-
 arch/x86/include/asm/kvm_emulate.h      |   1 +
 arch/x86/kernel/sys_x86_64.c            |   4 +-
 arch/x86/kvm/emulate.c                  |   1 +
 arch/x86/kvm/x86.c                      |  53 +++++------
 arch/x86/mm/hugetlbpage.c               |   2 +-
 arch/xtensa/kernel/syscall.c            |   2 +-
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c |   3 +
 drivers/regulator/core.c                |   2 +
 fs/hugetlbfs/inode.c                    |   2 +-
 fs/proc/task_mmu.c                      |   4 -
 include/linux/mm.h                      |  53 +++++------
 mm/gup.c                                |   5 -
 mm/memory.c                             |  38 --------
 mm/mmap.c                               | 160 +++++++++++++++++++-------------
 net/rxrpc/ar-key.c                      |  64 +++++++------
 27 files changed, 220 insertions(+), 224 deletions(-)

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 3/7] Allow stack to grow up to address space limit
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 6/7] rxrpc: Fix several cases where a padded len isn't checked in ticket decode Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 7/7] KVM: x86: fix singlestepping over syscall Ben Hutchings
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Helge Deller, Hugh Dickins, Linus Torvalds

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Helge Deller <deller@gmx.de>

commit bd726c90b6b8ce87602208701b208a208e6d5600 upstream.

Fix expand_upwards() on architectures with an upward-growing stack (parisc,
metag and partly IA-64) to allow the stack to reliably grow exactly up to
the address space limit given by TASK_SIZE.

Signed-off-by: Helge Deller <deller@gmx.de>
Acked-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 mm/mmap.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2139,16 +2139,19 @@ int expand_upwards(struct vm_area_struct
 	if (!(vma->vm_flags & VM_GROWSUP))
 		return -EFAULT;
 
-	/* Guard against wrapping around to address 0. */
+	/* Guard against exceeding limits of the address space. */
 	address &= PAGE_MASK;
-	address += PAGE_SIZE;
-	if (!address)
+	if (address >= TASK_SIZE)
 		return -ENOMEM;
+	address += PAGE_SIZE;
 
 	/* Enforce stack_guard_gap */
 	gap_addr = address + stack_guard_gap;
-	if (gap_addr < address)
-		return -ENOMEM;
+
+	/* Guard against overflow */
+	if (gap_addr < address || gap_addr > TASK_SIZE)
+		gap_addr = TASK_SIZE;
+
 	next = vma->vm_next;
 	if (next && next->vm_start < gap_addr) {
 		if (!(next->vm_flags & VM_GROWSUP))

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 4/7] regulator: core: Fix regualtor_ena_gpio_free not to access pin after freeing
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
                   ` (2 preceding siblings ...)
  2017-06-27 11:10 ` [PATCH 3.16 7/7] KVM: x86: fix singlestepping over syscall Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 2/7] mm: fix new crash in unmapped_area_topdown() Ben Hutchings
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Seung-Woo Kim, Mark Brown

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Seung-Woo Kim <sw0312.kim@samsung.com>

commit 60a2362f769cf549dc466134efe71c8bf9fbaaba upstream.

After freeing pin from regulator_ena_gpio_free, loop can access
the pin. So this patch fixes not to access pin after freeing.

Signed-off-by: Seung-Woo Kim <sw0312.kim@samsung.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/regulator/core.c | 2 ++
 1 file changed, 2 insertions(+)

--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -1709,6 +1709,8 @@ static void regulator_ena_gpio_free(stru
 				gpio_free(pin->gpio);
 				list_del(&pin->list);
 				kfree(pin);
+				rdev->ena_pin = NULL;
+				return;
 			} else {
 				pin->request_count--;
 			}

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 2/7] mm: fix new crash in unmapped_area_topdown()
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
                   ` (3 preceding siblings ...)
  2017-06-27 11:10 ` [PATCH 3.16 4/7] regulator: core: Fix regualtor_ena_gpio_free not to access pin after freeing Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 5/7] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Ben Hutchings
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Michal Hocko, Dave Jones, Linus Torvalds, Hugh Dickins

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <hughd@google.com>

commit f4cb767d76cf7ee72f97dd76f6cfa6c76a5edc89 upstream.

Trinity gets kernel BUG at mm/mmap.c:1963! in about 3 minutes of
mmap testing.  That's the VM_BUG_ON(gap_end < gap_start) at the
end of unmapped_area_topdown().  Linus points out how MAP_FIXED
(which does not have to respect our stack guard gap intentions)
could result in gap_end below gap_start there.  Fix that, and
the similar case in its alternative, unmapped_area().

Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Reported-by: Dave Jones <davej@codemonkey.org.uk>
Debugged-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 mm/mmap.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1732,7 +1732,8 @@ check_current:
 		/* Check if current node has a suitable gap */
 		if (gap_start > high_limit)
 			return -ENOMEM;
-		if (gap_end >= low_limit && gap_end - gap_start >= length)
+		if (gap_end >= low_limit &&
+		    gap_end > gap_start && gap_end - gap_start >= length)
 			goto found;
 
 		/* Visit right subtree if it looks promising */
@@ -1835,7 +1836,8 @@ check_current:
 		gap_end = vm_start_gap(vma);
 		if (gap_end < low_limit)
 			return -ENOMEM;
-		if (gap_start <= high_limit && gap_end - gap_start >= length)
+		if (gap_start <= high_limit &&
+		    gap_end > gap_start && gap_end - gap_start >= length)
 			goto found;
 
 		/* Visit left subtree if it looks promising */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 7/7] KVM: x86: fix singlestepping over syscall
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 6/7] rxrpc: Fix several cases where a padded len isn't checked in ticket decode Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 3/7] Allow stack to grow up to address space limit Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 4/7] regulator: core: Fix regualtor_ena_gpio_free not to access pin after freeing Ben Hutchings
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Paolo Bonzini, Radim Krčmář, Andy Lutomirski

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Paolo Bonzini <pbonzini@redhat.com>

commit c8401dda2f0a00cd25c0af6a95ed50e478d25de4 upstream.

TF is handled a bit differently for syscall and sysret, compared
to the other instructions: TF is checked after the instruction completes,
so that the OS can disable #DB at a syscall by adding TF to FMASK.
When the sysret is executed the #DB is taken "as if" the syscall insn
just completed.

KVM emulates syscall so that it can trap 32-bit syscall on Intel processors.
Fix the behavior, otherwise you could get #DB on a user stack which is not
nice.  This does not affect Linux guests, as they use an IST or task gate
for #DB.

This fixes CVE-2017-7518.

Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
[bwh: Backported to 3.16:
 - kvm_vcpu_check_singlestep() did not take an rflags parameter but
   called get_rflags() itself; delete that code
 - kvm_vcpu_check_singlestep() sets some flags differently
 - Drop changes to kvm_skip_emulated_instruction()]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/arch/x86/include/asm/kvm_emulate.h
+++ b/arch/x86/include/asm/kvm_emulate.h
@@ -274,6 +274,7 @@ struct x86_emulate_ctxt {
 	bool guest_mode; /* guest running a nested guest */
 	bool perm_ok; /* do not check permissions if true */
 	bool ud;	/* inject an #UD if host doesn't support insn */
+	bool tf;	/* TF value before instruction (after for syscall/sysret) */
 
 	bool have_exception;
 	struct x86_exception exception;
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -2310,6 +2310,7 @@ static int em_syscall(struct x86_emulate
 		ctxt->eflags &= ~(EFLG_VM | EFLG_IF | EFLG_RF);
 	}
 
+	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
 	return X86EMUL_CONTINUE;
 }
 
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4942,6 +4942,8 @@ static void init_emulate_ctxt(struct kvm
 	kvm_x86_ops->get_cs_db_l_bits(vcpu, &cs_db, &cs_l);
 
 	ctxt->eflags = kvm_get_rflags(vcpu);
+	ctxt->tf = (ctxt->eflags & X86_EFLAGS_TF) != 0;
+
 	ctxt->eip = kvm_rip_read(vcpu);
 	ctxt->mode = (!is_protmode(vcpu))		? X86EMUL_MODE_REAL :
 		     (ctxt->eflags & X86_EFLAGS_VM)	? X86EMUL_MODE_VM86 :
@@ -5132,38 +5134,26 @@ static int kvm_vcpu_check_hw_bp(unsigned
 	return dr6;
 }
 
-static void kvm_vcpu_check_singlestep(struct kvm_vcpu *vcpu, int *r)
+static void kvm_vcpu_do_singlestep(struct kvm_vcpu *vcpu, int *r)
 {
 	struct kvm_run *kvm_run = vcpu->run;
 
-	/*
-	 * Use the "raw" value to see if TF was passed to the processor.
-	 * Note that the new value of the flags has not been saved yet.
-	 *
-	 * This is correct even for TF set by the guest, because "the
-	 * processor will not generate this exception after the instruction
-	 * that sets the TF flag".
-	 */
-	unsigned long rflags = kvm_x86_ops->get_rflags(vcpu);
-
-	if (unlikely(rflags & X86_EFLAGS_TF)) {
-		if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
-			kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1;
-			kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
-			kvm_run->debug.arch.exception = DB_VECTOR;
-			kvm_run->exit_reason = KVM_EXIT_DEBUG;
-			*r = EMULATE_USER_EXIT;
-		} else {
-			vcpu->arch.emulate_ctxt.eflags &= ~X86_EFLAGS_TF;
-			/*
-			 * "Certain debug exceptions may clear bit 0-3.  The
-			 * remaining contents of the DR6 register are never
-			 * cleared by the processor".
-			 */
-			vcpu->arch.dr6 &= ~15;
-			vcpu->arch.dr6 |= DR6_BS;
-			kvm_queue_exception(vcpu, DB_VECTOR);
-		}
+	if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
+		kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1;
+		kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
+		kvm_run->debug.arch.exception = DB_VECTOR;
+		kvm_run->exit_reason = KVM_EXIT_DEBUG;
+		*r = EMULATE_USER_EXIT;
+	} else {
+		vcpu->arch.emulate_ctxt.eflags &= ~X86_EFLAGS_TF;
+		/*
+		 * "Certain debug exceptions may clear bit 0-3.  The
+		 * remaining contents of the DR6 register are never
+		 * cleared by the processor".
+		 */
+		vcpu->arch.dr6 &= ~15;
+		vcpu->arch.dr6 |= DR6_BS;
+		kvm_queue_exception(vcpu, DB_VECTOR);
 	}
 }
 
@@ -5316,8 +5306,9 @@ restart:
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
 		vcpu->arch.emulate_regs_need_sync_to_vcpu = false;
 		kvm_rip_write(vcpu, ctxt->eip);
-		if (r == EMULATE_DONE)
-			kvm_vcpu_check_singlestep(vcpu, &r);
+		if (r == EMULATE_DONE &&
+		    (ctxt->tf || (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP)))
+			kvm_vcpu_do_singlestep(vcpu, &r);
 		kvm_set_rflags(vcpu, ctxt->eflags);
 	} else
 		vcpu->arch.emulate_regs_need_sync_to_vcpu = true;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 5/7] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl()
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
                   ` (4 preceding siblings ...)
  2017-06-27 11:10 ` [PATCH 3.16 2/7] mm: fix new crash in unmapped_area_topdown() Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 1/7] mm: larger stack guard gap, between vmas Ben Hutchings
  2017-06-27 13:37 ` [PATCH 3.16 0/7] 3.16.45-rc1 review Guenter Roeck
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable; +Cc: akpm, Sinclair Yeh, Vladis Dronov

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Vladis Dronov <vdronov@redhat.com>

commit ee9c4e681ec4f58e42a83cb0c22a0289ade1aacf upstream.

The 'req->mip_levels' parameter in vmw_gb_surface_define_ioctl() is
a user-controlled 'uint32_t' value which is used as a loop count limit.
This can lead to a kernel lockup and DoS. Add check for 'req->mip_levels'.

References:
https://bugzilla.redhat.com/show_bug.cgi?id=1437431

Signed-off-by: Vladis Dronov <vdronov@redhat.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 3 +++
 1 file changed, 3 insertions(+)

--- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c
@@ -1251,6 +1251,9 @@ int vmw_gb_surface_define_ioctl(struct d
 	const struct svga3d_surface_desc *desc;
 	uint32_t backup_handle;
 
+	if (req->mip_levels > DRM_VMW_MAX_MIP_LEVELS)
+		return -EINVAL;
+
 	if (unlikely(vmw_user_surface_size == 0))
 		vmw_user_surface_size = ttm_round_pot(sizeof(*user_srf)) +
 			128;

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 6/7] rxrpc: Fix several cases where a padded len isn't checked in ticket decode
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 11:10 ` [PATCH 3.16 3/7] Allow stack to grow up to address space limit Ben Hutchings
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, 石磊,
	David S. Miller, David Howells, Dan Carpenter, Marc Dionne

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: David Howells <dhowells@redhat.com>

commit 5f2f97656ada8d811d3c1bef503ced266fcd53a0 upstream.

This fixes CVE-2017-7482.

When a kerberos 5 ticket is being decoded so that it can be loaded into an
rxrpc-type key, there are several places in which the length of a
variable-length field is checked to make sure that it's not going to
overrun the available data - but the data is padded to the nearest
four-byte boundary and the code doesn't check for this extra.  This could
lead to the size-remaining variable wrapping and the data pointer going
over the end of the buffer.

Fix this by making the various variable-length data checks use the padded
length.

Reported-by: 石磊 <shilei-c@360.cn>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Marc Dionne <marc.c.dionne@auristor.com>
Reviewed-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
[bwh: Backported to 3.16: adjust filename, context]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
 net/rxrpc/ar-key.c | 64 ++++++++++++++++++++++++++++++---------------------------
 1 file changed, 34 insertions(+), 30 deletions(-)

--- a/net/rxrpc/ar-key.c
+++ b/net/rxrpc/ar-key.c
@@ -213,7 +213,7 @@ static int rxrpc_krb5_decode_principal(s
 				       unsigned int *_toklen)
 {
 	const __be32 *xdr = *_xdr;
-	unsigned int toklen = *_toklen, n_parts, loop, tmp;
+	unsigned int toklen = *_toklen, n_parts, loop, tmp, paddedlen;
 
 	/* there must be at least one name, and at least #names+1 length
 	 * words */
@@ -243,16 +243,16 @@ static int rxrpc_krb5_decode_principal(s
 		toklen -= 4;
 		if (tmp <= 0 || tmp > AFSTOKEN_STRING_MAX)
 			return -EINVAL;
-		if (tmp > toklen)
+		paddedlen = (tmp + 3) & ~3;
+		if (paddedlen > toklen)
 			return -EINVAL;
 		princ->name_parts[loop] = kmalloc(tmp + 1, GFP_KERNEL);
 		if (!princ->name_parts[loop])
 			return -ENOMEM;
 		memcpy(princ->name_parts[loop], xdr, tmp);
 		princ->name_parts[loop][tmp] = 0;
-		tmp = (tmp + 3) & ~3;
-		toklen -= tmp;
-		xdr += tmp >> 2;
+		toklen -= paddedlen;
+		xdr += paddedlen >> 2;
 	}
 
 	if (toklen < 4)
@@ -261,16 +261,16 @@ static int rxrpc_krb5_decode_principal(s
 	toklen -= 4;
 	if (tmp <= 0 || tmp > AFSTOKEN_K5_REALM_MAX)
 		return -EINVAL;
-	if (tmp > toklen)
+	paddedlen = (tmp + 3) & ~3;
+	if (paddedlen > toklen)
 		return -EINVAL;
 	princ->realm = kmalloc(tmp + 1, GFP_KERNEL);
 	if (!princ->realm)
 		return -ENOMEM;
 	memcpy(princ->realm, xdr, tmp);
 	princ->realm[tmp] = 0;
-	tmp = (tmp + 3) & ~3;
-	toklen -= tmp;
-	xdr += tmp >> 2;
+	toklen -= paddedlen;
+	xdr += paddedlen >> 2;
 
 	_debug("%s/...@%s", princ->name_parts[0], princ->realm);
 
@@ -289,7 +289,7 @@ static int rxrpc_krb5_decode_tagged_data
 					 unsigned int *_toklen)
 {
 	const __be32 *xdr = *_xdr;
-	unsigned int toklen = *_toklen, len;
+	unsigned int toklen = *_toklen, len, paddedlen;
 
 	/* there must be at least one tag and one length word */
 	if (toklen <= 8)
@@ -303,15 +303,17 @@ static int rxrpc_krb5_decode_tagged_data
 	toklen -= 8;
 	if (len > max_data_size)
 		return -EINVAL;
+	paddedlen = (len + 3) & ~3;
+	if (paddedlen > toklen)
+		return -EINVAL;
 	td->data_len = len;
 
 	if (len > 0) {
 		td->data = kmemdup(xdr, len, GFP_KERNEL);
 		if (!td->data)
 			return -ENOMEM;
-		len = (len + 3) & ~3;
-		toklen -= len;
-		xdr += len >> 2;
+		toklen -= paddedlen;
+		xdr += paddedlen >> 2;
 	}
 
 	_debug("tag %x len %x", td->tag, td->data_len);
@@ -383,7 +385,7 @@ static int rxrpc_krb5_decode_ticket(u8 *
 				    const __be32 **_xdr, unsigned int *_toklen)
 {
 	const __be32 *xdr = *_xdr;
-	unsigned int toklen = *_toklen, len;
+	unsigned int toklen = *_toklen, len, paddedlen;
 
 	/* there must be at least one length word */
 	if (toklen <= 4)
@@ -395,6 +397,9 @@ static int rxrpc_krb5_decode_ticket(u8 *
 	toklen -= 4;
 	if (len > AFSTOKEN_K5_TIX_MAX)
 		return -EINVAL;
+	paddedlen = (len + 3) & ~3;
+	if (paddedlen > toklen)
+		return -EINVAL;
 	*_tktlen = len;
 
 	_debug("ticket len %u", len);
@@ -403,9 +408,8 @@ static int rxrpc_krb5_decode_ticket(u8 *
 		*_ticket = kmemdup(xdr, len, GFP_KERNEL);
 		if (!*_ticket)
 			return -ENOMEM;
-		len = (len + 3) & ~3;
-		toklen -= len;
-		xdr += len >> 2;
+		toklen -= paddedlen;
+		xdr += paddedlen >> 2;
 	}
 
 	*_xdr = xdr;
@@ -549,7 +553,7 @@ static int rxrpc_instantiate_xdr(struct
 {
 	const __be32 *xdr = data, *token;
 	const char *cp;
-	unsigned int len, tmp, loop, ntoken, toklen, sec_ix;
+	unsigned int len, paddedlen, loop, ntoken, toklen, sec_ix;
 	int ret;
 
 	_enter(",{%x,%x,%x,%x},%zu",
@@ -574,22 +578,21 @@ static int rxrpc_instantiate_xdr(struct
 	if (len < 1 || len > AFSTOKEN_CELL_MAX)
 		goto not_xdr;
 	datalen -= 4;
-	tmp = (len + 3) & ~3;
-	if (tmp > datalen)
+	paddedlen = (len + 3) & ~3;
+	if (paddedlen > datalen)
 		goto not_xdr;
 
 	cp = (const char *) xdr;
 	for (loop = 0; loop < len; loop++)
 		if (!isprint(cp[loop]))
 			goto not_xdr;
-	if (len < tmp)
-		for (; loop < tmp; loop++)
-			if (cp[loop])
-				goto not_xdr;
+	for (; loop < paddedlen; loop++)
+		if (cp[loop])
+			goto not_xdr;
 	_debug("cellname: [%u/%u] '%*.*s'",
-	       len, tmp, len, len, (const char *) xdr);
-	datalen -= tmp;
-	xdr += tmp >> 2;
+	       len, paddedlen, len, len, (const char *) xdr);
+	datalen -= paddedlen;
+	xdr += paddedlen >> 2;
 
 	/* get the token count */
 	if (datalen < 12)
@@ -610,10 +613,11 @@ static int rxrpc_instantiate_xdr(struct
 		sec_ix = ntohl(*xdr);
 		datalen -= 4;
 		_debug("token: [%x/%zx] %x", toklen, datalen, sec_ix);
-		if (toklen < 20 || toklen > datalen)
+		paddedlen = (toklen + 3) & ~3;
+		if (toklen < 20 || toklen > datalen || paddedlen > datalen)
 			goto not_xdr;
-		datalen -= (toklen + 3) & ~3;
-		xdr += (toklen + 3) >> 2;
+		datalen -= paddedlen;
+		xdr += paddedlen >> 2;
 
 	} while (--loop > 0);
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 3.16 1/7] mm: larger stack guard gap, between vmas
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
                   ` (5 preceding siblings ...)
  2017-06-27 11:10 ` [PATCH 3.16 5/7] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Ben Hutchings
@ 2017-06-27 11:10 ` Ben Hutchings
  2017-06-27 13:37 ` [PATCH 3.16 0/7] 3.16.45-rc1 review Guenter Roeck
  7 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 11:10 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: akpm, Linus Torvalds, Hugh Dickins, Michal Hocko, Helge Deller

3.16.45-rc1 review patch.  If anyone has any objections, please let me know.

------------------

From: Hugh Dickins <hughd@google.com>

commit 1be7107fbe18eed3e319a6c3e83c78254b693acb upstream.

Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.

This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.

Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.

One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications.  For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).

Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.

Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.

Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[Hugh Dickins: Backported to 3.16]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
---
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -3154,6 +3154,13 @@ bytes respectively. Such letter suffixes
 	spia_pedr=
 	spia_peddr=
 
+	stack_guard_gap=	[MM]
+			override the default stack gap protection. The value
+			is in page units and it defines how many pages prior
+			to (for stacks growing down) resp. after (for stacks
+			growing up) the main stack are reserved for no other
+			mapping. Default value is 256 pages.
+
 	stacktrace	[FTRACE]
 			Enabled the stack tracer on boot up.
 
--- a/arch/arc/mm/mmap.c
+++ b/arch/arc/mm/mmap.c
@@ -64,7 +64,7 @@ arch_get_unmapped_area(struct file *filp
 
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/arm/mm/mmap.c
+++ b/arch/arm/mm/mmap.c
@@ -89,7 +89,7 @@ arch_get_unmapped_area(struct file *filp
 
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
@@ -140,7 +140,7 @@ arch_get_unmapped_area_topdown(struct fi
 			addr = PAGE_ALIGN(addr);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
+				(!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/frv/mm/elf-fdpic.c
+++ b/arch/frv/mm/elf-fdpic.c
@@ -74,7 +74,7 @@ unsigned long arch_get_unmapped_area(str
 		addr = PAGE_ALIGN(addr);
 		vma = find_vma(current->mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			goto success;
 	}
 
--- a/arch/mips/mm/mmap.c
+++ b/arch/mips/mm/mmap.c
@@ -92,7 +92,7 @@ static unsigned long arch_get_unmapped_a
 
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/parisc/kernel/sys_parisc.c
+++ b/arch/parisc/kernel/sys_parisc.c
@@ -88,7 +88,7 @@ unsigned long arch_get_unmapped_area(str
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma, *prev;
 	unsigned long task_size = TASK_SIZE;
 	int do_color_align, last_mmap;
 	struct vm_unmapped_area_info info;
@@ -115,9 +115,10 @@ unsigned long arch_get_unmapped_area(str
 		else
 			addr = PAGE_ALIGN(addr);
 
-		vma = find_vma(mm, addr);
+		vma = find_vma_prev(mm, addr, &prev);
 		if (task_size - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)) &&
+		    (!prev || addr >= vm_end_gap(prev)))
 			goto found_addr;
 	}
 
@@ -141,7 +142,7 @@ arch_get_unmapped_area_topdown(struct fi
 			  const unsigned long len, const unsigned long pgoff,
 			  const unsigned long flags)
 {
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma, *prev;
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
 	int do_color_align, last_mmap;
@@ -175,9 +176,11 @@ arch_get_unmapped_area_topdown(struct fi
 			addr = COLOR_ALIGN(addr, last_mmap, pgoff);
 		else
 			addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
+
+		vma = find_vma_prev(mm, addr, &prev);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)) &&
+		    (!prev || addr >= vm_end_gap(prev)))
 			goto found_addr;
 	}
 
--- a/arch/powerpc/mm/slice.c
+++ b/arch/powerpc/mm/slice.c
@@ -103,7 +103,7 @@ static int slice_area_is_free(struct mm_
 	if ((mm->task_size - len) < addr)
 		return 0;
 	vma = find_vma(mm, addr);
-	return (!vma || (addr + len) <= vma->vm_start);
+	return (!vma || (addr + len) <= vm_start_gap(vma));
 }
 
 static int slice_low_has_vma(struct mm_struct *mm, unsigned long slice)
--- a/arch/sh/mm/mmap.c
+++ b/arch/sh/mm/mmap.c
@@ -63,7 +63,7 @@ unsigned long arch_get_unmapped_area(str
 
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
@@ -113,7 +113,7 @@ arch_get_unmapped_area_topdown(struct fi
 
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/sparc/kernel/sys_sparc_64.c
+++ b/arch/sparc/kernel/sys_sparc_64.c
@@ -118,7 +118,7 @@ unsigned long arch_get_unmapped_area(str
 
 		vma = find_vma(mm, addr);
 		if (task_size - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
@@ -181,7 +181,7 @@ arch_get_unmapped_area_topdown(struct fi
 
 		vma = find_vma(mm, addr);
 		if (task_size - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/sparc/mm/hugetlbpage.c
+++ b/arch/sparc/mm/hugetlbpage.c
@@ -115,7 +115,7 @@ hugetlb_get_unmapped_area(struct file *f
 		addr = ALIGN(addr, HPAGE_SIZE);
 		vma = find_vma(mm, addr);
 		if (task_size - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 	if (mm->get_unmapped_area == arch_get_unmapped_area)
--- a/arch/tile/mm/hugetlbpage.c
+++ b/arch/tile/mm/hugetlbpage.c
@@ -265,7 +265,7 @@ unsigned long hugetlb_get_unmapped_area(
 		addr = ALIGN(addr, huge_page_size(h));
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 	if (current->mm->get_unmapped_area == arch_get_unmapped_area)
--- a/arch/x86/kernel/sys_x86_64.c
+++ b/arch/x86/kernel/sys_x86_64.c
@@ -127,7 +127,7 @@ arch_get_unmapped_area(struct file *filp
 		addr = PAGE_ALIGN(addr);
 		vma = find_vma(mm, addr);
 		if (end - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
@@ -166,7 +166,7 @@ arch_get_unmapped_area_topdown(struct fi
 		addr = PAGE_ALIGN(addr);
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-				(!vma || addr + len <= vma->vm_start))
+				(!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/arch/x86/mm/hugetlbpage.c
+++ b/arch/x86/mm/hugetlbpage.c
@@ -156,7 +156,7 @@ hugetlb_get_unmapped_area(struct file *f
 		addr = ALIGN(addr, huge_page_size(h));
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 	if (mm->get_unmapped_area == arch_get_unmapped_area)
--- a/arch/xtensa/kernel/syscall.c
+++ b/arch/xtensa/kernel/syscall.c
@@ -86,7 +86,7 @@ unsigned long arch_get_unmapped_area(str
 		/* At this point:  (!vmm || addr < vmm->vm_end). */
 		if (TASK_SIZE - len < addr)
 			return -ENOMEM;
-		if (!vmm || addr + len <= vmm->vm_start)
+		if (!vmm || addr + len <= vm_start_gap(vmm))
 			return addr;
 		addr = vmm->vm_end;
 		if (flags & MAP_SHARED)
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -171,7 +171,7 @@ hugetlb_get_unmapped_area(struct file *f
 		addr = ALIGN(addr, huge_page_size(h));
 		vma = find_vma(mm, addr);
 		if (TASK_SIZE - len >= addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)))
 			return addr;
 	}
 
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -273,11 +273,7 @@ show_map_vma(struct seq_file *m, struct
 
 	/* We don't show the stack guard page in /proc/maps */
 	start = vma->vm_start;
-	if (stack_guard_page_start(vma, start))
-		start += PAGE_SIZE;
 	end = vma->vm_end;
-	if (stack_guard_page_end(vma, end))
-		end -= PAGE_SIZE;
 
 	seq_setwidth(m, 25 + sizeof(void *) * 6 - 1);
 	seq_printf(m, "%08lx-%08lx %c%c%c%c %08llx %02x:%02x %lu ",
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1241,34 +1241,6 @@ int set_page_dirty_lock(struct page *pag
 int clear_page_dirty_for_io(struct page *page);
 int get_cmdline(struct task_struct *task, char *buffer, int buflen);
 
-/* Is the vma a continuation of the stack vma above it? */
-static inline int vma_growsdown(struct vm_area_struct *vma, unsigned long addr)
-{
-	return vma && (vma->vm_end == addr) && (vma->vm_flags & VM_GROWSDOWN);
-}
-
-static inline int stack_guard_page_start(struct vm_area_struct *vma,
-					     unsigned long addr)
-{
-	return (vma->vm_flags & VM_GROWSDOWN) &&
-		(vma->vm_start == addr) &&
-		!vma_growsdown(vma->vm_prev, addr);
-}
-
-/* Is the vma a continuation of the stack vma below it? */
-static inline int vma_growsup(struct vm_area_struct *vma, unsigned long addr)
-{
-	return vma && (vma->vm_start == addr) && (vma->vm_flags & VM_GROWSUP);
-}
-
-static inline int stack_guard_page_end(struct vm_area_struct *vma,
-					   unsigned long addr)
-{
-	return (vma->vm_flags & VM_GROWSUP) &&
-		(vma->vm_end == addr) &&
-		!vma_growsup(vma->vm_next, addr);
-}
-
 extern pid_t
 vm_is_stack(struct task_struct *task, struct vm_area_struct *vma, int in_group);
 
@@ -1914,6 +1886,7 @@ void page_cache_async_readahead(struct a
 
 unsigned long max_sane_readahead(unsigned long nr);
 
+extern unsigned long stack_guard_gap;
 /* Generic expand stack which grows the stack according to GROWS{UP,DOWN} */
 extern int expand_stack(struct vm_area_struct *vma, unsigned long address);
 
@@ -1942,6 +1915,30 @@ static inline struct vm_area_struct * fi
 	return vma;
 }
 
+static inline unsigned long vm_start_gap(struct vm_area_struct *vma)
+{
+	unsigned long vm_start = vma->vm_start;
+
+	if (vma->vm_flags & VM_GROWSDOWN) {
+		vm_start -= stack_guard_gap;
+		if (vm_start > vma->vm_start)
+			vm_start = 0;
+	}
+	return vm_start;
+}
+
+static inline unsigned long vm_end_gap(struct vm_area_struct *vma)
+{
+	unsigned long vm_end = vma->vm_end;
+
+	if (vma->vm_flags & VM_GROWSUP) {
+		vm_end += stack_guard_gap;
+		if (vm_end < vma->vm_end)
+			vm_end = -PAGE_SIZE;
+	}
+	return vm_end;
+}
+
 static inline unsigned long vma_pages(struct vm_area_struct *vma)
 {
 	return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT;
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -266,11 +266,6 @@ static int faultin_page(struct task_stru
 	unsigned int fault_flags = 0;
 	int ret;
 
-	/* For mlock, just skip the stack guard page. */
-	if ((*flags & FOLL_MLOCK) &&
-			(stack_guard_page_start(vma, address) ||
-			 stack_guard_page_end(vma, address + PAGE_SIZE)))
-		return -ENOENT;
 	if (*flags & FOLL_WRITE)
 		fault_flags |= FAULT_FLAG_WRITE;
 	if (nonblocking)
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2589,40 +2589,6 @@ out_release:
 }
 
 /*
- * This is like a special single-page "expand_{down|up}wards()",
- * except we must first make sure that 'address{-|+}PAGE_SIZE'
- * doesn't hit another vma.
- */
-static inline int check_stack_guard_page(struct vm_area_struct *vma, unsigned long address)
-{
-	address &= PAGE_MASK;
-	if ((vma->vm_flags & VM_GROWSDOWN) && address == vma->vm_start) {
-		struct vm_area_struct *prev = vma->vm_prev;
-
-		/*
-		 * Is there a mapping abutting this one below?
-		 *
-		 * That's only ok if it's the same stack mapping
-		 * that has gotten split..
-		 */
-		if (prev && prev->vm_end == address)
-			return prev->vm_flags & VM_GROWSDOWN ? 0 : -ENOMEM;
-
-		return expand_downwards(vma, address - PAGE_SIZE);
-	}
-	if ((vma->vm_flags & VM_GROWSUP) && address + PAGE_SIZE == vma->vm_end) {
-		struct vm_area_struct *next = vma->vm_next;
-
-		/* As VM_GROWSDOWN but s/below/above/ */
-		if (next && next->vm_start == address + PAGE_SIZE)
-			return next->vm_flags & VM_GROWSUP ? 0 : -ENOMEM;
-
-		return expand_upwards(vma, address + PAGE_SIZE);
-	}
-	return 0;
-}
-
-/*
  * We enter with non-exclusive mmap_sem (to exclude vma changes,
  * but allow concurrent faults), and pte mapped but not yet locked.
  * We return with mmap_sem still held, but pte unmapped and unlocked.
@@ -2641,10 +2607,6 @@ static int do_anonymous_page(struct mm_s
 	if (vma->vm_flags & VM_SHARED)
 		return VM_FAULT_SIGBUS;
 
-	/* Check if we need to add a guard page to the stack */
-	if (check_stack_guard_page(vma, address) < 0)
-		return VM_FAULT_SIGSEGV;
-
 	/* Use the zero-page for reads */
 	if (!(flags & FAULT_FLAG_WRITE)) {
 		entry = pte_mkspecial(pfn_pte(my_zero_pfn(address),
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -266,6 +266,7 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
 	unsigned long rlim, retval;
 	unsigned long newbrk, oldbrk;
 	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *next;
 	unsigned long min_brk;
 	bool populate;
 
@@ -311,7 +312,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
 	}
 
 	/* Check against existing mmap mappings. */
-	if (find_vma_intersection(mm, oldbrk, newbrk+PAGE_SIZE))
+	next = find_vma(mm, oldbrk);
+	if (next && newbrk + PAGE_SIZE > vm_start_gap(next))
 		goto out;
 
 	/* Ok, looks good - let it rip. */
@@ -334,10 +336,22 @@ out:
 
 static long vma_compute_subtree_gap(struct vm_area_struct *vma)
 {
-	unsigned long max, subtree_gap;
-	max = vma->vm_start;
-	if (vma->vm_prev)
-		max -= vma->vm_prev->vm_end;
+	unsigned long max, prev_end, subtree_gap;
+
+	/*
+	 * Note: in the rare case of a VM_GROWSDOWN above a VM_GROWSUP, we
+	 * allow two stack_guard_gaps between them here, and when choosing
+	 * an unmapped area; whereas when expanding we only require one.
+	 * That's a little inconsistent, but keeps the code here simpler.
+	 */
+	max = vm_start_gap(vma);
+	if (vma->vm_prev) {
+		prev_end = vm_end_gap(vma->vm_prev);
+		if (max > prev_end)
+			max -= prev_end;
+		else
+			max = 0;
+	}
 	if (vma->vm_rb.rb_left) {
 		subtree_gap = rb_entry(vma->vm_rb.rb_left,
 				struct vm_area_struct, vm_rb)->rb_subtree_gap;
@@ -426,7 +440,7 @@ static void validate_mm(struct mm_struct
 			anon_vma_unlock_read(anon_vma);
 		}
 
-		highest_address = vma->vm_end;
+		highest_address = vm_end_gap(vma);
 		vma = vma->vm_next;
 		i++;
 	}
@@ -594,7 +608,7 @@ void __vma_link_rb(struct mm_struct *mm,
 	if (vma->vm_next)
 		vma_gap_update(vma->vm_next);
 	else
-		mm->highest_vm_end = vma->vm_end;
+		mm->highest_vm_end = vm_end_gap(vma);
 
 	/*
 	 * vma->vm_prev wasn't known when we followed the rbtree to find the
@@ -846,7 +860,7 @@ again:			remove_next = 1 + (end > next->
 			vma_gap_update(vma);
 		if (end_changed) {
 			if (!next)
-				mm->highest_vm_end = end;
+				mm->highest_vm_end = vm_end_gap(vma);
 			else if (!adjust_next)
 				vma_gap_update(next);
 		}
@@ -889,7 +903,7 @@ again:			remove_next = 1 + (end > next->
 		else if (next)
 			vma_gap_update(next);
 		else
-			mm->highest_vm_end = end;
+			VM_WARN_ON(mm->highest_vm_end != vm_end_gap(vma));
 	}
 	if (insert && file)
 		uprobe_mmap(insert);
@@ -1702,7 +1716,7 @@ unsigned long unmapped_area(struct vm_un
 
 	while (true) {
 		/* Visit left subtree if it looks promising */
-		gap_end = vma->vm_start;
+		gap_end = vm_start_gap(vma);
 		if (gap_end >= low_limit && vma->vm_rb.rb_left) {
 			struct vm_area_struct *left =
 				rb_entry(vma->vm_rb.rb_left,
@@ -1713,7 +1727,7 @@ unsigned long unmapped_area(struct vm_un
 			}
 		}
 
-		gap_start = vma->vm_prev ? vma->vm_prev->vm_end : 0;
+		gap_start = vma->vm_prev ? vm_end_gap(vma->vm_prev) : 0;
 check_current:
 		/* Check if current node has a suitable gap */
 		if (gap_start > high_limit)
@@ -1740,8 +1754,8 @@ check_current:
 			vma = rb_entry(rb_parent(prev),
 				       struct vm_area_struct, vm_rb);
 			if (prev == vma->vm_rb.rb_left) {
-				gap_start = vma->vm_prev->vm_end;
-				gap_end = vma->vm_start;
+				gap_start = vm_end_gap(vma->vm_prev);
+				gap_end = vm_start_gap(vma);
 				goto check_current;
 			}
 		}
@@ -1805,7 +1819,7 @@ unsigned long unmapped_area_topdown(stru
 
 	while (true) {
 		/* Visit right subtree if it looks promising */
-		gap_start = vma->vm_prev ? vma->vm_prev->vm_end : 0;
+		gap_start = vma->vm_prev ? vm_end_gap(vma->vm_prev) : 0;
 		if (gap_start <= high_limit && vma->vm_rb.rb_right) {
 			struct vm_area_struct *right =
 				rb_entry(vma->vm_rb.rb_right,
@@ -1818,7 +1832,7 @@ unsigned long unmapped_area_topdown(stru
 
 check_current:
 		/* Check if current node has a suitable gap */
-		gap_end = vma->vm_start;
+		gap_end = vm_start_gap(vma);
 		if (gap_end < low_limit)
 			return -ENOMEM;
 		if (gap_start <= high_limit && gap_end - gap_start >= length)
@@ -1844,7 +1858,7 @@ check_current:
 				       struct vm_area_struct, vm_rb);
 			if (prev == vma->vm_rb.rb_right) {
 				gap_start = vma->vm_prev ?
-					vma->vm_prev->vm_end : 0;
+					vm_end_gap(vma->vm_prev) : 0;
 				goto check_current;
 			}
 		}
@@ -1882,7 +1896,7 @@ arch_get_unmapped_area(struct file *filp
 		unsigned long len, unsigned long pgoff, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma, *prev;
 	struct vm_unmapped_area_info info;
 
 	if (len > TASK_SIZE - mmap_min_addr)
@@ -1893,9 +1907,10 @@ arch_get_unmapped_area(struct file *filp
 
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
+		vma = find_vma_prev(mm, addr, &prev);
 		if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
-		    (!vma || addr + len <= vma->vm_start))
+		    (!vma || addr + len <= vm_start_gap(vma)) &&
+		    (!prev || addr >= vm_end_gap(prev)))
 			return addr;
 	}
 
@@ -1918,7 +1933,7 @@ arch_get_unmapped_area_topdown(struct fi
 			  const unsigned long len, const unsigned long pgoff,
 			  const unsigned long flags)
 {
-	struct vm_area_struct *vma;
+	struct vm_area_struct *vma, *prev;
 	struct mm_struct *mm = current->mm;
 	unsigned long addr = addr0;
 	struct vm_unmapped_area_info info;
@@ -1933,9 +1948,10 @@ arch_get_unmapped_area_topdown(struct fi
 	/* requesting a specific address */
 	if (addr) {
 		addr = PAGE_ALIGN(addr);
-		vma = find_vma(mm, addr);
+		vma = find_vma_prev(mm, addr, &prev);
 		if (TASK_SIZE - len >= addr && addr >= mmap_min_addr &&
-				(!vma || addr + len <= vma->vm_start))
+				(!vma || addr + len <= vm_start_gap(vma)) &&
+				(!prev || addr >= vm_end_gap(prev)))
 			return addr;
 	}
 
@@ -2061,21 +2077,19 @@ find_vma_prev(struct mm_struct *mm, unsi
  * update accounting. This is shared with both the
  * grow-up and grow-down cases.
  */
-static int acct_stack_growth(struct vm_area_struct *vma, unsigned long size, unsigned long grow)
+static int acct_stack_growth(struct vm_area_struct *vma,
+			     unsigned long size, unsigned long grow)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	struct rlimit *rlim = current->signal->rlim;
-	unsigned long new_start, actual_size;
+	unsigned long new_start;
 
 	/* address space limit tests */
 	if (!may_expand_vm(mm, grow))
 		return -ENOMEM;
 
 	/* Stack limit test */
-	actual_size = size;
-	if (size && (vma->vm_flags & (VM_GROWSUP | VM_GROWSDOWN)))
-		actual_size -= PAGE_SIZE;
-	if (actual_size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur))
+	if (size > ACCESS_ONCE(rlim[RLIMIT_STACK].rlim_cur))
 		return -ENOMEM;
 
 	/* mlock limit tests */
@@ -2116,17 +2130,30 @@ static int acct_stack_growth(struct vm_a
  */
 int expand_upwards(struct vm_area_struct *vma, unsigned long address)
 {
+	struct vm_area_struct *next;
+	unsigned long gap_addr;
 	int error = 0;
 
 	if (!(vma->vm_flags & VM_GROWSUP))
 		return -EFAULT;
 
 	/* Guard against wrapping around to address 0. */
-	if (address < PAGE_ALIGN(address+4))
-		address = PAGE_ALIGN(address+4);
-	else
+	address &= PAGE_MASK;
+	address += PAGE_SIZE;
+	if (!address)
 		return -ENOMEM;
 
+	/* Enforce stack_guard_gap */
+	gap_addr = address + stack_guard_gap;
+	if (gap_addr < address)
+		return -ENOMEM;
+	next = vma->vm_next;
+	if (next && next->vm_start < gap_addr) {
+		if (!(next->vm_flags & VM_GROWSUP))
+			return -ENOMEM;
+		/* Check that both stack segments have the same anon_vma? */
+	}
+
 	/* We must make sure the anon_vma is allocated. */
 	if (unlikely(anon_vma_prepare(vma)))
 		return -ENOMEM;
@@ -2167,7 +2194,7 @@ int expand_upwards(struct vm_area_struct
 				if (vma->vm_next)
 					vma_gap_update(vma->vm_next);
 				else
-					vma->vm_mm->highest_vm_end = address;
+					vma->vm_mm->highest_vm_end = vm_end_gap(vma);
 				spin_unlock(&vma->vm_mm->page_table_lock);
 
 				perf_event_mmap(vma);
@@ -2187,6 +2214,8 @@ int expand_upwards(struct vm_area_struct
 int expand_downwards(struct vm_area_struct *vma,
 				   unsigned long address)
 {
+	struct vm_area_struct *prev;
+	unsigned long gap_addr;
 	int error;
 
 	address &= PAGE_MASK;
@@ -2194,6 +2223,17 @@ int expand_downwards(struct vm_area_stru
 	if (error)
 		return error;
 
+	/* Enforce stack_guard_gap */
+	gap_addr = address - stack_guard_gap;
+	if (gap_addr > address)
+		return -ENOMEM;
+	prev = vma->vm_prev;
+	if (prev && prev->vm_end > gap_addr) {
+		if (!(prev->vm_flags & VM_GROWSDOWN))
+			return -ENOMEM;
+		/* Check that both stack segments have the same anon_vma? */
+	}
+
 	/* We must make sure the anon_vma is allocated. */
 	if (unlikely(anon_vma_prepare(vma)))
 		return -ENOMEM;
@@ -2245,28 +2285,25 @@ int expand_downwards(struct vm_area_stru
 	return error;
 }
 
-/*
- * Note how expand_stack() refuses to expand the stack all the way to
- * abut the next virtual mapping, *unless* that mapping itself is also
- * a stack mapping. We want to leave room for a guard page, after all
- * (the guard page itself is not added here, that is done by the
- * actual page faulting logic)
- *
- * This matches the behavior of the guard page logic (see mm/memory.c:
- * check_stack_guard_page()), which only allows the guard page to be
- * removed under these circumstances.
- */
+/* enforced gap between the expanding stack and other mappings. */
+unsigned long stack_guard_gap = 256UL<<PAGE_SHIFT;
+
+static int __init cmdline_parse_stack_guard_gap(char *p)
+{
+	unsigned long val;
+	char *endptr;
+
+	val = simple_strtoul(p, &endptr, 10);
+	if (!*endptr)
+		stack_guard_gap = val << PAGE_SHIFT;
+
+	return 0;
+}
+__setup("stack_guard_gap=", cmdline_parse_stack_guard_gap);
+
 #ifdef CONFIG_STACK_GROWSUP
 int expand_stack(struct vm_area_struct *vma, unsigned long address)
 {
-	struct vm_area_struct *next;
-
-	address &= PAGE_MASK;
-	next = vma->vm_next;
-	if (next && next->vm_start == address + PAGE_SIZE) {
-		if (!(next->vm_flags & VM_GROWSUP))
-			return -ENOMEM;
-	}
 	return expand_upwards(vma, address);
 }
 
@@ -2288,14 +2325,6 @@ find_extend_vma(struct mm_struct *mm, un
 #else
 int expand_stack(struct vm_area_struct *vma, unsigned long address)
 {
-	struct vm_area_struct *prev;
-
-	address &= PAGE_MASK;
-	prev = vma->vm_prev;
-	if (prev && prev->vm_end == address) {
-		if (!(prev->vm_flags & VM_GROWSDOWN))
-			return -ENOMEM;
-	}
 	return expand_downwards(vma, address);
 }
 
@@ -2391,7 +2420,7 @@ detach_vmas_to_be_unmapped(struct mm_str
 		vma->vm_prev = prev;
 		vma_gap_update(vma);
 	} else
-		mm->highest_vm_end = prev ? prev->vm_end : 0;
+		mm->highest_vm_end = prev ? vm_end_gap(prev) : 0;
 	tail_vma->vm_next = NULL;
 
 	/* Kill the cache */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3.16 0/7] 3.16.45-rc1 review
  2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
                   ` (6 preceding siblings ...)
  2017-06-27 11:10 ` [PATCH 3.16 1/7] mm: larger stack guard gap, between vmas Ben Hutchings
@ 2017-06-27 13:37 ` Guenter Roeck
  2017-06-27 17:15   ` Ben Hutchings
  7 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2017-06-27 13:37 UTC (permalink / raw)
  To: Ben Hutchings, linux-kernel, stable; +Cc: torvalds, akpm

On 06/27/2017 04:10 AM, Ben Hutchings wrote:
> This is the start of the stable review cycle for the 3.16.45 release.
> There are 7 patches in this series, which will be posted as responses
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Thu Jun 29 11:10:33 UTC 2017.
> Anything received after that time might be too late.
> 

Build results:
	total: 136 pass: 136 fail: 0
Qemu test results:
	total: 107 pass: 107 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 3.16 0/7] 3.16.45-rc1 review
  2017-06-27 13:37 ` [PATCH 3.16 0/7] 3.16.45-rc1 review Guenter Roeck
@ 2017-06-27 17:15   ` Ben Hutchings
  0 siblings, 0 replies; 10+ messages in thread
From: Ben Hutchings @ 2017-06-27 17:15 UTC (permalink / raw)
  To: Guenter Roeck, linux-kernel, stable; +Cc: torvalds, akpm

[-- Attachment #1: Type: text/plain, Size: 793 bytes --]

On Tue, 2017-06-27 at 06:37 -0700, Guenter Roeck wrote:
> On 06/27/2017 04:10 AM, Ben Hutchings wrote:
> > This is the start of the stable review cycle for the 3.16.45 release.
> > There are 7 patches in this series, which will be posted as responses
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Thu Jun 29 11:10:33 UTC 2017.
> > Anything received after that time might be too late.
> > 
> 
> Build results:
> 	total: 136 pass: 136 fail: 0
> Qemu test results:
> 	total: 107 pass: 107 fail: 0
> 
> Details are available at http://kerneltests.org/builders.

Thanks for checking these.

Ben.

-- 
Ben Hutchings
Absolutum obsoletum. (If it works, it's out of date.) - Stafford Beer


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-06-27 17:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-27 11:10 [PATCH 3.16 0/7] 3.16.45-rc1 review Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 6/7] rxrpc: Fix several cases where a padded len isn't checked in ticket decode Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 3/7] Allow stack to grow up to address space limit Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 7/7] KVM: x86: fix singlestepping over syscall Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 4/7] regulator: core: Fix regualtor_ena_gpio_free not to access pin after freeing Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 2/7] mm: fix new crash in unmapped_area_topdown() Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 5/7] drm/vmwgfx: limit the number of mip levels in vmw_gb_surface_define_ioctl() Ben Hutchings
2017-06-27 11:10 ` [PATCH 3.16 1/7] mm: larger stack guard gap, between vmas Ben Hutchings
2017-06-27 13:37 ` [PATCH 3.16 0/7] 3.16.45-rc1 review Guenter Roeck
2017-06-27 17:15   ` Ben Hutchings

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.