From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1755799AbbGCTpJ (ORCPT <rfc822;w@1wt.eu>);
	Fri, 3 Jul 2015 15:45:09 -0400
Received: from mail.kernel.org ([198.145.29.136]:43809 "EHLO mail.kernel.org"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752286AbbGCTom (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Fri, 3 Jul 2015 15:44:42 -0400
From: Andy Lutomirski <luto@kernel.org>
To: x86@kernel.org, linux-kernel@vger.kernel.org
Cc: =?UTF-8?q?Fr=C3=A9d=C3=A9ric=20Weisbecker?= <fweisbec@gmail.com>,
        Rik van Riel <riel@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
        Denys Vlasenko <vda.linux@googlemail.com>,
        Borislav Petkov <bp@alien8.de>, Kees Cook <keescook@chromium.org>,
        Brian Gerst <brgerst@gmail.com>, paulmck@linux.vnet.ibm.com,
        Andy Lutomirski <luto@kernel.org>
Subject: [PATCH v5 02/17] x86/entry/64/compat: Fix bad fast syscall arg failure path
Date: Fri,  3 Jul 2015 12:44:19 -0700
Message-Id: <903010762c07a3d67df914fea2da84b52b0f8f1d.1435952415.git.luto@kernel.org>
X-Mailer: git-send-email 2.4.3
In-Reply-To: <cover.1435952415.git.luto@kernel.org>
References: <cover.1435952415.git.luto@kernel.org>
In-Reply-To: <cover.1435952415.git.luto@kernel.org>
References: <cover.1435952415.git.luto@kernel.org>
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

If user code does SYSCALL32 or SYSENTER without a valid stack, then
our attempt to determine the syscall args will result in a failed
uaccess fault.  Previously, we would try to recover by jumping to
the syscall exit code, but we'd run the syscall exit work even
though we never made it to the syscall entry work.

Clean it up by treating the failure path as a non-syscall entry and
exit pair.

This fixes strace's output when running the syscall_arg_fault test.
Without this fix, strace would get out of sync and would fail to
associate syscall entries with syscall exits.

Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
 arch/x86/entry/entry_64.S        |  2 +-
 arch/x86/entry/entry_64_compat.S | 35 +++++++++++++++++++++++++++++++++--
 2 files changed, 34 insertions(+), 3 deletions(-)

diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index 3bb2c4302df1..141a5d49dddc 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -613,7 +613,7 @@ ret_from_intr:
 	testb	$3, CS(%rsp)
 	jz	retint_kernel
 	/* Interrupt came from user space */
-retint_user:
+GLOBAL(retint_user)
 	GET_THREAD_INFO(%rcx)
 
 	/* %rcx: thread info. Interrupts are off. */
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index bb187a6a877c..efe0b1e499fa 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -425,8 +425,39 @@ cstar_tracesys:
 END(entry_SYSCALL_compat)
 
 ia32_badarg:
-	ASM_CLAC
-	movq	$-EFAULT, RAX(%rsp)
+	/*
+	 * So far, we've entered kernel mode, set AC, turned on IRQs, and
+	 * saved C regs except r8-r11.  We haven't done any of the other
+	 * standard entry work, though.  We want to bail, but we shouldn't
+	 * treat this as a syscall entry since we don't even know what the
+	 * args are.  Instead, treat this as a non-syscall entry, finish
+	 * the entry work, and immediately exit after setting AX = -EFAULT.
+	 *
+	 * We're really just being polite here.  Killing the task outright
+	 * would be a reasonable action, too.  Given that the only valid
+	 * way to have gotten here is through the vDSO, and we already know
+	 * that the stack pointer is bad, the task isn't going to survive
+	 * for long no matter what we do.
+	 */
+
+	ASM_CLAC			/* undo STAC */
+	movq	$-EFAULT, RAX(%rsp)	/* return -EFAULT if possible */
+
+	/* Fill in the rest of pt_regs */
+	xorl	%eax, %eax
+	movq	%rax, R11(%rsp)
+	movq	%rax, R10(%rsp)
+	movq	%rax, R9(%rsp)
+	movq	%rax, R8(%rsp)
+	SAVE_EXTRA_REGS
+
+	/* Turn IRQs back off. */
+	DISABLE_INTERRUPTS(CLBR_NONE)
+	TRACE_IRQS_OFF
+
+	/* And exit again. */
+	jmp retint_user
+
 ia32_ret_from_sys_call:
 	xorl	%eax, %eax		/* Do not leak kernel information */
 	movq	%rax, R11(%rsp)
-- 
2.4.3