From: Andy Lutomirski <luto@kernel.org>
To: x86@kernel.org, linux-kernel@vger.kernel.org
Cc: Borislav Petkov <bp@alien8.de>, Andy Lutomirski <luto@kernel.org>,
Pedro Alves <palves@redhat.com>, Oleg Nesterov <oleg@redhat.com>,
Kees Cook <keescook@chromium.org>
Subject: [PATCH v3 3/3] x86/ptrace, x86/signal: Remove TS_I386_REGS_POKED
Date: Mon, 20 Jun 2016 16:39:54 -0700 [thread overview]
Message-ID: <9c5c3fd519dcc2e4596ecb074e1f8967f83080ef.1466464928.git.luto@kernel.org> (raw)
In-Reply-To: <cover.1466464928.git.luto@kernel.org>
In-Reply-To: <cover.1466464928.git.luto@kernel.org>
System call restart has some oddities wrt ptrace:
1. For whatever reason, the kernel delivers signals and triggers
ptrace before handling syscall restart. This means that
-ERESTART_RESTARTBLOCK, etc is visible to userspace. We could
plausibly get away with changing that, but it seems quite risky.
2. As a result of (1), gdb (quite reasonably) expects that it can
snapshot user state on signal delivery, adjust regs to call a
function, and then restore user state.
3. Presumably as a result of (2), we do syscall restart if indicated
by the register state on ptrace resume even if we're *not* resuming
a syscall.
4. Also as a result of (1), gdb expects that writing -1 to orig_eax
via POKEUSER or similar will *disable* syscall restart, which is
necessary to get function calling on syscall exit to work.
The combination of (1) and (4) means that, if we have a 32-bit tracer,
we need to skip syscall restart if orig_eax == -1 (in a 32-bit signed
sense). The combination of (1) and (2) means that, if we have a
32-bit tracer, we need to enable syscall restart if orig_eax > 0 (in a
32-bit signed sense) and eax contains a -ERESTART* code (again in a
signed sense).
The current state of affairs is a mess. Setting a temporary per-task
flag when ptrace changes orig_eax is messy. It does the wrong thing
when ptrace only writes eax. It's also seriously overcomplicated IMO.
Instead, just unconditionally sign-extending them in the ptrace code
and not worrying about ptrace in the signal handling code.
Cc: Pedro Alves <palves@redhat.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
---
arch/x86/entry/common.c | 6 +-----
arch/x86/include/asm/syscall.h | 2 +-
arch/x86/include/asm/thread_info.h | 3 ---
arch/x86/kernel/ptrace.c | 37 +++++++++++++++++++++----------------
4 files changed, 23 insertions(+), 25 deletions(-)
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 0db497a8ff19..ec138e538c44 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -270,12 +270,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
* handling, because syscall restart has a fixup for compat
* syscalls. The fixup is exercised by the ptrace_syscall_32
* selftest.
- *
- * We also need to clear TS_REGS_POKED_I386: the 32-bit tracer
- * special case only applies after poking regs and before the
- * very next return to user mode.
*/
- ti->status &= ~(TS_COMPAT|TS_I386_REGS_POKED);
+ ti->status &= ~TS_COMPAT;
#endif
user_enter();
diff --git a/arch/x86/include/asm/syscall.h b/arch/x86/include/asm/syscall.h
index 4e23dd15c661..4216bb7cbcba 100644
--- a/arch/x86/include/asm/syscall.h
+++ b/arch/x86/include/asm/syscall.h
@@ -60,7 +60,7 @@ static inline long syscall_get_error(struct task_struct *task,
* TS_COMPAT is set for 32-bit syscall entries and then
* remains set until we return to user mode.
*/
- if (task_thread_info(task)->status & (TS_COMPAT|TS_I386_REGS_POKED))
+ if (task_thread_info(task)->status & TS_COMPAT)
/*
* Sign-extend the value so (int)-EFOO becomes (long)-EFOO
* and will match correctly in comparisons.
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 4bca518d11f4..30c133ac05cd 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -228,9 +228,6 @@ static inline unsigned long current_stack_pointer(void)
* have to worry about atomic accesses.
*/
#define TS_COMPAT 0x0002 /* 32bit syscall active (64BIT)*/
-#ifdef CONFIG_COMPAT
-#define TS_I386_REGS_POKED 0x0004 /* regs poked by 32-bit ptracer */
-#endif
#define TS_RESTORE_SIGMASK 0x0008 /* restore signal mask in do_signal() */
#ifndef __ASSEMBLY__
diff --git a/arch/x86/kernel/ptrace.c b/arch/x86/kernel/ptrace.c
index f79576a541ff..c95aba795f88 100644
--- a/arch/x86/kernel/ptrace.c
+++ b/arch/x86/kernel/ptrace.c
@@ -891,6 +891,10 @@ long arch_ptrace(struct task_struct *child, long request,
case offsetof(struct user32, regs.l): \
regs->q = value; break
+#define R32_SIGNED(l,q) \
+ case offsetof(struct user32, regs.l): \
+ regs->q = (long)(s32)value; break
+
#define SEG32(rs) \
case offsetof(struct user32, regs.rs): \
return set_segment_reg(child, \
@@ -917,25 +921,26 @@ static int putreg32(struct task_struct *child, unsigned regno, u32 value)
R32(edi, di);
R32(esi, si);
R32(ebp, bp);
- R32(eax, ax);
R32(eip, ip);
R32(esp, sp);
- case offsetof(struct user32, regs.orig_eax):
- /*
- * Warning: bizarre corner case fixup here. A 32-bit
- * debugger setting orig_eax to -1 wants to disable
- * syscall restart. Make sure that the syscall
- * restart code sign-extends orig_ax. Also make sure
- * we interpret the -ERESTART* codes correctly if
- * loaded into regs->ax in case the task is not
- * actually still sitting at the exit from a 32-bit
- * syscall with TS_COMPAT still set.
- */
- regs->orig_ax = value;
- if (syscall_get_nr(child, regs) >= 0)
- task_thread_info(child)->status |= TS_I386_REGS_POKED;
- break;
+ /*
+ * A 32-bit ptracer has the following expectations:
+ *
+ * - Storing -1 (i.e. 0xffffffff) to orig_eax will prevent
+ * syscall restart handling.
+ *
+ * - Restoring regs saved on exit from an interrupted
+ * restartable syscall will trigger syscall restart. Such
+ * regs will have non-negative orig_eax and negative eax.
+ *
+ * The kernel's syscall restart code treats regs->orig_ax and
+ * regs->ax as 64-bit signed quantities. 32-bit user code
+ * doesn't care about the high bits. Keep it simple and just
+ * sign-extend both values.
+ */
+ R32_SIGNED(orig_eax, orig_ax);
+ R32_SIGNED(eax, ax);
case offsetof(struct user32, regs.eflags):
return set_flags(child, value);
--
2.5.5
next prev parent reply other threads:[~2016-06-21 0:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-06-20 23:39 [PATCH v3 0/3] ptrace-vs-syscall-restart fixes, v3 Andy Lutomirski
2016-06-20 23:39 ` [PATCH v3 1/3] x86/ptrace: Stop setting TS_COMPAT in ptrace code Andy Lutomirski
2016-06-22 22:13 ` Oleg Nesterov
2016-07-24 18:47 ` Andy Lutomirski
2016-07-25 6:38 ` Ingo Molnar
2016-07-25 16:38 ` Oleg Nesterov
2016-07-25 16:57 ` Andy Lutomirski
2016-07-26 0:21 ` Andy Lutomirski
2016-06-20 23:39 ` [PATCH v3 2/3] x86/signal: Rewire the restart_block() syscall to have a constant nr Andy Lutomirski
2016-06-21 12:39 ` Pedro Alves
2016-06-21 16:32 ` Andy Lutomirski
2016-06-22 12:00 ` Pedro Alves
2016-06-22 15:20 ` Andy Lutomirski
2016-06-23 21:21 ` Oleg Nesterov
2016-06-20 23:39 ` Andy Lutomirski [this message]
2016-06-23 21:26 ` [PATCH v3 3/3] x86/ptrace, x86/signal: Remove TS_I386_REGS_POKED Oleg Nesterov
2016-06-23 21:53 ` Andy Lutomirski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9c5c3fd519dcc2e4596ecb074e1f8967f83080ef.1466464928.git.luto@kernel.org \
--to=luto@kernel.org \
--cc=bp@alien8.de \
--cc=keescook@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=oleg@redhat.com \
--cc=palves@redhat.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).