linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame
@ 2018-06-26 21:16 Mathieu Desnoyers
  2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2018-06-26 21:16 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner
  Cc: linux-kernel, Mathieu Desnoyers, Joel Fernandes, Peter Zijlstra,
	Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen,
	H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter,
	Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng,
	Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch,
	x86, Andrew Morton, Linus Torvalds

x86 is moving from is_compat_task() to in_compat_syscall(). However,
in_compat_syscall cannot be used to check whether a signal is being
delivered on a compat task.

Introduce is_compat_frame to allow performing this check from
architecture agnostic code. On all architectures except x86, it
invokes is_compat_task(). On x86, it uses is_ia32_frame() and
is_x32_frame() to check whether the signal frame is 32-bit.

This is needed by restartable sequences to detect whether it needs
to clear the top bits of the start_ip, abort_ip, and post_commit_offset
rseq_cs fields on signal delivery, thus ensuring identical behavior
for a 32-bit binary executed on 32-bit and 64-bit kernels.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Hunter <ahh@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 arch/x86/include/asm/compat.h | 24 ++++++++++++++++++++++++
 arch/x86/kernel/signal.c      | 17 -----------------
 include/linux/compat.h        | 17 +++++++++++++++++
 3 files changed, 41 insertions(+), 17 deletions(-)

diff --git a/arch/x86/include/asm/compat.h b/arch/x86/include/asm/compat.h
index fb97cf7c4137..1405a8df5215 100644
--- a/arch/x86/include/asm/compat.h
+++ b/arch/x86/include/asm/compat.h
@@ -8,6 +8,7 @@
 #include <linux/types.h>
 #include <linux/sched.h>
 #include <linux/sched/task_stack.h>
+#include <linux/signal.h>
 #include <asm/processor.h>
 #include <asm/user32.h>
 #include <asm/unistd.h>
@@ -242,4 +243,27 @@ struct compat_siginfo;
 int __copy_siginfo_to_user32(struct compat_siginfo __user *to,
 		const siginfo_t *from, bool x32_ABI);
 
+static inline int is_ia32_compat_frame(struct ksignal *ksig)
+{
+	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
+		ksig->ka.sa.sa_flags & SA_IA32_ABI;
+}
+
+static inline int is_ia32_frame(struct ksignal *ksig)
+{
+	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig);
+}
+
+static inline int is_x32_frame(struct ksignal *ksig)
+{
+	return IS_ENABLED(CONFIG_X86_X32_ABI) &&
+		ksig->ka.sa.sa_flags & SA_X32_ABI;
+}
+
+static inline bool is_compat_frame(struct ksignal *ksig)
+{
+	return is_ia32_frame(ksig) || is_x32_frame(ksig);
+}
+#define is_compat_frame	is_compat_frame		/* override the generic impl */
+
 #endif /* _ASM_X86_COMPAT_H */
diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c
index 92a3b312a53c..cb488e3e952d 100644
--- a/arch/x86/kernel/signal.c
+++ b/arch/x86/kernel/signal.c
@@ -664,23 +664,6 @@ SYSCALL_DEFINE0(rt_sigreturn)
 	return 0;
 }
 
-static inline int is_ia32_compat_frame(struct ksignal *ksig)
-{
-	return IS_ENABLED(CONFIG_IA32_EMULATION) &&
-		ksig->ka.sa.sa_flags & SA_IA32_ABI;
-}
-
-static inline int is_ia32_frame(struct ksignal *ksig)
-{
-	return IS_ENABLED(CONFIG_X86_32) || is_ia32_compat_frame(ksig);
-}
-
-static inline int is_x32_frame(struct ksignal *ksig)
-{
-	return IS_ENABLED(CONFIG_X86_X32_ABI) &&
-		ksig->ka.sa.sa_flags & SA_X32_ABI;
-}
-
 static int
 setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs)
 {
diff --git a/include/linux/compat.h b/include/linux/compat.h
index b1a5562b3215..2e1ffba65117 100644
--- a/include/linux/compat.h
+++ b/include/linux/compat.h
@@ -1022,6 +1022,19 @@ static inline struct compat_timeval ns_to_compat_timeval(s64 nsec)
 	return ctv;
 }
 
+/*
+ * For most but not all architectures, "is this a compat sigframe?" and
+ * "am I a compat task?" are the same question.  For architectures on which
+ * they aren't the same question, arch code can override is_compat_frame.
+ */
+
+#ifndef is_compat_frame
+static inline bool is_compat_frame(struct ksignal *ksig)
+{
+	return is_compat_task();
+}
+#endif
+
 #else /* !CONFIG_COMPAT */
 
 #define is_compat_task() (0)
@@ -1029,6 +1042,10 @@ static inline struct compat_timeval ns_to_compat_timeval(s64 nsec)
 static inline bool in_compat_syscall(void) { return false; }
 #endif
 
+#ifndef is_compat_frame
+static inline bool is_compat_frame(struct ksignal *ksig) { return false; }
+#endif
+
 #endif /* CONFIG_COMPAT */
 
 #endif /* _LINUX_COMPAT_H */
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields
  2018-06-26 21:16 [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame Mathieu Desnoyers
@ 2018-06-26 21:16 ` Mathieu Desnoyers
  2018-06-26 21:58   ` Andy Lutomirski
  0 siblings, 1 reply; 5+ messages in thread
From: Mathieu Desnoyers @ 2018-06-26 21:16 UTC (permalink / raw)
  To: Andy Lutomirski, Thomas Gleixner
  Cc: linux-kernel, Mathieu Desnoyers, Joel Fernandes, Peter Zijlstra,
	Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen,
	H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter,
	Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng,
	Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch,
	x86, Andrew Morton, Linus Torvalds

Make the behavior rseq on compat tasks more robust by ensuring that
kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of
rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset
when a 32-bit binary is run on a 64-bit kernel.

The intent here is that if user-space has garbage rather than zeroes
in its struct rseq_cs fields padding, the behavior will be the same
whether the binary is run on 32-bit or 64-bit kernels.

Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from
system call context, and use is_compat_frame() when invoked from
signal delivery.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Joel Fernandes <joelaf@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Watson <davejwatson@fb.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Chris Lameter <cl@linux.com>
Cc: Russell King <linux@arm.linux.org.uk>
Cc: Andrew Hunter <ahh@google.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: "Paul E . McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Paul Turner <pjt@google.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Josh Triplett <josh@joshtriplett.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Ben Maurer <bmaurer@fb.com>
Cc: linux-api@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
---
 kernel/rseq.c | 28 +++++++++++++++++++++++-----
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/kernel/rseq.c b/kernel/rseq.c
index 22b6acf1ad63..7b1d51b965fc 100644
--- a/kernel/rseq.c
+++ b/kernel/rseq.c
@@ -13,6 +13,7 @@
 #include <linux/syscalls.h>
 #include <linux/rseq.h>
 #include <linux/types.h>
+#include <linux/compat.h>
 #include <asm/ptrace.h>
 
 #define CREATE_TRACE_POINTS
@@ -112,7 +113,23 @@ static int rseq_reset_rseq_cpu_id(struct task_struct *t)
 	return 0;
 }
 
-static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs)
+#ifdef CONFIG_COMPAT
+static void rseq_cs_compat(struct ksignal *ksig, struct rseq_cs *rseq_cs)
+{
+	if (!(ksig ? is_compat_frame(ksig) : in_compat_syscall()))
+		return;
+
+	rseq_cs->abort_ip = (compat_uptr_t) rseq_cs->abort_ip;
+	rseq_cs->start_ip = (compat_uptr_t) rseq_cs->start_ip;
+	rseq_cs->post_commit_offset =
+		(compat_uptr_t) rseq_cs->post_commit_offset;
+}
+#else
+static void rseq_cs_compat(struct ksignal *ksig, struct rseq_cs *rseq_cs) { }
+#endif
+
+static int rseq_get_rseq_cs(struct ksignal *ksig, struct task_struct *t,
+			    struct rseq_cs *rseq_cs)
 {
 	struct rseq_cs __user *urseq_cs;
 	unsigned long ptr;
@@ -132,6 +149,7 @@ static int rseq_get_rseq_cs(struct task_struct *t, struct rseq_cs *rseq_cs)
 		return -EFAULT;
 	if (rseq_cs->version > 0)
 		return -EINVAL;
+	rseq_cs_compat(ksig, rseq_cs);
 
 	/* Ensure that abort_ip is not in the critical section. */
 	if (rseq_cs->abort_ip - rseq_cs->start_ip < rseq_cs->post_commit_offset)
@@ -209,14 +227,14 @@ static bool in_rseq_cs(unsigned long ip, struct rseq_cs *rseq_cs)
 	return ip - rseq_cs->start_ip < rseq_cs->post_commit_offset;
 }
 
-static int rseq_ip_fixup(struct pt_regs *regs)
+static int rseq_ip_fixup(struct ksignal *ksig, struct pt_regs *regs)
 {
 	unsigned long ip = instruction_pointer(regs);
 	struct task_struct *t = current;
 	struct rseq_cs rseq_cs;
 	int ret;
 
-	ret = rseq_get_rseq_cs(t, &rseq_cs);
+	ret = rseq_get_rseq_cs(ksig, t, &rseq_cs);
 	if (ret)
 		return ret;
 
@@ -260,7 +278,7 @@ void __rseq_handle_notify_resume(struct ksignal *ksig, struct pt_regs *regs)
 		return;
 	if (unlikely(!access_ok(VERIFY_WRITE, t->rseq, sizeof(*t->rseq))))
 		goto error;
-	ret = rseq_ip_fixup(regs);
+	ret = rseq_ip_fixup(ksig, regs);
 	if (unlikely(ret < 0))
 		goto error;
 	if (unlikely(rseq_update_cpu_id(t)))
@@ -287,7 +305,7 @@ void rseq_syscall(struct pt_regs *regs)
 	if (!t->rseq)
 		return;
 	if (!access_ok(VERIFY_READ, t->rseq, sizeof(*t->rseq)) ||
-	    rseq_get_rseq_cs(t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
+	    rseq_get_rseq_cs(NULL, t, &rseq_cs) || in_rseq_cs(ip, &rseq_cs))
 		force_sig(SIGSEGV, t);
 }
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields
  2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers
@ 2018-06-26 21:58   ` Andy Lutomirski
  2018-06-26 22:17     ` Mathieu Desnoyers
  2018-06-28  8:04     ` Thomas Gleixner
  0 siblings, 2 replies; 5+ messages in thread
From: Andy Lutomirski @ 2018-06-26 21:58 UTC (permalink / raw)
  To: Mathieu Desnoyers
  Cc: Thomas Gleixner, linux-kernel, Joel Fernandes, Peter Zijlstra,
	Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen,
	H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter,
	Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng,
	Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch,
	x86, Andrew Morton, Linus Torvalds



> On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> 
> Make the behavior rseq on compat tasks more robust by ensuring that
> kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of
> rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset
> when a 32-bit binary is run on a 64-bit kernel.
> 
> The intent here is that if user-space has garbage rather than zeroes
> in its struct rseq_cs fields padding, the behavior will be the same
> whether the binary is run on 32-bit or 64-bit kernels.
> 
> Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from
> system call context, and use is_compat_frame() when invoked from
> signal delivery.
> 

And when it’s invoked due to preemption unrelated to a syscall or signal, you malfunction?

I think the only sane solution is to make these fields be u64, delete the LINUX_FIELD_ macros, and possibly teach the x86 slowpath return to inject a signal if it’s trying to return to a 32-bit context with garbage in the high bits of regs->ip so that we determistically fail if the user screws up.

Rseq is brand new. It should not need compat code at all.

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields
  2018-06-26 21:58   ` Andy Lutomirski
@ 2018-06-26 22:17     ` Mathieu Desnoyers
  2018-06-28  8:04     ` Thomas Gleixner
  1 sibling, 0 replies; 5+ messages in thread
From: Mathieu Desnoyers @ 2018-06-26 22:17 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Thomas Gleixner, linux-kernel, Joel Fernandes, Peter Zijlstra,
	Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen,
	H. Peter Anvin, Chris Lameter, Russell King, Andrew Hunter,
	Michael Kerrisk, Paul E. McKenney, Paul Turner, Boqun Feng,
	Josh Triplett, rostedt, Ben Maurer, linux-api, linux-arch, x86,
	Andrew Morton, Linus Torvalds

----- On Jun 26, 2018, at 5:58 PM, Andy Lutomirski luto@amacapital.net wrote:

>> On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> wrote:
>> 
>> Make the behavior rseq on compat tasks more robust by ensuring that
>> kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of
>> rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset
>> when a 32-bit binary is run on a 64-bit kernel.
>> 
>> The intent here is that if user-space has garbage rather than zeroes
>> in its struct rseq_cs fields padding, the behavior will be the same
>> whether the binary is run on 32-bit or 64-bit kernels.
>> 
>> Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from
>> system call context, and use is_compat_frame() when invoked from
>> signal delivery.
>> 
> 
> And when it’s invoked due to preemption unrelated to a syscall or signal, you
> malfunction?

Fair point! Hence the "RFC". ;)

So I understand better your intent to use the pt_regs to figure out whether it
is compat or not. My is_compat_frame()+in_compat_syscall() approach does not
handle this correctly.

> 
> I think the only sane solution is to make these fields be u64,

I'm OK with turning the rseq_cs start_ip, post_commit_offset, and abort_ip
fields into normal u64.

> delete the
> LINUX_FIELD_ macros,

The LINUX_FIELD_ macros are still needed to ensure single-copy updates of
the (struct rseq *__tls_abi)->rseq_cs pointer by 32-bit user-space.

> and possibly teach the x86 slowpath return to inject a
> signal if it’s trying to return to a 32-bit context with garbage in the high
> bits of regs->ip so that we determistically fail if the user screws up.

I like the approach of dealing with the rseq_cs fields as u64 even on 32-bit
architectures. As a downside, it will require 32-bit architectures to do
arithmetic on 64-bit values, but it's not a fast-path. As you point out, the
tricky bit is to decide what happens when architecture code returns to
userspace with regs->ip containing garbage in the high bits.

An alternative approach is to ensure the high bits are cleared when returning
to an IP with garbage in the high bits.

> Rseq is brand new. It should not need compat code at all.

Dealing with u64 for start_ip, post_commit_offset, and abort_ip at the kernel
level would indeed provide this characteristic. However, I'm uneasy adding
64-bit arithmetic on operations really caring about 32 bits on 32-bit archs,
even though those are not fast paths.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields
  2018-06-26 21:58   ` Andy Lutomirski
  2018-06-26 22:17     ` Mathieu Desnoyers
@ 2018-06-28  8:04     ` Thomas Gleixner
  1 sibling, 0 replies; 5+ messages in thread
From: Thomas Gleixner @ 2018-06-28  8:04 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Mathieu Desnoyers, linux-kernel, Joel Fernandes, Peter Zijlstra,
	Catalin Marinas, Dave Watson, Will Deacon, Andi Kleen,
	H . Peter Anvin, Chris Lameter, Russell King, Andrew Hunter,
	Michael Kerrisk, Paul E . McKenney, Paul Turner, Boqun Feng,
	Josh Triplett, Steven Rostedt, Ben Maurer, linux-api, linux-arch,
	x86, Andrew Morton, Linus Torvalds

[-- Attachment #1: Type: text/plain, Size: 1434 bytes --]

On Tue, 26 Jun 2018, Andy Lutomirski wrote:
> > On Jun 26, 2018, at 2:16 PM, Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> > 
> > Make the behavior rseq on compat tasks more robust by ensuring that
> > kernel/rseq.c:rseq_get_rseq_cs() clears the high bits of
> > rseq_cs->abort_ip, rseq_cs->start_ip and rseq_cs->post_commit_offset
> > when a 32-bit binary is run on a 64-bit kernel.
> > 
> > The intent here is that if user-space has garbage rather than zeroes
> > in its struct rseq_cs fields padding, the behavior will be the same
> > whether the binary is run on 32-bit or 64-bit kernels.
> > 
> > Use in_compat_syscall() when rseq_get_rseq_cs() is invoked from
> > system call context, and use is_compat_frame() when invoked from
> > signal delivery.
> > 
> 
> And when it’s invoked due to preemption unrelated to a syscall or signal,
> you malfunction?
>
> I think the only sane solution is to make these fields be u64, delete the
> LINUX_FIELD_ macros, and possibly teach the x86 slowpath return to inject
> a signal if it’s trying to return to a 32-bit context with garbage in the
> high bits of regs->ip so that we determistically fail if the user screws
> up.

Right. That's the only sane solution. Trying to play games with 32/64bit
for a dubious value is going to bite us within no time and just create ugly
workarounds left and right. Forcing a clear handling upfront avoids all of
that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-06-28  8:06 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-06-26 21:16 [RFC PATCH for 4.18 1/2] compat: Introduce is_compat_frame Mathieu Desnoyers
2018-06-26 21:16 ` [RFC PATCH for 4.18 2/2] rseq: compat: clear high bits of rseq_cs fields Mathieu Desnoyers
2018-06-26 21:58   ` Andy Lutomirski
2018-06-26 22:17     ` Mathieu Desnoyers
2018-06-28  8:04     ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).