All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter()
@ 2015-07-15  7:37 Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO Michael Ellerman
                   ` (11 more replies)
  0 siblings, 12 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

To call do_syscall_trace_enter() we need pt_regs in r3, but we don't need
to recalculate it based on r1, it's already in r9.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/entry_64.S | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 579e0f9a2d57..0796c487d3db 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -243,7 +243,9 @@ syscall_error:
 /* Traced system call support */
 syscall_dotrace:
 	bl	save_nvgprs
-	addi	r3,r1,STACK_FRAME_OVERHEAD
+
+	/* Get pt_regs into r3 */
+	mr	r3, r9
 	bl	do_syscall_trace_enter
 	/*
 	 * Restore argument registers possibly just changed.
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-16 22:42   ` Benjamin Herrenschmidt
  2015-07-15  7:37 ` [RFC PATCH 03/12] powerpc/kernel: Change the do_syscall_trace_enter() API Michael Ellerman
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

Currently on powerpc we have our own #define for the highest (negative)
errno value, called _LAST_ERRNO. This is defined to be 516, for reasons
which are not clear.

The generic code, and x86, use MAX_ERRNO, which is defined to be 4095.

In particular seccomp uses MAX_ERRNO to restrict the value that a
seccomp filter can return.

Currently with the mismatch between _LAST_ERRNO and MAX_ERRNO, a seccomp
tracer wanting to return 600, expecting it to be seen as an error, would
instead find on powerpc that userspace sees a successful syscall with a
return value of 600.

To avoid this inconsistency, switch powerpc to use MAX_ERRNO.

We are somewhat confident that generic syscalls that can return a
non-error value above negative MAX_ERRNO have already been updated to
use force_successful_syscall_return().

I have also checked all the powerpc specific syscalls, and believe that
none of them expect to return a non-error value between -MAX_ERRNO and
-516. So this change should be safe ...

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/uapi/asm/errno.h | 2 --
 arch/powerpc/kernel/entry_32.S        | 3 ++-
 arch/powerpc/kernel/entry_64.S        | 5 +++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/errno.h b/arch/powerpc/include/uapi/asm/errno.h
index 8c145fd17d86..e8b6b5f7de7c 100644
--- a/arch/powerpc/include/uapi/asm/errno.h
+++ b/arch/powerpc/include/uapi/asm/errno.h
@@ -6,6 +6,4 @@
 #undef	EDEADLOCK
 #define	EDEADLOCK	58	/* File locking deadlock error */
 
-#define _LAST_ERRNO	516
-
 #endif	/* _ASM_POWERPC_ERRNO_H */
diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 46fc0f4d8982..67ecdf61f4e3 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -20,6 +20,7 @@
  */
 
 #include <linux/errno.h>
+#include <linux/err.h>
 #include <linux/sys.h>
 #include <linux/threads.h>
 #include <asm/reg.h>
@@ -354,7 +355,7 @@ ret_from_syscall:
 	SYNC
 	MTMSRD(r10)
 	lwz	r9,TI_FLAGS(r12)
-	li	r8,-_LAST_ERRNO
+	li	r8,-MAX_ERRNO
 	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
 	bne-	syscall_exit_work
 	cmplw	0,r3,r8
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 0796c487d3db..8292581a42f1 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -19,6 +19,7 @@
  */
 
 #include <linux/errno.h>
+#include <linux/err.h>
 #include <asm/unistd.h>
 #include <asm/processor.h>
 #include <asm/page.h>
@@ -207,7 +208,7 @@ system_call:			/* label this so stack traces look sane */
 #endif /* CONFIG_PPC_BOOK3E */
 
 	ld	r9,TI_FLAGS(r12)
-	li	r11,-_LAST_ERRNO
+	li	r11,-MAX_ERRNO
 	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
 	bne-	syscall_exit_work
 	cmpld	r3,r11
@@ -279,7 +280,7 @@ syscall_exit_work:
 	beq+	0f
 	REST_NVGPRS(r1)
 	b	2f
-0:	cmpld	r3,r11		/* r10 is -LAST_ERRNO */
+0:	cmpld	r3,r11		/* r11 is -MAX_ERRNO */
 	blt+	1f
 	andi.	r0,r9,_TIF_NOERROR
 	bne-	1f
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 03/12] powerpc/kernel: Change the do_syscall_trace_enter() API
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 04/12] powerpc: Drop unused syscall_get_error() Michael Ellerman
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

The API for calling do_syscall_trace_enter() is currently sensible
enough, it just returns the (modified) syscall number.

However once we enable seccomp filter it will get more complicated. When
seccomp filter runs, the seccomp kernel code (via SECCOMP_RET_ERRNO), or
a ptracer (via SECCOMP_RET_TRACE), may reject the syscall and *may* or may
*not* set a return value in r3.

That means the assembler that calls do_syscall_trace_enter() can not
blindly return ENOSYS, it needs to only return ENOSYS if a return value
has not already been set.

There is no way to implement that logic with the current API. So change
the do_syscall_trace_enter() API to make it deal with the return code
juggling, and the assembler can then just return whatever return code it
is given.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/kernel/entry_32.S |  4 ++++
 arch/powerpc/kernel/entry_64.S | 23 ++++++++++++++------
 arch/powerpc/kernel/ptrace.c   | 48 ++++++++++++++++++++++++++++++++----------
 3 files changed, 58 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 67ecdf61f4e3..2405631e91a2 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -458,6 +458,10 @@ syscall_dotrace:
 	lwz	r7,GPR7(r1)
 	lwz	r8,GPR8(r1)
 	REST_NVGPRS(r1)
+
+	cmplwi	r0,NR_syscalls
+	/* Return code is already in r3 thanks to do_syscall_trace_enter() */
+	bge-	ret_from_syscall
 	b	syscall_dotrace_cont
 
 syscall_exit_work:
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 8292581a42f1..41f8fcf615d7 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -151,8 +151,7 @@ END_FW_FTR_SECTION_IFSET(FW_FEATURE_SPLPAR)
 	CURRENT_THREAD_INFO(r11, r1)
 	ld	r10,TI_FLAGS(r11)
 	andi.	r11,r10,_TIF_SYSCALL_DOTRACE
-	bne	syscall_dotrace
-.Lsyscall_dotrace_cont:
+	bne	syscall_dotrace		/* does not return */
 	cmpldi	0,r0,NR_syscalls
 	bge-	syscall_enosys
 
@@ -248,22 +247,34 @@ syscall_dotrace:
 	/* Get pt_regs into r3 */
 	mr	r3, r9
 	bl	do_syscall_trace_enter
+
 	/*
-	 * Restore argument registers possibly just changed.
-	 * We use the return value of do_syscall_trace_enter
-	 * for the call number to look up in the table (r0).
+	 * We use the return value of do_syscall_trace_enter() as the syscall
+	 * number. If the syscall was rejected for any reason do_syscall_trace_enter()
+	 * returns an invalid syscall number and the test below against
+	 * NR_syscalls will fail.
 	 */
 	mr	r0,r3
+
+	/* Restore argument registers just clobbered and/or possibly changed. */
 	ld	r3,GPR3(r1)
 	ld	r4,GPR4(r1)
 	ld	r5,GPR5(r1)
 	ld	r6,GPR6(r1)
 	ld	r7,GPR7(r1)
 	ld	r8,GPR8(r1)
+
+	/* Repopulate r9 and r10 for the system_call path */
 	addi	r9,r1,STACK_FRAME_OVERHEAD
 	CURRENT_THREAD_INFO(r10, r1)
 	ld	r10,TI_FLAGS(r10)
-	b	.Lsyscall_dotrace_cont
+
+	cmpldi	r0,NR_syscalls
+	blt+	system_call
+
+	/* Return code is already in r3 thanks to do_syscall_trace_enter() */
+	b	.Lsyscall_exit
+
 
 syscall_enosys:
 	li	r3,-ENOSYS
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index f21897b42057..7484221bb3f8 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1762,26 +1762,42 @@ long arch_ptrace(struct task_struct *child, long request,
 	return ret;
 }
 
-/*
- * We must return the syscall number to actually look up in the table.
- * This can be -1L to skip running any syscall at all.
+/**
+ * do_syscall_trace_enter() - Do syscall tracing on kernel entry.
+ * @regs: the pt_regs of the task to trace (current)
+ *
+ * Performs various types of tracing on syscall entry. This includes seccomp,
+ * ptrace, syscall tracepoints and audit.
+ *
+ * The pt_regs are potentially visible to userspace via ptrace, so their
+ * contents is ABI.
+ *
+ * One or more of the tracers may modify the contents of pt_regs, in particular
+ * to modify arguments or even the syscall number itself.
+ *
+ * It's also possible that a tracer can choose to reject the system call. In
+ * that case this function will return an illegal syscall number, and will put
+ * an appropriate return value in regs->r3.
+ *
+ * Return: the (possibly changed) syscall number.
  */
 long do_syscall_trace_enter(struct pt_regs *regs)
 {
-	long ret = 0;
+	bool abort = false;
 
 	user_exit();
 
 	secure_computing_strict(regs->gpr[0]);
 
-	if (test_thread_flag(TIF_SYSCALL_TRACE) &&
-	    tracehook_report_syscall_entry(regs))
+	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
 		/*
-		 * Tracing decided this syscall should not happen.
-		 * We'll return a bogus call number to get an ENOSYS
-		 * error, but leave the original number in regs->gpr[0].
+		 * The tracer may decide to abort the syscall, if so tracehook
+		 * will return !0. Note that the tracer may also just change
+		 * regs->gpr[0] to an invalid syscall number, that is handled
+		 * below on the exit path.
 		 */
-		ret = -1L;
+		abort = tracehook_report_syscall_entry(regs) != 0;
+	}
 
 	if (unlikely(test_thread_flag(TIF_SYSCALL_TRACEPOINT)))
 		trace_sys_enter(regs, regs->gpr[0]);
@@ -1798,7 +1814,17 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 				    regs->gpr[5] & 0xffffffff,
 				    regs->gpr[6] & 0xffffffff);
 
-	return ret ?: regs->gpr[0];
+	if (abort || regs->gpr[0] >= NR_syscalls) {
+		/*
+		 * If we are aborting explicitly, or if the syscall number is
+		 * now invalid, set the return value to -ENOSYS.
+		 */
+		regs->gpr[3] = -ENOSYS;
+		return -1;
+	}
+
+	/* Return the possibly modified but valid syscall number */
+	return regs->gpr[0];
 }
 
 void do_syscall_trace_leave(struct pt_regs *regs)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 04/12] powerpc: Drop unused syscall_get_error()
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 03/12] powerpc/kernel: Change the do_syscall_trace_enter() API Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 05/12] powerpc: Don't negate error in syscall_set_return_value() Michael Ellerman
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

syscall_get_error() is unused, and never has been.

It's also probably wrong, as it negates r3 before returning it, but that
depends on what the caller is expecting.

It also doesn't deal with compat, and doesn't deal with TIF_NOERROR.

Although we could fix those, until it has a caller and it's clear what
semantics the caller wants it's just untested code. So drop it.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/syscall.h | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index ff21b7a2f0cc..c6239dabcfb1 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -34,12 +34,6 @@ static inline void syscall_rollback(struct task_struct *task,
 	regs->gpr[3] = regs->orig_gpr3;
 }
 
-static inline long syscall_get_error(struct task_struct *task,
-				     struct pt_regs *regs)
-{
-	return (regs->ccr & 0x10000000) ? -regs->gpr[3] : 0;
-}
-
 static inline long syscall_get_return_value(struct task_struct *task,
 					    struct pt_regs *regs)
 {
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 05/12] powerpc: Don't negate error in syscall_set_return_value()
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (2 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 04/12] powerpc: Drop unused syscall_get_error() Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 06/12] powerpc: Rework syscall_get_arguments() so there is only one loop Michael Ellerman
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

Currently the only caller of syscall_set_return_value() is seccomp
filter, which is not enabled on powerpc.

This means we have not noticed that our implementation of
syscall_set_return_value() negates error, even though the value passed
in is already negative.

So remove the negation in syscall_set_return_value(), and expect the
caller to do it like all other implementations do.

Also add a comment about the ccr handling.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/syscall.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index c6239dabcfb1..cabe90133e69 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -44,9 +44,15 @@ static inline void syscall_set_return_value(struct task_struct *task,
 					    struct pt_regs *regs,
 					    int error, long val)
 {
+	/*
+	 * In the general case it's not obvious that we must deal with CCR
+	 * here, as the syscall exit path will also do that for us. However
+	 * there are some places, eg. the signal code, which check ccr to
+	 * decide if the value in r3 is actually an error.
+	 */
 	if (error) {
 		regs->ccr |= 0x10000000L;
-		regs->gpr[3] = -error;
+		regs->gpr[3] = error;
 	} else {
 		regs->ccr &= ~0x10000000L;
 		regs->gpr[3] = val;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 06/12] powerpc: Rework syscall_get_arguments() so there is only one loop
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (3 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 05/12] powerpc: Don't negate error in syscall_set_return_value() Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 07/12] powerpc: Use orig_gpr3 in syscall_get_arguments() Michael Ellerman
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

Currently syscall_get_arguments() has two loops, one for compat and one
for regular tasks. In prepartion for the next patch, which changes which
registers we use, switch it to only have one loop, so we only have one
place to update.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/syscall.h | 19 ++++++++-----------
 1 file changed, 8 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index cabe90133e69..403e2303fe18 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -64,19 +64,16 @@ static inline void syscall_get_arguments(struct task_struct *task,
 					 unsigned int i, unsigned int n,
 					 unsigned long *args)
 {
+	unsigned long mask = -1UL;
+
 	BUG_ON(i + n > 6);
-#ifdef CONFIG_PPC64
-	if (test_tsk_thread_flag(task, TIF_32BIT)) {
-		/*
-		 * Zero-extend 32-bit argument values.  The high bits are
-		 * garbage ignored by the actual syscall dispatch.
-		 */
-		while (n-- > 0)
-			args[n] = (u32) regs->gpr[3 + i + n];
-		return;
-	}
+
+#ifdef CONFIG_COMPAT
+	if (test_tsk_thread_flag(task, TIF_32BIT))
+		mask = 0xffffffff;
 #endif
-	memcpy(args, &regs->gpr[3 + i], n * sizeof(args[0]));
+	while (n--)
+		args[n] = regs->gpr[3 + i + n] & mask;
 }
 
 static inline void syscall_set_arguments(struct task_struct *task,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 07/12] powerpc: Use orig_gpr3 in syscall_get_arguments()
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (4 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 06/12] powerpc: Rework syscall_get_arguments() so there is only one loop Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 08/12] powerpc: Change syscall_get_nr() to return int Michael Ellerman
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

Currently syscall_get_arguments() is used by syscall tracepoints, and
collect_syscall() which is used in some debugging as well as
/proc/pid/syscall.

The current implementation just copies regs->gpr[3 .. 5] out, which is
fine for all the current use cases.

When we enable seccomp filter, that will also start using
syscall_get_arguments(). However for seccomp filter we want to use r3
as the return value of the syscall, and orig_gpr3 as the first
parameter. This will allow seccomp to modify the return value in r3.

To support this we need to modify syscall_get_arguments() to return
orig_gpr3 instead of r3. This is safe for all uses because orig_gpr3
always contains the r3 value that was passed to the syscall. We store it
in the syscall entry path and never modify it.

Update syscall_set_arguments() while we're here, even though it's never
used.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/syscall.h | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index 403e2303fe18..8d79a87c0511 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -64,7 +64,7 @@ static inline void syscall_get_arguments(struct task_struct *task,
 					 unsigned int i, unsigned int n,
 					 unsigned long *args)
 {
-	unsigned long mask = -1UL;
+	unsigned long val, mask = -1UL;
 
 	BUG_ON(i + n > 6);
 
@@ -72,8 +72,14 @@ static inline void syscall_get_arguments(struct task_struct *task,
 	if (test_tsk_thread_flag(task, TIF_32BIT))
 		mask = 0xffffffff;
 #endif
-	while (n--)
-		args[n] = regs->gpr[3 + i + n] & mask;
+	while (n--) {
+		if (n == 0 && i == 0)
+			val = regs->orig_gpr3;
+		else
+			val = regs->gpr[3 + i + n];
+
+		args[n] = val & mask;
+	}
 }
 
 static inline void syscall_set_arguments(struct task_struct *task,
@@ -83,6 +89,10 @@ static inline void syscall_set_arguments(struct task_struct *task,
 {
 	BUG_ON(i + n > 6);
 	memcpy(&regs->gpr[3 + i], args, n * sizeof(args[0]));
+
+	/* Also copy the first argument into orig_gpr3 */
+	if (i == 0 && n > 0)
+		regs->orig_gpr3 = args[0];
 }
 
 static inline int syscall_get_arch(void)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 08/12] powerpc: Change syscall_get_nr() to return int
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (5 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 07/12] powerpc: Use orig_gpr3 in syscall_get_arguments() Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks Michael Ellerman
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

The documentation for syscall_get_nr() in asm-generic says:

 Note this returns int even on 64-bit machines. Only 32 bits of
 system call number can be meaningful. If the actual arch value
 is 64 bits, this truncates to 32 bits so 0xffffffff means -1.

However our implementation was never updated to reflect this.

Generally it's not important, but there is once case where it matters.

For seccomp filter with SECCOMP_RET_TRACE, the tracer will set
regs->gpr[0] to -1 to reject the syscall. When the task is a compat
task, this means we end up with 0xffffffff in r0 because ptrace will
zero extend the 32-bit value.

If syscall_get_nr() returns an unsigned long, then a 64-bit kernel will
see a positive value in r0 and will incorrectly allow the syscall
through seccomp.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/syscall.h | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h
index 8d79a87c0511..ab9f3f0a8637 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -22,10 +22,15 @@
 extern const unsigned long sys_call_table[];
 #endif /* CONFIG_FTRACE_SYSCALLS */
 
-static inline long syscall_get_nr(struct task_struct *task,
-				  struct pt_regs *regs)
+static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs)
 {
-	return TRAP(regs) == 0xc00 ? regs->gpr[0] : -1L;
+	/*
+	 * Note that we are returning an int here. That means 0xffffffff, ie.
+	 * 32-bit negative 1, will be interpreted as -1 on a 64-bit kernel.
+	 * This is important for seccomp so that compat tasks can set r0 = -1
+	 * to reject the syscall.
+	 */
+	return TRAP(regs) == 0xc00 ? regs->gpr[0] : -1;
 }
 
 static inline void syscall_rollback(struct task_struct *task,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (6 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 08/12] powerpc: Change syscall_get_nr() to return int Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15 15:12   ` Kees Cook
  2015-07-15  7:37 ` [RFC PATCH 10/12] powerpc/kernel: Enable seccomp filter Michael Ellerman
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

SIG_SYS was added in commit a0727e8ce513 "signal, x86: add SIGSYS info
and make it synchronous."

Because we use the asm-generic struct siginfo, we got support for
SIG_SYS for free as part of that commit.

However there was no compat handling added for powerpc. That means we've
been advertising the existence of signfo._sifields._sigsys to compat
tasks, but not actually filling in the fields correctly.

Luckily it looks like no one has noticed, presumably because the only
user of SIGSYS in the kernel is seccomp filter, which we don't support
yet.

So before we enable seccomp filter, add compat handling for SIGSYS.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/include/asm/compat.h             | 7 +++++++
 arch/powerpc/kernel/signal_32.c               | 5 +++++
 tools/testing/selftests/seccomp/seccomp_bpf.c | 4 ++++
 3 files changed, 16 insertions(+)

diff --git a/arch/powerpc/include/asm/compat.h b/arch/powerpc/include/asm/compat.h
index b142b8e0ed9e..4f2df589ec1d 100644
--- a/arch/powerpc/include/asm/compat.h
+++ b/arch/powerpc/include/asm/compat.h
@@ -174,6 +174,13 @@ typedef struct compat_siginfo {
 			int _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
 			int _fd;
 		} _sigpoll;
+
+		/* SIGSYS */
+		struct {
+			unsigned int _call_addr; /* calling insn */
+			int _syscall;		 /* triggering system call number */
+			unsigned int _arch;	 /* AUDIT_ARCH_* of syscall */
+		} _sigsys;
 	} _sifields;
 } compat_siginfo_t;
 
diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
index d3a831ac0f92..77f97284124e 100644
--- a/arch/powerpc/kernel/signal_32.c
+++ b/arch/powerpc/kernel/signal_32.c
@@ -949,6 +949,11 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *d, const siginfo_t *s)
 		err |= __put_user(s->si_overrun, &d->si_overrun);
 		err |= __put_user(s->si_int, &d->si_int);
 		break;
+	case __SI_SYS >> 16:
+		err |= __put_user(ptr_to_compat(s->si_call_addr), &d->si_call_addr);
+		err |= __put_user(s->si_syscall, &d->si_syscall);
+		err |= __put_user(s->si_arch, &d->si_arch);
+		break;
 	case __SI_RT >> 16: /* This is not generated by the kernel as of now.  */
 	case __SI_MESGQ >> 16:
 		err |= __put_user(s->si_int, &d->si_int);
diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index c5abe7fd7590..b2374c131340 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -645,6 +645,10 @@ static struct siginfo TRAP_info;
 static volatile int TRAP_nr;
 static void TRAP_action(int nr, siginfo_t *info, void *void_context)
 {
+	fprintf(stderr, "in TRAP_action\n");
+	fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr);
+	fprintf(stderr, "info->si_syscall %u\n", info->si_syscall);
+	fprintf(stderr, "info->si_arch %u\n", info->si_arch);
 	memcpy(&TRAP_info, info, sizeof(TRAP_info));
 	TRAP_nr = nr;
 }
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 10/12] powerpc/kernel: Enable seccomp filter
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (7 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15  7:37 ` [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian Michael Ellerman
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

This commit enables seccomp filter on powerpc, now that we have all the
necessary pieces in place.

To support seccomp's desire to modify the syscall return value under
some circumstances, we use a different ABI to the ptrace ABI. That is we
use r3 as the syscall return value, and orig_gpr3 is the first syscall
parameter.

This means the seccomp code, or a ptracer via SECCOMP_RET_TRACE, will
see -ENOSYS preloaded in r3. This is identical to the behaviour on x86,
and allows seccomp or the ptracer to either leave the -ENOSYS or change
it to something else, as well as rejecting or not the syscall by
modifying r0.

If seccomp does not reject the syscall, we restore the register state to
match what ptrace and audit expect, ie. r3 is the first syscall
parameter again. We do this restore using orig_gpr3, which may have been
modified by seccomp, which allows seccomp to modify the first syscall
paramater and allow the syscall to proceed.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 arch/powerpc/Kconfig         |  1 +
 arch/powerpc/kernel/ptrace.c | 28 +++++++++++++++++++++++++++-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 5ef27113b898..b6cb6a87b7a2 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -155,6 +155,7 @@ config PPC
 	select HAVE_PERF_EVENTS_NMI if PPC64
 	select EDAC_SUPPORT
 	select EDAC_ATOMIC_SCRUB
+	select HAVE_ARCH_SECCOMP_FILTER
 
 config GENERIC_CSUM
 	def_bool CPU_LITTLE_ENDIAN
diff --git a/arch/powerpc/kernel/ptrace.c b/arch/powerpc/kernel/ptrace.c
index 7484221bb3f8..de79eb5218c6 100644
--- a/arch/powerpc/kernel/ptrace.c
+++ b/arch/powerpc/kernel/ptrace.c
@@ -1787,7 +1787,33 @@ long do_syscall_trace_enter(struct pt_regs *regs)
 
 	user_exit();
 
-	secure_computing_strict(regs->gpr[0]);
+	if (test_thread_flag(TIF_SECCOMP)) {
+		/*
+		 * The ABI we present to seccomp tracers is that r3 contains
+		 * the syscall return value and orig_gpr3 contains the first
+		 * syscall parameter. This is different to the ptrace ABI where
+		 * both r3 and orig_gpr3 contain the first syscall parameter.
+		 */
+		regs->gpr[3] = -ENOSYS;
+
+		/*
+		 * We use the __ version here because we have already checked
+		 * TIF_SECCOMP. If this fails, there is nothing left to do, we
+		 * have already loaded -ENOSYS into r3, or seccomp has put
+		 * something else in r3 (via SECCOMP_RET_ERRNO/TRACE).
+		 */
+		if (__secure_computing())
+			return -1;
+
+		/*
+		 * The syscall was allowed by seccomp, restore the register
+		 * state to what ptrace and audit expect.
+		 * Note that we use orig_gpr3, which means a seccomp tracer can
+		 * modify the first syscall parameter (in orig_gpr3) and also
+		 * allow the syscall to proceed.
+		 */
+		regs->gpr[3] = regs->orig_gpr3;
+	}
 
 	if (test_thread_flag(TIF_SYSCALL_TRACE)) {
 		/*
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (8 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 10/12] powerpc/kernel: Enable seccomp filter Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15 15:16   ` Kees Cook
  2015-07-15  7:37 ` [RFC PATCH 12/12] selftests/seccomp: Add powerpc support Michael Ellerman
  2015-07-16 22:40 ` [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Benjamin Herrenschmidt
  11 siblings, 1 reply; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

The seccomp_bpf test uses BPF_LD|BPF_W|BPF_ABS to load 32-bit values
from seccomp_data->args. On big endian machines this will load the high
word of the argument, which is not what the test wants.

Borrow a hack from samples/seccomp/bpf-helper.h which changes the offset
on big endian to account for this.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index b2374c131340..51adb9afb511 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -82,7 +82,13 @@ struct seccomp_data {
 };
 #endif
 
+#if __BYTE_ORDER == __LITTLE_ENDIAN
 #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
+#elif __BYTE_ORDER == __BIG_ENDIAN
+#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + sizeof(__u32))
+#else
+#error "wut?"
+#endif
 
 #define SIBLING_EXIT_UNKILLED	0xbadbeef
 #define SIBLING_EXIT_FAILURE	0xbadface
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH 12/12] selftests/seccomp: Add powerpc support
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (9 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian Michael Ellerman
@ 2015-07-15  7:37 ` Michael Ellerman
  2015-07-15 15:16   ` Kees Cook
  2015-07-16 22:40 ` [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Benjamin Herrenschmidt
  11 siblings, 1 reply; 21+ messages in thread
From: Michael Ellerman @ 2015-07-15  7:37 UTC (permalink / raw)
  To: linuxppc-dev
  Cc: linux-kernel, keescook, luto, wad, strosake, bogdan.purcareata

Wire up the syscall number and regs so the tests work on powerpc.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
---
 tools/testing/selftests/seccomp/seccomp_bpf.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
index 51adb9afb511..05fcdb974df6 100644
--- a/tools/testing/selftests/seccomp/seccomp_bpf.c
+++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
@@ -14,6 +14,7 @@
 #include <linux/filter.h>
 #include <sys/prctl.h>
 #include <sys/ptrace.h>
+#include <sys/types.h>
 #include <sys/user.h>
 #include <linux/prctl.h>
 #include <linux/ptrace.h>
@@ -1209,6 +1210,10 @@ TEST_F(TRACE_poke, getpid_runs_normally)
 # define ARCH_REGS	struct user_pt_regs
 # define SYSCALL_NUM	regs[8]
 # define SYSCALL_RET	regs[0]
+#elif defined(__powerpc__)
+# define ARCH_REGS	struct pt_regs
+# define SYSCALL_NUM	gpr[0]
+# define SYSCALL_RET	gpr[3]
 #else
 # error "Do not know how to find your architecture's registers and syscalls"
 #endif
@@ -1242,7 +1247,7 @@ void change_syscall(struct __test_metadata *_metadata,
 	ret = ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov);
 	EXPECT_EQ(0, ret);
 
-#if defined(__x86_64__) || defined(__i386__) || defined(__aarch64__)
+#if defined(__x86_64__) || defined(__i386__) || defined(__aarch64__) || defined(__powerpc__)
 	{
 		regs.SYSCALL_NUM = syscall;
 	}
@@ -1406,6 +1411,8 @@ TEST_F(TRACE_syscall, syscall_dropped)
 #  define __NR_seccomp 383
 # elif defined(__aarch64__)
 #  define __NR_seccomp 277
+# elif defined(__powerpc__)
+#  define __NR_seccomp 358
 # else
 #  warning "seccomp syscall number unknown for this architecture"
 #  define __NR_seccomp 0xffff
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks
  2015-07-15  7:37 ` [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks Michael Ellerman
@ 2015-07-15 15:12   ` Kees Cook
  2015-07-16  3:38     ` Michael Ellerman
  0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2015-07-15 15:12 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, LKML, Andy Lutomirski, Will Drewry, strosake,
	bogdan.purcareata

On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> SIG_SYS was added in commit a0727e8ce513 "signal, x86: add SIGSYS info
> and make it synchronous."
>
> Because we use the asm-generic struct siginfo, we got support for
> SIG_SYS for free as part of that commit.
>
> However there was no compat handling added for powerpc. That means we've
> been advertising the existence of signfo._sifields._sigsys to compat
> tasks, but not actually filling in the fields correctly.
>
> Luckily it looks like no one has noticed, presumably because the only
> user of SIGSYS in the kernel is seccomp filter, which we don't support
> yet.
>
> So before we enable seccomp filter, add compat handling for SIGSYS.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  arch/powerpc/include/asm/compat.h             | 7 +++++++
>  arch/powerpc/kernel/signal_32.c               | 5 +++++
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 4 ++++
>  3 files changed, 16 insertions(+)
>
> diff --git a/arch/powerpc/include/asm/compat.h b/arch/powerpc/include/asm/compat.h
> index b142b8e0ed9e..4f2df589ec1d 100644
> --- a/arch/powerpc/include/asm/compat.h
> +++ b/arch/powerpc/include/asm/compat.h
> @@ -174,6 +174,13 @@ typedef struct compat_siginfo {
>                         int _band;      /* POLL_IN, POLL_OUT, POLL_MSG */
>                         int _fd;
>                 } _sigpoll;
> +
> +               /* SIGSYS */
> +               struct {
> +                       unsigned int _call_addr; /* calling insn */
> +                       int _syscall;            /* triggering system call number */
> +                       unsigned int _arch;      /* AUDIT_ARCH_* of syscall */
> +               } _sigsys;
>         } _sifields;
>  } compat_siginfo_t;
>
> diff --git a/arch/powerpc/kernel/signal_32.c b/arch/powerpc/kernel/signal_32.c
> index d3a831ac0f92..77f97284124e 100644
> --- a/arch/powerpc/kernel/signal_32.c
> +++ b/arch/powerpc/kernel/signal_32.c
> @@ -949,6 +949,11 @@ int copy_siginfo_to_user32(struct compat_siginfo __user *d, const siginfo_t *s)
>                 err |= __put_user(s->si_overrun, &d->si_overrun);
>                 err |= __put_user(s->si_int, &d->si_int);
>                 break;
> +       case __SI_SYS >> 16:
> +               err |= __put_user(ptr_to_compat(s->si_call_addr), &d->si_call_addr);
> +               err |= __put_user(s->si_syscall, &d->si_syscall);
> +               err |= __put_user(s->si_arch, &d->si_arch);
> +               break;
>         case __SI_RT >> 16: /* This is not generated by the kernel as of now.  */
>         case __SI_MESGQ >> 16:
>                 err |= __put_user(s->si_int, &d->si_int);
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index c5abe7fd7590..b2374c131340 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -645,6 +645,10 @@ static struct siginfo TRAP_info;
>  static volatile int TRAP_nr;
>  static void TRAP_action(int nr, siginfo_t *info, void *void_context)
>  {
> +       fprintf(stderr, "in TRAP_action\n");
> +       fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr);
> +       fprintf(stderr, "info->si_syscall %u\n", info->si_syscall);
> +       fprintf(stderr, "info->si_arch %u\n", info->si_arch);
>         memcpy(&TRAP_info, info, sizeof(TRAP_info));
>         TRAP_nr = nr;
>  }

This chunk looks like left-over debugging?

-Kees

> --
> 2.1.0
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian
  2015-07-15  7:37 ` [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian Michael Ellerman
@ 2015-07-15 15:16   ` Kees Cook
  2015-07-16  3:41     ` Michael Ellerman
  0 siblings, 1 reply; 21+ messages in thread
From: Kees Cook @ 2015-07-15 15:16 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, LKML, Andy Lutomirski, Will Drewry, strosake,
	bogdan.purcareata

On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> The seccomp_bpf test uses BPF_LD|BPF_W|BPF_ABS to load 32-bit values
> from seccomp_data->args. On big endian machines this will load the high
> word of the argument, which is not what the test wants.
>
> Borrow a hack from samples/seccomp/bpf-helper.h which changes the offset
> on big endian to account for this.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index b2374c131340..51adb9afb511 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -82,7 +82,13 @@ struct seccomp_data {
>  };
>  #endif
>
> +#if __BYTE_ORDER == __LITTLE_ENDIAN
>  #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
> +#elif __BYTE_ORDER == __BIG_ENDIAN
> +#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + sizeof(__u32))
> +#else
> +#error "wut?"
> +#endif

Ah-ha! Yes, thanks. Could you change the #error to something that
describes the particular (impossible) failure condition? "wut? Unknown
__BYTE_ORDER?!". Not a huge deal, but I always like verbose errors. :)
Especially for "impossible" situations. :)

-Kees

>
>  #define SIBLING_EXIT_UNKILLED  0xbadbeef
>  #define SIBLING_EXIT_FAILURE   0xbadface
> --
> 2.1.0
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 12/12] selftests/seccomp: Add powerpc support
  2015-07-15  7:37 ` [RFC PATCH 12/12] selftests/seccomp: Add powerpc support Michael Ellerman
@ 2015-07-15 15:16   ` Kees Cook
  0 siblings, 0 replies; 21+ messages in thread
From: Kees Cook @ 2015-07-15 15:16 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, LKML, Andy Lutomirski, Will Drewry, strosake,
	bogdan.purcareata

On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> Wire up the syscall number and regs so the tests work on powerpc.
>
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Acked-by: Kees Cook <keescook@chromium.org>

-Kees

> ---
>  tools/testing/selftests/seccomp/seccomp_bpf.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> index 51adb9afb511..05fcdb974df6 100644
> --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> @@ -14,6 +14,7 @@
>  #include <linux/filter.h>
>  #include <sys/prctl.h>
>  #include <sys/ptrace.h>
> +#include <sys/types.h>
>  #include <sys/user.h>
>  #include <linux/prctl.h>
>  #include <linux/ptrace.h>
> @@ -1209,6 +1210,10 @@ TEST_F(TRACE_poke, getpid_runs_normally)
>  # define ARCH_REGS     struct user_pt_regs
>  # define SYSCALL_NUM   regs[8]
>  # define SYSCALL_RET   regs[0]
> +#elif defined(__powerpc__)
> +# define ARCH_REGS     struct pt_regs
> +# define SYSCALL_NUM   gpr[0]
> +# define SYSCALL_RET   gpr[3]
>  #else
>  # error "Do not know how to find your architecture's registers and syscalls"
>  #endif
> @@ -1242,7 +1247,7 @@ void change_syscall(struct __test_metadata *_metadata,
>         ret = ptrace(PTRACE_GETREGSET, tracee, NT_PRSTATUS, &iov);
>         EXPECT_EQ(0, ret);
>
> -#if defined(__x86_64__) || defined(__i386__) || defined(__aarch64__)
> +#if defined(__x86_64__) || defined(__i386__) || defined(__aarch64__) || defined(__powerpc__)
>         {
>                 regs.SYSCALL_NUM = syscall;
>         }
> @@ -1406,6 +1411,8 @@ TEST_F(TRACE_syscall, syscall_dropped)
>  #  define __NR_seccomp 383
>  # elif defined(__aarch64__)
>  #  define __NR_seccomp 277
> +# elif defined(__powerpc__)
> +#  define __NR_seccomp 358
>  # else
>  #  warning "seccomp syscall number unknown for this architecture"
>  #  define __NR_seccomp 0xffff
> --
> 2.1.0
>



-- 
Kees Cook
Chrome OS Security

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks
  2015-07-15 15:12   ` Kees Cook
@ 2015-07-16  3:38     ` Michael Ellerman
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-16  3:38 UTC (permalink / raw)
  To: Kees Cook
  Cc: linuxppc-dev, LKML, Andy Lutomirski, Will Drewry, strosake,
	bogdan.purcareata

On Wed, 2015-07-15 at 08:12 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index c5abe7fd7590..b2374c131340 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -645,6 +645,10 @@ static struct siginfo TRAP_info;
> >  static volatile int TRAP_nr;
> >  static void TRAP_action(int nr, siginfo_t *info, void *void_context)
> >  {
> > +       fprintf(stderr, "in TRAP_action\n");
> > +       fprintf(stderr, "info->si_call_addr %p\n", info->si_call_addr);
> > +       fprintf(stderr, "info->si_syscall %u\n", info->si_syscall);
> > +       fprintf(stderr, "info->si_arch %u\n", info->si_arch);
> >         memcpy(&TRAP_info, info, sizeof(TRAP_info));
> >         TRAP_nr = nr;
> >  }
> 
> This chunk looks like left-over debugging?

Urgh yep, that's ugly. Thanks for noticing.

Will remove before merging :)

cheers



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian
  2015-07-15 15:16   ` Kees Cook
@ 2015-07-16  3:41     ` Michael Ellerman
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-16  3:41 UTC (permalink / raw)
  To: Kees Cook
  Cc: linuxppc-dev, LKML, Andy Lutomirski, Will Drewry, strosake,
	bogdan.purcareata

On Wed, 2015-07-15 at 08:16 -0700, Kees Cook wrote:
> On Wed, Jul 15, 2015 at 12:37 AM, Michael Ellerman <mpe@ellerman.id.au> wrote:
> > diff --git a/tools/testing/selftests/seccomp/seccomp_bpf.c b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > index b2374c131340..51adb9afb511 100644
> > --- a/tools/testing/selftests/seccomp/seccomp_bpf.c
> > +++ b/tools/testing/selftests/seccomp/seccomp_bpf.c
> > @@ -82,7 +82,13 @@ struct seccomp_data {
> >  };
> >  #endif
> >
> > +#if __BYTE_ORDER == __LITTLE_ENDIAN
> >  #define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]))
> > +#elif __BYTE_ORDER == __BIG_ENDIAN
> > +#define syscall_arg(_n) (offsetof(struct seccomp_data, args[_n]) + sizeof(__u32))
> > +#else
> > +#error "wut?"
> > +#endif
> 
> Ah-ha! Yes, thanks. Could you change the #error to something that
> describes the particular (impossible) failure condition? "wut? Unknown
> __BYTE_ORDER?!". Not a huge deal, but I always like verbose errors. :)
> Especially for "impossible" situations. :)

Yeah sorry that was a "quick hack" which got promoted into an actual patch.

Fixed to use your message.

cheers



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter()
  2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
                   ` (10 preceding siblings ...)
  2015-07-15  7:37 ` [RFC PATCH 12/12] selftests/seccomp: Add powerpc support Michael Ellerman
@ 2015-07-16 22:40 ` Benjamin Herrenschmidt
  2015-07-17  4:41     ` Michael Ellerman
  11 siblings, 1 reply; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-16 22:40 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, wad, keescook, linux-kernel, luto, strosake,
	bogdan.purcareata

On Wed, 2015-07-15 at 17:37 +1000, Michael Ellerman wrote:
> To call do_syscall_trace_enter() we need pt_regs in r3, but we don't need
> to recalculate it based on r1, it's already in r9.
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Is there any performance difference ?

I find the addi a bit more robust in case the code gets moved around or
the "previous" code gets changed to either not use r9 or clobber it,
which would have the potential to
introduce a subtle bug ...

Ben.

> ---
>  arch/powerpc/kernel/entry_64.S | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 579e0f9a2d57..0796c487d3db 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -243,7 +243,9 @@ syscall_error:
>  /* Traced system call support */
>  syscall_dotrace:
>  	bl	save_nvgprs
> -	addi	r3,r1,STACK_FRAME_OVERHEAD
> +
> +	/* Get pt_regs into r3 */
> +	mr	r3, r9
>  	bl	do_syscall_trace_enter
>  	/*
>  	 * Restore argument registers possibly just changed.



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO
  2015-07-15  7:37 ` [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO Michael Ellerman
@ 2015-07-16 22:42   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 21+ messages in thread
From: Benjamin Herrenschmidt @ 2015-07-16 22:42 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: linuxppc-dev, wad, keescook, linux-kernel, luto, strosake,
	bogdan.purcareata

On Wed, 2015-07-15 at 17:37 +1000, Michael Ellerman wrote:
> Currently on powerpc we have our own #define for the highest (negative)
> errno value, called _LAST_ERRNO. This is defined to be 516, for reasons
> which are not clear.
> 
> The generic code, and x86, use MAX_ERRNO, which is defined to be 4095.
> 
> In particular seccomp uses MAX_ERRNO to restrict the value that a
> seccomp filter can return.
> 
> Currently with the mismatch between _LAST_ERRNO and MAX_ERRNO, a seccomp
> tracer wanting to return 600, expecting it to be seen as an error, would
> instead find on powerpc that userspace sees a successful syscall with a
> return value of 600.
> 
> To avoid this inconsistency, switch powerpc to use MAX_ERRNO.
> 
> We are somewhat confident that generic syscalls that can return a
> non-error value above negative MAX_ERRNO have already been updated to
> use force_successful_syscall_return().
> 
> I have also checked all the powerpc specific syscalls, and believe that
> none of them expect to return a non-error value between -MAX_ERRNO and
> -516. So this change should be safe ...
> 
> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>

Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

> ---
>  arch/powerpc/include/uapi/asm/errno.h | 2 --
>  arch/powerpc/kernel/entry_32.S        | 3 ++-
>  arch/powerpc/kernel/entry_64.S        | 5 +++--
>  3 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/include/uapi/asm/errno.h b/arch/powerpc/include/uapi/asm/errno.h
> index 8c145fd17d86..e8b6b5f7de7c 100644
> --- a/arch/powerpc/include/uapi/asm/errno.h
> +++ b/arch/powerpc/include/uapi/asm/errno.h
> @@ -6,6 +6,4 @@
>  #undef	EDEADLOCK
>  #define	EDEADLOCK	58	/* File locking deadlock error */
>  
> -#define _LAST_ERRNO	516
> -
>  #endif	/* _ASM_POWERPC_ERRNO_H */
> diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
> index 46fc0f4d8982..67ecdf61f4e3 100644
> --- a/arch/powerpc/kernel/entry_32.S
> +++ b/arch/powerpc/kernel/entry_32.S
> @@ -20,6 +20,7 @@
>   */
>  
>  #include <linux/errno.h>
> +#include <linux/err.h>
>  #include <linux/sys.h>
>  #include <linux/threads.h>
>  #include <asm/reg.h>
> @@ -354,7 +355,7 @@ ret_from_syscall:
>  	SYNC
>  	MTMSRD(r10)
>  	lwz	r9,TI_FLAGS(r12)
> -	li	r8,-_LAST_ERRNO
> +	li	r8,-MAX_ERRNO
>  	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
>  	bne-	syscall_exit_work
>  	cmplw	0,r3,r8
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 0796c487d3db..8292581a42f1 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -19,6 +19,7 @@
>   */
>  
>  #include <linux/errno.h>
> +#include <linux/err.h>
>  #include <asm/unistd.h>
>  #include <asm/processor.h>
>  #include <asm/page.h>
> @@ -207,7 +208,7 @@ system_call:			/* label this so stack traces look sane */
>  #endif /* CONFIG_PPC_BOOK3E */
>  
>  	ld	r9,TI_FLAGS(r12)
> -	li	r11,-_LAST_ERRNO
> +	li	r11,-MAX_ERRNO
>  	andi.	r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK)
>  	bne-	syscall_exit_work
>  	cmpld	r3,r11
> @@ -279,7 +280,7 @@ syscall_exit_work:
>  	beq+	0f
>  	REST_NVGPRS(r1)
>  	b	2f
> -0:	cmpld	r3,r11		/* r10 is -LAST_ERRNO */
> +0:	cmpld	r3,r11		/* r11 is -MAX_ERRNO */
>  	blt+	1f
>  	andi.	r0,r9,_TIF_NOERROR
>  	bne-	1f



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter()
  2015-07-16 22:40 ` [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Benjamin Herrenschmidt
@ 2015-07-17  4:41     ` Michael Ellerman
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-17  4:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, wad, keescook, linux-kernel, luto, strosake,
	bogdan.purcareata

On Fri, 2015-07-17 at 08:40 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2015-07-15 at 17:37 +1000, Michael Ellerman wrote:
> > To call do_syscall_trace_enter() we need pt_regs in r3, but we don't need
> > to recalculate it based on r1, it's already in r9.
> > 
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> 
> Is there any performance difference ?

No.

I'm not going to bother measuring it :)

> I find the addi a bit more robust in case the code gets moved around or
> the "previous" code gets changed to either not use r9 or clobber it,
> which would have the potential to
> introduce a subtle bug ...

Yeah true.

There is an "invariant" in that entry code that r9 contains pt_regs, you can
see for example the DTL code goes to pains to ensure it puts pt_regs back in r9
after it clobbers it. As does the current syscall_dotrace.

But looking closer I don't see where we actually use that (prior to this
patch).

So yeah I'll drop this and send a clean up to just get rid of all the r9
reloading.

cheers




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter()
@ 2015-07-17  4:41     ` Michael Ellerman
  0 siblings, 0 replies; 21+ messages in thread
From: Michael Ellerman @ 2015-07-17  4:41 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: linuxppc-dev, wad, keescook, linux-kernel, luto, strosake,
	bogdan.purcareata

On Fri, 2015-07-17 at 08:40 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2015-07-15 at 17:37 +1000, Michael Ellerman wrote:
> > To call do_syscall_trace_enter() we need pt_regs in r3, but we don't need
> > to recalculate it based on r1, it's already in r9.
> > 
> > Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
> 
> Is there any performance difference ?

No.

I'm not going to bother measuring it :)

> I find the addi a bit more robust in case the code gets moved around or
> the "previous" code gets changed to either not use r9 or clobber it,
> which would have the potential to
> introduce a subtle bug ...

Yeah true.

There is an "invariant" in that entry code that r9 contains pt_regs, you can
see for example the DTL code goes to pains to ensure it puts pt_regs back in r9
after it clobbers it. As does the current syscall_dotrace.

But looking closer I don't see where we actually use that (prior to this
patch).

So yeah I'll drop this and send a clean up to just get rid of all the r9
reloading.

cheers

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-07-17  4:41 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-15  7:37 [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 02/12] powerpc/kernel: Switch to using MAX_ERRNO Michael Ellerman
2015-07-16 22:42   ` Benjamin Herrenschmidt
2015-07-15  7:37 ` [RFC PATCH 03/12] powerpc/kernel: Change the do_syscall_trace_enter() API Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 04/12] powerpc: Drop unused syscall_get_error() Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 05/12] powerpc: Don't negate error in syscall_set_return_value() Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 06/12] powerpc: Rework syscall_get_arguments() so there is only one loop Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 07/12] powerpc: Use orig_gpr3 in syscall_get_arguments() Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 08/12] powerpc: Change syscall_get_nr() to return int Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 09/12] powerpc/kernel: Add SIG_SYS support for compat tasks Michael Ellerman
2015-07-15 15:12   ` Kees Cook
2015-07-16  3:38     ` Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 10/12] powerpc/kernel: Enable seccomp filter Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 11/12] selftests/seccomp: Make seccomp tests work on big endian Michael Ellerman
2015-07-15 15:16   ` Kees Cook
2015-07-16  3:41     ` Michael Ellerman
2015-07-15  7:37 ` [RFC PATCH 12/12] selftests/seccomp: Add powerpc support Michael Ellerman
2015-07-15 15:16   ` Kees Cook
2015-07-16 22:40 ` [RFC PATCH 01/12] powerpc/kernel: Get pt_regs from r9 before calling do_syscall_trace_enter() Benjamin Herrenschmidt
2015-07-17  4:41   ` Michael Ellerman
2015-07-17  4:41     ` Michael Ellerman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.