* [kernel-hardening] [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 1:24 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, Mark Rutland, James Morse, Pratyush Anand,
Vladimir Murzin, Chris Metcalf, Andre Przywara
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
This patch ensures a syscall does not return to user-mode with a kernel
address limit. If that happened, a process can corrupt kernel-mode
memory and elevate privileges.
For example, it would mitigation this bug:
- https://bugs.chromium.org/p/project-zero/issues/detail?id=990
If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
state will result in a BUG_ON.
The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
added so each architecture can optimize this change.
Signed-off-by: Thomas Garnier <thgarnie@google.com>
---
Based on next-20170308
---
include/linux/syscalls.h | 19 +++++++++++++++++++
init/Kconfig | 7 +++++++
kernel/sys.c | 8 ++++++++
3 files changed, 34 insertions(+)
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 980c3c9b06f8..78a2268ecd6e 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
SYSCALL_METADATA(sname, x, __VA_ARGS__) \
__SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
+asmlinkage void verify_pre_usermode_state(void);
+
+#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
+static inline bool has_user_ds(void) {
+ bool ret = segment_eq(get_fs(), USER_DS);
+ // Prevent re-ordering the call
+ barrier();
+ return ret;
+}
+#else
+static inline bool has_user_ds(void) {
+ return false;
+}
+#endif
+
+
#define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
#define __SYSCALL_DEFINEx(x, name, ...) \
asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
@@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
{ \
+ bool user_caller = has_user_ds(); \
long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
+ if (user_caller) \
+ verify_pre_usermode_state(); \
__MAP(x,__SC_TEST,__VA_ARGS__); \
__PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
return ret; \
diff --git a/init/Kconfig b/init/Kconfig
index c859c993c26f..c4efc3a95e4a 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1929,6 +1929,13 @@ config PROFILING
config TRACEPOINTS
bool
+#
+# Set by each architecture that want to optimize how verify_pre_usermode_state
+# is called.
+#
+config ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
+ bool
+
source "arch/Kconfig"
endmenu # General setup
diff --git a/kernel/sys.c b/kernel/sys.c
index 196c7134bee6..411163ac9dc3 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2459,3 +2459,11 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
return 0;
}
#endif /* CONFIG_COMPAT */
+
+/* Called before coming back to user-mode */
+asmlinkage void verify_pre_usermode_state(void)
+{
+ if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
+ "incorrect get_fs() on user-mode return"))
+ set_fs(USER_DS);
+}
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v2 2/4] x86/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
@ 2017-03-09 1:24 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for x86.
---
Based on next-20170308
---
arch/x86/Kconfig | 1 +
arch/x86/entry/common.c | 3 +++
arch/x86/entry/entry_64.S | 19 +++++++++++++++++++
arch/x86/include/asm/pgtable_64_types.h | 11 +++++++++++
arch/x86/include/asm/processor.h | 11 -----------
5 files changed, 34 insertions(+), 11 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 005df7c825f5..6d48e18e6f09 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -63,6 +63,7 @@ config X86
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 370c42c7f046..525edbb77f03 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -22,6 +22,7 @@
#include <linux/context_tracking.h>
#include <linux/user-return-notifier.h>
#include <linux/uprobes.h>
+#include <linux/syscalls.h>
#include <asm/desc.h>
#include <asm/traps.h>
@@ -180,6 +181,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
struct thread_info *ti = current_thread_info();
u32 cached_flags;
+ verify_pre_usermode_state();
+
if (IS_ENABLED(CONFIG_PROVE_LOCKING) && WARN_ON(!irqs_disabled()))
local_irq_disable();
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d2b2a2948ffe..b3527d31b91b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -96,6 +96,19 @@ ENDPROC(native_usergs_sysret64)
# define TRACE_IRQS_IRETQ_DEBUG TRACE_IRQS_IRETQ
#endif
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+ call verify_pre_usermode_state
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ movq PER_CPU_VAR(current_task), %rax
+ movq $TASK_SIZE_MAX, %rcx
+ movq %rcx, TASK_addr_limit(%rax)
+.endm
+#endif
+
/*
* 64-bit SYSCALL instruction entry. Up to 6 arguments in registers.
*
@@ -201,6 +214,7 @@ entry_SYSCALL_64_fastpath:
* It might end up jumping to the slow path. If it jumps, RAX
* and all argument registers are clobbered.
*/
+
call *sys_call_table(, %rax, 8)
.Lentry_SYSCALL_64_after_fastpath_call:
@@ -218,6 +232,11 @@ entry_SYSCALL_64_fastpath:
testl $_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
jnz 1f
+ /*
+ * Check user-mode state on fast path return, the same check is done
+ * under the slow path through syscall_return_slowpath.
+ */
+ VERIFY_PRE_USERMODE_STATE
LOCKDEP_SYS_EXIT
TRACE_IRQS_ON /* user mode is traced as IRQs on */
movq RIP(%rsp), %rcx
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 3a264200c62f..0fbbb79d058c 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -76,4 +76,15 @@ typedef struct { pteval_t pte; } pte_t;
#define EARLY_DYNAMIC_PAGE_TABLES 64
+/*
+ * User space process size. 47bits minus one guard page. The guard
+ * page is necessary on Intel CPUs: if a SYSCALL instruction is at
+ * the highest possible canonical userspace address, then that
+ * syscall will enter the kernel with a non-canonical return
+ * address, and SYSRET will explode dangerously. We avoid this
+ * particular problem by preventing anything from being mapped
+ * at the maximum canonical address.
+ */
+#define TASK_SIZE_MAX ((_AC(1, UL) << 47) - PAGE_SIZE)
+
#endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca5407a..9bc99d37133e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -829,17 +829,6 @@ static inline void spin_lock_prefetch(const void *x)
#define KSTK_ESP(task) (task_pt_regs(task)->sp)
#else
-/*
- * User space process size. 47bits minus one guard page. The guard
- * page is necessary on Intel CPUs: if a SYSCALL instruction is at
- * the highest possible canonical userspace address, then that
- * syscall will enter the kernel with a non-canonical return
- * address, and SYSRET will explode dangerously. We avoid this
- * particular problem by preventing anything from being mapped
- * at the maximum canonical address.
- */
-#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)
-
/* This decides where the kernel will search for a free chunk of vm
* space during mmap's.
*/
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [kernel-hardening] [PATCH v2 2/4] x86/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 1:24 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, Mark Rutland, James Morse, Pratyush Anand,
Vladimir Murzin, Chris Metcalf, Andre Przywara
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for x86.
---
Based on next-20170308
---
arch/x86/Kconfig | 1 +
arch/x86/entry/common.c | 3 +++
arch/x86/entry/entry_64.S | 19 +++++++++++++++++++
arch/x86/include/asm/pgtable_64_types.h | 11 +++++++++++
arch/x86/include/asm/processor.h | 11 -----------
5 files changed, 34 insertions(+), 11 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 005df7c825f5..6d48e18e6f09 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -63,6 +63,7 @@ config X86
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT
select ARCH_SUPPORTS_NUMA_BALANCING if X86_64
diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 370c42c7f046..525edbb77f03 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -22,6 +22,7 @@
#include <linux/context_tracking.h>
#include <linux/user-return-notifier.h>
#include <linux/uprobes.h>
+#include <linux/syscalls.h>
#include <asm/desc.h>
#include <asm/traps.h>
@@ -180,6 +181,8 @@ __visible inline void prepare_exit_to_usermode(struct pt_regs *regs)
struct thread_info *ti = current_thread_info();
u32 cached_flags;
+ verify_pre_usermode_state();
+
if (IS_ENABLED(CONFIG_PROVE_LOCKING) && WARN_ON(!irqs_disabled()))
local_irq_disable();
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index d2b2a2948ffe..b3527d31b91b 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -96,6 +96,19 @@ ENDPROC(native_usergs_sysret64)
# define TRACE_IRQS_IRETQ_DEBUG TRACE_IRQS_IRETQ
#endif
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+ call verify_pre_usermode_state
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ movq PER_CPU_VAR(current_task), %rax
+ movq $TASK_SIZE_MAX, %rcx
+ movq %rcx, TASK_addr_limit(%rax)
+.endm
+#endif
+
/*
* 64-bit SYSCALL instruction entry. Up to 6 arguments in registers.
*
@@ -201,6 +214,7 @@ entry_SYSCALL_64_fastpath:
* It might end up jumping to the slow path. If it jumps, RAX
* and all argument registers are clobbered.
*/
+
call *sys_call_table(, %rax, 8)
.Lentry_SYSCALL_64_after_fastpath_call:
@@ -218,6 +232,11 @@ entry_SYSCALL_64_fastpath:
testl $_TIF_ALLWORK_MASK, TASK_TI_flags(%r11)
jnz 1f
+ /*
+ * Check user-mode state on fast path return, the same check is done
+ * under the slow path through syscall_return_slowpath.
+ */
+ VERIFY_PRE_USERMODE_STATE
LOCKDEP_SYS_EXIT
TRACE_IRQS_ON /* user mode is traced as IRQs on */
movq RIP(%rsp), %rcx
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index 3a264200c62f..0fbbb79d058c 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -76,4 +76,15 @@ typedef struct { pteval_t pte; } pte_t;
#define EARLY_DYNAMIC_PAGE_TABLES 64
+/*
+ * User space process size. 47bits minus one guard page. The guard
+ * page is necessary on Intel CPUs: if a SYSCALL instruction is at
+ * the highest possible canonical userspace address, then that
+ * syscall will enter the kernel with a non-canonical return
+ * address, and SYSRET will explode dangerously. We avoid this
+ * particular problem by preventing anything from being mapped
+ * at the maximum canonical address.
+ */
+#define TASK_SIZE_MAX ((_AC(1, UL) << 47) - PAGE_SIZE)
+
#endif /* _ASM_X86_PGTABLE_64_DEFS_H */
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index f385eca5407a..9bc99d37133e 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -829,17 +829,6 @@ static inline void spin_lock_prefetch(const void *x)
#define KSTK_ESP(task) (task_pt_regs(task)->sp)
#else
-/*
- * User space process size. 47bits minus one guard page. The guard
- * page is necessary on Intel CPUs: if a SYSCALL instruction is at
- * the highest possible canonical userspace address, then that
- * syscall will enter the kernel with a non-canonical return
- * address, and SYSRET will explode dangerously. We avoid this
- * particular problem by preventing anything from being mapped
- * at the maximum canonical address.
- */
-#define TASK_SIZE_MAX ((1UL << 47) - PAGE_SIZE)
-
/* This decides where the kernel will search for a free chunk of vm
* space during mmap's.
*/
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v2 3/4] arm/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
@ 2017-03-09 1:24 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for arm.
---
Based on next-20170308
---
arch/arm/Kconfig | 1 +
arch/arm/kernel/entry-common.S | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0d4e71b42c77..704fd8f197fa 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -12,6 +12,7 @@ config ARM
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_MIGHT_HAVE_PC_PARPORT
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7
select ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index eb5cd77bf1d8..a2cdf3d7bcff 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -12,6 +12,7 @@
#include <asm/unistd.h>
#include <asm/ftrace.h>
#include <asm/unwind.h>
+#include <asm/memory.h>
#ifdef CONFIG_AEABI
#include <asm/unistd-oabi.h>
#endif
@@ -27,6 +28,23 @@
#include "entry-header.S"
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
+ stmdb sp!, {r0}
+ bl verify_pre_usermode_state
+ ldmia sp!, {r0}
+#else
+ bl verify_pre_usermode_state
+#endif
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ mov r1, $TASK_SIZE
+ str r1, [tsk, #TI_ADDR_LIMIT]
+.endm
+#endif
.align 5
#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
@@ -39,6 +57,7 @@
ret_fast_syscall:
UNWIND(.fnstart )
UNWIND(.cantunwind )
+ VERIFY_PRE_USERMODE_STATE
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
@@ -65,6 +84,7 @@ ret_fast_syscall:
UNWIND(.fnstart )
UNWIND(.cantunwind )
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
+ VERIFY_PRE_USERMODE_STATE
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
@@ -263,6 +283,7 @@ __sys_trace:
__sys_trace_return:
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
+ VERIFY_PRE_USERMODE_STATE
mov r0, sp
bl syscall_trace_exit
b ret_slow_syscall
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [kernel-hardening] [PATCH v2 3/4] arm/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 1:24 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, Mark Rutland, James Morse, Pratyush Anand,
Vladimir Murzin, Chris Metcalf, Andre Przywara
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for arm.
---
Based on next-20170308
---
arch/arm/Kconfig | 1 +
arch/arm/kernel/entry-common.S | 21 +++++++++++++++++++++
2 files changed, 22 insertions(+)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0d4e71b42c77..704fd8f197fa 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -12,6 +12,7 @@ config ARM
select ARCH_HAVE_CUSTOM_GPIO_H
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_MIGHT_HAVE_PC_PARPORT
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARCH_OPTIONAL_KERNEL_RWX if ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_OPTIONAL_KERNEL_RWX_DEFAULT if CPU_V7
select ARCH_SUPPORTS_ATOMIC_RMW
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index eb5cd77bf1d8..a2cdf3d7bcff 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -12,6 +12,7 @@
#include <asm/unistd.h>
#include <asm/ftrace.h>
#include <asm/unwind.h>
+#include <asm/memory.h>
#ifdef CONFIG_AEABI
#include <asm/unistd-oabi.h>
#endif
@@ -27,6 +28,23 @@
#include "entry-header.S"
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
+ stmdb sp!, {r0}
+ bl verify_pre_usermode_state
+ ldmia sp!, {r0}
+#else
+ bl verify_pre_usermode_state
+#endif
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ mov r1, $TASK_SIZE
+ str r1, [tsk, #TI_ADDR_LIMIT]
+.endm
+#endif
.align 5
#if !(IS_ENABLED(CONFIG_TRACE_IRQFLAGS) || IS_ENABLED(CONFIG_CONTEXT_TRACKING))
@@ -39,6 +57,7 @@
ret_fast_syscall:
UNWIND(.fnstart )
UNWIND(.cantunwind )
+ VERIFY_PRE_USERMODE_STATE
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
@@ -65,6 +84,7 @@ ret_fast_syscall:
UNWIND(.fnstart )
UNWIND(.cantunwind )
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
+ VERIFY_PRE_USERMODE_STATE
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
@@ -263,6 +283,7 @@ __sys_trace:
__sys_trace_return:
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
+ VERIFY_PRE_USERMODE_STATE
mov r0, sp
bl syscall_trace_exit
b ret_slow_syscall
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
@ 2017-03-09 1:24 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for arm64.
---
Based on next-20170308
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/entry.S | 15 +++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 896eba61e5ed..da54774838d8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -24,6 +24,7 @@ config ARM64
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select ARCH_WANT_FRAME_POINTERS
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 43512d4d7df2..eca392ae63e9 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -737,6 +737,19 @@ ENTRY(cpu_switch_to)
ret
ENDPROC(cpu_switch_to)
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+ bl verify_pre_usermode_state
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ mov x1, #TASK_SIZE_64
+ str x1, [tsk, #TSK_TI_ADDR_LIMIT]
+.endm
+#endif
+
+
/*
* This is the fast syscall return path. We do as little as possible here,
* and this includes saving x0 back into the kernel stack.
@@ -744,6 +757,7 @@ ENDPROC(cpu_switch_to)
ret_fast_syscall:
disable_irq // disable interrupts
str x0, [sp, #S_X0] // returned x0
+ VERIFY_PRE_USERMODE_STATE
ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
and x2, x1, #_TIF_SYSCALL_WORK
cbnz x2, ret_fast_syscall_trace
@@ -771,6 +785,7 @@ work_pending:
*/
ret_to_user:
disable_irq // disable interrupts
+ VERIFY_PRE_USERMODE_STATE
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [kernel-hardening] [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 1:24 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 1:24 UTC (permalink / raw)
To: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Thomas Garnier, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, Mark Rutland, James Morse, Pratyush Anand,
Vladimir Murzin, Chris Metcalf, Andre Przywara
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening
Implement specific usage of verify_pre_usermode_state for user-mode
returns for arm64.
---
Based on next-20170308
---
arch/arm64/Kconfig | 1 +
arch/arm64/kernel/entry.S | 15 +++++++++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 896eba61e5ed..da54774838d8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -24,6 +24,7 @@ config ARM64
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
select ARCH_WANT_FRAME_POINTERS
select ARCH_HAS_UBSAN_SANITIZE_ALL
+ select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
select ARM_AMBA
select ARM_ARCH_TIMER
select ARM_GIC
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 43512d4d7df2..eca392ae63e9 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -737,6 +737,19 @@ ENTRY(cpu_switch_to)
ret
ENDPROC(cpu_switch_to)
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+.macro VERIFY_PRE_USERMODE_STATE
+ bl verify_pre_usermode_state
+.endm
+#else
+/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
+.macro VERIFY_PRE_USERMODE_STATE
+ mov x1, #TASK_SIZE_64
+ str x1, [tsk, #TSK_TI_ADDR_LIMIT]
+.endm
+#endif
+
+
/*
* This is the fast syscall return path. We do as little as possible here,
* and this includes saving x0 back into the kernel stack.
@@ -744,6 +757,7 @@ ENDPROC(cpu_switch_to)
ret_fast_syscall:
disable_irq // disable interrupts
str x0, [sp, #S_X0] // returned x0
+ VERIFY_PRE_USERMODE_STATE
ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
and x2, x1, #_TIF_SYSCALL_WORK
cbnz x2, ret_fast_syscall_trace
@@ -771,6 +785,7 @@ work_pending:
*/
ret_to_user:
disable_irq // disable interrupts
+ VERIFY_PRE_USERMODE_STATE
ldr x1, [tsk, #TSK_TI_FLAGS]
and x2, x1, #_TIF_WORK_MASK
cbnz x2, work_pending
--
2.12.0.246.ga2ecc84866-goog
^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
(?)
@ 2017-03-09 12:23 ` Mark Rutland
-1 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:23 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Wed, Mar 08, 2017 at 05:24:56PM -0800, Thomas Garnier wrote:
> Implement specific usage of verify_pre_usermode_state for user-mode
> returns for arm64.
> ---
> Based on next-20170308
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/kernel/entry.S | 15 +++++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 896eba61e5ed..da54774838d8 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -24,6 +24,7 @@ config ARM64
> select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
> select ARCH_WANT_FRAME_POINTERS
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> + select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> select ARM_AMBA
> select ARM_ARCH_TIMER
> select ARM_GIC
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 43512d4d7df2..eca392ae63e9 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -737,6 +737,19 @@ ENTRY(cpu_switch_to)
> ret
> ENDPROC(cpu_switch_to)
>
> +#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
> +.macro VERIFY_PRE_USERMODE_STATE
> + bl verify_pre_usermode_state
> +.endm
> +#else
We generally stick to lower case for the arm64 assembly macros. If we
need this, we should stick to the existing convention.
> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> +.macro VERIFY_PRE_USERMODE_STATE
> + mov x1, #TASK_SIZE_64
> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> +.endm
We need arm64's set_fs() to configure UAO, too, so this is much weaker
than set_fs(), and will leave __{get,put}_user and
__copy_{to,from}_user() able to access kernel memory.
We don't currently have an asm helper to clear UAO, and unconditionally
poking that on exception return is liable to be somewhat expensive.
Also, given we're only trying to catch this in syscalls, I'm afraid I
don't see what we gain by doing this in the entry assembly.
Thanks,
Mark.
> +#endif
> +
> +
> /*
> * This is the fast syscall return path. We do as little as possible here,
> * and this includes saving x0 back into the kernel stack.
> @@ -744,6 +757,7 @@ ENDPROC(cpu_switch_to)
> ret_fast_syscall:
> disable_irq // disable interrupts
> str x0, [sp, #S_X0] // returned x0
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
> and x2, x1, #_TIF_SYSCALL_WORK
> cbnz x2, ret_fast_syscall_trace
> @@ -771,6 +785,7 @@ work_pending:
> */
> ret_to_user:
> disable_irq // disable interrupts
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS]
> and x2, x1, #_TIF_WORK_MASK
> cbnz x2, work_pending
> --
> 2.12.0.246.ga2ecc84866-goog
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 12:23 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:23 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, linux-api, linux-kernel, x86,
linux-arm-kernel, kernel-hardening
On Wed, Mar 08, 2017 at 05:24:56PM -0800, Thomas Garnier wrote:
> Implement specific usage of verify_pre_usermode_state for user-mode
> returns for arm64.
> ---
> Based on next-20170308
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/kernel/entry.S | 15 +++++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 896eba61e5ed..da54774838d8 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -24,6 +24,7 @@ config ARM64
> select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
> select ARCH_WANT_FRAME_POINTERS
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> + select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> select ARM_AMBA
> select ARM_ARCH_TIMER
> select ARM_GIC
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 43512d4d7df2..eca392ae63e9 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -737,6 +737,19 @@ ENTRY(cpu_switch_to)
> ret
> ENDPROC(cpu_switch_to)
>
> +#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
> +.macro VERIFY_PRE_USERMODE_STATE
> + bl verify_pre_usermode_state
> +.endm
> +#else
We generally stick to lower case for the arm64 assembly macros. If we
need this, we should stick to the existing convention.
> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> +.macro VERIFY_PRE_USERMODE_STATE
> + mov x1, #TASK_SIZE_64
> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> +.endm
We need arm64's set_fs() to configure UAO, too, so this is much weaker
than set_fs(), and will leave __{get,put}_user and
__copy_{to,from}_user() able to access kernel memory.
We don't currently have an asm helper to clear UAO, and unconditionally
poking that on exception return is liable to be somewhat expensive.
Also, given we're only trying to catch this in syscalls, I'm afraid I
don't see what we gain by doing this in the entry assembly.
Thanks,
Mark.
> +#endif
> +
> +
> /*
> * This is the fast syscall return path. We do as little as possible here,
> * and this includes saving x0 back into the kernel stack.
> @@ -744,6 +757,7 @@ ENDPROC(cpu_switch_to)
> ret_fast_syscall:
> disable_irq // disable interrupts
> str x0, [sp, #S_X0] // returned x0
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
> and x2, x1, #_TIF_SYSCALL_WORK
> cbnz x2, ret_fast_syscall_trace
> @@ -771,6 +785,7 @@ work_pending:
> */
> ret_to_user:
> disable_irq // disable interrupts
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS]
> and x2, x1, #_TIF_WORK_MASK
> cbnz x2, work_pending
> --
> 2.12.0.246.ga2ecc84866-goog
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 12:23 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:23 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Mar 08, 2017 at 05:24:56PM -0800, Thomas Garnier wrote:
> Implement specific usage of verify_pre_usermode_state for user-mode
> returns for arm64.
> ---
> Based on next-20170308
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/kernel/entry.S | 15 +++++++++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 896eba61e5ed..da54774838d8 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -24,6 +24,7 @@ config ARM64
> select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
> select ARCH_WANT_FRAME_POINTERS
> select ARCH_HAS_UBSAN_SANITIZE_ALL
> + select ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> select ARM_AMBA
> select ARM_ARCH_TIMER
> select ARM_GIC
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 43512d4d7df2..eca392ae63e9 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -737,6 +737,19 @@ ENTRY(cpu_switch_to)
> ret
> ENDPROC(cpu_switch_to)
>
> +#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
> +.macro VERIFY_PRE_USERMODE_STATE
> + bl verify_pre_usermode_state
> +.endm
> +#else
We generally stick to lower case for the arm64 assembly macros. If we
need this, we should stick to the existing convention.
> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> +.macro VERIFY_PRE_USERMODE_STATE
> + mov x1, #TASK_SIZE_64
> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> +.endm
We need arm64's set_fs() to configure UAO, too, so this is much weaker
than set_fs(), and will leave __{get,put}_user and
__copy_{to,from}_user() able to access kernel memory.
We don't currently have an asm helper to clear UAO, and unconditionally
poking that on exception return is liable to be somewhat expensive.
Also, given we're only trying to catch this in syscalls, I'm afraid I
don't see what we gain by doing this in the entry assembly.
Thanks,
Mark.
> +#endif
> +
> +
> /*
> * This is the fast syscall return path. We do as little as possible here,
> * and this includes saving x0 back into the kernel stack.
> @@ -744,6 +757,7 @@ ENDPROC(cpu_switch_to)
> ret_fast_syscall:
> disable_irq // disable interrupts
> str x0, [sp, #S_X0] // returned x0
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS] // re-check for syscall tracing
> and x2, x1, #_TIF_SYSCALL_WORK
> cbnz x2, ret_fast_syscall_trace
> @@ -771,6 +785,7 @@ work_pending:
> */
> ret_to_user:
> disable_irq // disable interrupts
> + VERIFY_PRE_USERMODE_STATE
> ldr x1, [tsk, #TSK_TI_FLAGS]
> and x2, x1, #_TIF_WORK_MASK
> cbnz x2, work_pending
> --
> 2.12.0.246.ga2ecc84866-goog
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 12:23 ` Mark Rutland
(?)
@ 2017-03-09 15:56 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:56 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> We generally stick to lower case for the arm64 assembly macros. If we
> need this, we should stick to the existing convention.
>
>> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> +.macro VERIFY_PRE_USERMODE_STATE
>> + mov x1, #TASK_SIZE_64
>> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> +.endm
>
> We need arm64's set_fs() to configure UAO, too, so this is much weaker
> than set_fs(), and will leave __{get,put}_user and
> __copy_{to,from}_user() able to access kernel memory.
>
> We don't currently have an asm helper to clear UAO, and unconditionally
> poking that on exception return is liable to be somewhat expensive.
>
> Also, given we're only trying to catch this in syscalls, I'm afraid I
> don't see what we gain by doing this in the entry assembly.
>
I optimized all architectures from the arm (32-bit) discussion. I will
come back to a simple bl to the verify function. Thanks!
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 15:56 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:56 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, Linux API, LKML,
the arch/x86 maintainers, linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> We generally stick to lower case for the arm64 assembly macros. If we
> need this, we should stick to the existing convention.
>
>> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> +.macro VERIFY_PRE_USERMODE_STATE
>> + mov x1, #TASK_SIZE_64
>> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> +.endm
>
> We need arm64's set_fs() to configure UAO, too, so this is much weaker
> than set_fs(), and will leave __{get,put}_user and
> __copy_{to,from}_user() able to access kernel memory.
>
> We don't currently have an asm helper to clear UAO, and unconditionally
> poking that on exception return is liable to be somewhat expensive.
>
> Also, given we're only trying to catch this in syscalls, I'm afraid I
> don't see what we gain by doing this in the entry assembly.
>
I optimized all architectures from the arm (32-bit) discussion. I will
come back to a simple bl to the verify function. Thanks!
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 15:56 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:56 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> We generally stick to lower case for the arm64 assembly macros. If we
> need this, we should stick to the existing convention.
>
>> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> +.macro VERIFY_PRE_USERMODE_STATE
>> + mov x1, #TASK_SIZE_64
>> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> +.endm
>
> We need arm64's set_fs() to configure UAO, too, so this is much weaker
> than set_fs(), and will leave __{get,put}_user and
> __copy_{to,from}_user() able to access kernel memory.
>
> We don't currently have an asm helper to clear UAO, and unconditionally
> poking that on exception return is liable to be somewhat expensive.
>
> Also, given we're only trying to catch this in syscalls, I'm afraid I
> don't see what we gain by doing this in the entry assembly.
>
I optimized all architectures from the arm (32-bit) discussion. I will
come back to a simple bl to the verify function. Thanks!
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 15:56 ` Thomas Garnier
(?)
@ 2017-03-09 16:05 ` Mark Rutland
-1 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 16:05 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
What I was trying to ask was do we need to touch the assembly at all
here?
Are we trying to protect the non-syscall cases by doing this in
assembly? If so, it'd be worth calling out in the commit message.
If so, we could add the necessary helper to clear UAO.
If not, doing this in the entry assembly only saves the small overhead
of reading and comparing the addr_limit for in-kernel use of the
syscalls (e.g. in the compat wrappers), and we may as well rely on the
common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:05 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 16:05 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, Linux API, LKML,
the arch/x86 maintainers, linux-arm-kernel, Kernel Hardening
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
What I was trying to ask was do we need to touch the assembly at all
here?
Are we trying to protect the non-syscall cases by doing this in
assembly? If so, it'd be worth calling out in the commit message.
If so, we could add the necessary helper to clear UAO.
If not, doing this in the entry assembly only saves the small overhead
of reading and comparing the addr_limit for in-kernel use of the
syscalls (e.g. in the compat wrappers), and we may as well rely on the
common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:05 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 16:05 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
What I was trying to ask was do we need to touch the assembly at all
here?
Are we trying to protect the non-syscall cases by doing this in
assembly? If so, it'd be worth calling out in the commit message.
If so, we could add the necessary helper to clear UAO.
If not, doing this in the entry assembly only saves the small overhead
of reading and comparing the addr_limit for in-kernel use of the
syscalls (e.g. in the compat wrappers), and we may as well rely on the
common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 16:05 ` Mark Rutland
(?)
@ 2017-03-09 16:19 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:19 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Thu, Mar 9, 2017 at 8:05 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> What I was trying to ask was do we need to touch the assembly at all
> here?
You don't but he generic solution add code to every single syscall.
> Are we trying to protect the non-syscall cases by doing this in
> assembly? If so, it'd be worth calling out in the commit message.
It is an added benefit but not required.
> If so, we could add the necessary helper to clear UAO.
I can look at set_fs and fix it on the next iteraiton.
> If not, doing this in the entry assembly only saves the small overhead
> of reading and comparing the addr_limit for in-kernel use of the
> syscalls (e.g. in the compat wrappers), and we may as well rely on the
> common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
You also don't have the code added for each syscall and a call.
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:19 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:19 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, Linux API, LKML,
the arch/x86 maintainers, linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 8:05 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> What I was trying to ask was do we need to touch the assembly at all
> here?
You don't but he generic solution add code to every single syscall.
> Are we trying to protect the non-syscall cases by doing this in
> assembly? If so, it'd be worth calling out in the commit message.
It is an added benefit but not required.
> If so, we could add the necessary helper to clear UAO.
I can look at set_fs and fix it on the next iteraiton.
> If not, doing this in the entry assembly only saves the small overhead
> of reading and comparing the addr_limit for in-kernel use of the
> syscalls (e.g. in the compat wrappers), and we may as well rely on the
> common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
You also don't have the code added for each syscall and a call.
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:19 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:19 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 8:05 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> What I was trying to ask was do we need to touch the assembly at all
> here?
You don't but he generic solution add code to every single syscall.
> Are we trying to protect the non-syscall cases by doing this in
> assembly? If so, it'd be worth calling out in the commit message.
It is an added benefit but not required.
> If so, we could add the necessary helper to clear UAO.
I can look at set_fs and fix it on the next iteraiton.
> If not, doing this in the entry assembly only saves the small overhead
> of reading and comparing the addr_limit for in-kernel use of the
> syscalls (e.g. in the compat wrappers), and we may as well rely on the
> common !ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE implementation.
You also don't have the code added for each syscall and a call.
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 15:56 ` Thomas Garnier
(?)
@ 2017-03-09 16:26 ` Russell King - ARM Linux
-1 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 16:26 UTC (permalink / raw)
To: Thomas Garnier
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
> >
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
I wouldn't call what you've done on ARM an "optimisation", because my
comment about making the fast path worthless still stands.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:26 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 16:26 UTC (permalink / raw)
To: Thomas Garnier
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
> >
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
I wouldn't call what you've done on ARM an "optimisation", because my
comment about making the fast path worthless still stands.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:26 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 16:26 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> > We generally stick to lower case for the arm64 assembly macros. If we
> > need this, we should stick to the existing convention.
> >
> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
> >> +.macro VERIFY_PRE_USERMODE_STATE
> >> + mov x1, #TASK_SIZE_64
> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
> >> +.endm
> >
> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
> > than set_fs(), and will leave __{get,put}_user and
> > __copy_{to,from}_user() able to access kernel memory.
> >
> > We don't currently have an asm helper to clear UAO, and unconditionally
> > poking that on exception return is liable to be somewhat expensive.
> >
> > Also, given we're only trying to catch this in syscalls, I'm afraid I
> > don't see what we gain by doing this in the entry assembly.
> >
>
> I optimized all architectures from the arm (32-bit) discussion. I will
> come back to a simple bl to the verify function. Thanks!
I wouldn't call what you've done on ARM an "optimisation", because my
comment about making the fast path worthless still stands.
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 16:26 ` Russell King - ARM Linux
(?)
@ 2017-03-09 16:35 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:35 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>> >
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> I wouldn't call what you've done on ARM an "optimisation", because my
> comment about making the fast path worthless still stands.
Why does it still stands on the latest proposal?
>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:35 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:35 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>> >
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> I wouldn't call what you've done on ARM an "optimisation", because my
> comment about making the fast path worthless still stands.
Why does it still stands on the latest proposal?
>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 16:35 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 16:35 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Thu, Mar 09, 2017 at 07:56:49AM -0800, Thomas Garnier wrote:
>> On Thu, Mar 9, 2017 at 4:23 AM, Mark Rutland <mark.rutland@arm.com> wrote:
>> > We generally stick to lower case for the arm64 assembly macros. If we
>> > need this, we should stick to the existing convention.
>> >
>> >> +/* Similar to set_fs(USER_DS) in verify_pre_usermode_state without a warning. */
>> >> +.macro VERIFY_PRE_USERMODE_STATE
>> >> + mov x1, #TASK_SIZE_64
>> >> + str x1, [tsk, #TSK_TI_ADDR_LIMIT]
>> >> +.endm
>> >
>> > We need arm64's set_fs() to configure UAO, too, so this is much weaker
>> > than set_fs(), and will leave __{get,put}_user and
>> > __copy_{to,from}_user() able to access kernel memory.
>> >
>> > We don't currently have an asm helper to clear UAO, and unconditionally
>> > poking that on exception return is liable to be somewhat expensive.
>> >
>> > Also, given we're only trying to catch this in syscalls, I'm afraid I
>> > don't see what we gain by doing this in the entry assembly.
>> >
>>
>> I optimized all architectures from the arm (32-bit) discussion. I will
>> come back to a simple bl to the verify function. Thanks!
>
> I wouldn't call what you've done on ARM an "optimisation", because my
> comment about making the fast path worthless still stands.
Why does it still stands on the latest proposal?
>
> --
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
2017-03-09 16:35 ` Thomas Garnier
(?)
@ 2017-03-09 17:05 ` Russell King - ARM Linux
-1 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 17:05 UTC (permalink / raw)
To: Thomas Garnier
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 09, 2017 at 08:35:30AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > I wouldn't call what you've done on ARM an "optimisation", because my
> > comment about making the fast path worthless still stands.
>
> Why does it still stands on the latest proposal?
It's still having to needlessly save stuff when there's nothing wrong.
Remember, syscalls are a fast path, so the minimum we can do is good.
Calling into C functions is not ideal, because they will tend to be
_very_ expensive compared to hand crafted assembly, especially for
something like this.
It's possible to check the address limit in just three instructions,
which is way less than will be incurred by calling a C function.
Note: This patch is completely untested.
arch/arm/kernel/entry-common.S | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index eb5cd77bf1d8..6b43c6d73117 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -41,8 +41,11 @@
UNWIND(.cantunwind )
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
bne fast_work_pending
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
/* perform architecture specific actions before user return */
arch_ret_to_user r1, lr
@@ -67,6 +70,7 @@ ENDPROC(ret_fast_syscall)
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
beq no_work_pending
UNWIND(.fnend )
@@ -82,6 +86,7 @@ ENDPROC(ret_fast_syscall)
mov r2, why @ 'syscall'
bl do_work_pending
cmp r0, #0
+ ldreq r2, [tsk, #TI_ADDR_LIMIT]
beq no_work_pending
movlt scno, #(__NR_restart_syscall - __NR_SYSCALL_BASE)
ldmia sp, {r0 - r6} @ have to reload r0 - r6
@@ -99,9 +104,12 @@ ENTRY(ret_to_user)
disable_irq_notrace @ disable interrupts
ENTRY(ret_to_user_from_irq)
ldr r1, [tsk, #TI_FLAGS]
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_WORK_MASK
bne slow_work_pending
no_work_pending:
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
asm_trace_hardirqs_on save = 0
/* perform architecture specific actions before user return */
@@ -125,6 +133,16 @@ ENTRY(ret_from_fork)
b ret_slow_syscall
ENDPROC(ret_from_fork)
+addr_limit_fail:
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+ stmfd sp!, {r0, lr}
+ bl verify_pre_usermode_state
+ ldmfd sp!, {r0, lr}
+#endif
+ mov r2, #TASK_SIZE
+ str r2, [tsk, #TI_ADDR_LIMIT]
+ ret lr
+
/*=============================================================================
* SWI handler
*-----------------------------------------------------------------------------
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 17:05 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 17:05 UTC (permalink / raw)
To: Thomas Garnier
Cc: Mark Rutland, David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening
On Thu, Mar 09, 2017 at 08:35:30AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > I wouldn't call what you've done on ARM an "optimisation", because my
> > comment about making the fast path worthless still stands.
>
> Why does it still stands on the latest proposal?
It's still having to needlessly save stuff when there's nothing wrong.
Remember, syscalls are a fast path, so the minimum we can do is good.
Calling into C functions is not ideal, because they will tend to be
_very_ expensive compared to hand crafted assembly, especially for
something like this.
It's possible to check the address limit in just three instructions,
which is way less than will be incurred by calling a C function.
Note: This patch is completely untested.
arch/arm/kernel/entry-common.S | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index eb5cd77bf1d8..6b43c6d73117 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -41,8 +41,11 @@
UNWIND(.cantunwind )
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
bne fast_work_pending
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
/* perform architecture specific actions before user return */
arch_ret_to_user r1, lr
@@ -67,6 +70,7 @@ ENDPROC(ret_fast_syscall)
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
beq no_work_pending
UNWIND(.fnend )
@@ -82,6 +86,7 @@ ENDPROC(ret_fast_syscall)
mov r2, why @ 'syscall'
bl do_work_pending
cmp r0, #0
+ ldreq r2, [tsk, #TI_ADDR_LIMIT]
beq no_work_pending
movlt scno, #(__NR_restart_syscall - __NR_SYSCALL_BASE)
ldmia sp, {r0 - r6} @ have to reload r0 - r6
@@ -99,9 +104,12 @@ ENTRY(ret_to_user)
disable_irq_notrace @ disable interrupts
ENTRY(ret_to_user_from_irq)
ldr r1, [tsk, #TI_FLAGS]
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_WORK_MASK
bne slow_work_pending
no_work_pending:
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
asm_trace_hardirqs_on save = 0
/* perform architecture specific actions before user return */
@@ -125,6 +133,16 @@ ENTRY(ret_from_fork)
b ret_slow_syscall
ENDPROC(ret_from_fork)
+addr_limit_fail:
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+ stmfd sp!, {r0, lr}
+ bl verify_pre_usermode_state
+ ldmfd sp!, {r0, lr}
+#endif
+ mov r2, #TASK_SIZE
+ str r2, [tsk, #TI_ADDR_LIMIT]
+ ret lr
+
/*=============================================================================
* SWI handler
*-----------------------------------------------------------------------------
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH v2 4/4] arm64/syscalls: Specific usage of verify_pre_usermode_state
@ 2017-03-09 17:05 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 17:05 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 08:35:30AM -0800, Thomas Garnier wrote:
> On Thu, Mar 9, 2017 at 8:26 AM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > I wouldn't call what you've done on ARM an "optimisation", because my
> > comment about making the fast path worthless still stands.
>
> Why does it still stands on the latest proposal?
It's still having to needlessly save stuff when there's nothing wrong.
Remember, syscalls are a fast path, so the minimum we can do is good.
Calling into C functions is not ideal, because they will tend to be
_very_ expensive compared to hand crafted assembly, especially for
something like this.
It's possible to check the address limit in just three instructions,
which is way less than will be incurred by calling a C function.
Note: This patch is completely untested.
arch/arm/kernel/entry-common.S | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index eb5cd77bf1d8..6b43c6d73117 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -41,8 +41,11 @@
UNWIND(.cantunwind )
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
bne fast_work_pending
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
/* perform architecture specific actions before user return */
arch_ret_to_user r1, lr
@@ -67,6 +70,7 @@ ENDPROC(ret_fast_syscall)
str r0, [sp, #S_R0 + S_OFF]! @ save returned r0
disable_irq_notrace @ disable interrupts
ldr r1, [tsk, #TI_FLAGS] @ re-check for syscall tracing
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_SYSCALL_WORK | _TIF_WORK_MASK
beq no_work_pending
UNWIND(.fnend )
@@ -82,6 +86,7 @@ ENDPROC(ret_fast_syscall)
mov r2, why @ 'syscall'
bl do_work_pending
cmp r0, #0
+ ldreq r2, [tsk, #TI_ADDR_LIMIT]
beq no_work_pending
movlt scno, #(__NR_restart_syscall - __NR_SYSCALL_BASE)
ldmia sp, {r0 - r6} @ have to reload r0 - r6
@@ -99,9 +104,12 @@ ENTRY(ret_to_user)
disable_irq_notrace @ disable interrupts
ENTRY(ret_to_user_from_irq)
ldr r1, [tsk, #TI_FLAGS]
+ ldr r2, [tsk, #TI_ADDR_LIMIT]
tst r1, #_TIF_WORK_MASK
bne slow_work_pending
no_work_pending:
+ cmp r2, #TASK_SIZE
+ blgt addr_limit_fail
asm_trace_hardirqs_on save = 0
/* perform architecture specific actions before user return */
@@ -125,6 +133,16 @@ ENTRY(ret_from_fork)
b ret_slow_syscall
ENDPROC(ret_from_fork)
+addr_limit_fail:
+#ifdef CONFIG_BUG_ON_DATA_CORRUPTION
+ stmfd sp!, {r0, lr}
+ bl verify_pre_usermode_state
+ ldmfd sp!, {r0, lr}
+#endif
+ mov r2, #TASK_SIZE
+ str r2, [tsk, #TI_ADDR_LIMIT]
+ ret lr
+
/*=============================================================================
* SWI handler
*-----------------------------------------------------------------------------
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
(?)
@ 2017-03-09 8:42 ` Borislav Petkov
-1 siblings, 0 replies; 60+ messages in thread
From: Borislav Petkov @ 2017-03-09 8:42 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
> +asmlinkage void verify_pre_usermode_state(void);
> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
This is not the kernel comments style. Use /* */ instead.
> + barrier();
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
... and then you could slim down the ifdeffery a bit:
static inline bool has_user_ds(void) {
bool ret = false;
#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
ret = segment_eq(get_fs(), USER_DS);
/* Prevent re-ordering the call. */
barrier();
#endif
return ret;
}
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 8:42 ` Borislav Petkov
0 siblings, 0 replies; 60+ messages in thread
From: Borislav Petkov @ 2017-03-09 8:42 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Josh Poimboeuf, Brian Gerst,
Jan Beulich, Christian Borntraeger, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, linux-api, linux-kernel, x86, linux-arm-kernel,
kernel-hardening
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
> +asmlinkage void verify_pre_usermode_state(void);
> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
This is not the kernel comments style. Use /* */ instead.
> + barrier();
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
... and then you could slim down the ifdeffery a bit:
static inline bool has_user_ds(void) {
bool ret = false;
#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
ret = segment_eq(get_fs(), USER_DS);
/* Prevent re-ordering the call. */
barrier();
#endif
return ret;
}
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 8:42 ` Borislav Petkov
0 siblings, 0 replies; 60+ messages in thread
From: Borislav Petkov @ 2017-03-09 8:42 UTC (permalink / raw)
To: linux-arm-kernel
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
> +asmlinkage void verify_pre_usermode_state(void);
> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
This is not the kernel comments style. Use /* */ instead.
> + barrier();
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
... and then you could slim down the ifdeffery a bit:
static inline bool has_user_ds(void) {
bool ret = false;
#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
ret = segment_eq(get_fs(), USER_DS);
/* Prevent re-ordering the call. */
barrier();
#endif
return ret;
}
--
Regards/Gruss,
Boris.
Good mailing practices for 400: avoid top-posting and trim the reply.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 8:42 ` Borislav Petkov
(?)
@ 2017-03-09 15:48 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:48 UTC (permalink / raw)
To: Borislav Petkov
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>
>> +asmlinkage void verify_pre_usermode_state(void);
>> +
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>
> This is not the kernel comments style. Use /* */ instead.
>
>> + barrier();
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> ... and then you could slim down the ifdeffery a bit:
>
> static inline bool has_user_ds(void) {
> bool ret = false;
>
> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> ret = segment_eq(get_fs(), USER_DS);
> /* Prevent re-ordering the call. */
> barrier();
> #endif
>
> return ret;
> }
>
I agree, cleaner. I will look to do this change on next iteration.
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:48 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:48 UTC (permalink / raw)
To: Borislav Petkov
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Josh Poimboeuf, Brian Gerst,
Jan Beulich, Christian Borntraeger, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>
>> +asmlinkage void verify_pre_usermode_state(void);
>> +
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>
> This is not the kernel comments style. Use /* */ instead.
>
>> + barrier();
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> ... and then you could slim down the ifdeffery a bit:
>
> static inline bool has_user_ds(void) {
> bool ret = false;
>
> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> ret = segment_eq(get_fs(), USER_DS);
> /* Prevent re-ordering the call. */
> barrier();
> #endif
>
> return ret;
> }
>
I agree, cleaner. I will look to do this change on next iteration.
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:48 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:48 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>
>> +asmlinkage void verify_pre_usermode_state(void);
>> +
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>
> This is not the kernel comments style. Use /* */ instead.
>
>> + barrier();
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> ... and then you could slim down the ifdeffery a bit:
>
> static inline bool has_user_ds(void) {
> bool ret = false;
>
> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> ret = segment_eq(get_fs(), USER_DS);
> /* Prevent re-ordering the call. */
> barrier();
> #endif
>
> return ret;
> }
>
I agree, cleaner. I will look to do this change on next iteration.
> --
> Regards/Gruss,
> Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 8:42 ` Borislav Petkov
@ 2017-03-09 17:27 ` Andy Lutomirski
-1 siblings, 0 replies; 60+ messages in thread
From: Andy Lutomirski @ 2017-03-09 17:27 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>
>> +asmlinkage void verify_pre_usermode_state(void);
>> +
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>
> This is not the kernel comments style. Use /* */ instead.
>
>> + barrier();
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> ... and then you could slim down the ifdeffery a bit:
>
> static inline bool has_user_ds(void) {
> bool ret = false;
>
> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> ret = segment_eq(get_fs(), USER_DS);
> /* Prevent re-ordering the call. */
> barrier();
> #endif
>
> return ret;
> }
I don't like having any kernel configuration in which has_user_ds()
unconditionally returns false. Can we put the ifdeffery in the caller
instead?
--Andy
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 17:27 ` Andy Lutomirski
0 siblings, 0 replies; 60+ messages in thread
From: Andy Lutomirski @ 2017-03-09 17:27 UTC (permalink / raw)
To: Borislav Petkov
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Josh Poimboeuf, Brian Gerst,
Jan Beulich, Christian Borntraeger, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, linux-kernel, X86 ML,
linux-arm-kernel, kernel-hardening
On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>
>> +asmlinkage void verify_pre_usermode_state(void);
>> +
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>
> This is not the kernel comments style. Use /* */ instead.
>
>> + barrier();
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> ... and then you could slim down the ifdeffery a bit:
>
> static inline bool has_user_ds(void) {
> bool ret = false;
>
> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> ret = segment_eq(get_fs(), USER_DS);
> /* Prevent re-ordering the call. */
> barrier();
> #endif
>
> return ret;
> }
I don't like having any kernel configuration in which has_user_ds()
unconditionally returns false. Can we put the ifdeffery in the caller
instead?
--Andy
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 17:27 ` [kernel-hardening] " Andy Lutomirski
@ 2017-03-09 17:41 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 17:41 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Borislav Petkov, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 9, 2017 at 9:27 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
>> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>>
>>> +asmlinkage void verify_pre_usermode_state(void);
>>> +
>>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>>> +static inline bool has_user_ds(void) {
>>> + bool ret = segment_eq(get_fs(), USER_DS);
>>> + // Prevent re-ordering the call
>>
>> This is not the kernel comments style. Use /* */ instead.
>>
>>> + barrier();
>>> + return ret;
>>> +}
>>> +#else
>>> +static inline bool has_user_ds(void) {
>>> + return false;
>>> +}
>>> +#endif
>>
>> ... and then you could slim down the ifdeffery a bit:
>>
>> static inline bool has_user_ds(void) {
>> bool ret = false;
>>
>> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> ret = segment_eq(get_fs(), USER_DS);
>> /* Prevent re-ordering the call. */
>> barrier();
>> #endif
>>
>> return ret;
>> }
>
> I don't like having any kernel configuration in which has_user_ds()
> unconditionally returns false. Can we put the ifdeffery in the caller
> instead?
I don't like it either to be honest. We could have something like:
#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
#define CHECK_USER_CALLER(_x) bool _x = segment_eq(get_fs(), USER_DS)
#else
#define CHECK_USER_CALLER(_x) bool _x = false
#endif
// In the syscall macro:
CHECK_CALLED_BY_USER(user_caller);
>
> --Andy
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 17:41 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 17:41 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Borislav Petkov, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Josh Poimboeuf, Brian Gerst,
Jan Beulich, Christian Borntraeger, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, linux-kernel, X86 ML,
linux-arm-kernel, kernel-hardening
On Thu, Mar 9, 2017 at 9:27 AM, Andy Lutomirski <luto@amacapital.net> wrote:
> On Thu, Mar 9, 2017 at 12:42 AM, Borislav Petkov <bp@alien8.de> wrote:
>> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>>> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>>> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
>>> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>>>
>>> +asmlinkage void verify_pre_usermode_state(void);
>>> +
>>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>>> +static inline bool has_user_ds(void) {
>>> + bool ret = segment_eq(get_fs(), USER_DS);
>>> + // Prevent re-ordering the call
>>
>> This is not the kernel comments style. Use /* */ instead.
>>
>>> + barrier();
>>> + return ret;
>>> +}
>>> +#else
>>> +static inline bool has_user_ds(void) {
>>> + return false;
>>> +}
>>> +#endif
>>
>> ... and then you could slim down the ifdeffery a bit:
>>
>> static inline bool has_user_ds(void) {
>> bool ret = false;
>>
>> #ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> ret = segment_eq(get_fs(), USER_DS);
>> /* Prevent re-ordering the call. */
>> barrier();
>> #endif
>>
>> return ret;
>> }
>
> I don't like having any kernel configuration in which has_user_ds()
> unconditionally returns false. Can we put the ifdeffery in the caller
> instead?
I don't like it either to be honest. We could have something like:
#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
#define CHECK_USER_CALLER(_x) bool _x = segment_eq(get_fs(), USER_DS)
#else
#define CHECK_USER_CALLER(_x) bool _x = false
#endif
// In the syscall macro:
CHECK_CALLED_BY_USER(user_caller);
>
> --Andy
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
@ 2017-03-09 10:39 ` Sergey Senozhatsky
-1 siblings, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2017-03-09 10:39 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
Hello,
On (03/08/17 17:24), Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
I like the patch set.
a side note (perhaps a bit irrelevant), the WARN backtrace does not
really tell more than "incorrect get_fs() on user-mode return" message
does
incorrect get_fs() on user-mode return
------------[ cut here ]------------
kernel BUG at kernel/sys.c:2467!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: FOO
CPU: 2 PID: 355 Comm: BAR
Hardware name: BUZ
task: ffff8801329f4e00 task.stack: ffffc900005d8000
RIP: 0010:verify_pre_usermode_state+0x31/0x34
RSP: 0018:ffffc900005dbf48 EFLAGS: 00010096
RAX: 0000000000000026 RBX: 0000000000000002 RCX: 0000000000000001
RDX: 0000000000000046 RSI: ffff880130cead88 RDI: ffffffff81095594
RBP: ffffc900005dbf48 R08: 0000000000000001 R09: 0000000000000001
R10: ffffc900005dbd58 R11: ffff8801329f4e00 R12: 0000000000000002
R13: 0000000000000001 R14: 00007fb6c6a7b5e0 R15: 0000000000000002
FS: 00007fb6c70d3b40(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd9d94f8f8 CR3: 00000001295b2000 CR4: 00000000000006e0
Call Trace:
entry_SYSCALL_64_fastpath+0x3a/0xb2
RIP: 0033:0x7fb6c67ba3c0
RSP: 002b:00007ffd9d94f5c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: 0000000000000002 RBX: 0000000000000000 RCX: 00007fb6c67ba3c0
RDX: 0000000000000002 RSI: 0000000000ba7310 RDI: 0000000000000001
RBP: 0000000000000001 R08: 00007fb6c6a7c740 R09: 00007fb6c70d3b40
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Code: 48 8b 14 25 40 c5 00 00 48 b8 00 f0 ff ff ff 7f 00 00 48 39 82 28 11 00 00 74 12 55 48 c7 c7 9b f5 78 81 48 89 e5 e8 14 19 0b 00 <0f> 0b c3 66 66 66 66 90 55 48 89 e5 53 48 8b 47 50 48 89 fb 48
may be some day someone would be interested in something like
"incorrect get_fs() on user-mode return from %pS"
and set_fs() would save _RET_IP_.
just a side note.
-ss
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 10:39 ` Sergey Senozhatsky
0 siblings, 0 replies; 60+ messages in thread
From: Sergey Senozhatsky @ 2017-03-09 10:39 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, Mark Rutland, James Morse, Pratyush Anand,
Vladimir Murzin, Chris Metcalf, Andre Przywara, linux-api,
linux-kernel, x86, linux-arm-kernel, kernel-hardening
Hello,
On (03/08/17 17:24), Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
I like the patch set.
a side note (perhaps a bit irrelevant), the WARN backtrace does not
really tell more than "incorrect get_fs() on user-mode return" message
does
incorrect get_fs() on user-mode return
------------[ cut here ]------------
kernel BUG at kernel/sys.c:2467!
invalid opcode: 0000 [#1] PREEMPT SMP
Modules linked in: FOO
CPU: 2 PID: 355 Comm: BAR
Hardware name: BUZ
task: ffff8801329f4e00 task.stack: ffffc900005d8000
RIP: 0010:verify_pre_usermode_state+0x31/0x34
RSP: 0018:ffffc900005dbf48 EFLAGS: 00010096
RAX: 0000000000000026 RBX: 0000000000000002 RCX: 0000000000000001
RDX: 0000000000000046 RSI: ffff880130cead88 RDI: ffffffff81095594
RBP: ffffc900005dbf48 R08: 0000000000000001 R09: 0000000000000001
R10: ffffc900005dbd58 R11: ffff8801329f4e00 R12: 0000000000000002
R13: 0000000000000001 R14: 00007fb6c6a7b5e0 R15: 0000000000000002
FS: 00007fb6c70d3b40(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffd9d94f8f8 CR3: 00000001295b2000 CR4: 00000000000006e0
Call Trace:
entry_SYSCALL_64_fastpath+0x3a/0xb2
RIP: 0033:0x7fb6c67ba3c0
RSP: 002b:00007ffd9d94f5c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
RAX: 0000000000000002 RBX: 0000000000000000 RCX: 00007fb6c67ba3c0
RDX: 0000000000000002 RSI: 0000000000ba7310 RDI: 0000000000000001
RBP: 0000000000000001 R08: 00007fb6c6a7c740 R09: 00007fb6c70d3b40
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
Code: 48 8b 14 25 40 c5 00 00 48 b8 00 f0 ff ff ff 7f 00 00 48 39 82 28 11 00 00 74 12 55 48 c7 c7 9b f5 78 81 48 89 e5 e8 14 19 0b 00 <0f> 0b c3 66 66 66 66 90 55 48 89 e5 53 48 8b 47 50 48 89 fb 48
may be some day someone would be interested in something like
"incorrect get_fs() on user-mode return from %pS"
and set_fs() would save _RET_IP_.
just a side note.
-ss
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
(?)
@ 2017-03-09 12:09 ` Mark Rutland
-1 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:09 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
Hi,
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
>
> For example, it would mitigation this bug:
>
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>
> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> state will result in a BUG_ON.
>
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
> + barrier();
What ordering are we trying to ensure, that isn't otherwise given?
We expect get_fs() and set_fs() to be ordered w.r.t. each other and
w.r.t. uaccess uses, or we'd need barriers all over the place.
Given that, I can't see why we need a barrier here. So this needs a
better comment, at least.
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
It would be simpler to wrap the call entirely, e.g. have:
#ifdef CONFIG_WHATEVER
static inline void verify_pre_usermode_state(void)
{
if (segment_eq(get_fs(), USER_DS))
__verify_pre_usermode_state();
}
#else
static inline void verify_pre_usermode_state(void) { }
#endif
> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> { \
> + bool user_caller = has_user_ds(); \
> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> + if (user_caller) \
> + verify_pre_usermode_state(); \
... then we can unconditionally use verify_pre_usermode_state() here ...
> __MAP(x,__SC_TEST,__VA_ARGS__); \
> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
> return ret; \
[...]
> +/* Called before coming back to user-mode */
> +asmlinkage void verify_pre_usermode_state(void)
... and we just prepend a couple of underscores here.
> +{
> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + "incorrect get_fs() on user-mode return"))
> + set_fs(USER_DS);
> +}
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 12:09 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:09 UTC (permalink / raw)
To: Thomas Garnier
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, linux-api, linux-kernel, x86,
linux-arm-kernel, kernel-hardening
Hi,
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
>
> For example, it would mitigation this bug:
>
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>
> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> state will result in a BUG_ON.
>
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
> + barrier();
What ordering are we trying to ensure, that isn't otherwise given?
We expect get_fs() and set_fs() to be ordered w.r.t. each other and
w.r.t. uaccess uses, or we'd need barriers all over the place.
Given that, I can't see why we need a barrier here. So this needs a
better comment, at least.
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
It would be simpler to wrap the call entirely, e.g. have:
#ifdef CONFIG_WHATEVER
static inline void verify_pre_usermode_state(void)
{
if (segment_eq(get_fs(), USER_DS))
__verify_pre_usermode_state();
}
#else
static inline void verify_pre_usermode_state(void) { }
#endif
> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> { \
> + bool user_caller = has_user_ds(); \
> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> + if (user_caller) \
> + verify_pre_usermode_state(); \
... then we can unconditionally use verify_pre_usermode_state() here ...
> __MAP(x,__SC_TEST,__VA_ARGS__); \
> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
> return ret; \
[...]
> +/* Called before coming back to user-mode */
> +asmlinkage void verify_pre_usermode_state(void)
... and we just prepend a couple of underscores here.
> +{
> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + "incorrect get_fs() on user-mode return"))
> + set_fs(USER_DS);
> +}
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 12:09 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 12:09 UTC (permalink / raw)
To: linux-arm-kernel
Hi,
On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
>
> For example, it would mitigation this bug:
>
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>
> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> state will result in a BUG_ON.
>
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
> + barrier();
What ordering are we trying to ensure, that isn't otherwise given?
We expect get_fs() and set_fs() to be ordered w.r.t. each other and
w.r.t. uaccess uses, or we'd need barriers all over the place.
Given that, I can't see why we need a barrier here. So this needs a
better comment, at least.
> + return ret;
> +}
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
It would be simpler to wrap the call entirely, e.g. have:
#ifdef CONFIG_WHATEVER
static inline void verify_pre_usermode_state(void)
{
if (segment_eq(get_fs(), USER_DS))
__verify_pre_usermode_state();
}
#else
static inline void verify_pre_usermode_state(void) { }
#endif
> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> { \
> + bool user_caller = has_user_ds(); \
> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> + if (user_caller) \
> + verify_pre_usermode_state(); \
... then we can unconditionally use verify_pre_usermode_state() here ...
> __MAP(x,__SC_TEST,__VA_ARGS__); \
> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
> return ret; \
[...]
> +/* Called before coming back to user-mode */
> +asmlinkage void verify_pre_usermode_state(void)
... and we just prepend a couple of underscores here.
> +{
> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + "incorrect get_fs() on user-mode return"))
> + set_fs(USER_DS);
> +}
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 12:09 ` Mark Rutland
(?)
@ 2017-03-09 13:44 ` Russell King - ARM Linux
-1 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 13:44 UTC (permalink / raw)
To: Mark Rutland
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > This patch ensures a syscall does not return to user-mode with a kernel
> > address limit. If that happened, a process can corrupt kernel-mode
> > memory and elevate privileges.
> >
> > For example, it would mitigation this bug:
> >
> > - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
> >
> > If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> > state will result in a BUG_ON.
> >
> > The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> > added so each architecture can optimize this change.
>
> > +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> > +static inline bool has_user_ds(void) {
> > + bool ret = segment_eq(get_fs(), USER_DS);
> > + // Prevent re-ordering the call
> > + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
> > + return ret;
> > +}
> > +#else
> > +static inline bool has_user_ds(void) {
> > + return false;
> > +}
> > +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
That's utterly pointless - you've missed a detail.
> > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > { \
> > + bool user_caller = has_user_ds(); \
> > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > + if (user_caller) \
> > + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Look at this closely. has_user_ds() is called _before_ the syscall code
is invoked. It's checking what conditions the syscall was entered from.
If the syscall was entered with the user segment selected, then we run
a check on the system state _after_ the syscall code has returned.
Putting both after the syscall code has returned is completely pointless -
it turns it into this code:
if (segment_eq(get_fs(), USER_DS))
if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
"incorrect get_fs() on user-mode return"))
set_fs(USER_DS);
which is obviously bogus (it'll never fire.)
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 13:44 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 13:44 UTC (permalink / raw)
To: Mark Rutland
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, linux-api, linux-kernel, x86, linux-arm-kernel,
kernel-hardening
On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > This patch ensures a syscall does not return to user-mode with a kernel
> > address limit. If that happened, a process can corrupt kernel-mode
> > memory and elevate privileges.
> >
> > For example, it would mitigation this bug:
> >
> > - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
> >
> > If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> > state will result in a BUG_ON.
> >
> > The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> > added so each architecture can optimize this change.
>
> > +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> > +static inline bool has_user_ds(void) {
> > + bool ret = segment_eq(get_fs(), USER_DS);
> > + // Prevent re-ordering the call
> > + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
> > + return ret;
> > +}
> > +#else
> > +static inline bool has_user_ds(void) {
> > + return false;
> > +}
> > +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
That's utterly pointless - you've missed a detail.
> > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > { \
> > + bool user_caller = has_user_ds(); \
> > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > + if (user_caller) \
> > + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Look at this closely. has_user_ds() is called _before_ the syscall code
is invoked. It's checking what conditions the syscall was entered from.
If the syscall was entered with the user segment selected, then we run
a check on the system state _after_ the syscall code has returned.
Putting both after the syscall code has returned is completely pointless -
it turns it into this code:
if (segment_eq(get_fs(), USER_DS))
if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
"incorrect get_fs() on user-mode return"))
set_fs(USER_DS);
which is obviously bogus (it'll never fire.)
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 13:44 ` Russell King - ARM Linux
0 siblings, 0 replies; 60+ messages in thread
From: Russell King - ARM Linux @ 2017-03-09 13:44 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > This patch ensures a syscall does not return to user-mode with a kernel
> > address limit. If that happened, a process can corrupt kernel-mode
> > memory and elevate privileges.
> >
> > For example, it would mitigation this bug:
> >
> > - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
> >
> > If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> > state will result in a BUG_ON.
> >
> > The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> > added so each architecture can optimize this change.
>
> > +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> > +static inline bool has_user_ds(void) {
> > + bool ret = segment_eq(get_fs(), USER_DS);
> > + // Prevent re-ordering the call
> > + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
> > + return ret;
> > +}
> > +#else
> > +static inline bool has_user_ds(void) {
> > + return false;
> > +}
> > +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
That's utterly pointless - you've missed a detail.
> > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > { \
> > + bool user_caller = has_user_ds(); \
> > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > + if (user_caller) \
> > + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Look at this closely. has_user_ds() is called _before_ the syscall code
is invoked. It's checking what conditions the syscall was entered from.
If the syscall was entered with the user segment selected, then we run
a check on the system state _after_ the syscall code has returned.
Putting both after the syscall code has returned is completely pointless -
it turns it into this code:
if (segment_eq(get_fs(), USER_DS))
if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
"incorrect get_fs() on user-mode return"))
set_fs(USER_DS);
which is obviously bogus (it'll never fire.)
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 13:44 ` Russell King - ARM Linux
(?)
@ 2017-03-09 15:21 ` Mark Rutland
-1 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 15:21 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > It would be simpler to wrap the call entirely, e.g. have:
> >
> > #ifdef CONFIG_WHATEVER
> > static inline void verify_pre_usermode_state(void)
> > {
> > if (segment_eq(get_fs(), USER_DS))
> > __verify_pre_usermode_state();
> > }
> > #else
> > static inline void verify_pre_usermode_state(void) { }
> > #endif
>
> That's utterly pointless - you've missed a detail.
>
> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > > { \
> > > + bool user_caller = has_user_ds(); \
> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > > + if (user_caller) \
> > > + verify_pre_usermode_state(); \
> >
> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>
> Look at this closely. has_user_ds() is called _before_ the syscall code
> is invoked. It's checking what conditions the syscall was entered from.
> If the syscall was entered with the user segment selected, then we run
> a check on the system state _after_ the syscall code has returned.
Indeed; I clearly did not consider this correctly.
Sorry for the noise.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:21 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 15:21 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, linux-api, linux-kernel, x86, linux-arm-kernel,
kernel-hardening
On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > It would be simpler to wrap the call entirely, e.g. have:
> >
> > #ifdef CONFIG_WHATEVER
> > static inline void verify_pre_usermode_state(void)
> > {
> > if (segment_eq(get_fs(), USER_DS))
> > __verify_pre_usermode_state();
> > }
> > #else
> > static inline void verify_pre_usermode_state(void) { }
> > #endif
>
> That's utterly pointless - you've missed a detail.
>
> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > > { \
> > > + bool user_caller = has_user_ds(); \
> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > > + if (user_caller) \
> > > + verify_pre_usermode_state(); \
> >
> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>
> Look at this closely. has_user_ds() is called _before_ the syscall code
> is invoked. It's checking what conditions the syscall was entered from.
> If the syscall was entered with the user segment selected, then we run
> a check on the system state _after_ the syscall code has returned.
Indeed; I clearly did not consider this correctly.
Sorry for the noise.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:21 ` Mark Rutland
0 siblings, 0 replies; 60+ messages in thread
From: Mark Rutland @ 2017-03-09 15:21 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
> > It would be simpler to wrap the call entirely, e.g. have:
> >
> > #ifdef CONFIG_WHATEVER
> > static inline void verify_pre_usermode_state(void)
> > {
> > if (segment_eq(get_fs(), USER_DS))
> > __verify_pre_usermode_state();
> > }
> > #else
> > static inline void verify_pre_usermode_state(void) { }
> > #endif
>
> That's utterly pointless - you've missed a detail.
>
> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> > > { \
> > > + bool user_caller = has_user_ds(); \
> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> > > + if (user_caller) \
> > > + verify_pre_usermode_state(); \
> >
> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>
> Look at this closely. has_user_ds() is called _before_ the syscall code
> is invoked. It's checking what conditions the syscall was entered from.
> If the syscall was entered with the user segment selected, then we run
> a check on the system state _after_ the syscall code has returned.
Indeed; I clearly did not consider this correctly.
Sorry for the noise.
Thanks,
Mark.
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 15:21 ` Mark Rutland
(?)
@ 2017-03-09 15:54 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:54 UTC (permalink / raw)
To: Mark Rutland
Cc: Russell King - ARM Linux, David Howells, Dave Hansen,
Arnd Bergmann, Al Viro, René Nyffenegger, Andrew Morton,
Kees Cook, Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner
On Thu, Mar 9, 2017 at 7:21 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
>> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
>> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>
>> > It would be simpler to wrap the call entirely, e.g. have:
>> >
>> > #ifdef CONFIG_WHATEVER
>> > static inline void verify_pre_usermode_state(void)
>> > {
>> > if (segment_eq(get_fs(), USER_DS))
>> > __verify_pre_usermode_state();
>> > }
>> > #else
>> > static inline void verify_pre_usermode_state(void) { }
>> > #endif
>>
>> That's utterly pointless - you've missed a detail.
>>
>> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> > > { \
>> > > + bool user_caller = has_user_ds(); \
>> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> > > + if (user_caller) \
>> > > + verify_pre_usermode_state(); \
>> >
>> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>>
>> Look at this closely. has_user_ds() is called _before_ the syscall code
>> is invoked. It's checking what conditions the syscall was entered from.
>> If the syscall was entered with the user segment selected, then we run
>> a check on the system state _after_ the syscall code has returned.
>
> Indeed; I clearly did not consider this correctly.
>
> Sorry for the noise.
>
No problem, I missed that reply so discard my question on the email
few seconds ago.
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:54 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:54 UTC (permalink / raw)
To: Mark Rutland
Cc: Russell King - ARM Linux, David Howells, Dave Hansen,
Arnd Bergmann, Al Viro, René Nyffenegger, Andrew Morton,
Kees Cook, Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Will Deacon, Catalin Marinas,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 7:21 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
>> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
>> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>
>> > It would be simpler to wrap the call entirely, e.g. have:
>> >
>> > #ifdef CONFIG_WHATEVER
>> > static inline void verify_pre_usermode_state(void)
>> > {
>> > if (segment_eq(get_fs(), USER_DS))
>> > __verify_pre_usermode_state();
>> > }
>> > #else
>> > static inline void verify_pre_usermode_state(void) { }
>> > #endif
>>
>> That's utterly pointless - you've missed a detail.
>>
>> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> > > { \
>> > > + bool user_caller = has_user_ds(); \
>> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> > > + if (user_caller) \
>> > > + verify_pre_usermode_state(); \
>> >
>> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>>
>> Look at this closely. has_user_ds() is called _before_ the syscall code
>> is invoked. It's checking what conditions the syscall was entered from.
>> If the syscall was entered with the user segment selected, then we run
>> a check on the system state _after_ the syscall code has returned.
>
> Indeed; I clearly did not consider this correctly.
>
> Sorry for the noise.
>
No problem, I missed that reply so discard my question on the email
few seconds ago.
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:54 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:54 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 7:21 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> On Thu, Mar 09, 2017 at 01:44:56PM +0000, Russell King - ARM Linux wrote:
>> On Thu, Mar 09, 2017 at 12:09:55PM +0000, Mark Rutland wrote:
>> > On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>
>> > It would be simpler to wrap the call entirely, e.g. have:
>> >
>> > #ifdef CONFIG_WHATEVER
>> > static inline void verify_pre_usermode_state(void)
>> > {
>> > if (segment_eq(get_fs(), USER_DS))
>> > __verify_pre_usermode_state();
>> > }
>> > #else
>> > static inline void verify_pre_usermode_state(void) { }
>> > #endif
>>
>> That's utterly pointless - you've missed a detail.
>>
>> > > @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> > > asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> > > { \
>> > > + bool user_caller = has_user_ds(); \
>> > > long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> > > + if (user_caller) \
>> > > + verify_pre_usermode_state(); \
>> >
>> > ... then we can unconditionally use verify_pre_usermode_state() here ...
>>
>> Look at this closely. has_user_ds() is called _before_ the syscall code
>> is invoked. It's checking what conditions the syscall was entered from.
>> If the syscall was entered with the user segment selected, then we run
>> a check on the system state _after_ the syscall code has returned.
>
> Indeed; I clearly did not consider this correctly.
>
> Sorry for the noise.
>
No problem, I missed that reply so discard my question on the email
few seconds ago.
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 12:09 ` Mark Rutland
(?)
@ 2017-03-09 15:52 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:52 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
On Thu, Mar 9, 2017 at 4:09 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> This patch ensures a syscall does not return to user-mode with a kernel
>> address limit. If that happened, a process can corrupt kernel-mode
>> memory and elevate privileges.
>>
>> For example, it would mitigation this bug:
>>
>> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>>
>> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
>> state will result in a BUG_ON.
>>
>> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
>> added so each architecture can optimize this change.
>
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>> + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
I was half sure of that so that's why I added the barrier. If it is
not needed then I can remove it. Thanks!
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
>
>> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> { \
>> + bool user_caller = has_user_ds(); \
>> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> + if (user_caller) \
>> + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Not sure I understood that point. The goal is to see if get_fs was
changed, that's why I check before the syscall and I want to ensure
the call is not shuffled after the syscall, therefore the original
barrier.
>
>> __MAP(x,__SC_TEST,__VA_ARGS__); \
>> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
>> return ret; \
>
> [...]
>
>> +/* Called before coming back to user-mode */
>> +asmlinkage void verify_pre_usermode_state(void)
>
> ... and we just prepend a couple of underscores here.
>
>> +{
>> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
>> + "incorrect get_fs() on user-mode return"))
>> + set_fs(USER_DS);
>> +}
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:52 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:52 UTC (permalink / raw)
To: Mark Rutland
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Christian Borntraeger,
Luis R . Rodriguez, He Chen, Russell King, Will Deacon,
Catalin Marinas, James Morse, Pratyush Anand, Vladimir Murzin,
Chris Metcalf, Andre Przywara, Linux API, LKML,
the arch/x86 maintainers, linux-arm-kernel, Kernel Hardening
On Thu, Mar 9, 2017 at 4:09 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> This patch ensures a syscall does not return to user-mode with a kernel
>> address limit. If that happened, a process can corrupt kernel-mode
>> memory and elevate privileges.
>>
>> For example, it would mitigation this bug:
>>
>> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>>
>> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
>> state will result in a BUG_ON.
>>
>> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
>> added so each architecture can optimize this change.
>
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>> + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
I was half sure of that so that's why I added the barrier. If it is
not needed then I can remove it. Thanks!
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
>
>> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> { \
>> + bool user_caller = has_user_ds(); \
>> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> + if (user_caller) \
>> + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Not sure I understood that point. The goal is to see if get_fs was
changed, that's why I check before the syscall and I want to ensure
the call is not shuffled after the syscall, therefore the original
barrier.
>
>> __MAP(x,__SC_TEST,__VA_ARGS__); \
>> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
>> return ret; \
>
> [...]
>
>> +/* Called before coming back to user-mode */
>> +asmlinkage void verify_pre_usermode_state(void)
>
> ... and we just prepend a couple of underscores here.
>
>> +{
>> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
>> + "incorrect get_fs() on user-mode return"))
>> + set_fs(USER_DS);
>> +}
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:52 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:52 UTC (permalink / raw)
To: linux-arm-kernel
On Thu, Mar 9, 2017 at 4:09 AM, Mark Rutland <mark.rutland@arm.com> wrote:
> Hi,
>
> On Wed, Mar 08, 2017 at 05:24:53PM -0800, Thomas Garnier wrote:
>> This patch ensures a syscall does not return to user-mode with a kernel
>> address limit. If that happened, a process can corrupt kernel-mode
>> memory and elevate privileges.
>>
>> For example, it would mitigation this bug:
>>
>> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>>
>> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
>> state will result in a BUG_ON.
>>
>> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
>> added so each architecture can optimize this change.
>
>> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
>> +static inline bool has_user_ds(void) {
>> + bool ret = segment_eq(get_fs(), USER_DS);
>> + // Prevent re-ordering the call
>> + barrier();
>
> What ordering are we trying to ensure, that isn't otherwise given?
>
> We expect get_fs() and set_fs() to be ordered w.r.t. each other and
> w.r.t. uaccess uses, or we'd need barriers all over the place.
>
> Given that, I can't see why we need a barrier here. So this needs a
> better comment, at least.
>
I was half sure of that so that's why I added the barrier. If it is
not needed then I can remove it. Thanks!
>> + return ret;
>> +}
>> +#else
>> +static inline bool has_user_ds(void) {
>> + return false;
>> +}
>> +#endif
>
> It would be simpler to wrap the call entirely, e.g. have:
>
> #ifdef CONFIG_WHATEVER
> static inline void verify_pre_usermode_state(void)
> {
> if (segment_eq(get_fs(), USER_DS))
> __verify_pre_usermode_state();
> }
> #else
> static inline void verify_pre_usermode_state(void) { }
> #endif
>
>> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
>> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
>> { \
>> + bool user_caller = has_user_ds(); \
>> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
>> + if (user_caller) \
>> + verify_pre_usermode_state(); \
>
> ... then we can unconditionally use verify_pre_usermode_state() here ...
Not sure I understood that point. The goal is to see if get_fs was
changed, that's why I check before the syscall and I want to ensure
the call is not shuffled after the syscall, therefore the original
barrier.
>
>> __MAP(x,__SC_TEST,__VA_ARGS__); \
>> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
>> return ret; \
>
> [...]
>
>> +/* Called before coming back to user-mode */
>> +asmlinkage void verify_pre_usermode_state(void)
>
> ... and we just prepend a couple of underscores here.
>
>> +{
>> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
>> + "incorrect get_fs() on user-mode return"))
>> + set_fs(USER_DS);
>> +}
>
> Thanks,
> Mark.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 1:24 ` [kernel-hardening] " Thomas Garnier
@ 2017-03-09 12:32 ` Christian Borntraeger
-1 siblings, 0 replies; 60+ messages in thread
From: Christian Borntraeger @ 2017-03-09 12:32 UTC (permalink / raw)
To: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening,
Heiko Carstens, Martin Schwidefsky, linux-s390
On 03/09/2017 02:24 AM, Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
>
> For example, it would mitigation this bug:
>
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>
> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> state will result in a BUG_ON.
>
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.
>
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> ---
> Based on next-20170308
> ---
> include/linux/syscalls.h | 19 +++++++++++++++++++
> init/Kconfig | 7 +++++++
> kernel/sys.c | 8 ++++++++
> 3 files changed, 34 insertions(+)
>
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 980c3c9b06f8..78a2268ecd6e 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
> +asmlinkage void verify_pre_usermode_state(void);
> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
> + barrier();
> + return ret;
> +}
Can you please disable that for s390? (e.g. by setting
CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE for s390)
We have a separate address space for kernel/user so the logic will
be slightly different and is already handled in
commit b5a882fcf146c87cb6b67c6df353e1c042b8773d
Author: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Fri Feb 17 08:13:28 2017 +0100
s390: restore address space when returning to user space
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
> +
> +
> #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
> #define __SYSCALL_DEFINEx(x, name, ...) \
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> { \
> + bool user_caller = has_user_ds(); \
> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> + if (user_caller) \
> + verify_pre_usermode_state(); \
> __MAP(x,__SC_TEST,__VA_ARGS__); \
> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
> return ret; \
> diff --git a/init/Kconfig b/init/Kconfig
> index c859c993c26f..c4efc3a95e4a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1929,6 +1929,13 @@ config PROFILING
> config TRACEPOINTS
> bool
>
> +#
> +# Set by each architecture that want to optimize how verify_pre_usermode_state
> +# is called.
> +#
> +config ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> + bool
> +
> source "arch/Kconfig"
>
> endmenu # General setup
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 196c7134bee6..411163ac9dc3 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2459,3 +2459,11 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
> return 0;
> }
> #endif /* CONFIG_COMPAT */
> +
> +/* Called before coming back to user-mode */
> +asmlinkage void verify_pre_usermode_state(void)
> +{
> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + "incorrect get_fs() on user-mode return"))
> + set_fs(USER_DS);
> +}
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 12:32 ` Christian Borntraeger
0 siblings, 0 replies; 60+ messages in thread
From: Christian Borntraeger @ 2017-03-09 12:32 UTC (permalink / raw)
To: Thomas Garnier, David Howells, Dave Hansen, Arnd Bergmann,
Al Viro, René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara
Cc: linux-api, linux-kernel, x86, linux-arm-kernel, kernel-hardening,
Heiko Carstens, Martin Schwidefsky, linux-s390
On 03/09/2017 02:24 AM, Thomas Garnier wrote:
> This patch ensures a syscall does not return to user-mode with a kernel
> address limit. If that happened, a process can corrupt kernel-mode
> memory and elevate privileges.
>
> For example, it would mitigation this bug:
>
> - https://bugs.chromium.org/p/project-zero/issues/detail?id=990
>
> If the CONFIG_BUG_ON_DATA_CORRUPTION option is enabled, an incorrect
> state will result in a BUG_ON.
>
> The CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE option is also
> added so each architecture can optimize this change.
>
> Signed-off-by: Thomas Garnier <thgarnie@google.com>
> ---
> Based on next-20170308
> ---
> include/linux/syscalls.h | 19 +++++++++++++++++++
> init/Kconfig | 7 +++++++
> kernel/sys.c | 8 ++++++++
> 3 files changed, 34 insertions(+)
>
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 980c3c9b06f8..78a2268ecd6e 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -191,6 +191,22 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> SYSCALL_METADATA(sname, x, __VA_ARGS__) \
> __SYSCALL_DEFINEx(x, sname, __VA_ARGS__)
>
> +asmlinkage void verify_pre_usermode_state(void);
> +
> +#ifndef CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> +static inline bool has_user_ds(void) {
> + bool ret = segment_eq(get_fs(), USER_DS);
> + // Prevent re-ordering the call
> + barrier();
> + return ret;
> +}
Can you please disable that for s390? (e.g. by setting
CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE for s390)
We have a separate address space for kernel/user so the logic will
be slightly different and is already handled in
commit b5a882fcf146c87cb6b67c6df353e1c042b8773d
Author: Heiko Carstens <heiko.carstens@de.ibm.com>
Date: Fri Feb 17 08:13:28 2017 +0100
s390: restore address space when returning to user space
> +#else
> +static inline bool has_user_ds(void) {
> + return false;
> +}
> +#endif
> +
> +
> #define __PROTECT(...) asmlinkage_protect(__VA_ARGS__)
> #define __SYSCALL_DEFINEx(x, name, ...) \
> asmlinkage long sys##name(__MAP(x,__SC_DECL,__VA_ARGS__)) \
> @@ -199,7 +215,10 @@ extern struct trace_event_functions exit_syscall_print_funcs;
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)); \
> asmlinkage long SyS##name(__MAP(x,__SC_LONG,__VA_ARGS__)) \
> { \
> + bool user_caller = has_user_ds(); \
> long ret = SYSC##name(__MAP(x,__SC_CAST,__VA_ARGS__)); \
> + if (user_caller) \
> + verify_pre_usermode_state(); \
> __MAP(x,__SC_TEST,__VA_ARGS__); \
> __PROTECT(x, ret,__MAP(x,__SC_ARGS,__VA_ARGS__)); \
> return ret; \
> diff --git a/init/Kconfig b/init/Kconfig
> index c859c993c26f..c4efc3a95e4a 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -1929,6 +1929,13 @@ config PROFILING
> config TRACEPOINTS
> bool
>
> +#
> +# Set by each architecture that want to optimize how verify_pre_usermode_state
> +# is called.
> +#
> +config ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE
> + bool
> +
> source "arch/Kconfig"
>
> endmenu # General setup
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 196c7134bee6..411163ac9dc3 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -2459,3 +2459,11 @@ COMPAT_SYSCALL_DEFINE1(sysinfo, struct compat_sysinfo __user *, info)
> return 0;
> }
> #endif /* CONFIG_COMPAT */
> +
> +/* Called before coming back to user-mode */
> +asmlinkage void verify_pre_usermode_state(void)
> +{
> + if (CHECK_DATA_CORRUPTION(!segment_eq(get_fs(), USER_DS),
> + "incorrect get_fs() on user-mode return"))
> + set_fs(USER_DS);
> +}
>
^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
2017-03-09 12:32 ` [kernel-hardening] " Christian Borntraeger
@ 2017-03-09 15:53 ` Thomas Garnier
-1 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:53 UTC (permalink / raw)
To: Christian Borntraeger
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov
> On Thu, Mar 9, 2017 at 4:32 AM, Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> Can you please disable that for s390? (e.g. by setting
> CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE for s390)
> We have a separate address space for kernel/user so the logic will
> be slightly different and is already handled in
>
> commit b5a882fcf146c87cb6b67c6df353e1c042b8773d
> Author: Heiko Carstens <heiko.carstens@de.ibm.com>
> Date: Fri Feb 17 08:13:28 2017 +0100
>
> s390: restore address space when returning to user space
>
No problem, I will add it on the next iteration of this change.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread
* [kernel-hardening] Re: [PATCH v2 1/4] syscalls: Restore address limit after a syscall
@ 2017-03-09 15:53 ` Thomas Garnier
0 siblings, 0 replies; 60+ messages in thread
From: Thomas Garnier @ 2017-03-09 15:53 UTC (permalink / raw)
To: Christian Borntraeger
Cc: David Howells, Dave Hansen, Arnd Bergmann, Al Viro,
René Nyffenegger, Andrew Morton, Kees Cook,
Paul E . McKenney, David S . Miller, Andy Lutomirski,
Ard Biesheuvel, Nicolas Pitre, Petr Mladek,
Sebastian Andrzej Siewior, Sergey Senozhatsky, Helge Deller,
Rik van Riel, Ingo Molnar, Oleg Nesterov, John Stultz,
Thomas Gleixner, Pavel Tikhomirov, Frederic Weisbecker,
Stephen Smalley, Stanislav Kinsburskiy, Ingo Molnar,
H . Peter Anvin, Paolo Bonzini, Borislav Petkov, Josh Poimboeuf,
Brian Gerst, Jan Beulich, Luis R . Rodriguez, He Chen,
Russell King, Will Deacon, Catalin Marinas, Mark Rutland,
James Morse, Pratyush Anand, Vladimir Murzin, Chris Metcalf,
Andre Przywara, Linux API, LKML, the arch/x86 maintainers,
linux-arm-kernel, Kernel Hardening, Heiko Carstens,
Martin Schwidefsky, linux-s390
> On Thu, Mar 9, 2017 at 4:32 AM, Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> Can you please disable that for s390? (e.g. by setting
> CONFIG_ARCH_NO_SYSCALL_VERIFY_PRE_USERMODE_STATE for s390)
> We have a separate address space for kernel/user so the logic will
> be slightly different and is already handled in
>
> commit b5a882fcf146c87cb6b67c6df353e1c042b8773d
> Author: Heiko Carstens <heiko.carstens@de.ibm.com>
> Date: Fri Feb 17 08:13:28 2017 +0100
>
> s390: restore address space when returning to user space
>
No problem, I will add it on the next iteration of this change.
--
Thomas
^ permalink raw reply [flat|nested] 60+ messages in thread