LKML Archive on lore.kernel.org
 help / Atom feed
* [PATCH v7 0/3] Harden spectrev2 userspace-userspace protection
@ 2018-09-25 12:37 Jiri Kosina
  2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Jiri Kosina @ 2018-09-25 12:37 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Andrea Arcangeli, Woodhouse, David, Andi Kleen, Tim Chen,
	Schaufler, Casey
  Cc: linux-kernel, x86

Currently, linux kernel is basically not preventing userspace-userspace 
spectrev2 attack, because:

- IBPB is basically unused (issued only for tasks that marked themselves 
  explicitly non-dumpable, which is absolutely negligible minority of all 
  software out there), therefore cross-process branch buffer posioning 
  using spectrev2 is possible

- STIBP is completely unused, therefore cross-process branch buffer 
  poisoning using spectrev2 between processess running on two HT siblings 
  thread s is possible

This patchset changes IBPB semantics, so that it's now applied whenever 
context-switching between processess that can't use ptrace() to achieve 
the same. This admittedly comes with extra overhad on a context switch; 
systems that don't care about could disable the mitigation using 
nospectre_v2 boot option.
The IBPB implementaion is heavily based on original patches by Tim Chen.

In addition to that, we unconditionally turn STIBP on so that HT siblings 
always have separate branch buffers.

We've been carrying IBPB implementation with the same semantics in our 
(SUSE) trees since january disclosure; STIBP was more or less ignored up 
to today.

There is more work going on by Tim Chen, that will go on top, and enable 
prctl()-based more fine-grained per-process tuning of this mitigation.

v1->v2:
        include IBPB changes
v2->v3: 
        fix IBPB 'who can trace who' semantics
        wire up STIBP flipping to SMT hotplug
v3->v4:
	dropped ___ptrace_may_access(), as it's not needed
	fixed deadlock with LSM/audit/selinux (Andrea Arcangeli)
	statically patch out the ptrace check if !IBPB

v4->v5:
	fix MSR writing logic (Thomas Gleixner, Josh Poimboeuf)

v5->v6:
	propagate X86_FEATURE_RSB_CTXSW setting to sysfs
	propagate STIBP setting to sysfs (Thomas Gleixner)
	simplify arch_smt_update() (Thomas Gleixner)

v6->v7:
	PTRACE_MODE_NOACCESS_CHK -> PTRACE_MODE_SCHED and PTRACE_MODE_IBPB
		(Thomas Gleixner)
	drop unnecessary x86_spec_ctrl_base mutex in cpu_show_common()

Jiri Kosina (3):
      x86/speculation: apply IBPB more strictly to avoid cross-process data leak
      x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
      x86/speculation: Propagate information about RSB filling mitigation to sysfs

 arch/x86/kernel/cpu/bugs.c | 60 ++++++++++++++++++++++++++++++++++++++++++++++++++++++------
 arch/x86/mm/tlb.c          | 31 ++++++++++++++++++++-----------
 include/linux/ptrace.h     |  4 ++++
 kernel/cpu.c               | 11 ++++++++++-
 kernel/ptrace.c            | 12 ++++++++----
 5 files changed, 96 insertions(+), 22 deletions(-)

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak
  2018-09-25 12:37 [PATCH v7 0/3] Harden spectrev2 userspace-userspace protection Jiri Kosina
@ 2018-09-25 12:38 ` Jiri Kosina
  2018-09-26 12:30   ` [tip:x86/pti] x86/speculation: Apply " tip-bot for Jiri Kosina
  2018-09-27 20:18   ` [PATCH v7 1/3] x86/speculation: apply " Stephen Smalley
  2018-09-25 12:38 ` [PATCH v7 2/3] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Jiri Kosina
  2018-09-25 12:39 ` [PATCH v7 3/3] x86/speculation: Propagate information about RSB filling mitigation to sysfs Jiri Kosina
  2 siblings, 2 replies; 9+ messages in thread
From: Jiri Kosina @ 2018-09-25 12:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Andrea Arcangeli, Woodhouse, David, Andi Kleen, Tim Chen,
	Schaufler, Casey
  Cc: linux-kernel, x86

From: Jiri Kosina <jkosina@suse.cz>

Currently, we are issuing IBPB only in cases when switching into a non-dumpable
process, the rationale being to protect such 'important and security sensitive'
processess (such as GPG) from data leak into a different userspace process via
spectre v2.

This is however completely insufficient to provide proper userspace-to-userpace
spectrev2 protection, as any process can poison branch buffers before being
scheduled out, and the newly scheduled process immediately becomes spectrev2
victim.

In order to minimize the performance impact (for usecases that do require
spectrev2 protection), issue the barrier only in cases when switching between
processess where the victim can't be ptraced by the potential attacker (as in
such cases, the attacker doesn't have to bother with branch buffers at all).

[ tglx@linutronix.de: split up PTRACE_MODE_NOACCESS_CHK into PTRACE_MODE_SCHED and
  PTRACE_MODE_IBPB to be able to do ptrace() context tracking reasonably
  fine-grained ]

Fixes: 18bf3c3ea8 ("x86/speculation: Use Indirect Branch Prediction Barrier in context switch")
Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 arch/x86/mm/tlb.c      | 31 ++++++++++++++++++++-----------
 include/linux/ptrace.h | 21 +++++++++++++++++++--
 kernel/ptrace.c        | 10 ++++++++++
 3 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e96b99eb800c..073b8df349a0 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -7,6 +7,7 @@
 #include <linux/export.h>
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
+#include <linux/ptrace.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
@@ -180,6 +181,19 @@ static void sync_current_stack_to_mm(struct mm_struct *mm)
 	}
 }
 
+static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
+{
+	/*
+	 * Check if the current (previous) task has access to the memory
+	 * of the @tsk (next) task. If access is denied, make sure to
+	 * issue a IBPB to stop user->user Spectre-v2 attacks.
+	 *
+	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
+	 */
+	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
+		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
+}
+
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			struct task_struct *tsk)
 {
@@ -262,18 +276,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 		 * one process from doing Spectre-v2 attacks on another.
 		 *
 		 * As an optimization, flush indirect branches only when
-		 * switching into processes that disable dumping. This
-		 * protects high value processes like gpg, without having
-		 * too high performance overhead. IBPB is *expensive*!
-		 *
-		 * This will not flush branches when switching into kernel
-		 * threads. It will also not flush if we switch to idle
-		 * thread and back to the same process. It will flush if we
-		 * switch to a different non-dumpable process.
+		 * switching into a processes that can't be ptrace by the
+		 * current one (as in such case, attacker has much more
+		 * convenient way how to tamper with the next process than
+		 * branch buffer poisoning).
 		 */
-		if (tsk && tsk->mm &&
-		    tsk->mm->context.ctx_id != last_ctx_id &&
-		    get_dumpable(tsk->mm) != SUID_DUMP_USER)
+		if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
+				ibpb_needed(tsk, last_ctx_id))
 			indirect_branch_prediction_barrier();
 
 		if (IS_ENABLED(CONFIG_VMAP_STACK)) {
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 4f36431c380b..e5e5ef513df3 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -62,14 +62,17 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
 #define PTRACE_MODE_READ	0x01
 #define PTRACE_MODE_ATTACH	0x02
 #define PTRACE_MODE_NOAUDIT	0x04
-#define PTRACE_MODE_FSCREDS 0x08
-#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_FSCREDS	0x08
+#define PTRACE_MODE_REALCREDS	0x10
+#define PTRACE_MODE_SCHED	0x20
+#define PTRACE_MODE_IBPB	0x40
 
 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
 #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
 #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
 
 /**
  * ptrace_may_access - check whether the caller is permitted to access
@@ -87,6 +90,20 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
  */
 extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
 
+/**
+ * ptrace_may_access - check whether the caller is permitted to access
+ * a target task.
+ * @task: target task
+ * @mode: selects type of access and caller credentials
+ *
+ * Returns true on success, false on denial.
+ *
+ * Similar to ptrace_may_access(). Only to be called from context switch
+ * code. Does not call into audit and the regular LSM hooks due to locking
+ * constraints.
+ */
+extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
+
 static inline int ptrace_reparented(struct task_struct *child)
 {
 	return !same_thread_group(child->real_parent, child->parent);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 21fec73d45d4..99cfddde6a55 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -261,6 +261,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 
 static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
 {
+	if (mode & PTRACE_MODE_SCHED)
+		return false;
+
 	if (mode & PTRACE_MODE_NOAUDIT)
 		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
 	else
@@ -328,9 +331,16 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
 	     !ptrace_has_cap(mm->user_ns, mode)))
 	    return -EPERM;
 
+	if (mode & PTRACE_MODE_SCHED)
+		return 0;
 	return security_ptrace_access_check(task, mode);
 }
 
+bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
+{
+	return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
+}
+
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;

-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 2/3] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
  2018-09-25 12:37 [PATCH v7 0/3] Harden spectrev2 userspace-userspace protection Jiri Kosina
  2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
@ 2018-09-25 12:38 ` Jiri Kosina
  2018-09-26 12:31   ` [tip:x86/pti] " tip-bot for Jiri Kosina
  2018-09-25 12:39 ` [PATCH v7 3/3] x86/speculation: Propagate information about RSB filling mitigation to sysfs Jiri Kosina
  2 siblings, 1 reply; 9+ messages in thread
From: Jiri Kosina @ 2018-09-25 12:38 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Andrea Arcangeli, Woodhouse, David, Andi Kleen, Tim Chen,
	Schaufler, Casey
  Cc: linux-kernel, x86

From: Jiri Kosina <jkosina@suse.cz>

STIBP is a feature provided by certain Intel ucodes / CPUs. This feature
(once enabled) prevents cross-hyperthread control of decisions made by
indirect branch predictors.

Enable this feature if

- the CPU is vulnerable to spectre v2
- the CPU supports SMT and has SMT siblings online
- spectre_v2 mitigation autoselection is enabled (default)

After some previous discussion, this patch leaves STIBP on all the time,
as wrmsr on crossing kernel boundary is a no-no. This could perhaps later
be a bit more optimized (like disabling it in NOHZ, experiment with
disabling it in idle, etc) if needed.

Note that the synchronization of the mask manipulation via newly added
spec_ctrl_mutex is currently not strictly needed, as the only updater is
already being serialized by cpu_add_remove_lock, but let's make this a
little bit more future-proof.

Cc: stable@vger.kernel.org
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 arch/x86/kernel/cpu/bugs.c | 57 +++++++++++++++++++++++++++++++++++++++++-----
 kernel/cpu.c               | 11 ++++++++-
 2 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 40bdaea97fe7..53eb14a65610 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -35,12 +35,10 @@ static void __init spectre_v2_select_mitigation(void);
 static void __init ssb_select_mitigation(void);
 static void __init l1tf_select_mitigation(void);
 
-/*
- * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any
- * writes to SPEC_CTRL contain whatever reserved bits have been set.
- */
-u64 __ro_after_init x86_spec_ctrl_base;
+/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
+u64 x86_spec_ctrl_base;
 EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
+static DEFINE_MUTEX(spec_ctrl_mutex);
 
 /*
  * The vendor and possibly platform specific bits which can be modified in
@@ -325,6 +323,46 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
 	return cmd;
 }
 
+static bool stibp_needed(void)
+{
+	if (spectre_v2_enabled == SPECTRE_V2_NONE)
+		return false;
+
+	if (!boot_cpu_has(X86_FEATURE_STIBP))
+		return false;
+
+	return true;
+}
+
+static void update_stibp_msr(void *info)
+{
+	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+void arch_smt_update(void)
+{
+	u64 mask;
+
+	if (!stibp_needed())
+		return;
+
+	mutex_lock(&spec_ctrl_mutex);
+	mask = x86_spec_ctrl_base;
+	if (cpu_smt_control == CPU_SMT_ENABLED)
+		mask |= SPEC_CTRL_STIBP;
+	else
+		mask &= ~SPEC_CTRL_STIBP;
+
+	if (mask != x86_spec_ctrl_base) {
+		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
+				cpu_smt_control == CPU_SMT_ENABLED ?
+				"Enabling" : "Disabling");
+		x86_spec_ctrl_base = mask;
+		on_each_cpu(update_stibp_msr, NULL, 1);
+	}
+	mutex_unlock(&spec_ctrl_mutex);
+}
+
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -424,6 +462,9 @@ static void __init spectre_v2_select_mitigation(void)
 		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
 		pr_info("Enabling Restricted Speculation for firmware calls\n");
 	}
+
+	/* Enable STIBP if appropriate */
+	arch_smt_update();
 }
 
 #undef pr_fmt
@@ -814,6 +855,8 @@ static ssize_t l1tf_show_state(char *buf)
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
+	int ret;
+
 	if (!boot_cpu_has_bug(bug))
 		return sprintf(buf, "Not affected\n");
 
@@ -831,10 +874,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
 			       spectre_v2_module_string());
+		return ret;
 
 	case X86_BUG_SPEC_STORE_BYPASS:
 		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
diff --git a/kernel/cpu.c b/kernel/cpu.c
index aa7fe85ad62e..2fb49916ea56 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2025,6 +2025,12 @@ static void cpuhp_online_cpu_device(unsigned int cpu)
 	kobject_uevent(&dev->kobj, KOBJ_ONLINE);
 }
 
+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { };
+
 static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
 	int cpu, ret = 0;
@@ -2051,8 +2057,10 @@ static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 		 */
 		cpuhp_offline_cpu_device(cpu);
 	}
-	if (!ret)
+	if (!ret) {
 		cpu_smt_control = ctrlval;
+		arch_smt_update();
+	}
 	cpu_maps_update_done();
 	return ret;
 }
@@ -2063,6 +2071,7 @@ static int cpuhp_smt_enable(void)
 
 	cpu_maps_update_begin();
 	cpu_smt_control = CPU_SMT_ENABLED;
+	arch_smt_update();
 	for_each_present_cpu(cpu) {
 		/* Skip online CPUs and CPUs on offline nodes */
 		if (cpu_online(cpu) || !node_online(cpu_to_node(cpu)))


-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v7 3/3] x86/speculation: Propagate information about RSB filling mitigation to sysfs
  2018-09-25 12:37 [PATCH v7 0/3] Harden spectrev2 userspace-userspace protection Jiri Kosina
  2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
  2018-09-25 12:38 ` [PATCH v7 2/3] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Jiri Kosina
@ 2018-09-25 12:39 ` Jiri Kosina
  2018-09-26 12:31   ` [tip:x86/pti] " tip-bot for Jiri Kosina
  2 siblings, 1 reply; 9+ messages in thread
From: Jiri Kosina @ 2018-09-25 12:39 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Andrea Arcangeli, Woodhouse, David, Andi Kleen, Tim Chen,
	Schaufler, Casey
  Cc: linux-kernel, x86

From: Jiri Kosina <jkosina@suse.cz>

If spectrev2 mitigation has been enabled, we're filling RSB on context 
switch in order to protect from various classess of spectrev2 attacks.

If this mitigation is enabled, say so in sysfs for spectrev2.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
---
 arch/x86/kernel/cpu/bugs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 53eb14a65610..fe32103fcdc7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -874,10 +874,11 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
+			       boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
 			       spectre_v2_module_string());
 		return ret;
 
-- 
2.12.3


-- 
Jiri Kosina
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:x86/pti] x86/speculation: Apply IBPB more strictly to avoid cross-process data leak
  2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
@ 2018-09-26 12:30   ` " tip-bot for Jiri Kosina
  2018-09-27 20:18   ` [PATCH v7 1/3] x86/speculation: apply " Stephen Smalley
  1 sibling, 0 replies; 9+ messages in thread
From: tip-bot for Jiri Kosina @ 2018-09-26 12:30 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: jpoimboe, ak, linux-kernel, jkosina, dwmw, aarcange,
	casey.schaufler, peterz, mingo, tglx, tim.c.chen, hpa

Commit-ID:  dbfe2953f63c640463c630746cd5d9de8b2f63ae
Gitweb:     https://git.kernel.org/tip/dbfe2953f63c640463c630746cd5d9de8b2f63ae
Author:     Jiri Kosina <jkosina@suse.cz>
AuthorDate: Tue, 25 Sep 2018 14:38:18 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 26 Sep 2018 14:26:51 +0200

x86/speculation: Apply IBPB more strictly to avoid cross-process data leak

Currently, IBPB is only issued in cases when switching into a non-dumpable
process, the rationale being to protect such 'important and security
sensitive' processess (such as GPG) from data leaking into a different
userspace process via spectre v2.

This is however completely insufficient to provide proper userspace-to-userpace
spectrev2 protection, as any process can poison branch buffers before being
scheduled out, and the newly scheduled process immediately becomes spectrev2
victim.

In order to minimize the performance impact (for usecases that do require
spectrev2 protection), issue the barrier only in cases when switching between
processess where the victim can't be ptraced by the potential attacker (as in
such cases, the attacker doesn't have to bother with branch buffers at all).

[ tglx: Split up PTRACE_MODE_NOACCESS_CHK into PTRACE_MODE_SCHED and
  PTRACE_MODE_IBPB to be able to do ptrace() context tracking reasonably
  fine-grained ]

Fixes: 18bf3c3ea8 ("x86/speculation: Use Indirect Branch Prediction Barrier in context switch")
Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251437340.15880@cbobk.fhfr.pm

---
 arch/x86/mm/tlb.c      | 31 ++++++++++++++++++++-----------
 include/linux/ptrace.h | 21 +++++++++++++++++++--
 kernel/ptrace.c        | 10 ++++++++++
 3 files changed, 49 insertions(+), 13 deletions(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index e96b99eb800c..073b8df349a0 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -7,6 +7,7 @@
 #include <linux/export.h>
 #include <linux/cpu.h>
 #include <linux/debugfs.h>
+#include <linux/ptrace.h>
 
 #include <asm/tlbflush.h>
 #include <asm/mmu_context.h>
@@ -180,6 +181,19 @@ static void sync_current_stack_to_mm(struct mm_struct *mm)
 	}
 }
 
+static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
+{
+	/*
+	 * Check if the current (previous) task has access to the memory
+	 * of the @tsk (next) task. If access is denied, make sure to
+	 * issue a IBPB to stop user->user Spectre-v2 attacks.
+	 *
+	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
+	 */
+	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
+		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
+}
+
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			struct task_struct *tsk)
 {
@@ -262,18 +276,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 		 * one process from doing Spectre-v2 attacks on another.
 		 *
 		 * As an optimization, flush indirect branches only when
-		 * switching into processes that disable dumping. This
-		 * protects high value processes like gpg, without having
-		 * too high performance overhead. IBPB is *expensive*!
-		 *
-		 * This will not flush branches when switching into kernel
-		 * threads. It will also not flush if we switch to idle
-		 * thread and back to the same process. It will flush if we
-		 * switch to a different non-dumpable process.
+		 * switching into a processes that can't be ptrace by the
+		 * current one (as in such case, attacker has much more
+		 * convenient way how to tamper with the next process than
+		 * branch buffer poisoning).
 		 */
-		if (tsk && tsk->mm &&
-		    tsk->mm->context.ctx_id != last_ctx_id &&
-		    get_dumpable(tsk->mm) != SUID_DUMP_USER)
+		if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
+				ibpb_needed(tsk, last_ctx_id))
 			indirect_branch_prediction_barrier();
 
 		if (IS_ENABLED(CONFIG_VMAP_STACK)) {
diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
index 4f36431c380b..e5e5ef513df3 100644
--- a/include/linux/ptrace.h
+++ b/include/linux/ptrace.h
@@ -62,14 +62,17 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
 #define PTRACE_MODE_READ	0x01
 #define PTRACE_MODE_ATTACH	0x02
 #define PTRACE_MODE_NOAUDIT	0x04
-#define PTRACE_MODE_FSCREDS 0x08
-#define PTRACE_MODE_REALCREDS 0x10
+#define PTRACE_MODE_FSCREDS	0x08
+#define PTRACE_MODE_REALCREDS	0x10
+#define PTRACE_MODE_SCHED	0x20
+#define PTRACE_MODE_IBPB	0x40
 
 /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
 #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
 #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
 #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
+#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
 
 /**
  * ptrace_may_access - check whether the caller is permitted to access
@@ -87,6 +90,20 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
  */
 extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
 
+/**
+ * ptrace_may_access - check whether the caller is permitted to access
+ * a target task.
+ * @task: target task
+ * @mode: selects type of access and caller credentials
+ *
+ * Returns true on success, false on denial.
+ *
+ * Similar to ptrace_may_access(). Only to be called from context switch
+ * code. Does not call into audit and the regular LSM hooks due to locking
+ * constraints.
+ */
+extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
+
 static inline int ptrace_reparented(struct task_struct *child)
 {
 	return !same_thread_group(child->real_parent, child->parent);
diff --git a/kernel/ptrace.c b/kernel/ptrace.c
index 21fec73d45d4..99cfddde6a55 100644
--- a/kernel/ptrace.c
+++ b/kernel/ptrace.c
@@ -261,6 +261,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
 
 static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
 {
+	if (mode & PTRACE_MODE_SCHED)
+		return false;
+
 	if (mode & PTRACE_MODE_NOAUDIT)
 		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
 	else
@@ -328,9 +331,16 @@ ok:
 	     !ptrace_has_cap(mm->user_ns, mode)))
 	    return -EPERM;
 
+	if (mode & PTRACE_MODE_SCHED)
+		return 0;
 	return security_ptrace_access_check(task, mode);
 }
 
+bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
+{
+	return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
+}
+
 bool ptrace_may_access(struct task_struct *task, unsigned int mode)
 {
 	int err;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:x86/pti] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation
  2018-09-25 12:38 ` [PATCH v7 2/3] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Jiri Kosina
@ 2018-09-26 12:31   ` " tip-bot for Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Jiri Kosina @ 2018-09-26 12:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: aarcange, ak, peterz, jpoimboe, tim.c.chen, hpa, linux-kernel,
	tglx, mingo, dwmw, jkosina, casey.schaufler

Commit-ID:  53c613fe6349994f023245519265999eed75957f
Gitweb:     https://git.kernel.org/tip/53c613fe6349994f023245519265999eed75957f
Author:     Jiri Kosina <jkosina@suse.cz>
AuthorDate: Tue, 25 Sep 2018 14:38:55 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 26 Sep 2018 14:26:52 +0200

x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation

STIBP is a feature provided by certain Intel ucodes / CPUs. This feature
(once enabled) prevents cross-hyperthread control of decisions made by
indirect branch predictors.

Enable this feature if

- the CPU is vulnerable to spectre v2
- the CPU supports SMT and has SMT siblings online
- spectre_v2 mitigation autoselection is enabled (default)

After some previous discussion, this leaves STIBP on all the time, as wrmsr
on crossing kernel boundary is a no-no. This could perhaps later be a bit
more optimized (like disabling it in NOHZ, experiment with disabling it in
idle, etc) if needed.

Note that the synchronization of the mask manipulation via newly added
spec_ctrl_mutex is currently not strictly needed, as the only updater is
already being serialized by cpu_add_remove_lock, but let's make this a
little bit more future-proof.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438240.15880@cbobk.fhfr.pm

---
 arch/x86/kernel/cpu/bugs.c | 57 +++++++++++++++++++++++++++++++++++++++++-----
 kernel/cpu.c               | 11 ++++++++-
 2 files changed, 61 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 40bdaea97fe7..53eb14a65610 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -35,12 +35,10 @@ static void __init spectre_v2_select_mitigation(void);
 static void __init ssb_select_mitigation(void);
 static void __init l1tf_select_mitigation(void);
 
-/*
- * Our boot-time value of the SPEC_CTRL MSR. We read it once so that any
- * writes to SPEC_CTRL contain whatever reserved bits have been set.
- */
-u64 __ro_after_init x86_spec_ctrl_base;
+/* The base value of the SPEC_CTRL MSR that always has to be preserved. */
+u64 x86_spec_ctrl_base;
 EXPORT_SYMBOL_GPL(x86_spec_ctrl_base);
+static DEFINE_MUTEX(spec_ctrl_mutex);
 
 /*
  * The vendor and possibly platform specific bits which can be modified in
@@ -325,6 +323,46 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
 	return cmd;
 }
 
+static bool stibp_needed(void)
+{
+	if (spectre_v2_enabled == SPECTRE_V2_NONE)
+		return false;
+
+	if (!boot_cpu_has(X86_FEATURE_STIBP))
+		return false;
+
+	return true;
+}
+
+static void update_stibp_msr(void *info)
+{
+	wrmsrl(MSR_IA32_SPEC_CTRL, x86_spec_ctrl_base);
+}
+
+void arch_smt_update(void)
+{
+	u64 mask;
+
+	if (!stibp_needed())
+		return;
+
+	mutex_lock(&spec_ctrl_mutex);
+	mask = x86_spec_ctrl_base;
+	if (cpu_smt_control == CPU_SMT_ENABLED)
+		mask |= SPEC_CTRL_STIBP;
+	else
+		mask &= ~SPEC_CTRL_STIBP;
+
+	if (mask != x86_spec_ctrl_base) {
+		pr_info("Spectre v2 cross-process SMT mitigation: %s STIBP\n",
+				cpu_smt_control == CPU_SMT_ENABLED ?
+				"Enabling" : "Disabling");
+		x86_spec_ctrl_base = mask;
+		on_each_cpu(update_stibp_msr, NULL, 1);
+	}
+	mutex_unlock(&spec_ctrl_mutex);
+}
+
 static void __init spectre_v2_select_mitigation(void)
 {
 	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
@@ -424,6 +462,9 @@ specv2_set_mode:
 		setup_force_cpu_cap(X86_FEATURE_USE_IBRS_FW);
 		pr_info("Enabling Restricted Speculation for firmware calls\n");
 	}
+
+	/* Enable STIBP if appropriate */
+	arch_smt_update();
 }
 
 #undef pr_fmt
@@ -814,6 +855,8 @@ static ssize_t l1tf_show_state(char *buf)
 static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr,
 			       char *buf, unsigned int bug)
 {
+	int ret;
+
 	if (!boot_cpu_has_bug(bug))
 		return sprintf(buf, "Not affected\n");
 
@@ -831,10 +874,12 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		return sprintf(buf, "%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
+			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
 			       spectre_v2_module_string());
+		return ret;
 
 	case X86_BUG_SPEC_STORE_BYPASS:
 		return sprintf(buf, "%s\n", ssb_strings[ssb_mode]);
diff --git a/kernel/cpu.c b/kernel/cpu.c
index aa7fe85ad62e..2fb49916ea56 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -2025,6 +2025,12 @@ static void cpuhp_online_cpu_device(unsigned int cpu)
 	kobject_uevent(&dev->kobj, KOBJ_ONLINE);
 }
 
+/*
+ * Architectures that need SMT-specific errata handling during SMT hotplug
+ * should override this.
+ */
+void __weak arch_smt_update(void) { };
+
 static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 {
 	int cpu, ret = 0;
@@ -2051,8 +2057,10 @@ static int cpuhp_smt_disable(enum cpuhp_smt_control ctrlval)
 		 */
 		cpuhp_offline_cpu_device(cpu);
 	}
-	if (!ret)
+	if (!ret) {
 		cpu_smt_control = ctrlval;
+		arch_smt_update();
+	}
 	cpu_maps_update_done();
 	return ret;
 }
@@ -2063,6 +2071,7 @@ static int cpuhp_smt_enable(void)
 
 	cpu_maps_update_begin();
 	cpu_smt_control = CPU_SMT_ENABLED;
+	arch_smt_update();
 	for_each_present_cpu(cpu) {
 		/* Skip online CPUs and CPUs on offline nodes */
 		if (cpu_online(cpu) || !node_online(cpu_to_node(cpu)))

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [tip:x86/pti] x86/speculation: Propagate information about RSB filling mitigation to sysfs
  2018-09-25 12:39 ` [PATCH v7 3/3] x86/speculation: Propagate information about RSB filling mitigation to sysfs Jiri Kosina
@ 2018-09-26 12:31   ` " tip-bot for Jiri Kosina
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Jiri Kosina @ 2018-09-26 12:31 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: dwmw, hpa, ak, aarcange, mingo, tglx, jkosina, peterz,
	tim.c.chen, linux-kernel, jpoimboe, casey.schaufler

Commit-ID:  bb4b3b7762735cdaba5a40fd94c9303d9ffa147a
Gitweb:     https://git.kernel.org/tip/bb4b3b7762735cdaba5a40fd94c9303d9ffa147a
Author:     Jiri Kosina <jkosina@suse.cz>
AuthorDate: Tue, 25 Sep 2018 14:39:28 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 26 Sep 2018 14:26:52 +0200

x86/speculation: Propagate information about RSB filling mitigation to sysfs

If spectrev2 mitigation has been enabled, RSB is filled on context switch
in order to protect from various classes of spectrev2 attacks.

If this mitigation is enabled, say so in sysfs for spectrev2.

Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc:  "WoodhouseDavid" <dwmw@amazon.co.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc:  "SchauflerCasey" <casey.schaufler@intel.com>
Link: https://lkml.kernel.org/r/nycvar.YFH.7.76.1809251438580.15880@cbobk.fhfr.pm

---
 arch/x86/kernel/cpu/bugs.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 53eb14a65610..fe32103fcdc7 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -874,10 +874,11 @@ static ssize_t cpu_show_common(struct device *dev, struct device_attribute *attr
 		return sprintf(buf, "Mitigation: __user pointer sanitization\n");
 
 	case X86_BUG_SPECTRE_V2:
-		ret = sprintf(buf, "%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		ret = sprintf(buf, "%s%s%s%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
 			       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
 			       boot_cpu_has(X86_FEATURE_USE_IBRS_FW) ? ", IBRS_FW" : "",
 			       (x86_spec_ctrl_base & SPEC_CTRL_STIBP) ? ", STIBP" : "",
+			       boot_cpu_has(X86_FEATURE_RSB_CTXSW) ? ", RSB filling" : "",
 			       spectre_v2_module_string());
 		return ret;
 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak
  2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
  2018-09-26 12:30   ` [tip:x86/pti] x86/speculation: Apply " tip-bot for Jiri Kosina
@ 2018-09-27 20:18   ` " Stephen Smalley
  2018-09-27 20:28     ` Thomas Gleixner
  1 sibling, 1 reply; 9+ messages in thread
From: Stephen Smalley @ 2018-09-27 20:18 UTC (permalink / raw)
  To: Jiri Kosina, Thomas Gleixner, Ingo Molnar, Peter Zijlstra,
	Josh Poimboeuf, Andrea Arcangeli, Woodhouse, David, Andi Kleen,
	Tim Chen, Schaufler, Casey
  Cc: linux-kernel, x86

On 09/25/2018 08:38 AM, Jiri Kosina wrote:
> From: Jiri Kosina <jkosina@suse.cz>
> 
> Currently, we are issuing IBPB only in cases when switching into a non-dumpable
> process, the rationale being to protect such 'important and security sensitive'
> processess (such as GPG) from data leak into a different userspace process via
> spectre v2.
> 
> This is however completely insufficient to provide proper userspace-to-userpace
> spectrev2 protection, as any process can poison branch buffers before being
> scheduled out, and the newly scheduled process immediately becomes spectrev2
> victim.
> 
> In order to minimize the performance impact (for usecases that do require
> spectrev2 protection), issue the barrier only in cases when switching between
> processess where the victim can't be ptraced by the potential attacker (as in
> such cases, the attacker doesn't have to bother with branch buffers at all).
> 
> [ tglx@linutronix.de: split up PTRACE_MODE_NOACCESS_CHK into PTRACE_MODE_SCHED and
>    PTRACE_MODE_IBPB to be able to do ptrace() context tracking reasonably
>    fine-grained ]
> 
> Fixes: 18bf3c3ea8 ("x86/speculation: Use Indirect Branch Prediction Barrier in context switch")
> Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
> Signed-off-by: Jiri Kosina <jkosina@suse.cz>
> ---
>   arch/x86/mm/tlb.c      | 31 ++++++++++++++++++++-----------
>   include/linux/ptrace.h | 21 +++++++++++++++++++--
>   kernel/ptrace.c        | 10 ++++++++++
>   3 files changed, 49 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index e96b99eb800c..073b8df349a0 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -7,6 +7,7 @@
>   #include <linux/export.h>
>   #include <linux/cpu.h>
>   #include <linux/debugfs.h>
> +#include <linux/ptrace.h>
>   
>   #include <asm/tlbflush.h>
>   #include <asm/mmu_context.h>
> @@ -180,6 +181,19 @@ static void sync_current_stack_to_mm(struct mm_struct *mm)
>   	}
>   }
>   
> +static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
> +{
> +	/*
> +	 * Check if the current (previous) task has access to the memory
> +	 * of the @tsk (next) task. If access is denied, make sure to
> +	 * issue a IBPB to stop user->user Spectre-v2 attacks.
> +	 *
> +	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
> +	 */
> +	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
> +		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));

Would there be any safe way to perform the ptrace check earlier at a 
point where the locking constraints are less severe, and just pass down 
the result to this code?  Possibly just defaulting to enabling IBPB for 
safety if something changed in the interim that would invalidate the 
earlier ptrace check?  Probably not possible, but I thought I'd ask as 
it would avoid the need to skip both the ptrace_has_cap check and the 
LSM hook, and would reduce the critical section.

> +}
> +
>   void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>   			struct task_struct *tsk)
>   {
> @@ -262,18 +276,13 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>   		 * one process from doing Spectre-v2 attacks on another.
>   		 *
>   		 * As an optimization, flush indirect branches only when
> -		 * switching into processes that disable dumping. This
> -		 * protects high value processes like gpg, without having
> -		 * too high performance overhead. IBPB is *expensive*!
> -		 *
> -		 * This will not flush branches when switching into kernel
> -		 * threads. It will also not flush if we switch to idle
> -		 * thread and back to the same process. It will flush if we
> -		 * switch to a different non-dumpable process.
> +		 * switching into a processes that can't be ptrace by the
> +		 * current one (as in such case, attacker has much more
> +		 * convenient way how to tamper with the next process than
> +		 * branch buffer poisoning).
>   		 */
> -		if (tsk && tsk->mm &&
> -		    tsk->mm->context.ctx_id != last_ctx_id &&
> -		    get_dumpable(tsk->mm) != SUID_DUMP_USER)
> +		if (static_cpu_has(X86_FEATURE_USE_IBPB) &&
> +				ibpb_needed(tsk, last_ctx_id))
>   			indirect_branch_prediction_barrier();
>   
>   		if (IS_ENABLED(CONFIG_VMAP_STACK)) {
> diff --git a/include/linux/ptrace.h b/include/linux/ptrace.h
> index 4f36431c380b..e5e5ef513df3 100644
> --- a/include/linux/ptrace.h
> +++ b/include/linux/ptrace.h
> @@ -62,14 +62,17 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
>   #define PTRACE_MODE_READ	0x01
>   #define PTRACE_MODE_ATTACH	0x02
>   #define PTRACE_MODE_NOAUDIT	0x04
> -#define PTRACE_MODE_FSCREDS 0x08
> -#define PTRACE_MODE_REALCREDS 0x10
> +#define PTRACE_MODE_FSCREDS	0x08
> +#define PTRACE_MODE_REALCREDS	0x10
> +#define PTRACE_MODE_SCHED	0x20
> +#define PTRACE_MODE_IBPB	0x40
>   
>   /* shorthands for READ/ATTACH and FSCREDS/REALCREDS combinations */
>   #define PTRACE_MODE_READ_FSCREDS (PTRACE_MODE_READ | PTRACE_MODE_FSCREDS)
>   #define PTRACE_MODE_READ_REALCREDS (PTRACE_MODE_READ | PTRACE_MODE_REALCREDS)
>   #define PTRACE_MODE_ATTACH_FSCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS)
>   #define PTRACE_MODE_ATTACH_REALCREDS (PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS)
> +#define PTRACE_MODE_SPEC_IBPB (PTRACE_MODE_ATTACH_REALCREDS | PTRACE_MODE_IBPB)
>   
>   /**
>    * ptrace_may_access - check whether the caller is permitted to access
> @@ -87,6 +90,20 @@ extern void exit_ptrace(struct task_struct *tracer, struct list_head *dead);
>    */
>   extern bool ptrace_may_access(struct task_struct *task, unsigned int mode);
>   
> +/**
> + * ptrace_may_access - check whether the caller is permitted to access

s/ptrace_may_access/ptrace_may_access_sched/

> + * a target task.
> + * @task: target task
> + * @mode: selects type of access and caller credentials
> + *
> + * Returns true on success, false on denial.
> + *
> + * Similar to ptrace_may_access(). Only to be called from context switch
> + * code. Does not call into audit and the regular LSM hooks due to locking
> + * constraints.

Pardon my ignorance, but can you clarify exactly what are the locking 
constraints for any code that might be called now or in the future from 
ptrace_may_access_sched().  What's permissible?  rcu_read_lock()?

> + */
> +extern bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode);
> +
>   static inline int ptrace_reparented(struct task_struct *child)
>   {
>   	return !same_thread_group(child->real_parent, child->parent);
> diff --git a/kernel/ptrace.c b/kernel/ptrace.c
> index 21fec73d45d4..99cfddde6a55 100644
> --- a/kernel/ptrace.c
> +++ b/kernel/ptrace.c
> @@ -261,6 +261,9 @@ static int ptrace_check_attach(struct task_struct *child, bool ignore_state)
>   
>   static int ptrace_has_cap(struct user_namespace *ns, unsigned int mode)
>   {
> +	if (mode & PTRACE_MODE_SCHED)
> +		return false;
> +
>   	if (mode & PTRACE_MODE_NOAUDIT)
>   		return has_ns_capability_noaudit(current, ns, CAP_SYS_PTRACE);
>   	else
> @@ -328,9 +331,16 @@ static int __ptrace_may_access(struct task_struct *task, unsigned int mode)
>   	     !ptrace_has_cap(mm->user_ns, mode)))
>   	    return -EPERM;
>   
> +	if (mode & PTRACE_MODE_SCHED)
> +		return 0;
>   	return security_ptrace_access_check(task, mode);
>   }
>   
> +bool ptrace_may_access_sched(struct task_struct *task, unsigned int mode)
> +{
> +	return __ptrace_may_access(task, mode | PTRACE_MODE_SCHED);
> +}
> +
>   bool ptrace_may_access(struct task_struct *task, unsigned int mode)
>   {
>   	int err;
> 


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak
  2018-09-27 20:18   ` [PATCH v7 1/3] x86/speculation: apply " Stephen Smalley
@ 2018-09-27 20:28     ` Thomas Gleixner
  0 siblings, 0 replies; 9+ messages in thread
From: Thomas Gleixner @ 2018-09-27 20:28 UTC (permalink / raw)
  To: Stephen Smalley
  Cc: Jiri Kosina, Ingo Molnar, Peter Zijlstra, Josh Poimboeuf,
	Andrea Arcangeli, Woodhouse, David, Andi Kleen, Tim Chen,
	Schaufler, Casey, linux-kernel, x86

On Thu, 27 Sep 2018, Stephen Smalley wrote:
> On 09/25/2018 08:38 AM, Jiri Kosina wrote:
> >   +static bool ibpb_needed(struct task_struct *tsk, u64 last_ctx_id)
> > +{
> > +	/*
> > +	 * Check if the current (previous) task has access to the memory
> > +	 * of the @tsk (next) task. If access is denied, make sure to
> > +	 * issue a IBPB to stop user->user Spectre-v2 attacks.
> > +	 *
> > +	 * Note: __ptrace_may_access() returns 0 or -ERRNO.
> > +	 */
> > +	return (tsk && tsk->mm && tsk->mm->context.ctx_id != last_ctx_id &&
> > +		ptrace_may_access_sched(tsk, PTRACE_MODE_SPEC_IBPB));
> 
> Would there be any safe way to perform the ptrace check earlier at a point
> where the locking constraints are less severe, and just pass down the result
> to this code?  Possibly just defaulting to enabling IBPB for safety if
> something changed in the interim that would invalidate the earlier ptrace
> check?  Probably not possible, but I thought I'd ask as it would avoid the
> need to skip both the ptrace_has_cap check and the LSM hook, and would reduce
> the critical section.

It's not possible unfortunately as this happens under the scheduler run
queue lock and this needs to be taken to figure out which is the next
task. We can't drop it before context switch and revisit the decision
afterwards to verify it, that would be a massive performance issue and open
an even more horrible can of worms.

Any check which needs to be done in that context should be as minimalistic
as possible. So having a special mode which then might invoke special hooks
makes a lot of sense.

> > + * Returns true on success, false on denial.
> > + *
> > + * Similar to ptrace_may_access(). Only to be called from context switch
> > + * code. Does not call into audit and the regular LSM hooks due to locking
> > + * constraints.
> 
> Pardon my ignorance, but can you clarify exactly what are the locking
> constraints for any code that might be called now or in the future from
> ptrace_may_access_sched().  What's permissible?  rcu_read_lock()?

rcu_read_lock() is fine. Locks might be fine, but the probability that you
run into a lock inversion is extremly high. Also please keep in mind that
this wants to be a raw_spinlock as otherwise preempt-RT will have issues
and the lock sections need to be really short. switch_to() is a hot path.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, back to index

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-09-25 12:37 [PATCH v7 0/3] Harden spectrev2 userspace-userspace protection Jiri Kosina
2018-09-25 12:38 ` [PATCH v7 1/3] x86/speculation: apply IBPB more strictly to avoid cross-process data leak Jiri Kosina
2018-09-26 12:30   ` [tip:x86/pti] x86/speculation: Apply " tip-bot for Jiri Kosina
2018-09-27 20:18   ` [PATCH v7 1/3] x86/speculation: apply " Stephen Smalley
2018-09-27 20:28     ` Thomas Gleixner
2018-09-25 12:38 ` [PATCH v7 2/3] x86/speculation: Enable cross-hyperthread spectre v2 STIBP mitigation Jiri Kosina
2018-09-26 12:31   ` [tip:x86/pti] " tip-bot for Jiri Kosina
2018-09-25 12:39 ` [PATCH v7 3/3] x86/speculation: Propagate information about RSB filling mitigation to sysfs Jiri Kosina
2018-09-26 12:31   ` [tip:x86/pti] " tip-bot for Jiri Kosina

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org linux-kernel@archiver.kernel.org
	public-inbox-index lkml


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox