On 18/01/18 13:48, Peter Zijlstra wrote:
From: Thomas Gleixner <tglx@linutronix.de>

Indirect Branch Speculation (IBS) is controlled per physical core. If one
thread disables it then it's disabled for the core. If a thread enters idle
it makes sense to reenable IBS so the sibling thread can run with full
speculation enabled in user space.

This makes only sense in mwait_idle_with_hints() because mwait_idle() can
serve an interrupt immediately before speculation can be stopped again. SKL
which requires IBRS should use mwait_idle_with_hints() so this is a non
issue and in the worst case a missed optimization.

[peterz: fixed stop_indirect_branch_speculation placement]

Originally-by: Tim Chen <tim.c.chen@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/include/asm/mwait.h |   14 ++++++++++++++
 arch/x86/kernel/process.c    |   14 ++++++++++++++
 2 files changed, 28 insertions(+)

--- a/arch/x86/include/asm/mwait.h
+++ b/arch/x86/include/asm/mwait.h
@@ -6,6 +6,7 @@
 #include <linux/sched/idle.h>
 
 #include <asm/cpufeature.h>
+#include <asm/nospec-branch.h>
 
 #define MWAIT_SUBSTATE_MASK		0xf
 #define MWAIT_CSTATE_MASK		0xf
@@ -106,9 +107,22 @@ static inline void mwait_idle_with_hints
 			mb();
 		}
 
+		/*
+		 * Indirect Branch Speculation (IBS) is controlled per
+		 * physical core. If one thread disables it, then it's
+		 * disabled on all threads of the core. The kernel disables
+		 * it on entry from user space. Reenable it on the thread
+		 * which goes idle so the other thread has a chance to run
+		 * with full speculation enabled in userspace.
+		 */
+		restart_indirect_branch_speculation();
 		__monitor((void *)&current_thread_info()->flags, 0, 0);
 		if (!need_resched())
 			__mwait(eax, ecx);
+		/*
+		 * Stop IBS again to protect kernel execution.
+		 */
+		stop_indirect_branch_speculation();

Be very careful.  The safety of this on Skylake+ depends on there not being a ret instruction which can be speculatively reached between the two WRMSRs used to play with MSR_SPEC_CTRL.

You need to guarantee that the net call tree of these five functions is forced always inline.  A plain "static inline" function is not good enough (and there are several here), if the compiler chooses to out-of-line it, and a real function call is definitely a problem.

I accidentally introduced a vulnerability into Xen in my first attempt at this.  (I'm still trying to decide whether reimplementing it in asm would be better long term).

~Andrew