From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 11 Dec 2018 00:03:07 -0000 Received: from mga12.intel.com ([192.55.52.136]) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gWVVy-0007nb-4t for speck@linutronix.de; Tue, 11 Dec 2018 01:03:06 +0100 Date: Mon, 10 Dec 2018 16:03:03 -0800 From: Andi Kleen Subject: [MODERATED] Re: [PATCH v2 3/8] MDSv2 5 Message-ID: <20181211000303.GB16024@tassilo.jf.intel.com> References: <0d6a3fbe4c511152a0f5350e62e9e09ec545f709.1544464266.git.ak@linux.intel.com> MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit To: speck@linutronix.de List-ID: Hi Linus, > So you have a non-inline function that > > - has a test to return early > > - does a *single* instruction > > - the instruction gets nop'ed out if it's not valid > > all of which just *screams* "that's wrong" to me. Why isn't it inline > and a static branch? I tried inline originally, but it caused bad include loops with alternative.h and mwait.h. It could be done, but would need some restructuring in the includes. I don't think it's worth it because going into idle isn't really that performance sensitive. Also if the side effect is there its overhead really swamps any call overhead. alternative already handles the noping, so I don't think we would need a static branch, unless you want to toggle it at runtime? > > The actual asm sequence also seems bad to me. > > Why isn't that just using > > int val = __KERNEL_DS; > > with the asm just doing > > "verw %[kernelds]" .. [kernelds] "m" (val) > > instead, letting gcc do the stack setup part. It has to be the memory form, only that has the necessary side effect in the microcode. > > Maybe none of this matters simply because it's in the idle case, but > the exact same issues happen for the kernel exit case. kernel exit doesn't use the C code, it's custom assembler. The only case where it may matter is KVM entry, but that is only used with the software sequence, and that is quite expensive anyways. > > And no, we're not doing that insane kernel exit software case by > default. Not without a whole lot of examples of horrid badness. > > The data that system calls touch is basically already user data. Sure, > there may be kernel pointers etc there, but we've already accepted > those leaking locally for all the usual reasons that we have no > control over. It's encryption keys etc. too. But yes. Another case is interrupt data -- for example the TCP stack sometimes copies packet data. Ok you could argue they should encrypt on the wire, but it still seems bogus to leak it. Then if there are drivers which copy data in interrupts they may also leak, but presumably that's rare these days. > > So the whole "let's add crazy long sequences to every kernel exit" is > not going to happen. Not without a lot more explanations. So you would prefer to only flush on context switch, and perhaps for kernel crypto code and perhaps TCP stack if it copied anything? One issue here is that to be really sure there no such other cases it would need some way to audit all the code. I'm not sure there's a practical way to do such an audit. FWIW from our tests so far the performance loss from the kernel exit overhead doesn't seem to be that bad. -Andi