From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (193.142.43.55:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 25 Oct 2019 09:08:54 -0000 Received: from esa3.hc3370-68.iphmx.com ([216.71.145.155]) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1iNvaW-0003YA-WE for speck@linutronix.de; Fri, 25 Oct 2019 11:08:53 +0200 Subject: [MODERATED] Re: [PATCH 3/9] TAA 3 References: <580e02757c3e639bff00fcea830aa46eba46a92f.1571905227.git.bp@suse.de> <6f1ab744-622c-179b-276b-5506b2fd9ae1@citrix.com> <20191024194503.GH14115@zn.tnic> <38430127-3ece-dc06-2264-6b3bc347b523@citrix.com> <20191024201748.GL14115@zn.tnic> <832cb284-9852-5cfe-b71c-c3a23b85adc5@citrix.com> <20191025071746.GA22381@zn.tnic> From: Andrew Cooper Message-ID: Date: Fri, 25 Oct 2019 10:08:41 +0100 MIME-Version: 1.0 In-Reply-To: <20191025071746.GA22381@zn.tnic> Content-Type: multipart/mixed; boundary="oek0JYEW6g5ZKYv9LAAwl3WdldPrH8osk"; protected-headers="v1" To: speck@linutronix.de List-ID: --oek0JYEW6g5ZKYv9LAAwl3WdldPrH8osk Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Content-Language: en-GB On 25/10/2019 08:17, speck for Borislav Petkov wrote: > On Thu, Oct 24, 2019 at 11:38:21PM +0100, speck for Andrew Cooper wrote= : >> I don't necessarily disagree, but the customers (who ultimately pay my= >> salary) want late microcode loading and livepatching, so we've deliver= ed. > Yeah, you guys promised too much. How do you deal with userspace using > a feature and you wanna upgrade microcode which disables it? TSX might > not be a good example here Its the perfect example here.=C2=A0 The answer is by requesting that Inte= l change bit 0's behaviour from causing #UD's to causing aborts. The first version of this microcode was definitely not safe to late load.= > because feature bits disappearing is still ok, Some userspace apparently gets confused when CPUID changes behind its back, which is why the CPUID control in bit 1 was split out from an otherwise monolithic bit 0. At late load, choose (or not) to use bit 0 only. At boot, choose (or not) both bits 0 and 1 in unison. > it doesn't fault but it would simply start aborting transactions > unconditionally but what if it is a CPU feature which userspace is > actively using and it disappears underneath its feet all of a sudden? Noone has guaranteed that all microcode ever in the future is going to be safe to use on a running system.=C2=A0 If it really can't be made to b= e safe, then customers are really going to have to reboot. However, there is a lot of effort going into trying to make sure that fixes such as this one are made safe for late loading. To give a concrete example, we have customers who's elapsed time for a reboot, conforming to SLAs, is in excess of 9 months, and new microcode with critical fixes is coming out faster than that.=C2=A0 I bet that I'm = not the only person on this list with this type of customer. > Just upgrade the microcode and forget about it is not enough. I'm prett= y > sure you'll have to "dance". But hey, you can buy almost everything wit= h > money nowadays so... :-) Yeah, you have to dance, but the constituent pieces are already around, so its not too bad. > >> Skylake CPUs aren't getting TSX_CTRL, but force setting/clearing bits = at >> boot will affect later logic.=C2=A0 (Unless I'm being blind while read= ing the >> patches, which is a distinct possibility). > Yes, that's why I'm saying we should not blindly force set and clear > bits but mirror what CPUID is telling us. At least wrt TSX. > Ah - in which case I agree.=C2=A0 Sorry for the noise. ~Andrew --oek0JYEW6g5ZKYv9LAAwl3WdldPrH8osk--