From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.linutronix.de (146.0.238.70:993) by crypto-ml.lab.linutronix.de with IMAP4-SSL for ; 13 Dec 2018 16:53:14 -0000 Received: from mx2.suse.de ([195.135.220.15] helo=mx1.suse.de) by Galois.linutronix.de with esmtps (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1gXUEa-0003zU-MB for speck@linutronix.de; Thu, 13 Dec 2018 17:53:13 +0100 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id B7E59AE4B for ; Thu, 13 Dec 2018 16:53:05 +0000 (UTC) Date: Thu, 13 Dec 2018 17:52:56 +0100 From: Borislav Petkov Subject: [MODERATED] Re: [PATCH v2 2/8] MDSv2 1 Message-ID: <20181213165256.GB26795@zn.tnic> References: <87784B01-00DE-4E4E-AF13-7CEFC9903019@oracle.com> <20181211100335.GB25994@zn.tnic> <20181212213147.GR9077@char.us.oracle.com> <20181212221731.GF6696@zn.tnic> <20181212224046.GF7946@char.us.oracle.com> <20181212224553.GG6696@zn.tnic> MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable To: speck@linutronix.de List-ID: On Thu, Dec 13, 2018 at 07:15:29AM -0800, speck for Andrew Cooper wrote: > No - not if they can possibly avoid it. Well, from the engineers of the cloud company I know - probably the biggest one - I hear a completely different story and they *do* have to reboot. And in the cases where they reboot, they can update microcode too. I'm willing to bet a lot of beers that microcode updates come a lot seldomly than kernel updates. And if it weren't for the year of mitigations 2018, late loading wouldnt've even been a topic at all. Because microcode updates are *really* that seldom. Patch memory being really limited and already carrying some other patches, comes to mind. So the normal flow on the boxes out there is, at some point you get the updated microcode package which gets installed onto the system. The next time you update the kernel or you have to reboot the box, you regenerate the initrd and you have your microcode update ready too. Done. Problem solved and then some. > Within the constraints of "quiesce the system with stop_machine()", yes. Except that is not enough in every case: static bool is_blacklisted(unsigned int cpu) { struct cpuinfo_x86 *c =3D &cpu_data(cpu); /* * Late loading on model 79 with microcode revision less than 0x0b000= 021 * and LLC size per core bigger than 2.5MB may result in a system han= g. * This behavior is documented in item BDF90, #334165 (Intel Xeon * Processor E7-8800/4800 v4 Product Family). */ if (c->x86 =3D=3D 6 && c->x86_model =3D=3D INTEL_FAM6_BROADWELL_X && c->x86_stepping =3D=3D 0x01 && llc_size_per_core > 2621440 && c->microcode < 0x0b000021) { pr_err_once("Erratum BDF90: late loading with revision < 0x0b= 000021 (0x%x) disabled.\n", c->microcode); pr_err_once("Please consider either early loading through ini= trd/built-in or a potential BIOS update.\n"); Yap, that's an erratum for late loading. Those customers can't do late loading because shit freezes the box. Good luck! And why are we doing such a complex conditional, you ask? Well, we're trying to salvage the situation because, yeah, late loading is just a *bad* *bad* idea. And this is the first erratum I'm aware of. I betcha more beers there are other problems with late loading. The debian people did blacklist a couple of microcode revisions for that reason too, for example. > I've got some customers to have said in no uncertain terms that they > want to reboot to get mitigations for these forthcoming issues.=C2=A0 Said > customers are large enough that Intel is currently engaged, and have > tentatively said that it is fine to load ucode in parallel on every > core during stop machine, rather than serially. Who said that, Intel or the customers? If it is Intel, see above. If it is the customers, they are more than free to support their own late loading. > I'm expecting Intel to propose this change in Linux as well as Xen. I have seen the "proposal". > I personally think everyone should reboot and call it done, Exactly. > but at the end of the day, I'm beholden to my customers, and they > really really do want late load microcode to work. Huh? They want to reboot but they still want late microcode loading? I have no idea what your argument here is. --=20 Regards/Gruss, Boris. SUSE Linux GmbH, GF: Felix Imend=C3=B6rffer, Jane Smithard, Graham Norton, HR= B 21284 (AG N=C3=BCrnberg) --=20