linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* x86, microcode: BUG: microcode update that changes x86_capability
@ 2014-09-18 13:52 Henrique de Moraes Holschuh
  2014-09-18 19:14 ` Andy Lutomirski
  2014-09-22  0:37 ` Andi Kleen
  0 siblings, 2 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-18 13:52 UTC (permalink / raw)
  To: linux-kernel; +Cc: Borislav Petkov, H Peter Anvin

The new Haswell microcode update[1] removes the "hle" (hardware lock
elision) processor capability.  And it is not cosmetic, either: Intel TSX
opcodes will cause an illegal opcode trap after the microcode update[2].

This means cpu_info()->x86_capability becomes stale after the microcode
update.

We could add logic to compute the new x86_capability after a microcode
update run, and OOPS the kernel if something too important (i.e. anything
the kernel uses) went away.  Otherwise, refresh cpu_info()->x86_capability.

Is that doable?


[1] sig 0x000306f2, pf mask 0x6f, 2014-09-03, rev 0x0029, size 28672
    sig 0x000306c3, pf mask 0x32, 2014-07-03, rev 0x001c, size 21504
    sig 0x00040651, pf mask 0x72, 2014-07-03, rev 0x001c, size 20480
    sig 0x00040661, pf mask 0x32, 2014-07-03, rev 0x0012, size 23552

[2] instantly segfaulting every running process using libpthread-2.19,
    as well as any other users of Intel TSX.
    https://bugs.launchpad.net/intel/+bug/1370352

    And yes, this means we will kill support for microcode updates
    outside of the initramfs/early-initramfs, at least in Debian,
    and likely in Ubuntu.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 13:52 x86, microcode: BUG: microcode update that changes x86_capability Henrique de Moraes Holschuh
@ 2014-09-18 19:14 ` Andy Lutomirski
  2014-09-18 19:53   ` Chuck Ebbert
  2014-09-19 16:11   ` Henrique de Moraes Holschuh
  2014-09-22  0:37 ` Andi Kleen
  1 sibling, 2 replies; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-18 19:14 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh, linux-kernel; +Cc: Borislav Petkov, H Peter Anvin

On 09/18/2014 06:52 AM, Henrique de Moraes Holschuh wrote:
> The new Haswell microcode update[1] removes the "hle" (hardware lock
> elision) processor capability.  And it is not cosmetic, either: Intel TSX
> opcodes will cause an illegal opcode trap after the microcode update[2].
> 
> This means cpu_info()->x86_capability becomes stale after the microcode
> update.
> 
> We could add logic to compute the new x86_capability after a microcode
> update run, and OOPS the kernel if something too important (i.e. anything
> the kernel uses) went away.  Otherwise, refresh cpu_info()->x86_capability.
> 
> Is that doable?
> 
> 
> [1] sig 0x000306f2, pf mask 0x6f, 2014-09-03, rev 0x0029, size 28672
>     sig 0x000306c3, pf mask 0x32, 2014-07-03, rev 0x001c, size 21504
>     sig 0x00040651, pf mask 0x72, 2014-07-03, rev 0x001c, size 20480
>     sig 0x00040661, pf mask 0x32, 2014-07-03, rev 0x0012, size 23552

This is HSD136, right?  Do you have a link to where that ucode comes
from?  Does it have release notes?

> 
> [2] instantly segfaulting every running process using libpthread-2.19,
>     as well as any other users of Intel TSX.
>     https://bugs.launchpad.net/intel/+bug/1370352
> 
>     And yes, this means we will kill support for microcode updates
>     outside of the initramfs/early-initramfs, at least in Debian,
>     and likely in Ubuntu.
> 

Given that there is exactly one microcode update like this (at least of
the sort that blows up userspace), I think that we should seriously
consider blacklisting just this particular microcode update once
userspace is running.

--Andy

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 19:14 ` Andy Lutomirski
@ 2014-09-18 19:53   ` Chuck Ebbert
  2014-09-18 19:55     ` H. Peter Anvin
  2014-09-19  9:56     ` Henrique de Moraes Holschuh
  2014-09-19 16:11   ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 47+ messages in thread
From: Chuck Ebbert @ 2014-09-18 19:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Henrique de Moraes Holschuh, linux-kernel, Borislav Petkov,
	H Peter Anvin

On Thu, 18 Sep 2014 12:14:59 -0700
Andy Lutomirski <luto@amacapital.net> wrote:

> On 09/18/2014 06:52 AM, Henrique de Moraes Holschuh wrote:
> > The new Haswell microcode update[1] removes the "hle" (hardware lock
> > elision) processor capability.  And it is not cosmetic, either: Intel TSX
> > opcodes will cause an illegal opcode trap after the microcode update[2].
> > 
> > This means cpu_info()->x86_capability becomes stale after the microcode
> > update.
> > 
> > We could add logic to compute the new x86_capability after a microcode
> > update run, and OOPS the kernel if something too important (i.e. anything
> > the kernel uses) went away.  Otherwise, refresh cpu_info()->x86_capability.
> > 
> > Is that doable?
> > 
> > 
> > [1] sig 0x000306f2, pf mask 0x6f, 2014-09-03, rev 0x0029, size 28672
> >     sig 0x000306c3, pf mask 0x32, 2014-07-03, rev 0x001c, size 21504
> >     sig 0x00040651, pf mask 0x72, 2014-07-03, rev 0x001c, size 20480
> >     sig 0x00040661, pf mask 0x32, 2014-07-03, rev 0x0012, size 23552
> 
> This is HSD136, right?  Do you have a link to where that ucode comes
> from?  Does it have release notes?
> 

https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=24290&lang=eng

I can't find any release notes.

Haswell-EP is also affected, it appears:

http://techreport.com/news/26911/errata-prompts-intel-to-disable-tsx-in-haswell-early-broadwell-cpus

> > 
> > [2] instantly segfaulting every running process using libpthread-2.19,
> >     as well as any other users of Intel TSX.
> >     https://bugs.launchpad.net/intel/+bug/1370352
> > 
> >     And yes, this means we will kill support for microcode updates
> >     outside of the initramfs/early-initramfs, at least in Debian,
> >     and likely in Ubuntu.
> > 
> 
> Given that there is exactly one microcode update like this (at least of
> the sort that blows up userspace), I think that we should seriously
> consider blacklisting just this particular microcode update once
> userspace is running.
> 

All future updates for these CPUs will have this problem.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 19:53   ` Chuck Ebbert
@ 2014-09-18 19:55     ` H. Peter Anvin
  2014-09-18 20:06       ` Henrique de Moraes Holschuh
  2014-09-19  9:56     ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2014-09-18 19:55 UTC (permalink / raw)
  To: Chuck Ebbert, Andy Lutomirski
  Cc: Henrique de Moraes Holschuh, linux-kernel, Borislav Petkov

We should, but this is also part of why we want the early ucode capability.

On September 18, 2014 12:53:28 PM PDT, Chuck Ebbert <cebbert.lkml@gmail.com> wrote:
>On Thu, 18 Sep 2014 12:14:59 -0700
>Andy Lutomirski <luto@amacapital.net> wrote:
>
>> On 09/18/2014 06:52 AM, Henrique de Moraes Holschuh wrote:
>> > The new Haswell microcode update[1] removes the "hle" (hardware
>lock
>> > elision) processor capability.  And it is not cosmetic, either:
>Intel TSX
>> > opcodes will cause an illegal opcode trap after the microcode
>update[2].
>> > 
>> > This means cpu_info()->x86_capability becomes stale after the
>microcode
>> > update.
>> > 
>> > We could add logic to compute the new x86_capability after a
>microcode
>> > update run, and OOPS the kernel if something too important (i.e.
>anything
>> > the kernel uses) went away.  Otherwise, refresh
>cpu_info()->x86_capability.
>> > 
>> > Is that doable?
>> > 
>> > 
>> > [1] sig 0x000306f2, pf mask 0x6f, 2014-09-03, rev 0x0029, size
>28672
>> >     sig 0x000306c3, pf mask 0x32, 2014-07-03, rev 0x001c, size
>21504
>> >     sig 0x00040651, pf mask 0x72, 2014-07-03, rev 0x001c, size
>20480
>> >     sig 0x00040661, pf mask 0x32, 2014-07-03, rev 0x0012, size
>23552
>> 
>> This is HSD136, right?  Do you have a link to where that ucode comes
>> from?  Does it have release notes?
>> 
>
>https://downloadcenter.intel.com/Detail_Desc.aspx?DwnldID=24290&lang=eng
>
>I can't find any release notes.
>
>Haswell-EP is also affected, it appears:
>
>http://techreport.com/news/26911/errata-prompts-intel-to-disable-tsx-in-haswell-early-broadwell-cpus
>
>> > 
>> > [2] instantly segfaulting every running process using
>libpthread-2.19,
>> >     as well as any other users of Intel TSX.
>> >     https://bugs.launchpad.net/intel/+bug/1370352
>> > 
>> >     And yes, this means we will kill support for microcode updates
>> >     outside of the initramfs/early-initramfs, at least in Debian,
>> >     and likely in Ubuntu.
>> > 
>> 
>> Given that there is exactly one microcode update like this (at least
>of
>> the sort that blows up userspace), I think that we should seriously
>> consider blacklisting just this particular microcode update once
>> userspace is running.
>> 
>
>All future updates for these CPUs will have this problem.

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 19:55     ` H. Peter Anvin
@ 2014-09-18 20:06       ` Henrique de Moraes Holschuh
  2014-09-19  0:13         ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-18 20:06 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Chuck Ebbert, Andy Lutomirski, linux-kernel, Borislav Petkov

On Thu, 18 Sep 2014, H. Peter Anvin wrote:
> We should, but this is also part of why we want the early ucode capability.

Well, yes.  But that won't help the several stable and LTS distros with
kernels without early ucode update support.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 20:06       ` Henrique de Moraes Holschuh
@ 2014-09-19  0:13         ` Henrique de Moraes Holschuh
  2014-09-19  0:23           ` Andy Lutomirski
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19  0:13 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Chuck Ebbert, Andy Lutomirski, linux-kernel, Borislav Petkov

On Thu, 18 Sep 2014, Henrique de Moraes Holschuh wrote:
> On Thu, 18 Sep 2014, H. Peter Anvin wrote:
> > We should, but this is also part of why we want the early ucode capability.
> 
> Well, yes.  But that won't help the several stable and LTS distros with
> kernels without early ucode update support.

Here's a plan that might work, pending actually checking the libpthread TSX
code to make sure it keys on /proc/cpuinfo flags:

Add a cpu quirk, triggered by the Haswell cpuids, to force-disable hle on
the affected processors.

This will work around the x86_capability capability issue (which should
still be fixed, anyway), and it should also get userspace to stay away from
TSX, therefore also working around the worst issue (processes getting
SIGILL).

This will disable the "user may ask the BIOS to keep TSX enabled"
anti-feature, though.  This drawback can be avoided, but only if a future
microcode update won't re-disable hle when the BIOS enabled it.  For now, I
suggest that we decree that "hle is toast" for the current Haswells and add
back ways to enable it for testing when we know more about it.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19  0:13         ` Henrique de Moraes Holschuh
@ 2014-09-19  0:23           ` Andy Lutomirski
  2014-09-19  0:28             ` H. Peter Anvin
  2014-09-19 11:00             ` Henrique de Moraes Holschuh
  0 siblings, 2 replies; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-19  0:23 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Sep 18, 2014 5:13 PM, "Henrique de Moraes Holschuh" <hmh@hmh.eng.br> wrote:
>
> On Thu, 18 Sep 2014, Henrique de Moraes Holschuh wrote:
> > On Thu, 18 Sep 2014, H. Peter Anvin wrote:
> > > We should, but this is also part of why we want the early ucode capability.
> >
> > Well, yes.  But that won't help the several stable and LTS distros with
> > kernels without early ucode update support.
>
> Here's a plan that might work, pending actually checking the libpthread TSX
> code to make sure it keys on /proc/cpuinfo flags:

Surely it checks cpuid directly, though.

Can we twiddle the cpuid bit?  I never noticed any way in the docs to
do it, but if BIOS has such an ability, maybe we do, too.  I wonder if
there's anything semi-documented in biosbits, or if we could just
reverse-engineer it.

--Andy

>
> Add a cpu quirk, triggered by the Haswell cpuids, to force-disable hle on
> the affected processors.
>
> This will work around the x86_capability capability issue (which should
> still be fixed, anyway), and it should also get userspace to stay away from
> TSX, therefore also working around the worst issue (processes getting
> SIGILL).
>
> This will disable the "user may ask the BIOS to keep TSX enabled"
> anti-feature, though.  This drawback can be avoided, but only if a future
> microcode update won't re-disable hle when the BIOS enabled it.  For now, I
> suggest that we decree that "hle is toast" for the current Haswells and add
> back ways to enable it for testing when we know more about it.
>
> --
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19  0:23           ` Andy Lutomirski
@ 2014-09-19  0:28             ` H. Peter Anvin
  2014-09-19  1:00               ` Andy Lutomirski
  2014-09-19 11:00             ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2014-09-19  0:28 UTC (permalink / raw)
  To: Andy Lutomirski, Henrique de Moraes Holschuh
  Cc: Borislav Petkov, Chuck Ebbert, linux-kernel

The cpuid bit gets twiddled...

On September 18, 2014 5:23:40 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
>On Sep 18, 2014 5:13 PM, "Henrique de Moraes Holschuh" <hmh@hmh.eng.br>
>wrote:
>>
>> On Thu, 18 Sep 2014, Henrique de Moraes Holschuh wrote:
>> > On Thu, 18 Sep 2014, H. Peter Anvin wrote:
>> > > We should, but this is also part of why we want the early ucode
>capability.
>> >
>> > Well, yes.  But that won't help the several stable and LTS distros
>with
>> > kernels without early ucode update support.
>>
>> Here's a plan that might work, pending actually checking the
>libpthread TSX
>> code to make sure it keys on /proc/cpuinfo flags:
>
>Surely it checks cpuid directly, though.
>
>Can we twiddle the cpuid bit?  I never noticed any way in the docs to
>do it, but if BIOS has such an ability, maybe we do, too.  I wonder if
>there's anything semi-documented in biosbits, or if we could just
>reverse-engineer it.
>
>--Andy
>
>>
>> Add a cpu quirk, triggered by the Haswell cpuids, to force-disable
>hle on
>> the affected processors.
>>
>> This will work around the x86_capability capability issue (which
>should
>> still be fixed, anyway), and it should also get userspace to stay
>away from
>> TSX, therefore also working around the worst issue (processes getting
>> SIGILL).
>>
>> This will disable the "user may ask the BIOS to keep TSX enabled"
>> anti-feature, though.  This drawback can be avoided, but only if a
>future
>> microcode update won't re-disable hle when the BIOS enabled it.  For
>now, I
>> suggest that we decree that "hle is toast" for the current Haswells
>and add
>> back ways to enable it for testing when we know more about it.
>>
>> --
>>   "One disk to rule them all, One disk to find them. One disk to
>bring
>>   them all and in the darkness grind them. In the Land of Redmond
>>   where the shadows lie." -- The Silicon Valley Tarot
>>   Henrique Holschuh

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19  0:28             ` H. Peter Anvin
@ 2014-09-19  1:00               ` Andy Lutomirski
  2014-09-19  8:03                 ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-19  1:00 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: linux-kernel, Chuck Ebbert, Borislav Petkov, Henrique de Moraes Holschuh

On Sep 18, 2014 5:28 PM, "H. Peter Anvin" <hpa@zytor.com> wrote:
>
> The cpuid bit gets twiddled...

Yes, but how?  I assume that BIOS isn't switching between two
different ucode blobs, and I don't know about any wrcpuid instruction.
So there must be *some* way, at least on new ucode (and maybe on old
ucode) to change that bit.  If we could do that in the kernel, we
might be able to come up with a more intelligent way to handle this
ucode gotcha (such as, ideally, clearing the bit ourselves on old
ucode).

>
> On September 18, 2014 5:23:40 PM PDT, Andy Lutomirski <luto@amacapital.net> wrote:
> >On Sep 18, 2014 5:13 PM, "Henrique de Moraes Holschuh" <hmh@hmh.eng.br>
> >wrote:
> >>
> >> On Thu, 18 Sep 2014, Henrique de Moraes Holschuh wrote:
> >> > On Thu, 18 Sep 2014, H. Peter Anvin wrote:
> >> > > We should, but this is also part of why we want the early ucode
> >capability.
> >> >
> >> > Well, yes.  But that won't help the several stable and LTS distros
> >with
> >> > kernels without early ucode update support.
> >>
> >> Here's a plan that might work, pending actually checking the
> >libpthread TSX
> >> code to make sure it keys on /proc/cpuinfo flags:
> >
> >Surely it checks cpuid directly, though.
> >
> >Can we twiddle the cpuid bit?  I never noticed any way in the docs to
> >do it, but if BIOS has such an ability, maybe we do, too.  I wonder if
> >there's anything semi-documented in biosbits, or if we could just
> >reverse-engineer it.
> >
> >--Andy
> >
> >>
> >> Add a cpu quirk, triggered by the Haswell cpuids, to force-disable
> >hle on
> >> the affected processors.
> >>
> >> This will work around the x86_capability capability issue (which
> >should
> >> still be fixed, anyway), and it should also get userspace to stay
> >away from
> >> TSX, therefore also working around the worst issue (processes getting
> >> SIGILL).
> >>
> >> This will disable the "user may ask the BIOS to keep TSX enabled"
> >> anti-feature, though.  This drawback can be avoided, but only if a
> >future
> >> microcode update won't re-disable hle when the BIOS enabled it.  For
> >now, I
> >> suggest that we decree that "hle is toast" for the current Haswells
> >and add
> >> back ways to enable it for testing when we know more about it.
> >>
> >> --
> >>   "One disk to rule them all, One disk to find them. One disk to
> >bring
> >>   them all and in the darkness grind them. In the Land of Redmond
> >>   where the shadows lie." -- The Silicon Valley Tarot
> >>   Henrique Holschuh
>
> --
> Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19  1:00               ` Andy Lutomirski
@ 2014-09-19  8:03                 ` Borislav Petkov
  0 siblings, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2014-09-19  8:03 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: H. Peter Anvin, linux-kernel, Chuck Ebbert, Henrique de Moraes Holschuh

On Thu, Sep 18, 2014 at 06:00:12PM -0700, Andy Lutomirski wrote:
> Yes, but how?  I assume that BIOS isn't switching between two
> different ucode blobs, and I don't know about any wrcpuid instruction.
> So there must be *some* way, at least on new ucode (and maybe on old
> ucode) to change that bit.

I'd venture a guess that WRMSR to some reg should give you that, if they
expose it.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 19:53   ` Chuck Ebbert
  2014-09-18 19:55     ` H. Peter Anvin
@ 2014-09-19  9:56     ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19  9:56 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Andy Lutomirski, linux-kernel, Borislav Petkov, H Peter Anvin

On Thu, 18 Sep 2014, Chuck Ebbert wrote:
> > > [1] sig 0x000306f2, pf mask 0x6f, 2014-09-03, rev 0x0029, size 28672
> > >     sig 0x000306c3, pf mask 0x32, 2014-07-03, rev 0x001c, size 21504
> > >     sig 0x00040651, pf mask 0x72, 2014-07-03, rev 0x001c, size 20480
> > >     sig 0x00040661, pf mask 0x32, 2014-07-03, rev 0x0012, size 23552

...

> > Given that there is exactly one microcode update like this (at least of
> > the sort that blows up userspace), I think that we should seriously
> > consider blacklisting just this particular microcode update once
> > userspace is running.
> > 
> 
> All future updates for these CPUs will have this problem.

Sort of.  Any update _from_ microcodes earlier than the above, will.

Otherwise, it depends on whether Intel TSX "testing mode" is enabled in
BIOS, and whether it "sticks" across a microcode update or gets reset to the
default of "disabled".  Updating from "disabled" to "disabled" should be
safe.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19  0:23           ` Andy Lutomirski
  2014-09-19  0:28             ` H. Peter Anvin
@ 2014-09-19 11:00             ` Henrique de Moraes Holschuh
  2014-09-19 11:29               ` Borislav Petkov
  2014-09-19 22:35               ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 11:00 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Thu, 18 Sep 2014, Andy Lutomirski wrote:
> On Sep 18, 2014 5:13 PM, "Henrique de Moraes Holschuh" <hmh@hmh.eng.br> wrote:
> > Here's a plan that might work, pending actually checking the libpthread TSX
> > code to make sure it keys on /proc/cpuinfo flags:
> 
> Surely it checks cpuid directly, though.

Unfortunately, you're correct.  It uses cpuid() directly.  So, my plan is
not going to work.

> Can we twiddle the cpuid bit?  I never noticed any way in the docs to

No, we can't, at least not by manipulating cpuid itself.

And if the old microcode had such a Intel TSX on/off switch (which would
also reset the bit), the "fix" that disables Intel TSX should have been a
BIOS update, not a microcode update... so I consider that very unlikely.

I'm filling a bug on Debian glibc, asking them to blacklist HLE until
further notice.

So that's two nice features that might go the way of the Dodo over this:
We're also killing microcode update support outside of the initramfs in
Debian.  It has become obvious that anything other than the early initramfs
method of microcode updates should be considered a developer thing.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 11:00             ` Henrique de Moraes Holschuh
@ 2014-09-19 11:29               ` Borislav Petkov
  2014-09-19 12:54                 ` Chuck Ebbert
  2014-09-19 13:51                 ` Henrique de Moraes Holschuh
  2014-09-19 22:35               ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 47+ messages in thread
From: Borislav Petkov @ 2014-09-19 11:29 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Andy Lutomirski, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, Sep 19, 2014 at 08:00:15AM -0300, Henrique de Moraes Holschuh wrote:
> We're also killing microcode update support outside of the initramfs in
> Debian.  It has become obvious that anything other than the early initramfs
> method of microcode updates should be considered a developer thing.

That's simply not true: long-running systems which you can't reboot for
whatever reason will need the late microcode update.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 11:29               ` Borislav Petkov
@ 2014-09-19 12:54                 ` Chuck Ebbert
  2014-09-19 13:14                   ` Josh Boyer
  2014-09-19 15:00                   ` Borislav Petkov
  2014-09-19 13:51                 ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 47+ messages in thread
From: Chuck Ebbert @ 2014-09-19 12:54 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Henrique de Moraes Holschuh, Andy Lutomirski, H. Peter Anvin,
	linux-kernel

On Fri, 19 Sep 2014 13:29:53 +0200
Borislav Petkov <bp@alien8.de> wrote:

> On Fri, Sep 19, 2014 at 08:00:15AM -0300, Henrique de Moraes Holschuh wrote:
> > We're also killing microcode update support outside of the initramfs in
> > Debian.  It has become obvious that anything other than the early initramfs
> > method of microcode updates should be considered a developer thing.
> 
> That's simply not true: long-running systems which you can't reboot for
> whatever reason will need the late microcode update.
> 

Assuming we can identify all the affected models and steppings, maybe
something like this would work:

1) Refuse to finish booting if a microcode update that disables TSX
isn't applied before userspace starts running on those CPUs.

and/or

2) Don't allow a late update if TSX is still enabled on those
processors.

(1) could be overridden by a command line option for people who want to
develop TSX code.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 12:54                 ` Chuck Ebbert
@ 2014-09-19 13:14                   ` Josh Boyer
  2014-09-19 13:37                     ` Chuck Ebbert
  2014-09-19 15:00                   ` Borislav Petkov
  1 sibling, 1 reply; 47+ messages in thread
From: Josh Boyer @ 2014-09-19 13:14 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Borislav Petkov, Henrique de Moraes Holschuh, Andy Lutomirski,
	H. Peter Anvin, linux-kernel

On Fri, Sep 19, 2014 at 8:54 AM, Chuck Ebbert <cebbert.lkml@gmail.com> wrote:
> On Fri, 19 Sep 2014 13:29:53 +0200
> Borislav Petkov <bp@alien8.de> wrote:
>
>> On Fri, Sep 19, 2014 at 08:00:15AM -0300, Henrique de Moraes Holschuh wrote:
>> > We're also killing microcode update support outside of the initramfs in
>> > Debian.  It has become obvious that anything other than the early initramfs
>> > method of microcode updates should be considered a developer thing.
>>
>> That's simply not true: long-running systems which you can't reboot for
>> whatever reason will need the late microcode update.
>>
>
> Assuming we can identify all the affected models and steppings, maybe
> something like this would work:
>
> 1) Refuse to finish booting if a microcode update that disables TSX
> isn't applied before userspace starts running on those CPUs.

How would you accomplish that when applying a microcode update
requires userspace?  Or did you mean "before we transition out of the
initramfs"?

josh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 13:14                   ` Josh Boyer
@ 2014-09-19 13:37                     ` Chuck Ebbert
  0 siblings, 0 replies; 47+ messages in thread
From: Chuck Ebbert @ 2014-09-19 13:37 UTC (permalink / raw)
  To: Josh Boyer
  Cc: Borislav Petkov, Henrique de Moraes Holschuh, Andy Lutomirski,
	H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014 09:14:50 -0400
Josh Boyer <jwboyer@fedoraproject.org> wrote:

> On Fri, Sep 19, 2014 at 8:54 AM, Chuck Ebbert <cebbert.lkml@gmail.com> wrote:
> > On Fri, 19 Sep 2014 13:29:53 +0200
> > Borislav Petkov <bp@alien8.de> wrote:
> >
> >> On Fri, Sep 19, 2014 at 08:00:15AM -0300, Henrique de Moraes Holschuh wrote:
> >> > We're also killing microcode update support outside of the initramfs in
> >> > Debian.  It has become obvious that anything other than the early initramfs
> >> > method of microcode updates should be considered a developer thing.
> >>
> >> That's simply not true: long-running systems which you can't reboot for
> >> whatever reason will need the late microcode update.
> >>
> >
> > Assuming we can identify all the affected models and steppings, maybe
> > something like this would work:
> >
> > 1) Refuse to finish booting if a microcode update that disables TSX
> > isn't applied before userspace starts running on those CPUs.
> 
> How would you accomplish that when applying a microcode update
> requires userspace?  Or did you mean "before we transition out of the
> initramfs"?
> 

I guess I meant requiring the update be done with the early microcode
method.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 11:29               ` Borislav Petkov
  2014-09-19 12:54                 ` Chuck Ebbert
@ 2014-09-19 13:51                 ` Henrique de Moraes Holschuh
  2014-09-19 14:49                   ` Borislav Petkov
  1 sibling, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 13:51 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Borislav Petkov wrote:
> On Fri, Sep 19, 2014 at 08:00:15AM -0300, Henrique de Moraes Holschuh wrote:
> > We're also killing microcode update support outside of the initramfs in
> > Debian.  It has become obvious that anything other than the early initramfs
> > method of microcode updates should be considered a developer thing.
> 
> That's simply not true: long-running systems which you can't reboot for
> whatever reason will need the late microcode update.

I have no plans to ask the Debian kernel team to disable the "microcode"
module.  The local system administrator is welcome to trigger the microcode
update manually: I will make sure to document how, although with a suitable
warning.

And I will also continue to pester you with patches to the microcode driver
:-)

But I will not trigger a microcode update when the intel-microcode package
gets updated/installed anymore.  The user will be warned of the need for
either a reboot or a manually triggered microcode update.

Anyway, those are my current plans.  I am delaying the microcode update in
non-free (Debian ancillary distro), so that I can reconsider the above as
the situation unfolds over the next week or two.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 13:51                 ` Henrique de Moraes Holschuh
@ 2014-09-19 14:49                   ` Borislav Petkov
  2014-09-19 17:22                     ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-19 14:49 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Andy Lutomirski, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, Sep 19, 2014 at 10:51:08AM -0300, Henrique de Moraes Holschuh wrote:
> And I will also continue to pester you with patches to the microcode driver
> :-)

That's fine - I'm currently busy and not looking at them but I haven't
forgotten them.

> But I will not trigger a microcode update when the intel-microcode package
> gets updated/installed anymore.  The user will be warned of the need for
> either a reboot or a manually triggered microcode update.

That's also wrong - you need to warn only on the affected machines.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 12:54                 ` Chuck Ebbert
  2014-09-19 13:14                   ` Josh Boyer
@ 2014-09-19 15:00                   ` Borislav Petkov
  2014-09-19 16:13                     ` Andy Lutomirski
  2014-09-19 16:42                     ` Henrique de Moraes Holschuh
  1 sibling, 2 replies; 47+ messages in thread
From: Borislav Petkov @ 2014-09-19 15:00 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Henrique de Moraes Holschuh, Andy Lutomirski, H. Peter Anvin,
	linux-kernel

On Fri, Sep 19, 2014 at 07:54:14AM -0500, Chuck Ebbert wrote:
> Assuming we can identify all the affected models and steppings, maybe
> something like this would work:
> 
> 1) Refuse to finish booting if a microcode update that disables TSX
> isn't applied before userspace starts running on those CPUs.

Well, I think when we're booting, we would have already applied the
early microcode, no? Because then it is a non-issue.

> 2) Don't allow a late update if TSX is still enabled on those
> processors.

Yeah, so the use case I have in mind is when a long-running machine
wants to apply microcode and this microcode disables CPUID bits and
instructions. And the machine cannot be rebooted.

I guess in that case we would have to issue a warning only on the
affected processors that a rebooted is mandatory and fail the update...
Maybe something like that.

> (1) could be overridden by a command line option for people who want
> to develop TSX code.

The way I understand it, those people shouldn't apply the microcode
patch at all.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 19:14 ` Andy Lutomirski
  2014-09-18 19:53   ` Chuck Ebbert
@ 2014-09-19 16:11   ` Henrique de Moraes Holschuh
  1 sibling, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 16:11 UTC (permalink / raw)
  To: Andy Lutomirski; +Cc: linux-kernel, Borislav Petkov, H Peter Anvin

On Thu, 18 Sep 2014, Andy Lutomirski wrote:
> > [2] instantly segfaulting every running process using libpthread-2.19,
> >     as well as any other users of Intel TSX.
> >     https://bugs.launchpad.net/intel/+bug/1370352
> > 
> >     And yes, this means we will kill support for microcode updates
> >     outside of the initramfs/early-initramfs, at least in Debian,
> >     and likely in Ubuntu.
> 
> Given that there is exactly one microcode update like this (at least of
> the sort that blows up userspace), I think that we should seriously
> consider blacklisting just this particular microcode update once
> userspace is running.

This was just the first one.  It is likely that there will be others.

Anyway, IMHO kernel blacklists are of limited value for this kind of issue.
Sure, we can add them to the microcode driver (*not* the early microcode
driver) to protect the unwary after the fact, but the useful fast-reaction
blacklisting will be done by the distros in userspace.

In fact, a big thank-you to Canonical's QA process and to Felix Geyer, for
raising the early alarm (https://bugs.launchpad.net/intel/+bug/1370352)...

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 15:00                   ` Borislav Petkov
@ 2014-09-19 16:13                     ` Andy Lutomirski
  2014-09-19 16:54                       ` Henrique de Moraes Holschuh
  2014-09-19 16:42                     ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-19 16:13 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Henrique de Moraes Holschuh, H. Peter Anvin, linux-kernel

On Fri, Sep 19, 2014 at 8:00 AM, Borislav Petkov <bp@alien8.de> wrote:
> On Fri, Sep 19, 2014 at 07:54:14AM -0500, Chuck Ebbert wrote:
>> Assuming we can identify all the affected models and steppings, maybe
>> something like this would work:
>>
>> 1) Refuse to finish booting if a microcode update that disables TSX
>> isn't applied before userspace starts running on those CPUs.
>
> Well, I think when we're booting, we would have already applied the
> early microcode, no? Because then it is a non-issue.
>
>> 2) Don't allow a late update if TSX is still enabled on those
>> processors.
>
> Yeah, so the use case I have in mind is when a long-running machine
> wants to apply microcode and this microcode disables CPUID bits and
> instructions. And the machine cannot be rebooted.
>
> I guess in that case we would have to issue a warning only on the
> affected processors that a rebooted is mandatory and fail the update...
> Maybe something like that.
>
>> (1) could be overridden by a command line option for people who want
>> to develop TSX code.
>
> The way I understand it, those people shouldn't apply the microcode
> patch at all.
>

One way or another, anyone who has a kernel without some kind of
workaround, an old BIOS, and a new ucode file in /lib/firmware is
going to have problems unless they're set up for early ucode updates.

Can we change the ucode blob format for these firmwares so that old
kernels won't apply them?  I have no other good ideas.  The trouble is
that distros *should* push out the new ucode, but only if there's some
guarantee that they'll only be applied early, never late.

--Andy

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 15:00                   ` Borislav Petkov
  2014-09-19 16:13                     ` Andy Lutomirski
@ 2014-09-19 16:42                     ` Henrique de Moraes Holschuh
  2014-09-23 20:00                       ` Borislav Petkov
  1 sibling, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 16:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Borislav Petkov wrote:
> On Fri, Sep 19, 2014 at 07:54:14AM -0500, Chuck Ebbert wrote:
> > 2) Don't allow a late update if TSX is still enabled on those
> > processors.
> 
> Yeah, so the use case I have in mind is when a long-running machine
> wants to apply microcode and this microcode disables CPUID bits and
> instructions. And the machine cannot be rebooted.
> 
> I guess in that case we would have to issue a warning only on the
> affected processors that a rebooted is mandatory and fail the update...
> Maybe something like that.

Well, in this case we'd have to (on Intel, but AMD is likely the same):

1. offline a "guinea pig" group of "cpus", i.e. an entire "microcode update
unit" that doesn't include the BSP.  This is going to be a pain, as what
composes a "microcode update unit" is not set in stone, and could change in
a future microarch.

2. apply the update to one of the "guinea pig" "cpus" (which will update all
"cpus" in the same "microcode update unit").

3. sanity check.

4a. abort the update run if something nasty happened, leaving the "guinea
pig" "cpus" locked offline until the next reboot.  Warn the user.

4b. online the "guinea pig" "cpus" if the update looks good, and proceed to
update the rest of the "cpus" in the system.

We need this dance because we cannot roll-back a microcode update in the
general case.

To me, it looks way too complicated to be worth the effort.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 16:13                     ` Andy Lutomirski
@ 2014-09-19 16:54                       ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 16:54 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Andy Lutomirski wrote:
> Can we change the ucode blob format for these firmwares so that old
> kernels won't apply them?  I have no other good ideas.  The trouble is
> that distros *should* push out the new ucode, but only if there's some
> guarantee that they'll only be applied early, never late.

All it takes is to rm the /lib/firmware/intel-ucode/<haswell f-m-s> files,
while not touching the microcode you will add to the early initramfs.

Anyway, if you want to fix this through the stable kernel updates:

The intel microcode driver does a "sanity check" pass over all cpus before
the update, attempting to locate the appropriate microcode.  It is trivial
to add the following logic to it:

if cpu_has(hle) && cpuid_in_table(haswell_cpuids) { ignore the microcode }.

You don't even have to key on the current microcode revision, as we don't
know for sure it will be safe to update these processors again when the BIOS
force-enabled HLE anyway.

We can enhance the blacklist to only key on the old microcode that always
enables Intel TSX later, if this proves to be safe.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 14:49                   ` Borislav Petkov
@ 2014-09-19 17:22                     ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 17:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Andy Lutomirski, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Borislav Petkov wrote:
> On Fri, Sep 19, 2014 at 10:51:08AM -0300, Henrique de Moraes Holschuh wrote:
> > But I will not trigger a microcode update when the intel-microcode package
> > gets updated/installed anymore.  The user will be warned of the need for
> > either a reboot or a manually triggered microcode update.
> 
> That's also wrong - you need to warn only on the affected machines.

I could enhance iucode_tool so that it can tell you when there are pending
updates.  But it is a layering violation I don't like much, and it is a lot
less future-proof than the stuff iucode-tool already does.  I will think
about it... but I'm inclined to just tell the user that "microcode updates,
if any, will be applied the next time this system reboots".

As for immediately updating the microcode on all processors other than Intel
Haswell: this time, Felix Geyer found the issue early enough to save both
Ubuntu and Debian users from harm, and I brought it to attention here as
soon as I had a good idea of what was happening.

But this was a close call as far as I'm concerned.  Next time, we might not
be so lucky.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 11:00             ` Henrique de Moraes Holschuh
  2014-09-19 11:29               ` Borislav Petkov
@ 2014-09-19 22:35               ` Henrique de Moraes Holschuh
  2014-09-29 11:51                 ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-19 22:35 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Henrique de Moraes Holschuh wrote:
> I'm filling a bug on Debian glibc, asking them to blacklist HLE until
> further notice.

FWIW, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762195

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-18 13:52 x86, microcode: BUG: microcode update that changes x86_capability Henrique de Moraes Holschuh
  2014-09-18 19:14 ` Andy Lutomirski
@ 2014-09-22  0:37 ` Andi Kleen
  2014-09-22  0:51   ` H. Peter Anvin
  1 sibling, 1 reply; 47+ messages in thread
From: Andi Kleen @ 2014-09-22  0:37 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh; +Cc: linux-kernel, Borislav Petkov, H Peter Anvin

Henrique de Moraes Holschuh <hmh@hmh.eng.br> writes:


>     And yes, this means we will kill support for microcode updates
>     outside of the initramfs/early-initramfs, at least in Debian,
>     and likely in Ubuntu.

You got it totally backwards. initramfs updating should handle this
microcode update just fine, as it happens before the kernel 
scans the cpuids. Just don't update it later.

Nothing else is needed.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-22  0:37 ` Andi Kleen
@ 2014-09-22  0:51   ` H. Peter Anvin
  2014-09-22  0:58     ` Andi Kleen
  0 siblings, 1 reply; 47+ messages in thread
From: H. Peter Anvin @ 2014-09-22  0:51 UTC (permalink / raw)
  To: Andi Kleen, Henrique de Moraes Holschuh; +Cc: linux-kernel, Borislav Petkov

He said *outside* of the early update mechanism.

On September 21, 2014 5:37:24 PM PDT, Andi Kleen <andi@firstfloor.org> wrote:
>Henrique de Moraes Holschuh <hmh@hmh.eng.br> writes:
>
>
>>     And yes, this means we will kill support for microcode updates
>>     outside of the initramfs/early-initramfs, at least in Debian,
>>     and likely in Ubuntu.
>
>You got it totally backwards. initramfs updating should handle this
>microcode update just fine, as it happens before the kernel 
>scans the cpuids. Just don't update it later.
>
>Nothing else is needed.
>
>-Andi

-- 
Sent from my mobile phone.  Please pardon brevity and lack of formatting.

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-22  0:51   ` H. Peter Anvin
@ 2014-09-22  0:58     ` Andi Kleen
  0 siblings, 0 replies; 47+ messages in thread
From: Andi Kleen @ 2014-09-22  0:58 UTC (permalink / raw)
  To: H. Peter Anvin
  Cc: Andi Kleen, Henrique de Moraes Holschuh, linux-kernel, Borislav Petkov

On Sun, Sep 21, 2014 at 05:51:12PM -0700, H. Peter Anvin wrote:
> He said *outside* of the early update mechanism.

True. Sorry yes I misread it.

Yes, that's the way to go.

-Andi

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 16:42                     ` Henrique de Moraes Holschuh
@ 2014-09-23 20:00                       ` Borislav Petkov
  2014-09-24 14:56                         ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-23 20:00 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Fri, Sep 19, 2014 at 01:42:17PM -0300, Henrique de Moraes Holschuh wrote:
> 1. offline a "guinea pig" group of "cpus", i.e. an entire "microcode update
> unit" that doesn't include the BSP.  This is going to be a pain, as what
> composes a "microcode update unit" is not set in stone, and could change in
> a future microarch.

I'm pretty sure it is very dangerous to run with different microcode
revisions on different cores. Your plan won't fly and I have hard time
understanding why one would do such thing even if it did work.

If we're going to have to hide stuff which software might be using, I
don't see a way around rebooting.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-23 20:00                       ` Borislav Petkov
@ 2014-09-24 14:56                         ` Henrique de Moraes Holschuh
  2014-09-24 15:00                           ` Andy Lutomirski
  2014-09-25  8:51                           ` Borislav Petkov
  0 siblings, 2 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-24 14:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Tue, 23 Sep 2014, Borislav Petkov wrote:
> On Fri, Sep 19, 2014 at 01:42:17PM -0300, Henrique de Moraes Holschuh wrote:
> > 1. offline a "guinea pig" group of "cpus", i.e. an entire "microcode update
> > unit" that doesn't include the BSP.  This is going to be a pain, as what
> > composes a "microcode update unit" is not set in stone, and could change in
> > a future microarch.
> 
> I'm pretty sure it is very dangerous to run with different microcode
> revisions on different cores. Your plan won't fly and I have hard time
> understanding why one would do such thing even if it did work.

I don't want that plan to fly, it is too complex and I wrote as much at
the end of that email.  I won't bother with the situations where it would
be helpful, they're not very interesting.


On the topic of microcode revision skew in a multi-processor system: 

For a long time we had an Extremely Bad userspace interface that required
userspace to trigger the microcode update once per cpu, and it fetched the
microcode from userspace once per cpu.

This made for an absurdly large time window during which we'd have
microcode revision skew across cpus, and yet nothing blew up sky-high.  If
microcode revision skew was not generally safe, we'd have had a lot of
trouble already.

In fact, we still run the system with microcode revision skew while the
microcode update is taking place through the regular microcode driver, as
it is serialized one cpu at a time, and the other cpus are active and
running.

I don't know about AMD, but on Intel, the time it takes to update the
microcode on a core is anything but negligible[1], so the microcode
version skew window still exists, and it is not small.  It is much smaller
than it once was, but it is still there.

The only way to really minimize the risk of microcode version skew is to
limit oneself to firmware and early initramfs microcode updates.

> If we're going to have to hide stuff which software might be using, I
> don't see a way around rebooting.

Nor do I.

But IMHO we still need to detect and do something smart when
x86_capability changes due to a microcode update.

And I'd really prefer it to be "update x86_capability, warn the user and
carry on" for anything that is not going to crash the kernel.  Several
distros will really want this backported to -stable, as the older kernels
cannot do early microcode updates.


[1] Intel processors take from 200 thousand cycles to several million
    cycles per core to sucessfully apply a microcode update.  Verified
    using get_cycles() right before and right after the WRMSR 0x79.
    Variance was really high, about 10%.  My limited testing matched what
    has been previously reported by Ben Hawkes.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 14:56                         ` Henrique de Moraes Holschuh
@ 2014-09-24 15:00                           ` Andy Lutomirski
  2014-09-24 17:45                             ` Henrique de Moraes Holschuh
  2014-09-25  8:51                           ` Borislav Petkov
  1 sibling, 1 reply; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-24 15:00 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, Sep 24, 2014 at 7:56 AM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Tue, 23 Sep 2014, Borislav Petkov wrote:
>> On Fri, Sep 19, 2014 at 01:42:17PM -0300, Henrique de Moraes Holschuh wrote:
>> > 1. offline a "guinea pig" group of "cpus", i.e. an entire "microcode update
>> > unit" that doesn't include the BSP.  This is going to be a pain, as what
>> > composes a "microcode update unit" is not set in stone, and could change in
>> > a future microarch.
>>
>> I'm pretty sure it is very dangerous to run with different microcode
>> revisions on different cores. Your plan won't fly and I have hard time
>> understanding why one would do such thing even if it did work.
>
> I don't want that plan to fly, it is too complex and I wrote as much at
> the end of that email.  I won't bother with the situations where it would
> be helpful, they're not very interesting.
>
>
> On the topic of microcode revision skew in a multi-processor system:
>
> For a long time we had an Extremely Bad userspace interface that required
> userspace to trigger the microcode update once per cpu, and it fetched the
> microcode from userspace once per cpu.
>
> This made for an absurdly large time window during which we'd have
> microcode revision skew across cpus, and yet nothing blew up sky-high.  If
> microcode revision skew was not generally safe, we'd have had a lot of
> trouble already.
>
> In fact, we still run the system with microcode revision skew while the
> microcode update is taking place through the regular microcode driver, as
> it is serialized one cpu at a time, and the other cpus are active and
> running.
>
> I don't know about AMD, but on Intel, the time it takes to update the
> microcode on a core is anything but negligible[1], so the microcode
> version skew window still exists, and it is not small.  It is much smaller
> than it once was, but it is still there.
>
> The only way to really minimize the risk of microcode version skew is to
> limit oneself to firmware and early initramfs microcode updates.
>
>> If we're going to have to hide stuff which software might be using, I
>> don't see a way around rebooting.
>
> Nor do I.
>
> But IMHO we still need to detect and do something smart when
> x86_capability changes due to a microcode update.
>
> And I'd really prefer it to be "update x86_capability, warn the user and
> carry on" for anything that is not going to crash the kernel.  Several
> distros will really want this backported to -stable, as the older kernels
> cannot do early microcode updates.
>

I'm trying to see if Intel is willing to document any additional
controls for the TSX bits in this ucode.  No word yet, but I might
hear something soon.

--Andy

>
> [1] Intel processors take from 200 thousand cycles to several million
>     cycles per core to sucessfully apply a microcode update.  Verified
>     using get_cycles() right before and right after the WRMSR 0x79.
>     Variance was really high, about 10%.  My limited testing matched what
>     has been previously reported by Ben Hawkes.
>
> --
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 15:00                           ` Andy Lutomirski
@ 2014-09-24 17:45                             ` Henrique de Moraes Holschuh
  2014-09-24 17:48                               ` Andy Lutomirski
  2014-09-25  8:57                               ` Borislav Petkov
  0 siblings, 2 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-24 17:45 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, 24 Sep 2014, Andy Lutomirski wrote:
> On Wed, Sep 24, 2014 at 7:56 AM, Henrique de Moraes Holschuh
> <hmh@hmh.eng.br> wrote:
> > And I'd really prefer it to be "update x86_capability, warn the user and
> > carry on" for anything that is not going to crash the kernel.  Several
> > distros will really want this backported to -stable, as the older kernels
> > cannot do early microcode updates.
> >
> 
> I'm trying to see if Intel is willing to document any additional
> controls for the TSX bits in this ucode.  No word yet, but I might
> hear something soon.

If they do document it, please make sure to ask what will happen in the
following situation:

   Assume there is a newer release of Intel microcode for these
   processors, i.e. newer than the microcodes in the 2014-09-13 release.
   IOW assume there are at least two public microcode updates in which the
   Intel TSX feature has been disabled by default, but can be enabled by
   the BIOS/UEFI.

   1. BIOS/UEFI has recent microcode (which has the Intel TSX on/off
      switch), but it is not the latest microcode, and installed this
      update on the processor.

   2. BIOS/UEFI has *enabled* Intel TSX on user request.

   3. Microcode is updated to the latest microcode by the operating
      system, newer than the one in BIOS/UEFI.

   After step 3, will Intel TSX be enabled, or disabled ?

Or, to be more explicit: will future microcode updates preserve Intel TSX
enabled/disabled state, or will they always reset it to disabled?

This is really important, for obvious reasons.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 17:45                             ` Henrique de Moraes Holschuh
@ 2014-09-24 17:48                               ` Andy Lutomirski
  2014-09-24 18:59                                 ` Henrique de Moraes Holschuh
  2014-09-25  8:57                               ` Borislav Petkov
  1 sibling, 1 reply; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-24 17:48 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, Sep 24, 2014 at 10:45 AM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Wed, 24 Sep 2014, Andy Lutomirski wrote:
>> On Wed, Sep 24, 2014 at 7:56 AM, Henrique de Moraes Holschuh
>> <hmh@hmh.eng.br> wrote:
>> > And I'd really prefer it to be "update x86_capability, warn the user and
>> > carry on" for anything that is not going to crash the kernel.  Several
>> > distros will really want this backported to -stable, as the older kernels
>> > cannot do early microcode updates.
>> >
>>
>> I'm trying to see if Intel is willing to document any additional
>> controls for the TSX bits in this ucode.  No word yet, but I might
>> hear something soon.
>
> If they do document it, please make sure to ask what will happen in the
> following situation:
>
>    Assume there is a newer release of Intel microcode for these
>    processors, i.e. newer than the microcodes in the 2014-09-13 release.
>    IOW assume there are at least two public microcode updates in which the
>    Intel TSX feature has been disabled by default, but can be enabled by
>    the BIOS/UEFI.
>
>    1. BIOS/UEFI has recent microcode (which has the Intel TSX on/off
>       switch), but it is not the latest microcode, and installed this
>       update on the processor.
>
>    2. BIOS/UEFI has *enabled* Intel TSX on user request.
>
>    3. Microcode is updated to the latest microcode by the operating
>       system, newer than the one in BIOS/UEFI.
>
>    After step 3, will Intel TSX be enabled, or disabled ?
>
> Or, to be more explicit: will future microcode updates preserve Intel TSX
> enabled/disabled state, or will they always reset it to disabled?
>
> This is really important, for obvious reasons.
>

Indeed.

We can sort of fudge it if whatever control BIOS uses is available to
us, too, and we can reprogram it to "enabled" after a microcode update
disables TSX.

If I had one of the affected chips, I'd try scanning the MSR space to
see if a new MSR appears after applying the update.  But I don't, so I
can't do this.

--Andy

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 17:48                               ` Andy Lutomirski
@ 2014-09-24 18:59                                 ` Henrique de Moraes Holschuh
  2014-09-24 19:34                                   ` Andy Lutomirski
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-24 18:59 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, 24 Sep 2014, Andy Lutomirski wrote:
> We can sort of fudge it if whatever control BIOS uses is available to
> us, too, and we can reprogram it to "enabled" after a microcode update
> disables TSX.

Only for the early initramfs microcode update driver, and that's going to be
useful only as a way to honor the "keep Intel TSX enabled even if it is
badly broken" switch that was added by Intel for developer usage.

For the runtime microcode update (regular microcode driver), an
"enabled->disabled->enabled" transition would still disrupt the system:
triggering a microcode update in a cpu can update other cpus, which might be
running Intel TSX instructions.  Boom! processes running on these other cpus
can crash with SIGILL, and we have data loss.

The microcode update has to preserve the entire [visible] processor state,
otherwise we cannot safely apply it "late".  Intel TSX included.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 18:59                                 ` Henrique de Moraes Holschuh
@ 2014-09-24 19:34                                   ` Andy Lutomirski
  0 siblings, 0 replies; 47+ messages in thread
From: Andy Lutomirski @ 2014-09-24 19:34 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, Sep 24, 2014 at 11:59 AM, Henrique de Moraes Holschuh
<hmh@hmh.eng.br> wrote:
> On Wed, 24 Sep 2014, Andy Lutomirski wrote:
>> We can sort of fudge it if whatever control BIOS uses is available to
>> us, too, and we can reprogram it to "enabled" after a microcode update
>> disables TSX.
>
> Only for the early initramfs microcode update driver, and that's going to be
> useful only as a way to honor the "keep Intel TSX enabled even if it is
> badly broken" switch that was added by Intel for developer usage.
>
> For the runtime microcode update (regular microcode driver), an
> "enabled->disabled->enabled" transition would still disrupt the system:
> triggering a microcode update in a cpu can update other cpus, which might be
> running Intel TSX instructions.  Boom! processes running on these other cpus
> can crash with SIGILL, and we have data loss.
>
> The microcode update has to preserve the entire [visible] processor state,
> otherwise we cannot safely apply it "late".  Intel TSX included.

Ugh, right.

If we knew the set of CPUs that would be affected by a given update,
we could freeze those CPUs first, though.  But yes, this sucks.

--Andy

>
> --
>   "One disk to rule them all, One disk to find them. One disk to bring
>   them all and in the darkness grind them. In the Land of Redmond
>   where the shadows lie." -- The Silicon Valley Tarot
>   Henrique Holschuh



-- 
Andy Lutomirski
AMA Capital Management, LLC

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 14:56                         ` Henrique de Moraes Holschuh
  2014-09-24 15:00                           ` Andy Lutomirski
@ 2014-09-25  8:51                           ` Borislav Petkov
  2014-09-25 11:36                             ` Henrique de Moraes Holschuh
  1 sibling, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25  8:51 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Wed, Sep 24, 2014 at 11:56:58AM -0300, Henrique de Moraes Holschuh wrote:
> I don't know about AMD, but on Intel, the time it takes to update the
> microcode on a core is anything but negligible[1], so the microcode
> version skew window still exists, and it is not small.  It is much smaller
> than it once was, but it is still there.

I think that window is unsafe but yeah, we probably should take your
empirical observation as good enough for now.

> But IMHO we still need to detect and do something smart when
> x86_capability changes due to a microcode update.
> 
> And I'd really prefer it to be "update x86_capability, warn the user and
> carry on" for anything that is not going to crash the kernel.

The problem is with hiding CPUID bits and userspace using HLE after
having detected it previously. I think we'll be on the safe side if we
reboot thus the suggestion to the user that rebooting should be done
ASAP.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-24 17:45                             ` Henrique de Moraes Holschuh
  2014-09-24 17:48                               ` Andy Lutomirski
@ 2014-09-25  8:57                               ` Borislav Petkov
  1 sibling, 0 replies; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25  8:57 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Andy Lutomirski, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Wed, Sep 24, 2014 at 02:45:57PM -0300, Henrique de Moraes Holschuh wrote:
> On Wed, 24 Sep 2014, Andy Lutomirski wrote:
> > On Wed, Sep 24, 2014 at 7:56 AM, Henrique de Moraes Holschuh
> > <hmh@hmh.eng.br> wrote:
> > > And I'd really prefer it to be "update x86_capability, warn the user and
> > > carry on" for anything that is not going to crash the kernel.  Several
> > > distros will really want this backported to -stable, as the older kernels
> > > cannot do early microcode updates.
> > >
> > 
> > I'm trying to see if Intel is willing to document any additional
> > controls for the TSX bits in this ucode.  No word yet, but I might
> > hear something soon.
> 
> If they do document it, please make sure to ask what will happen in the
> following situation:
> 
>    Assume there is a newer release of Intel microcode for these
>    processors, i.e. newer than the microcodes in the 2014-09-13 release.
>    IOW assume there are at least two public microcode updates in which the
>    Intel TSX feature has been disabled by default, but can be enabled by
>    the BIOS/UEFI.
> 
>    1. BIOS/UEFI has recent microcode (which has the Intel TSX on/off
>       switch), but it is not the latest microcode, and installed this
>       update on the processor.
> 
>    2. BIOS/UEFI has *enabled* Intel TSX on user request.
> 
>    3. Microcode is updated to the latest microcode by the operating
>       system, newer than the one in BIOS/UEFI.
> 
>    After step 3, will Intel TSX be enabled, or disabled ?
> 
> Or, to be more explicit: will future microcode updates preserve Intel TSX
> enabled/disabled state, or will they always reset it to disabled?

Well, you boot with the microcode in the BIOS so you will be able to
enable/disable TSX initially. When you apply the microcode patch to
disable TSX, this will remain the case until next reboot, where you
start with the same BIOS which has older microcode version. And will need to
apply the microcode again.

Unless you update your BIOS which will also hide the TSX enable/disable
switch too, presumably.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25  8:51                           ` Borislav Petkov
@ 2014-09-25 11:36                             ` Henrique de Moraes Holschuh
  2014-09-25 12:10                               ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-25 11:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, 25 Sep 2014, Borislav Petkov wrote:
> > But IMHO we still need to detect and do something smart when
> > x86_capability changes due to a microcode update.
> > 
> > And I'd really prefer it to be "update x86_capability, warn the user and
> > carry on" for anything that is not going to crash the kernel.
> 
> The problem is with hiding CPUID bits and userspace using HLE after
> having detected it previously. I think we'll be on the safe side if we

It is safe to apply this particular batch of problematic microcode updades
inside the regular initramfs, as long as you do it as one of the very first
tasks.

This isn't an useless fix, it will allow systems without early initramfs
support to operate correctly after a microcode update.  And kernels 3.0, 3.2
and 3.4 _cannot_ apply early initramfs microcode updates at all, so they
need it.

Besides, we need to detect and scream bloody murder when microcode updates
do something like this anyway, now that the pandora box was opened.  If
we're going to detect it, might as well fix it when it is not something the
kernel uses.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 11:36                             ` Henrique de Moraes Holschuh
@ 2014-09-25 12:10                               ` Borislav Petkov
  2014-09-25 14:40                                 ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25 12:10 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, Sep 25, 2014 at 08:36:45AM -0300, Henrique de Moraes Holschuh wrote:
> This isn't an useless fix, it will allow systems without early initramfs
> support to operate correctly after a microcode update.

So what do we do if we update the microcode late and some userspace task
is using HLE and all of a sudden it segfaults and gets killed due to
#UD. I'll forward all those complaint emails to you then, no?

:-)

What's saying is, a reboot in this case is maybe the lesser of two evils.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 12:10                               ` Borislav Petkov
@ 2014-09-25 14:40                                 ` Henrique de Moraes Holschuh
  2014-09-25 14:56                                   ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-25 14:40 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, 25 Sep 2014, Borislav Petkov wrote:
> On Thu, Sep 25, 2014 at 08:36:45AM -0300, Henrique de Moraes Holschuh wrote:
> > This isn't an useless fix, it will allow systems without early initramfs
> > support to operate correctly after a microcode update.
> 
> So what do we do if we update the microcode late and some userspace task
> is using HLE and all of a sudden it segfaults and gets killed due to
> #UD. I'll forward all those complaint emails to you then, no?
> 
> :-)
> 
> What's saying is, a reboot in this case is maybe the lesser of two evils.

In that case we should blacklist to refuse to apply the update, and reboot
only if the blacklist wasn't good enough and we detect that something really
important in the cpu feature cpuid bits changed.

However, a reboot is even worse than everything linked to libpthread
segfaulting, as it will also cause data loss for the stuff that didn't get
SIGILL'd to death.  Meh.

Backporting early initramfs support to 3.0/3.2/3.4 doesn't seem doable, or
wise.

At this point, what alternatives are left?

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 14:40                                 ` Henrique de Moraes Holschuh
@ 2014-09-25 14:56                                   ` Borislav Petkov
  2014-09-25 15:30                                     ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25 14:56 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, Sep 25, 2014 at 11:40:25AM -0300, Henrique de Moraes Holschuh wrote:
> At this point, what alternatives are left?

Here's what we could do:

* Install microcode to /lib/firmware/...

* Refuse to update the microcode and tell the user that she needs to reboot.

* Reboot and load the microcode

For that to work though, we'd need to detect the that we're freshly
booting and only then load the microcode (if we're coming in later,
we should refuse because something linking to libpthread might've run
already).

Now, we need to think about how to detect that reliably, if at all
possible.

The other thing we could do is backport early ucode loading...

Hmm, I'm not crazy about both possibilities though, TBH.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 14:56                                   ` Borislav Petkov
@ 2014-09-25 15:30                                     ` Henrique de Moraes Holschuh
  2014-09-25 15:50                                       ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-25 15:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, 25 Sep 2014, Borislav Petkov wrote:
> On Thu, Sep 25, 2014 at 11:40:25AM -0300, Henrique de Moraes Holschuh wrote:
> > At this point, what alternatives are left?
> 
> Here's what we could do:
> 
> * Install microcode to /lib/firmware/...
> 
> * Refuse to update the microcode and tell the user that she needs to reboot.
> 
> * Reboot and load the microcode
> 
> For that to work though, we'd need to detect the that we're freshly
> booting and only then load the microcode (if we're coming in later,
> we should refuse because something linking to libpthread might've run
> already).

Userspace can install the microcode only inside the initramfs, if it wants
to avoid it being loaded later.  It is not even too difficult to do so.

But the kernel currently doesn't have a away to know that happened.

We could just add the blacklist _and_ support for x86_capability changes
that don't touch something the kernel uses, and bypass the blacklist the
first time userspace writes 2 to the sysfs update trigger.

So, it would be an one-use trapdoor that the initramfs can use to update the
Haswells at boot.  If someone really wants to use it again, he can rmmod +
modprobe (no way to do it if the microcode driver is built-in, though).

> The other thing we could do is backport early ucode loading...
> 
> Hmm, I'm not crazy about both possibilities though, TBH.

Indeed.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 15:30                                     ` Henrique de Moraes Holschuh
@ 2014-09-25 15:50                                       ` Borislav Petkov
  2014-09-25 16:41                                         ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25 15:50 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, Sep 25, 2014 at 12:30:06PM -0300, Henrique de Moraes Holschuh wrote:
> Userspace can install the microcode only inside the initramfs, if it wants
> to avoid it being loaded later.  It is not even too difficult to do so.

Hmm, so in thinking about this more, what we need to do on all kernels
should be something along those lines (if I'm not missing something,
that is):


	if (early microcode loading support) {
		install microcode into initramfs;
	}

	install microcode into /lib/firmware/...;
	tell the user to reboot;

On the next reboot, everything gets loaded automatically.

Of course, user needs to make sure that the microcode loader module gets
loaded during boot. If its built-in, we're fine.

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 15:50                                       ` Borislav Petkov
@ 2014-09-25 16:41                                         ` Henrique de Moraes Holschuh
  2014-09-25 16:57                                           ` Borislav Petkov
  0 siblings, 1 reply; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-25 16:41 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, 25 Sep 2014, Borislav Petkov wrote:
> On Thu, Sep 25, 2014 at 12:30:06PM -0300, Henrique de Moraes Holschuh wrote:
> > Userspace can install the microcode only inside the initramfs, if it wants
> > to avoid it being loaded later.  It is not even too difficult to do so.
> 
> Hmm, so in thinking about this more, what we need to do on all kernels
> should be something along those lines (if I'm not missing something,
> that is):
> 
> 
> 	if (early microcode loading support) {
> 		install microcode into initramfs;
> 	}
> 
> 	install microcode into /lib/firmware/...;
> 	tell the user to reboot;
> 
> On the next reboot, everything gets loaded automatically.
> 
> Of course, user needs to make sure that the microcode loader module gets
> loaded during boot. If its built-in, we're fine.

We still need to update x86_capability after a microcode update for the
above to really work in the "install microcode into initramfs" case.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 16:41                                         ` Henrique de Moraes Holschuh
@ 2014-09-25 16:57                                           ` Borislav Petkov
  2014-09-25 17:09                                             ` Henrique de Moraes Holschuh
  0 siblings, 1 reply; 47+ messages in thread
From: Borislav Petkov @ 2014-09-25 16:57 UTC (permalink / raw)
  To: Henrique de Moraes Holschuh
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, Sep 25, 2014 at 01:41:21PM -0300, Henrique de Moraes Holschuh wrote:
> We still need to update x86_capability after a microcode update for the
> above to really work in the "install microcode into initramfs" case.

I think you mean for the kernels without early microcode loading...

Because those which do early loading will have loaded the microcode much
earlier than the x86_capability bits are detected. AFAICS, of course. It
is late already :-\

-- 
Regards/Gruss,
    Boris.
--

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-25 16:57                                           ` Borislav Petkov
@ 2014-09-25 17:09                                             ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-25 17:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Chuck Ebbert, Andy Lutomirski, H. Peter Anvin, linux-kernel

On Thu, 25 Sep 2014, Borislav Petkov wrote:
> On Thu, Sep 25, 2014 at 01:41:21PM -0300, Henrique de Moraes Holschuh wrote:
> > We still need to update x86_capability after a microcode update for the
> > above to really work in the "install microcode into initramfs" case.
> 
> I think you mean for the kernels without early microcode loading...

exactly.

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

* Re: x86, microcode: BUG: microcode update that changes x86_capability
  2014-09-19 22:35               ` Henrique de Moraes Holschuh
@ 2014-09-29 11:51                 ` Henrique de Moraes Holschuh
  0 siblings, 0 replies; 47+ messages in thread
From: Henrique de Moraes Holschuh @ 2014-09-29 11:51 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Borislav Petkov, Chuck Ebbert, H. Peter Anvin, linux-kernel

On Fri, 19 Sep 2014, Henrique de Moraes Holschuh wrote:
> On Fri, 19 Sep 2014, Henrique de Moraes Holschuh wrote:
> > I'm filling a bug on Debian glibc, asking them to blacklist HLE until
> > further notice.
> 
> FWIW, https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=762195

Also, glibc lock elision removed from Fedora 20, 21, rawhide due to this
issue:

http://journal.siddhesh.in/posts/buggy-hle-microcode-updates-and-sigills.html

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh

^ permalink raw reply	[flat|nested] 47+ messages in thread

end of thread, other threads:[~2014-09-29 11:51 UTC | newest]

Thread overview: 47+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-18 13:52 x86, microcode: BUG: microcode update that changes x86_capability Henrique de Moraes Holschuh
2014-09-18 19:14 ` Andy Lutomirski
2014-09-18 19:53   ` Chuck Ebbert
2014-09-18 19:55     ` H. Peter Anvin
2014-09-18 20:06       ` Henrique de Moraes Holschuh
2014-09-19  0:13         ` Henrique de Moraes Holschuh
2014-09-19  0:23           ` Andy Lutomirski
2014-09-19  0:28             ` H. Peter Anvin
2014-09-19  1:00               ` Andy Lutomirski
2014-09-19  8:03                 ` Borislav Petkov
2014-09-19 11:00             ` Henrique de Moraes Holschuh
2014-09-19 11:29               ` Borislav Petkov
2014-09-19 12:54                 ` Chuck Ebbert
2014-09-19 13:14                   ` Josh Boyer
2014-09-19 13:37                     ` Chuck Ebbert
2014-09-19 15:00                   ` Borislav Petkov
2014-09-19 16:13                     ` Andy Lutomirski
2014-09-19 16:54                       ` Henrique de Moraes Holschuh
2014-09-19 16:42                     ` Henrique de Moraes Holschuh
2014-09-23 20:00                       ` Borislav Petkov
2014-09-24 14:56                         ` Henrique de Moraes Holschuh
2014-09-24 15:00                           ` Andy Lutomirski
2014-09-24 17:45                             ` Henrique de Moraes Holschuh
2014-09-24 17:48                               ` Andy Lutomirski
2014-09-24 18:59                                 ` Henrique de Moraes Holschuh
2014-09-24 19:34                                   ` Andy Lutomirski
2014-09-25  8:57                               ` Borislav Petkov
2014-09-25  8:51                           ` Borislav Petkov
2014-09-25 11:36                             ` Henrique de Moraes Holschuh
2014-09-25 12:10                               ` Borislav Petkov
2014-09-25 14:40                                 ` Henrique de Moraes Holschuh
2014-09-25 14:56                                   ` Borislav Petkov
2014-09-25 15:30                                     ` Henrique de Moraes Holschuh
2014-09-25 15:50                                       ` Borislav Petkov
2014-09-25 16:41                                         ` Henrique de Moraes Holschuh
2014-09-25 16:57                                           ` Borislav Petkov
2014-09-25 17:09                                             ` Henrique de Moraes Holschuh
2014-09-19 13:51                 ` Henrique de Moraes Holschuh
2014-09-19 14:49                   ` Borislav Petkov
2014-09-19 17:22                     ` Henrique de Moraes Holschuh
2014-09-19 22:35               ` Henrique de Moraes Holschuh
2014-09-29 11:51                 ` Henrique de Moraes Holschuh
2014-09-19  9:56     ` Henrique de Moraes Holschuh
2014-09-19 16:11   ` Henrique de Moraes Holschuh
2014-09-22  0:37 ` Andi Kleen
2014-09-22  0:51   ` H. Peter Anvin
2014-09-22  0:58     ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).