All of lore.kernel.org
 help / color / mirror / Atom feed
* x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
@ 2018-02-06 14:09 Rolf Neugebauer
  2018-02-06 14:24 ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-06 14:09 UTC (permalink / raw)
  To: Borislav Petkov, Jia Zhang, Thomas Gleixner, Tony Luck,
	Ingo Molnar, H. Peter Anvin, x86, LKML, Greg KH, Rolf Neugebauer

The backport of 7e702d17ed1 ("x86/microcode/intel: Extend BDW
late-loading further with LLC size check") to 4.9.79 and 4.4.14 causes
a division by zero panic on some single vCPU machine types on Google
Cloud (e.g. g1-small and n1-standard-1):

[    1.591435] divide error: 0000 [#1] SMP
[    1.591961] Modules linked in:
[    1.592546] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.9.79-linuxkit #1
[    1.593461] Hardware name: Google Google Compute Engine/Google
Compute Engine, BIOS Google 01/01/2011
[    1.595135] task: ffffa01b2b63c040 task.stack: ffffb24f40010000
[    1.596683] RIP: 0010:[<ffffffffb7d88ce3>]  [<ffffffffb7d88ce3>]
init_intel_microcode+0x49/0x5a
[    1.598018] RSP: 0000:ffffb24f40013e48  EFLAGS: 00010206
[    1.599016] RAX: 0000000001400000 RBX: 00000000ffffffea RCX: 0000000000000000
[    1.600093] RDX: 0000000000000000 RSI: ffffa01b2fffa9a8 RDI: ffffffffb7d8888b
[    1.601222] RBP: 00000000ffffffff R08: ffffa01b2fffa9a1 R09: 000000000000005f
[    1.602330] R10: 0000000000000067 R11: ffffa01b22164177 R12: 0000000000000000
[    1.603291] R13: ffffffffb7d787b6 R14: 0000000000000000 R15: 0000000000000000
[    1.604406] FS:  0000000000000000(0000) GS:ffffa01b2fc00000(0000)
knlGS:0000000000000000
[    1.606154] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.607493] CR2: 0000000000000000 CR3: 000000002ac0a000 CR4: 0000000000040630
[    1.609310] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    1.611122] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    1.613008] Stack:
[    1.613445]  ffffffffb7d888c1 0000000000000000 ffffffffb7d1ba40
0000000000000000
[    1.615423]  0000000000000000 ffffffffb7d787b6 0000000000000000
0000000000000000
[    1.617049]  ffffffffb74fed89 00000000fffffff4 00000000ffffffff
4eae7bc133547c47
[    1.617904] Call Trace:
[    1.618198]  [<ffffffffb7d888c1>] ? microcode_init+0x36/0x1de
[    1.619264]  [<ffffffffb7d787b6>] ? set_debug_rodata+0xc/0xc
[    1.620143]  [<ffffffffb74fed89>] ? driver_register+0xaf/0xb4
[    1.621841]  [<ffffffffb7d8888b>] ? save_microcode_in_initrd+0x3c/0x3c
[    1.623424]  [<ffffffffb70021b7>] ? do_one_initcall+0x98/0x130
[    1.624659]  [<ffffffffb7d787b6>] ? set_debug_rodata+0xc/0xc
[    1.626303]  [<ffffffffb7d79091>] ? kernel_init_freeable+0x166/0x1e7
[    1.627674]  [<ffffffffb77ce3d7>] ? rest_init+0x6e/0x6e
[    1.628942]  [<ffffffffb77ce3e1>] ? kernel_init+0xa/0xe6
[    1.630243]  [<ffffffffb77d8587>] ? ret_from_fork+0x57/0x70
[    1.631592] Code: 16 0f b6 35 a0 ac fb ff 48 c7 c7 c4 ea 99 b7 e8
82 3e 41 ff 31 c0 c3 8b 05 3b ad fb ff 0f b7 0d 54 ad fb ff 31 d2 c1
e0 0a 48 98 <48> f7 f1 89 05 f4 6a 14 00 48 c7 c0 00 d8 c2 b7 c3 4c 8d
54 24
[    1.638875] RIP  [<ffffffffb7d88ce3>] init_intel_microcode+0x49/0x5a
[    1.640382]  RSP <ffffb24f40013e48>
[    1.641320] ---[ end trace e88ef332e19594b9 ]---
[    1.642418] Kernel panic - not syncing: Fatal exception
[    1.644581] Kernel Offset: 0x36000000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[    1.647312] ---[ end Kernel panic - not syncing: Fatal exception

On these machines x86_max_cores is zero:

[    0.227977] ftrace: allocating 35563 entries in 139 pages
[    0.273941] smpboot: x86_max_cores == zero !?!?
[    0.274645] smpboot: Max logical packages: 1


4.9.78 (and earlier) as well as 4.4.13 (and earlier) worked fine so
this seems like a regression in stable.

4.14.16 and 4.15 (to which the above patch was backported as well)
work fine, but I noticed significant code changes for these kernels
around where x86_max_cores is set.

The obvious quick fix/hack is to check for x86_max_cores in
calc_llc_size_per_core() and return 0. I'm happy to submit a patch for
this, but a more correct fix might be back porting the relevant
changes around where  x86_max_cores is set.

Thanks
Rolf

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-06 14:09 x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114 Rolf Neugebauer
@ 2018-02-06 14:24 ` Borislav Petkov
  2018-02-07 14:34   ` Greg KH
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2018-02-06 14:24 UTC (permalink / raw)
  To: Rolf Neugebauer, Greg KH
  Cc: Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML, Greg KH

On Tue, Feb 06, 2018 at 02:09:35PM +0000, Rolf Neugebauer wrote:
> The backport of 7e702d17ed1 ("x86/microcode/intel: Extend BDW
> late-loading further with LLC size check") to 4.9.79 and 4.4.14 causes
> a division by zero panic on some single vCPU machine types on Google
> Cloud (e.g. g1-small and n1-standard-1):

The microcode loader is not supposed to load in a guest.

Greg, I guess you need something like this:

a15a753539ec ("x86/microcode/AMD: Do not load when running on a hypervisor")

I haven't tried whether it applies cleanly though...

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-06 14:24 ` Borislav Petkov
@ 2018-02-07 14:34   ` Greg KH
  2018-02-07 16:31     ` Rolf Neugebauer
  0 siblings, 1 reply; 14+ messages in thread
From: Greg KH @ 2018-02-07 14:34 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Rolf Neugebauer, Jia Zhang, Thomas Gleixner, Tony Luck,
	Ingo Molnar, H. Peter Anvin, x86, LKML

On Tue, Feb 06, 2018 at 03:24:44PM +0100, Borislav Petkov wrote:
> On Tue, Feb 06, 2018 at 02:09:35PM +0000, Rolf Neugebauer wrote:
> > The backport of 7e702d17ed1 ("x86/microcode/intel: Extend BDW
> > late-loading further with LLC size check") to 4.9.79 and 4.4.14 causes
> > a division by zero panic on some single vCPU machine types on Google
> > Cloud (e.g. g1-small and n1-standard-1):
> 
> The microcode loader is not supposed to load in a guest.
> 
> Greg, I guess you need something like this:
> 
> a15a753539ec ("x86/microcode/AMD: Do not load when running on a hypervisor")
> 
> I haven't tried whether it applies cleanly though...

It applied cleanly to 4.9 (with some fuzz, but looks clean enough...)

But for 4.4, not at all.  Rolf, any chance you could backport the patch
there for this issue?  If not, I can add it to my queue, but it might
take a few days...

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-07 14:34   ` Greg KH
@ 2018-02-07 16:31     ` Rolf Neugebauer
  2018-02-07 18:12       ` Borislav Petkov
  2018-02-08 16:39       ` Greg KH
  0 siblings, 2 replies; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-07 16:31 UTC (permalink / raw)
  To: Greg KH
  Cc: Borislav Petkov, Jia Zhang, Thomas Gleixner, Tony Luck,
	Ingo Molnar, H. Peter Anvin, x86, LKML

On Wed, Feb 7, 2018 at 2:34 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Tue, Feb 06, 2018 at 03:24:44PM +0100, Borislav Petkov wrote:
>> On Tue, Feb 06, 2018 at 02:09:35PM +0000, Rolf Neugebauer wrote:
>> > The backport of 7e702d17ed1 ("x86/microcode/intel: Extend BDW
>> > late-loading further with LLC size check") to 4.9.79 and 4.4.14 causes
>> > a division by zero panic on some single vCPU machine types on Google
>> > Cloud (e.g. g1-small and n1-standard-1):
>>
>> The microcode loader is not supposed to load in a guest.
>>
>> Greg, I guess you need something like this:
>>
>> a15a753539ec ("x86/microcode/AMD: Do not load when running on a hypervisor")
>>
>> I haven't tried whether it applies cleanly though...
>
> It applied cleanly to 4.9 (with some fuzz, but looks clean enough...)
>
> But for 4.4, not at all.  Rolf, any chance you could backport the patch
> there for this issue?  If not, I can add it to my queue, but it might
> take a few days...

The backport was straight forward. I've added it below. If my mailer
(gmail) screws this up let me where I should send it with 'git
send-email'

Rolf

>From 2cca500deea6a2f47dc126bfe3713933d7192c33 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Sun, 18 Dec 2016 17:44:13 +0100
Subject: [PATCH 1/1] x86/microcode/AMD: Do not load when running on a
 hypervisor

commit a15a753539eca8ba243d576f02e7ca9c4b7d7042 upstream with minor
adjustments.

Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.

By having it turned off by default, the APs won't run into the loader
either.

Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".

Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Juergen Gross <jgross@suse.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
---
 arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c
b/arch/x86/kernel/cpu/microcode/core.c
index b3e94ef461fd..1b3e0aa4c511 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -44,7 +44,7 @@

 static struct microcode_ops    *microcode_ops;

-static bool dis_ucode_ldr;
+static bool dis_ucode_ldr = true;

 static int __init disable_loader(char *str)
 {
@@ -81,6 +81,7 @@ struct cpu_info_ctx {

 static bool __init check_loader_disabled_bsp(void)
 {
+       u32 a, b, c, d;
 #ifdef CONFIG_X86_32
        const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
        const char *opt     = "dis_ucode_ldr";
@@ -93,8 +94,23 @@ static bool __init check_loader_disabled_bsp(void)
        bool *res = &dis_ucode_ldr;
 #endif

-       if (cmdline_find_option_bool(cmdline, option))
-               *res = true;
+       if (!have_cpuid_p())
+               return *res;
+
+       a = 1;
+       c = 0;
+       native_cpuid(&a, &b, &c, &d);
+
+       /*
+        * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
+        * completely accurate as xen pv guests don't see that CPUID bit set but
+        * that's good enough as they don't land on the BSP path anyway.
+        */
+       if (c & BIT(31))
+               return *res;
+
+       if (cmdline_find_option_bool(cmdline, option) <= 0)
+               *res = false;

        return *res;
 }
@@ -126,9 +142,6 @@ void __init load_ucode_bsp(void)
        if (check_loader_disabled_bsp())
                return;

-       if (!have_cpuid_p())
-               return;
-
        vendor = x86_vendor();
        family = x86_family();

@@ -162,9 +175,6 @@ void load_ucode_ap(void)
        if (check_loader_disabled_ap())
                return;

-       if (!have_cpuid_p())
-               return;
-
        vendor = x86_vendor();
        family = x86_family();

--
2.16.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-07 16:31     ` Rolf Neugebauer
@ 2018-02-07 18:12       ` Borislav Petkov
  2018-02-08 13:55         ` Rolf Neugebauer
  2018-02-08 16:39       ` Greg KH
  1 sibling, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2018-02-07 18:12 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: Greg KH, Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML

On Wed, Feb 07, 2018 at 04:31:59PM +0000, Rolf Neugebauer wrote:
>  arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
>  1 file changed, 19 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/core.c
> b/arch/x86/kernel/cpu/microcode/core.c
> index b3e94ef461fd..1b3e0aa4c511 100644
> --- a/arch/x86/kernel/cpu/microcode/core.c
> +++ b/arch/x86/kernel/cpu/microcode/core.c
> @@ -44,7 +44,7 @@
> 
>  static struct microcode_ops    *microcode_ops;
> 
> -static bool dis_ucode_ldr;
> +static bool dis_ucode_ldr = true;
> 
>  static int __init disable_loader(char *str)
>  {
> @@ -81,6 +81,7 @@ struct cpu_info_ctx {
> 
>  static bool __init check_loader_disabled_bsp(void)
>  {
> +       u32 a, b, c, d;
>  #ifdef CONFIG_X86_32
>         const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
>         const char *opt     = "dis_ucode_ldr";
> @@ -93,8 +94,23 @@ static bool __init check_loader_disabled_bsp(void)
>         bool *res = &dis_ucode_ldr;
>  #endif
> 
> -       if (cmdline_find_option_bool(cmdline, option))
> -               *res = true;
> +       if (!have_cpuid_p())
> +               return *res;

That might cause an issue, see 1f161f67a272c. Might wanna backport that
commit too, just in case, for those old CPUID-less geodes.

The rest looks ok.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-07 18:12       ` Borislav Petkov
@ 2018-02-08 13:55         ` Rolf Neugebauer
  2018-02-08 14:53           ` Borislav Petkov
  0 siblings, 1 reply; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-08 13:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Greg KH, Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML

On Wed, Feb 7, 2018 at 6:12 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Wed, Feb 07, 2018 at 04:31:59PM +0000, Rolf Neugebauer wrote:
>>  arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
>>  1 file changed, 19 insertions(+), 9 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/microcode/core.c
>> b/arch/x86/kernel/cpu/microcode/core.c
>> index b3e94ef461fd..1b3e0aa4c511 100644
>> --- a/arch/x86/kernel/cpu/microcode/core.c
>> +++ b/arch/x86/kernel/cpu/microcode/core.c
>> @@ -44,7 +44,7 @@
>>
>>  static struct microcode_ops    *microcode_ops;
>>
>> -static bool dis_ucode_ldr;
>> +static bool dis_ucode_ldr = true;
>>
>>  static int __init disable_loader(char *str)
>>  {
>> @@ -81,6 +81,7 @@ struct cpu_info_ctx {
>>
>>  static bool __init check_loader_disabled_bsp(void)
>>  {
>> +       u32 a, b, c, d;
>>  #ifdef CONFIG_X86_32
>>         const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
>>         const char *opt     = "dis_ucode_ldr";
>> @@ -93,8 +94,23 @@ static bool __init check_loader_disabled_bsp(void)
>>         bool *res = &dis_ucode_ldr;
>>  #endif
>>
>> -       if (cmdline_find_option_bool(cmdline, option))
>> -               *res = true;
>> +       if (!have_cpuid_p())
>> +               return *res;
>
> That might cause an issue, see 1f161f67a272c. Might wanna backport that
> commit too, just in case, for those old CPUID-less geodes.

On the 4.4 kernel, 1f161f67a272c ("x86/microcode: Do the family check
first") does not apply cleanly. Looks like it relies on 309aac77768c0
("x86/microcode: Decrease CPUID use") and 7a93a40be23e5
("x86/microcode: Remove local vendor variable") as well. These don't
apply cleanly either.

Should I just manually backport the functionality of 1f161f67a272c?

Rolf

>
> The rest looks ok.
>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-08 13:55         ` Rolf Neugebauer
@ 2018-02-08 14:53           ` Borislav Petkov
  2018-02-08 16:25             ` Rolf Neugebauer
  0 siblings, 1 reply; 14+ messages in thread
From: Borislav Petkov @ 2018-02-08 14:53 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: Greg KH, Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML

On Thu, Feb 08, 2018 at 01:55:48PM +0000, Rolf Neugebauer wrote:
> On the 4.4 kernel, 1f161f67a272c ("x86/microcode: Do the family check
> first") does not apply cleanly. Looks like it relies on 309aac77768c0
> ("x86/microcode: Decrease CPUID use") and 7a93a40be23e5
> ("x86/microcode: Remove local vendor variable") as well. These don't
> apply cleanly either.

It figures. The backporting game is hm, well, a fun one. :-)

> Should I just manually backport the functionality of 1f161f67a272c?

Yeah, try that first.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-08 14:53           ` Borislav Petkov
@ 2018-02-08 16:25             ` Rolf Neugebauer
  2018-02-08 16:39               ` Greg KH
  2018-02-08 18:26               ` x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114 Borislav Petkov
  0 siblings, 2 replies; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-08 16:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Greg KH, Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML

On Thu, Feb 8, 2018 at 2:53 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Thu, Feb 08, 2018 at 01:55:48PM +0000, Rolf Neugebauer wrote:
>> On the 4.4 kernel, 1f161f67a272c ("x86/microcode: Do the family check
>> first") does not apply cleanly. Looks like it relies on 309aac77768c0
>> ("x86/microcode: Decrease CPUID use") and 7a93a40be23e5
>> ("x86/microcode: Remove local vendor variable") as well. These don't
>> apply cleanly either.
>
> It figures. The backporting game is hm, well, a fun one. :-)
>
>> Should I just manually backport the functionality of 1f161f67a272c?
>
> Yeah, try that first.

Here it is below (same disclaimer about crappy mail client). If Boris
is fine with the changes, Greg, I can send it properly with "git
send-email" if needed.

Rolf

>From 397a76b8403f0b4a73222a031e52d72da94af2f6 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp@suse.de>
Date: Thu, 12 Oct 2017 13:23:16 +0200
Subject: [PATCH 2/2] x86/microcode: Do the family check first

commit 1f161f67a272cc4f29f27934dd3f74cb657eb5c4 upstream with adjustments.

On CPUs like AMD's Geode, for example, we shouldn't even try to load
microcode because they do not support the modern microcode loading
interface.

However, we do the family check *after* the other checks whether the
loader has been disabled on the command line or whether we're running in
a guest.

So move the family checks first in order to exit early if we're being
loaded on an unsupported family.

Reported-and-tested-by: Sven Glodowski <glodi1@arcor.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.11..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396
Link: http://lkml.kernel.org/r/20171012112316.977-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
---
 arch/x86/kernel/cpu/microcode/core.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c
b/arch/x86/kernel/cpu/microcode/core.c
index 1b3e0aa4c511..ce5f8a2e7ae6 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -94,9 +94,6 @@ static bool __init check_loader_disabled_bsp(void)
        bool *res = &dis_ucode_ldr;
 #endif

-       if (!have_cpuid_p())
-               return *res;
-
        a = 1;
        c = 0;
        native_cpuid(&a, &b, &c, &d);
@@ -138,8 +135,9 @@ void __init load_ucode_bsp(void)
 {
        int vendor;
        unsigned int family;
+       bool intel = true;

-       if (check_loader_disabled_bsp())
+       if (!have_cpuid_p())
                return;

        vendor = x86_vendor();
@@ -147,16 +145,27 @@ void __init load_ucode_bsp(void)

        switch (vendor) {
        case X86_VENDOR_INTEL:
-               if (family >= 6)
-                       load_ucode_intel_bsp();
+               if (family < 6)
+                       return;
                break;
+
        case X86_VENDOR_AMD:
-               if (family >= 0x10)
-                       load_ucode_amd_bsp(family);
+               if (family < 0x10)
+                       return;
+               intel = false;
                break;
+
        default:
-               break;
+               return;
        }
+
+       if (check_loader_disabled_bsp())
+               return;
+
+       if (intel)
+               load_ucode_intel_bsp();
+       else
+               load_ucode_amd_bsp(family);
 }

 static bool check_loader_disabled_ap(void)
--
2.16.0

>
> --
> Regards/Gruss,
>     Boris.
>
> Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-07 16:31     ` Rolf Neugebauer
  2018-02-07 18:12       ` Borislav Petkov
@ 2018-02-08 16:39       ` Greg KH
  1 sibling, 0 replies; 14+ messages in thread
From: Greg KH @ 2018-02-08 16:39 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: Borislav Petkov, Jia Zhang, Thomas Gleixner, Tony Luck,
	Ingo Molnar, H. Peter Anvin, x86, LKML

On Wed, Feb 07, 2018 at 04:31:59PM +0000, Rolf Neugebauer wrote:
> On Wed, Feb 7, 2018 at 2:34 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> > On Tue, Feb 06, 2018 at 03:24:44PM +0100, Borislav Petkov wrote:
> >> On Tue, Feb 06, 2018 at 02:09:35PM +0000, Rolf Neugebauer wrote:
> >> > The backport of 7e702d17ed1 ("x86/microcode/intel: Extend BDW
> >> > late-loading further with LLC size check") to 4.9.79 and 4.4.14 causes
> >> > a division by zero panic on some single vCPU machine types on Google
> >> > Cloud (e.g. g1-small and n1-standard-1):
> >>
> >> The microcode loader is not supposed to load in a guest.
> >>
> >> Greg, I guess you need something like this:
> >>
> >> a15a753539ec ("x86/microcode/AMD: Do not load when running on a hypervisor")
> >>
> >> I haven't tried whether it applies cleanly though...
> >
> > It applied cleanly to 4.9 (with some fuzz, but looks clean enough...)
> >
> > But for 4.4, not at all.  Rolf, any chance you could backport the patch
> > there for this issue?  If not, I can add it to my queue, but it might
> > take a few days...
> 
> The backport was straight forward. I've added it below. If my mailer
> (gmail) screws this up let me where I should send it with 'git
> send-email'

Yeah, it's messed up, can you resend it please?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-08 16:25             ` Rolf Neugebauer
@ 2018-02-08 16:39               ` Greg KH
  2018-02-08 17:12                 ` [PATCH 1/2] x86/microcode/AMD: Do not load when running on a hypervisor Rolf Neugebauer
  2018-02-08 18:26               ` x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114 Borislav Petkov
  1 sibling, 1 reply; 14+ messages in thread
From: Greg KH @ 2018-02-08 16:39 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: Borislav Petkov, Jia Zhang, Thomas Gleixner, Tony Luck,
	Ingo Molnar, H. Peter Anvin, x86, LKML

On Thu, Feb 08, 2018 at 04:25:58PM +0000, Rolf Neugebauer wrote:
> On Thu, Feb 8, 2018 at 2:53 PM, Borislav Petkov <bp@alien8.de> wrote:
> > On Thu, Feb 08, 2018 at 01:55:48PM +0000, Rolf Neugebauer wrote:
> >> On the 4.4 kernel, 1f161f67a272c ("x86/microcode: Do the family check
> >> first") does not apply cleanly. Looks like it relies on 309aac77768c0
> >> ("x86/microcode: Decrease CPUID use") and 7a93a40be23e5
> >> ("x86/microcode: Remove local vendor variable") as well. These don't
> >> apply cleanly either.
> >
> > It figures. The backporting game is hm, well, a fun one. :-)
> >
> >> Should I just manually backport the functionality of 1f161f67a272c?
> >
> > Yeah, try that first.
> 
> Here it is below (same disclaimer about crappy mail client). If Boris
> is fine with the changes, Greg, I can send it properly with "git
> send-email" if needed.

Can you also send this for 4.9?  It should need it as well.  And at
first glance, it looks good to me.

thanks for doing this,

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] x86/microcode/AMD: Do not load when running on a hypervisor
  2018-02-08 16:39               ` Greg KH
@ 2018-02-08 17:12                 ` Rolf Neugebauer
  2018-02-08 17:12                   ` [PATCH 2/2] x86/microcode: Do the family check first Rolf Neugebauer
  0 siblings, 1 reply; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-08 17:12 UTC (permalink / raw)
  To: gregkh
  Cc: bp, zhang.jia, tglx, tony.luck, mingo, hpa, x86, linux-kernel,
	Borislav Petkov, stable, Rolf Neugebauer

From: Borislav Petkov <bp@suse.de>

commit a15a753539eca8ba243d576f02e7ca9c4b7d7042 upstream with minor
adjustments.

Doing so is completely void of sense for multiple reasons so prevent
it. Set dis_ucode_ldr to true and thus disable the microcode loader by
default to address xen pv guests which execute the AP path but not the
BSP path.

By having it turned off by default, the APs won't run into the loader
either.

Also, check CPUID(1).ECX[31] which hypervisors set. Well almost, not the
xen pv one. That one gets the aforementioned "fix".

Also, improve the detection method by caching the final decision whether
to continue loading in dis_ucode_ldr and do it once on the BSP. The APs
then simply test that value.

Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Juergen Gross <jgross@suse.com>
Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Acked-by: Juergen Gross <jgross@suse.com>
Link: http://lkml.kernel.org/r/20161218164414.9649-4-bp@alien8.de
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
---
 arch/x86/kernel/cpu/microcode/core.c | 28 +++++++++++++++++++---------
 1 file changed, 19 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index b3e94ef461fd..1b3e0aa4c511 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -44,7 +44,7 @@
 
 static struct microcode_ops	*microcode_ops;
 
-static bool dis_ucode_ldr;
+static bool dis_ucode_ldr = true;
 
 static int __init disable_loader(char *str)
 {
@@ -81,6 +81,7 @@ struct cpu_info_ctx {
 
 static bool __init check_loader_disabled_bsp(void)
 {
+	u32 a, b, c, d;
 #ifdef CONFIG_X86_32
 	const char *cmdline = (const char *)__pa_nodebug(boot_command_line);
 	const char *opt	    = "dis_ucode_ldr";
@@ -93,8 +94,23 @@ static bool __init check_loader_disabled_bsp(void)
 	bool *res = &dis_ucode_ldr;
 #endif
 
-	if (cmdline_find_option_bool(cmdline, option))
-		*res = true;
+	if (!have_cpuid_p())
+		return *res;
+
+	a = 1;
+	c = 0;
+	native_cpuid(&a, &b, &c, &d);
+
+	/*
+	 * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
+	 * completely accurate as xen pv guests don't see that CPUID bit set but
+	 * that's good enough as they don't land on the BSP path anyway.
+	 */
+	if (c & BIT(31))
+		return *res;
+
+	if (cmdline_find_option_bool(cmdline, option) <= 0)
+		*res = false;
 
 	return *res;
 }
@@ -126,9 +142,6 @@ void __init load_ucode_bsp(void)
 	if (check_loader_disabled_bsp())
 		return;
 
-	if (!have_cpuid_p())
-		return;
-
 	vendor = x86_vendor();
 	family = x86_family();
 
@@ -162,9 +175,6 @@ void load_ucode_ap(void)
 	if (check_loader_disabled_ap())
 		return;
 
-	if (!have_cpuid_p())
-		return;
-
 	vendor = x86_vendor();
 	family = x86_family();
 
-- 
2.16.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] x86/microcode: Do the family check first
  2018-02-08 17:12                 ` [PATCH 1/2] x86/microcode/AMD: Do not load when running on a hypervisor Rolf Neugebauer
@ 2018-02-08 17:12                   ` Rolf Neugebauer
  2018-02-08 17:25                     ` Greg KH
  0 siblings, 1 reply; 14+ messages in thread
From: Rolf Neugebauer @ 2018-02-08 17:12 UTC (permalink / raw)
  To: gregkh
  Cc: bp, zhang.jia, tglx, tony.luck, mingo, hpa, x86, linux-kernel,
	Borislav Petkov, stable, Linus Torvalds, Peter Zijlstra,
	Ingo Molnar, Rolf Neugebauer

From: Borislav Petkov <bp@suse.de>

commit 1f161f67a272cc4f29f27934dd3f74cb657eb5c4 upstream with adjustments.

On CPUs like AMD's Geode, for example, we shouldn't even try to load
microcode because they do not support the modern microcode loading
interface.

However, we do the family check *after* the other checks whether the
loader has been disabled on the command line or whether we're running in
a guest.

So move the family checks first in order to exit early if we're being
loaded on an unsupported family.

Reported-and-tested-by: Sven Glodowski <glodi1@arcor.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: <stable@vger.kernel.org> # 4.11..
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://bugzilla.suse.com/show_bug.cgi?id=1061396
Link: http://lkml.kernel.org/r/20171012112316.977-1-bp@alien8.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: <stable@vger.kernel.org> # 4.4.x
Signed-off-by: Rolf Neugebauer <rolf.neugebauer@docker.com>
---
 arch/x86/kernel/cpu/microcode/core.c | 27 ++++++++++++++++++---------
 1 file changed, 18 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index 1b3e0aa4c511..ce5f8a2e7ae6 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -94,9 +94,6 @@ static bool __init check_loader_disabled_bsp(void)
 	bool *res = &dis_ucode_ldr;
 #endif
 
-	if (!have_cpuid_p())
-		return *res;
-
 	a = 1;
 	c = 0;
 	native_cpuid(&a, &b, &c, &d);
@@ -138,8 +135,9 @@ void __init load_ucode_bsp(void)
 {
 	int vendor;
 	unsigned int family;
+	bool intel = true;
 
-	if (check_loader_disabled_bsp())
+	if (!have_cpuid_p())
 		return;
 
 	vendor = x86_vendor();
@@ -147,16 +145,27 @@ void __init load_ucode_bsp(void)
 
 	switch (vendor) {
 	case X86_VENDOR_INTEL:
-		if (family >= 6)
-			load_ucode_intel_bsp();
+		if (family < 6)
+			return;
 		break;
+
 	case X86_VENDOR_AMD:
-		if (family >= 0x10)
-			load_ucode_amd_bsp(family);
+		if (family < 0x10)
+			return;
+		intel = false;
 		break;
+
 	default:
-		break;
+		return;
 	}
+
+	if (check_loader_disabled_bsp())
+		return;
+
+	if (intel)
+		load_ucode_intel_bsp();
+	else
+		load_ucode_amd_bsp(family);
 }
 
 static bool check_loader_disabled_ap(void)
-- 
2.16.0

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] x86/microcode: Do the family check first
  2018-02-08 17:12                   ` [PATCH 2/2] x86/microcode: Do the family check first Rolf Neugebauer
@ 2018-02-08 17:25                     ` Greg KH
  0 siblings, 0 replies; 14+ messages in thread
From: Greg KH @ 2018-02-08 17:25 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: bp, zhang.jia, tglx, tony.luck, mingo, hpa, x86, linux-kernel,
	Borislav Petkov, stable, Linus Torvalds, Peter Zijlstra,
	Ingo Molnar

On Thu, Feb 08, 2018 at 05:12:04PM +0000, Rolf Neugebauer wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> commit 1f161f67a272cc4f29f27934dd3f74cb657eb5c4 upstream with adjustments.

This also seems to work for 4.9, thanks!  Both now queued up.

greg k-h

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114
  2018-02-08 16:25             ` Rolf Neugebauer
  2018-02-08 16:39               ` Greg KH
@ 2018-02-08 18:26               ` Borislav Petkov
  1 sibling, 0 replies; 14+ messages in thread
From: Borislav Petkov @ 2018-02-08 18:26 UTC (permalink / raw)
  To: Rolf Neugebauer
  Cc: Greg KH, Jia Zhang, Thomas Gleixner, Tony Luck, Ingo Molnar,
	H. Peter Anvin, x86, LKML

On Thu, Feb 08, 2018 at 04:25:58PM +0000, Rolf Neugebauer wrote:
> Here it is below (same disclaimer about crappy mail client). If Boris
> is fine with the changes, Greg, I can send it properly with "git
> send-email" if needed.

Yap, looks ok to me.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2018-02-08 18:26 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-06 14:09 x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114 Rolf Neugebauer
2018-02-06 14:24 ` Borislav Petkov
2018-02-07 14:34   ` Greg KH
2018-02-07 16:31     ` Rolf Neugebauer
2018-02-07 18:12       ` Borislav Petkov
2018-02-08 13:55         ` Rolf Neugebauer
2018-02-08 14:53           ` Borislav Petkov
2018-02-08 16:25             ` Rolf Neugebauer
2018-02-08 16:39               ` Greg KH
2018-02-08 17:12                 ` [PATCH 1/2] x86/microcode/AMD: Do not load when running on a hypervisor Rolf Neugebauer
2018-02-08 17:12                   ` [PATCH 2/2] x86/microcode: Do the family check first Rolf Neugebauer
2018-02-08 17:25                     ` Greg KH
2018-02-08 18:26               ` x86/microcode/intel: Division by zero panic in 4.9.79 and 4.4.114 Borislav Petkov
2018-02-08 16:39       ` Greg KH

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.