linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Problem with late AMD microcode reload/feedback
@ 2018-12-15 23:46 Rafał Miłecki
  2018-12-16  0:05 ` Borislav Petkov
  2018-12-16  0:08 ` Borislav Petkov
  0 siblings, 2 replies; 9+ messages in thread
From: Rafał Miłecki @ 2018-12-15 23:46 UTC (permalink / raw)
  To: Borislav Petkov, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	x86, Linux Kernel Mailing List

Hi,

I'm trying to reload AMD Ryzen Mobile (fam17h) microcode doing:
echo 1 > /sys/devices/system/cpu/microcode/reload

The problem is I don't get any feedback. No error for the "echo"
command, no a single new line in the "dmesg". I have no idea if
microcode has been reloaded or not.

I did a quick pr_info based debugging and I noticed that:
1) load_microcode_amd() calls __load_microcode_amd() and gets UCODE_OK
2) load_microcode_amd() calls find_patch(0) and gets a NULL

because of that NULL load_microcode_amd() doesn't return UCODE_NEW.

Seeing above I've decided to debug find_patch(). It seems to be
calling __find_equiv_id(0) which returns 0.

The last step was debugging __find_equiv_id() and find_equiv_id(). It
seems that find_equiv_id() gets sig 8458000 that doesn't exists in the
equiv_cpu_table:
[19.736770] microcode: [find_equiv_id] sig:8458000
[19.736772] microcode: [find_equiv_id] equiv_table->installed_cpu:8392466
[19.736775] microcode: [find_equiv_id] equiv_table->installed_cpu:8392578

Has my microcode been updated? Is there a way to improve that
microcode loading code? Is find_patch(0) returning a NULL expected or
maybe a bug?

-- 
Rafał

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-15 23:46 Problem with late AMD microcode reload/feedback Rafał Miłecki
@ 2018-12-16  0:05 ` Borislav Petkov
  2018-12-16  8:08   ` Rafał Miłecki
  2018-12-16  0:08 ` Borislav Petkov
  1 sibling, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2018-12-16  0:05 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, Dec 16, 2018 at 12:46:05AM +0100, Rafał Miłecki wrote:
> [19.736770] microcode: [find_equiv_id] sig:8458000

That's your CPU's family/model/stepping: 0x0810f10

> [19.736772] microcode: [find_equiv_id] equiv_table->installed_cpu:8392466
> [19.736775] microcode: [find_equiv_id] equiv_table->installed_cpu:8392578

and those are present on the system. Best to look at them in hex, btw:

0x0800f12
0x0800f82

Which means, there's no microcode for your CPU so nothing gets updated.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-15 23:46 Problem with late AMD microcode reload/feedback Rafał Miłecki
  2018-12-16  0:05 ` Borislav Petkov
@ 2018-12-16  0:08 ` Borislav Petkov
  1 sibling, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-12-16  0:08 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, Dec 16, 2018 at 12:46:05AM +0100, Rafał Miłecki wrote:
> I'm trying to reload AMD Ryzen Mobile (fam17h) microcode doing:
> echo 1 > /sys/devices/system/cpu/microcode/reload

Also, I'd advise against using the late loading method but put the
microcode in the initrd (which your distro should be probably doing,
already):

Documentation/x86/microcode.txt

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16  0:05 ` Borislav Petkov
@ 2018-12-16  8:08   ` Rafał Miłecki
  2018-12-16 10:06     ` Borislav Petkov
  0 siblings, 1 reply; 9+ messages in thread
From: Rafał Miłecki @ 2018-12-16  8:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Linux Kernel Mailing List

On 16.12.2018 01:05, Borislav Petkov wrote:
> On Sun, Dec 16, 2018 at 12:46:05AM +0100, Rafał Miłecki wrote:
>> [19.736770] microcode: [find_equiv_id] sig:8458000
> 
> That's your CPU's family/model/stepping: 0x0810f10
> 
>> [19.736772] microcode: [find_equiv_id] equiv_table->installed_cpu:8392466
>> [19.736775] microcode: [find_equiv_id] equiv_table->installed_cpu:8392578
> 
> and those are present on the system. Best to look at them in hex, btw:
> 
> 0x0800f12
> 0x0800f82
> 
> Which means, there's no microcode for your CPU so nothing gets updated.

Thanks! I had no idea microcode_amd_fam17h.bin is a container with few
microcodes. I thought there is a single microcode for a whole family
(e.g. 17h).

Using hex also makes more sense indeed!
[   44.127941] microcode: verify_and_add_patch: Added patch_id: 0x08001227, proc_id: 0x8012
[   44.127948] microcode: verify_and_add_patch: Added patch_id: 0x0800820b, proc_id: 0x8082
[   44.127952] microcode: [find_equiv_id] sig:0x810f10
[   44.127955] microcode: [find_equiv_id] equiv_table->installed_cpu:0x800f12
[   44.127958] microcode: [find_equiv_id] equiv_table->installed_cpu:0x800f82

So for now I'm stuck with the default/BIOS-uploaded microcode:
[    2.604680] microcode: CPU0: patch_level=0x0810100b
[    2.605617] microcode: CPU1: patch_level=0x0810100b
[    2.606583] microcode: CPU2: patch_level=0x0810100b
[    2.607528] microcode: CPU3: patch_level=0x0810100b
[    2.608408] microcode: CPU4: patch_level=0x0810100b
[    2.609285] microcode: CPU5: patch_level=0x0810100b
[    2.610270] microcode: CPU6: patch_level=0x0810100b
[    2.611135] microcode: CPU7: patch_level=0x0810100b

I've one more hacking idea. My notebook has Ryzen 5 PRO 2500U CPU but I
also have access to another one with Ryzen 5 2500U running:
[    2.780949] microcode: CPU0: patch_level=0x08101007

For my hack tests I'd like to replace my 0x0810100b with a 0x08101007.
Is that possible to extract/dump current microcode from the CPU and
package it as microcode_amd_fam17h.bin?

Are there any ready tools for that?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16  8:08   ` Rafał Miłecki
@ 2018-12-16 10:06     ` Borislav Petkov
  2018-12-16 10:26       ` Rafał Miłecki
  0 siblings, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2018-12-16 10:06 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Thomas Gleixner, Ingo Molnar, H. Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, Dec 16, 2018 at 09:08:00AM +0100, Rafał Miłecki wrote:
> Thanks! I had no idea microcode_amd_fam17h.bin is a container with few
> microcodes. I thought there is a single microcode for a whole family
> (e.g. 17h).

It is a container for all F17h - you're simply making the wrong
assumption that it would have microcode for *all* F17h out there. Which
is not the case, because, well, there's not microcode for all of them
... yet.

:-)

IOW, there are only two patches released:

file offset: 60 (0x3c)
Patch 00: type 1, size: 3200 (0xc80)
   ID: 0x08001227, CPU rev ID: 0x00008012

file offset: 3268 (0xcc4)
Patch 01: type 1, size: 3200 (0xc80)
   ID: 0x0800820b, CPU rev ID: 0x00008082

> Using hex also makes more sense indeed!
> [   44.127941] microcode: verify_and_add_patch: Added patch_id: 0x08001227, proc_id: 0x8012
> [   44.127948] microcode: verify_and_add_patch: Added patch_id: 0x0800820b, proc_id: 0x8082

Yap.

> For my hack tests I'd like to replace my 0x0810100b with a 0x08101007.

Why would you even want to downgrade the microcode?!

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16 10:06     ` Borislav Petkov
@ 2018-12-16 10:26       ` Rafał Miłecki
  2018-12-16 10:44         ` Borislav Petkov
  0 siblings, 1 reply; 9+ messages in thread
From: Rafał Miłecki @ 2018-12-16 10:26 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, 16 Dec 2018 at 11:06, Borislav Petkov <bp@alien8.de> wrote:
> > For my hack tests I'd like to replace my 0x0810100b with a 0x08101007.
>
> Why would you even want to downgrade the microcode?!

Debugging CPU errors. I have two notebooks:

1) HP EliteBook 745 G5 with Ryzen 5 PRO 2500U
It runs 1.03.01 BIOS with 0x0810100b microcode and suffers from MCE
logged CPU errors.

2) MateBook D with Ryzen 5 2500U
It runs 1.12 BIOS with 0x08101007 microcode and MCE doesn't report any
CPU errors.

I wanted to downgrade microcode on HP EliteBook and upgrade microcode
on MateBook to see if that makes a difference for them. For that I
need to:
1) Dump old microcode from MateBook & run it on EliteBook
2) Dump new microcode from EliteBook & run it on MateBook

-- 
Rafał

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16 10:26       ` Rafał Miłecki
@ 2018-12-16 10:44         ` Borislav Petkov
  2018-12-16 11:02           ` Rafał Miłecki
  0 siblings, 1 reply; 9+ messages in thread
From: Borislav Petkov @ 2018-12-16 10:44 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, Dec 16, 2018 at 11:26:29AM +0100, Rafał Miłecki wrote:
> Debugging CPU errors.

I told you that this issue is being worked on and there will be a fix
of sorts at some point. Don't try any funky business of downgrading the
microcode and maybe break your boxes in the process. Just ignore the MCE
- it is harmless! - until there's a fix.

Ok?

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16 10:44         ` Borislav Petkov
@ 2018-12-16 11:02           ` Rafał Miłecki
  2018-12-16 11:12             ` Borislav Petkov
  0 siblings, 1 reply; 9+ messages in thread
From: Rafał Miłecki @ 2018-12-16 11:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, 16 Dec 2018 at 11:44, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Dec 16, 2018 at 11:26:29AM +0100, Rafał Miłecki wrote:
> > Debugging CPU errors.
>
> I told you that this issue is being worked on and there will be a fix
> of sorts at some point. Don't try any funky business of downgrading the
> microcode and maybe break your boxes in the process. Just ignore the MCE
> - it is harmless! - until there's a fix.
>
> Ok?

OK, if you say so, I'll try not to panic seeing those errors repeating
over and over.

I know such issues may take months or years to get fixed, so I was
trying to do some hacking on my own. I'll try some patience :)

-- 
Rafał

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with late AMD microcode reload/feedback
  2018-12-16 11:02           ` Rafał Miłecki
@ 2018-12-16 11:12             ` Borislav Petkov
  0 siblings, 0 replies; 9+ messages in thread
From: Borislav Petkov @ 2018-12-16 11:12 UTC (permalink / raw)
  To: Rafał Miłecki
  Cc: Thomas Gleixner, Ingo Molnar, H . Peter Anvin, x86,
	Linux Kernel Mailing List

On Sun, Dec 16, 2018 at 12:02:49PM +0100, Rafał Miłecki wrote:
> OK, if you say so, I'll try not to panic seeing those errors repeating
> over and over.

Yes, patience is the key :-)

> I know such issues may take months or years to get fixed, so I was
> trying to do some hacking on my own. I'll try some patience :)

Well, if you wanna hack on stuff, there's a lot more you can do which is
100% safe.

Like getting rid of -Wmissing-prototypes warnings, for example, so that
we can enable this option per default. If you're interested, see here:

https://marc.info/?l=kernel-janitors&m=154178546220848&w=2

It is not hardcore stuff but if you're looking for doing some work on
the kernel, that would be one thing to do.

And of course there's the ever-so-helpful testing of linux-next kernels
on old/spare hardware and trying to fix any issues there. Or reviewing
patches on lkml.

It all depends on what kind of "hacking" you wanna do.

:-)

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-12-16 11:12 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-15 23:46 Problem with late AMD microcode reload/feedback Rafał Miłecki
2018-12-16  0:05 ` Borislav Petkov
2018-12-16  8:08   ` Rafał Miłecki
2018-12-16 10:06     ` Borislav Petkov
2018-12-16 10:26       ` Rafał Miłecki
2018-12-16 10:44         ` Borislav Petkov
2018-12-16 11:02           ` Rafał Miłecki
2018-12-16 11:12             ` Borislav Petkov
2018-12-16  0:08 ` Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).