linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-07-31 11:27 Prarit Bhargava
  0 siblings, 0 replies; 6+ messages in thread
From: Prarit Bhargava @ 2018-07-31 11:27 UTC (permalink / raw)
  To: bp; +Cc: linux-edac, linux-kernel, sironi, tony.luck, Prarit Bhargava, stable

I tested this on AMD Ryzen & Intel Broadwell system and dumped the
boot_cpu_data before and after a microcode update.  On the Intel
system I also did a fatal MCE using mce-inject to confirm the output
from the mce handling code.

P.

---8<---

On systems where a runtime microcode update has occurred the microcode
version output in a MCE log record is wrong because
boot_cpu_data.microcode is not updated during runtime.

Update boot_cpu_data.microcode when the BSP's microcode is updated.

Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
Suggested-by: Borislav Petkov <bp@alien8.com>
Signed-off-by: Prarit Bhargava <prarit@redhat.com>
Cc: stable@vger.kernel.org
Cc: sironi@amazon.de
Cc: tony.luck@intel.com
---
Changes in v2: Use mc_amd->hdr.patch_id on AMD

 arch/x86/kernel/cpu/microcode/amd.c   | 4 ++++
 arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
 2 files changed, 8 insertions(+)

diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
index 0624957aa068..63b072377ba4 100644
--- a/arch/x86/kernel/cpu/microcode/amd.c
+++ b/arch/x86/kernel/cpu/microcode/amd.c
@@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
 	uci->cpu_sig.rev = mc_amd->hdr.patch_id;
 	c->microcode = mc_amd->hdr.patch_id;
 
+	/* Update boot_cpu_data's revision too, if we're on the BSP: */
+	if (c->cpu_index == boot_cpu_data.cpu_index)
+		boot_cpu_data.microcode =  mc_amd->hdr.patch_id;
+
 	return UCODE_UPDATED;
 }
 
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..256d336cbc04 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
 	uci->cpu_sig.rev = rev;
 	c->microcode = rev;
 
+	/* Update boot_cpu_data's revision too, if we're on the BSP: */
+	if (c->cpu_index == boot_cpu_data.cpu_index)
+		boot_cpu_data.microcode = rev;
+
 	return UCODE_UPDATED;
 }
 

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-08-01 15:55 Prarit Bhargava
  0 siblings, 0 replies; 6+ messages in thread
From: Prarit Bhargava @ 2018-08-01 15:55 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, sironi, tony.luck, Oleksandr Natalenko

On 08/01/2018 11:43 AM, Borislav Petkov wrote:
> (dropping stable@)
> 
> On Tue, Jul 31, 2018 at 07:27:39AM -0400, Prarit Bhargava wrote:
>> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
>> boot_cpu_data before and after a microcode update.  On the Intel
>> system I also did a fatal MCE using mce-inject to confirm the output
>> from the mce handling code.
> 
> Ok, sending the two patches I've massaged into submission as a reply to
> this message - please run them on your boxes. You can also inject MCEs
> on the AMD box to verify - simply enable CONFIG_X86_MCE_INJECT, go to
> debugfs and do, for example:
> 
> cd /sys/kernel/debug/mce-inject/
> echo 10 > /sys/devices/system/machinecheck/machinecheck0/check_interval
> echo 0x9c7d410092080813 > status; echo 0x000000006d3d483b > addr; echo 2 > cpu; echo 4 > bank; echo hw > flags
> echo 0xbc00200000010015 > status; echo 0x00000000942e4d98 > addr; echo 6 > cpu; echo 2 > bank; echo hw > flags
> echo 0xfe000010000b0c0f > status; echo 0xe1101add1e550012 > addr; echo 0xcafebeef > misc; echo 2 > cpu; echo hw > flags; echo 4 > bank
> 

Borislav, I'm not sure if you saw his comments but Oleksandr, cc'd, has an
objection to overwriting the stored microcode version in boot_cpu_data.  He has
found that information useful in debugging microcode issues.

P.

> Thanks!
>
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-08-01 15:43 Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2018-08-01 15:43 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: linux-edac, linux-kernel, sironi, tony.luck

(dropping stable@)

On Tue, Jul 31, 2018 at 07:27:39AM -0400, Prarit Bhargava wrote:
> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
> boot_cpu_data before and after a microcode update.  On the Intel
> system I also did a fatal MCE using mce-inject to confirm the output
> from the mce handling code.

Ok, sending the two patches I've massaged into submission as a reply to
this message - please run them on your boxes. You can also inject MCEs
on the AMD box to verify - simply enable CONFIG_X86_MCE_INJECT, go to
debugfs and do, for example:

cd /sys/kernel/debug/mce-inject/
echo 10 > /sys/devices/system/machinecheck/machinecheck0/check_interval
echo 0x9c7d410092080813 > status; echo 0x000000006d3d483b > addr; echo 2 > cpu; echo 4 > bank; echo hw > flags
echo 0xbc00200000010015 > status; echo 0x00000000942e4d98 > addr; echo 6 > cpu; echo 2 > bank; echo hw > flags
echo 0xfe000010000b0c0f > status; echo 0xe1101add1e550012 > addr; echo 0xcafebeef > misc; echo 2 > cpu; echo hw > flags; echo 4 > bank

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-07-31 15:31 Filippo Sironi
  0 siblings, 0 replies; 6+ messages in thread
From: Filippo Sironi @ 2018-07-31 15:31 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Prarit Bhargava, linux-edac, linux-kernel, tony.luck, stable

> On 31. Jul 2018, at 15:00, Borislav Petkov <bp@alien8.de> wrote:
> 
> On Tue, Jul 31, 2018 at 11:46:09AM +0000, Sironi, Filippo wrote:
>> There may be a chance of skipping this code, I think.
>> 
>> If the microcode is loaded on the hyperthread sibling of the boot cpu
>> before being loaded on the boot cpu, the boot cpu will exit earlier
>> from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.
>> 
>> (This seems to be possible in apply_microcode_amd() as well.)
>> 
>> In my tree with the aforementioned change - Intel only - I also had
>> the following patch:
> 
> Yap, good catch.
> 
> -- 
> Regards/Gruss,
>    Boris.
> 
> ECO tip #101: Trim your mails when you reply.
> --
> 

Boris, Prarit,

I've sent the patch the I had in my tree and extended it to cover
apply_microcode_amd(). I don't have an AMD host available for testing,
can one of you give that a spin?

Thanks,
Filippo


Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-07-31 13:00 Borislav Petkov
  0 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2018-07-31 13:00 UTC (permalink / raw)
  To: Sironi, Filippo
  Cc: Prarit Bhargava, linux-edac, linux-kernel, tony.luck, stable

On Tue, Jul 31, 2018 at 11:46:09AM +0000, Sironi, Filippo wrote:
> There may be a chance of skipping this code, I think.
> 
> If the microcode is loaded on the hyperthread sibling of the boot cpu
> before being loaded on the boot cpu, the boot cpu will exit earlier
> from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.
> 
> (This seems to be possible in apply_microcode_amd() as well.)
> 
> In my tree with the aforementioned change - Intel only - I also had
> the following patch:

Yap, good catch.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [v2] arch/x86: Fix boot_cpu_data.microcode version output
@ 2018-07-31 11:46 Filippo Sironi
  0 siblings, 0 replies; 6+ messages in thread
From: Filippo Sironi @ 2018-07-31 11:46 UTC (permalink / raw)
  To: Prarit Bhargava; +Cc: bp, linux-edac, linux-kernel, tony.luck, stable

> On 31. Jul 2018, at 13:27, Prarit Bhargava <prarit@redhat.com> wrote:
> 
> I tested this on AMD Ryzen & Intel Broadwell system and dumped the
> boot_cpu_data before and after a microcode update.  On the Intel
> system I also did a fatal MCE using mce-inject to confirm the output
> from the mce handling code.
> 
> P.
> 
> ---8<---
> 
> On systems where a runtime microcode update has occurred the microcode
> version output in a MCE log record is wrong because
> boot_cpu_data.microcode is not updated during runtime.
> 
> Update boot_cpu_data.microcode when the BSP's microcode is updated.
> 
> Fixes: fa94d0c6e0f3 ("x86/MCE: Save microcode revision in machine check records")
> Suggested-by: Borislav Petkov <bp@alien8.com>
> Signed-off-by: Prarit Bhargava <prarit@redhat.com>
> Cc: stable@vger.kernel.org
> Cc: sironi@amazon.de
> Cc: tony.luck@intel.com
> ---
> Changes in v2: Use mc_amd->hdr.patch_id on AMD
> 
> arch/x86/kernel/cpu/microcode/amd.c   | 4 ++++
> arch/x86/kernel/cpu/microcode/intel.c | 4 ++++
> 2 files changed, 8 insertions(+)
> 
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c b/arch/x86/kernel/cpu/microcode/amd.c
> index 0624957aa068..63b072377ba4 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -537,6 +537,10 @@ static enum ucode_state apply_microcode_amd(int cpu)
> 	uci->cpu_sig.rev = mc_amd->hdr.patch_id;
> 	c->microcode = mc_amd->hdr.patch_id;
> 
> +	/* Update boot_cpu_data's revision too, if we're on the BSP: */
> +	if (c->cpu_index == boot_cpu_data.cpu_index)
> +		boot_cpu_data.microcode =  mc_amd->hdr.patch_id;
> +
> 	return UCODE_UPDATED;
> }
> 
> diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
> index 97ccf4c3b45b..256d336cbc04 100644
> --- a/arch/x86/kernel/cpu/microcode/intel.c
> +++ b/arch/x86/kernel/cpu/microcode/intel.c
> @@ -851,6 +851,10 @@ static enum ucode_state apply_microcode_intel(int cpu)
> 	uci->cpu_sig.rev = rev;
> 	c->microcode = rev;
> 
> +	/* Update boot_cpu_data's revision too, if we're on the BSP: */
> +	if (c->cpu_index == boot_cpu_data.cpu_index)
> +		boot_cpu_data.microcode = rev;
> +
> 	return UCODE_UPDATED;
> }

There may be a chance of skipping this code, I think.

If the microcode is loaded on the hyperthread sibling of the boot cpu
before being loaded on the boot cpu, the boot cpu will exit earlier
from apply_microcode_intel() - in if (rev >= mc->hdr.rev) { ... }.

(This seems to be possible in apply_microcode_amd() as well.)

In my tree with the aforementioned change - Intel only - I also had
the following patch:


which prevents the issue.

> -- 
> 2.17.0
> 
> 

Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
---
To unsubscribe from this list: send the line "unsubscribe linux-edac" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 97ccf4c3b45b..4bc869e829eb 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -797,6 +797,7 @@ static enum ucode_state apply_microcode_intel(int cpu)
        struct microcode_intel *mc;
        static int prev_rev;
        u32 rev;
+       enum ucode_state ret;
 
        /* We should bind the task to the CPU */
        if (WARN_ON(raw_smp_processor_id() != cpu))
@@ -817,9 +818,8 @@ static enum ucode_state apply_microcode_intel(int cpu)
         */
        rev = intel_get_microcode_revision();
        if (rev >= mc->hdr.rev) {
-               uci->cpu_sig.rev = rev;
-               c->microcode = rev;
-               return UCODE_OK;
+               ret = UCODE_OK;
+               goto out;
        }
 
        /*
@@ -848,10 +848,12 @@ static enum ucode_state apply_microcode_intel(int cpu)
                prev_rev = rev;
        }
 
+       ret = UCODE_UPDATED;
+out:
        uci->cpu_sig.rev = rev;
        c->microcode = rev;
 
-       return UCODE_UPDATED;
+       return ret;
 }
 
 static enum ucode_state generic_load_microcode(int cpu, void *data, size_t size,

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2018-08-01 15:55 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-31 11:27 [v2] arch/x86: Fix boot_cpu_data.microcode version output Prarit Bhargava
2018-07-31 11:46 Filippo Sironi
2018-07-31 13:00 Borislav Petkov
2018-07-31 15:31 Filippo Sironi
2018-08-01 15:43 Borislav Petkov
2018-08-01 15:55 Prarit Bhargava

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).