[PATCH v4 0/2] x86/ucode/AMD: load ucode on every logical thread

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v4 0/2] x86/ucode/AMD: load ucode on every logical thread
@ 2023-02-15 15:38 Sergey Dyasli
  2023-02-15 15:38 ` [PATCH v4 1/2] x86/ucode/AMD: apply the patch early " Sergey Dyasli
  2023-02-15 15:38 ` [PATCH v4 2/2] x86/ucode/AMD: late load the patch " Sergey Dyasli
  0 siblings, 2 replies; 7+ messages in thread
From: Sergey Dyasli @ 2023-02-15 15:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli

I've added a second patch to cover late loading as that should also
happen on every cpu, according to AMD.

Sergey Dyasli (2):
  x86/ucode/AMD: apply the patch early on every logical thread
  x86/ucode/AMD: late load the patch on every logical thread

 xen/arch/x86/cpu/microcode/amd.c     | 11 +++--
 xen/arch/x86/cpu/microcode/core.c    | 61 +++++++++++++++++++---------
 xen/arch/x86/cpu/microcode/intel.c   | 10 +++--
 xen/arch/x86/cpu/microcode/private.h |  3 +-
 4 files changed, 59 insertions(+), 26 deletions(-)

-- 
2.31.1



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v4 1/2] x86/ucode/AMD: apply the patch early on every logical thread
  2023-02-15 15:38 [PATCH v4 0/2] x86/ucode/AMD: load ucode on every logical thread Sergey Dyasli
@ 2023-02-15 15:38 ` Sergey Dyasli
  2023-02-21 13:57   ` Jan Beulich
  2023-02-15 15:38 ` [PATCH v4 2/2] x86/ucode/AMD: late load the patch " Sergey Dyasli
  1 sibling, 1 reply; 7+ messages in thread
From: Sergey Dyasli @ 2023-02-15 15:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli

The original issue has been reported on AMD Bulldozer-based CPUs where
ucode loading loses the LWP feature bit in order to gain the IBPB bit.
LWP disabling is per-SMT/CMT core modification and needs to happen on
each sibling thread despite the shared microcode engine. Otherwise,
logical CPUs will end up with different cpuid capabilities.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211

Guests running under Xen happen to be not affected because of levelling
logic for the feature masking/override MSRs which causes the LWP bit to
fall out and hides the issue. The latest recommendation from AMD, after
discussing this bug, is to load ucode on every logical CPU.

In Linux kernel this issue has been addressed by e7ad18d1169c
("x86/microcode/AMD: Apply the patch early on every logical thread").
Follow the same approach in Xen.

Introduce SAME_UCODE match result and use it for early AMD ucode
loading. Take this opportunity and move opt_ucode_allow_same out of
compare_revisions() to the relevant callers and also modify the warning
message based on it. Intel's side of things is modified for consistency
but provides no functional change.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
---
v3 --> v4:
- Coding style fixes
- Removed goto
- Removed the paragraph about late loading in the commit message

v2 --> v3:
- Moved opt_ucode_allow_same out of compare_revisions() and updated
  the commit message
- Adjusted the warning message

v1 --> v2:
- Expanded the commit message with the levelling section
- Adjusted comment for OLD_UCODE
---
 xen/arch/x86/cpu/microcode/amd.c     | 11 ++++++++---
 xen/arch/x86/cpu/microcode/core.c    | 26 +++++++++++++++++---------
 xen/arch/x86/cpu/microcode/intel.c   | 10 +++++++---
 xen/arch/x86/cpu/microcode/private.h |  3 ++-
 4 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/cpu/microcode/amd.c b/xen/arch/x86/cpu/microcode/amd.c
index 4b097187a0..a9a5557835 100644
--- a/xen/arch/x86/cpu/microcode/amd.c
+++ b/xen/arch/x86/cpu/microcode/amd.c
@@ -176,8 +176,8 @@ static enum microcode_match_result compare_revisions(
     if ( new_rev > old_rev )
         return NEW_UCODE;
 
-    if ( opt_ucode_allow_same && new_rev == old_rev )
-        return NEW_UCODE;
+    if ( new_rev == old_rev )
+        return SAME_UCODE;
 
     return OLD_UCODE;
 }
@@ -220,8 +220,13 @@ static int cf_check apply_microcode(const struct microcode_patch *patch)
     unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     uint32_t rev, old_rev = sig->rev;
+    enum microcode_match_result result = microcode_fits(patch);
 
-    if ( microcode_fits(patch) != NEW_UCODE )
+    /*
+     * Allow application of the same revision to pick up SMT-specific changes
+     * even if the revision of the other SMT thread is already up-to-date.
+     */
+    if ( result != NEW_UCODE && result != SAME_UCODE )
         return -EINVAL;
 
     if ( check_final_patch_levels(sig) )
diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c
index d14754e222..ba6e7b42c6 100644
--- a/xen/arch/x86/cpu/microcode/core.c
+++ b/xen/arch/x86/cpu/microcode/core.c
@@ -612,17 +612,25 @@ static long cf_check microcode_update_helper(void *data)
      * that ucode revision.
      */
     spin_lock(&microcode_mutex);
-    if ( microcode_cache &&
-         alternative_call(ucode_ops.compare_patch,
-                          patch, microcode_cache) != NEW_UCODE )
+    if ( microcode_cache )
     {
-        spin_unlock(&microcode_mutex);
-        printk(XENLOG_WARNING "microcode: couldn't find any newer revision "
-                              "in the provided blob!\n");
-        microcode_free_patch(patch);
-        ret = -ENOENT;
+        enum microcode_match_result result;
 
-        goto put;
+        result = alternative_call(ucode_ops.compare_patch, patch,
+                                  microcode_cache);
+
+        if ( result != NEW_UCODE &&
+             !(opt_ucode_allow_same && result == SAME_UCODE) )
+        {
+            spin_unlock(&microcode_mutex);
+            printk(XENLOG_WARNING
+                   "microcode: couldn't find any newer%s revision in the provided blob!\n",
+                   opt_ucode_allow_same ? " (or the same)" : "");
+            microcode_free_patch(patch);
+            ret = -ENOENT;
+
+            goto put;
+        }
     }
     spin_unlock(&microcode_mutex);
 
diff --git a/xen/arch/x86/cpu/microcode/intel.c b/xen/arch/x86/cpu/microcode/intel.c
index f7fec4b4ed..8d4d6574aa 100644
--- a/xen/arch/x86/cpu/microcode/intel.c
+++ b/xen/arch/x86/cpu/microcode/intel.c
@@ -232,8 +232,8 @@ static enum microcode_match_result compare_revisions(
     if ( new_rev > old_rev )
         return NEW_UCODE;
 
-    if ( opt_ucode_allow_same && new_rev == old_rev )
-        return NEW_UCODE;
+    if ( new_rev == old_rev )
+        return SAME_UCODE;
 
     /*
      * Treat pre-production as always applicable - anyone using pre-production
@@ -290,8 +290,12 @@ static int cf_check apply_microcode(const struct microcode_patch *patch)
     unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &this_cpu(cpu_sig);
     uint32_t rev, old_rev = sig->rev;
+    enum microcode_match_result result;
+
+    result = microcode_update_match(patch);
 
-    if ( microcode_update_match(patch) != NEW_UCODE )
+    if ( result != NEW_UCODE &&
+         !(opt_ucode_allow_same && result == SAME_UCODE) )
         return -EINVAL;
 
     wbinvd();
diff --git a/xen/arch/x86/cpu/microcode/private.h b/xen/arch/x86/cpu/microcode/private.h
index 73b095d5bf..626aeb4d08 100644
--- a/xen/arch/x86/cpu/microcode/private.h
+++ b/xen/arch/x86/cpu/microcode/private.h
@@ -6,7 +6,8 @@
 extern bool opt_ucode_allow_same;
 
 enum microcode_match_result {
-    OLD_UCODE, /* signature matched, but revision id is older or equal */
+    OLD_UCODE, /* signature matched, but revision id is older */
+    SAME_UCODE, /* signature matched, but revision id is the same */
     NEW_UCODE, /* signature matched, but revision id is newer */
     MIS_UCODE, /* signature mismatched */
 };
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v4 2/2] x86/ucode/AMD: late load the patch on every logical thread
  2023-02-15 15:38 [PATCH v4 0/2] x86/ucode/AMD: load ucode on every logical thread Sergey Dyasli
  2023-02-15 15:38 ` [PATCH v4 1/2] x86/ucode/AMD: apply the patch early " Sergey Dyasli
@ 2023-02-15 15:38 ` Sergey Dyasli
  2023-02-21 14:03   ` Jan Beulich
  1 sibling, 1 reply; 7+ messages in thread
From: Sergey Dyasli @ 2023-02-15 15:38 UTC (permalink / raw)
  To: xen-devel
  Cc: Jan Beulich, Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli

Currently late ucode loading is performed only on the first core of CPU
siblings.  But according to the latest recommendation from AMD, late
ucode loading should happen on every logical thread/core.

To achieve that, consider every logical cpu as "primary" when running on
AMD cpus, i.e. skip cpu_sibling_mask checks.

Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
---
v4:
- new patch
---
 xen/arch/x86/cpu/microcode/core.c | 35 ++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/cpu/microcode/core.c b/xen/arch/x86/cpu/microcode/core.c
index ba6e7b42c6..f720030761 100644
--- a/xen/arch/x86/cpu/microcode/core.c
+++ b/xen/arch/x86/cpu/microcode/core.c
@@ -398,10 +398,16 @@ static int cf_check microcode_nmi_callback(
          (!ucode_in_nmi && cpu == primary) )
         return 0;
 
-    if ( cpu == primary )
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+        /* load ucode on every logical thread/core */
         ret = primary_thread_work(nmi_patch);
     else
-        ret = secondary_nmi_work();
+    {
+        if ( cpu == primary )
+            ret = primary_thread_work(nmi_patch);
+        else
+            ret = secondary_nmi_work();
+    }
     this_cpu(loading_err) = ret;
 
     return 0;
@@ -540,7 +546,6 @@ static int control_thread_fn(const struct microcode_patch *patch)
 static int cf_check do_microcode_update(void *patch)
 {
     unsigned int cpu = smp_processor_id();
-    int ret;
 
     /*
      * The control thread set state to coordinate ucode loading. Primary
@@ -548,13 +553,18 @@ static int cf_check do_microcode_update(void *patch)
      * the completion of the ucode loading process.
      */
     if ( cpu == cpumask_first(&cpu_online_map) )
-        ret = control_thread_fn(patch);
-    else if ( cpu == cpumask_first(this_cpu(cpu_sibling_mask)) )
-        ret = primary_thread_fn(patch);
-    else
-        ret = secondary_thread_fn();
+        return control_thread_fn(patch);
 
-    return ret;
+    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+        /* load ucode on every logical thread/core */
+        return primary_thread_fn(patch);
+    else
+    {
+        if ( cpu == cpumask_first(this_cpu(cpu_sibling_mask)) )
+            return primary_thread_fn(patch);
+        else
+            return secondary_thread_fn();
+    }
 }
 
 struct ucode_buf {
@@ -642,8 +652,13 @@ static long cf_check microcode_update_helper(void *data)
     /* Calculate the number of online CPU core */
     nr_cores = 0;
     for_each_online_cpu(cpu)
-        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+    {
+        if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
+            /* load ucode on every logical thread/core */
+            nr_cores++;
+        else if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
             nr_cores++;
+    }
 
     printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
 
-- 
2.31.1



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 1/2] x86/ucode/AMD: apply the patch early on every logical thread
  2023-02-15 15:38 ` [PATCH v4 1/2] x86/ucode/AMD: apply the patch early " Sergey Dyasli
@ 2023-02-21 13:57   ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2023-02-21 13:57 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli, xen-devel

On 15.02.2023 16:38, Sergey Dyasli wrote:
> The original issue has been reported on AMD Bulldozer-based CPUs where
> ucode loading loses the LWP feature bit in order to gain the IBPB bit.
> LWP disabling is per-SMT/CMT core modification and needs to happen on
> each sibling thread despite the shared microcode engine. Otherwise,
> logical CPUs will end up with different cpuid capabilities.
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=216211
> 
> Guests running under Xen happen to be not affected because of levelling
> logic for the feature masking/override MSRs which causes the LWP bit to
> fall out and hides the issue. The latest recommendation from AMD, after
> discussing this bug, is to load ucode on every logical CPU.
> 
> In Linux kernel this issue has been addressed by e7ad18d1169c
> ("x86/microcode/AMD: Apply the patch early on every logical thread").
> Follow the same approach in Xen.
> 
> Introduce SAME_UCODE match result and use it for early AMD ucode
> loading. Take this opportunity and move opt_ucode_allow_same out of
> compare_revisions() to the relevant callers and also modify the warning
> message based on it. Intel's side of things is modified for consistency
> but provides no functional change.
> 
> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] x86/ucode/AMD: late load the patch on every logical thread
  2023-02-15 15:38 ` [PATCH v4 2/2] x86/ucode/AMD: late load the patch " Sergey Dyasli
@ 2023-02-21 14:03   ` Jan Beulich
  2023-02-21 21:26     ` Sergey Dyasli
  0 siblings, 1 reply; 7+ messages in thread
From: Jan Beulich @ 2023-02-21 14:03 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli, xen-devel

On 15.02.2023 16:38, Sergey Dyasli wrote:
> --- a/xen/arch/x86/cpu/microcode/core.c
> +++ b/xen/arch/x86/cpu/microcode/core.c
> @@ -398,10 +398,16 @@ static int cf_check microcode_nmi_callback(
>           (!ucode_in_nmi && cpu == primary) )
>          return 0;
>  
> -    if ( cpu == primary )
> +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )

Given their origin, I'm pretty certain Hygon wants treating the same here
and below.

> +        /* load ucode on every logical thread/core */
>          ret = primary_thread_work(nmi_patch);
>      else
> -        ret = secondary_nmi_work();
> +    {
> +        if ( cpu == primary )
> +            ret = primary_thread_work(nmi_patch);
> +        else
> +            ret = secondary_nmi_work();
> +    }

May I ask to get away with less code churn and less overall indentation?
Already

    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
        /* load ucode on every logical thread/core */
        ret = primary_thread_work(nmi_patch);
    else if ( cpu == primary )
        ret = primary_thread_work(nmi_patch);
    else
        ret = secondary_nmi_work();

would be shorter, but I don't see anything wrong with simply going with

    if ( cpu == primary ||
        /* Load ucode on every logical thread/core: */
        (boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
        ret = primary_thread_work(nmi_patch);
    else
        ret = secondary_nmi_work();

Same again further down, I believe.

Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] x86/ucode/AMD: late load the patch on every logical thread
  2023-02-21 14:03   ` Jan Beulich
@ 2023-02-21 21:26     ` Sergey Dyasli
  2023-02-22 10:41       ` Jan Beulich
  0 siblings, 1 reply; 7+ messages in thread
From: Sergey Dyasli @ 2023-02-21 21:26 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli, xen-devel

On Tue, Feb 21, 2023 at 2:03 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.02.2023 16:38, Sergey Dyasli wrote:
> > --- a/xen/arch/x86/cpu/microcode/core.c
> > +++ b/xen/arch/x86/cpu/microcode/core.c
> > @@ -398,10 +398,16 @@ static int cf_check microcode_nmi_callback(
> >           (!ucode_in_nmi && cpu == primary) )
> >          return 0;
> >
> > -    if ( cpu == primary )
> > +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
>
> Given their origin, I'm pretty certain Hygon wants treating the same here
> and below.

Hygon? ucode_ops is currently initialised only for Amd and Intel.
Speaking of which, I'm thinking about adding a new function
is_cpu_primary() there. This would make the core code much cleaner.
I'll see if I can make it work.

Thanks,
Sergey


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v4 2/2] x86/ucode/AMD: late load the patch on every logical thread
  2023-02-21 21:26     ` Sergey Dyasli
@ 2023-02-22 10:41       ` Jan Beulich
  0 siblings, 0 replies; 7+ messages in thread
From: Jan Beulich @ 2023-02-22 10:41 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Sergey Dyasli, xen-devel

On 21.02.2023 22:26, Sergey Dyasli wrote:
> On Tue, Feb 21, 2023 at 2:03 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 15.02.2023 16:38, Sergey Dyasli wrote:
>>> --- a/xen/arch/x86/cpu/microcode/core.c
>>> +++ b/xen/arch/x86/cpu/microcode/core.c
>>> @@ -398,10 +398,16 @@ static int cf_check microcode_nmi_callback(
>>>           (!ucode_in_nmi && cpu == primary) )
>>>          return 0;
>>>
>>> -    if ( cpu == primary )
>>> +    if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
>>
>> Given their origin, I'm pretty certain Hygon wants treating the same here
>> and below.
> 
> Hygon? ucode_ops is currently initialised only for Amd and Intel.

Hmm, you're right, we still haven't sorted that aspect. I'm inclined
to say though that adding Hygon in your changes right away reduces the
burden later on. And it'll do no harm as long as early_microcode_init()
isn't properly dealing with Hygon.

> Speaking of which, I'm thinking about adding a new function
> is_cpu_primary() there. This would make the core code much cleaner.
> I'll see if I can make it work.

Thanks - I was actually meaning to suggest something like that,
realizing the potential improvement only after sending the earlier
reply. Even just a static helper (without new hook) may already
improve things.

Jan


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2023-02-22 10:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-15 15:38 [PATCH v4 0/2] x86/ucode/AMD: load ucode on every logical thread Sergey Dyasli
2023-02-15 15:38 ` [PATCH v4 1/2] x86/ucode/AMD: apply the patch early " Sergey Dyasli
2023-02-21 13:57   ` Jan Beulich
2023-02-15 15:38 ` [PATCH v4 2/2] x86/ucode/AMD: late load the patch " Sergey Dyasli
2023-02-21 14:03   ` Jan Beulich
2023-02-21 21:26     ` Sergey Dyasli
2023-02-22 10:41       ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.