xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [Xen-devel] [PATCH v9 00/15] improve late microcode loading
@ 2019-08-19  1:25 Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
                   ` (15 more replies)
  0 siblings, 16 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Boris Ostrovsky, Chao Gao, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

Changes in version 9:
 - add Jan's Reviewed-by
 - rendevzous threads in NMI handler to disable NMI. Note that NMI can
 be served as usual on threads that are chosen to initiate ucode loading
 on each core.
 - avoid unnecessary memory allocation or copy when creating a microcode
 patch (patch 12)
 - rework patch 1 to avoid microcode_update_match() being used to
 compare two arbitrary updates.
 - call .end_update in early loading path.

Sergey, could you help to test this series on an AMD machine?
Regarding changes to AMD side, I didn't do any test for them due to
lack of hardware. At least, two basic tests are needed:
* do a microcode update after system bootup
* don't bring all pCPUs up at bootup by specifying maxcpus option in xen
  command line and then do a microcode update and online all offlined
  CPUs via 'xen-hptool'.

The intention of this series is to make the late microcode loading
more reliable by rendezvousing all cpus in stop_machine context.
This idea comes from Ashok. I am porting his linux patch to Xen
(see patch 13 more details).

This series includes below changes:
 1. Patch 1-12: introduce a global microcode cache and some cleanup
 2. Patch 13: synchronize late microcode loading
 3. Patch 14: support parallel microcodes update on different cores
 4. Patch 15: block #NMI handling during microcode loading

Currently, late microcode loading does a lot of things including
parsing microcode blob, checking the signature/revision and performing
update. Putting all of them into stop_machine context is a bad idea
because of complexity (one issue I observed is memory allocation
triggered one assertion in stop_machine context). To simplify the
load process, parsing microcode is moved out of the load process.
Remaining parts of load process is put to stop_machine context.

Previous change log:
Changes in version 8:
 - block #NMI handling during microcode loading (Patch 16)
 - Don't assume that all CPUs in the system have loaded a same ucode.
 So when parsing a blob, we attempt to save a patch as long as it matches
 with current cpu signature regardless of the revision of the patch.
 And also for loading, we only require the patch to be loaded isn't old
 than the cached one.
 - store an update after the first successful loading on a CPU
 - remove the patch that calls wbinvd() unconditionally before microcode
 loading. It is under internal discussion.
 - divide two big patches into several patches to improve readability.

Changes in version 7:
 - cache one microcode update rather than a list of it. Assuming that all CPUs
 (including those will be plugged in later) in the system have the same
 signature, one update matches with one CPU should match with others. Thus, one
 update is enough for microcode updating during CPU hot-plug and resuming.
 - To handle load failure, microcode update is cached after it is applied to
 avoid a broken update overriding a validated one. Unvalidated microcode updates
 are passed by arguments rather than another global variable, where this series
 slightly differs from Roger's suggestion in:
 https://lists.xen.org/archives/html/xen-devel/2019-03/msg00776.html
 - incorporate Sergey's patch (patch 10) to fix a bug: we maintain a variable
 to reflect current microcode revision. But in some cases, this variable isn't
 initialized during system boot time, which results in falsely reporting that
 processor is susceptible to some known vulnerabilities.
 - fix issues reported by Sergey:
 https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00901.html
 - Responses to Sergey/Roger/Wei/Ashok's other comments.

Major changes in version 6:
 - run wbinvd before updating microcode (patch 10)
 - add an userspace tool for late microcode update (patch 1)
 - scale time to wait by the number of remaining CPUs to respond 
 - remove 'cpu' parameters from some related callbacks and functins
 - save an ucode patch only if its supported CPU is allowed to mix with
   current cpu.

Changes in version 5:
 - support parallel microcode updates for all cores (see patch 8)
 - Address Roger's comments on the last version.


Chao Gao (15):
  microcode/intel: extend microcode_update_match()
  microcode/amd: fix memory leak
  microcode/amd: distinguish old and mismatched ucode in
    microcode_fits()
  microcode: introduce a global cache of ucode patch
  microcode: clean up microcode_resume_cpu
  microcode: remove struct ucode_cpu_info
  microcode: remove pointless 'cpu' parameter
  microcode/amd: call svm_host_osvw_init() in common code
  microcode: pass a patch pointer to apply_microcode()
  microcode: split out apply_microcode() from cpu_request_microcode()
  microcode: unify loading update during CPU resuming and AP wakeup
  microcode: reduce memory allocation and copy when creating a patch
  x86/microcode: Synchronize late microcode loading
  microcode: remove microcode_update_lock
  microcode: block #NMI handling when loading an ucode

 xen/arch/x86/acpi/power.c       |   2 +-
 xen/arch/x86/apic.c             |   2 +-
 xen/arch/x86/microcode.c        | 518 ++++++++++++++++++++++++++++++----------
 xen/arch/x86/microcode_amd.c    | 267 +++++++++------------
 xen/arch/x86/microcode_intel.c  | 202 ++++++++--------
 xen/arch/x86/smpboot.c          |   5 +-
 xen/arch/x86/spec_ctrl.c        |   2 +-
 xen/include/asm-x86/microcode.h |  43 ++--
 xen/include/asm-x86/processor.h |   4 +-
 9 files changed, 642 insertions(+), 403 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-28 15:12   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 02/15] microcode/amd: fix memory leak Chao Gao
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

to a more generic function. So that it can be used alone to check
an update against the CPU signature and current update revision.

Note that enum microcode_match_result will be used in common code
(aka microcode.c), it has been placed in the common header.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - microcode_update_match() doesn't accept (sig, pf, rev) any longer.
 Hence, it won't be used to compare two arbitrary updates.
 - rewrite patch description

Changes in v8:
 - make sure enough room for an extended header and signature array

Changes in v6:
 - eliminate unnecessary type casting in microcode_update_match
 - check if a patch has an extend header

Changes in v5:
 - constify the extended_signature
 - use named enum type for the return value of microcode_update_match
---
 xen/arch/x86/microcode_intel.c  | 60 ++++++++++++++++++++++-------------------
 xen/include/asm-x86/microcode.h |  6 +++++
 2 files changed, 39 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 22fdeca..c185b5c 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
     return 0;
 }
 
-static inline int microcode_update_match(
-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
-    int sig, int pf)
+/* Check an update against the CPU signature and current update revision */
+static enum microcode_match_result microcode_update_match(
+    const struct microcode_header_intel *mc_header, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
-
-    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
-            (mc_header->rev > uci->cpu_sig.rev));
+    const struct extended_sigtable *ext_header;
+    const struct extended_signature *ext_sig;
+    unsigned int i;
+    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    unsigned int sig = uci->cpu_sig.sig;
+    unsigned int pf = uci->cpu_sig.pf;
+    unsigned int rev = uci->cpu_sig.rev;
+    unsigned long data_size = get_datasize(mc_header);
+    const void *end = (const void *)mc_header + get_totalsize(mc_header);
+
+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    ext_header = (const void *)(mc_header + 1) + data_size;
+    ext_sig = (const void *)(ext_header + 1);
+
+    /*
+     * Make sure there is enough space to hold an extended header and enough
+     * array elements.
+     */
+    if ( (end < (const void *)ext_sig) ||
+         (end < (const void *)(ext_sig + ext_header->count)) )
+        return MIS_UCODE;
+
+    for ( i = 0; i < ext_header->count; i++ )
+        if ( sigmatch(sig, ext_sig[i].sig, pf, ext_sig[i].pf) )
+            return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
 }
 
 static int microcode_sanity_check(void *mc)
@@ -243,31 +268,12 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
-    const struct extended_sigtable *ext_header;
     unsigned long total_size = get_totalsize(mc_header);
-    int ext_sigcount, i;
-    struct extended_signature *ext_sig;
     void *new_mc;
 
-    if ( microcode_update_match(cpu, mc_header,
-                                mc_header->sig, mc_header->pf) )
-        goto find;
-
-    if ( total_size <= (get_datasize(mc_header) + MC_HEADER_SIZE) )
+    if ( microcode_update_match(mc, cpu) != NEW_UCODE )
         return 0;
 
-    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
-    ext_sigcount = ext_header->count;
-    ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
-    for ( i = 0; i < ext_sigcount; i++ )
-    {
-        if ( microcode_update_match(cpu, mc_header,
-                                    ext_sig->sig, ext_sig->pf) )
-            goto find;
-        ext_sig++;
-    }
-    return 0;
- find:
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
              cpu, mc_header->rev, uci->cpu_sig.rev);
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 23ea954..882f560 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -3,6 +3,12 @@
 
 #include <xen/percpu.h>
 
+enum microcode_match_result {
+    OLD_UCODE, /* signature matched, but revision id is older or equal */
+    NEW_UCODE, /* signature matched, but revision id is newer */
+    MIS_UCODE, /* signature mismatched */
+};
+
 struct cpu_signature;
 struct ucode_cpu_info;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 02/15] microcode/amd: fix memory leak
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 03/15] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Two buffers, '->equiv_cpu_table' and '->mpb',  inside 'mc_amd' might be
allocated and in the error-handing path they are not freed properly.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - use xzalloc() to get rid of explicitly initializing some fields
 to NULL/0.

changes in v8:
 - new
 - it is found by reading code. No test is done.
---
 xen/arch/x86/microcode_amd.c | 13 +++++++------
 1 file changed, 7 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 7a854c0..3069784 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -425,7 +425,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    mc_amd = xmalloc(struct microcode_amd);
+    mc_amd = xzalloc(struct microcode_amd);
     if ( !mc_amd )
     {
         printk(KERN_ERR "microcode: Cannot allocate memory for microcode patch\n");
@@ -479,6 +479,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
 
     if ( error )
     {
+        xfree(mc_amd->equiv_cpu_table);
         xfree(mc_amd);
         goto out;
     }
@@ -491,8 +492,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
      * It's possible the data file has multiple matching ucode,
      * lets keep searching till the latest version
      */
-    mc_amd->mpb = NULL;
-    mc_amd->mpb_size = 0;
     last_offset = offset;
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
@@ -549,11 +548,13 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
 
     if ( save_error )
     {
-        xfree(mc_amd);
         uci->mc.mc_amd = mc_old;
+        mc_old = mc_amd;
     }
-    else
-        xfree(mc_old);
+
+    xfree(mc_old->mpb);
+    xfree(mc_old->equiv_cpu_table);
+    xfree(mc_old);
 
   out:
 #if CONFIG_HVM
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 03/15] microcode/amd: distinguish old and mismatched ucode in microcode_fits()
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 02/15] microcode/amd: fix memory leak Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Sometimes, an ucode with a level lower than or equal to current CPU's
patch level is useful. For example, to work around a broken bios which
only loads ucode for BSP, when BSP parses an ucode blob during bootup,
it is better to save an ucode with lower or equal level for APs

No functional change is made in this patch. But following patch would
handle "old ucode" and "mismatched ucode" separately.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v8:
 - new
---
 xen/arch/x86/microcode_amd.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 3069784..3db3555 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -152,8 +152,8 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
     return 0;
 }
 
-static bool_t microcode_fits(const struct microcode_amd *mc_amd,
-                             unsigned int cpu)
+static enum microcode_match_result microcode_fits(
+    const struct microcode_amd *mc_amd, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
@@ -167,27 +167,27 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( !find_equiv_cpu_id(equiv_cpu_table, current_cpu_id, &equiv_cpu_id) )
-        return 0;
+        return MIS_UCODE;
 
     if ( (mc_header->processor_rev_id) != equiv_cpu_id )
-        return 0;
+        return MIS_UCODE;
 
     if ( !verify_patch_size(mc_amd->mpb_size) )
     {
         pr_debug("microcode: patch size mismatch\n");
-        return 0;
+        return MIS_UCODE;
     }
 
     if ( mc_header->patch_id <= uci->cpu_sig.rev )
     {
         pr_debug("microcode: patch is already at required level or greater.\n");
-        return 0;
+        return OLD_UCODE;
     }
 
     pr_debug("microcode: CPU%d found a matching microcode update with version %#x (current=%#x)\n",
              cpu, mc_header->patch_id, uci->cpu_sig.rev);
 
-    return 1;
+    return NEW_UCODE;
 }
 
 static int apply_microcode(unsigned int cpu)
@@ -496,7 +496,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) )
+        if ( microcode_fits(mc_amd, cpu) == NEW_UCODE )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -576,7 +576,7 @@ static int microcode_resume_match(unsigned int cpu, const void *mc)
     struct microcode_amd *mc_amd = uci->mc.mc_amd;
     const struct microcode_amd *src = mc;
 
-    if ( !microcode_fits(src, cpu) )
+    if ( microcode_fits(src, cpu) != NEW_UCODE )
         return 0;
 
     if ( src != mc_amd )
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (2 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 03/15] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-22 11:11   ` Roger Pau Monné
                     ` (2 more replies)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 05/15] microcode: clean up microcode_resume_cpu Chao Gao
                   ` (11 subsequent siblings)
  15 siblings, 3 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

to replace the current per-cpu cache 'uci->mc'.

With the assumption that all CPUs in the system have the same signature
(family, model, stepping and 'pf'), one microcode update matches with
one cpu should match with others. Having multiple microcode revisions
on different cpus would cause system unstable and should be avoided.
Hence, caching only one microcode update is good enough for all cases.

Introduce a global variable, microcode_cache, to store the newest
matching microcode update. Whenever we get a new valid microcode update,
its revision id is compared against that of the microcode update to
determine whether the "microcode_cache" needs to be replaced. And
this global cache is loaded to cpu in apply_microcode().

All operations on the cache is protected by 'microcode_mutex'.

Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
as I am going to remove it completely in the following patches. We copy
everything to create the new cache blob to avoid reusing some buffers
previously allocated for the old per-cpu cache. It is not so efficient,
but it is already corrected by a patch later in this series.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v9:
 - on Intel side, ->compare_patch just checks the patch revision number.
 - explain why all buffers are copied in alloc_microcode_patch() in
 patch description.

Changes in v8:
 - Free generic wrapper struct in general code
 - Try to update cache as long as a patch covers current cpu. Previsouly,
 cache is updated only if the patch is newer than current update revision in
 the CPU. The small difference can work around a broken bios which only
 applies microcode update to BSP and software has to apply the same
 update to other CPUs.

Changes in v7:
 - reworked to cache only one microcode patch rather than a list of
 microcode patches.
---
 xen/arch/x86/microcode.c        | 39 ++++++++++++++++++
 xen/arch/x86/microcode_amd.c    | 90 +++++++++++++++++++++++++++++++++++++----
 xen/arch/x86/microcode_intel.c  | 73 ++++++++++++++++++++++++++-------
 xen/include/asm-x86/microcode.h | 17 ++++++++
 4 files changed, 197 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 421d57e..0ecd2fd 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -61,6 +61,9 @@ static struct ucode_mod_blob __initdata ucode_blob;
  */
 static bool_t __initdata ucode_scan;
 
+/* Protected by microcode_mutex */
+static struct microcode_patch *microcode_cache;
+
 void __init microcode_set_module(unsigned int idx)
 {
     ucode_mod_idx = idx;
@@ -262,6 +265,42 @@ int microcode_resume_cpu(unsigned int cpu)
     return err;
 }
 
+void microcode_free_patch(struct microcode_patch *microcode_patch)
+{
+    microcode_ops->free_patch(microcode_patch->mc);
+    xfree(microcode_patch);
+}
+
+const struct microcode_patch *microcode_get_cache(void)
+{
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    return microcode_cache;
+}
+
+/* Return true if cache gets updated. Otherwise, return false */
+bool microcode_update_cache(struct microcode_patch *patch)
+{
+
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    if ( !microcode_cache )
+        microcode_cache = patch;
+    else if ( microcode_ops->compare_patch(patch,
+                                           microcode_cache) == NEW_UCODE )
+    {
+        microcode_free_patch(microcode_cache);
+        microcode_cache = patch;
+    }
+    else
+    {
+        microcode_free_patch(patch);
+        return false;
+    }
+
+    return true;
+}
+
 static int microcode_update_cpu(const void *buf, size_t size)
 {
     int err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 3db3555..30129ca 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -190,24 +190,83 @@ static enum microcode_match_result microcode_fits(
     return NEW_UCODE;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    if ( !patch )
+        return false;
+    return microcode_fits(patch->mc_amd, smp_processor_id()) == NEW_UCODE;
+}
+
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_amd *mc_amd)
+{
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+    struct microcode_amd *cache = xmalloc(struct microcode_amd);
+    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
+    struct equiv_cpu_entry *equiv_cpu_table =
+                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
+
+    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
+    {
+        xfree(microcode_patch);
+        xfree(cache);
+        xfree(mpb);
+        xfree(equiv_cpu_table);
+        return ERR_PTR(-ENOMEM);
+    }
+
+    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
+    cache->mpb = mpb;
+    cache->mpb_size = mc_amd->mpb_size;
+    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
+           mc_amd->equiv_cpu_table_size);
+    cache->equiv_cpu_table = equiv_cpu_table;
+    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
+    microcode_patch->mc_amd = cache;
+
+    return microcode_patch;
+}
+
+static void free_patch(void *mc)
+{
+    struct microcode_amd *mc_amd = mc;
+
+    xfree(mc_amd->equiv_cpu_table);
+    xfree(mc_amd->mpb);
+    xfree(mc_amd);
+}
+
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    const struct microcode_amd *new_mc = new->mc_amd;
+    const struct microcode_header_amd *new_header = new_mc->mpb;
+    const struct microcode_amd *old_mc = old->mc_amd;
+    const struct microcode_header_amd *old_header = old_mc->mpb;
+
+    if ( new_header->processor_rev_id == old_header->processor_rev_id )
+        return (new_header->patch_id > old_header->patch_id) ?
+                NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
+}
+
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    struct microcode_header_amd *hdr;
     int hw_err;
+    const struct microcode_header_amd *hdr;
+    const struct microcode_patch *patch = microcode_get_cache();
 
     /* We should bind the task to the CPU */
     BUG_ON(raw_smp_processor_id() != cpu);
 
-    if ( mc_amd == NULL )
+    if ( !match_cpu(patch) )
         return -EINVAL;
 
-    hdr = mc_amd->mpb;
-    if ( hdr == NULL )
-        return -EINVAL;
+    hdr = patch->mc_amd->mpb;
 
     spin_lock_irqsave(&microcode_update_lock, flags);
 
@@ -496,7 +555,21 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) == NEW_UCODE )
+        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
+
+        if ( IS_ERR(new_patch) )
+        {
+            error = PTR_ERR(new_patch);
+            break;
+        }
+
+        /* Update cache if this patch covers current CPU */
+        if ( microcode_fits(new_patch->mc_amd, cpu) != MIS_UCODE )
+            microcode_update_cache(new_patch);
+        else
+            microcode_free_patch(new_patch);
+
+        if ( match_cpu(microcode_get_cache()) )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -640,6 +713,9 @@ static const struct microcode_ops microcode_amd_ops = {
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
     .start_update                     = start_update,
+    .free_patch                       = free_patch,
+    .compare_patch                    = compare_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_amd(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index c185b5c..14485dc 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -259,6 +259,31 @@ static int microcode_sanity_check(void *mc)
     return 0;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    if ( !patch )
+        return false;
+
+    return microcode_update_match(&patch->mc_intel->hdr,
+                                  smp_processor_id()) == NEW_UCODE;
+}
+
+static void free_patch(void *mc)
+{
+    xfree(mc);
+}
+
+/*
+ * Both patches to compare are supposed to be applicable to local CPU.
+ * Just compare the revision number.
+ */
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ?  NEW_UCODE :
+                                                                OLD_UCODE;
+}
+
 /*
  * return 0 - no update found
  * return 1 - found update
@@ -269,10 +294,26 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc;
+    void *new_mc = xmalloc_bytes(total_size);
+    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
 
-    if ( microcode_update_match(mc, cpu) != NEW_UCODE )
+    if ( !new_patch || !new_mc )
+    {
+        xfree(new_patch);
+        xfree(new_mc);
+        return -ENOMEM;
+    }
+    memcpy(new_mc, mc, total_size);
+    new_patch->mc_intel = new_mc;
+
+    /* Make sure that this patch covers current CPU */
+    if ( microcode_update_match(mc, cpu) == MIS_UCODE )
+    {
+        microcode_free_patch(new_patch);
         return 0;
+    }
+
+    microcode_update_cache(new_patch);
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
@@ -297,18 +338,22 @@ static int apply_microcode(unsigned int cpu)
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    const struct microcode_intel *mc_intel;
+    const struct microcode_patch *patch = microcode_get_cache();
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu_num != cpu);
 
-    if ( uci->mc.mc_intel == NULL )
+    if ( !match_cpu(patch) )
         return -EINVAL;
 
+    mc_intel = patch->mc_intel;
+
     /* serialize access to the physical write to MSR 0x79 */
     spin_lock_irqsave(&microcode_update_lock, flags);
 
     /* write microcode via MSR 0x79 */
-    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
+    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
 
     /* As documented in the SDM: Do a CPUID 1 here */
@@ -319,19 +364,17 @@ static int apply_microcode(unsigned int cpu)
     val[1] = (uint32_t)(msr_content >> 32);
 
     spin_unlock_irqrestore(&microcode_update_lock, flags);
-    if ( val[1] != uci->mc.mc_intel->hdr.rev )
+    if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
+               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
-           cpu_num, uci->cpu_sig.rev, val[1],
-           uci->mc.mc_intel->hdr.year,
-           uci->mc.mc_intel->hdr.month,
-           uci->mc.mc_intel->hdr.day);
+           cpu_num, uci->cpu_sig.rev, val[1], mc_intel->hdr.year,
+           mc_intel->hdr.month, mc_intel->hdr.day);
     uci->cpu_sig.rev = val[1];
 
     return 0;
@@ -371,7 +414,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     long offset = 0;
     int error = 0;
     void *mc;
-    unsigned int matching_count = 0;
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -389,10 +431,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
          * lets keep searching till the latest version
          */
         if ( error == 1 )
-        {
-            matching_count++;
             error = 0;
-        }
+
         xfree(mc);
     }
     if ( offset > 0 )
@@ -400,7 +440,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && matching_count )
+    if ( !error && match_cpu(microcode_get_cache()) )
         error = apply_microcode(cpu);
 
     return error;
@@ -416,6 +456,9 @@ static const struct microcode_ops microcode_intel_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+    .free_patch                       = free_patch,
+    .compare_patch                    = compare_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 882f560..42949b1 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -12,6 +12,14 @@ enum microcode_match_result {
 struct cpu_signature;
 struct ucode_cpu_info;
 
+struct microcode_patch {
+    union {
+        struct microcode_intel *mc_intel;
+        struct microcode_amd *mc_amd;
+        void *mc;
+    };
+};
+
 struct microcode_ops {
     int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
@@ -19,6 +27,11 @@ struct microcode_ops {
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
     int (*apply_microcode)(unsigned int cpu);
     int (*start_update)(void);
+    void (*free_patch)(void *mc);
+    bool (*match_cpu)(const struct microcode_patch *patch);
+    enum microcode_match_result (*compare_patch)(
+            const struct microcode_patch *new,
+            const struct microcode_patch *old);
 };
 
 struct cpu_signature {
@@ -39,4 +52,8 @@ struct ucode_cpu_info {
 DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 extern const struct microcode_ops *microcode_ops;
 
+const struct microcode_patch *microcode_get_cache(void);
+bool microcode_update_cache(struct microcode_patch *patch);
+void microcode_free_patch(struct microcode_patch *patch);
+
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 05/15] microcode: clean up microcode_resume_cpu
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (3 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 06/15] microcode: remove struct ucode_cpu_info Chao Gao
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Previously, a per-cpu ucode cache is maintained. Then each CPU had one
per-cpu update cache and there might be multiple versions of microcode.
Thus microcode_resume_cpu tried best to update microcode by loading
every update cache until a successful load.

But now the cache struct is simplified a lot and only a single ucode is
cached. a single invocation of ->apply_microcode() would load the cache
and make microcode updated.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
changes in v8:
 - new
 - separated from the following patch
---
 xen/arch/x86/microcode.c        | 40 ++---------------------------------
 xen/arch/x86/microcode_amd.c    | 47 -----------------------------------------
 xen/arch/x86/microcode_intel.c  |  6 ------
 xen/include/asm-x86/microcode.h |  1 -
 4 files changed, 2 insertions(+), 92 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 0ecd2fd..ca5ee37 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -215,8 +215,6 @@ int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct cpu_signature nsig;
-    unsigned int cpu2;
 
     if ( !microcode_ops )
         return 0;
@@ -224,42 +222,8 @@ int microcode_resume_cpu(unsigned int cpu)
     spin_lock(&microcode_mutex);
 
     err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
-    if ( err )
-    {
-        __microcode_fini_cpu(cpu);
-        spin_unlock(&microcode_mutex);
-        return err;
-    }
-
-    if ( uci->mc.mc_valid )
-    {
-        err = microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid);
-        if ( err >= 0 )
-        {
-            if ( err )
-                err = microcode_ops->apply_microcode(cpu);
-            spin_unlock(&microcode_mutex);
-            return err;
-        }
-    }
-
-    nsig = uci->cpu_sig;
-    __microcode_fini_cpu(cpu);
-    uci->cpu_sig = nsig;
-
-    err = -EIO;
-    for_each_online_cpu ( cpu2 )
-    {
-        uci = &per_cpu(ucode_cpu_info, cpu2);
-        if ( uci->mc.mc_valid &&
-             microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid) > 0 )
-        {
-            err = microcode_ops->apply_microcode(cpu);
-            break;
-        }
-    }
-
-    __microcode_fini_cpu(cpu);
+    if ( likely(!err) )
+        err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
 
     return err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 30129ca..b351894 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -643,52 +643,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    const struct microcode_amd *src = mc;
-
-    if ( microcode_fits(src, cpu) != NEW_UCODE )
-        return 0;
-
-    if ( src != mc_amd )
-    {
-        if ( mc_amd )
-        {
-            xfree(mc_amd->equiv_cpu_table);
-            xfree(mc_amd->mpb);
-            xfree(mc_amd);
-        }
-
-        mc_amd = xmalloc(struct microcode_amd);
-        uci->mc.mc_amd = mc_amd;
-        if ( !mc_amd )
-            return -ENOMEM;
-        mc_amd->equiv_cpu_table = xmalloc_bytes(src->equiv_cpu_table_size);
-        if ( !mc_amd->equiv_cpu_table )
-            goto err1;
-        mc_amd->mpb = xmalloc_bytes(src->mpb_size);
-        if ( !mc_amd->mpb )
-            goto err2;
-
-        mc_amd->equiv_cpu_table_size = src->equiv_cpu_table_size;
-        mc_amd->mpb_size = src->mpb_size;
-        memcpy(mc_amd->mpb, src->mpb, src->mpb_size);
-        memcpy(mc_amd->equiv_cpu_table, src->equiv_cpu_table,
-               src->equiv_cpu_table_size);
-    }
-
-    return 1;
-
-err2:
-    xfree(mc_amd->equiv_cpu_table);
-err1:
-    xfree(mc_amd);
-    uci->mc.mc_amd = NULL;
-    return -ENOMEM;
-}
-
 static int start_update(void)
 {
 #if CONFIG_HVM
@@ -708,7 +662,6 @@ static int start_update(void)
 }
 
 static const struct microcode_ops microcode_amd_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 14485dc..58eb186 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -446,13 +446,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    return get_matching_microcode(mc, cpu);
-}
-
 static const struct microcode_ops microcode_intel_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 42949b1..3238743 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -21,7 +21,6 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
                                  size_t size);
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 06/15] microcode: remove struct ucode_cpu_info
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (4 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 05/15] microcode: clean up microcode_resume_cpu Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 07/15] microcode: remove pointless 'cpu' parameter Chao Gao
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Remove the per-cpu cache field in struct ucode_cpu_info since it has
been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.

The cpu status notifier is also removed. It was used to free the "mc"
field to avoid memory leak.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - rebase and fix conflict

Changes in v8:
 - split microcode_resume_cpu() cleanup to a separate patch.

Changes in v6:
 - remove the whole struct ucode_cpu_info instead of the per-cpu cache
  in it.
---
 xen/arch/x86/apic.c             |  2 +-
 xen/arch/x86/microcode.c        | 57 +++++++---------------------------------
 xen/arch/x86/microcode_amd.c    | 58 +++++++++++------------------------------
 xen/arch/x86/microcode_intel.c  | 28 +++++++-------------
 xen/arch/x86/spec_ctrl.c        |  2 +-
 xen/include/asm-x86/microcode.h | 12 +--------
 6 files changed, 36 insertions(+), 123 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 9c3c998..ae1f1e9 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1193,7 +1193,7 @@ static void __init check_deadline_errata(void)
     else
         rev = (unsigned long)m->driver_data;
 
-    if ( this_cpu(ucode_cpu_info).cpu_sig.rev >= rev )
+    if ( this_cpu(cpu_sig).rev >= rev )
         return;
 
     setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE);
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index ca5ee37..552e7fe 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -187,7 +187,7 @@ const struct microcode_ops *microcode_ops;
 
 static DEFINE_SPINLOCK(microcode_mutex);
 
-DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 struct microcode_info {
     unsigned int cpu;
@@ -196,32 +196,17 @@ struct microcode_info {
     char buffer[1];
 };
 
-static void __microcode_fini_cpu(unsigned int cpu)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-
-    xfree(uci->mc.mc_valid);
-    memset(uci, 0, sizeof(*uci));
-}
-
-static void microcode_fini_cpu(unsigned int cpu)
-{
-    spin_lock(&microcode_mutex);
-    __microcode_fini_cpu(cpu);
-    spin_unlock(&microcode_mutex);
-}
-
 int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
     if ( likely(!err) )
         err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
@@ -269,16 +254,13 @@ static int microcode_update_cpu(const void *buf, size_t size)
 {
     int err;
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
     if ( likely(!err) )
         err = microcode_ops->cpu_request_microcode(cpu, buf, size);
-    else
-        __microcode_fini_cpu(cpu);
-
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -365,29 +347,10 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-static int microcode_percpu_callback(
-    struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
-    unsigned int cpu = (unsigned long)hcpu;
-
-    switch ( action )
-    {
-    case CPU_DEAD:
-        microcode_fini_cpu(cpu);
-        break;
-    }
-
-    return NOTIFY_DONE;
-}
-
-static struct notifier_block microcode_percpu_nfb = {
-    .notifier_call = microcode_percpu_callback,
-};
-
 int __init early_microcode_update_cpu(bool start_update)
 {
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc = 0;
     void *data = NULL;
     size_t len;
@@ -406,7 +369,7 @@ int __init early_microcode_update_cpu(bool start_update)
         data = bootstrap_map(&ucode_mod);
     }
 
-    microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    microcode_ops->collect_cpu_info(cpu, sig);
 
     if ( data )
     {
@@ -425,7 +388,7 @@ int __init early_microcode_update_cpu(bool start_update)
 int __init early_microcode_init(void)
 {
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc;
 
     rc = microcode_init_intel();
@@ -438,12 +401,10 @@ int __init early_microcode_init(void)
 
     if ( microcode_ops )
     {
-        microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+        microcode_ops->collect_cpu_info(cpu, sig);
 
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
-
-        register_cpu_notifier(&microcode_percpu_nfb);
     }
 
     return rc;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index b351894..9e4ec73 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -155,7 +155,7 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
 static enum microcode_match_result microcode_fits(
     const struct microcode_amd *mc_amd, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
@@ -178,14 +178,14 @@ static enum microcode_match_result microcode_fits(
         return MIS_UCODE;
     }
 
-    if ( mc_header->patch_id <= uci->cpu_sig.rev )
+    if ( mc_header->patch_id <= sig->rev )
     {
         pr_debug("microcode: patch is already at required level or greater.\n");
         return OLD_UCODE;
     }
 
     pr_debug("microcode: CPU%d found a matching microcode update with version %#x (current=%#x)\n",
-             cpu, mc_header->patch_id, uci->cpu_sig.rev);
+             cpu, mc_header->patch_id, sig->rev);
 
     return NEW_UCODE;
 }
@@ -254,9 +254,9 @@ static enum microcode_match_result compare_patch(
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
     int hw_err;
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch = microcode_get_cache();
 
@@ -292,9 +292,9 @@ static int apply_microcode(unsigned int cpu)
     }
 
     printk(KERN_WARNING "microcode: CPU%d updated from revision %#x to %#x\n",
-           cpu, uci->cpu_sig.rev, hdr->patch_id);
+           cpu, sig->rev, hdr->patch_id);
 
-    uci->cpu_sig.rev = rev;
+    sig->rev = rev;
 
     return 0;
 }
@@ -440,14 +440,14 @@ static bool_t check_final_patch_levels(unsigned int cpu)
      * any of the 'final_levels', then we should not update the microcode
      * patch on the cpu as system will hang otherwise.
      */
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     unsigned int i;
 
     if ( boot_cpu_data.x86 != 0x10 )
         return 0;
 
     for ( i = 0; i < ARRAY_SIZE(final_levels); i++ )
-        if ( uci->cpu_sig.rev == final_levels[i] )
+        if ( sig->rev == final_levels[i] )
             return 1;
 
     return 0;
@@ -456,13 +456,12 @@ static bool_t check_final_patch_levels(unsigned int cpu)
 static int cpu_request_microcode(unsigned int cpu, const void *buf,
                                  size_t bufsize)
 {
-    struct microcode_amd *mc_amd, *mc_old;
+    struct microcode_amd *mc_amd;
     size_t offset = 0;
-    size_t last_offset, applied_offset = 0;
-    int error = 0, save_error = 1;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -531,7 +530,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         {
             printk(KERN_ERR "microcode: CPU%d incorrect or corrupt container file\n"
                    "microcode: Failed to update patch level. "
-                   "Current lvl:%#x\n", cpu, uci->cpu_sig.rev);
+                   "Current lvl:%#x\n", cpu, sig->rev);
             break;
         }
     }
@@ -543,15 +542,10 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    mc_old = uci->mc.mc_amd;
-    /* implicitely validates uci->mc.mc_valid */
-    uci->mc.mc_amd = mc_amd;
-
     /*
      * It's possible the data file has multiple matching ucode,
      * lets keep searching till the latest version
      */
-    last_offset = offset;
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
@@ -574,11 +568,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             error = apply_microcode(cpu);
             if ( error )
                 break;
-            applied_offset = last_offset;
         }
 
-        last_offset = offset;
-
         if ( offset >= bufsize )
             break;
 
@@ -606,28 +597,9 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
              *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
             break;
     }
-
-    /* On success keep the microcode patch for
-     * re-apply on resume.
-     */
-    if ( applied_offset )
-    {
-        save_error = get_ucode_from_buffer_amd(
-            mc_amd, buf, bufsize, &applied_offset);
-
-        if ( save_error )
-            error = save_error;
-    }
-
-    if ( save_error )
-    {
-        uci->mc.mc_amd = mc_old;
-        mc_old = mc_amd;
-    }
-
-    xfree(mc_old->mpb);
-    xfree(mc_old->equiv_cpu_table);
-    xfree(mc_old);
+    xfree(mc_amd->mpb);
+    xfree(mc_amd->equiv_cpu_table);
+    xfree(mc_amd);
 
   out:
 #if CONFIG_HVM
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 58eb186..fafaa79 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -141,10 +141,10 @@ static enum microcode_match_result microcode_update_match(
     const struct extended_sigtable *ext_header;
     const struct extended_signature *ext_sig;
     unsigned int i;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    unsigned int sig = uci->cpu_sig.sig;
-    unsigned int pf = uci->cpu_sig.pf;
-    unsigned int rev = uci->cpu_sig.rev;
+    struct cpu_signature *cpu_sig = &per_cpu(cpu_sig, cpu);
+    unsigned int sig = cpu_sig->sig;
+    unsigned int pf = cpu_sig->pf;
+    unsigned int rev = cpu_sig->rev;
     unsigned long data_size = get_datasize(mc_header);
     const void *end = (const void *)mc_header + get_totalsize(mc_header);
 
@@ -291,7 +291,6 @@ static enum microcode_match_result compare_patch(
  */
 static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
@@ -317,17 +316,8 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, uci->cpu_sig.rev);
-    new_mc = xmalloc_bytes(total_size);
-    if ( new_mc == NULL )
-    {
-        printk(KERN_ERR "microcode: error! Can not allocate memory\n");
-        return -ENOMEM;
-    }
+             cpu, mc_header->rev, per_cpu(cpu_sig, cpu).rev);
 
-    memcpy(new_mc, mc, total_size);
-    xfree(uci->mc.mc_intel);
-    uci->mc.mc_intel = new_mc;
     return 1;
 }
 
@@ -337,7 +327,7 @@ static int apply_microcode(unsigned int cpu)
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch = microcode_get_cache();
 
@@ -368,14 +358,14 @@ static int apply_microcode(unsigned int cpu)
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
+               sig->rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
-           cpu_num, uci->cpu_sig.rev, val[1], mc_intel->hdr.year,
+           cpu_num, sig->rev, val[1], mc_intel->hdr.year,
            mc_intel->hdr.month, mc_intel->hdr.day);
-    uci->cpu_sig.rev = val[1];
+    sig->rev = val[1];
 
     return 0;
 }
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 468a847..4761be8 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -438,7 +438,7 @@ static bool __init check_smt_enabled(void)
 /* Calculate whether Retpoline is known-safe on this CPU. */
 static bool __init retpoline_safe(uint64_t caps)
 {
-    unsigned int ucode_rev = this_cpu(ucode_cpu_info).cpu_sig.rev;
+    unsigned int ucode_rev = this_cpu(cpu_sig).rev;
 
     if ( boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
         return true;
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 3238743..5b8289f 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -10,7 +10,6 @@ enum microcode_match_result {
 };
 
 struct cpu_signature;
-struct ucode_cpu_info;
 
 struct microcode_patch {
     union {
@@ -39,16 +38,7 @@ struct cpu_signature {
     unsigned int rev;
 };
 
-struct ucode_cpu_info {
-    struct cpu_signature cpu_sig;
-    union {
-        struct microcode_intel *mc_intel;
-        struct microcode_amd *mc_amd;
-        void *mc_valid;
-    } mc;
-};
-
-DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
 extern const struct microcode_ops *microcode_ops;
 
 const struct microcode_patch *microcode_get_cache(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 07/15] microcode: remove pointless 'cpu' parameter
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (5 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 06/15] microcode: remove struct ucode_cpu_info Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - use a convenience variable 'cpu' in collect_cpu_info() on AMD side
 - rebase and fix conflicts

Changes in v8:
 - Use current_cpu_data in collect_cpu_info()
 - keep the cpu parameter of check_final_patch_levels()
 - use smp_processor_id() in get_matching_microcode() rather than
 define a local variable and label it "__maybe_unused"
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 20 ++++++++------------
 xen/arch/x86/microcode_amd.c    | 30 +++++++++++-------------------
 xen/arch/x86/microcode_intel.c  | 35 +++++++++++++----------------------
 xen/arch/x86/smpboot.c          |  2 +-
 xen/include/asm-x86/microcode.h |  7 +++----
 xen/include/asm-x86/processor.h |  2 +-
 7 files changed, 38 insertions(+), 60 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index aecc754..4f21903 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -253,7 +253,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu(0);
+    microcode_resume_cpu();
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 552e7fe..3b87c72 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -196,19 +196,19 @@ struct microcode_info {
     char buffer[1];
 };
 
-int microcode_resume_cpu(unsigned int cpu)
+int microcode_resume_cpu(void)
 {
     int err;
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->apply_microcode(cpu);
+        err = microcode_ops->apply_microcode();
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -258,9 +258,9 @@ static int microcode_update_cpu(const void *buf, size_t size)
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
+        err = microcode_ops->cpu_request_microcode(buf, size);
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -349,8 +349,6 @@ __initcall(microcode_init);
 
 int __init early_microcode_update_cpu(bool start_update)
 {
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc = 0;
     void *data = NULL;
     size_t len;
@@ -369,7 +367,7 @@ int __init early_microcode_update_cpu(bool start_update)
         data = bootstrap_map(&ucode_mod);
     }
 
-    microcode_ops->collect_cpu_info(cpu, sig);
+    microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
     if ( data )
     {
@@ -387,8 +385,6 @@ int __init early_microcode_update_cpu(bool start_update)
 
 int __init early_microcode_init(void)
 {
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc;
 
     rc = microcode_init_intel();
@@ -401,7 +397,7 @@ int __init early_microcode_init(void)
 
     if ( microcode_ops )
     {
-        microcode_ops->collect_cpu_info(cpu, sig);
+        microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 9e4ec73..dd3821c 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -78,8 +78,9 @@ struct mpbhdr {
 static DEFINE_SPINLOCK(microcode_update_lock);
 
 /* See comment in start_update() for cases when this routine fails */
-static int collect_cpu_info(unsigned int cpu, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu];
 
     memset(csig, 0, sizeof(*csig));
@@ -153,17 +154,15 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
 }
 
 static enum microcode_match_result microcode_fits(
-    const struct microcode_amd *mc_amd, unsigned int cpu)
+    const struct microcode_amd *mc_amd)
 {
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( !find_equiv_cpu_id(equiv_cpu_table, current_cpu_id, &equiv_cpu_id) )
@@ -192,9 +191,7 @@ static enum microcode_match_result microcode_fits(
 
 static bool match_cpu(const struct microcode_patch *patch)
 {
-    if ( !patch )
-        return false;
-    return microcode_fits(patch->mc_amd, smp_processor_id()) == NEW_UCODE;
+    return patch && (microcode_fits(patch->mc_amd) == NEW_UCODE);
 }
 
 static struct microcode_patch *alloc_microcode_patch(
@@ -251,18 +248,16 @@ static enum microcode_match_result compare_patch(
     return MIS_UCODE;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint32_t rev;
     int hw_err;
+    unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch = microcode_get_cache();
 
-    /* We should bind the task to the CPU */
-    BUG_ON(raw_smp_processor_id() != cpu);
-
     if ( !match_cpu(patch) )
         return -EINVAL;
 
@@ -453,19 +448,16 @@ static bool_t check_final_patch_levels(unsigned int cpu)
     return 0;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t bufsize)
+static int cpu_request_microcode(const void *buf, size_t bufsize)
 {
     struct microcode_amd *mc_amd;
     size_t offset = 0;
     int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( *(const uint32_t *)buf != UCODE_MAGIC )
@@ -558,14 +550,14 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         }
 
         /* Update cache if this patch covers current CPU */
-        if ( microcode_fits(new_patch->mc_amd, cpu) != MIS_UCODE )
+        if ( microcode_fits(new_patch->mc_amd) != MIS_UCODE )
             microcode_update_cache(new_patch);
         else
             microcode_free_patch(new_patch);
 
         if ( match_cpu(microcode_get_cache()) )
         {
-            error = apply_microcode(cpu);
+            error = apply_microcode();
             if ( error )
                 break;
         }
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index fafaa79..a5452d4 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -96,13 +96,12 @@ struct extended_sigtable {
 /* serialize access to the physical write to MSR 0x79 */
 static DEFINE_SPINLOCK(microcode_update_lock);
 
-static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu_num = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu_num];
     uint64_t msr_content;
 
-    BUG_ON(cpu_num != smp_processor_id());
-
     memset(csig, 0, sizeof(*csig));
 
     if ( (c->x86_vendor != X86_VENDOR_INTEL) || (c->x86 < 6) )
@@ -136,12 +135,12 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
 
 /* Check an update against the CPU signature and current update revision */
 static enum microcode_match_result microcode_update_match(
-    const struct microcode_header_intel *mc_header, unsigned int cpu)
+    const struct microcode_header_intel *mc_header)
 {
     const struct extended_sigtable *ext_header;
     const struct extended_signature *ext_sig;
     unsigned int i;
-    struct cpu_signature *cpu_sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *cpu_sig = &this_cpu(cpu_sig);
     unsigned int sig = cpu_sig->sig;
     unsigned int pf = cpu_sig->pf;
     unsigned int rev = cpu_sig->rev;
@@ -264,8 +263,7 @@ static bool match_cpu(const struct microcode_patch *patch)
     if ( !patch )
         return false;
 
-    return microcode_update_match(&patch->mc_intel->hdr,
-                                  smp_processor_id()) == NEW_UCODE;
+    return microcode_update_match(&patch->mc_intel->hdr) == NEW_UCODE;
 }
 
 static void free_patch(void *mc)
@@ -289,7 +287,7 @@ static enum microcode_match_result compare_patch(
  * return 1 - found update
  * return < 0 - error
  */
-static int get_matching_microcode(const void *mc, unsigned int cpu)
+static int get_matching_microcode(const void *mc)
 {
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
@@ -306,7 +304,7 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     new_patch->mc_intel = new_mc;
 
     /* Make sure that this patch covers current CPU */
-    if ( microcode_update_match(mc, cpu) == MIS_UCODE )
+    if ( microcode_update_match(mc) == MIS_UCODE )
     {
         microcode_free_patch(new_patch);
         return 0;
@@ -316,24 +314,21 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, per_cpu(cpu_sig, cpu).rev);
+             smp_processor_id(), mc_header->rev, this_cpu(cpu_sig).rev);
 
     return 1;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch = microcode_get_cache();
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu_num != cpu);
-
     if ( !match_cpu(patch) )
         return -EINVAL;
 
@@ -398,22 +393,18 @@ static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
     return offset + total_size;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t size)
+static int cpu_request_microcode(const void *buf, size_t size)
 {
     long offset = 0;
     int error = 0;
     void *mc;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
         error = microcode_sanity_check(mc);
         if ( error )
             break;
-        error = get_matching_microcode(mc, cpu);
+        error = get_matching_microcode(mc);
         if ( error < 0 )
             break;
         /*
@@ -431,7 +422,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         error = offset;
 
     if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode(cpu);
+        error = apply_microcode();
 
     return error;
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 65e9cee..c818cfc 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -364,7 +364,7 @@ void start_secondary(void *unused)
     if ( system_state <= SYS_STATE_smp_boot )
         early_microcode_update_cpu(false);
     else
-        microcode_resume_cpu(cpu);
+        microcode_resume_cpu();
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 5b8289f..35223eb 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -20,10 +20,9 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
-                                 size_t size);
-    int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
-    int (*apply_microcode)(unsigned int cpu);
+    int (*cpu_request_microcode)(const void *buf, size_t size);
+    int (*collect_cpu_info)(struct cpu_signature *csig);
+    int (*apply_microcode)(void);
     int (*start_update)(void);
     void (*free_patch)(void *mc);
     bool (*match_cpu)(const struct microcode_patch *patch);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 2862321..104faa9 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -568,7 +568,7 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(unsigned int cpu);
+int microcode_resume_cpu(void);
 int early_microcode_update_cpu(bool start_update);
 int early_microcode_init(void);
 int microcode_init_intel(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (6 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 07/15] microcode: remove pointless 'cpu' parameter Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-22 13:08   ` Roger Pau Monné
  2019-08-28 15:26   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 09/15] microcode: pass a patch pointer to apply_microcode() Chao Gao
                   ` (7 subsequent siblings)
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Introduce a vendor hook, .end_update, for svm_host_osvw_init().
The hook function is called on each cpu after loading an update.
It is a preparation for spliting out apply_microcode() from
cpu_request_microcode().

Note that svm_host_osvm_init() should be called regardless of the
result of loading an update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v9:
 - call .end_update in early loading path
 - on AMD side, initialize .{start,end}_update only if "CONFIG_HVM"
 is true.
---
 xen/arch/x86/microcode.c        | 10 +++++++++-
 xen/arch/x86/microcode_amd.c    | 23 ++++++++++-------------
 xen/include/asm-x86/microcode.h |  1 +
 3 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 3b87c72..c9401a7 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -277,6 +277,9 @@ static long do_microcode_update(void *_info)
     if ( error )
         info->error = error;
 
+    if ( microcode_ops->end_update )
+        microcode_ops->end_update();
+
     info->cpu = cpumask_next(info->cpu, &cpu_online_map);
     if ( info->cpu < nr_cpu_ids )
         return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
@@ -377,7 +380,12 @@ int __init early_microcode_update_cpu(bool start_update)
         if ( rc )
             return rc;
 
-        return microcode_update_cpu(data, len);
+        rc = microcode_update_cpu(data, len);
+
+        if ( microcode_ops->end_update )
+            microcode_ops->end_update();
+
+        return rc;
     }
     else
         return -ENOMEM;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index dd3821c..b85fb04 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -594,10 +594,6 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     xfree(mc_amd);
 
   out:
-#if CONFIG_HVM
-    svm_host_osvw_init();
-#endif
-
     /*
      * In some cases we may return an error even if processor's microcode has
      * been updated. For example, the first patch in a container file is loaded
@@ -609,27 +605,28 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
 
 static int start_update(void)
 {
-#if CONFIG_HVM
     /*
-     * We assume here that svm_host_osvw_init() will be called on each cpu (from
-     * cpu_request_microcode()).
-     *
-     * Note that if collect_cpu_info() returns an error then
-     * cpu_request_microcode() will not invoked thus leaving OSVW bits not
-     * updated. Currently though collect_cpu_info() will not fail on processors
-     * supporting OSVW so we will not deal with this possibility.
+     * svm_host_osvw_init() will be called on each cpu by calling '.end_update'
+     * in common code.
      */
     svm_host_osvw_reset();
-#endif
 
     return 0;
 }
 
+static void end_update(void)
+{
+    svm_host_osvw_init();
+}
+
 static const struct microcode_ops microcode_amd_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+#if CONFIG_HVM
     .start_update                     = start_update,
+    .end_update                       = end_update,
+#endif
     .free_patch                       = free_patch,
     .compare_patch                    = compare_patch,
     .match_cpu                        = match_cpu,
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 35223eb..c8d2c4f 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -24,6 +24,7 @@ struct microcode_ops {
     int (*collect_cpu_info)(struct cpu_signature *csig);
     int (*apply_microcode)(void);
     int (*start_update)(void);
+    void (*end_update)(void);
     void (*free_patch)(void *mc);
     bool (*match_cpu)(const struct microcode_patch *patch);
     enum microcode_match_result (*compare_patch)(
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 09/15] microcode: pass a patch pointer to apply_microcode()
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (7 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

apply_microcode()'s always loading the cached ucode patch forces
a patch to be stored before being loading. Make apply_microcode()
accept a patch pointer to remove the limitation so that a patch
can be stored after a successful loading.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/microcode.c        | 2 +-
 xen/arch/x86/microcode_amd.c    | 5 ++---
 xen/arch/x86/microcode_intel.c  | 5 ++---
 xen/include/asm-x86/microcode.h | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index c9401a7..0e9322a 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -208,7 +208,7 @@ int microcode_resume_cpu(void)
 
     err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->apply_microcode();
+        err = microcode_ops->apply_microcode(microcode_cache);
     spin_unlock(&microcode_mutex);
 
     return err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index b85fb04..21cdfe0 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -248,7 +248,7 @@ static enum microcode_match_result compare_patch(
     return MIS_UCODE;
 }
 
-static int apply_microcode(void)
+static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
     uint32_t rev;
@@ -256,7 +256,6 @@ static int apply_microcode(void)
     unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
-    const struct microcode_patch *patch = microcode_get_cache();
 
     if ( !match_cpu(patch) )
         return -EINVAL;
@@ -557,7 +556,7 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
 
         if ( match_cpu(microcode_get_cache()) )
         {
-            error = apply_microcode();
+            error = apply_microcode(microcode_get_cache());
             if ( error )
                 break;
         }
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index a5452d4..8c0008c 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -319,7 +319,7 @@ static int get_matching_microcode(const void *mc)
     return 1;
 }
 
-static int apply_microcode(void)
+static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
     uint64_t msr_content;
@@ -327,7 +327,6 @@ static int apply_microcode(void)
     unsigned int cpu_num = raw_smp_processor_id();
     struct cpu_signature *sig = &this_cpu(cpu_sig);
     const struct microcode_intel *mc_intel;
-    const struct microcode_patch *patch = microcode_get_cache();
 
     if ( !match_cpu(patch) )
         return -EINVAL;
@@ -422,7 +421,7 @@ static int cpu_request_microcode(const void *buf, size_t size)
         error = offset;
 
     if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode();
+        error = apply_microcode(microcode_get_cache());
 
     return error;
 }
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index c8d2c4f..8c7de9d 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -22,7 +22,7 @@ struct microcode_patch {
 struct microcode_ops {
     int (*cpu_request_microcode)(const void *buf, size_t size);
     int (*collect_cpu_info)(struct cpu_signature *csig);
-    int (*apply_microcode)(void);
+    int (*apply_microcode)(const struct microcode_patch *patch);
     int (*start_update)(void);
     void (*end_update)(void);
     void (*free_patch)(void *mc);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (8 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 09/15] microcode: pass a patch pointer to apply_microcode() Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-22 13:59   ` Roger Pau Monné
  2019-08-29 10:19   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
                   ` (5 subsequent siblings)
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

During late microcode loading, apply_microcode() is invoked in
cpu_request_microcode(). To make late microcode update more reliable,
we want to put the apply_microcode() into stop_machine context. So
we split out it from cpu_request_microcode(). In general, for both
early loading on BSP and late loading, cpu_request_microcode() is
called first to get the matching microcode update contained by
the blob and then apply_microcode() is invoked explicitly on each
cpu in common code.

Given that all CPUs are supposed to have the same signature, parsing
microcode only needs to be done once. So cpu_request_microcode() is
also moved out of microcode_update_cpu().

In some cases (e.g. a broken bios), the system may have multiple
revisions of microcode update. So we would try to load a microcode
update as long as it covers current cpu. And if a cpu loads this patch
successfully, the patch would be stored into the patch cache.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v9:
 - remove the calling of ->compare_patch in microcode_update_cpu().
 - drop "microcode_" prefix for static function - microcode_parse_blob().
 - rebase and fix conflict

Changes in v8:
 - divide the original patch into three patches to improve readability
 - load an update on each cpu as long as the update covers current cpu
 - store an update after the first successful loading on a CPU
 - Make sure the current CPU (especially pf value) is covered
 by updates.

changes in v7:
 - to handle load failure, unvalidated patches won't be cached. They
 are passed as function arguments. So if update failed, we needn't
 any cleanup to microcode cache.
---
 xen/arch/x86/microcode.c        | 177 ++++++++++++++++++++++++++--------------
 xen/arch/x86/microcode_amd.c    |  38 +++++----
 xen/arch/x86/microcode_intel.c  |  66 +++++++--------
 xen/include/asm-x86/microcode.h |   5 +-
 4 files changed, 172 insertions(+), 114 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 0e9322a..a2febc7 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -189,12 +189,19 @@ static DEFINE_SPINLOCK(microcode_mutex);
 
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
-struct microcode_info {
-    unsigned int cpu;
-    uint32_t buffer_size;
-    int error;
-    char buffer[1];
-};
+/*
+ * Return a patch that covers current CPU. If there are multiple patches,
+ * return the one with the highest revision number. Return error If no
+ * patch is found and an error occurs during the parsing process. Otherwise
+ * return NULL.
+ */
+static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
+{
+    if ( likely(!microcode_ops->collect_cpu_info(&this_cpu(cpu_sig))) )
+        return microcode_ops->cpu_request_microcode(buf, len);
+
+    return NULL;
+}
 
 int microcode_resume_cpu(void)
 {
@@ -220,13 +227,6 @@ void microcode_free_patch(struct microcode_patch *microcode_patch)
     xfree(microcode_patch);
 }
 
-const struct microcode_patch *microcode_get_cache(void)
-{
-    ASSERT(spin_is_locked(&microcode_mutex));
-
-    return microcode_cache;
-}
-
 /* Return true if cache gets updated. Otherwise, return false */
 bool microcode_update_cache(struct microcode_patch *patch)
 {
@@ -250,49 +250,71 @@ bool microcode_update_cache(struct microcode_patch *patch)
     return true;
 }
 
-static int microcode_update_cpu(const void *buf, size_t size)
+/*
+ * Load a microcode update to current CPU.
+ *
+ * If no patch is provided, the cached patch will be loaded. Microcode update
+ * during APs bringup and CPU resuming falls into this case.
+ */
+static int microcode_update_cpu(const struct microcode_patch *patch)
 {
-    int err;
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    int err = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
-    spin_lock(&microcode_mutex);
+    if ( unlikely(err) )
+        return err;
 
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(buf, size);
-    spin_unlock(&microcode_mutex);
+    if ( patch )
+        err = microcode_ops->apply_microcode(patch);
+    else if ( microcode_cache )
+    {
+        spin_lock(&microcode_mutex);
+        err = microcode_ops->apply_microcode(microcode_cache);
+        if ( err == -EIO )
+        {
+            microcode_free_patch(microcode_cache);
+            microcode_cache = NULL;
+        }
+        spin_unlock(&microcode_mutex);
+    }
+    else
+        /* No patch to update */
+        err = -ENOENT;
 
     return err;
 }
 
-static long do_microcode_update(void *_info)
+static long do_microcode_update(void *patch)
 {
-    struct microcode_info *info = _info;
-    int error;
-
-    BUG_ON(info->cpu != smp_processor_id());
+    unsigned int cpu;
 
-    error = microcode_update_cpu(info->buffer, info->buffer_size);
-    if ( error )
-        info->error = error;
+    /* Store the patch after a successful loading */
+    if ( !microcode_update_cpu(patch) && patch )
+    {
+        spin_lock(&microcode_mutex);
+        microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
+        patch = NULL;
+    }
 
     if ( microcode_ops->end_update )
         microcode_ops->end_update();
 
-    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
-    if ( info->cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
+    if ( cpu < nr_cpu_ids )
+        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch);
 
-    error = info->error;
-    xfree(info);
-    return error;
+    /* Free the patch if no CPU has loaded it successfully. */
+    if ( patch )
+        microcode_free_patch(patch);
+
+    return 0;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
-    struct microcode_info *info;
+    void *buffer;
+    struct microcode_patch *patch;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -300,32 +322,44 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
-    info = xmalloc_bytes(sizeof(*info) + len);
-    if ( info == NULL )
+    buffer = xmalloc_bytes(len);
+    if ( !buffer )
         return -ENOMEM;
 
-    ret = copy_from_guest(info->buffer, buf, len);
-    if ( ret != 0 )
+    if ( copy_from_guest(buffer, buf, len) )
     {
-        xfree(info);
-        return ret;
+        ret = -EFAULT;
+        goto free;
     }
 
-    info->buffer_size = len;
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
-
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-        {
-            xfree(info);
-            return ret;
-        }
+            goto free;
     }
 
-    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    patch = parse_blob(buffer, len);
+    if ( IS_ERR(patch) )
+    {
+        ret = PTR_ERR(patch);
+        printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
+        goto free;
+    }
+
+    if ( !patch )
+    {
+        printk(XENLOG_INFO "No ucode found. Update aborted!\n");
+        ret = -EINVAL;
+        goto free;
+    }
+
+    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
+                                    do_microcode_update, patch);
+
+ free:
+    xfree(buffer);
+    return ret;
 }
 
 static int __init microcode_init(void)
@@ -372,23 +406,46 @@ int __init early_microcode_update_cpu(bool start_update)
 
     microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
-    if ( data )
+    if ( !data )
+        return -ENOMEM;
+
+    if ( start_update )
     {
-        if ( start_update && microcode_ops->start_update )
+        struct microcode_patch *patch;
+
+        if ( microcode_ops->start_update )
             rc = microcode_ops->start_update();
 
         if ( rc )
             return rc;
 
-        rc = microcode_update_cpu(data, len);
+        patch = parse_blob(data, len);
+        if ( IS_ERR(patch) )
+        {
+            printk(XENLOG_INFO "Parsing microcode blob error %ld\n",
+                   PTR_ERR(patch));
+            return PTR_ERR(patch);
+        }
+
+        if ( !patch )
+        {
+            printk(XENLOG_INFO "No ucode found. Update aborted!\n");
+            return -EINVAL;
+        }
 
-        if ( microcode_ops->end_update )
-            microcode_ops->end_update();
+        spin_lock(&microcode_mutex);
+        rc = microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
 
-        return rc;
+        ASSERT(rc);
     }
-    else
-        return -ENOMEM;
+
+    rc = microcode_update_cpu(NULL);
+
+    if ( microcode_ops->end_update )
+        microcode_ops->end_update();
+
+    return rc;
 }
 
 int __init early_microcode_init(void)
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 21cdfe0..6353323 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -447,9 +447,11 @@ static bool_t check_final_patch_levels(unsigned int cpu)
     return 0;
 }
 
-static int cpu_request_microcode(const void *buf, size_t bufsize)
+static struct microcode_patch *cpu_request_microcode(const void *buf,
+                                                     size_t bufsize)
 {
     struct microcode_amd *mc_amd;
+    struct microcode_patch *patch = NULL;
     size_t offset = 0;
     int error = 0;
     unsigned int current_cpu_id;
@@ -548,19 +550,22 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
             break;
         }
 
-        /* Update cache if this patch covers current CPU */
-        if ( microcode_fits(new_patch->mc_amd) != MIS_UCODE )
-            microcode_update_cache(new_patch);
-        else
-            microcode_free_patch(new_patch);
-
-        if ( match_cpu(microcode_get_cache()) )
+        /*
+         * If the new patch covers current CPU, compare patches and store the
+         * one with higher revision.
+         */
+        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
+             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
         {
-            error = apply_microcode(microcode_get_cache());
-            if ( error )
-                break;
+            struct microcode_patch *tmp = patch;
+
+            patch = new_patch;
+            new_patch = tmp;
         }
 
+        if ( new_patch )
+            microcode_free_patch(new_patch);
+
         if ( offset >= bufsize )
             break;
 
@@ -593,13 +598,10 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     xfree(mc_amd);
 
   out:
-    /*
-     * In some cases we may return an error even if processor's microcode has
-     * been updated. For example, the first patch in a container file is loaded
-     * successfully but subsequent container file processing encounters a
-     * failure.
-     */
-    return error;
+    if ( error && !patch )
+        patch = ERR_PTR(error);
+
+    return patch;
 }
 
 static int start_update(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 8c0008c..96b38f8 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -282,14 +282,9 @@ static enum microcode_match_result compare_patch(
                                                                 OLD_UCODE;
 }
 
-/*
- * return 0 - no update found
- * return 1 - found update
- * return < 0 - error
- */
-static int get_matching_microcode(const void *mc)
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_header_intel *mc_header)
 {
-    const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
     struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
@@ -298,25 +293,12 @@ static int get_matching_microcode(const void *mc)
     {
         xfree(new_patch);
         xfree(new_mc);
-        return -ENOMEM;
+        return ERR_PTR(-ENOMEM);
     }
-    memcpy(new_mc, mc, total_size);
+    memcpy(new_mc, mc_header, total_size);
     new_patch->mc_intel = new_mc;
 
-    /* Make sure that this patch covers current CPU */
-    if ( microcode_update_match(mc) == MIS_UCODE )
-    {
-        microcode_free_patch(new_patch);
-        return 0;
-    }
-
-    microcode_update_cache(new_patch);
-
-    pr_debug("microcode: CPU%d found a matching microcode update with"
-             " version %#x (current=%#x)\n",
-             smp_processor_id(), mc_header->rev, this_cpu(cpu_sig).rev);
-
-    return 1;
+    return new_patch;
 }
 
 static int apply_microcode(const struct microcode_patch *patch)
@@ -392,26 +374,44 @@ static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
     return offset + total_size;
 }
 
-static int cpu_request_microcode(const void *buf, size_t size)
+static struct microcode_patch *cpu_request_microcode(const void *buf,
+                                                     size_t size)
 {
     long offset = 0;
     int error = 0;
     void *mc;
+    struct microcode_patch *patch = NULL;
 
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
+        struct microcode_patch *new_patch;
+
         error = microcode_sanity_check(mc);
         if ( error )
             break;
-        error = get_matching_microcode(mc);
-        if ( error < 0 )
+
+        new_patch = alloc_microcode_patch(mc);
+        if ( IS_ERR(new_patch) )
+        {
+            error = PTR_ERR(new_patch);
             break;
+        }
+
         /*
-         * It's possible the data file has multiple matching ucode,
-         * lets keep searching till the latest version
+         * If the new patch covers current CPU, compare patches and store the
+         * one with higher revision.
          */
-        if ( error == 1 )
-            error = 0;
+        if ( (microcode_update_match(&new_patch->mc_intel->hdr) != MIS_UCODE) &&
+             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+        {
+            struct microcode_patch *tmp = patch;
+
+            patch = new_patch;
+            new_patch = tmp;
+        }
+
+        if ( new_patch )
+            microcode_free_patch(new_patch);
 
         xfree(mc);
     }
@@ -420,10 +420,10 @@ static int cpu_request_microcode(const void *buf, size_t size)
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode(microcode_get_cache());
+    if ( error && !patch )
+        patch = ERR_PTR(error);
 
-    return error;
+    return patch;
 }
 
 static const struct microcode_ops microcode_intel_ops = {
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 8c7de9d..8e71615 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -20,7 +20,8 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*cpu_request_microcode)(const void *buf, size_t size);
+    struct microcode_patch *(*cpu_request_microcode)(const void *buf,
+                                                     size_t size);
     int (*collect_cpu_info)(struct cpu_signature *csig);
     int (*apply_microcode)(const struct microcode_patch *patch);
     int (*start_update)(void);
@@ -41,8 +42,6 @@ struct cpu_signature {
 DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
 extern const struct microcode_ops *microcode_ops;
 
-const struct microcode_patch *microcode_get_cache(void);
-bool microcode_update_cache(struct microcode_patch *patch);
 void microcode_free_patch(struct microcode_patch *patch);
 
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (9 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-22 14:10   ` Roger Pau Monné
  2019-08-29 10:29   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
                   ` (4 subsequent siblings)
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

Both are loading the cached patch. Since APs call the unified function,
microcode_update_one(), during wakeup, the 'start_update' parameter
which originally used to distinguish BSP and APs is redundant. So remove
this parameter.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Note that here is a functional change: resuming a CPU would call
->end_update() now while previously it wasn't. Not quite sure
whether it is correct.

Changes in v9:
 - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
   microcode_update_one()
 - rebase and fix conflicts.

Changes in v8:
 - split out from the previous patch
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
 xen/arch/x86/smpboot.c          |  5 +--
 xen/include/asm-x86/processor.h |  4 +-
 4 files changed, 44 insertions(+), 57 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 4f21903..24798d5 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -253,7 +253,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu();
+    microcode_update_one();
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index a2febc7..bdd9c9f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
     return NULL;
 }
 
-int microcode_resume_cpu(void)
-{
-    int err;
-    struct cpu_signature *sig = &this_cpu(cpu_sig);
-
-    if ( !microcode_ops )
-        return 0;
-
-    spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->apply_microcode(microcode_cache);
-    spin_unlock(&microcode_mutex);
-
-    return err;
-}
-
 void microcode_free_patch(struct microcode_patch *microcode_patch)
 {
     microcode_ops->free_patch(microcode_patch->mc);
@@ -384,11 +366,29 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-int __init early_microcode_update_cpu(bool start_update)
+/* Load a cached update to current cpu */
+int microcode_update_one(void)
+{
+    int rc;
+
+    if ( !microcode_ops )
+        return -EOPNOTSUPP;
+
+    rc = microcode_update_cpu(NULL);
+
+    if ( microcode_ops->end_update )
+        microcode_ops->end_update();
+
+    return rc;
+}
+
+/* BSP calls this function to parse ucode blob and then apply an update. */
+int __init early_microcode_update_cpu(void)
 {
     int rc = 0;
     void *data = NULL;
     size_t len;
+    struct microcode_patch *patch;
 
     if ( !microcode_ops )
         return -ENOSYS;
@@ -409,43 +409,33 @@ int __init early_microcode_update_cpu(bool start_update)
     if ( !data )
         return -ENOMEM;
 
-    if ( start_update )
-    {
-        struct microcode_patch *patch;
-
-        if ( microcode_ops->start_update )
-            rc = microcode_ops->start_update();
-
-        if ( rc )
-            return rc;
-
-        patch = parse_blob(data, len);
-        if ( IS_ERR(patch) )
-        {
-            printk(XENLOG_INFO "Parsing microcode blob error %ld\n",
-                   PTR_ERR(patch));
-            return PTR_ERR(patch);
-        }
+    if ( microcode_ops->start_update )
+        rc = microcode_ops->start_update();
 
-        if ( !patch )
-        {
-            printk(XENLOG_INFO "No ucode found. Update aborted!\n");
-            return -EINVAL;
-        }
+    if ( rc )
+        return rc;
 
-        spin_lock(&microcode_mutex);
-        rc = microcode_update_cache(patch);
-        spin_unlock(&microcode_mutex);
+    patch = parse_blob(data, len);
+    if ( IS_ERR(patch) )
+    {
+        printk(XENLOG_INFO "Parsing microcode blob error %ld\n",
+               PTR_ERR(patch));
+        return PTR_ERR(patch);
+    }
 
-        ASSERT(rc);
+    if ( !patch )
+    {
+        printk(XENLOG_INFO "No ucode found. Update aborted!\n");
+        return -EINVAL;
     }
 
-    rc = microcode_update_cpu(NULL);
+    spin_lock(&microcode_mutex);
+    rc = microcode_update_cache(patch);
+    spin_unlock(&microcode_mutex);
 
-    if ( microcode_ops->end_update )
-        microcode_ops->end_update();
+    ASSERT(rc);
 
-    return rc;
+    return microcode_update_one();
 }
 
 int __init early_microcode_init(void)
@@ -465,7 +455,7 @@ int __init early_microcode_init(void)
         microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
         if ( ucode_mod.mod_end || ucode_blob.size )
-            rc = early_microcode_update_cpu(true);
+            rc = early_microcode_update_cpu();
     }
 
     return rc;
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index c818cfc..e62a1ca 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -361,10 +361,7 @@ void start_secondary(void *unused)
 
     initialize_cpu_data(cpu);
 
-    if ( system_state <= SYS_STATE_smp_boot )
-        early_microcode_update_cpu(false);
-    else
-        microcode_resume_cpu();
+    microcode_update_one();
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 104faa9..2a76d90 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -568,9 +568,9 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(void);
-int early_microcode_update_cpu(bool start_update);
+int early_microcode_update_cpu(void);
 int early_microcode_init(void);
+int microcode_update_one(void);
 int microcode_init_intel(void);
 int microcode_init_amd(void);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (10 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-23  8:11   ` Roger Pau Monné
  2019-08-29 10:47   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
                   ` (3 subsequent siblings)
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

To create a microcode patch from a vendor-specific update,
allocate_microcode_patch() copied everything from the update.
It is not efficient. Essentially, we just need to go through
ucodes in the blob, find the one with the newest revision and
install it into the microcode_patch. In the process, buffers
like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
side) can be reused. microcode_patch now is allocated after
it is sure that there is a matching ucode.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v9:
 - new
---
 xen/arch/x86/microcode_amd.c   | 99 +++++++++++++++---------------------------
 xen/arch/x86/microcode_intel.c | 65 ++++++++++-----------------
 2 files changed, 58 insertions(+), 106 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 6353323..ec1c2eb 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -194,36 +194,6 @@ static bool match_cpu(const struct microcode_patch *patch)
     return patch && (microcode_fits(patch->mc_amd) == NEW_UCODE);
 }
 
-static struct microcode_patch *alloc_microcode_patch(
-    const struct microcode_amd *mc_amd)
-{
-    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
-    struct microcode_amd *cache = xmalloc(struct microcode_amd);
-    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
-    struct equiv_cpu_entry *equiv_cpu_table =
-                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
-
-    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
-    {
-        xfree(microcode_patch);
-        xfree(cache);
-        xfree(mpb);
-        xfree(equiv_cpu_table);
-        return ERR_PTR(-ENOMEM);
-    }
-
-    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
-    cache->mpb = mpb;
-    cache->mpb_size = mc_amd->mpb_size;
-    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
-           mc_amd->equiv_cpu_table_size);
-    cache->equiv_cpu_table = equiv_cpu_table;
-    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
-    microcode_patch->mc_amd = cache;
-
-    return microcode_patch;
-}
-
 static void free_patch(void *mc)
 {
     struct microcode_amd *mc_amd = mc;
@@ -320,18 +290,10 @@ static int get_ucode_from_buffer_amd(
         return -EINVAL;
     }
 
-    if ( mc_amd->mpb_size < mpbuf->len )
-    {
-        if ( mc_amd->mpb )
-        {
-            xfree(mc_amd->mpb);
-            mc_amd->mpb_size = 0;
-        }
-        mc_amd->mpb = xmalloc_bytes(mpbuf->len);
-        if ( mc_amd->mpb == NULL )
-            return -ENOMEM;
-        mc_amd->mpb_size = mpbuf->len;
-    }
+    mc_amd->mpb = xmalloc_bytes(mpbuf->len);
+    if ( mc_amd->mpb == NULL )
+        return -ENOMEM;
+    mc_amd->mpb_size = mpbuf->len;
     memcpy(mc_amd->mpb, mpbuf->data, mpbuf->len);
 
     pr_debug("microcode: CPU%d size %zu, block size %u offset %zu equivID %#x rev %#x\n",
@@ -451,8 +413,9 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
                                                      size_t bufsize)
 {
     struct microcode_amd *mc_amd;
+    struct microcode_header_amd *saved = NULL;
     struct microcode_patch *patch = NULL;
-    size_t offset = 0;
+    size_t offset = 0, saved_size = 0;
     int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
@@ -542,29 +505,21 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
-
-        if ( IS_ERR(new_patch) )
-        {
-            error = PTR_ERR(new_patch);
-            break;
-        }
-
         /*
-         * If the new patch covers current CPU, compare patches and store the
+         * If the new ucode covers current CPU, compare ucodes and store the
          * one with higher revision.
          */
-        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
-             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+#define REV_ID(mpb) (((struct microcode_header_amd *)(mpb))->processor_rev_id)
+        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
+             (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
+#undef REV_ID
         {
-            struct microcode_patch *tmp = patch;
-
-            patch = new_patch;
-            new_patch = tmp;
+            xfree(saved);
+            saved = mc_amd->mpb;
+            saved_size = mc_amd->mpb_size;
         }
-
-        if ( new_patch )
-            microcode_free_patch(new_patch);
+        else
+            xfree(mc_amd->mpb);
 
         if ( offset >= bufsize )
             break;
@@ -593,9 +548,25 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
              *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
             break;
     }
-    xfree(mc_amd->mpb);
-    xfree(mc_amd->equiv_cpu_table);
-    xfree(mc_amd);
+
+    if ( saved )
+    {
+        mc_amd->mpb = saved;
+        mc_amd->mpb_size = saved_size;
+        patch = xmalloc(struct microcode_patch);
+        if ( patch )
+            patch->mc_amd = mc_amd;
+        else
+        {
+            free_patch(mc_amd);
+            error = -ENOMEM;
+        }
+    }
+    else
+    {
+        mc_amd->mpb = NULL;
+        free_patch(mc_amd);
+    }
 
   out:
     if ( error && !patch )
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 96b38f8..ae5759f 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -282,25 +282,6 @@ static enum microcode_match_result compare_patch(
                                                                 OLD_UCODE;
 }
 
-static struct microcode_patch *alloc_microcode_patch(
-    const struct microcode_header_intel *mc_header)
-{
-    unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc = xmalloc_bytes(total_size);
-    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
-
-    if ( !new_patch || !new_mc )
-    {
-        xfree(new_patch);
-        xfree(new_mc);
-        return ERR_PTR(-ENOMEM);
-    }
-    memcpy(new_mc, mc_header, total_size);
-    new_patch->mc_intel = new_mc;
-
-    return new_patch;
-}
-
 static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
@@ -379,47 +360,47 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
 {
     long offset = 0;
     int error = 0;
-    void *mc;
+    struct microcode_intel *mc, *saved = NULL;
     struct microcode_patch *patch = NULL;
 
-    while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
+    while ( (offset = get_next_ucode_from_buffer((void **)&mc, buf,
+                                                 size, offset)) > 0 )
     {
-        struct microcode_patch *new_patch;
-
         error = microcode_sanity_check(mc);
         if ( error )
-            break;
-
-        new_patch = alloc_microcode_patch(mc);
-        if ( IS_ERR(new_patch) )
         {
-            error = PTR_ERR(new_patch);
+            xfree(mc);
             break;
         }
 
         /*
-         * If the new patch covers current CPU, compare patches and store the
+         * If the new update covers current CPU, compare updates and store the
          * one with higher revision.
          */
-        if ( (microcode_update_match(&new_patch->mc_intel->hdr) != MIS_UCODE) &&
-             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+        if ( (microcode_update_match(&mc->hdr) != MIS_UCODE) &&
+             (!saved || (mc->hdr.rev > saved->hdr.rev)) )
         {
-            struct microcode_patch *tmp = patch;
-
-            patch = new_patch;
-            new_patch = tmp;
+            xfree(saved);
+            saved = mc;
         }
-
-        if ( new_patch )
-            microcode_free_patch(new_patch);
-
-        xfree(mc);
+        else
+            xfree(mc);
     }
-    if ( offset > 0 )
-        xfree(mc);
     if ( offset < 0 )
         error = offset;
 
+    if ( saved )
+    {
+        patch = xmalloc(struct microcode_patch);
+        if ( patch )
+            patch->mc_intel = saved;
+        else
+        {
+            xfree(saved);
+            error = -ENOMEM;
+        }
+    }
+
     if ( error && !patch )
         patch = ERR_PTR(error);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (11 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19 10:27   ` Sergey Dyasli
  2019-08-29 12:06   ` Jan Beulich
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 14/15] microcode: remove microcode_update_lock Chao Gao
                   ` (2 subsequent siblings)
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Borislav Petkov, Ashok Raj, Wei Liu, Jun Nakajima,
	Andrew Cooper, Jan Beulich, Thomas Gleixner, Chao Gao,
	Roger Pau Monné

This patch ports microcode improvement patches from linux kernel.

Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.

Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - log __buildin_return_address(0) when timeout
 - divide CPUs into three logical sets and they will call different
 functions during ucode loading. The 'control thread' is chosen to
 coordinate ucode loading on all CPUs. Since only control thread would
 set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
 - s/rep_nop/cpu_relax
 - each thread updates its revision number itself
 - add XENLOG_ERR prefix for each line of multi-line log messages

Changes in v8:
 - to support blocking #NMI handling during loading ucode
   * introduce a flag, 'loading_state', to mark the start or end of
     ucode loading.
   * use a bitmap for cpu callin since if cpu may stay in #NMI handling,
     there are two places for a cpu to call in. bitmap won't be counted
     twice.
   * don't wait for all CPUs callout, just wait for CPUs that perform the
     update. We have to do this because some threads may be stuck in NMI
     handling (where cannot reach the rendezvous).
 - emit a warning if the system stays in stop_machine context for more
 than 1s
 - comment that rdtsc is fine while loading an update
 - use cmpxchg() to avoid panic being called on multiple CPUs
 - Propagate revision number to other threads
 - refine comments and prompt messages

Changes in v7:
 - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
 - reword the comment above microcode_update_cpu() to clearly state that
 one thread per core should do the update.
---
 xen/arch/x86/microcode.c | 289 +++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 267 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index bdd9c9f..91f9e81 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -30,18 +30,52 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/spinlock.h>
+#include <xen/stop_machine.h>
 #include <xen/tasklet.h>
 #include <xen/guest_access.h>
 #include <xen/earlycpio.h>
+#include <xen/watchdog.h>
 
+#include <asm/delay.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
 
+/*
+ * Before performing a late microcode update on any thread, we
+ * rendezvous all cpus in stop_machine context. The timeout for
+ * waiting for cpu rendezvous is 30ms. It is the timeout used by
+ * live patching
+ */
+#define MICROCODE_CALLIN_TIMEOUT_US 30000
+
+/*
+ * Timeout for each thread to complete update is set to 1s. It is a
+ * conservative choice considering all possible interference.
+ */
+#define MICROCODE_UPDATE_TIMEOUT_US 1000000
+
 static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
+static unsigned int nr_cores;
+
+/*
+ * These states help to coordinate CPUs during loading an update.
+ *
+ * The semantics of each state is as follow:
+ *  - LOADING_PREPARE: initial state of 'loading_state'.
+ *  - LOADING_CALLIN: CPUs are allowed to callin.
+ *  - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
+ *  - LOADING_EXIT: ucode loading is done or aborted.
+ */
+static enum {
+    LOADING_PREPARE,
+    LOADING_CALLIN,
+    LOADING_ENTER,
+    LOADING_EXIT,
+} loading_state;
 
 /*
  * If we scan the initramfs.cpio for the early microcode code
@@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 /*
+ * Count the CPUs that have entered, exited the rendezvous and succeeded in
+ * microcode update during late microcode update respectively.
+ *
+ * Note that a bitmap is used for callin to allow cpu to set a bit multiple
+ * times. It is required to do busy-loop in #NMI handling.
+ */
+static cpumask_t cpu_callin_map;
+static atomic_t cpu_out, cpu_updated;
+
+/*
  * Return a patch that covers current CPU. If there are multiple patches,
  * return the one with the highest revision number. Return error If no
  * patch is found and an error occurs during the parsing process. Otherwise
@@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch *patch)
     return true;
 }
 
+/* Wait for a condition to be met with a timeout (us). */
+static int wait_for_condition(int (*func)(void *data), void *data,
+                         unsigned int timeout)
+{
+    while ( !func(data) )
+    {
+        if ( !timeout-- )
+        {
+            printk("CPU%u: Timeout in %pS\n",
+                   smp_processor_id(), __builtin_return_address(0));
+            return -EBUSY;
+        }
+        udelay(1);
+    }
+
+    return 0;
+}
+
+static int wait_cpu_callin(void *nr)
+{
+    return cpumask_weight(&cpu_callin_map) >= (unsigned long)nr;
+}
+
+static int wait_cpu_callout(void *nr)
+{
+    return atomic_read(&cpu_out) >= (unsigned long)nr;
+}
+
 /*
  * Load a microcode update to current CPU.
  *
@@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
     return err;
 }
 
-static long do_microcode_update(void *patch)
+static int slave_thread_fn(void)
+{
+    unsigned int cpu = smp_processor_id();
+    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
+
+    while ( loading_state != LOADING_CALLIN )
+        cpu_relax();
+
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    while ( loading_state != LOADING_EXIT )
+        cpu_relax();
+
+    /* Copy update revision from the "master" thread. */
+    this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev;
+
+    return 0;
+}
+
+static int master_thread_fn(const struct microcode_patch *patch)
+{
+    unsigned int cpu = smp_processor_id();
+    int ret = 0;
+
+    while ( loading_state != LOADING_CALLIN )
+        cpu_relax();
+
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    while ( loading_state != LOADING_ENTER )
+        cpu_relax();
+
+    /*
+     * If an error happened, control thread would set 'loading_state'
+     * to LOADING_EXIT. Don't perform ucode loading for this case
+     */
+    if ( loading_state == LOADING_EXIT )
+        return ret;
+
+    ret = microcode_ops->apply_microcode(patch);
+    if ( !ret )
+        atomic_inc(&cpu_updated);
+    atomic_inc(&cpu_out);
+
+    while ( loading_state != LOADING_EXIT )
+        cpu_relax();
+
+    return ret;
+}
+
+static int control_thread_fn(const struct microcode_patch *patch)
 {
-    unsigned int cpu;
+    unsigned int cpu = smp_processor_id(), done;
+    unsigned long tick;
+    int ret;
 
-    /* Store the patch after a successful loading */
-    if ( !microcode_update_cpu(patch) && patch )
+    /* Allow threads to call in */
+    loading_state = LOADING_CALLIN;
+    smp_mb();
+
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    /* Waiting for all threads calling in */
+    ret = wait_for_condition(wait_cpu_callin,
+                             (void *)(unsigned long)num_online_cpus(),
+                             MICROCODE_CALLIN_TIMEOUT_US);
+    if ( ret ) {
+        loading_state = LOADING_EXIT;
+        return ret;
+    }
+
+    /* Let master threads load the given ucode update */
+    loading_state = LOADING_ENTER;
+    smp_mb();
+
+    ret = microcode_ops->apply_microcode(patch);
+    if ( !ret )
+        atomic_inc(&cpu_updated);
+    atomic_inc(&cpu_out);
+
+    tick = rdtsc_ordered();
+    /* Waiting for master threads finishing update */
+    done = atomic_read(&cpu_out);
+    while ( done != nr_cores )
     {
-        spin_lock(&microcode_mutex);
-        microcode_update_cache(patch);
-        spin_unlock(&microcode_mutex);
-        patch = NULL;
+        /*
+         * During each timeout interval, at least a CPU is expected to
+         * finish its update. Otherwise, something goes wrong.
+         *
+         * Note that RDTSC (in wait_for_condition()) is safe for threads to
+         * execute while waiting for completion of loading an update.
+         */
+        if ( wait_for_condition(wait_cpu_callout,
+                                (void *)(unsigned long)(done + 1),
+                                MICROCODE_UPDATE_TIMEOUT_US) )
+            panic("Timeout when finished updating microcode (finished %u/%u)",
+                  done, nr_cores);
+
+        /* Print warning message once if long time is spent here */
+        if ( tick && rdtsc_ordered() - tick >= cpu_khz * 1000 )
+        {
+            printk(XENLOG_WARNING
+                   "WARNING: UPDATING MICROCODE HAS CONSUMED MORE THAN 1 SECOND!\n");
+            tick = 0;
+        }
+        done = atomic_read(&cpu_out);
     }
 
-    if ( microcode_ops->end_update )
-        microcode_ops->end_update();
+    /* Mark loading is done to unblock other threads */
+    loading_state = LOADING_EXIT;
+    smp_mb();
 
-    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
-    if ( cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch);
+    return ret;
+}
 
-    /* Free the patch if no CPU has loaded it successfully. */
-    if ( patch )
-        microcode_free_patch(patch);
+static int do_microcode_update(void *patch)
+{
+    unsigned int cpu = smp_processor_id();
+    /*
+     * "master" thread is the one with the lowest thread id among all siblings
+     * thread in a core or a compute unit. It is chosen to load a microcode
+     * update.
+     */
+    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
+    int ret;
 
-    return 0;
+    /*
+     * The control thread set state to coordinate ucode loading. Master threads
+     * load the given ucode patch. Slave threads just wait for the completion
+     * of the ucode loading process.
+     */
+    if ( cpu == cpumask_first(&cpu_online_map) )
+        ret = control_thread_fn(patch);
+    else if ( cpu == master )
+        ret = master_thread_fn(patch);
+    else
+        ret = slave_thread_fn();
+
+    if ( microcode_ops->end_update )
+        microcode_ops->end_update();
+
+    return ret;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
+    unsigned int cpu, updated;
     struct microcode_patch *patch;
 
     if ( len != (uint32_t)len )
@@ -314,11 +504,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         goto free;
     }
 
+    /* cpu_online_map must not change during update */
+    if ( !get_cpu_maps() )
+    {
+        ret = -EBUSY;
+        goto free;
+    }
+
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-            goto free;
+            goto put;
     }
 
     patch = parse_blob(buffer, len);
@@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     {
         ret = PTR_ERR(patch);
         printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
-        goto free;
+        goto put;
     }
 
     if ( !patch )
     {
         printk(XENLOG_INFO "No ucode found. Update aborted!\n");
         ret = -EINVAL;
-        goto free;
+        goto put;
+    }
+
+    cpumask_clear(&cpu_callin_map);
+    atomic_set(&cpu_out, 0);
+    atomic_set(&cpu_updated, 0);
+    loading_state = LOADING_PREPARE;
+
+    /* Calculate the number of online CPU core */
+    nr_cores = 0;
+    for_each_online_cpu(cpu)
+        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+            nr_cores++;
+
+    printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
+
+    /*
+     * We intend to disable interrupt for long time, which may lead to
+     * watchdog timeout.
+     */
+    watchdog_disable();
+    /*
+     * Late loading dance. Why the heavy-handed stop_machine effort?
+     *
+     * - HT siblings must be idle and not execute other code while the other
+     *   sibling is loading microcode in order to avoid any negative
+     *   interactions cause by the loading.
+     *
+     * - In addition, microcode update on the cores must be serialized until
+     *   this requirement can be relaxed in the future. Right now, this is
+     *   conservative and good.
+     */
+    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
+    watchdog_enable();
+
+    updated = atomic_read(&cpu_updated);
+    if ( updated > 0 )
+    {
+        spin_lock(&microcode_mutex);
+        microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
     }
+    else
+        microcode_free_patch(patch);
 
-    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
-                                    do_microcode_update, patch);
+    if ( updated && updated != nr_cores )
+        printk(XENLOG_ERR "ERROR: Updating microcode succeeded on %u cores and failed\n"
+               XENLOG_ERR "on other %u cores. A system with differing microcode \n"
+               XENLOG_ERR "revisions is considered unstable. Please reboot and do not\n"
+               XENLOG_ERR "load the microcode that triggersthis warning!\n",
+               updated, nr_cores - updated);
 
+ put:
+    put_cpu_maps();
  free:
     xfree(buffer);
     return ret;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 14/15] microcode: remove microcode_update_lock
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (12 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
  2019-08-22  7:51 ` [Xen-devel] [PATCH v9 00/15] improve late microcode loading Sergey Dyasli
  15 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

microcode_update_lock is to prevent logic threads of a same core from
updating microcode at the same time. But due to using a global lock, it
also prevented parallel microcode updating on different cores.

Remove this lock in order to update microcode in parallel. It is safe
because we have already ensured serialization of sibling threads at the
caller side.
1.For late microcode update, do_microcode_update() ensures that only one
  sibiling thread of a core can update microcode.
2.For microcode update during system startup or CPU-hotplug,
  microcode_mutex() guarantees update serialization of logical threads.
3.get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
  late microcode update.

Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
only) are still processed sequentially.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v7:
 - reworked. Remove complex lock logics introduced in v5 and v6. The microcode
 patch to be applied is passed as an argument without any global variable. Thus
 no lock is added to serialize potential readers/writers. Callers of
 apply_microcode() will guarantee the correctness: the patch poninted by the
 arguments won't be changed by others.

Changes in v6:
 - introduce early_ucode_update_lock to serialize early ucode update.

Changes in v5:
 - newly add
---
 xen/arch/x86/microcode_amd.c   | 8 +-------
 xen/arch/x86/microcode_intel.c | 8 +-------
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index ec1c2eb..8685b3e 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -74,9 +74,6 @@ struct mpbhdr {
     uint8_t data[];
 };
 
-/* serialize access to the physical write */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 /* See comment in start_update() for cases when this routine fails */
 static int collect_cpu_info(struct cpu_signature *csig)
 {
@@ -220,7 +217,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-    unsigned long flags;
     uint32_t rev;
     int hw_err;
     unsigned int cpu = smp_processor_id();
@@ -232,15 +228,13 @@ static int apply_microcode(const struct microcode_patch *patch)
 
     hdr = patch->mc_amd->mpb;
 
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
 
     /* get patch id after patching */
     rdmsrl(MSR_AMD_PATCHLEVEL, rev);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
-
     /*
      * Some processors leave the ucode blob mapping as UC after the update.
      * Flush the mapping to regain normal cacheability.
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index ae5759f..6186461 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -93,9 +93,6 @@ struct extended_sigtable {
 
 #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
 
-/* serialize access to the physical write to MSR 0x79 */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 static int collect_cpu_info(struct cpu_signature *csig)
 {
     unsigned int cpu_num = smp_processor_id();
@@ -284,7 +281,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-    unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
@@ -296,8 +292,7 @@ static int apply_microcode(const struct microcode_patch *patch)
 
     mc_intel = patch->mc_intel;
 
-    /* serialize access to the physical write to MSR 0x79 */
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     /* write microcode via MSR 0x79 */
     wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
@@ -310,7 +305,6 @@ static int apply_microcode(const struct microcode_patch *patch)
     rdmsrl(MSR_IA32_UCODE_REV, msr_content);
     val[1] = (uint32_t)(msr_content >> 32);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
     if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (13 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 14/15] microcode: remove microcode_update_lock Chao Gao
@ 2019-08-19  1:25 ` Chao Gao
  2019-08-23  8:46   ` Sergey Dyasli
  2019-08-29 12:22   ` Jan Beulich
  2019-08-22  7:51 ` [Xen-devel] [PATCH v9 00/15] improve late microcode loading Sergey Dyasli
  15 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19  1:25 UTC (permalink / raw)
  To: xen-devel
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Chao Gao,
	Roger Pau Monné

register an nmi callback. And this callback does busy-loop on threads
which are waiting for loading completion. Control threads send NMI to
slave threads to prevent NMI acceptance during ucode loading.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v9:
 - control threads send NMI to all other threads. Slave threads will
 stay in the NMI handling to prevent NMI acceptance during ucode
 loading. Note that self-nmi is invalid according to SDM.
 - s/rep_nop/cpu_relax
 - remove debug message in microcode_nmi_callback(). Printing debug
 message would take long times and control thread may timeout.
 - rebase and fix conflicts

Changes in v8:
 - new
---
 xen/arch/x86/microcode.c | 28 ++++++++++++++++++++++------
 1 file changed, 22 insertions(+), 6 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 91f9e81..d943835 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -38,6 +38,7 @@
 
 #include <asm/delay.h>
 #include <asm/msr.h>
+#include <asm/nmi.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
@@ -339,14 +340,8 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
 
 static int slave_thread_fn(void)
 {
-    unsigned int cpu = smp_processor_id();
     unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
 
-    while ( loading_state != LOADING_CALLIN )
-        cpu_relax();
-
-    cpumask_set_cpu(cpu, &cpu_callin_map);
-
     while ( loading_state != LOADING_EXIT )
         cpu_relax();
 
@@ -399,6 +394,8 @@ static int control_thread_fn(const struct microcode_patch *patch)
 
     cpumask_set_cpu(cpu, &cpu_callin_map);
 
+    smp_send_nmi_allbutself();
+
     /* Waiting for all threads calling in */
     ret = wait_for_condition(wait_cpu_callin,
                              (void *)(unsigned long)num_online_cpus(),
@@ -481,12 +478,28 @@ static int do_microcode_update(void *patch)
     return ret;
 }
 
+static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    /* The first thread of a core is to load an update. Don't block it. */
+    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ||
+         loading_state != LOADING_CALLIN )
+        return 0;
+
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    while ( loading_state != LOADING_EXIT )
+        cpu_relax();
+
+    return 0;
+}
+
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
     unsigned int cpu, updated;
     struct microcode_patch *patch;
+    nmi_callback_t *saved_nmi_callback;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -551,6 +564,8 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      * watchdog timeout.
      */
     watchdog_disable();
+
+    saved_nmi_callback = set_nmi_callback(microcode_nmi_callback);
     /*
      * Late loading dance. Why the heavy-handed stop_machine effort?
      *
@@ -563,6 +578,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      *   conservative and good.
      */
     ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
+    set_nmi_callback(saved_nmi_callback);
     watchdog_enable();
 
     updated = atomic_read(&cpu_updated);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-08-19 10:27   ` Sergey Dyasli
  2019-08-19 14:49     ` Chao Gao
  2019-08-29 12:06   ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Sergey Dyasli @ 2019-08-19 10:27 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Kevin Tian,
	Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, Jun Nakajima,
	Thomas Gleixner, Borislav Petkov, Roger Pau Monné

On 19/08/2019 02:25, Chao Gao wrote:
> This patch ports microcode improvement patches from linux kernel.
> 
> Before you read any further: the early loading method is still the
> preferred one and you should always do that. The following patch is
> improving the late loading mechanism for long running jobs and cloud use
> cases.
> 
> Gather all cores and serialize the microcode update on them by doing it
> one-by-one to make the late update process as reliable as possible and
> avoid potential issues caused by the microcode update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Chao Gao <chao.gao@intel.com>
> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v9:
>  - log __buildin_return_address(0) when timeout
>  - divide CPUs into three logical sets and they will call different
>  functions during ucode loading. The 'control thread' is chosen to
>  coordinate ucode loading on all CPUs. Since only control thread would
>  set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
>  - s/rep_nop/cpu_relax
>  - each thread updates its revision number itself
>  - add XENLOG_ERR prefix for each line of multi-line log messages
> 
> Changes in v8:
>  - to support blocking #NMI handling during loading ucode
>    * introduce a flag, 'loading_state', to mark the start or end of
>      ucode loading.
>    * use a bitmap for cpu callin since if cpu may stay in #NMI handling,
>      there are two places for a cpu to call in. bitmap won't be counted
>      twice.
>    * don't wait for all CPUs callout, just wait for CPUs that perform the
>      update. We have to do this because some threads may be stuck in NMI
>      handling (where cannot reach the rendezvous).
>  - emit a warning if the system stays in stop_machine context for more
>  than 1s
>  - comment that rdtsc is fine while loading an update
>  - use cmpxchg() to avoid panic being called on multiple CPUs
>  - Propagate revision number to other threads
>  - refine comments and prompt messages
> 
> Changes in v7:
>  - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
>  - reword the comment above microcode_update_cpu() to clearly state that
>  one thread per core should do the update.
> ---
>  xen/arch/x86/microcode.c | 289 +++++++++++++++++++++++++++++++++++++++++++----
>  1 file changed, 267 insertions(+), 22 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index bdd9c9f..91f9e81 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -30,18 +30,52 @@
>  #include <xen/smp.h>
>  #include <xen/softirq.h>
>  #include <xen/spinlock.h>
> +#include <xen/stop_machine.h>
>  #include <xen/tasklet.h>
>  #include <xen/guest_access.h>
>  #include <xen/earlycpio.h>
> +#include <xen/watchdog.h>
>  
> +#include <asm/delay.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
>  #include <asm/setup.h>
>  #include <asm/microcode.h>
>  
> +/*
> + * Before performing a late microcode update on any thread, we
> + * rendezvous all cpus in stop_machine context. The timeout for
> + * waiting for cpu rendezvous is 30ms. It is the timeout used by
> + * live patching
> + */
> +#define MICROCODE_CALLIN_TIMEOUT_US 30000
> +
> +/*
> + * Timeout for each thread to complete update is set to 1s. It is a
> + * conservative choice considering all possible interference.
> + */
> +#define MICROCODE_UPDATE_TIMEOUT_US 1000000
> +
>  static module_t __initdata ucode_mod;
>  static signed int __initdata ucode_mod_idx;
>  static bool_t __initdata ucode_mod_forced;
> +static unsigned int nr_cores;
> +
> +/*
> + * These states help to coordinate CPUs during loading an update.
> + *
> + * The semantics of each state is as follow:
> + *  - LOADING_PREPARE: initial state of 'loading_state'.
> + *  - LOADING_CALLIN: CPUs are allowed to callin.
> + *  - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
> + *  - LOADING_EXIT: ucode loading is done or aborted.
> + */
> +static enum {
> +    LOADING_PREPARE,
> +    LOADING_CALLIN,
> +    LOADING_ENTER,
> +    LOADING_EXIT,
> +} loading_state;
>  
>  /*
>   * If we scan the initramfs.cpio for the early microcode code
> @@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
>  DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>  
>  /*
> + * Count the CPUs that have entered, exited the rendezvous and succeeded in
> + * microcode update during late microcode update respectively.
> + *
> + * Note that a bitmap is used for callin to allow cpu to set a bit multiple
> + * times. It is required to do busy-loop in #NMI handling.
> + */
> +static cpumask_t cpu_callin_map;
> +static atomic_t cpu_out, cpu_updated;
> +
> +/*
>   * Return a patch that covers current CPU. If there are multiple patches,
>   * return the one with the highest revision number. Return error If no
>   * patch is found and an error occurs during the parsing process. Otherwise
> @@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch *patch)
>      return true;
>  }
>  
> +/* Wait for a condition to be met with a timeout (us). */
> +static int wait_for_condition(int (*func)(void *data), void *data,
> +                         unsigned int timeout)
> +{
> +    while ( !func(data) )
> +    {
> +        if ( !timeout-- )
> +        {
> +            printk("CPU%u: Timeout in %pS\n",
> +                   smp_processor_id(), __builtin_return_address(0));
> +            return -EBUSY;
> +        }
> +        udelay(1);
> +    }
> +
> +    return 0;
> +}
> +
> +static int wait_cpu_callin(void *nr)
> +{
> +    return cpumask_weight(&cpu_callin_map) >= (unsigned long)nr;
> +}
> +
> +static int wait_cpu_callout(void *nr)
> +{
> +    return atomic_read(&cpu_out) >= (unsigned long)nr;
> +}
> +
>  /*
>   * Load a microcode update to current CPU.
>   *
> @@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
>      return err;
>  }
>  
> -static long do_microcode_update(void *patch)
> +static int slave_thread_fn(void)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> +
> +    while ( loading_state != LOADING_CALLIN )
> +        cpu_relax();
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    while ( loading_state != LOADING_EXIT )
> +        cpu_relax();
> +
> +    /* Copy update revision from the "master" thread. */
> +    this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev;
> +
> +    return 0;
> +}
> +
> +static int master_thread_fn(const struct microcode_patch *patch)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    int ret = 0;
> +
> +    while ( loading_state != LOADING_CALLIN )
> +        cpu_relax();
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    while ( loading_state != LOADING_ENTER )
> +        cpu_relax();

If I'm reading it right, this will wait forever in case when...

> +
> +    /*
> +     * If an error happened, control thread would set 'loading_state'
> +     * to LOADING_EXIT. Don't perform ucode loading for this case
> +     */
> +    if ( loading_state == LOADING_EXIT )
> +        return ret;
> +
> +    ret = microcode_ops->apply_microcode(patch);
> +    if ( !ret )
> +        atomic_inc(&cpu_updated);
> +    atomic_inc(&cpu_out);
> +
> +    while ( loading_state != LOADING_EXIT )
> +        cpu_relax();
> +
> +    return ret;
> +}
> +
> +static int control_thread_fn(const struct microcode_patch *patch)
>  {
> -    unsigned int cpu;
> +    unsigned int cpu = smp_processor_id(), done;
> +    unsigned long tick;
> +    int ret;
>  
> -    /* Store the patch after a successful loading */
> -    if ( !microcode_update_cpu(patch) && patch )
> +    /* Allow threads to call in */
> +    loading_state = LOADING_CALLIN;
> +    smp_mb();
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    /* Waiting for all threads calling in */
> +    ret = wait_for_condition(wait_cpu_callin,
> +                             (void *)(unsigned long)num_online_cpus(),
> +                             MICROCODE_CALLIN_TIMEOUT_US);
> +    if ( ret ) {
> +        loading_state = LOADING_EXIT;
> +        return ret;
> +    }

...this condition holds. Have you actually tested this case?

> +
> +    /* Let master threads load the given ucode update */
> +    loading_state = LOADING_ENTER;
> +    smp_mb();
> +
> +    ret = microcode_ops->apply_microcode(patch);
> +    if ( !ret )
> +        atomic_inc(&cpu_updated);
> +    atomic_inc(&cpu_out);
> +
> +    tick = rdtsc_ordered();
> +    /* Waiting for master threads finishing update */
> +    done = atomic_read(&cpu_out);
> +    while ( done != nr_cores )
>      {
> -        spin_lock(&microcode_mutex);
> -        microcode_update_cache(patch);
> -        spin_unlock(&microcode_mutex);
> -        patch = NULL;
> +        /*
> +         * During each timeout interval, at least a CPU is expected to
> +         * finish its update. Otherwise, something goes wrong.
> +         *
> +         * Note that RDTSC (in wait_for_condition()) is safe for threads to
> +         * execute while waiting for completion of loading an update.
> +         */
> +        if ( wait_for_condition(wait_cpu_callout,
> +                                (void *)(unsigned long)(done + 1),
> +                                MICROCODE_UPDATE_TIMEOUT_US) )
> +            panic("Timeout when finished updating microcode (finished %u/%u)",
> +                  done, nr_cores);
> +
> +        /* Print warning message once if long time is spent here */
> +        if ( tick && rdtsc_ordered() - tick >= cpu_khz * 1000 )
> +        {
> +            printk(XENLOG_WARNING
> +                   "WARNING: UPDATING MICROCODE HAS CONSUMED MORE THAN 1 SECOND!\n");
> +            tick = 0;
> +        }
> +        done = atomic_read(&cpu_out);
>      }
>  
> -    if ( microcode_ops->end_update )
> -        microcode_ops->end_update();
> +    /* Mark loading is done to unblock other threads */
> +    loading_state = LOADING_EXIT;
> +    smp_mb();
>  
> -    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
> -    if ( cpu < nr_cpu_ids )
> -        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch);
> +    return ret;
> +}
>  
> -    /* Free the patch if no CPU has loaded it successfully. */
> -    if ( patch )
> -        microcode_free_patch(patch);
> +static int do_microcode_update(void *patch)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    /*
> +     * "master" thread is the one with the lowest thread id among all siblings
> +     * thread in a core or a compute unit. It is chosen to load a microcode
> +     * update.
> +     */
> +    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> +    int ret;
>  
> -    return 0;
> +    /*
> +     * The control thread set state to coordinate ucode loading. Master threads
> +     * load the given ucode patch. Slave threads just wait for the completion
> +     * of the ucode loading process.
> +     */
> +    if ( cpu == cpumask_first(&cpu_online_map) )
> +        ret = control_thread_fn(patch);
> +    else if ( cpu == master )
> +        ret = master_thread_fn(patch);
> +    else
> +        ret = slave_thread_fn();
> +
> +    if ( microcode_ops->end_update )
> +        microcode_ops->end_update();
> +
> +    return ret;
>  }
>  
>  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>  {
>      int ret;
>      void *buffer;
> +    unsigned int cpu, updated;
>      struct microcode_patch *patch;
>  
>      if ( len != (uint32_t)len )
> @@ -314,11 +504,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          goto free;
>      }
>  
> +    /* cpu_online_map must not change during update */
> +    if ( !get_cpu_maps() )
> +    {
> +        ret = -EBUSY;
> +        goto free;
> +    }
> +
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
>          if ( ret != 0 )
> -            goto free;
> +            goto put;
>      }
>  
>      patch = parse_blob(buffer, len);
> @@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      {
>          ret = PTR_ERR(patch);
>          printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
> -        goto free;
> +        goto put;
>      }
>  
>      if ( !patch )
>      {
>          printk(XENLOG_INFO "No ucode found. Update aborted!\n");
>          ret = -EINVAL;
> -        goto free;
> +        goto put;
> +    }
> +
> +    cpumask_clear(&cpu_callin_map);
> +    atomic_set(&cpu_out, 0);
> +    atomic_set(&cpu_updated, 0);
> +    loading_state = LOADING_PREPARE;
> +
> +    /* Calculate the number of online CPU core */
> +    nr_cores = 0;
> +    for_each_online_cpu(cpu)
> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +            nr_cores++;
> +
> +    printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
> +
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();
> +    /*
> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> +     *
> +     * - HT siblings must be idle and not execute other code while the other
> +     *   sibling is loading microcode in order to avoid any negative
> +     *   interactions cause by the loading.
> +     *
> +     * - In addition, microcode update on the cores must be serialized until
> +     *   this requirement can be relaxed in the future. Right now, this is
> +     *   conservative and good.
> +     */
> +    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
> +    watchdog_enable();
> +
> +    updated = atomic_read(&cpu_updated);
> +    if ( updated > 0 )
> +    {
> +        spin_lock(&microcode_mutex);
> +        microcode_update_cache(patch);
> +        spin_unlock(&microcode_mutex);
>      }
> +    else
> +        microcode_free_patch(patch);
>  
> -    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
> -                                    do_microcode_update, patch);
> +    if ( updated && updated != nr_cores )
> +        printk(XENLOG_ERR "ERROR: Updating microcode succeeded on %u cores and failed\n"
> +               XENLOG_ERR "on other %u cores. A system with differing microcode \n"
> +               XENLOG_ERR "revisions is considered unstable. Please reboot and do not\n"
> +               XENLOG_ERR "load the microcode that triggersthis warning!\n",
> +               updated, nr_cores - updated);
>  
> + put:
> +    put_cpu_maps();
>   free:
>      xfree(buffer);
>      return ret;
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
  2019-08-19 10:27   ` Sergey Dyasli
@ 2019-08-19 14:49     ` Chao Gao
  0 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-19 14:49 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Jun Nakajima, xen-devel, Thomas Gleixner, Borislav Petkov,
	Roger Pau Monné

On Mon, Aug 19, 2019 at 11:27:36AM +0100, Sergey Dyasli wrote:
>> +static int master_thread_fn(const struct microcode_patch *patch)
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    int ret = 0;
>> +
>> +    while ( loading_state != LOADING_CALLIN )
>> +        cpu_relax();
>> +
>> +    cpumask_set_cpu(cpu, &cpu_callin_map);
>> +
>> +    while ( loading_state != LOADING_ENTER )
>> +        cpu_relax();
>
>If I'm reading it right, this will wait forever in case when...
>
>> +
>> +    /*
>> +     * If an error happened, control thread would set 'loading_state'
>> +     * to LOADING_EXIT. Don't perform ucode loading for this case
>> +     */
>> +    if ( loading_state == LOADING_EXIT )
>> +        return ret;

I tried to check whether there was an error here. But as you said, we
cannot reach here if 'control thread' set loading_state from LOADING_CALLIN
to LOADING_EXIT. I will do this check in the while-loop right above.

>> +
>> +    ret = microcode_ops->apply_microcode(patch);
>> +    if ( !ret )
>> +        atomic_inc(&cpu_updated);
>> +    atomic_inc(&cpu_out);
>> +
>> +    while ( loading_state != LOADING_EXIT )
>> +        cpu_relax();
>> +
>> +    return ret;
>> +}
>> +
>> +static int control_thread_fn(const struct microcode_patch *patch)
>>  {
>> -    unsigned int cpu;
>> +    unsigned int cpu = smp_processor_id(), done;
>> +    unsigned long tick;
>> +    int ret;
>>  
>> -    /* Store the patch after a successful loading */
>> -    if ( !microcode_update_cpu(patch) && patch )
>> +    /* Allow threads to call in */
>> +    loading_state = LOADING_CALLIN;
>> +    smp_mb();
>> +
>> +    cpumask_set_cpu(cpu, &cpu_callin_map);
>> +
>> +    /* Waiting for all threads calling in */
>> +    ret = wait_for_condition(wait_cpu_callin,
>> +                             (void *)(unsigned long)num_online_cpus(),
>> +                             MICROCODE_CALLIN_TIMEOUT_US);
>> +    if ( ret ) {
>> +        loading_state = LOADING_EXIT;
>> +        return ret;
>> +    }
>
>...this condition holds. Have you actually tested this case?

I didn't craft a case to verify the error-handling path. And I believe
that you are right. 

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 00/15] improve late microcode loading
  2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
                   ` (14 preceding siblings ...)
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
@ 2019-08-22  7:51 ` Sergey Dyasli
  2019-08-22 15:39   ` Chao Gao
  15 siblings, 1 reply; 57+ messages in thread
From: Sergey Dyasli @ 2019-08-22  7:51 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Ashok Raj,
	Wei Liu, Andrew Cooper, Jan Beulich, Boris Ostrovsky,
	Brian Woods, Suravee Suthikulpanit, Roger Pau Monné

Hi Chao,

On 19/08/2019 02:25, Chao Gao wrote:
> Previous change log:
> Changes in version 8:
>  - block #NMI handling during microcode loading (Patch 16)
>  - Don't assume that all CPUs in the system have loaded a same ucode.
>  So when parsing a blob, we attempt to save a patch as long as it matches
>  with current cpu signature regardless of the revision of the patch.
>  And also for loading, we only require the patch to be loaded isn't old
>  than the cached one.
>  - store an update after the first successful loading on a CPU
>  - remove the patch that calls wbinvd() unconditionally before microcode>  loading. It is under internal discussion.

I noticed that you removed the patch which adds wbinvd() back in v8.
What was the reasoning behind that and is there any outcome from the
internal discussion that you mention here?

>  - divide two big patches into several patches to improve readability.

Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-08-22 11:11   ` Roger Pau Monné
  2019-08-28 15:21   ` Jan Beulich
  2019-08-29 10:18   ` Jan Beulich
  2 siblings, 0 replies; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-22 11:11 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 19, 2019 at 09:25:17AM +0800, Chao Gao wrote:
> to replace the current per-cpu cache 'uci->mc'.
> 
> With the assumption that all CPUs in the system have the same signature
> (family, model, stepping and 'pf'), one microcode update matches with
> one cpu should match with others. Having multiple microcode revisions
> on different cpus would cause system unstable and should be avoided.
> Hence, caching only one microcode update is good enough for all cases.
> 
> Introduce a global variable, microcode_cache, to store the newest
> matching microcode update. Whenever we get a new valid microcode update,
> its revision id is compared against that of the microcode update to
> determine whether the "microcode_cache" needs to be replaced. And
> this global cache is loaded to cpu in apply_microcode().
> 
> All operations on the cache is protected by 'microcode_mutex'.
> 
> Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
> as I am going to remove it completely in the following patches. We copy
> everything to create the new cache blob to avoid reusing some buffers
> previously allocated for the old per-cpu cache. It is not so efficient,
> but it is already corrected by a patch later in this series.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

I have some nits below, but I don't think they affect functionality in
anyway, hence my RB. It would be nice to get those fixed as follow-ups
if others think the current version is suitable for committing.

> ---
> Changes in v9:
>  - on Intel side, ->compare_patch just checks the patch revision number.
>  - explain why all buffers are copied in alloc_microcode_patch() in
>  patch description.
> 
> Changes in v8:
>  - Free generic wrapper struct in general code
>  - Try to update cache as long as a patch covers current cpu. Previsouly,
>  cache is updated only if the patch is newer than current update revision in
>  the CPU. The small difference can work around a broken bios which only
>  applies microcode update to BSP and software has to apply the same
>  update to other CPUs.
> 
> Changes in v7:
>  - reworked to cache only one microcode patch rather than a list of
>  microcode patches.
> ---
>  xen/arch/x86/microcode.c        | 39 ++++++++++++++++++
>  xen/arch/x86/microcode_amd.c    | 90 +++++++++++++++++++++++++++++++++++++----
>  xen/arch/x86/microcode_intel.c  | 73 ++++++++++++++++++++++++++-------
>  xen/include/asm-x86/microcode.h | 17 ++++++++
>  4 files changed, 197 insertions(+), 22 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 421d57e..0ecd2fd 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -61,6 +61,9 @@ static struct ucode_mod_blob __initdata ucode_blob;
>   */
>  static bool_t __initdata ucode_scan;
>  
> +/* Protected by microcode_mutex */
> +static struct microcode_patch *microcode_cache;
> +
>  void __init microcode_set_module(unsigned int idx)
>  {
>      ucode_mod_idx = idx;
> @@ -262,6 +265,42 @@ int microcode_resume_cpu(unsigned int cpu)
>      return err;
>  }
>  
> +void microcode_free_patch(struct microcode_patch *microcode_patch)
> +{
> +    microcode_ops->free_patch(microcode_patch->mc);
> +    xfree(microcode_patch);
> +}
> +
> +const struct microcode_patch *microcode_get_cache(void)
> +{
> +    ASSERT(spin_is_locked(&microcode_mutex));
> +
> +    return microcode_cache;
> +}
> +
> +/* Return true if cache gets updated. Otherwise, return false */
> +bool microcode_update_cache(struct microcode_patch *patch)
> +{
> +

Nit: extra newline above.

> +    ASSERT(spin_is_locked(&microcode_mutex));
> +
> +    if ( !microcode_cache )
> +        microcode_cache = patch;
> +    else if ( microcode_ops->compare_patch(patch,
> +                                           microcode_cache) == NEW_UCODE )
> +    {
> +        microcode_free_patch(microcode_cache);
> +        microcode_cache = patch;
> +    }
> +    else
> +    {
> +        microcode_free_patch(patch);
> +        return false;
> +    }
> +
> +    return true;
> +}
> +
>  static int microcode_update_cpu(const void *buf, size_t size)
>  {
>      int err;
> diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
> index 3db3555..30129ca 100644
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -190,24 +190,83 @@ static enum microcode_match_result microcode_fits(
>      return NEW_UCODE;
>  }
>  
> +static bool match_cpu(const struct microcode_patch *patch)
> +{
> +    if ( !patch )
> +        return false;
> +    return microcode_fits(patch->mc_amd, smp_processor_id()) == NEW_UCODE;
> +}
> +
> +static struct microcode_patch *alloc_microcode_patch(
> +    const struct microcode_amd *mc_amd)
> +{
> +    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
> +    struct microcode_amd *cache = xmalloc(struct microcode_amd);
> +    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
> +    struct equiv_cpu_entry *equiv_cpu_table =
> +                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
> +
> +    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
> +    {
> +        xfree(microcode_patch);
> +        xfree(cache);
> +        xfree(mpb);
> +        xfree(equiv_cpu_table);
> +        return ERR_PTR(-ENOMEM);
> +    }
> +
> +    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
> +    cache->mpb = mpb;
> +    cache->mpb_size = mc_amd->mpb_size;
> +    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
> +           mc_amd->equiv_cpu_table_size);
> +    cache->equiv_cpu_table = equiv_cpu_table;
> +    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
> +    microcode_patch->mc_amd = cache;
> +
> +    return microcode_patch;
> +}
> +
> +static void free_patch(void *mc)
> +{
> +    struct microcode_amd *mc_amd = mc;
> +
> +    xfree(mc_amd->equiv_cpu_table);
> +    xfree(mc_amd->mpb);
> +    xfree(mc_amd);
> +}

It's asymmetric that alloc_microcode_patch allocates microcode_patch,
but free_patch doesn't free it. Not a big deal, but it would be good
to make this symmetric IMO.

> +
> +static enum microcode_match_result compare_patch(
> +    const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +    const struct microcode_amd *new_mc = new->mc_amd;
> +    const struct microcode_header_amd *new_header = new_mc->mpb;
> +    const struct microcode_amd *old_mc = old->mc_amd;
> +    const struct microcode_header_amd *old_header = old_mc->mpb;

The local variables new_mc/old_mc are used just one, and hence are
not really helpful IMO, you could just do:

const struct microcode_header_amd *new_header = new->mc_amd->mpb;
const struct microcode_header_amd *old_header = old->mc_amd->mpb;

Again, just a nit.

> +
> +    if ( new_header->processor_rev_id == old_header->processor_rev_id )
> +        return (new_header->patch_id > old_header->patch_id) ?
> +                NEW_UCODE : OLD_UCODE;
> +
> +    return MIS_UCODE;
> +}
> +
>  static int apply_microcode(unsigned int cpu)
>  {
>      unsigned long flags;
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>      uint32_t rev;
> -    struct microcode_amd *mc_amd = uci->mc.mc_amd;
> -    struct microcode_header_amd *hdr;
>      int hw_err;
> +    const struct microcode_header_amd *hdr;
> +    const struct microcode_patch *patch = microcode_get_cache();
>  
>      /* We should bind the task to the CPU */
>      BUG_ON(raw_smp_processor_id() != cpu);
>  
> -    if ( mc_amd == NULL )
> +    if ( !match_cpu(patch) )
>          return -EINVAL;

Another nit, but !patch should get ENOENT rather than EINVAL IMO.

>  
> -    hdr = mc_amd->mpb;
> -    if ( hdr == NULL )
> -        return -EINVAL;
> +    hdr = patch->mc_amd->mpb;
>  
>      spin_lock_irqsave(&microcode_update_lock, flags);
>  
> @@ -496,7 +555,21 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
>      while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
>                                                 &offset)) == 0 )
>      {
> -        if ( microcode_fits(mc_amd, cpu) == NEW_UCODE )
> +        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
> +
> +        if ( IS_ERR(new_patch) )
> +        {
> +            error = PTR_ERR(new_patch);
> +            break;
> +        }
> +
> +        /* Update cache if this patch covers current CPU */
> +        if ( microcode_fits(new_patch->mc_amd, cpu) != MIS_UCODE )
> +            microcode_update_cache(new_patch);
> +        else
> +            microcode_free_patch(new_patch);
> +
> +        if ( match_cpu(microcode_get_cache()) )
>          {
>              error = apply_microcode(cpu);
>              if ( error )
> @@ -640,6 +713,9 @@ static const struct microcode_ops microcode_amd_ops = {
>      .collect_cpu_info                 = collect_cpu_info,
>      .apply_microcode                  = apply_microcode,
>      .start_update                     = start_update,
> +    .free_patch                       = free_patch,
> +    .compare_patch                    = compare_patch,
> +    .match_cpu                        = match_cpu,
>  };
>  
>  int __init microcode_init_amd(void)
> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
> index c185b5c..14485dc 100644
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -259,6 +259,31 @@ static int microcode_sanity_check(void *mc)
>      return 0;
>  }
>  
> +static bool match_cpu(const struct microcode_patch *patch)
> +{
> +    if ( !patch )
> +        return false;
> +
> +    return microcode_update_match(&patch->mc_intel->hdr,
> +                                  smp_processor_id()) == NEW_UCODE;
> +}
> +
> +static void free_patch(void *mc)
> +{
> +    xfree(mc);
> +}
> +
> +/*
> + * Both patches to compare are supposed to be applicable to local CPU.
> + * Just compare the revision number.
> + */
> +static enum microcode_match_result compare_patch(
> +    const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +    return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ?  NEW_UCODE :
> +                                                                OLD_UCODE;

Nit, the usual format in Xen would be:

return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ? NEW_UCODE
                                                         : OLD_UCODE;

> +}
> +
>  /*
>   * return 0 - no update found
>   * return 1 - found update
> @@ -269,10 +294,26 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>      const struct microcode_header_intel *mc_header = mc;
>      unsigned long total_size = get_totalsize(mc_header);
> -    void *new_mc;
> +    void *new_mc = xmalloc_bytes(total_size);
> +    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
>  
> -    if ( microcode_update_match(mc, cpu) != NEW_UCODE )
> +    if ( !new_patch || !new_mc )
> +    {
> +        xfree(new_patch);
> +        xfree(new_mc);
> +        return -ENOMEM;
> +    }
> +    memcpy(new_mc, mc, total_size);
> +    new_patch->mc_intel = new_mc;
> +
> +    /* Make sure that this patch covers current CPU */
> +    if ( microcode_update_match(mc, cpu) == MIS_UCODE )
> +    {
> +        microcode_free_patch(new_patch);
>          return 0;
> +    }

Nit: won't it be easier to do this check first and then allocate if
needed?

AFAICT new_mc or new_patch are not required by the
microcode_update_match call, and hence could be allocated after such
call. Maybe this ugliness will go away in following patches?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
@ 2019-08-22 13:08   ` Roger Pau Monné
  2019-08-28 15:26   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-22 13:08 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 19, 2019 at 09:25:21AM +0800, Chao Gao wrote:
> Introduce a vendor hook, .end_update, for svm_host_osvw_init().
> The hook function is called on each cpu after loading an update.
> It is a preparation for spliting out apply_microcode() from
> cpu_request_microcode().
> 
> Note that svm_host_osvm_init() should be called regardless of the
> result of loading an update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-08-22 13:59   ` Roger Pau Monné
  2019-08-29 10:06     ` Jan Beulich
  2019-08-29 10:19   ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-22 13:59 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 19, 2019 at 09:25:23AM +0800, Chao Gao wrote:
> During late microcode loading, apply_microcode() is invoked in
> cpu_request_microcode(). To make late microcode update more reliable,
> we want to put the apply_microcode() into stop_machine context. So
> we split out it from cpu_request_microcode(). In general, for both
> early loading on BSP and late loading, cpu_request_microcode() is
> called first to get the matching microcode update contained by
> the blob and then apply_microcode() is invoked explicitly on each
> cpu in common code.
> 
> Given that all CPUs are supposed to have the same signature, parsing
> microcode only needs to be done once. So cpu_request_microcode() is
> also moved out of microcode_update_cpu().
> 
> In some cases (e.g. a broken bios), the system may have multiple
> revisions of microcode update. So we would try to load a microcode
> update as long as it covers current cpu. And if a cpu loads this patch
> successfully, the patch would be stored into the patch cache.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> Changes in v9:
>  - remove the calling of ->compare_patch in microcode_update_cpu().
>  - drop "microcode_" prefix for static function - microcode_parse_blob().
>  - rebase and fix conflict
> 
> Changes in v8:
>  - divide the original patch into three patches to improve readability
>  - load an update on each cpu as long as the update covers current cpu
>  - store an update after the first successful loading on a CPU
>  - Make sure the current CPU (especially pf value) is covered
>  by updates.
> 
> changes in v7:
>  - to handle load failure, unvalidated patches won't be cached. They
>  are passed as function arguments. So if update failed, we needn't
>  any cleanup to microcode cache.
> ---
>  xen/arch/x86/microcode.c        | 177 ++++++++++++++++++++++++++--------------
>  xen/arch/x86/microcode_amd.c    |  38 +++++----
>  xen/arch/x86/microcode_intel.c  |  66 +++++++--------
>  xen/include/asm-x86/microcode.h |   5 +-
>  4 files changed, 172 insertions(+), 114 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 0e9322a..a2febc7 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -189,12 +189,19 @@ static DEFINE_SPINLOCK(microcode_mutex);
>  
>  DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>  
> -struct microcode_info {
> -    unsigned int cpu;
> -    uint32_t buffer_size;
> -    int error;
> -    char buffer[1];
> -};
> +/*
> + * Return a patch that covers current CPU. If there are multiple patches,
> + * return the one with the highest revision number. Return error If no
> + * patch is found and an error occurs during the parsing process. Otherwise
> + * return NULL.
> + */
> +static struct microcode_patch *parse_blob(const char *buf, uint32_t len)

Nit: size_t would be more appropriate for len. AFAICT there's no need
for it to be 32bits anyway.

> +{
> +    if ( likely(!microcode_ops->collect_cpu_info(&this_cpu(cpu_sig))) )
> +        return microcode_ops->cpu_request_microcode(buf, len);
> +
> +    return NULL;
> +}
>  
>  int microcode_resume_cpu(void)
>  {
> @@ -220,13 +227,6 @@ void microcode_free_patch(struct microcode_patch *microcode_patch)
>      xfree(microcode_patch);
>  }
>  
> -const struct microcode_patch *microcode_get_cache(void)
> -{
> -    ASSERT(spin_is_locked(&microcode_mutex));
> -
> -    return microcode_cache;
> -}
> -
>  /* Return true if cache gets updated. Otherwise, return false */
>  bool microcode_update_cache(struct microcode_patch *patch)
>  {
> @@ -250,49 +250,71 @@ bool microcode_update_cache(struct microcode_patch *patch)
>      return true;
>  }
>  
> -static int microcode_update_cpu(const void *buf, size_t size)
> +/*
> + * Load a microcode update to current CPU.
> + *
> + * If no patch is provided, the cached patch will be loaded. Microcode update
> + * during APs bringup and CPU resuming falls into this case.
> + */
> +static int microcode_update_cpu(const struct microcode_patch *patch)
>  {
> -    int err;
> -    unsigned int cpu = smp_processor_id();
> -    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
> +    int err = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>  
> -    spin_lock(&microcode_mutex);
> +    if ( unlikely(err) )
> +        return err;
>  
> -    err = microcode_ops->collect_cpu_info(sig);
> -    if ( likely(!err) )
> -        err = microcode_ops->cpu_request_microcode(buf, size);
> -    spin_unlock(&microcode_mutex);
> +    if ( patch )
> +        err = microcode_ops->apply_microcode(patch);
> +    else if ( microcode_cache )
> +    {
> +        spin_lock(&microcode_mutex);
> +        err = microcode_ops->apply_microcode(microcode_cache);
> +        if ( err == -EIO )
> +        {
> +            microcode_free_patch(microcode_cache);
> +            microcode_cache = NULL;
> +        }
> +        spin_unlock(&microcode_mutex);
> +    }
> +    else
> +        /* No patch to update */
> +        err = -ENOENT;
>  
>      return err;
>  }
>  
> -static long do_microcode_update(void *_info)
> +static long do_microcode_update(void *patch)
>  {
> -    struct microcode_info *info = _info;
> -    int error;
> -
> -    BUG_ON(info->cpu != smp_processor_id());
> +    unsigned int cpu;
>  
> -    error = microcode_update_cpu(info->buffer, info->buffer_size);
> -    if ( error )
> -        info->error = error;
> +    /* Store the patch after a successful loading */
> +    if ( !microcode_update_cpu(patch) && patch )

Aren't you loosing the error code returned by microcode_update_cpu
here?

Seeing how this works I'm not sure what's the best option here. As
updating will be attempted on other CPUs, I'm not sure if it's OK to
return an error if the update succeed on some CPUs but failed on
others.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
@ 2019-08-22 14:10   ` Roger Pau Monné
  2019-08-22 16:44     ` Chao Gao
  2019-08-29 10:29   ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-22 14:10 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
> Both are loading the cached patch. Since APs call the unified function,
> microcode_update_one(), during wakeup, the 'start_update' parameter
> which originally used to distinguish BSP and APs is redundant. So remove
> this parameter.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Note that here is a functional change: resuming a CPU would call
> ->end_update() now while previously it wasn't. Not quite sure
> whether it is correct.

I guess that's required if it called start_update prior to calling
end_update?

> 
> Changes in v9:
>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>    microcode_update_one()
>  - rebase and fix conflicts.
> 
> Changes in v8:
>  - split out from the previous patch
> ---
>  xen/arch/x86/acpi/power.c       |  2 +-
>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
>  xen/arch/x86/smpboot.c          |  5 +--
>  xen/include/asm-x86/processor.h |  4 +-
>  4 files changed, 44 insertions(+), 57 deletions(-)
> 
> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> index 4f21903..24798d5 100644
> --- a/xen/arch/x86/acpi/power.c
> +++ b/xen/arch/x86/acpi/power.c
> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>  
>      console_end_sync();
>  
> -    microcode_resume_cpu();
> +    microcode_update_one();
>  
>      if ( !recheck_cpu_features(0) )
>          panic("Missing previously available feature(s)\n");
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index a2febc7..bdd9c9f 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
>      return NULL;
>  }
>  
> -int microcode_resume_cpu(void)
> -{
> -    int err;
> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
> -
> -    if ( !microcode_ops )
> -        return 0;
> -
> -    spin_lock(&microcode_mutex);
> -
> -    err = microcode_ops->collect_cpu_info(sig);
> -    if ( likely(!err) )
> -        err = microcode_ops->apply_microcode(microcode_cache);
> -    spin_unlock(&microcode_mutex);
> -
> -    return err;
> -}
> -
>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>  {
>      microcode_ops->free_patch(microcode_patch->mc);
> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>  }
>  __initcall(microcode_init);
>  
> -int __init early_microcode_update_cpu(bool start_update)
> +/* Load a cached update to current cpu */
> +int microcode_update_one(void)
> +{
> +    int rc;
> +
> +    if ( !microcode_ops )
> +        return -EOPNOTSUPP;
> +
> +    rc = microcode_update_cpu(NULL);
> +
> +    if ( microcode_ops->end_update )
> +        microcode_ops->end_update();

Don't you need to call start_update before calling
microcode_update_cpu?

It would be nice to have paired calls to start_update/end_update in
the same context (ie: function) or else this is very hard to follow,
and very easy to get out of sync.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 00/15] improve late microcode loading
  2019-08-22  7:51 ` [Xen-devel] [PATCH v9 00/15] improve late microcode loading Sergey Dyasli
@ 2019-08-22 15:39   ` Chao Gao
  0 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-22 15:39 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, xen-devel,
	Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

On Thu, Aug 22, 2019 at 08:51:43AM +0100, Sergey Dyasli wrote:
>Hi Chao,
>
>On 19/08/2019 02:25, Chao Gao wrote:
>> Previous change log:
>> Changes in version 8:
>>  - block #NMI handling during microcode loading (Patch 16)
>>  - Don't assume that all CPUs in the system have loaded a same ucode.
>>  So when parsing a blob, we attempt to save a patch as long as it matches
>>  with current cpu signature regardless of the revision of the patch.
>>  And also for loading, we only require the patch to be loaded isn't old
>>  than the cached one.
>>  - store an update after the first successful loading on a CPU
>>  - remove the patch that calls wbinvd() unconditionally before microcode>  loading. It is under internal discussion.
>
>I noticed that you removed the patch which adds wbinvd() back in v8.
>What was the reasoning behind that and is there any outcome from the
>internal discussion that you mention here?

Jan (maybe someone else) was concerned about the impact of calling
wbinvd() unconditionally, especially with your work to make serial
ucode loading an option. To address this concern, I planned to call
wbinvd() conditionally. I need to confirm with Intel microcode team
whether it is fine and what the condition should be. But I haven't
received an answer. I will talk with Ashok again and probably add
this patch back in v10.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-22 14:10   ` Roger Pau Monné
@ 2019-08-22 16:44     ` Chao Gao
  2019-08-23  9:09       ` Roger Pau Monné
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-22 16:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
>On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
>> Both are loading the cached patch. Since APs call the unified function,
>> microcode_update_one(), during wakeup, the 'start_update' parameter
>> which originally used to distinguish BSP and APs is redundant. So remove
>> this parameter.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Note that here is a functional change: resuming a CPU would call
>> ->end_update() now while previously it wasn't. Not quite sure
>> whether it is correct.
>
>I guess that's required if it called start_update prior to calling
>end_update?
>
>> 
>> Changes in v9:
>>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>>    microcode_update_one()
>>  - rebase and fix conflicts.
>> 
>> Changes in v8:
>>  - split out from the previous patch
>> ---
>>  xen/arch/x86/acpi/power.c       |  2 +-
>>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
>>  xen/arch/x86/smpboot.c          |  5 +--
>>  xen/include/asm-x86/processor.h |  4 +-
>>  4 files changed, 44 insertions(+), 57 deletions(-)
>> 
>> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>> index 4f21903..24798d5 100644
>> --- a/xen/arch/x86/acpi/power.c
>> +++ b/xen/arch/x86/acpi/power.c
>> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>>  
>>      console_end_sync();
>>  
>> -    microcode_resume_cpu();
>> +    microcode_update_one();
>>  
>>      if ( !recheck_cpu_features(0) )
>>          panic("Missing previously available feature(s)\n");
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index a2febc7..bdd9c9f 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
>>      return NULL;
>>  }
>>  
>> -int microcode_resume_cpu(void)
>> -{
>> -    int err;
>> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
>> -
>> -    if ( !microcode_ops )
>> -        return 0;
>> -
>> -    spin_lock(&microcode_mutex);
>> -
>> -    err = microcode_ops->collect_cpu_info(sig);
>> -    if ( likely(!err) )
>> -        err = microcode_ops->apply_microcode(microcode_cache);
>> -    spin_unlock(&microcode_mutex);
>> -
>> -    return err;
>> -}
>> -
>>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>>  {
>>      microcode_ops->free_patch(microcode_patch->mc);
>> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>>  }
>>  __initcall(microcode_init);
>>  
>> -int __init early_microcode_update_cpu(bool start_update)
>> +/* Load a cached update to current cpu */
>> +int microcode_update_one(void)
>> +{
>> +    int rc;
>> +
>> +    if ( !microcode_ops )
>> +        return -EOPNOTSUPP;
>> +
>> +    rc = microcode_update_cpu(NULL);
>> +
>> +    if ( microcode_ops->end_update )
>> +        microcode_ops->end_update();
>
>Don't you need to call start_update before calling
>microcode_update_cpu?

No. On AMD side, osvw_status records the hardware erratum in the system.
As we don't assume all CPUs have the same erratum, each cpu calls
end_update to update osvw_status after ucode loading.
start_update just resets osvw_status to 0. And it is called once prior
to ucode loading on any CPU so that osvw_status can be recomputed.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
@ 2019-08-23  8:11   ` Roger Pau Monné
  2019-08-26  7:03     ` Chao Gao
  2019-08-29 10:47   ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-23  8:11 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 19, 2019 at 09:25:25AM +0800, Chao Gao wrote:
> To create a microcode patch from a vendor-specific update,
> allocate_microcode_patch() copied everything from the update.
> It is not efficient. Essentially, we just need to go through
> ucodes in the blob, find the one with the newest revision and
> install it into the microcode_patch. In the process, buffers
> like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
> side) can be reused. microcode_patch now is allocated after
> it is sure that there is a matching ucode.

Oh, I think this answers my question on a previous patch.

For future series it would be nice to avoid so many rewrites in the
same series, alloc_microcode_patch is already modified in a previous
patch, just to be removed here. It also makes it harder to follow
what's going on.

> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v9:
>  - new
> ---
>  xen/arch/x86/microcode_amd.c   | 99 +++++++++++++++---------------------------
>  xen/arch/x86/microcode_intel.c | 65 ++++++++++-----------------
>  2 files changed, 58 insertions(+), 106 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
> index 6353323..ec1c2eb 100644
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -194,36 +194,6 @@ static bool match_cpu(const struct microcode_patch *patch)
>      return patch && (microcode_fits(patch->mc_amd) == NEW_UCODE);
>  }
>  
> -static struct microcode_patch *alloc_microcode_patch(
> -    const struct microcode_amd *mc_amd)
> -{
> -    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
> -    struct microcode_amd *cache = xmalloc(struct microcode_amd);
> -    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
> -    struct equiv_cpu_entry *equiv_cpu_table =
> -                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
> -
> -    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
> -    {
> -        xfree(microcode_patch);
> -        xfree(cache);
> -        xfree(mpb);
> -        xfree(equiv_cpu_table);
> -        return ERR_PTR(-ENOMEM);
> -    }
> -
> -    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
> -    cache->mpb = mpb;
> -    cache->mpb_size = mc_amd->mpb_size;
> -    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
> -           mc_amd->equiv_cpu_table_size);
> -    cache->equiv_cpu_table = equiv_cpu_table;
> -    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
> -    microcode_patch->mc_amd = cache;
> -
> -    return microcode_patch;
> -}
> -
>  static void free_patch(void *mc)
>  {
>      struct microcode_amd *mc_amd = mc;
> @@ -320,18 +290,10 @@ static int get_ucode_from_buffer_amd(
>          return -EINVAL;
>      }
>  
> -    if ( mc_amd->mpb_size < mpbuf->len )
> -    {
> -        if ( mc_amd->mpb )
> -        {
> -            xfree(mc_amd->mpb);
> -            mc_amd->mpb_size = 0;
> -        }
> -        mc_amd->mpb = xmalloc_bytes(mpbuf->len);
> -        if ( mc_amd->mpb == NULL )
> -            return -ENOMEM;
> -        mc_amd->mpb_size = mpbuf->len;
> -    }
> +    mc_amd->mpb = xmalloc_bytes(mpbuf->len);
> +    if ( mc_amd->mpb == NULL )

Nit:

if ( !mc_amd->mpb )

is the usual idiom used in Xen.

> +        return -ENOMEM;
> +    mc_amd->mpb_size = mpbuf->len;
>      memcpy(mc_amd->mpb, mpbuf->data, mpbuf->len);
>  
>      pr_debug("microcode: CPU%d size %zu, block size %u offset %zu equivID %#x rev %#x\n",
> @@ -451,8 +413,9 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>                                                       size_t bufsize)
>  {
>      struct microcode_amd *mc_amd;
> +    struct microcode_header_amd *saved = NULL;
>      struct microcode_patch *patch = NULL;
> -    size_t offset = 0;
> +    size_t offset = 0, saved_size = 0;
>      int error = 0;
>      unsigned int current_cpu_id;
>      unsigned int equiv_cpu_id;
> @@ -542,29 +505,21 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>      while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
>                                                 &offset)) == 0 )
>      {
> -        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
> -
> -        if ( IS_ERR(new_patch) )
> -        {
> -            error = PTR_ERR(new_patch);
> -            break;
> -        }
> -
>          /*
> -         * If the new patch covers current CPU, compare patches and store the
> +         * If the new ucode covers current CPU, compare ucodes and store the
>           * one with higher revision.
>           */
> -        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
> -             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
> +#define REV_ID(mpb) (((struct microcode_header_amd *)(mpb))->processor_rev_id)
> +        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
> +             (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
> +#undef REV_ID
>          {
> -            struct microcode_patch *tmp = patch;
> -
> -            patch = new_patch;
> -            new_patch = tmp;
> +            xfree(saved);
> +            saved = mc_amd->mpb;
> +            saved_size = mc_amd->mpb_size;
>          }
> -
> -        if ( new_patch )
> -            microcode_free_patch(new_patch);
> +        else
> +            xfree(mc_amd->mpb);
>  
>          if ( offset >= bufsize )
>              break;
> @@ -593,9 +548,25 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>               *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
>              break;
>      }
> -    xfree(mc_amd->mpb);
> -    xfree(mc_amd->equiv_cpu_table);
> -    xfree(mc_amd);
> +
> +    if ( saved )
> +    {
> +        mc_amd->mpb = saved;
> +        mc_amd->mpb_size = saved_size;
> +        patch = xmalloc(struct microcode_patch);
> +        if ( patch )
> +            patch->mc_amd = mc_amd;
> +        else
> +        {
> +            free_patch(mc_amd);
> +            error = -ENOMEM;
> +        }
> +    }
> +    else
> +    {
> +        mc_amd->mpb = NULL;

What's the point in setting mpb to NULL if you are just going to free
mc_amd below?

Also, I'm not sure I understand why you need to free mc_amd, isn't
this buff memory that should be freed by the caller?

ie: in the Intel counterpart below you don't seem to free the mc
cursor used for the get_next_ucode_from_buffer loop.

> +        free_patch(mc_amd);
> +    }
>  
>    out:
>      if ( error && !patch )
> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
> index 96b38f8..ae5759f 100644
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -282,25 +282,6 @@ static enum microcode_match_result compare_patch(
>                                                                  OLD_UCODE;
>  }
>  
> -static struct microcode_patch *alloc_microcode_patch(
> -    const struct microcode_header_intel *mc_header)
> -{
> -    unsigned long total_size = get_totalsize(mc_header);
> -    void *new_mc = xmalloc_bytes(total_size);
> -    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
> -
> -    if ( !new_patch || !new_mc )
> -    {
> -        xfree(new_patch);
> -        xfree(new_mc);
> -        return ERR_PTR(-ENOMEM);
> -    }
> -    memcpy(new_mc, mc_header, total_size);
> -    new_patch->mc_intel = new_mc;
> -
> -    return new_patch;
> -}
> -
>  static int apply_microcode(const struct microcode_patch *patch)
>  {
>      unsigned long flags;
> @@ -379,47 +360,47 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>  {
>      long offset = 0;
>      int error = 0;
> -    void *mc;
> +    struct microcode_intel *mc, *saved = NULL;
>      struct microcode_patch *patch = NULL;
>  
> -    while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
> +    while ( (offset = get_next_ucode_from_buffer((void **)&mc, buf,
> +                                                 size, offset)) > 0 )
>      {
> -        struct microcode_patch *new_patch;
> -
>          error = microcode_sanity_check(mc);
>          if ( error )
> -            break;
> -
> -        new_patch = alloc_microcode_patch(mc);
> -        if ( IS_ERR(new_patch) )
>          {
> -            error = PTR_ERR(new_patch);
> +            xfree(mc);
>              break;
>          }
>  
>          /*
> -         * If the new patch covers current CPU, compare patches and store the
> +         * If the new update covers current CPU, compare updates and store the
>           * one with higher revision.
>           */
> -        if ( (microcode_update_match(&new_patch->mc_intel->hdr) != MIS_UCODE) &&
> -             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
> +        if ( (microcode_update_match(&mc->hdr) != MIS_UCODE) &&
> +             (!saved || (mc->hdr.rev > saved->hdr.rev)) )
>          {
> -            struct microcode_patch *tmp = patch;
> -
> -            patch = new_patch;
> -            new_patch = tmp;
> +            xfree(saved);
> +            saved = mc;
>          }
> -
> -        if ( new_patch )
> -            microcode_free_patch(new_patch);
> -
> -        xfree(mc);
> +        else
> +            xfree(mc);
>      }
> -    if ( offset > 0 )
> -        xfree(mc);
>      if ( offset < 0 )
>          error = offset;
>  
> +    if ( saved )
> +    {
> +        patch = xmalloc(struct microcode_patch);
> +        if ( patch )
> +            patch->mc_intel = saved;
> +        else
> +        {
> +            xfree(saved);
> +            error = -ENOMEM;
> +        }
> +    }
> +

The above code looks suspiciously similar to the AMD
cpu_request_microcode counterpart, which makes me think it could be
further abstracted in order to reduce this duplication. In any case,
this can be done in a follow up patch.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
@ 2019-08-23  8:46   ` Sergey Dyasli
  2019-08-26  8:07     ` Chao Gao
  2019-08-29 12:22   ` Jan Beulich
  1 sibling, 1 reply; 57+ messages in thread
From: Sergey Dyasli @ 2019-08-23  8:46 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Ashok Raj,
	Wei Liu, Andrew Cooper, Jan Beulich, Roger Pau Monné

On 19/08/2019 02:25, Chao Gao wrote:
> register an nmi callback. And this callback does busy-loop on threads
> which are waiting for loading completion. Control threads send NMI to
> slave threads to prevent NMI acceptance during ucode loading.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v9:
>  - control threads send NMI to all other threads. Slave threads will
>  stay in the NMI handling to prevent NMI acceptance during ucode
>  loading. Note that self-nmi is invalid according to SDM.

To me this looks like a half-measure: why keep only slave threads in
the NMI handler, when master threads can update the microcode from
inside the NMI handler as well?

You mention that self-nmi is invalid, but Xen has self_nmi() which is
used for apply_alternatives() during boot, so can be trusted to work.

I experimented a bit with the following approach: after loading_state
becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
the NMI handler. And it seems to work.

Separate question is about the safety of this approach: can we be sure
that a ucode update would not reset the status of the NMI latch? I.e.
can it cause another NMI to be delivered while Xen already handles one?

>  - s/rep_nop/cpu_relax
>  - remove debug message in microcode_nmi_callback(). Printing debug
>  message would take long times and control thread may timeout.
>  - rebase and fix conflicts

Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-22 16:44     ` Chao Gao
@ 2019-08-23  9:09       ` Roger Pau Monné
  2019-08-29  7:37         ` Chao Gao
  0 siblings, 1 reply; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-23  9:09 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
> >> Both are loading the cached patch. Since APs call the unified function,
> >> microcode_update_one(), during wakeup, the 'start_update' parameter
> >> which originally used to distinguish BSP and APs is redundant. So remove
> >> this parameter.
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> ---
> >> Note that here is a functional change: resuming a CPU would call
> >> ->end_update() now while previously it wasn't. Not quite sure
> >> whether it is correct.
> >
> >I guess that's required if it called start_update prior to calling
> >end_update?
> >
> >> 
> >> Changes in v9:
> >>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
> >>    microcode_update_one()
> >>  - rebase and fix conflicts.
> >> 
> >> Changes in v8:
> >>  - split out from the previous patch
> >> ---
> >>  xen/arch/x86/acpi/power.c       |  2 +-
> >>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
> >>  xen/arch/x86/smpboot.c          |  5 +--
> >>  xen/include/asm-x86/processor.h |  4 +-
> >>  4 files changed, 44 insertions(+), 57 deletions(-)
> >> 
> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> >> index 4f21903..24798d5 100644
> >> --- a/xen/arch/x86/acpi/power.c
> >> +++ b/xen/arch/x86/acpi/power.c
> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
> >>  
> >>      console_end_sync();
> >>  
> >> -    microcode_resume_cpu();
> >> +    microcode_update_one();
> >>  
> >>      if ( !recheck_cpu_features(0) )
> >>          panic("Missing previously available feature(s)\n");
> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> >> index a2febc7..bdd9c9f 100644
> >> --- a/xen/arch/x86/microcode.c
> >> +++ b/xen/arch/x86/microcode.c
> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
> >>      return NULL;
> >>  }
> >>  
> >> -int microcode_resume_cpu(void)
> >> -{
> >> -    int err;
> >> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
> >> -
> >> -    if ( !microcode_ops )
> >> -        return 0;
> >> -
> >> -    spin_lock(&microcode_mutex);
> >> -
> >> -    err = microcode_ops->collect_cpu_info(sig);
> >> -    if ( likely(!err) )
> >> -        err = microcode_ops->apply_microcode(microcode_cache);
> >> -    spin_unlock(&microcode_mutex);
> >> -
> >> -    return err;
> >> -}
> >> -
> >>  void microcode_free_patch(struct microcode_patch *microcode_patch)
> >>  {
> >>      microcode_ops->free_patch(microcode_patch->mc);
> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
> >>  }
> >>  __initcall(microcode_init);
> >>  
> >> -int __init early_microcode_update_cpu(bool start_update)
> >> +/* Load a cached update to current cpu */
> >> +int microcode_update_one(void)
> >> +{
> >> +    int rc;
> >> +
> >> +    if ( !microcode_ops )
> >> +        return -EOPNOTSUPP;
> >> +
> >> +    rc = microcode_update_cpu(NULL);
> >> +
> >> +    if ( microcode_ops->end_update )
> >> +        microcode_ops->end_update();
> >
> >Don't you need to call start_update before calling
> >microcode_update_cpu?
> 
> No. On AMD side, osvw_status records the hardware erratum in the system.
> As we don't assume all CPUs have the same erratum, each cpu calls
> end_update to update osvw_status after ucode loading.
> start_update just resets osvw_status to 0. And it is called once prior
> to ucode loading on any CPU so that osvw_status can be recomputed.

Oh, I think I understand it. start_update must only be called once
_before_ the sequence to update the microcode on all CPUs is
performed, while end_update needs to be called on _each_ CPU after the
update has been completed in order to account for any erratas.

The name for those hooks should be improved, I guess renaming
end_update to end_update_each or end_update_percpu would be clearer in
order to make it clear that start_update is global, while end_update
is percpu. Anyway, I don't want to delay this series for a naming nit.

I'm still unsure where start_update is called for the resume from
suspension case, I don't seem to see any call to start_update neither
in enter_state or microcode_update_one, hence I think this is missing?

I would expect you need to clean osvw_status also on resume from
suspension, in case microcode loading fails? Or else you will be
carrying a stale osvw_status.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
  2019-08-23  8:11   ` Roger Pau Monné
@ 2019-08-26  7:03     ` Chao Gao
  2019-08-26  8:11       ` Roger Pau Monné
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-26  7:03 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Fri, Aug 23, 2019 at 10:11:21AM +0200, Roger Pau Monné wrote:
>On Mon, Aug 19, 2019 at 09:25:25AM +0800, Chao Gao wrote:
>> To create a microcode patch from a vendor-specific update,
>> allocate_microcode_patch() copied everything from the update.
>> It is not efficient. Essentially, we just need to go through
>> ucodes in the blob, find the one with the newest revision and
>> install it into the microcode_patch. In the process, buffers
>> like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
>> side) can be reused. microcode_patch now is allocated after
>> it is sure that there is a matching ucode.
>
>Oh, I think this answers my question on a previous patch.
>
>For future series it would be nice to avoid so many rewrites in the
>same series, alloc_microcode_patch is already modified in a previous
>patch, just to be removed here. It also makes it harder to follow
>what's going on.

Got it. This patch is added in this new version. And some trivial
patches already got reviewed-by. So I don't merge it with them.

>>      while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
>>                                                 &offset)) == 0 )
>>      {
>> -        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
>> -
>> -        if ( IS_ERR(new_patch) )
>> -        {
>> -            error = PTR_ERR(new_patch);
>> -            break;
>> -        }
>> -
>>          /*
>> -         * If the new patch covers current CPU, compare patches and store the
>> +         * If the new ucode covers current CPU, compare ucodes and store the
>>           * one with higher revision.
>>           */
>> -        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
>> -             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
>> +#define REV_ID(mpb) (((struct microcode_header_amd *)(mpb))->processor_rev_id)
>> +        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
>> +             (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
>> +#undef REV_ID
>>          {
>> -            struct microcode_patch *tmp = patch;
>> -
>> -            patch = new_patch;
>> -            new_patch = tmp;
>> +            xfree(saved);
>> +            saved = mc_amd->mpb;
>> +            saved_size = mc_amd->mpb_size;
>>          }
>> -
>> -        if ( new_patch )
>> -            microcode_free_patch(new_patch);
>> +        else
>> +            xfree(mc_amd->mpb);

It might be better to move 'mc_amd->mpb = NULL' here.

>>  
>>          if ( offset >= bufsize )
>>              break;
>> @@ -593,9 +548,25 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>>               *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
>>              break;
>>      }
>> -    xfree(mc_amd->mpb);
>> -    xfree(mc_amd->equiv_cpu_table);
>> -    xfree(mc_amd);
>> +
>> +    if ( saved )
>> +    {
>> +        mc_amd->mpb = saved;
>> +        mc_amd->mpb_size = saved_size;
>> +        patch = xmalloc(struct microcode_patch);
>> +        if ( patch )
>> +            patch->mc_amd = mc_amd;
>> +        else
>> +        {
>> +            free_patch(mc_amd);
>> +            error = -ENOMEM;
>> +        }
>> +    }
>> +    else
>> +    {
>> +        mc_amd->mpb = NULL;
>
>What's the point in setting mpb to NULL if you are just going to free
>mc_amd below?

To avoid double free here. mc_amd->mpb is always freed or saved.
And here we want to free mc_amd itself and mc_amd->equiv_cpu_table.

>
>Also, I'm not sure I understand why you need to free mc_amd, isn't
>this buff memory that should be freed by the caller?

But mc_amd is allocated in this function.

>
>ie: in the Intel counterpart below you don't seem to free the mc
>cursor used for the get_next_ucode_from_buffer loop.

'mc' is saved if it is newer than current patch stored in 'saved'.
Otherwise 'mc' is freed immediately. So we don't need to free it
again after the while loop.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-23  8:46   ` Sergey Dyasli
@ 2019-08-26  8:07     ` Chao Gao
  2019-08-27  4:52       ` Chao Gao
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-26  8:07 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, xen-devel,
	Roger Pau Monné

On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>On 19/08/2019 02:25, Chao Gao wrote:
>> register an nmi callback. And this callback does busy-loop on threads
>> which are waiting for loading completion. Control threads send NMI to
>> slave threads to prevent NMI acceptance during ucode loading.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v9:
>>  - control threads send NMI to all other threads. Slave threads will
>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>  loading. Note that self-nmi is invalid according to SDM.
>
>To me this looks like a half-measure: why keep only slave threads in
>the NMI handler, when master threads can update the microcode from
>inside the NMI handler as well?

No special reason. Because the issue we want to address is that slave
threads might go to handle NMI and access MSRs when master thread is
loading ucode. So we only keep slave threads in the NMI handler.

>
>You mention that self-nmi is invalid, but Xen has self_nmi() which is
>used for apply_alternatives() during boot, so can be trusted to work.

Sorry, I meant using self shorthand to send self-nmi. I tried to use
self shorthand but got APIC error. And I agree that it is better to
make slave thread call self_nmi() itself.

>
>I experimented a bit with the following approach: after loading_state
>becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>the NMI handler. And it seems to work.
>
>Separate question is about the safety of this approach: can we be sure
>that a ucode update would not reset the status of the NMI latch? I.e.
>can it cause another NMI to be delivered while Xen already handles one?

Ashok, what's your opinion on Sergey's approach and his concern?

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
  2019-08-26  7:03     ` Chao Gao
@ 2019-08-26  8:11       ` Roger Pau Monné
  0 siblings, 0 replies; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-26  8:11 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Mon, Aug 26, 2019 at 03:03:22PM +0800, Chao Gao wrote:
> On Fri, Aug 23, 2019 at 10:11:21AM +0200, Roger Pau Monné wrote:
> >On Mon, Aug 19, 2019 at 09:25:25AM +0800, Chao Gao wrote:
> >> To create a microcode patch from a vendor-specific update,
> >> allocate_microcode_patch() copied everything from the update.
> >> It is not efficient. Essentially, we just need to go through
> >> ucodes in the blob, find the one with the newest revision and
> >> install it into the microcode_patch. In the process, buffers
> >> like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
> >> side) can be reused. microcode_patch now is allocated after
> >> it is sure that there is a matching ucode.
> >
> >Oh, I think this answers my question on a previous patch.
> >
> >For future series it would be nice to avoid so many rewrites in the
> >same series, alloc_microcode_patch is already modified in a previous
> >patch, just to be removed here. It also makes it harder to follow
> >what's going on.
> 
> Got it. This patch is added in this new version. And some trivial
> patches already got reviewed-by. So I don't merge it with them.
> 
> >>      while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
> >>                                                 &offset)) == 0 )
> >>      {
> >> -        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
> >> -
> >> -        if ( IS_ERR(new_patch) )
> >> -        {
> >> -            error = PTR_ERR(new_patch);
> >> -            break;
> >> -        }
> >> -
> >>          /*
> >> -         * If the new patch covers current CPU, compare patches and store the
> >> +         * If the new ucode covers current CPU, compare ucodes and store the
> >>           * one with higher revision.
> >>           */
> >> -        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
> >> -             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
> >> +#define REV_ID(mpb) (((struct microcode_header_amd *)(mpb))->processor_rev_id)
> >> +        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
> >> +             (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
> >> +#undef REV_ID
> >>          {
> >> -            struct microcode_patch *tmp = patch;
> >> -
> >> -            patch = new_patch;
> >> -            new_patch = tmp;
> >> +            xfree(saved);
> >> +            saved = mc_amd->mpb;
> >> +            saved_size = mc_amd->mpb_size;
> >>          }
> >> -
> >> -        if ( new_patch )
> >> -            microcode_free_patch(new_patch);
> >> +        else
> >> +            xfree(mc_amd->mpb);
> 
> It might be better to move 'mc_amd->mpb = NULL' here.
> 
> >>  
> >>          if ( offset >= bufsize )
> >>              break;
> >> @@ -593,9 +548,25 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
> >>               *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
> >>              break;
> >>      }
> >> -    xfree(mc_amd->mpb);
> >> -    xfree(mc_amd->equiv_cpu_table);
> >> -    xfree(mc_amd);
> >> +
> >> +    if ( saved )
> >> +    {
> >> +        mc_amd->mpb = saved;
> >> +        mc_amd->mpb_size = saved_size;
> >> +        patch = xmalloc(struct microcode_patch);
> >> +        if ( patch )
> >> +            patch->mc_amd = mc_amd;
> >> +        else
> >> +        {
> >> +            free_patch(mc_amd);
> >> +            error = -ENOMEM;
> >> +        }
> >> +    }
> >> +    else
> >> +    {
> >> +        mc_amd->mpb = NULL;
> >
> >What's the point in setting mpb to NULL if you are just going to free
> >mc_amd below?
> 
> To avoid double free here. mc_amd->mpb is always freed or saved.
> And here we want to free mc_amd itself and mc_amd->equiv_cpu_table.

But there's no chance of a double free here, since you are freeing
mc_amd in the line below after setting mpb = NULL.

I think it would make sense to set mpb = NULL after freeing it inside
the loop.

With that you can add my:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> >
> >Also, I'm not sure I understand why you need to free mc_amd, isn't
> >this buff memory that should be freed by the caller?
> 
> But mc_amd is allocated in this function.
> 
> >
> >ie: in the Intel counterpart below you don't seem to free the mc
> >cursor used for the get_next_ucode_from_buffer loop.
> 
> 'mc' is saved if it is newer than current patch stored in 'saved'.
> Otherwise 'mc' is freed immediately. So we don't need to free it
> again after the while loop.

Ack, thanks!

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-26  8:07     ` Chao Gao
@ 2019-08-27  4:52       ` Chao Gao
  2019-08-28  8:52         ` Sergey Dyasli
  2019-08-29 12:11         ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-27  4:52 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich, xen-devel,
	Roger Pau Monné

On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>On 19/08/2019 02:25, Chao Gao wrote:
>>> register an nmi callback. And this callback does busy-loop on threads
>>> which are waiting for loading completion. Control threads send NMI to
>>> slave threads to prevent NMI acceptance during ucode loading.
>>> 
>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>> ---
>>> Changes in v9:
>>>  - control threads send NMI to all other threads. Slave threads will
>>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>>  loading. Note that self-nmi is invalid according to SDM.
>>
>>To me this looks like a half-measure: why keep only slave threads in
>>the NMI handler, when master threads can update the microcode from
>>inside the NMI handler as well?
>
>No special reason. Because the issue we want to address is that slave
>threads might go to handle NMI and access MSRs when master thread is
>loading ucode. So we only keep slave threads in the NMI handler.
>
>>
>>You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>used for apply_alternatives() during boot, so can be trusted to work.
>
>Sorry, I meant using self shorthand to send self-nmi. I tried to use
>self shorthand but got APIC error. And I agree that it is better to
>make slave thread call self_nmi() itself.
>
>>
>>I experimented a bit with the following approach: after loading_state
>>becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>the NMI handler. And it seems to work.
>>
>>Separate question is about the safety of this approach: can we be sure
>>that a ucode update would not reset the status of the NMI latch? I.e.
>>can it cause another NMI to be delivered while Xen already handles one?
>
>Ashok, what's your opinion on Sergey's approach and his concern?

Hi Sergey,

I talked with Ashok. We think your approach is better. I will follow
your approach in v10. It would be much helpful if you post your patch
so that I can just rebase it onto other patches.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-27  4:52       ` Chao Gao
@ 2019-08-28  8:52         ` Sergey Dyasli
  2019-08-29 12:11         ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Sergey Dyasli @ 2019-08-28  8:52 UTC (permalink / raw)
  To: Chao Gao
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Ashok Raj,
	Wei Liu, Andrew Cooper, Jan Beulich, xen-devel,
	Roger Pau Monné

On 27/08/2019 05:52, Chao Gao wrote:
> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>> On 19/08/2019 02:25, Chao Gao wrote:
>>>> register an nmi callback. And this callback does busy-loop on threads
>>>> which are waiting for loading completion. Control threads send NMI to
>>>> slave threads to prevent NMI acceptance during ucode loading.
>>>>
>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>> ---
>>>> Changes in v9:
>>>>  - control threads send NMI to all other threads. Slave threads will
>>>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>>>  loading. Note that self-nmi is invalid according to SDM.
>>>
>>> To me this looks like a half-measure: why keep only slave threads in
>>> the NMI handler, when master threads can update the microcode from
>>> inside the NMI handler as well?
>>
>> No special reason. Because the issue we want to address is that slave
>> threads might go to handle NMI and access MSRs when master thread is
>> loading ucode. So we only keep slave threads in the NMI handler.
>>
>>>
>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>> used for apply_alternatives() during boot, so can be trusted to work.
>>
>> Sorry, I meant using self shorthand to send self-nmi. I tried to use
>> self shorthand but got APIC error. And I agree that it is better to
>> make slave thread call self_nmi() itself.
>>
>>>
>>> I experimented a bit with the following approach: after loading_state
>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>> the NMI handler. And it seems to work.
>>>
>>> Separate question is about the safety of this approach: can we be sure
>>> that a ucode update would not reset the status of the NMI latch? I.e.
>>> can it cause another NMI to be delivered while Xen already handles one?
>>
>> Ashok, what's your opinion on Sergey's approach and his concern?
> 
> Hi Sergey,
> 
> I talked with Ashok. We think your approach is better. I will follow
> your approach in v10. It would be much helpful if you post your patch
> so that I can just rebase it onto other patches.

Sure thing. The below code is my first attempt at improving the original
patch. It can benefit from some further refactoring.

---
 xen/arch/x86/microcode.c | 108 ++++++++++++++++++++++++++++-----------
 1 file changed, 79 insertions(+), 29 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 91f9e811f8..ba2363406f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -36,8 +36,10 @@
 #include <xen/earlycpio.h>
 #include <xen/watchdog.h>

+#include <asm/apic.h>
 #include <asm/delay.h>
 #include <asm/msr.h>
+#include <asm/nmi.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
@@ -232,6 +234,7 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
  */
 static cpumask_t cpu_callin_map;
 static atomic_t cpu_out, cpu_updated;
+struct microcode_patch *nmi_patch;

 /*
  * Return a patch that covers current CPU. If there are multiple patches,
@@ -337,15 +340,25 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
     return err;
 }

+static void slave_thread_work(void)
+{
+    /* Do nothing, just wait */
+    while ( loading_state != LOADING_EXIT )
+        cpu_relax();
+}
+
 static int slave_thread_fn(void)
 {
-    unsigned int cpu = smp_processor_id();
     unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));

     while ( loading_state != LOADING_CALLIN )
+    {
+        if ( loading_state == LOADING_EXIT )
+            return 0;
         cpu_relax();
+    }

-    cpumask_set_cpu(cpu, &cpu_callin_map);
+    self_nmi();

     while ( loading_state != LOADING_EXIT )
         cpu_relax();
@@ -356,30 +369,35 @@ static int slave_thread_fn(void)
     return 0;
 }

-static int master_thread_fn(const struct microcode_patch *patch)
+static void master_thread_work(void)
 {
-    unsigned int cpu = smp_processor_id();
-    int ret = 0;
-
-    while ( loading_state != LOADING_CALLIN )
-        cpu_relax();
-
-    cpumask_set_cpu(cpu, &cpu_callin_map);
+    int ret;

     while ( loading_state != LOADING_ENTER )
+    {
+        if ( loading_state == LOADING_EXIT )
+            return;
         cpu_relax();
+    }

-    /*
-     * If an error happened, control thread would set 'loading_state'
-     * to LOADING_EXIT. Don't perform ucode loading for this case
-     */
-    if ( loading_state == LOADING_EXIT )
-        return ret;
-
-    ret = microcode_ops->apply_microcode(patch);
+    ret = microcode_ops->apply_microcode(nmi_patch);
     if ( !ret )
         atomic_inc(&cpu_updated);
     atomic_inc(&cpu_out);
+}
+
+static int master_thread_fn(const struct microcode_patch *patch)
+{
+    int ret = 0;
+
+    while ( loading_state != LOADING_CALLIN )
+    {
+        if ( loading_state == LOADING_EXIT )
+            return ret;
+        cpu_relax();
+    }
+
+    self_nmi();

     while ( loading_state != LOADING_EXIT )
         cpu_relax();
@@ -387,35 +405,40 @@ static int master_thread_fn(const struct microcode_patch *patch)
     return ret;
 }

-static int control_thread_fn(const struct microcode_patch *patch)
+static void control_thread_work(void)
 {
-    unsigned int cpu = smp_processor_id(), done;
-    unsigned long tick;
     int ret;

-    /* Allow threads to call in */
-    loading_state = LOADING_CALLIN;
-    smp_mb();
-
-    cpumask_set_cpu(cpu, &cpu_callin_map);
-
     /* Waiting for all threads calling in */
     ret = wait_for_condition(wait_cpu_callin,
                              (void *)(unsigned long)num_online_cpus(),
                              MICROCODE_CALLIN_TIMEOUT_US);
     if ( ret ) {
         loading_state = LOADING_EXIT;
-        return ret;
+        return;
     }

     /* Let master threads load the given ucode update */
     loading_state = LOADING_ENTER;
     smp_mb();

-    ret = microcode_ops->apply_microcode(patch);
+    ret = microcode_ops->apply_microcode(nmi_patch);
     if ( !ret )
         atomic_inc(&cpu_updated);
     atomic_inc(&cpu_out);
+}
+
+static int control_thread_fn(const struct microcode_patch *patch)
+{
+    unsigned int done;
+    unsigned long tick;
+    int ret;
+
+    /* Allow threads to call in */
+    loading_state = LOADING_CALLIN;
+    smp_mb();
+
+    self_nmi();

     tick = rdtsc_ordered();
     /* Waiting for master threads finishing update */
@@ -481,12 +504,35 @@ static int do_microcode_update(void *patch)
     return ret;
 }

+static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
+    unsigned int controller = cpumask_first(&cpu_online_map);
+
+    /* System-generated NMI, will be ignored */
+    if ( loading_state == LOADING_PREPARE )
+        return 0;
+
+    ASSERT(loading_state == LOADING_CALLIN);
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    if ( cpu == controller )
+        control_thread_work();
+    else if ( cpu == master )
+        master_thread_work();
+    else
+        slave_thread_work();
+
+    return 0;
+}
+
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
     unsigned int cpu, updated;
     struct microcode_patch *patch;
+    nmi_callback_t *saved_nmi_callback;

     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -551,6 +597,9 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      * watchdog timeout.
      */
     watchdog_disable();
+
+    nmi_patch = patch;
+    saved_nmi_callback = set_nmi_callback(microcode_nmi_callback);
     /*
      * Late loading dance. Why the heavy-handed stop_machine effort?
      *
@@ -563,6 +612,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
      *   conservative and good.
      */
     ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
+    set_nmi_callback(saved_nmi_callback);
     watchdog_enable();

     updated = atomic_read(&cpu_updated);
-- 
2.17.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-08-28 15:12   ` Jan Beulich
  2019-08-29  7:15     ` Chao Gao
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2019-08-28 15:12 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> to a more generic function. So that it can be used alone to check
> an update against the CPU signature and current update revision.
> 
> Note that enum microcode_match_result will be used in common code
> (aka microcode.c), it has been placed in the common header.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> Reviewed-by: Jan Beulich <jbeulich@suse.com>

I don't think these can be legitimately retained with ...

> Changes in v9:
>  - microcode_update_match() doesn't accept (sig, pf, rev) any longer.
>  Hence, it won't be used to compare two arbitrary updates.

... this kind of a change.

> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>      return 0;
>  }
>  
> -static inline int microcode_update_match(
> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
> -    int sig, int pf)
> +/* Check an update against the CPU signature and current update revision */
> +static enum microcode_match_result microcode_update_match(
> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
>  {
> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
> -
> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
> -            (mc_header->rev > uci->cpu_sig.rev));
> +    const struct extended_sigtable *ext_header;
> +    const struct extended_signature *ext_sig;
> +    unsigned int i;
> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
> +    unsigned int sig = uci->cpu_sig.sig;
> +    unsigned int pf = uci->cpu_sig.pf;
> +    unsigned int rev = uci->cpu_sig.rev;
> +    unsigned long data_size = get_datasize(mc_header);
> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
> +
> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;

Didn't you lose a range check against "end" ahead of this if()?
get_totalsize() and get_datasize() aiui also would need to live
after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based
one. This would also affect the caller as it seems.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
  2019-08-22 11:11   ` Roger Pau Monné
@ 2019-08-28 15:21   ` Jan Beulich
  2019-08-29 10:18   ` Jan Beulich
  2 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-28 15:21 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> +/* Return true if cache gets updated. Otherwise, return false */
> +bool microcode_update_cache(struct microcode_patch *patch)
> +{
> +
> +    ASSERT(spin_is_locked(&microcode_mutex));

Stray blank line ahead of this one.

> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -259,6 +259,31 @@ static int microcode_sanity_check(void *mc)
>      return 0;
>  }
>  
> +static bool match_cpu(const struct microcode_patch *patch)
> +{
> +    if ( !patch )
> +        return false;
> +
> +    return microcode_update_match(&patch->mc_intel->hdr,
> +                                  smp_processor_id()) == NEW_UCODE;
> +}
> +
> +static void free_patch(void *mc)
> +{
> +    xfree(mc);
> +}
> +
> +/*
> + * Both patches to compare are supposed to be applicable to local CPU.
> + * Just compare the revision number.
> + */
> +static enum microcode_match_result compare_patch(
> +    const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +    return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ?  NEW_UCODE :
> +                                                                OLD_UCODE;
> +}

The comment ahead of the function is nice, but please move it
inside the function and accompany it by ASSERT()s checking what
the comment says.

> @@ -19,6 +27,11 @@ struct microcode_ops {
>      int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
>      int (*apply_microcode)(unsigned int cpu);
>      int (*start_update)(void);
> +    void (*free_patch)(void *mc);
> +    bool (*match_cpu)(const struct microcode_patch *patch);
> +    enum microcode_match_result (*compare_patch)(
> +            const struct microcode_patch *new,
> +            const struct microcode_patch *old);

I'm afraid indentation here doesn't really match our (sadly still
unwritten) conventions - it should be four more spaces than what
the previous (incomplete) line had.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
  2019-08-22 13:08   ` Roger Pau Monné
@ 2019-08-28 15:26   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-28 15:26 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -594,10 +594,6 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
>      xfree(mc_amd);
>  
>    out:
> -#if CONFIG_HVM
> -    svm_host_osvw_init();
> -#endif
> -
>      /*
>       * In some cases we may return an error even if processor's microcode has
>       * been updated. For example, the first patch in a container file is loaded
> @@ -609,27 +605,28 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
>  
>  static int start_update(void)
>  {
> -#if CONFIG_HVM
>      /*
> -     * We assume here that svm_host_osvw_init() will be called on each cpu (from
> -     * cpu_request_microcode()).
> -     *
> -     * Note that if collect_cpu_info() returns an error then
> -     * cpu_request_microcode() will not invoked thus leaving OSVW bits not
> -     * updated. Currently though collect_cpu_info() will not fail on processors
> -     * supporting OSVW so we will not deal with this possibility.
> +     * svm_host_osvw_init() will be called on each cpu by calling '.end_update'
> +     * in common code.
>       */
>      svm_host_osvw_reset();
> -#endif
>  
>      return 0;
>  }
>  
> +static void end_update(void)
> +{
> +    svm_host_osvw_init();
> +}
> +
>  static const struct microcode_ops microcode_amd_ops = {
>      .cpu_request_microcode            = cpu_request_microcode,
>      .collect_cpu_info                 = collect_cpu_info,
>      .apply_microcode                  = apply_microcode,
> +#if CONFIG_HVM

I realize it was wrong in the old code as well, but please use
#ifdef instead of #if.

>      .start_update                     = start_update,
> +    .end_update                       = end_update,
> +#endif

With this there'll be two warnings about unused functions when
!HVM - you need to frame the implementations with an #ifdef as
well.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
  2019-08-29  7:15     ` Chao Gao
@ 2019-08-29  7:14       ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29  7:14 UTC (permalink / raw)
  To: Chao Gao
  Cc: Andrew Cooper, xen-devel, Ashok Raj, Wei Liu, Roger Pau Monné

On 29.08.2019 09:15, Chao Gao wrote:
> On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote:
>> On 19.08.2019 03:25, Chao Gao wrote:
>>> --- a/xen/arch/x86/microcode_intel.c
>>> +++ b/xen/arch/x86/microcode_intel.c
>>> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>>>      return 0;
>>>  }
>>>  
>>> -static inline int microcode_update_match(
>>> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>>> -    int sig, int pf)
>>> +/* Check an update against the CPU signature and current update revision */
>>> +static enum microcode_match_result microcode_update_match(
>>> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
>>>  {
>>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>>> -
>>> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>>> -            (mc_header->rev > uci->cpu_sig.rev));
>>> +    const struct extended_sigtable *ext_header;
>>> +    const struct extended_signature *ext_sig;
>>> +    unsigned int i;
>>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>>> +    unsigned int sig = uci->cpu_sig.sig;
>>> +    unsigned int pf = uci->cpu_sig.pf;
>>> +    unsigned int rev = uci->cpu_sig.rev;
>>> +    unsigned long data_size = get_datasize(mc_header);
>>> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
>>> +
>>> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>>> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>>
>> Didn't you lose a range check against "end" ahead of this if()?
>> get_totalsize() and get_datasize() aiui also would need to live
>> after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based
>> one. This would also affect the caller as it seems.
> 
> I think microcode_sanity_check() is for this purpose. We can do
> sanity check before the if(). Perhaps, we can just add an assertion
> that sanity check won't fail. Because whenever sanity check failed
> when pasing an ucode blob, we just drop the ucode; we won't pass an
> broken ucode to this function.

Well - that's the main question. The purpose of this patch, after all,
is (aiui) to allow calling the function in more cases. If all callers
are indeed supposed to check the basic properties, then yes, an ASSERT()
would be fine.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match()
  2019-08-28 15:12   ` Jan Beulich
@ 2019-08-29  7:15     ` Chao Gao
  2019-08-29  7:14       ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-29  7:15 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On Wed, Aug 28, 2019 at 05:12:34PM +0200, Jan Beulich wrote:
>On 19.08.2019 03:25, Chao Gao wrote:
>> to a more generic function. So that it can be used alone to check
>> an update against the CPU signature and current update revision.
>> 
>> Note that enum microcode_match_result will be used in common code
>> (aka microcode.c), it has been placed in the common header.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>
>I don't think these can be legitimately retained with ...
>
>> Changes in v9:
>>  - microcode_update_match() doesn't accept (sig, pf, rev) any longer.
>>  Hence, it won't be used to compare two arbitrary updates.
>
>... this kind of a change.

Will drop RBs.

>
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -134,14 +134,39 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>>      return 0;
>>  }
>>  
>> -static inline int microcode_update_match(
>> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>> -    int sig, int pf)
>> +/* Check an update against the CPU signature and current update revision */
>> +static enum microcode_match_result microcode_update_match(
>> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
>>  {
>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>> -
>> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>> -            (mc_header->rev > uci->cpu_sig.rev));
>> +    const struct extended_sigtable *ext_header;
>> +    const struct extended_signature *ext_sig;
>> +    unsigned int i;
>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> +    unsigned int sig = uci->cpu_sig.sig;
>> +    unsigned int pf = uci->cpu_sig.pf;
>> +    unsigned int rev = uci->cpu_sig.rev;
>> +    unsigned long data_size = get_datasize(mc_header);
>> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
>> +
>> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>
>Didn't you lose a range check against "end" ahead of this if()?
>get_totalsize() and get_datasize() aiui also would need to live
>after a range check, just a sizeof() (i.e. MC_HEADER_SIZE) based
>one. This would also affect the caller as it seems.

I think microcode_sanity_check() is for this purpose. We can do
sanity check before the if(). Perhaps, we can just add an assertion
that sanity check won't fail. Because whenever sanity check failed
when pasing an ucode blob, we just drop the ucode; we won't pass an
broken ucode to this function.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-23  9:09       ` Roger Pau Monné
@ 2019-08-29  7:37         ` Chao Gao
  2019-08-29  8:16           ` Roger Pau Monné
  2019-08-29 10:26           ` Jan Beulich
  0 siblings, 2 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-29  7:37 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
>On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
>> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
>> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
>> >> Both are loading the cached patch. Since APs call the unified function,
>> >> microcode_update_one(), during wakeup, the 'start_update' parameter
>> >> which originally used to distinguish BSP and APs is redundant. So remove
>> >> this parameter.
>> >> 
>> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> >> ---
>> >> Note that here is a functional change: resuming a CPU would call
>> >> ->end_update() now while previously it wasn't. Not quite sure
>> >> whether it is correct.
>> >
>> >I guess that's required if it called start_update prior to calling
>> >end_update?
>> >
>> >> 
>> >> Changes in v9:
>> >>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>> >>    microcode_update_one()
>> >>  - rebase and fix conflicts.
>> >> 
>> >> Changes in v8:
>> >>  - split out from the previous patch
>> >> ---
>> >>  xen/arch/x86/acpi/power.c       |  2 +-
>> >>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
>> >>  xen/arch/x86/smpboot.c          |  5 +--
>> >>  xen/include/asm-x86/processor.h |  4 +-
>> >>  4 files changed, 44 insertions(+), 57 deletions(-)
>> >> 
>> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>> >> index 4f21903..24798d5 100644
>> >> --- a/xen/arch/x86/acpi/power.c
>> >> +++ b/xen/arch/x86/acpi/power.c
>> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>> >>  
>> >>      console_end_sync();
>> >>  
>> >> -    microcode_resume_cpu();
>> >> +    microcode_update_one();
>> >>  
>> >>      if ( !recheck_cpu_features(0) )
>> >>          panic("Missing previously available feature(s)\n");
>> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> >> index a2febc7..bdd9c9f 100644
>> >> --- a/xen/arch/x86/microcode.c
>> >> +++ b/xen/arch/x86/microcode.c
>> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
>> >>      return NULL;
>> >>  }
>> >>  
>> >> -int microcode_resume_cpu(void)
>> >> -{
>> >> -    int err;
>> >> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
>> >> -
>> >> -    if ( !microcode_ops )
>> >> -        return 0;
>> >> -
>> >> -    spin_lock(&microcode_mutex);
>> >> -
>> >> -    err = microcode_ops->collect_cpu_info(sig);
>> >> -    if ( likely(!err) )
>> >> -        err = microcode_ops->apply_microcode(microcode_cache);
>> >> -    spin_unlock(&microcode_mutex);
>> >> -
>> >> -    return err;
>> >> -}
>> >> -
>> >>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>> >>  {
>> >>      microcode_ops->free_patch(microcode_patch->mc);
>> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>> >>  }
>> >>  __initcall(microcode_init);
>> >>  
>> >> -int __init early_microcode_update_cpu(bool start_update)
>> >> +/* Load a cached update to current cpu */
>> >> +int microcode_update_one(void)
>> >> +{
>> >> +    int rc;
>> >> +
>> >> +    if ( !microcode_ops )
>> >> +        return -EOPNOTSUPP;
>> >> +
>> >> +    rc = microcode_update_cpu(NULL);
>> >> +
>> >> +    if ( microcode_ops->end_update )
>> >> +        microcode_ops->end_update();
>> >
>> >Don't you need to call start_update before calling
>> >microcode_update_cpu?
>> 
>> No. On AMD side, osvw_status records the hardware erratum in the system.
>> As we don't assume all CPUs have the same erratum, each cpu calls
>> end_update to update osvw_status after ucode loading.
>> start_update just resets osvw_status to 0. And it is called once prior
>> to ucode loading on any CPU so that osvw_status can be recomputed.
>
>Oh, I think I understand it. start_update must only be called once
>_before_ the sequence to update the microcode on all CPUs is
>performed, while end_update needs to be called on _each_ CPU after the
>update has been completed in order to account for any erratas.
>
>The name for those hooks should be improved, I guess renaming
>end_update to end_update_each or end_update_percpu would be clearer in
>order to make it clear that start_update is global, while end_update
>is percpu. Anyway, I don't want to delay this series for a naming nit.
>
>I'm still unsure where start_update is called for the resume from
>suspension case, I don't seem to see any call to start_update neither
>in enter_state or microcode_update_one, hence I think this is missing?

No. Actually, no call of start_update for resume case.

>
>I would expect you need to clean osvw_status also on resume from
>suspension, in case microcode loading fails? Or else you will be
>carrying a stale osvw_status.

Then we need to send IPI to all other CPUs to recompute osvw_state. But
I think it is not necessary. If ucode cache isn't changed during the
CPU's suspension period, there is not stale osvw bit (assuming OSVW on
the resuming CPU won't change). If the ucode cache is updated (there
must be a late ucode loading), osvw_status should have been cleaned
before late ucode loading.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-29  7:37         ` Chao Gao
@ 2019-08-29  8:16           ` Roger Pau Monné
  2019-08-29 10:26           ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Roger Pau Monné @ 2019-08-29  8:16 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Jan Beulich, Ashok Raj, Wei Liu, Andrew Cooper

On Thu, Aug 29, 2019 at 03:37:47PM +0800, Chao Gao wrote:
> On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
> >On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
> >> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
> >> >On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
> >> >> Both are loading the cached patch. Since APs call the unified function,
> >> >> microcode_update_one(), during wakeup, the 'start_update' parameter
> >> >> which originally used to distinguish BSP and APs is redundant. So remove
> >> >> this parameter.
> >> >> 
> >> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> >> ---
> >> >> Note that here is a functional change: resuming a CPU would call
> >> >> ->end_update() now while previously it wasn't. Not quite sure
> >> >> whether it is correct.
> >> >
> >> >I guess that's required if it called start_update prior to calling
> >> >end_update?
> >> >
> >> >> 
> >> >> Changes in v9:
> >> >>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
> >> >>    microcode_update_one()
> >> >>  - rebase and fix conflicts.
> >> >> 
> >> >> Changes in v8:
> >> >>  - split out from the previous patch
> >> >> ---
> >> >>  xen/arch/x86/acpi/power.c       |  2 +-
> >> >>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
> >> >>  xen/arch/x86/smpboot.c          |  5 +--
> >> >>  xen/include/asm-x86/processor.h |  4 +-
> >> >>  4 files changed, 44 insertions(+), 57 deletions(-)
> >> >> 
> >> >> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
> >> >> index 4f21903..24798d5 100644
> >> >> --- a/xen/arch/x86/acpi/power.c
> >> >> +++ b/xen/arch/x86/acpi/power.c
> >> >> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
> >> >>  
> >> >>      console_end_sync();
> >> >>  
> >> >> -    microcode_resume_cpu();
> >> >> +    microcode_update_one();
> >> >>  
> >> >>      if ( !recheck_cpu_features(0) )
> >> >>          panic("Missing previously available feature(s)\n");
> >> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> >> >> index a2febc7..bdd9c9f 100644
> >> >> --- a/xen/arch/x86/microcode.c
> >> >> +++ b/xen/arch/x86/microcode.c
> >> >> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
> >> >>      return NULL;
> >> >>  }
> >> >>  
> >> >> -int microcode_resume_cpu(void)
> >> >> -{
> >> >> -    int err;
> >> >> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
> >> >> -
> >> >> -    if ( !microcode_ops )
> >> >> -        return 0;
> >> >> -
> >> >> -    spin_lock(&microcode_mutex);
> >> >> -
> >> >> -    err = microcode_ops->collect_cpu_info(sig);
> >> >> -    if ( likely(!err) )
> >> >> -        err = microcode_ops->apply_microcode(microcode_cache);
> >> >> -    spin_unlock(&microcode_mutex);
> >> >> -
> >> >> -    return err;
> >> >> -}
> >> >> -
> >> >>  void microcode_free_patch(struct microcode_patch *microcode_patch)
> >> >>  {
> >> >>      microcode_ops->free_patch(microcode_patch->mc);
> >> >> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
> >> >>  }
> >> >>  __initcall(microcode_init);
> >> >>  
> >> >> -int __init early_microcode_update_cpu(bool start_update)
> >> >> +/* Load a cached update to current cpu */
> >> >> +int microcode_update_one(void)
> >> >> +{
> >> >> +    int rc;
> >> >> +
> >> >> +    if ( !microcode_ops )
> >> >> +        return -EOPNOTSUPP;
> >> >> +
> >> >> +    rc = microcode_update_cpu(NULL);
> >> >> +
> >> >> +    if ( microcode_ops->end_update )
> >> >> +        microcode_ops->end_update();
> >> >
> >> >Don't you need to call start_update before calling
> >> >microcode_update_cpu?
> >> 
> >> No. On AMD side, osvw_status records the hardware erratum in the system.
> >> As we don't assume all CPUs have the same erratum, each cpu calls
> >> end_update to update osvw_status after ucode loading.
> >> start_update just resets osvw_status to 0. And it is called once prior
> >> to ucode loading on any CPU so that osvw_status can be recomputed.
> >
> >Oh, I think I understand it. start_update must only be called once
> >_before_ the sequence to update the microcode on all CPUs is
> >performed, while end_update needs to be called on _each_ CPU after the
> >update has been completed in order to account for any erratas.
> >
> >The name for those hooks should be improved, I guess renaming
> >end_update to end_update_each or end_update_percpu would be clearer in
> >order to make it clear that start_update is global, while end_update
> >is percpu. Anyway, I don't want to delay this series for a naming nit.
> >
> >I'm still unsure where start_update is called for the resume from
> >suspension case, I don't seem to see any call to start_update neither
> >in enter_state or microcode_update_one, hence I think this is missing?
> 
> No. Actually, no call of start_update for resume case.
> 
> >
> >I would expect you need to clean osvw_status also on resume from
> >suspension, in case microcode loading fails? Or else you will be
> >carrying a stale osvw_status.
> 
> Then we need to send IPI to all other CPUs to recompute osvw_state.

Why would you need to send an IPI? Aren't other CPUs going to update
the microcode, and hence call end_update?

AFAICT you only need to call start_update after returning from
suspension and before any CPU updates it's microcode. Then osvw_status
will be updated by each CPU as the microcode gets loaded?

> But
> I think it is not necessary. If ucode cache isn't changed during the
> CPU's suspension period, there is not stale osvw bit (assuming OSVW on
> the resuming CPU won't change). If the ucode cache is updated (there
> must be a late ucode loading), osvw_status should have been cleaned
> before late ucode loading.

It could be possible that an ucode that previously loaded fine throw
an error, but I agree that's quite unlikely. Anyway the fix seemed
trivial to me, but maybe I'm missing something.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-22 13:59   ` Roger Pau Monné
@ 2019-08-29 10:06     ` Jan Beulich
  2019-08-30  3:22       ` Chao Gao
  0 siblings, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:06 UTC (permalink / raw)
  To: Roger Pau Monné, Chao Gao
  Cc: Andrew Cooper, Ashok Raj, Wei Liu, xen-devel

On 22.08.2019 15:59, Roger Pau Monné  wrote:
> Seeing how this works I'm not sure what's the best option here. As
> updating will be attempted on other CPUs, I'm not sure if it's OK to
> return an error if the update succeed on some CPUs but failed on
> others.

The overall result of a partially successful update should be an
error - mismatched ucode may, after all, be more of a problem
than outdated ucode.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
  2019-08-22 11:11   ` Roger Pau Monné
  2019-08-28 15:21   ` Jan Beulich
@ 2019-08-29 10:18   ` Jan Beulich
  2 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:18 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> +static enum microcode_match_result compare_patch(
> +    const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +    return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ?  NEW_UCODE :
> +                                                                OLD_UCODE;

There's one too many blank after the ? here. Also we commonly align
the : under the ? in cases like this one.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
  2019-08-22 13:59   ` Roger Pau Monné
@ 2019-08-29 10:19   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:19 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -300,32 +322,44 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      if ( microcode_ops == NULL )
>          return -EINVAL;
>  
> -    info = xmalloc_bytes(sizeof(*info) + len);
> -    if ( info == NULL )
> +    buffer = xmalloc_bytes(len);
> +    if ( !buffer )
>          return -ENOMEM;
>  
> -    ret = copy_from_guest(info->buffer, buf, len);
> -    if ( ret != 0 )
> +    if ( copy_from_guest(buffer, buf, len) )
>      {
> -        xfree(info);
> -        return ret;
> +        ret = -EFAULT;
> +        goto free;
>      }
>  
> -    info->buffer_size = len;
> -    info->error = 0;
> -    info->cpu = cpumask_first(&cpu_online_map);
> -
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
>          if ( ret != 0 )
> -        {
> -            xfree(info);
> -            return ret;
> -        }
> +            goto free;
>      }
>  
> -    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +    patch = parse_blob(buffer, len);
> +    if ( IS_ERR(patch) )
> +    {
> +        ret = PTR_ERR(patch);
> +        printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);

I think this wants to be at least XENLOG_WARNING.

> @@ -372,23 +406,46 @@ int __init early_microcode_update_cpu(bool start_update)
>  
>      microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>  
> -    if ( data )
> +    if ( !data )
> +        return -ENOMEM;
> +
> +    if ( start_update )
>      {
> -        if ( start_update && microcode_ops->start_update )
> +        struct microcode_patch *patch;
> +
> +        if ( microcode_ops->start_update )
>              rc = microcode_ops->start_update();
>  
>          if ( rc )
>              return rc;
>  
> -        rc = microcode_update_cpu(data, len);
> +        patch = parse_blob(data, len);
> +        if ( IS_ERR(patch) )
> +        {
> +            printk(XENLOG_INFO "Parsing microcode blob error %ld\n",

Same here.

> +                   PTR_ERR(patch));
> +            return PTR_ERR(patch);
> +        }
> +
> +        if ( !patch )
> +        {
> +            printk(XENLOG_INFO "No ucode found. Update aborted!\n");

Here I'm not sure the message is worthwhile to have.

> @@ -41,8 +42,6 @@ struct cpu_signature {
>  DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
>  extern const struct microcode_ops *microcode_ops;
>  
> -const struct microcode_patch *microcode_get_cache(void);
> -bool microcode_update_cache(struct microcode_patch *patch);

If you remove the declaration but not the definition, then the
latter should become static.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-29  7:37         ` Chao Gao
  2019-08-29  8:16           ` Roger Pau Monné
@ 2019-08-29 10:26           ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:26 UTC (permalink / raw)
  To: Chao Gao
  Cc: Andrew Cooper, xen-devel, Ashok Raj, Wei Liu, Roger Pau Monné

On 29.08.2019 09:37, Chao Gao wrote:
> On Fri, Aug 23, 2019 at 11:09:07AM +0200, Roger Pau Monné wrote:
>> On Fri, Aug 23, 2019 at 12:44:34AM +0800, Chao Gao wrote:
>>> On Thu, Aug 22, 2019 at 04:10:46PM +0200, Roger Pau Monné wrote:
>>>> On Mon, Aug 19, 2019 at 09:25:24AM +0800, Chao Gao wrote:
>>>>> Both are loading the cached patch. Since APs call the unified function,
>>>>> microcode_update_one(), during wakeup, the 'start_update' parameter
>>>>> which originally used to distinguish BSP and APs is redundant. So remove
>>>>> this parameter.
>>>>>
>>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>>> ---
>>>>> Note that here is a functional change: resuming a CPU would call
>>>>> ->end_update() now while previously it wasn't. Not quite sure
>>>>> whether it is correct.
>>>>
>>>> I guess that's required if it called start_update prior to calling
>>>> end_update?
>>>>
>>>>>
>>>>> Changes in v9:
>>>>>  - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
>>>>>    microcode_update_one()
>>>>>  - rebase and fix conflicts.
>>>>>
>>>>> Changes in v8:
>>>>>  - split out from the previous patch
>>>>> ---
>>>>>  xen/arch/x86/acpi/power.c       |  2 +-
>>>>>  xen/arch/x86/microcode.c        | 90 ++++++++++++++++++-----------------------
>>>>>  xen/arch/x86/smpboot.c          |  5 +--
>>>>>  xen/include/asm-x86/processor.h |  4 +-
>>>>>  4 files changed, 44 insertions(+), 57 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
>>>>> index 4f21903..24798d5 100644
>>>>> --- a/xen/arch/x86/acpi/power.c
>>>>> +++ b/xen/arch/x86/acpi/power.c
>>>>> @@ -253,7 +253,7 @@ static int enter_state(u32 state)
>>>>>  
>>>>>      console_end_sync();
>>>>>  
>>>>> -    microcode_resume_cpu();
>>>>> +    microcode_update_one();
>>>>>  
>>>>>      if ( !recheck_cpu_features(0) )
>>>>>          panic("Missing previously available feature(s)\n");
>>>>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>>>>> index a2febc7..bdd9c9f 100644
>>>>> --- a/xen/arch/x86/microcode.c
>>>>> +++ b/xen/arch/x86/microcode.c
>>>>> @@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, uint32_t len)
>>>>>      return NULL;
>>>>>  }
>>>>>  
>>>>> -int microcode_resume_cpu(void)
>>>>> -{
>>>>> -    int err;
>>>>> -    struct cpu_signature *sig = &this_cpu(cpu_sig);
>>>>> -
>>>>> -    if ( !microcode_ops )
>>>>> -        return 0;
>>>>> -
>>>>> -    spin_lock(&microcode_mutex);
>>>>> -
>>>>> -    err = microcode_ops->collect_cpu_info(sig);
>>>>> -    if ( likely(!err) )
>>>>> -        err = microcode_ops->apply_microcode(microcode_cache);
>>>>> -    spin_unlock(&microcode_mutex);
>>>>> -
>>>>> -    return err;
>>>>> -}
>>>>> -
>>>>>  void microcode_free_patch(struct microcode_patch *microcode_patch)
>>>>>  {
>>>>>      microcode_ops->free_patch(microcode_patch->mc);
>>>>> @@ -384,11 +366,29 @@ static int __init microcode_init(void)
>>>>>  }
>>>>>  __initcall(microcode_init);
>>>>>  
>>>>> -int __init early_microcode_update_cpu(bool start_update)
>>>>> +/* Load a cached update to current cpu */
>>>>> +int microcode_update_one(void)
>>>>> +{
>>>>> +    int rc;
>>>>> +
>>>>> +    if ( !microcode_ops )
>>>>> +        return -EOPNOTSUPP;
>>>>> +
>>>>> +    rc = microcode_update_cpu(NULL);
>>>>> +
>>>>> +    if ( microcode_ops->end_update )
>>>>> +        microcode_ops->end_update();
>>>>
>>>> Don't you need to call start_update before calling
>>>> microcode_update_cpu?
>>>
>>> No. On AMD side, osvw_status records the hardware erratum in the system.
>>> As we don't assume all CPUs have the same erratum, each cpu calls
>>> end_update to update osvw_status after ucode loading.
>>> start_update just resets osvw_status to 0. And it is called once prior
>>> to ucode loading on any CPU so that osvw_status can be recomputed.
>>
>> Oh, I think I understand it. start_update must only be called once
>> _before_ the sequence to update the microcode on all CPUs is
>> performed, while end_update needs to be called on _each_ CPU after the
>> update has been completed in order to account for any erratas.
>>
>> The name for those hooks should be improved, I guess renaming
>> end_update to end_update_each or end_update_percpu would be clearer in
>> order to make it clear that start_update is global, while end_update
>> is percpu. Anyway, I don't want to delay this series for a naming nit.
>>
>> I'm still unsure where start_update is called for the resume from
>> suspension case, I don't seem to see any call to start_update neither
>> in enter_state or microcode_update_one, hence I think this is missing?
> 
> No. Actually, no call of start_update for resume case.
> 
>>
>> I would expect you need to clean osvw_status also on resume from
>> suspension, in case microcode loading fails? Or else you will be
>> carrying a stale osvw_status.
> 
> Then we need to send IPI to all other CPUs to recompute osvw_state. But
> I think it is not necessary. If ucode cache isn't changed during the
> CPU's suspension period, there is not stale osvw bit (assuming OSVW on
> the resuming CPU won't change). If the ucode cache is updated (there
> must be a late ucode loading), osvw_status should have been cleaned
> before late ucode loading.

I'd actually expect firmware to load whatever ucode it has available,
in which case the OSVW state can very well change across resume. I
agree though that after a successful load of the ucode Xen has
cached that state should be the pre-suspend one again. Yet I guess it
would be more consistent if a proper start-update, ucode-load, end-
update cycle was done even in this case.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
  2019-08-22 14:10   ` Roger Pau Monné
@ 2019-08-29 10:29   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:29 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> Both are loading the cached patch. Since APs call the unified function,
> microcode_update_one(), during wakeup, the 'start_update' parameter
> which originally used to distinguish BSP and APs is redundant. So remove
> this parameter.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Note that here is a functional change: resuming a CPU would call
> ->end_update() now while previously it wasn't. Not quite sure
> whether it is correct.

I think it is, as implied by the other response I've sent. But it
should then (as also said) include calling ->start_update() as well.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
  2019-08-23  8:11   ` Roger Pau Monné
@ 2019-08-29 10:47   ` Jan Beulich
  1 sibling, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 10:47 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -542,29 +505,21 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>      while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
>                                                 &offset)) == 0 )
>      {
> -        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
> -
> -        if ( IS_ERR(new_patch) )
> -        {
> -            error = PTR_ERR(new_patch);
> -            break;
> -        }
> -
>          /*
> -         * If the new patch covers current CPU, compare patches and store the
> +         * If the new ucode covers current CPU, compare ucodes and store the
>           * one with higher revision.
>           */
> -        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
> -             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
> +#define REV_ID(mpb) (((struct microcode_header_amd *)(mpb))->processor_rev_id)
> +        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
> +             (!saved || (REV_ID(mc_amd->mpb) > REV_ID(saved))) )
> +#undef REV_ID

I'm not happy with this helper #define, the more that "saved" already is
of the correct type. compare_patch() in reality only acts on the header,
so I'd suggest having that function forward to a new compare_header()
(or some other suitable name) and use that new function here as well.

> @@ -379,47 +360,47 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
>  {
>      long offset = 0;
>      int error = 0;
> -    void *mc;
> +    struct microcode_intel *mc, *saved = NULL;
>      struct microcode_patch *patch = NULL;
>  
> -    while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
> +    while ( (offset = get_next_ucode_from_buffer((void **)&mc, buf,

Casts like this make me rather nervous. Please see about getting rid of
it (by using a union or a 2nd local variable).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
  2019-08-19 10:27   ` Sergey Dyasli
@ 2019-08-29 12:06   ` Jan Beulich
  2019-08-30  3:30     ` Chao Gao
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 12:06 UTC (permalink / raw)
  To: Chao Gao
  Cc: Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel, Thomas Gleixner, Borislav Petkov, Roger Pau Monné

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch *patch)
>      return true;
>  }
>  
> +/* Wait for a condition to be met with a timeout (us). */
> +static int wait_for_condition(int (*func)(void *data), void *data,
> +                         unsigned int timeout)
> +{
> +    while ( !func(data) )
> +    {
> +        if ( !timeout-- )
> +        {
> +            printk("CPU%u: Timeout in %pS\n",
> +                   smp_processor_id(), __builtin_return_address(0));
> +            return -EBUSY;
> +        }
> +        udelay(1);
> +    }
> +
> +    return 0;
> +}
> +
> +static int wait_cpu_callin(void *nr)
> +{
> +    return cpumask_weight(&cpu_callin_map) >= (unsigned long)nr;
> +}
> +
> +static int wait_cpu_callout(void *nr)
> +{
> +    return atomic_read(&cpu_out) >= (unsigned long)nr;
> +}

Since wait_for_condition() is used with only these two functions
as callbacks, they should imo return bool and take const void *.

> @@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
>      return err;
>  }
>  
> -static long do_microcode_update(void *patch)
> +static int slave_thread_fn(void)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> +
> +    while ( loading_state != LOADING_CALLIN )
> +        cpu_relax();
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    while ( loading_state != LOADING_EXIT )
> +        cpu_relax();
> +
> +    /* Copy update revision from the "master" thread. */
> +    this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev;
> +
> +    return 0;
> +}
> +
> +static int master_thread_fn(const struct microcode_patch *patch)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    int ret = 0;
> +
> +    while ( loading_state != LOADING_CALLIN )
> +        cpu_relax();
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    while ( loading_state != LOADING_ENTER )
> +        cpu_relax();
> +
> +    /*
> +     * If an error happened, control thread would set 'loading_state'
> +     * to LOADING_EXIT. Don't perform ucode loading for this case
> +     */
> +    if ( loading_state == LOADING_EXIT )
> +        return ret;

Even if the producer transitions this through ENTER to EXIT, the
observer here may never get to see the ENTER state, and hence
never exit the loop above. You want either < ENTER or == CALLIN.

> +    ret = microcode_ops->apply_microcode(patch);
> +    if ( !ret )
> +        atomic_inc(&cpu_updated);
> +    atomic_inc(&cpu_out);
> +
> +    while ( loading_state != LOADING_EXIT )
> +        cpu_relax();
> +
> +    return ret;
> +}

As a cosmetic remark, I don't think "master" and "slave" are
suitable terms here. "primary" and "secondary" would imo come
closer to what the threads' relationship is.

> +static int control_thread_fn(const struct microcode_patch *patch)
>  {
> -    unsigned int cpu;
> +    unsigned int cpu = smp_processor_id(), done;
> +    unsigned long tick;
> +    int ret;
>  
> -    /* Store the patch after a successful loading */
> -    if ( !microcode_update_cpu(patch) && patch )
> +    /* Allow threads to call in */
> +    loading_state = LOADING_CALLIN;
> +    smp_mb();

Why not just smp_wmb()? (Same further down then.)

> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    /* Waiting for all threads calling in */
> +    ret = wait_for_condition(wait_cpu_callin,
> +                             (void *)(unsigned long)num_online_cpus(),
> +                             MICROCODE_CALLIN_TIMEOUT_US);
> +    if ( ret ) {

Misplaced brace.

> +static int do_microcode_update(void *patch)

const?

> @@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      {
>          ret = PTR_ERR(patch);
>          printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
> -        goto free;
> +        goto put;
>      }
>  
>      if ( !patch )
>      {
>          printk(XENLOG_INFO "No ucode found. Update aborted!\n");
>          ret = -EINVAL;
> -        goto free;
> +        goto put;
> +    }
> +
> +    cpumask_clear(&cpu_callin_map);
> +    atomic_set(&cpu_out, 0);
> +    atomic_set(&cpu_updated, 0);
> +    loading_state = LOADING_PREPARE;
> +
> +    /* Calculate the number of online CPU core */
> +    nr_cores = 0;
> +    for_each_online_cpu(cpu)
> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +            nr_cores++;
> +
> +    printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
> +
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();
> +    /*
> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> +     *
> +     * - HT siblings must be idle and not execute other code while the other
> +     *   sibling is loading microcode in order to avoid any negative
> +     *   interactions cause by the loading.
> +     *
> +     * - In addition, microcode update on the cores must be serialized until
> +     *   this requirement can be relaxed in the future. Right now, this is
> +     *   conservative and good.
> +     */
> +    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
> +    watchdog_enable();

Considering that stop_machine_run() doesn't itself disable the watchdog,
did you consider having the control thread disable/enable the watchdog,
thus shortening the period where it's not active?

> +    updated = atomic_read(&cpu_updated);
> +    if ( updated > 0 )
> +    {
> +        spin_lock(&microcode_mutex);
> +        microcode_update_cache(patch);
> +        spin_unlock(&microcode_mutex);
>      }
> +    else
> +        microcode_free_patch(patch);
>  
> -    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
> -                                    do_microcode_update, patch);
> +    if ( updated && updated != nr_cores )
> +        printk(XENLOG_ERR "ERROR: Updating microcode succeeded on %u cores and failed\n"
> +               XENLOG_ERR "on other %u cores. A system with differing microcode \n"

Stray blank before newline.

> +               XENLOG_ERR "revisions is considered unstable. Please reboot and do not\n"
> +               XENLOG_ERR "load the microcode that triggersthis warning!\n",

Missing blank before "this".

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-27  4:52       ` Chao Gao
  2019-08-28  8:52         ` Sergey Dyasli
@ 2019-08-29 12:11         ` Jan Beulich
  2019-08-30  6:35           ` Chao Gao
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 12:11 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 27.08.2019 06:52, Chao Gao wrote:
> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>> On 19/08/2019 02:25, Chao Gao wrote:
>>>> register an nmi callback. And this callback does busy-loop on threads
>>>> which are waiting for loading completion. Control threads send NMI to
>>>> slave threads to prevent NMI acceptance during ucode loading.
>>>>
>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>> ---
>>>> Changes in v9:
>>>>  - control threads send NMI to all other threads. Slave threads will
>>>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>>>  loading. Note that self-nmi is invalid according to SDM.
>>>
>>> To me this looks like a half-measure: why keep only slave threads in
>>> the NMI handler, when master threads can update the microcode from
>>> inside the NMI handler as well?
>>
>> No special reason. Because the issue we want to address is that slave
>> threads might go to handle NMI and access MSRs when master thread is
>> loading ucode. So we only keep slave threads in the NMI handler.
>>
>>>
>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>> used for apply_alternatives() during boot, so can be trusted to work.
>>
>> Sorry, I meant using self shorthand to send self-nmi. I tried to use
>> self shorthand but got APIC error. And I agree that it is better to
>> make slave thread call self_nmi() itself.
>>
>>>
>>> I experimented a bit with the following approach: after loading_state
>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>> the NMI handler. And it seems to work.
>>>
>>> Separate question is about the safety of this approach: can we be sure
>>> that a ucode update would not reset the status of the NMI latch? I.e.
>>> can it cause another NMI to be delivered while Xen already handles one?
>>
>> Ashok, what's your opinion on Sergey's approach and his concern?
> 
> I talked with Ashok. We think your approach is better. I will follow
> your approach in v10. It would be much helpful if you post your patch
> so that I can just rebase it onto other patches.

Doing the actual ucode update inside an NMI handler seems rather risky
to me. Even if Ashok confirmed it would not be an issue on past and
current Intel CPUs - what about future ones, or ones from other vendors?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-19  1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
  2019-08-23  8:46   ` Sergey Dyasli
@ 2019-08-29 12:22   ` Jan Beulich
  2019-08-30  6:33     ` Chao Gao
  1 sibling, 1 reply; 57+ messages in thread
From: Jan Beulich @ 2019-08-29 12:22 UTC (permalink / raw)
  To: Chao Gao
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On 19.08.2019 03:25, Chao Gao wrote:
> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch)
>      return ret;
>  }
>  
> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> +{
> +    /* The first thread of a core is to load an update. Don't block it. */
> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ||
> +         loading_state != LOADING_CALLIN )
> +        return 0;
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    while ( loading_state != LOADING_EXIT )
> +        cpu_relax();
> +
> +    return 0;
> +}

By returning 0 you tell do_nmi() to continue processing the NMI.
Since you can't tell whether a non-IPI NMI has surfaced at about
the same time this is generally the right thing imo, but how do
you prevent unknown_nmi_error() from getting entered when do_nmi()
ends up setting handle_unknown to true? (The question is mostly
rhetorical, but there's a disconnect between do_nmi() checking
"cpu == 0" and the control thread running on
cpumask_first(&cpu_online_map), i.e. you introduce a well hidden
dependency on CPU 0 never going offline. IOW my request is to at
least make this less well hidden, such that it can be noticed if
and when someone endeavors to remove said limitation.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-29 10:06     ` Jan Beulich
@ 2019-08-30  3:22       ` Chao Gao
  2019-08-30  7:25         ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-30  3:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, xen-devel, Ashok Raj, Wei Liu, Roger Pau Monné

On Thu, Aug 29, 2019 at 12:06:28PM +0200, Jan Beulich wrote:
>On 22.08.2019 15:59, Roger Pau Monné  wrote:
>> Seeing how this works I'm not sure what's the best option here. As
>> updating will be attempted on other CPUs, I'm not sure if it's OK to
>> return an error if the update succeed on some CPUs but failed on
>> others.
>
>The overall result of a partially successful update should be an
>error - mismatched ucode may, after all, be more of a problem
>than outdated ucode.

Will only take care -EIO case. If systems have differing ucodes on
cores, partially update is expected when we try to correct the system
with an ucode equal to the newest ucode rev already on cores.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
  2019-08-29 12:06   ` Jan Beulich
@ 2019-08-30  3:30     ` Chao Gao
  0 siblings, 0 replies; 57+ messages in thread
From: Chao Gao @ 2019-08-30  3:30 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel, Thomas Gleixner, Borislav Petkov, Roger Pau Monné

On Thu, Aug 29, 2019 at 02:06:39PM +0200, Jan Beulich wrote:
>On 19.08.2019 03:25, Chao Gao wrote:
>> +
>> +static int master_thread_fn(const struct microcode_patch *patch)
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    int ret = 0;
>> +
>> +    while ( loading_state != LOADING_CALLIN )
>> +        cpu_relax();
>> +
>> +    cpumask_set_cpu(cpu, &cpu_callin_map);
>> +
>> +    while ( loading_state != LOADING_ENTER )
>> +        cpu_relax();
>> +
>> +    /*
>> +     * If an error happened, control thread would set 'loading_state'
>> +     * to LOADING_EXIT. Don't perform ucode loading for this case
>> +     */
>> +    if ( loading_state == LOADING_EXIT )
>> +        return ret;
>
>Even if the producer transitions this through ENTER to EXIT, the
>observer here may never get to see the ENTER state, and hence
>never exit the loop above. You want either < ENTER or == CALLIN.

Yes. I find stopmachine_action() is a good example to implement
a state machine. I will follow it.

>
>> +    ret = microcode_ops->apply_microcode(patch);
>> +    if ( !ret )
>> +        atomic_inc(&cpu_updated);
>> +    atomic_inc(&cpu_out);
>> +
>> +    while ( loading_state != LOADING_EXIT )
>> +        cpu_relax();
>> +
>> +    return ret;
>> +}
>
>As a cosmetic remark, I don't think "master" and "slave" are
>suitable terms here. "primary" and "secondary" would imo come
>closer to what the threads' relationship is.

Will do.

>> +
>> +    /*
>> +     * We intend to disable interrupt for long time, which may lead to
>> +     * watchdog timeout.
>> +     */
>> +    watchdog_disable();
>> +    /*
>> +     * Late loading dance. Why the heavy-handed stop_machine effort?
>> +     *
>> +     * - HT siblings must be idle and not execute other code while the other
>> +     *   sibling is loading microcode in order to avoid any negative
>> +     *   interactions cause by the loading.
>> +     *
>> +     * - In addition, microcode update on the cores must be serialized until
>> +     *   this requirement can be relaxed in the future. Right now, this is
>> +     *   conservative and good.
>> +     */
>> +    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
>> +    watchdog_enable();
>
>Considering that stop_machine_run() doesn't itself disable the watchdog,
>did you consider having the control thread disable/enable the watchdog,
>thus shortening the period where it's not active?

Good idea. It helps to keep the code here clean. I think maybe
microcode_nmi_callback can be registered by control thread as well.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-29 12:22   ` Jan Beulich
@ 2019-08-30  6:33     ` Chao Gao
  2019-08-30  7:30       ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-30  6:33 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Roger Pau Monné, Ashok Raj, Wei Liu, Andrew Cooper

On Thu, Aug 29, 2019 at 02:22:47PM +0200, Jan Beulich wrote:
>On 19.08.2019 03:25, Chao Gao wrote:
>> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch)
>>      return ret;
>>  }
>>  
>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>> +{
>> +    /* The first thread of a core is to load an update. Don't block it. */
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ||
>> +         loading_state != LOADING_CALLIN )
>> +        return 0;
>> +
>> +    cpumask_set_cpu(cpu, &cpu_callin_map);
>> +
>> +    while ( loading_state != LOADING_EXIT )
>> +        cpu_relax();
>> +
>> +    return 0;
>> +}
>
>By returning 0 you tell do_nmi() to continue processing the NMI.
>Since you can't tell whether a non-IPI NMI has surfaced at about
>the same time this is generally the right thing imo, but how do
>you prevent unknown_nmi_error() from getting entered when do_nmi()
>ends up setting handle_unknown to true? (The question is mostly
>rhetorical, but there's a disconnect between do_nmi() checking
>"cpu == 0" and the control thread running on
>cpumask_first(&cpu_online_map), i.e. you introduce a well hidden
>dependency on CPU 0 never going offline. IOW my request is to at
>least make this less well hidden, such that it can be noticed if
>and when someone endeavors to remove said limitation.)

Seems the issue is that we couldn't send IPI NMI to BSP, otherwise
unknown_nmi_error() would be trigger. And loading ucode after
rendezvousing all CPUs in NMI handler expects all CPUs to receive IPI
NMI. So this approach always has such issue.

Considering self_nmi is called at another place, could we provide a
way to temporarily suppress or (force) ignore unknown nmi error?

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-29 12:11         ` Jan Beulich
@ 2019-08-30  6:35           ` Chao Gao
  2019-09-09  5:52             ` Chao Gao
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-08-30  6:35 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On Thu, Aug 29, 2019 at 02:11:10PM +0200, Jan Beulich wrote:
>On 27.08.2019 06:52, Chao Gao wrote:
>> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>>> On 19/08/2019 02:25, Chao Gao wrote:
>>>>> register an nmi callback. And this callback does busy-loop on threads
>>>>> which are waiting for loading completion. Control threads send NMI to
>>>>> slave threads to prevent NMI acceptance during ucode loading.
>>>>>
>>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>>> ---
>>>>> Changes in v9:
>>>>>  - control threads send NMI to all other threads. Slave threads will
>>>>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>>>>  loading. Note that self-nmi is invalid according to SDM.
>>>>
>>>> To me this looks like a half-measure: why keep only slave threads in
>>>> the NMI handler, when master threads can update the microcode from
>>>> inside the NMI handler as well?
>>>
>>> No special reason. Because the issue we want to address is that slave
>>> threads might go to handle NMI and access MSRs when master thread is
>>> loading ucode. So we only keep slave threads in the NMI handler.
>>>
>>>>
>>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>>> used for apply_alternatives() during boot, so can be trusted to work.
>>>
>>> Sorry, I meant using self shorthand to send self-nmi. I tried to use
>>> self shorthand but got APIC error. And I agree that it is better to
>>> make slave thread call self_nmi() itself.
>>>
>>>>
>>>> I experimented a bit with the following approach: after loading_state
>>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>>> the NMI handler. And it seems to work.
>>>>
>>>> Separate question is about the safety of this approach: can we be sure
>>>> that a ucode update would not reset the status of the NMI latch? I.e.
>>>> can it cause another NMI to be delivered while Xen already handles one?
>>>
>>> Ashok, what's your opinion on Sergey's approach and his concern?
>> 
>> I talked with Ashok. We think your approach is better. I will follow
>> your approach in v10. It would be much helpful if you post your patch
>> so that I can just rebase it onto other patches.
>
>Doing the actual ucode update inside an NMI handler seems rather risky
>to me. Even if Ashok confirmed it would not be an issue on past and
>current Intel CPUs - what about future ones, or ones from other vendors?

Will confirm these with Ashok.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-08-30  3:22       ` Chao Gao
@ 2019-08-30  7:25         ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-30  7:25 UTC (permalink / raw)
  To: Chao Gao
  Cc: Andrew Cooper, xen-devel, Ashok Raj, Wei Liu, Roger Pau Monné

On 30.08.2019 05:22, Chao Gao wrote:
> On Thu, Aug 29, 2019 at 12:06:28PM +0200, Jan Beulich wrote:
>> On 22.08.2019 15:59, Roger Pau Monné  wrote:
>>> Seeing how this works I'm not sure what's the best option here. As
>>> updating will be attempted on other CPUs, I'm not sure if it's OK to
>>> return an error if the update succeed on some CPUs but failed on
>>> others.
>>
>> The overall result of a partially successful update should be an
>> error - mismatched ucode may, after all, be more of a problem
>> than outdated ucode.
> 
> Will only take care -EIO case. If systems have differing ucodes on
> cores, partially update is expected when we try to correct the system
> with an ucode equal to the newest ucode rev already on cores.

But an update attempt with what's already loaded in the CPU should
yield "success", hence a "partial" update like what you describe
should not be considered "partial" in the first place. Iirc an
update attempt when same (or newer?) ucode is already loaded on
all cores yields "success" too, doesn't it?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-30  6:33     ` Chao Gao
@ 2019-08-30  7:30       ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-08-30  7:30 UTC (permalink / raw)
  To: Chao Gao
  Cc: Andrew Cooper, xen-devel, Ashok Raj, Wei Liu, Roger Pau Monné

On 30.08.2019 08:33, Chao Gao wrote:
> On Thu, Aug 29, 2019 at 02:22:47PM +0200, Jan Beulich wrote:
>> On 19.08.2019 03:25, Chao Gao wrote:
>>> @@ -481,12 +478,28 @@ static int do_microcode_update(void *patch)
>>>      return ret;
>>>  }
>>>  
>>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>>> +{
>>> +    /* The first thread of a core is to load an update. Don't block it. */
>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) ||
>>> +         loading_state != LOADING_CALLIN )
>>> +        return 0;
>>> +
>>> +    cpumask_set_cpu(cpu, &cpu_callin_map);
>>> +
>>> +    while ( loading_state != LOADING_EXIT )
>>> +        cpu_relax();
>>> +
>>> +    return 0;
>>> +}
>>
>> By returning 0 you tell do_nmi() to continue processing the NMI.
>> Since you can't tell whether a non-IPI NMI has surfaced at about
>> the same time this is generally the right thing imo, but how do
>> you prevent unknown_nmi_error() from getting entered when do_nmi()
>> ends up setting handle_unknown to true? (The question is mostly
>> rhetorical, but there's a disconnect between do_nmi() checking
>> "cpu == 0" and the control thread running on
>> cpumask_first(&cpu_online_map), i.e. you introduce a well hidden
>> dependency on CPU 0 never going offline. IOW my request is to at
>> least make this less well hidden, such that it can be noticed if
>> and when someone endeavors to remove said limitation.)
> 
> Seems the issue is that we couldn't send IPI NMI to BSP, otherwise
> unknown_nmi_error() would be trigger. And loading ucode after
> rendezvousing all CPUs in NMI handler expects all CPUs to receive IPI
> NMI. So this approach always has such issue.

Not really, I don't think: If both sides agreed (explicitly!) on which
CPU leads this effort, then it would be clear that the one CPU
handling NMIs coming from the platform should not be sent an NMI, and
hence it should be this one to lead the effort. FAOD - my remark really
was because of the new hidden(!) dependency you introduce on CPU 0
always being this "special" CPU. I don't expect you to change the code,
but I'd like you to make the currently hidden dependency explicit.

> Considering self_nmi is called at another place, could we provide a
> way to temporarily suppress or (force) ignore unknown nmi error?

I'm afraid any attempt at doing so will leave room for missing an
actual (platform) NMI.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-08-30  6:35           ` Chao Gao
@ 2019-09-09  5:52             ` Chao Gao
  2019-09-09  6:16               ` Jan Beulich
  0 siblings, 1 reply; 57+ messages in thread
From: Chao Gao @ 2019-09-09  5:52 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On Fri, Aug 30, 2019 at 02:35:06PM +0800, Chao Gao wrote:
>On Thu, Aug 29, 2019 at 02:11:10PM +0200, Jan Beulich wrote:
>>On 27.08.2019 06:52, Chao Gao wrote:
>>> On Mon, Aug 26, 2019 at 04:07:59PM +0800, Chao Gao wrote:
>>>> On Fri, Aug 23, 2019 at 09:46:37AM +0100, Sergey Dyasli wrote:
>>>>> On 19/08/2019 02:25, Chao Gao wrote:
>>>>>> register an nmi callback. And this callback does busy-loop on threads
>>>>>> which are waiting for loading completion. Control threads send NMI to
>>>>>> slave threads to prevent NMI acceptance during ucode loading.
>>>>>>
>>>>>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>>>>>> ---
>>>>>> Changes in v9:
>>>>>>  - control threads send NMI to all other threads. Slave threads will
>>>>>>  stay in the NMI handling to prevent NMI acceptance during ucode
>>>>>>  loading. Note that self-nmi is invalid according to SDM.
>>>>>
>>>>> To me this looks like a half-measure: why keep only slave threads in
>>>>> the NMI handler, when master threads can update the microcode from
>>>>> inside the NMI handler as well?
>>>>
>>>> No special reason. Because the issue we want to address is that slave
>>>> threads might go to handle NMI and access MSRs when master thread is
>>>> loading ucode. So we only keep slave threads in the NMI handler.
>>>>
>>>>>
>>>>> You mention that self-nmi is invalid, but Xen has self_nmi() which is
>>>>> used for apply_alternatives() during boot, so can be trusted to work.
>>>>
>>>> Sorry, I meant using self shorthand to send self-nmi. I tried to use
>>>> self shorthand but got APIC error. And I agree that it is better to
>>>> make slave thread call self_nmi() itself.
>>>>
>>>>>
>>>>> I experimented a bit with the following approach: after loading_state
>>>>> becomes LOADING_CALLIN, each cpu issues a self_nmi() and rendezvous
>>>>> via cpu_callin_map into LOADING_ENTER to do a ucode update directly in
>>>>> the NMI handler. And it seems to work.
>>>>>
>>>>> Separate question is about the safety of this approach: can we be sure
>>>>> that a ucode update would not reset the status of the NMI latch? I.e.
>>>>> can it cause another NMI to be delivered while Xen already handles one?
>>>>
>>>> Ashok, what's your opinion on Sergey's approach and his concern?
>>> 
>>> I talked with Ashok. We think your approach is better. I will follow
>>> your approach in v10. It would be much helpful if you post your patch
>>> so that I can just rebase it onto other patches.
>>
>>Doing the actual ucode update inside an NMI handler seems rather risky
>>to me. Even if Ashok confirmed it would not be an issue on past and
>>current Intel CPUs - what about future ones, or ones from other vendors?
>

Intel SDM doesn't say that loading ucode isn't allowed inside an NMI
handler. So it is allowed implicitly. If future CPUs cannot load ucode
in NMI handler, SDM should document it and at that time, we can move
ucode loading out of NMI handler for new CPUS. As to AMD, if someone
objects to this approach, let's use this approach for Intel only.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

* Re: [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode
  2019-09-09  5:52             ` Chao Gao
@ 2019-09-09  6:16               ` Jan Beulich
  0 siblings, 0 replies; 57+ messages in thread
From: Jan Beulich @ 2019-09-09  6:16 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 09.09.2019 07:52, Chao Gao wrote:
> On Fri, Aug 30, 2019 at 02:35:06PM +0800, Chao Gao wrote:
>>> Doing the actual ucode update inside an NMI handler seems rather risky
>>> to me. Even if Ashok confirmed it would not be an issue on past and
>>> current Intel CPUs - what about future ones, or ones from other vendors?
>>
> 
> Intel SDM doesn't say that loading ucode isn't allowed inside an NMI
> handler. So it is allowed implicitly.

Well, if the SDM was complete and correct everywhere else, I'd agree
to such an interpretation / implication.

> If future CPUs cannot load ucode
> in NMI handler, SDM should document it and at that time, we can move
> ucode loading out of NMI handler for new CPUS. As to AMD, if someone
> objects to this approach, let's use this approach for Intel only.

Getting a definitive statement may turn out difficult. But I guess if
you support both approaches anyway, having a command line option to
override the default behavior wouldn't be much more than a 1 line
addition?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 57+ messages in thread

end of thread, other threads:[~2019-09-09  6:16 UTC | newest]

Thread overview: 57+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-19  1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
2019-08-28 15:12   ` Jan Beulich
2019-08-29  7:15     ` Chao Gao
2019-08-29  7:14       ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 02/15] microcode/amd: fix memory leak Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 03/15] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
2019-08-22 11:11   ` Roger Pau Monné
2019-08-28 15:21   ` Jan Beulich
2019-08-29 10:18   ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 05/15] microcode: clean up microcode_resume_cpu Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 06/15] microcode: remove struct ucode_cpu_info Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 07/15] microcode: remove pointless 'cpu' parameter Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
2019-08-22 13:08   ` Roger Pau Monné
2019-08-28 15:26   ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 09/15] microcode: pass a patch pointer to apply_microcode() Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-08-22 13:59   ` Roger Pau Monné
2019-08-29 10:06     ` Jan Beulich
2019-08-30  3:22       ` Chao Gao
2019-08-30  7:25         ` Jan Beulich
2019-08-29 10:19   ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
2019-08-22 14:10   ` Roger Pau Monné
2019-08-22 16:44     ` Chao Gao
2019-08-23  9:09       ` Roger Pau Monné
2019-08-29  7:37         ` Chao Gao
2019-08-29  8:16           ` Roger Pau Monné
2019-08-29 10:26           ` Jan Beulich
2019-08-29 10:29   ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
2019-08-23  8:11   ` Roger Pau Monné
2019-08-26  7:03     ` Chao Gao
2019-08-26  8:11       ` Roger Pau Monné
2019-08-29 10:47   ` Jan Beulich
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
2019-08-19 10:27   ` Sergey Dyasli
2019-08-19 14:49     ` Chao Gao
2019-08-29 12:06   ` Jan Beulich
2019-08-30  3:30     ` Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 14/15] microcode: remove microcode_update_lock Chao Gao
2019-08-19  1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
2019-08-23  8:46   ` Sergey Dyasli
2019-08-26  8:07     ` Chao Gao
2019-08-27  4:52       ` Chao Gao
2019-08-28  8:52         ` Sergey Dyasli
2019-08-29 12:11         ` Jan Beulich
2019-08-30  6:35           ` Chao Gao
2019-09-09  5:52             ` Chao Gao
2019-09-09  6:16               ` Jan Beulich
2019-08-29 12:22   ` Jan Beulich
2019-08-30  6:33     ` Chao Gao
2019-08-30  7:30       ` Jan Beulich
2019-08-22  7:51 ` [Xen-devel] [PATCH v9 00/15] improve late microcode loading Sergey Dyasli
2019-08-22 15:39   ` Chao Gao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).