Xen-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Xen-devel] [PATCH v10 00/16] improve late microcode loading
@ 2019-09-12  7:22 Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match() Chao Gao
                   ` (16 more replies)
  0 siblings, 17 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Boris Ostrovsky, Chao Gao, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

Major changes in version 10:
 - add back the patch to call wbinvd() conditionally
 - add a patch to disable late loading due to BDF90
 - rendezvous CPUs in NMI handler and load ucode. But provide an option
 to disable this behavior.
 - avoid the call of self_nmi() on the control thread because it may
 trigger the unknown_nmi_error() in do_nmi().
 - ensure ->start_update is called during system resuming from
 suspension

Sergey, could you help to test this series on an AMD machine?
Regarding changes to AMD side, I didn't do any test for them due to
lack of hardware. At least, two basic tests are needed:
* do a microcode update after system bootup
* don't bring all pCPUs up at bootup by specifying maxcpus option in xen
  command line and then do a microcode update and online all offlined
  CPUs via 'xen-hptool'.

The intention of this series is to make the late microcode loading
more reliable by rendezvousing all cpus in stop_machine context.
This idea comes from Ashok. I am porting his linux patch to Xen
(see patch 12 more details).

This series includes below changes:
 1. Patch 1-11: introduce a global microcode cache and some cleanup
 2. Patch 12: synchronize late microcode loading
 3. Patch 13: support parallel microcodes update on different cores
 4. Patch 14: block #NMI handling during microcode loading
 5. Patch 15: disable late ucode loading due to BDF90
 6. Patch 16: call wbinvd() conditionally

Currently, late microcode loading does a lot of things including
parsing microcode blob, checking the signature/revision and performing
update. Putting all of them into stop_machine context is a bad idea
because of complexity (one issue I observed is memory allocation
triggered one assertion in stop_machine context). To simplify the
load process, parsing microcode is moved out of the load process.
Remaining parts of load process is put to stop_machine context.

Previous change log:
Changes in version 9:
 - add Jan's Reviewed-by
 - rendevzous threads in NMI handler to disable NMI. Note that NMI can
 be served as usual on threads that are chosen to initiate ucode loading
 on each core.
 - avoid unnecessary memory allocation or copy when creating a microcode
 patch (patch 12)
 - rework patch 1 to avoid microcode_update_match() being used to
 compare two arbitrary updates.
 - call .end_update in early loading path.

Changes in version 8:
 - block #NMI handling during microcode loading (Patch 16)
 - Don't assume that all CPUs in the system have loaded a same ucode.
 So when parsing a blob, we attempt to save a patch as long as it matches
 with current cpu signature regardless of the revision of the patch.
 And also for loading, we only require the patch to be loaded isn't old
 than the cached one.
 - store an update after the first successful loading on a CPU
 - remove the patch that calls wbinvd() unconditionally before microcode
 loading. It is under internal discussion.
 - divide two big patches into several patches to improve readability.

Changes in version 7:
 - cache one microcode update rather than a list of it. Assuming that all CPUs
 (including those will be plugged in later) in the system have the same
 signature, one update matches with one CPU should match with others. Thus, one
 update is enough for microcode updating during CPU hot-plug and resuming.
 - To handle load failure, microcode update is cached after it is applied to
 avoid a broken update overriding a validated one. Unvalidated microcode updates
 are passed by arguments rather than another global variable, where this series
 slightly differs from Roger's suggestion in:
 https://lists.xen.org/archives/html/xen-devel/2019-03/msg00776.html
 - incorporate Sergey's patch (patch 10) to fix a bug: we maintain a variable
 to reflect current microcode revision. But in some cases, this variable isn't
 initialized during system boot time, which results in falsely reporting that
 processor is susceptible to some known vulnerabilities.
 - fix issues reported by Sergey:
 https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00901.html
 - Responses to Sergey/Roger/Wei/Ashok's other comments.

Major changes in version 6:
 - run wbinvd before updating microcode (patch 10)
 - add an userspace tool for late microcode update (patch 1)
 - scale time to wait by the number of remaining CPUs to respond 
 - remove 'cpu' parameters from some related callbacks and functins
 - save an ucode patch only if its supported CPU is allowed to mix with
   current cpu.

Changes in version 5:
 - support parallel microcode updates for all cores (see patch 8)
 - Address Roger's comments on the last version.

Chao Gao (16):
  microcode/intel: extend microcode_update_match()
  microcode/amd: distinguish old and mismatched ucode in
    microcode_fits()
  microcode: introduce a global cache of ucode patch
  microcode: clean up microcode_resume_cpu
  microcode: remove struct ucode_cpu_info
  microcode: remove pointless 'cpu' parameter
  microcode/amd: call svm_host_osvw_init() in common code
  microcode: pass a patch pointer to apply_microcode()
  microcode: split out apply_microcode() from cpu_request_microcode()
  microcode: unify ucode loading during system bootup and resuming
  microcode: reduce memory allocation and copy when creating a patch
  x86/microcode: Synchronize late microcode loading
  microcode: remove microcode_update_lock
  microcode: rendezvous CPUs in NMI handler and load ucode
  microcode: disable late loading if CPUs are affected by BDF90
  microcode/intel: writeback and invalidate cache conditionally

 docs/misc/xen-command-line.pandoc |  10 +
 xen/arch/x86/acpi/power.c         |   2 +-
 xen/arch/x86/apic.c               |   2 +-
 xen/arch/x86/microcode.c          | 569 ++++++++++++++++++++++++++++++--------
 xen/arch/x86/microcode_amd.c      | 282 +++++++++----------
 xen/arch/x86/microcode_intel.c    | 269 +++++++++++-------
 xen/arch/x86/smpboot.c            |   5 +-
 xen/arch/x86/spec_ctrl.c          |   2 +-
 xen/arch/x86/traps.c              |   6 +-
 xen/include/asm-x86/microcode.h   |  43 +--
 xen/include/asm-x86/nmi.h         |   3 +
 xen/include/asm-x86/processor.h   |   4 +-
 12 files changed, 788 insertions(+), 409 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match()
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 10:24   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 02/16] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
                   ` (15 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

to a more generic function. So that it can be used alone to check
an update against the CPU signature and current update revision.

Note that enum microcode_match_result will be used in common code
(aka microcode.c), it has been placed in the common header. And
constifying the parameter of microcode_sanity_check() such that it
can be called by microcode_update_match().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v10:
 - Drop RBs
 - assert that microcode passed to microcode_update_match() would pass
 sanity check. Constify the parameter of microcode_sanity_check()

Changes in v9:
 - microcode_update_match() doesn't accept (sig, pf, rev) any longer.
 Hence, it won't be used to compare two arbitrary updates.
 - rewrite patch description

Changes in v8:
 - make sure enough room for an extended header and signature array

Changes in v6:
 - eliminate unnecessary type casting in microcode_update_match
 - check if a patch has an extend header

Changes in v5:
 - constify the extended_signature
 - use named enum type for the return value of microcode_update_match
---
 xen/arch/x86/microcode_intel.c  | 75 ++++++++++++++++++++++-------------------
 xen/include/asm-x86/microcode.h |  6 ++++
 2 files changed, 47 insertions(+), 34 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 22fdeca..1a3ffa5 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -134,21 +134,11 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
     return 0;
 }
 
-static inline int microcode_update_match(
-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
-    int sig, int pf)
+static int microcode_sanity_check(const void *mc)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
-
-    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
-            (mc_header->rev > uci->cpu_sig.rev));
-}
-
-static int microcode_sanity_check(void *mc)
-{
-    struct microcode_header_intel *mc_header = mc;
-    struct extended_sigtable *ext_header = NULL;
-    struct extended_signature *ext_sig;
+    const struct microcode_header_intel *mc_header = mc;
+    const struct extended_sigtable *ext_header = NULL;
+    const struct extended_signature *ext_sig;
     unsigned long total_size, data_size, ext_table_size;
     unsigned int ext_sigcount = 0, i;
     uint32_t sum, orig_sum;
@@ -234,6 +224,42 @@ static int microcode_sanity_check(void *mc)
     return 0;
 }
 
+/* Check an update against the CPU signature and current update revision */
+static enum microcode_match_result microcode_update_match(
+    const struct microcode_header_intel *mc_header, unsigned int cpu)
+{
+    const struct extended_sigtable *ext_header;
+    const struct extended_signature *ext_sig;
+    unsigned int i;
+    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    unsigned int sig = uci->cpu_sig.sig;
+    unsigned int pf = uci->cpu_sig.pf;
+    unsigned int rev = uci->cpu_sig.rev;
+    unsigned long data_size = get_datasize(mc_header);
+    const void *end = (const void *)mc_header + get_totalsize(mc_header);
+
+    ASSERT(!microcode_sanity_check(mc_header));
+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    ext_header = (const void *)(mc_header + 1) + data_size;
+    ext_sig = (const void *)(ext_header + 1);
+
+    /*
+     * Make sure there is enough space to hold an extended header and enough
+     * array elements.
+     */
+    if ( (end < (const void *)ext_sig) ||
+         (end < (const void *)(ext_sig + ext_header->count)) )
+        return MIS_UCODE;
+
+    for ( i = 0; i < ext_header->count; i++ )
+        if ( sigmatch(sig, ext_sig[i].sig, pf, ext_sig[i].pf) )
+            return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
+}
+
 /*
  * return 0 - no update found
  * return 1 - found update
@@ -243,31 +269,12 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
-    const struct extended_sigtable *ext_header;
     unsigned long total_size = get_totalsize(mc_header);
-    int ext_sigcount, i;
-    struct extended_signature *ext_sig;
     void *new_mc;
 
-    if ( microcode_update_match(cpu, mc_header,
-                                mc_header->sig, mc_header->pf) )
-        goto find;
-
-    if ( total_size <= (get_datasize(mc_header) + MC_HEADER_SIZE) )
+    if ( microcode_update_match(mc, cpu) != NEW_UCODE )
         return 0;
 
-    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
-    ext_sigcount = ext_header->count;
-    ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
-    for ( i = 0; i < ext_sigcount; i++ )
-    {
-        if ( microcode_update_match(cpu, mc_header,
-                                    ext_sig->sig, ext_sig->pf) )
-            goto find;
-        ext_sig++;
-    }
-    return 0;
- find:
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
              cpu, mc_header->rev, uci->cpu_sig.rev);
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 23ea954..882f560 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -3,6 +3,12 @@
 
 #include <xen/percpu.h>
 
+enum microcode_match_result {
+    OLD_UCODE, /* signature matched, but revision id is older or equal */
+    NEW_UCODE, /* signature matched, but revision id is newer */
+    MIS_UCODE, /* signature mismatched */
+};
+
 struct cpu_signature;
 struct ucode_cpu_info;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 02/16] microcode/amd: distinguish old and mismatched ucode in microcode_fits()
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch Chao Gao
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Sometimes, an ucode with a level lower than or equal to current CPU's
patch level is useful. For example, to work around a broken bios which
only loads ucode for BSP, when BSP parses an ucode blob during bootup,
it is better to save an ucode with lower or equal level for APs

No functional change is made in this patch. But following patch would
handle "old ucode" and "mismatched ucode" separately.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v8:
 - new
---
 xen/arch/x86/microcode_amd.c | 18 +++++++++---------
 1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 9b74330..7fa700b 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -152,8 +152,8 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
     return 0;
 }
 
-static bool_t microcode_fits(const struct microcode_amd *mc_amd,
-                             unsigned int cpu)
+static enum microcode_match_result microcode_fits(
+    const struct microcode_amd *mc_amd, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
@@ -167,27 +167,27 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( !find_equiv_cpu_id(equiv_cpu_table, current_cpu_id, &equiv_cpu_id) )
-        return 0;
+        return MIS_UCODE;
 
     if ( (mc_header->processor_rev_id) != equiv_cpu_id )
-        return 0;
+        return MIS_UCODE;
 
     if ( !verify_patch_size(mc_amd->mpb_size) )
     {
         pr_debug("microcode: patch size mismatch\n");
-        return 0;
+        return MIS_UCODE;
     }
 
     if ( mc_header->patch_id <= uci->cpu_sig.rev )
     {
         pr_debug("microcode: patch is already at required level or greater.\n");
-        return 0;
+        return OLD_UCODE;
     }
 
     pr_debug("microcode: CPU%d found a matching microcode update with version %#x (current=%#x)\n",
              cpu, mc_header->patch_id, uci->cpu_sig.rev);
 
-    return 1;
+    return NEW_UCODE;
 }
 
 static int apply_microcode(unsigned int cpu)
@@ -496,7 +496,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) )
+        if ( microcode_fits(mc_amd, cpu) == NEW_UCODE )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -579,7 +579,7 @@ static int microcode_resume_match(unsigned int cpu, const void *mc)
     struct microcode_amd *mc_amd = uci->mc.mc_amd;
     const struct microcode_amd *src = mc;
 
-    if ( !microcode_fits(src, cpu) )
+    if ( microcode_fits(src, cpu) != NEW_UCODE )
         return 0;
 
     if ( src != mc_amd )
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match() Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 02/16] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 10:29   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 04/16] microcode: clean up microcode_resume_cpu Chao Gao
                   ` (13 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

to replace the current per-cpu cache 'uci->mc'.

With the assumption that all CPUs in the system have the same signature
(family, model, stepping and 'pf'), one microcode update matches with
one cpu should match with others. Having differing microcode revisions
on cpus would cause system unstable and should be avoided. Hence, caching
one microcode update is good enough for all cases.

Introduce a global variable, microcode_cache, to store the newest
matching microcode update. Whenever we get a new valid microcode update,
its revision id is compared against that of the microcode update to
determine whether the "microcode_cache" needs to be replaced. And
this global cache is loaded to cpu in apply_microcode().

All operations on the cache is protected by 'microcode_mutex'.

Note that I deliberately avoid touching the old per-cpu cache ('uci->mc')
as I am going to remove it completely in the following patches. We copy
everything to create the new cache blob to avoid reusing some buffers
previously allocated for the old per-cpu cache. It is not so efficient,
but it is already corrected by a patch later in this series.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes in v10:
 - assert mismatched ucode won't be passed to ->compare_patch.
 - return -ENOENT if patch is NULL in .apply_microcode().
 - check against NULL pointer dereference in free_patch() on AMD side
 - cosmetic changes suggested by Roger and Jan.

Changes in v9:
 - on Intel side, ->compare_patch just checks the patch revision number.
 - explain why all buffers are copied in alloc_microcode_patch() in
 patch description.

Changes in v8:
 - Free generic wrapper struct in general code
 - Try to update cache as long as a patch covers current cpu. Previsouly,
 cache is updated only if the patch is newer than current update revision in
 the CPU. The small difference can work around a broken bios which only
 applies microcode update to BSP and software has to apply the same
 update to other CPUs.

Changes in v7:
 - reworked to cache only one microcode patch rather than a list of
 microcode patches.
---
 xen/arch/x86/microcode.c        | 38 ++++++++++++++++
 xen/arch/x86/microcode_amd.c    | 98 ++++++++++++++++++++++++++++++++++++++---
 xen/arch/x86/microcode_intel.c  | 81 +++++++++++++++++++++++++++-------
 xen/include/asm-x86/microcode.h | 16 +++++++
 4 files changed, 211 insertions(+), 22 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 421d57e..e218a9d 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -61,6 +61,9 @@ static struct ucode_mod_blob __initdata ucode_blob;
  */
 static bool_t __initdata ucode_scan;
 
+/* Protected by microcode_mutex */
+static struct microcode_patch *microcode_cache;
+
 void __init microcode_set_module(unsigned int idx)
 {
     ucode_mod_idx = idx;
@@ -262,6 +265,41 @@ int microcode_resume_cpu(unsigned int cpu)
     return err;
 }
 
+void microcode_free_patch(struct microcode_patch *microcode_patch)
+{
+    microcode_ops->free_patch(microcode_patch->mc);
+    xfree(microcode_patch);
+}
+
+const struct microcode_patch *microcode_get_cache(void)
+{
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    return microcode_cache;
+}
+
+/* Return true if cache gets updated. Otherwise, return false */
+bool microcode_update_cache(struct microcode_patch *patch)
+{
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    if ( !microcode_cache )
+        microcode_cache = patch;
+    else if ( microcode_ops->compare_patch(patch,
+                                           microcode_cache) == NEW_UCODE )
+    {
+        microcode_free_patch(microcode_cache);
+        microcode_cache = patch;
+    }
+    else
+    {
+        microcode_free_patch(patch);
+        return false;
+    }
+
+    return true;
+}
+
 static int microcode_update_cpu(const void *buf, size_t size)
 {
     int err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 7fa700b..2dca1df 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -190,25 +190,92 @@ static enum microcode_match_result microcode_fits(
     return NEW_UCODE;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    if ( !patch )
+        return false;
+    return microcode_fits(patch->mc_amd, smp_processor_id()) == NEW_UCODE;
+}
+
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_amd *mc_amd)
+{
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+    struct microcode_amd *cache = xmalloc(struct microcode_amd);
+    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
+    struct equiv_cpu_entry *equiv_cpu_table =
+                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
+
+    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
+    {
+        xfree(microcode_patch);
+        xfree(cache);
+        xfree(mpb);
+        xfree(equiv_cpu_table);
+        return ERR_PTR(-ENOMEM);
+    }
+
+    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
+    cache->mpb = mpb;
+    cache->mpb_size = mc_amd->mpb_size;
+    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
+           mc_amd->equiv_cpu_table_size);
+    cache->equiv_cpu_table = equiv_cpu_table;
+    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
+    microcode_patch->mc_amd = cache;
+
+    return microcode_patch;
+}
+
+static void free_patch(void *mc)
+{
+    struct microcode_amd *mc_amd = mc;
+
+    if ( mc_amd )
+    {
+        xfree(mc_amd->equiv_cpu_table);
+        xfree(mc_amd->mpb);
+        xfree(mc_amd);
+    }
+}
+
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    const struct microcode_header_amd *new_header = new->mc_amd->mpb;
+    const struct microcode_header_amd *old_header = old->mc_amd->mpb;
+
+    /* Both patches to compare are supposed to be applicable to local CPU. */
+    ASSERT(microcode_fits(new->mc_amd, smp_processor_id()) != MIS_UCODE);
+    ASSERT(microcode_fits(new->mc_amd, smp_processor_id()) != MIS_UCODE);
+
+    if ( new_header->processor_rev_id == old_header->processor_rev_id )
+        return (new_header->patch_id > old_header->patch_id) ?
+                NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
+}
+
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    struct microcode_header_amd *hdr;
     int hw_err;
+    const struct microcode_header_amd *hdr;
+    const struct microcode_patch *patch = microcode_get_cache();
 
     /* We should bind the task to the CPU */
     BUG_ON(raw_smp_processor_id() != cpu);
 
-    if ( mc_amd == NULL )
-        return -EINVAL;
+    if ( !patch )
+        return -ENOENT;
 
-    hdr = mc_amd->mpb;
-    if ( hdr == NULL )
+    if ( !match_cpu(patch) )
         return -EINVAL;
 
+    hdr = patch->mc_amd->mpb;
+
     spin_lock_irqsave(&microcode_update_lock, flags);
 
     hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
@@ -496,7 +563,21 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) == NEW_UCODE )
+        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
+
+        if ( IS_ERR(new_patch) )
+        {
+            error = PTR_ERR(new_patch);
+            break;
+        }
+
+        /* Update cache if this patch covers current CPU */
+        if ( microcode_fits(new_patch->mc_amd, cpu) != MIS_UCODE )
+            microcode_update_cache(new_patch);
+        else
+            microcode_free_patch(new_patch);
+
+        if ( match_cpu(microcode_get_cache()) )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -643,6 +724,9 @@ static const struct microcode_ops microcode_amd_ops = {
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
     .start_update                     = start_update,
+    .free_patch                       = free_patch,
+    .compare_patch                    = compare_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_amd(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 1a3ffa5..eefc2d2 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -260,6 +260,36 @@ static enum microcode_match_result microcode_update_match(
     return MIS_UCODE;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    if ( !patch )
+        return false;
+
+    return microcode_update_match(&patch->mc_intel->hdr,
+                                  smp_processor_id()) == NEW_UCODE;
+}
+
+static void free_patch(void *mc)
+{
+    xfree(mc);
+}
+
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    /*
+     * Both patches to compare are supposed to be applicable to local CPU.
+     * Just compare the revision number.
+     */
+    ASSERT(microcode_update_match(&old->mc_intel->hdr, smp_processor_id()) !=
+               MIS_UCODE);
+    ASSERT(microcode_update_match(&new->mc_intel->hdr, smp_processor_id()) !=
+               MIS_UCODE);
+
+    return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ? NEW_UCODE
+                                                             : OLD_UCODE;
+}
+
 /*
  * return 0 - no update found
  * return 1 - found update
@@ -270,10 +300,26 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc;
+    void *new_mc = xmalloc_bytes(total_size);
+    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
 
-    if ( microcode_update_match(mc, cpu) != NEW_UCODE )
+    if ( !new_patch || !new_mc )
+    {
+        xfree(new_patch);
+        xfree(new_mc);
+        return -ENOMEM;
+    }
+    memcpy(new_mc, mc, total_size);
+    new_patch->mc_intel = new_mc;
+
+    /* Make sure that this patch covers current CPU */
+    if ( microcode_update_match(mc, cpu) == MIS_UCODE )
+    {
+        microcode_free_patch(new_patch);
         return 0;
+    }
+
+    microcode_update_cache(new_patch);
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
@@ -298,18 +344,25 @@ static int apply_microcode(unsigned int cpu)
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    const struct microcode_intel *mc_intel;
+    const struct microcode_patch *patch = microcode_get_cache();
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu_num != cpu);
 
-    if ( uci->mc.mc_intel == NULL )
+    if ( !patch )
+        return -ENOENT;
+
+    if ( !match_cpu(patch) )
         return -EINVAL;
 
+    mc_intel = patch->mc_intel;
+
     /* serialize access to the physical write to MSR 0x79 */
     spin_lock_irqsave(&microcode_update_lock, flags);
 
     /* write microcode via MSR 0x79 */
-    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
+    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
 
     /* As documented in the SDM: Do a CPUID 1 here */
@@ -320,19 +373,17 @@ static int apply_microcode(unsigned int cpu)
     val[1] = (uint32_t)(msr_content >> 32);
 
     spin_unlock_irqrestore(&microcode_update_lock, flags);
-    if ( val[1] != uci->mc.mc_intel->hdr.rev )
+    if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
+               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
-           cpu_num, uci->cpu_sig.rev, val[1],
-           uci->mc.mc_intel->hdr.year,
-           uci->mc.mc_intel->hdr.month,
-           uci->mc.mc_intel->hdr.day);
+           cpu_num, uci->cpu_sig.rev, val[1], mc_intel->hdr.year,
+           mc_intel->hdr.month, mc_intel->hdr.day);
     uci->cpu_sig.rev = val[1];
 
     return 0;
@@ -372,7 +423,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     long offset = 0;
     int error = 0;
     void *mc;
-    unsigned int matching_count = 0;
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -390,10 +440,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
          * lets keep searching till the latest version
          */
         if ( error == 1 )
-        {
-            matching_count++;
             error = 0;
-        }
+
         xfree(mc);
     }
     if ( offset > 0 )
@@ -401,7 +449,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && matching_count )
+    if ( !error && match_cpu(microcode_get_cache()) )
         error = apply_microcode(cpu);
 
     return error;
@@ -417,6 +465,9 @@ static const struct microcode_ops microcode_intel_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+    .free_patch                       = free_patch,
+    .compare_patch                    = compare_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 882f560..4d45401 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -12,6 +12,14 @@ enum microcode_match_result {
 struct cpu_signature;
 struct ucode_cpu_info;
 
+struct microcode_patch {
+    union {
+        struct microcode_intel *mc_intel;
+        struct microcode_amd *mc_amd;
+        void *mc;
+    };
+};
+
 struct microcode_ops {
     int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
@@ -19,6 +27,10 @@ struct microcode_ops {
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
     int (*apply_microcode)(unsigned int cpu);
     int (*start_update)(void);
+    void (*free_patch)(void *mc);
+    bool (*match_cpu)(const struct microcode_patch *patch);
+    enum microcode_match_result (*compare_patch)(
+        const struct microcode_patch *new, const struct microcode_patch *old);
 };
 
 struct cpu_signature {
@@ -39,4 +51,8 @@ struct ucode_cpu_info {
 DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 extern const struct microcode_ops *microcode_ops;
 
+const struct microcode_patch *microcode_get_cache(void);
+bool microcode_update_cache(struct microcode_patch *patch);
+void microcode_free_patch(struct microcode_patch *patch);
+
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 04/16] microcode: clean up microcode_resume_cpu
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (2 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 05/16] microcode: remove struct ucode_cpu_info Chao Gao
                   ` (12 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Previously, a per-cpu ucode cache is maintained. Then each CPU had one
per-cpu update cache and there might be multiple versions of microcode.
Thus microcode_resume_cpu tried best to update microcode by loading
every update cache until a successful load.

But now the cache struct is simplified a lot and only a single ucode is
cached. a single invocation of ->apply_microcode() would load the cache
and make microcode updated.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
changes in v8:
 - new
 - separated from the following patch
---
 xen/arch/x86/microcode.c        | 40 ++---------------------------------
 xen/arch/x86/microcode_amd.c    | 47 -----------------------------------------
 xen/arch/x86/microcode_intel.c  |  6 ------
 xen/include/asm-x86/microcode.h |  1 -
 4 files changed, 2 insertions(+), 92 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index e218a9d..922b94f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -215,8 +215,6 @@ int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct cpu_signature nsig;
-    unsigned int cpu2;
 
     if ( !microcode_ops )
         return 0;
@@ -224,42 +222,8 @@ int microcode_resume_cpu(unsigned int cpu)
     spin_lock(&microcode_mutex);
 
     err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
-    if ( err )
-    {
-        __microcode_fini_cpu(cpu);
-        spin_unlock(&microcode_mutex);
-        return err;
-    }
-
-    if ( uci->mc.mc_valid )
-    {
-        err = microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid);
-        if ( err >= 0 )
-        {
-            if ( err )
-                err = microcode_ops->apply_microcode(cpu);
-            spin_unlock(&microcode_mutex);
-            return err;
-        }
-    }
-
-    nsig = uci->cpu_sig;
-    __microcode_fini_cpu(cpu);
-    uci->cpu_sig = nsig;
-
-    err = -EIO;
-    for_each_online_cpu ( cpu2 )
-    {
-        uci = &per_cpu(ucode_cpu_info, cpu2);
-        if ( uci->mc.mc_valid &&
-             microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid) > 0 )
-        {
-            err = microcode_ops->apply_microcode(cpu);
-            break;
-        }
-    }
-
-    __microcode_fini_cpu(cpu);
+    if ( likely(!err) )
+        err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
 
     return err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 2dca1df..04b00aa 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -654,52 +654,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    const struct microcode_amd *src = mc;
-
-    if ( microcode_fits(src, cpu) != NEW_UCODE )
-        return 0;
-
-    if ( src != mc_amd )
-    {
-        if ( mc_amd )
-        {
-            xfree(mc_amd->equiv_cpu_table);
-            xfree(mc_amd->mpb);
-            xfree(mc_amd);
-        }
-
-        mc_amd = xmalloc(struct microcode_amd);
-        uci->mc.mc_amd = mc_amd;
-        if ( !mc_amd )
-            return -ENOMEM;
-        mc_amd->equiv_cpu_table = xmalloc_bytes(src->equiv_cpu_table_size);
-        if ( !mc_amd->equiv_cpu_table )
-            goto err1;
-        mc_amd->mpb = xmalloc_bytes(src->mpb_size);
-        if ( !mc_amd->mpb )
-            goto err2;
-
-        mc_amd->equiv_cpu_table_size = src->equiv_cpu_table_size;
-        mc_amd->mpb_size = src->mpb_size;
-        memcpy(mc_amd->mpb, src->mpb, src->mpb_size);
-        memcpy(mc_amd->equiv_cpu_table, src->equiv_cpu_table,
-               src->equiv_cpu_table_size);
-    }
-
-    return 1;
-
-err2:
-    xfree(mc_amd->equiv_cpu_table);
-err1:
-    xfree(mc_amd);
-    uci->mc.mc_amd = NULL;
-    return -ENOMEM;
-}
-
 static int start_update(void)
 {
 #if CONFIG_HVM
@@ -719,7 +673,6 @@ static int start_update(void)
 }
 
 static const struct microcode_ops microcode_amd_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index eefc2d2..97f759e 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -455,13 +455,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    return get_matching_microcode(mc, cpu);
-}
-
 static const struct microcode_ops microcode_intel_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 4d45401..da0b156 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -21,7 +21,6 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
                                  size_t size);
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 05/16] microcode: remove struct ucode_cpu_info
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (3 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 04/16] microcode: clean up microcode_resume_cpu Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 06/16] microcode: remove pointless 'cpu' parameter Chao Gao
                   ` (11 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Remove the per-cpu cache field in struct ucode_cpu_info since it has
been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.

The cpu status notifier is also removed. It was used to free the "mc"
field to avoid memory leak.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - rebase and fix conflict

Changes in v8:
 - split microcode_resume_cpu() cleanup to a separate patch.

Changes in v6:
 - remove the whole struct ucode_cpu_info instead of the per-cpu cache
  in it.
---
 xen/arch/x86/apic.c             |  2 +-
 xen/arch/x86/microcode.c        | 57 +++++++--------------------------------
 xen/arch/x86/microcode_amd.c    | 59 +++++++++--------------------------------
 xen/arch/x86/microcode_intel.c  | 28 +++++++------------
 xen/arch/x86/spec_ctrl.c        |  2 +-
 xen/include/asm-x86/microcode.h | 12 +--------
 6 files changed, 34 insertions(+), 126 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index ea0d561..6cdb50c 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1190,7 +1190,7 @@ static void __init check_deadline_errata(void)
     else
         rev = (unsigned long)m->driver_data;
 
-    if ( this_cpu(ucode_cpu_info).cpu_sig.rev >= rev )
+    if ( this_cpu(cpu_sig).rev >= rev )
         return;
 
     setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE);
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 922b94f..d17dbec 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -187,7 +187,7 @@ const struct microcode_ops *microcode_ops;
 
 static DEFINE_SPINLOCK(microcode_mutex);
 
-DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 struct microcode_info {
     unsigned int cpu;
@@ -196,32 +196,17 @@ struct microcode_info {
     char buffer[1];
 };
 
-static void __microcode_fini_cpu(unsigned int cpu)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-
-    xfree(uci->mc.mc_valid);
-    memset(uci, 0, sizeof(*uci));
-}
-
-static void microcode_fini_cpu(unsigned int cpu)
-{
-    spin_lock(&microcode_mutex);
-    __microcode_fini_cpu(cpu);
-    spin_unlock(&microcode_mutex);
-}
-
 int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
     if ( likely(!err) )
         err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
@@ -268,16 +253,13 @@ static int microcode_update_cpu(const void *buf, size_t size)
 {
     int err;
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
     if ( likely(!err) )
         err = microcode_ops->cpu_request_microcode(cpu, buf, size);
-    else
-        __microcode_fini_cpu(cpu);
-
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -364,29 +346,10 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-static int microcode_percpu_callback(
-    struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
-    unsigned int cpu = (unsigned long)hcpu;
-
-    switch ( action )
-    {
-    case CPU_DEAD:
-        microcode_fini_cpu(cpu);
-        break;
-    }
-
-    return NOTIFY_DONE;
-}
-
-static struct notifier_block microcode_percpu_nfb = {
-    .notifier_call = microcode_percpu_callback,
-};
-
 int __init early_microcode_update_cpu(bool start_update)
 {
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc = 0;
     void *data = NULL;
     size_t len;
@@ -405,7 +368,7 @@ int __init early_microcode_update_cpu(bool start_update)
         data = bootstrap_map(&ucode_mod);
     }
 
-    microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    microcode_ops->collect_cpu_info(cpu, sig);
 
     if ( data )
     {
@@ -424,7 +387,7 @@ int __init early_microcode_update_cpu(bool start_update)
 int __init early_microcode_init(void)
 {
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc;
 
     rc = microcode_init_intel();
@@ -437,12 +400,10 @@ int __init early_microcode_init(void)
 
     if ( microcode_ops )
     {
-        microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+        microcode_ops->collect_cpu_info(cpu, sig);
 
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
-
-        register_cpu_notifier(&microcode_percpu_nfb);
     }
 
     return rc;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 04b00aa..69c9cfe 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -155,7 +155,7 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
 static enum microcode_match_result microcode_fits(
     const struct microcode_amd *mc_amd, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
@@ -178,14 +178,14 @@ static enum microcode_match_result microcode_fits(
         return MIS_UCODE;
     }
 
-    if ( mc_header->patch_id <= uci->cpu_sig.rev )
+    if ( mc_header->patch_id <= sig->rev )
     {
         pr_debug("microcode: patch is already at required level or greater.\n");
         return OLD_UCODE;
     }
 
     pr_debug("microcode: CPU%d found a matching microcode update with version %#x (current=%#x)\n",
-             cpu, mc_header->patch_id, uci->cpu_sig.rev);
+             cpu, mc_header->patch_id, sig->rev);
 
     return NEW_UCODE;
 }
@@ -259,9 +259,9 @@ static enum microcode_match_result compare_patch(
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
     int hw_err;
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch = microcode_get_cache();
 
@@ -300,9 +300,9 @@ static int apply_microcode(unsigned int cpu)
     }
 
     printk(KERN_WARNING "microcode: CPU%d updated from revision %#x to %#x\n",
-           cpu, uci->cpu_sig.rev, hdr->patch_id);
+           cpu, sig->rev, hdr->patch_id);
 
-    uci->cpu_sig.rev = rev;
+    sig->rev = rev;
 
     return 0;
 }
@@ -448,14 +448,14 @@ static bool_t check_final_patch_levels(unsigned int cpu)
      * any of the 'final_levels', then we should not update the microcode
      * patch on the cpu as system will hang otherwise.
      */
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     unsigned int i;
 
     if ( boot_cpu_data.x86 != 0x10 )
         return 0;
 
     for ( i = 0; i < ARRAY_SIZE(final_levels); i++ )
-        if ( uci->cpu_sig.rev == final_levels[i] )
+        if ( sig->rev == final_levels[i] )
             return 1;
 
     return 0;
@@ -464,13 +464,12 @@ static bool_t check_final_patch_levels(unsigned int cpu)
 static int cpu_request_microcode(unsigned int cpu, const void *buf,
                                  size_t bufsize)
 {
-    struct microcode_amd *mc_amd, *mc_old;
+    struct microcode_amd *mc_amd;
     size_t offset = 0;
-    size_t last_offset, applied_offset = 0;
-    int error = 0, save_error = 1;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -539,7 +538,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         {
             printk(KERN_ERR "microcode: CPU%d incorrect or corrupt container file\n"
                    "microcode: Failed to update patch level. "
-                   "Current lvl:%#x\n", cpu, uci->cpu_sig.rev);
+                   "Current lvl:%#x\n", cpu, sig->rev);
             break;
         }
     }
@@ -551,15 +550,10 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    mc_old = uci->mc.mc_amd;
-    /* implicitely validates uci->mc.mc_valid */
-    uci->mc.mc_amd = mc_amd;
-
     /*
      * It's possible the data file has multiple matching ucode,
      * lets keep searching till the latest version
      */
-    last_offset = offset;
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
@@ -582,11 +576,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             error = apply_microcode(cpu);
             if ( error )
                 break;
-            applied_offset = last_offset;
         }
 
-        last_offset = offset;
-
         if ( offset >= bufsize )
             break;
 
@@ -614,31 +605,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
              *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
             break;
     }
-
-    /* On success keep the microcode patch for
-     * re-apply on resume.
-     */
-    if ( applied_offset )
-    {
-        save_error = get_ucode_from_buffer_amd(
-            mc_amd, buf, bufsize, &applied_offset);
-
-        if ( save_error )
-            error = save_error;
-    }
-
-    if ( save_error )
-    {
-        uci->mc.mc_amd = mc_old;
-        mc_old = mc_amd;
-    }
-
-    if ( mc_old )
-    {
-        xfree(mc_old->mpb);
-        xfree(mc_old->equiv_cpu_table);
-        xfree(mc_old);
-    }
+    free_patch(mc_amd);
 
   out:
 #if CONFIG_HVM
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 97f759e..f63e4bd 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -231,10 +231,10 @@ static enum microcode_match_result microcode_update_match(
     const struct extended_sigtable *ext_header;
     const struct extended_signature *ext_sig;
     unsigned int i;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    unsigned int sig = uci->cpu_sig.sig;
-    unsigned int pf = uci->cpu_sig.pf;
-    unsigned int rev = uci->cpu_sig.rev;
+    struct cpu_signature *cpu_sig = &per_cpu(cpu_sig, cpu);
+    unsigned int sig = cpu_sig->sig;
+    unsigned int pf = cpu_sig->pf;
+    unsigned int rev = cpu_sig->rev;
     unsigned long data_size = get_datasize(mc_header);
     const void *end = (const void *)mc_header + get_totalsize(mc_header);
 
@@ -297,7 +297,6 @@ static enum microcode_match_result compare_patch(
  */
 static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
@@ -323,17 +322,8 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, uci->cpu_sig.rev);
-    new_mc = xmalloc_bytes(total_size);
-    if ( new_mc == NULL )
-    {
-        printk(KERN_ERR "microcode: error! Can not allocate memory\n");
-        return -ENOMEM;
-    }
+             cpu, mc_header->rev, per_cpu(cpu_sig, cpu).rev);
 
-    memcpy(new_mc, mc, total_size);
-    xfree(uci->mc.mc_intel);
-    uci->mc.mc_intel = new_mc;
     return 1;
 }
 
@@ -343,7 +333,7 @@ static int apply_microcode(unsigned int cpu)
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch = microcode_get_cache();
 
@@ -377,14 +367,14 @@ static int apply_microcode(unsigned int cpu)
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
+               sig->rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
-           cpu_num, uci->cpu_sig.rev, val[1], mc_intel->hdr.year,
+           cpu_num, sig->rev, val[1], mc_intel->hdr.year,
            mc_intel->hdr.month, mc_intel->hdr.day);
-    uci->cpu_sig.rev = val[1];
+    sig->rev = val[1];
 
     return 0;
 }
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index 468a847..4761be8 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -438,7 +438,7 @@ static bool __init check_smt_enabled(void)
 /* Calculate whether Retpoline is known-safe on this CPU. */
 static bool __init retpoline_safe(uint64_t caps)
 {
-    unsigned int ucode_rev = this_cpu(ucode_cpu_info).cpu_sig.rev;
+    unsigned int ucode_rev = this_cpu(cpu_sig).rev;
 
     if ( boot_cpu_data.x86_vendor & (X86_VENDOR_AMD | X86_VENDOR_HYGON) )
         return true;
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index da0b156..3f4c4be 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -10,7 +10,6 @@ enum microcode_match_result {
 };
 
 struct cpu_signature;
-struct ucode_cpu_info;
 
 struct microcode_patch {
     union {
@@ -38,16 +37,7 @@ struct cpu_signature {
     unsigned int rev;
 };
 
-struct ucode_cpu_info {
-    struct cpu_signature cpu_sig;
-    union {
-        struct microcode_intel *mc_intel;
-        struct microcode_amd *mc_amd;
-        void *mc_valid;
-    } mc;
-};
-
-DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
 extern const struct microcode_ops *microcode_ops;
 
 const struct microcode_patch *microcode_get_cache(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 06/16] microcode: remove pointless 'cpu' parameter
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (4 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 05/16] microcode: remove struct ucode_cpu_info Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v9:
 - use a convenience variable 'cpu' in collect_cpu_info() on AMD side
 - rebase and fix conflicts

Changes in v8:
 - Use current_cpu_data in collect_cpu_info()
 - keep the cpu parameter of check_final_patch_levels()
 - use smp_processor_id() in get_matching_microcode() rather than
 define a local variable and label it "__maybe_unused"
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 20 ++++++++------------
 xen/arch/x86/microcode_amd.c    | 34 +++++++++++++---------------------
 xen/arch/x86/microcode_intel.c  | 41 +++++++++++++++--------------------------
 xen/arch/x86/smpboot.c          |  2 +-
 xen/include/asm-x86/microcode.h |  7 +++----
 xen/include/asm-x86/processor.h |  2 +-
 7 files changed, 42 insertions(+), 66 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index e3954ee..269b140 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -278,7 +278,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu(0);
+    microcode_resume_cpu();
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index d17dbec..89a8d2b 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -196,19 +196,19 @@ struct microcode_info {
     char buffer[1];
 };
 
-int microcode_resume_cpu(unsigned int cpu)
+int microcode_resume_cpu(void)
 {
     int err;
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->apply_microcode(cpu);
+        err = microcode_ops->apply_microcode();
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -257,9 +257,9 @@ static int microcode_update_cpu(const void *buf, size_t size)
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
+        err = microcode_ops->cpu_request_microcode(buf, size);
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -348,8 +348,6 @@ __initcall(microcode_init);
 
 int __init early_microcode_update_cpu(bool start_update)
 {
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc = 0;
     void *data = NULL;
     size_t len;
@@ -368,7 +366,7 @@ int __init early_microcode_update_cpu(bool start_update)
         data = bootstrap_map(&ucode_mod);
     }
 
-    microcode_ops->collect_cpu_info(cpu, sig);
+    microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
     if ( data )
     {
@@ -386,8 +384,6 @@ int __init early_microcode_update_cpu(bool start_update)
 
 int __init early_microcode_init(void)
 {
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     int rc;
 
     rc = microcode_init_intel();
@@ -400,7 +396,7 @@ int __init early_microcode_init(void)
 
     if ( microcode_ops )
     {
-        microcode_ops->collect_cpu_info(cpu, sig);
+        microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 69c9cfe..1d27c71 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -78,8 +78,9 @@ struct mpbhdr {
 static DEFINE_SPINLOCK(microcode_update_lock);
 
 /* See comment in start_update() for cases when this routine fails */
-static int collect_cpu_info(unsigned int cpu, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu];
 
     memset(csig, 0, sizeof(*csig));
@@ -153,17 +154,15 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
 }
 
 static enum microcode_match_result microcode_fits(
-    const struct microcode_amd *mc_amd, unsigned int cpu)
+    const struct microcode_amd *mc_amd)
 {
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( !find_equiv_cpu_id(equiv_cpu_table, current_cpu_id, &equiv_cpu_id) )
@@ -192,9 +191,7 @@ static enum microcode_match_result microcode_fits(
 
 static bool match_cpu(const struct microcode_patch *patch)
 {
-    if ( !patch )
-        return false;
-    return microcode_fits(patch->mc_amd, smp_processor_id()) == NEW_UCODE;
+    return patch && (microcode_fits(patch->mc_amd) == NEW_UCODE);
 }
 
 static struct microcode_patch *alloc_microcode_patch(
@@ -246,8 +243,8 @@ static enum microcode_match_result compare_patch(
     const struct microcode_header_amd *old_header = old->mc_amd->mpb;
 
     /* Both patches to compare are supposed to be applicable to local CPU. */
-    ASSERT(microcode_fits(new->mc_amd, smp_processor_id()) != MIS_UCODE);
-    ASSERT(microcode_fits(new->mc_amd, smp_processor_id()) != MIS_UCODE);
+    ASSERT(microcode_fits(new->mc_amd) != MIS_UCODE);
+    ASSERT(microcode_fits(new->mc_amd) != MIS_UCODE);
 
     if ( new_header->processor_rev_id == old_header->processor_rev_id )
         return (new_header->patch_id > old_header->patch_id) ?
@@ -256,18 +253,16 @@ static enum microcode_match_result compare_patch(
     return MIS_UCODE;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint32_t rev;
     int hw_err;
+    unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch = microcode_get_cache();
 
-    /* We should bind the task to the CPU */
-    BUG_ON(raw_smp_processor_id() != cpu);
-
     if ( !patch )
         return -ENOENT;
 
@@ -461,19 +456,16 @@ static bool_t check_final_patch_levels(unsigned int cpu)
     return 0;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t bufsize)
+static int cpu_request_microcode(const void *buf, size_t bufsize)
 {
     struct microcode_amd *mc_amd;
     size_t offset = 0;
     int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( *(const uint32_t *)buf != UCODE_MAGIC )
@@ -566,14 +558,14 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         }
 
         /* Update cache if this patch covers current CPU */
-        if ( microcode_fits(new_patch->mc_amd, cpu) != MIS_UCODE )
+        if ( microcode_fits(new_patch->mc_amd) != MIS_UCODE )
             microcode_update_cache(new_patch);
         else
             microcode_free_patch(new_patch);
 
         if ( match_cpu(microcode_get_cache()) )
         {
-            error = apply_microcode(cpu);
+            error = apply_microcode();
             if ( error )
                 break;
         }
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index f63e4bd..5f1ae2f 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -96,13 +96,12 @@ struct extended_sigtable {
 /* serialize access to the physical write to MSR 0x79 */
 static DEFINE_SPINLOCK(microcode_update_lock);
 
-static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu_num = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu_num];
     uint64_t msr_content;
 
-    BUG_ON(cpu_num != smp_processor_id());
-
     memset(csig, 0, sizeof(*csig));
 
     if ( (c->x86_vendor != X86_VENDOR_INTEL) || (c->x86 < 6) )
@@ -226,12 +225,12 @@ static int microcode_sanity_check(const void *mc)
 
 /* Check an update against the CPU signature and current update revision */
 static enum microcode_match_result microcode_update_match(
-    const struct microcode_header_intel *mc_header, unsigned int cpu)
+    const struct microcode_header_intel *mc_header)
 {
     const struct extended_sigtable *ext_header;
     const struct extended_signature *ext_sig;
     unsigned int i;
-    struct cpu_signature *cpu_sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *cpu_sig = &this_cpu(cpu_sig);
     unsigned int sig = cpu_sig->sig;
     unsigned int pf = cpu_sig->pf;
     unsigned int rev = cpu_sig->rev;
@@ -265,8 +264,7 @@ static bool match_cpu(const struct microcode_patch *patch)
     if ( !patch )
         return false;
 
-    return microcode_update_match(&patch->mc_intel->hdr,
-                                  smp_processor_id()) == NEW_UCODE;
+    return microcode_update_match(&patch->mc_intel->hdr) == NEW_UCODE;
 }
 
 static void free_patch(void *mc)
@@ -281,10 +279,8 @@ static enum microcode_match_result compare_patch(
      * Both patches to compare are supposed to be applicable to local CPU.
      * Just compare the revision number.
      */
-    ASSERT(microcode_update_match(&old->mc_intel->hdr, smp_processor_id()) !=
-               MIS_UCODE);
-    ASSERT(microcode_update_match(&new->mc_intel->hdr, smp_processor_id()) !=
-               MIS_UCODE);
+    ASSERT(microcode_update_match(&old->mc_intel->hdr) != MIS_UCODE);
+    ASSERT(microcode_update_match(&new->mc_intel->hdr) != MIS_UCODE);
 
     return (new->mc_intel->hdr.rev > old->mc_intel->hdr.rev) ? NEW_UCODE
                                                              : OLD_UCODE;
@@ -295,7 +291,7 @@ static enum microcode_match_result compare_patch(
  * return 1 - found update
  * return < 0 - error
  */
-static int get_matching_microcode(const void *mc, unsigned int cpu)
+static int get_matching_microcode(const void *mc)
 {
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
@@ -312,7 +308,7 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     new_patch->mc_intel = new_mc;
 
     /* Make sure that this patch covers current CPU */
-    if ( microcode_update_match(mc, cpu) == MIS_UCODE )
+    if ( microcode_update_match(mc) == MIS_UCODE )
     {
         microcode_free_patch(new_patch);
         return 0;
@@ -322,24 +318,21 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, per_cpu(cpu_sig, cpu).rev);
+             smp_processor_id(), mc_header->rev, this_cpu(cpu_sig).rev);
 
     return 1;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch = microcode_get_cache();
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu_num != cpu);
-
     if ( !patch )
         return -ENOENT;
 
@@ -407,22 +400,18 @@ static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
     return offset + total_size;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t size)
+static int cpu_request_microcode(const void *buf, size_t size)
 {
     long offset = 0;
     int error = 0;
     void *mc;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
         error = microcode_sanity_check(mc);
         if ( error )
             break;
-        error = get_matching_microcode(mc, cpu);
+        error = get_matching_microcode(mc);
         if ( error < 0 )
             break;
         /*
@@ -440,7 +429,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         error = offset;
 
     if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode(cpu);
+        error = apply_microcode();
 
     return error;
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 911416c..73a1afc 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -363,7 +363,7 @@ void start_secondary(void *unused)
     if ( system_state <= SYS_STATE_smp_boot )
         early_microcode_update_cpu(false);
     else
-        microcode_resume_cpu(cpu);
+        microcode_resume_cpu();
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 3f4c4be..f2a5ea4 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -20,10 +20,9 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
-                                 size_t size);
-    int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
-    int (*apply_microcode)(unsigned int cpu);
+    int (*cpu_request_microcode)(const void *buf, size_t size);
+    int (*collect_cpu_info)(struct cpu_signature *csig);
+    int (*apply_microcode)(void);
     int (*start_update)(void);
     void (*free_patch)(void *mc);
     bool (*match_cpu)(const struct microcode_patch *patch);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 3660238..a673372 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -569,7 +569,7 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(unsigned int cpu);
+int microcode_resume_cpu(void);
 int early_microcode_update_cpu(bool start_update);
 int early_microcode_init(void);
 int microcode_init_intel(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (5 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 06/16] microcode: remove pointless 'cpu' parameter Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 12:34   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 08/16] microcode: pass a patch pointer to apply_microcode() Chao Gao
                   ` (9 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Introduce a vendor hook, .end_update_percpu, for svm_host_osvw_init().
The hook function is called on each cpu after loading an update.
It is a preparation for spliting out apply_microcode() from
cpu_request_microcode().

Note that svm_host_osvm_init() should be called regardless of the
result of loading an update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes in v10:
 - rename end_update to end_update_percpu.
 - use #ifdef rather than #if and frame the implementation with

Changes in v9:
 - call .end_update in early loading path
 - on AMD side, initialize .{start,end}_update only if "CONFIG_HVM"
 is true.
---
 xen/arch/x86/microcode.c        | 10 +++++++++-
 xen/arch/x86/microcode_amd.c    | 25 ++++++++++++-------------
 xen/include/asm-x86/microcode.h |  1 +
 3 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 89a8d2b..5c82a2d 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -276,6 +276,9 @@ static long do_microcode_update(void *_info)
     if ( error )
         info->error = error;
 
+    if ( microcode_ops->end_update_percpu )
+        microcode_ops->end_update_percpu();
+
     info->cpu = cpumask_next(info->cpu, &cpu_online_map);
     if ( info->cpu < nr_cpu_ids )
         return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
@@ -376,7 +379,12 @@ int __init early_microcode_update_cpu(bool start_update)
         if ( rc )
             return rc;
 
-        return microcode_update_cpu(data, len);
+        rc = microcode_update_cpu(data, len);
+
+        if ( microcode_ops->end_update_percpu )
+            microcode_ops->end_update_percpu();
+
+        return rc;
     }
     else
         return -ENOMEM;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 1d27c71..c96a3b3 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -600,10 +600,6 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     free_patch(mc_amd);
 
   out:
-#if CONFIG_HVM
-    svm_host_osvw_init();
-#endif
-
     /*
      * In some cases we may return an error even if processor's microcode has
      * been updated. For example, the first patch in a container file is loaded
@@ -613,29 +609,32 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     return error;
 }
 
+#ifdef CONFIG_HVM
 static int start_update(void)
 {
-#if CONFIG_HVM
     /*
-     * We assume here that svm_host_osvw_init() will be called on each cpu (from
-     * cpu_request_microcode()).
-     *
-     * Note that if collect_cpu_info() returns an error then
-     * cpu_request_microcode() will not invoked thus leaving OSVW bits not
-     * updated. Currently though collect_cpu_info() will not fail on processors
-     * supporting OSVW so we will not deal with this possibility.
+     * svm_host_osvw_init() will be called on each cpu by calling '.end_update'
+     * in common code.
      */
     svm_host_osvw_reset();
-#endif
 
     return 0;
 }
 
+static void end_update_percpu(void)
+{
+    svm_host_osvw_init();
+}
+#endif
+
 static const struct microcode_ops microcode_amd_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+#ifdef CONFIG_HVM
     .start_update                     = start_update,
+    .end_update_percpu                = end_update_percpu,
+#endif
     .free_patch                       = free_patch,
     .compare_patch                    = compare_patch,
     .match_cpu                        = match_cpu,
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index f2a5ea4..b0eee0e 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -24,6 +24,7 @@ struct microcode_ops {
     int (*collect_cpu_info)(struct cpu_signature *csig);
     int (*apply_microcode)(void);
     int (*start_update)(void);
+    void (*end_update_percpu)(void);
     void (*free_patch)(void *mc);
     bool (*match_cpu)(const struct microcode_patch *patch);
     enum microcode_match_result (*compare_patch)(
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 08/16] microcode: pass a patch pointer to apply_microcode()
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (6 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

apply_microcode()'s always loading the cached ucode patch forces
a patch to be stored before being loaded. Make apply_microcode()
accept a patch pointer to remove the limitation so that a patch
can be stored after a successful loading.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/microcode.c        | 2 +-
 xen/arch/x86/microcode_amd.c    | 5 ++---
 xen/arch/x86/microcode_intel.c  | 5 ++---
 xen/include/asm-x86/microcode.h | 2 +-
 4 files changed, 6 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 5c82a2d..b44e4d7 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -208,7 +208,7 @@ int microcode_resume_cpu(void)
 
     err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->apply_microcode();
+        err = microcode_ops->apply_microcode(microcode_cache);
     spin_unlock(&microcode_mutex);
 
     return err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index c96a3b3..c6d2ea3 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -253,7 +253,7 @@ static enum microcode_match_result compare_patch(
     return MIS_UCODE;
 }
 
-static int apply_microcode(void)
+static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
     uint32_t rev;
@@ -261,7 +261,6 @@ static int apply_microcode(void)
     unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *hdr;
-    const struct microcode_patch *patch = microcode_get_cache();
 
     if ( !patch )
         return -ENOENT;
@@ -565,7 +564,7 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
 
         if ( match_cpu(microcode_get_cache()) )
         {
-            error = apply_microcode();
+            error = apply_microcode(microcode_get_cache());
             if ( error )
                 break;
         }
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 5f1ae2f..b1ec81d 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -323,7 +323,7 @@ static int get_matching_microcode(const void *mc)
     return 1;
 }
 
-static int apply_microcode(void)
+static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
     uint64_t msr_content;
@@ -331,7 +331,6 @@ static int apply_microcode(void)
     unsigned int cpu_num = raw_smp_processor_id();
     struct cpu_signature *sig = &this_cpu(cpu_sig);
     const struct microcode_intel *mc_intel;
-    const struct microcode_patch *patch = microcode_get_cache();
 
     if ( !patch )
         return -ENOENT;
@@ -429,7 +428,7 @@ static int cpu_request_microcode(const void *buf, size_t size)
         error = offset;
 
     if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode();
+        error = apply_microcode(microcode_get_cache());
 
     return error;
 }
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index b0eee0e..02feb09 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -22,7 +22,7 @@ struct microcode_patch {
 struct microcode_ops {
     int (*cpu_request_microcode)(const void *buf, size_t size);
     int (*collect_cpu_info)(struct cpu_signature *csig);
-    int (*apply_microcode)(void);
+    int (*apply_microcode)(const struct microcode_patch *patch);
     int (*start_update)(void);
     void (*end_update_percpu)(void);
     void (*free_patch)(void *mc);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (7 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 08/16] microcode: pass a patch pointer to apply_microcode() Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 14:07   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming Chao Gao
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

During late microcode loading, apply_microcode() is invoked in
cpu_request_microcode(). To make late microcode update more reliable,
we want to put the apply_microcode() into stop_machine context. So
we split out it from cpu_request_microcode(). In general, for both
early loading on BSP and late loading, cpu_request_microcode() is
called first to get the matching microcode update contained by
the blob and then apply_microcode() is invoked explicitly on each
cpu in common code.

Given that all CPUs are supposed to have the same signature, parsing
microcode only needs to be done once. So cpu_request_microcode() is
also moved out of microcode_update_cpu().

In some cases (e.g. a broken bios), the system may have multiple
revisions of microcode update. So we would try to load a microcode
update as long as it covers current cpu. And if a cpu loads this patch
successfully, the patch would be stored into the patch cache.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes in v10:
 - make microcode_update_cache static
 - raise an error if loading ucode failed with -EIO
 - ensure end_update_percpu() is called following a successful call of
 start_update()

Changes in v9:
 - remove the calling of ->compare_patch in microcode_update_cpu().
 - drop "microcode_" prefix for static function - microcode_parse_blob().
 - rebase and fix conflict

Changes in v8:
 - divide the original patch into three patches to improve readability
 - load an update on each cpu as long as the update covers current cpu
 - store an update after the first successful loading on a CPU
 - Make sure the current CPU (especially pf value) is covered
 by updates.

changes in v7:
 - to handle load failure, unvalidated patches won't be cached. They
 are passed as function arguments. So if update failed, we needn't
 any cleanup to microcode cache.
---
 xen/arch/x86/microcode.c        | 182 +++++++++++++++++++++++++++-------------
 xen/arch/x86/microcode_amd.c    |  38 +++++----
 xen/arch/x86/microcode_intel.c  |  66 +++++++--------
 xen/include/asm-x86/microcode.h |   5 +-
 4 files changed, 178 insertions(+), 113 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index b44e4d7..d4738f6 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -189,12 +189,19 @@ static DEFINE_SPINLOCK(microcode_mutex);
 
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
-struct microcode_info {
-    unsigned int cpu;
-    uint32_t buffer_size;
-    int error;
-    char buffer[1];
-};
+/*
+ * Return a patch that covers current CPU. If there are multiple patches,
+ * return the one with the highest revision number. Return error If no
+ * patch is found and an error occurs during the parsing process. Otherwise
+ * return NULL.
+ */
+static struct microcode_patch *parse_blob(const char *buf, size_t len)
+{
+    if ( likely(!microcode_ops->collect_cpu_info(&this_cpu(cpu_sig))) )
+        return microcode_ops->cpu_request_microcode(buf, len);
+
+    return NULL;
+}
 
 int microcode_resume_cpu(void)
 {
@@ -220,15 +227,8 @@ void microcode_free_patch(struct microcode_patch *microcode_patch)
     xfree(microcode_patch);
 }
 
-const struct microcode_patch *microcode_get_cache(void)
-{
-    ASSERT(spin_is_locked(&microcode_mutex));
-
-    return microcode_cache;
-}
-
 /* Return true if cache gets updated. Otherwise, return false */
-bool microcode_update_cache(struct microcode_patch *patch)
+static bool microcode_update_cache(struct microcode_patch *patch)
 {
     ASSERT(spin_is_locked(&microcode_mutex));
 
@@ -249,49 +249,80 @@ bool microcode_update_cache(struct microcode_patch *patch)
     return true;
 }
 
-static int microcode_update_cpu(const void *buf, size_t size)
+/*
+ * Load a microcode update to current CPU.
+ *
+ * If no patch is provided, the cached patch will be loaded. Microcode update
+ * during APs bringup and CPU resuming falls into this case.
+ */
+static int microcode_update_cpu(const struct microcode_patch *patch)
 {
-    int err;
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    int err = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
-    spin_lock(&microcode_mutex);
+    if ( unlikely(err) )
+        return err;
 
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(buf, size);
-    spin_unlock(&microcode_mutex);
+    if ( patch )
+        err = microcode_ops->apply_microcode(patch);
+    else if ( microcode_cache )
+    {
+        spin_lock(&microcode_mutex);
+        err = microcode_ops->apply_microcode(microcode_cache);
+        if ( err == -EIO )
+        {
+            microcode_free_patch(microcode_cache);
+            microcode_cache = NULL;
+        }
+        spin_unlock(&microcode_mutex);
+    }
+    else
+        /* No patch to update */
+        err = -ENOENT;
 
     return err;
 }
 
-static long do_microcode_update(void *_info)
+static long do_microcode_update(void *patch)
 {
-    struct microcode_info *info = _info;
-    int error;
-
-    BUG_ON(info->cpu != smp_processor_id());
+    unsigned int cpu;
+    int ret = microcode_update_cpu(patch);
 
-    error = microcode_update_cpu(info->buffer, info->buffer_size);
-    if ( error )
-        info->error = error;
+    /* Store the patch after a successful loading */
+    if ( !ret && patch )
+    {
+        spin_lock(&microcode_mutex);
+        microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
+        patch = NULL;
+    }
 
     if ( microcode_ops->end_update_percpu )
         microcode_ops->end_update_percpu();
 
-    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
-    if ( info->cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    /*
+     * Each thread tries to load ucode and only the first thread of a core
+     * would succeed. Ignore error other than -EIO.
+     */
+    if ( ret != -EIO )
+        ret = 0;
+
+    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
+    if ( cpu < nr_cpu_ids )
+        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch) ?
+                                                                          : ret;
 
-    error = info->error;
-    xfree(info);
-    return error;
+    /* Free the patch if no CPU has loaded it successfully. */
+    if ( patch )
+        microcode_free_patch(patch);
+
+    return ret;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
-    struct microcode_info *info;
+    void *buffer;
+    struct microcode_patch *patch;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -299,32 +330,46 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
-    info = xmalloc_bytes(sizeof(*info) + len);
-    if ( info == NULL )
+    buffer = xmalloc_bytes(len);
+    if ( !buffer )
         return -ENOMEM;
 
-    ret = copy_from_guest(info->buffer, buf, len);
-    if ( ret != 0 )
+    if ( copy_from_guest(buffer, buf, len) )
+    {
+        ret = -EFAULT;
+        goto free;
+    }
+
+    patch = parse_blob(buffer, len);
+    if ( IS_ERR(patch) )
     {
-        xfree(info);
-        return ret;
+        ret = PTR_ERR(patch);
+        printk(XENLOG_WARNING "Parsing microcode blob error %d\n", ret);
+        goto free;
     }
 
-    info->buffer_size = len;
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
+    if ( !patch )
+    {
+        ret = -ENOENT;
+        goto free;
+    }
 
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
         {
-            xfree(info);
-            return ret;
+            microcode_free_patch(patch);
+            goto free;
         }
     }
 
-    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
+                                    do_microcode_update, patch);
+
+ free:
+    xfree(buffer);
+    return ret;
 }
 
 static int __init microcode_init(void)
@@ -371,23 +416,42 @@ int __init early_microcode_update_cpu(bool start_update)
 
     microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
-    if ( data )
+    if ( !data )
+        return -ENOMEM;
+
+    if ( start_update )
     {
-        if ( start_update && microcode_ops->start_update )
+        struct microcode_patch *patch;
+
+        patch = parse_blob(data, len);
+        if ( IS_ERR(patch) )
+        {
+            printk(XENLOG_WARNING "Parsing microcode blob error %ld\n",
+                   PTR_ERR(patch));
+            return PTR_ERR(patch);
+        }
+
+        if ( !patch )
+            return -ENOENT;
+
+        spin_lock(&microcode_mutex);
+        rc = microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
+        ASSERT(rc);
+
+        if ( microcode_ops->start_update )
             rc = microcode_ops->start_update();
 
         if ( rc )
             return rc;
+    }
 
-        rc = microcode_update_cpu(data, len);
+    rc = microcode_update_cpu(NULL);
 
-        if ( microcode_ops->end_update_percpu )
-            microcode_ops->end_update_percpu();
+    if ( microcode_ops->end_update_percpu )
+        microcode_ops->end_update_percpu();
 
-        return rc;
-    }
-    else
-        return -ENOMEM;
+    return rc;
 }
 
 int __init early_microcode_init(void)
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index c6d2ea3..1d1bea4 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -455,9 +455,11 @@ static bool_t check_final_patch_levels(unsigned int cpu)
     return 0;
 }
 
-static int cpu_request_microcode(const void *buf, size_t bufsize)
+static struct microcode_patch *cpu_request_microcode(const void *buf,
+                                                     size_t bufsize)
 {
     struct microcode_amd *mc_amd;
+    struct microcode_patch *patch = NULL;
     size_t offset = 0;
     int error = 0;
     unsigned int current_cpu_id;
@@ -556,19 +558,22 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
             break;
         }
 
-        /* Update cache if this patch covers current CPU */
-        if ( microcode_fits(new_patch->mc_amd) != MIS_UCODE )
-            microcode_update_cache(new_patch);
-        else
-            microcode_free_patch(new_patch);
-
-        if ( match_cpu(microcode_get_cache()) )
+        /*
+         * If the new patch covers current CPU, compare patches and store the
+         * one with higher revision.
+         */
+        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
+             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
         {
-            error = apply_microcode(microcode_get_cache());
-            if ( error )
-                break;
+            struct microcode_patch *tmp = patch;
+
+            patch = new_patch;
+            new_patch = tmp;
         }
 
+        if ( new_patch )
+            microcode_free_patch(new_patch);
+
         if ( offset >= bufsize )
             break;
 
@@ -599,13 +604,10 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     free_patch(mc_amd);
 
   out:
-    /*
-     * In some cases we may return an error even if processor's microcode has
-     * been updated. For example, the first patch in a container file is loaded
-     * successfully but subsequent container file processing encounters a
-     * failure.
-     */
-    return error;
+    if ( error && !patch )
+        patch = ERR_PTR(error);
+
+    return patch;
 }
 
 #ifdef CONFIG_HVM
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index b1ec81d..c3083d7 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -286,14 +286,9 @@ static enum microcode_match_result compare_patch(
                                                              : OLD_UCODE;
 }
 
-/*
- * return 0 - no update found
- * return 1 - found update
- * return < 0 - error
- */
-static int get_matching_microcode(const void *mc)
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_header_intel *mc_header)
 {
-    const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
     struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
@@ -302,25 +297,12 @@ static int get_matching_microcode(const void *mc)
     {
         xfree(new_patch);
         xfree(new_mc);
-        return -ENOMEM;
+        return ERR_PTR(-ENOMEM);
     }
-    memcpy(new_mc, mc, total_size);
+    memcpy(new_mc, mc_header, total_size);
     new_patch->mc_intel = new_mc;
 
-    /* Make sure that this patch covers current CPU */
-    if ( microcode_update_match(mc) == MIS_UCODE )
-    {
-        microcode_free_patch(new_patch);
-        return 0;
-    }
-
-    microcode_update_cache(new_patch);
-
-    pr_debug("microcode: CPU%d found a matching microcode update with"
-             " version %#x (current=%#x)\n",
-             smp_processor_id(), mc_header->rev, this_cpu(cpu_sig).rev);
-
-    return 1;
+    return new_patch;
 }
 
 static int apply_microcode(const struct microcode_patch *patch)
@@ -399,26 +381,44 @@ static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
     return offset + total_size;
 }
 
-static int cpu_request_microcode(const void *buf, size_t size)
+static struct microcode_patch *cpu_request_microcode(const void *buf,
+                                                     size_t size)
 {
     long offset = 0;
     int error = 0;
     void *mc;
+    struct microcode_patch *patch = NULL;
 
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
+        struct microcode_patch *new_patch;
+
         error = microcode_sanity_check(mc);
         if ( error )
             break;
-        error = get_matching_microcode(mc);
-        if ( error < 0 )
+
+        new_patch = alloc_microcode_patch(mc);
+        if ( IS_ERR(new_patch) )
+        {
+            error = PTR_ERR(new_patch);
             break;
+        }
+
         /*
-         * It's possible the data file has multiple matching ucode,
-         * lets keep searching till the latest version
+         * If the new patch covers current CPU, compare patches and store the
+         * one with higher revision.
          */
-        if ( error == 1 )
-            error = 0;
+        if ( (microcode_update_match(&new_patch->mc_intel->hdr) != MIS_UCODE) &&
+             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+        {
+            struct microcode_patch *tmp = patch;
+
+            patch = new_patch;
+            new_patch = tmp;
+        }
+
+        if ( new_patch )
+            microcode_free_patch(new_patch);
 
         xfree(mc);
     }
@@ -427,10 +427,10 @@ static int cpu_request_microcode(const void *buf, size_t size)
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && match_cpu(microcode_get_cache()) )
-        error = apply_microcode(microcode_get_cache());
+    if ( error && !patch )
+        patch = ERR_PTR(error);
 
-    return error;
+    return patch;
 }
 
 static const struct microcode_ops microcode_intel_ops = {
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 02feb09..7d5a1f8 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -20,7 +20,8 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*cpu_request_microcode)(const void *buf, size_t size);
+    struct microcode_patch *(*cpu_request_microcode)(const void *buf,
+                                                     size_t size);
     int (*collect_cpu_info)(struct cpu_signature *csig);
     int (*apply_microcode)(const struct microcode_patch *patch);
     int (*start_update)(void);
@@ -40,8 +41,6 @@ struct cpu_signature {
 DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
 extern const struct microcode_ops *microcode_ops;
 
-const struct microcode_patch *microcode_get_cache(void);
-bool microcode_update_cache(struct microcode_patch *patch);
 void microcode_free_patch(struct microcode_patch *patch);
 
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (8 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 14:59   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch Chao Gao
                   ` (6 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

During system bootup and resuming, CPUs just load the cached ucode.
So one unified function microcode_update_one() is introduced. It
takes a boolean to indicate whether ->start_update should be called.
Since early_microcode_update_cpu() is only called on BSP (APs call
the unified function), start_update is always true and so remove
this parameter.

There is a functional change: ->start_update is called on BSP and
->end_update_percpu is called during system resuming. They are not
invoked by previous microcode_resume_cpu().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v10:
 - call ->start_update for system resume from suspension

Changes in v9:
 - return -EOPNOTSUPP rather than 0 if microcode_ops is NULL in
   microcode_update_one()
 - rebase and fix conflicts.

Changes in v8:
 - split out from the previous patch
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 91 +++++++++++++++++++----------------------
 xen/arch/x86/smpboot.c          |  5 +--
 xen/include/asm-x86/processor.h |  4 +-
 4 files changed, 45 insertions(+), 57 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 269b140..01e6aec 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -278,7 +278,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu();
+    microcode_update_one(true);
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index d4738f6..c2ea20f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -203,24 +203,6 @@ static struct microcode_patch *parse_blob(const char *buf, size_t len)
     return NULL;
 }
 
-int microcode_resume_cpu(void)
-{
-    int err;
-    struct cpu_signature *sig = &this_cpu(cpu_sig);
-
-    if ( !microcode_ops )
-        return 0;
-
-    spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->apply_microcode(microcode_cache);
-    spin_unlock(&microcode_mutex);
-
-    return err;
-}
-
 void microcode_free_patch(struct microcode_patch *microcode_patch)
 {
     microcode_ops->free_patch(microcode_patch->mc);
@@ -394,11 +376,38 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-int __init early_microcode_update_cpu(bool start_update)
+/* Load a cached update to current cpu */
+int microcode_update_one(bool start_update)
+{
+    int err;
+
+    if ( !microcode_ops )
+        return -EOPNOTSUPP;
+
+    microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
+
+    if ( start_update && microcode_ops->start_update )
+    {
+        err = microcode_ops->start_update();
+        if ( err )
+            return err;
+    }
+
+    err = microcode_update_cpu(NULL);
+
+    if ( microcode_ops->end_update_percpu )
+        microcode_ops->end_update_percpu();
+
+    return err;
+}
+
+/* BSP calls this function to parse ucode blob and then apply an update. */
+int __init early_microcode_update_cpu(void)
 {
     int rc = 0;
     void *data = NULL;
     size_t len;
+    struct microcode_patch *patch;
 
     if ( !microcode_ops )
         return -ENOSYS;
@@ -414,44 +423,26 @@ int __init early_microcode_update_cpu(bool start_update)
         data = bootstrap_map(&ucode_mod);
     }
 
-    microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
-
     if ( !data )
         return -ENOMEM;
 
-    if ( start_update )
+    patch = parse_blob(data, len);
+    if ( IS_ERR(patch) )
     {
-        struct microcode_patch *patch;
-
-        patch = parse_blob(data, len);
-        if ( IS_ERR(patch) )
-        {
-            printk(XENLOG_WARNING "Parsing microcode blob error %ld\n",
-                   PTR_ERR(patch));
-            return PTR_ERR(patch);
-        }
-
-        if ( !patch )
-            return -ENOENT;
-
-        spin_lock(&microcode_mutex);
-        rc = microcode_update_cache(patch);
-        spin_unlock(&microcode_mutex);
-        ASSERT(rc);
-
-        if ( microcode_ops->start_update )
-            rc = microcode_ops->start_update();
-
-        if ( rc )
-            return rc;
+        printk(XENLOG_WARNING "Parsing microcode blob error %ld\n",
+               PTR_ERR(patch));
+        return PTR_ERR(patch);
     }
 
-    rc = microcode_update_cpu(NULL);
+    if ( !patch )
+        return -ENOENT;
 
-    if ( microcode_ops->end_update_percpu )
-        microcode_ops->end_update_percpu();
+    spin_lock(&microcode_mutex);
+    rc = microcode_update_cache(patch);
+    spin_unlock(&microcode_mutex);
+    ASSERT(rc);
 
-    return rc;
+    return microcode_update_one(true);
 }
 
 int __init early_microcode_init(void)
@@ -471,7 +462,7 @@ int __init early_microcode_init(void)
         microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
 
         if ( ucode_mod.mod_end || ucode_blob.size )
-            rc = early_microcode_update_cpu(true);
+            rc = early_microcode_update_cpu();
     }
 
     return rc;
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 73a1afc..179f6b6 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -360,10 +360,7 @@ void start_secondary(void *unused)
 
     initialize_cpu_data(cpu);
 
-    if ( system_state <= SYS_STATE_smp_boot )
-        early_microcode_update_cpu(false);
-    else
-        microcode_resume_cpu();
+    microcode_update_one(false);
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index a673372..c92956f 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -569,9 +569,9 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(void);
-int early_microcode_update_cpu(bool start_update);
+int early_microcode_update_cpu(void);
 int early_microcode_init(void);
+int microcode_update_one(bool start_update);
 int microcode_init_intel(void);
 int microcode_init_amd(void);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (9 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 15:04   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading Chao Gao
                   ` (5 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

To create a microcode patch from a vendor-specific update,
allocate_microcode_patch() copied everything from the update.
It is not efficient. Essentially, we just need to go through
ucodes in the blob, find the one with the newest revision and
install it into the microcode_patch. In the process, buffers
like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
side) can be reused. microcode_patch now is allocated after
it is sure that there is a matching ucode.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes in v10:
 - avoid unnecessary type casting
   * introduce compare_header on AMD side
   * specify the type of the first parameter of get_next_ucode_from_buffer()
     on Intel side

Changes in v9:
 - new
---
 xen/arch/x86/microcode_amd.c   | 112 +++++++++++++++++------------------------
 xen/arch/x86/microcode_intel.c |  67 +++++++++---------------
 2 files changed, 69 insertions(+), 110 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 1d1bea4..f05db72 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -194,36 +194,6 @@ static bool match_cpu(const struct microcode_patch *patch)
     return patch && (microcode_fits(patch->mc_amd) == NEW_UCODE);
 }
 
-static struct microcode_patch *alloc_microcode_patch(
-    const struct microcode_amd *mc_amd)
-{
-    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
-    struct microcode_amd *cache = xmalloc(struct microcode_amd);
-    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
-    struct equiv_cpu_entry *equiv_cpu_table =
-                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
-
-    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
-    {
-        xfree(microcode_patch);
-        xfree(cache);
-        xfree(mpb);
-        xfree(equiv_cpu_table);
-        return ERR_PTR(-ENOMEM);
-    }
-
-    memcpy(mpb, mc_amd->mpb, mc_amd->mpb_size);
-    cache->mpb = mpb;
-    cache->mpb_size = mc_amd->mpb_size;
-    memcpy(equiv_cpu_table, mc_amd->equiv_cpu_table,
-           mc_amd->equiv_cpu_table_size);
-    cache->equiv_cpu_table = equiv_cpu_table;
-    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
-    microcode_patch->mc_amd = cache;
-
-    return microcode_patch;
-}
-
 static void free_patch(void *mc)
 {
     struct microcode_amd *mc_amd = mc;
@@ -236,6 +206,17 @@ static void free_patch(void *mc)
     }
 }
 
+static enum microcode_match_result compare_header(
+    const struct microcode_header_amd *new_header,
+    const struct microcode_header_amd *old_header)
+{
+    if ( new_header->processor_rev_id == old_header->processor_rev_id )
+        return (new_header->patch_id > old_header->patch_id) ? NEW_UCODE
+                                                             : OLD_UCODE;
+
+    return MIS_UCODE;
+}
+
 static enum microcode_match_result compare_patch(
     const struct microcode_patch *new, const struct microcode_patch *old)
 {
@@ -246,11 +227,7 @@ static enum microcode_match_result compare_patch(
     ASSERT(microcode_fits(new->mc_amd) != MIS_UCODE);
     ASSERT(microcode_fits(new->mc_amd) != MIS_UCODE);
 
-    if ( new_header->processor_rev_id == old_header->processor_rev_id )
-        return (new_header->patch_id > old_header->patch_id) ?
-                NEW_UCODE : OLD_UCODE;
-
-    return MIS_UCODE;
+    return compare_header(new_header, old_header);
 }
 
 static int apply_microcode(const struct microcode_patch *patch)
@@ -328,18 +305,10 @@ static int get_ucode_from_buffer_amd(
         return -EINVAL;
     }
 
-    if ( mc_amd->mpb_size < mpbuf->len )
-    {
-        if ( mc_amd->mpb )
-        {
-            xfree(mc_amd->mpb);
-            mc_amd->mpb_size = 0;
-        }
-        mc_amd->mpb = xmalloc_bytes(mpbuf->len);
-        if ( mc_amd->mpb == NULL )
-            return -ENOMEM;
-        mc_amd->mpb_size = mpbuf->len;
-    }
+    mc_amd->mpb = xmalloc_bytes(mpbuf->len);
+    if ( !mc_amd->mpb )
+        return -ENOMEM;
+    mc_amd->mpb_size = mpbuf->len;
     memcpy(mc_amd->mpb, mpbuf->data, mpbuf->len);
 
     pr_debug("microcode: CPU%d size %zu, block size %u offset %zu equivID %#x rev %#x\n",
@@ -459,8 +428,9 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
                                                      size_t bufsize)
 {
     struct microcode_amd *mc_amd;
+    struct microcode_header_amd *saved = NULL;
     struct microcode_patch *patch = NULL;
-    size_t offset = 0;
+    size_t offset = 0, saved_size = 0;
     int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
@@ -550,29 +520,22 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        struct microcode_patch *new_patch = alloc_microcode_patch(mc_amd);
-
-        if ( IS_ERR(new_patch) )
-        {
-            error = PTR_ERR(new_patch);
-            break;
-        }
-
         /*
-         * If the new patch covers current CPU, compare patches and store the
+         * If the new ucode covers current CPU, compare ucodes and store the
          * one with higher revision.
          */
-        if ( (microcode_fits(new_patch->mc_amd) != MIS_UCODE) &&
-             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+        if ( (microcode_fits(mc_amd) != MIS_UCODE) &&
+             (!saved || (compare_header(mc_amd->mpb, saved) == NEW_UCODE)) )
         {
-            struct microcode_patch *tmp = patch;
-
-            patch = new_patch;
-            new_patch = tmp;
+            xfree(saved);
+            saved = mc_amd->mpb;
+            saved_size = mc_amd->mpb_size;
+        }
+        else
+        {
+            xfree(mc_amd->mpb);
+            mc_amd->mpb = NULL;
         }
-
-        if ( new_patch )
-            microcode_free_patch(new_patch);
 
         if ( offset >= bufsize )
             break;
@@ -601,7 +564,22 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
              *(const uint32_t *)(buf + offset) == UCODE_MAGIC )
             break;
     }
-    free_patch(mc_amd);
+
+    if ( saved )
+    {
+        mc_amd->mpb = saved;
+        mc_amd->mpb_size = saved_size;
+        patch = xmalloc(struct microcode_patch);
+        if ( patch )
+            patch->mc_amd = mc_amd;
+        else
+        {
+            free_patch(mc_amd);
+            error = -ENOMEM;
+        }
+    }
+    else
+        free_patch(mc_amd);
 
   out:
     if ( error && !patch )
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index c3083d7..4e811b7 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -286,25 +286,6 @@ static enum microcode_match_result compare_patch(
                                                              : OLD_UCODE;
 }
 
-static struct microcode_patch *alloc_microcode_patch(
-    const struct microcode_header_intel *mc_header)
-{
-    unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc = xmalloc_bytes(total_size);
-    struct microcode_patch *new_patch = xmalloc(struct microcode_patch);
-
-    if ( !new_patch || !new_mc )
-    {
-        xfree(new_patch);
-        xfree(new_mc);
-        return ERR_PTR(-ENOMEM);
-    }
-    memcpy(new_mc, mc_header, total_size);
-    new_patch->mc_intel = new_mc;
-
-    return new_patch;
-}
-
 static int apply_microcode(const struct microcode_patch *patch)
 {
     unsigned long flags;
@@ -353,8 +334,9 @@ static int apply_microcode(const struct microcode_patch *patch)
     return 0;
 }
 
-static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
-                                       unsigned long size, long offset)
+static long get_next_ucode_from_buffer(struct microcode_intel **mc,
+                                       const u8 *buf, unsigned long size,
+                                       long offset)
 {
     struct microcode_header_intel *mc_header;
     unsigned long total_size;
@@ -386,47 +368,46 @@ static struct microcode_patch *cpu_request_microcode(const void *buf,
 {
     long offset = 0;
     int error = 0;
-    void *mc;
+    struct microcode_intel *mc, *saved = NULL;
     struct microcode_patch *patch = NULL;
 
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
-        struct microcode_patch *new_patch;
-
         error = microcode_sanity_check(mc);
         if ( error )
-            break;
-
-        new_patch = alloc_microcode_patch(mc);
-        if ( IS_ERR(new_patch) )
         {
-            error = PTR_ERR(new_patch);
+            xfree(mc);
             break;
         }
 
         /*
-         * If the new patch covers current CPU, compare patches and store the
+         * If the new update covers current CPU, compare updates and store the
          * one with higher revision.
          */
-        if ( (microcode_update_match(&new_patch->mc_intel->hdr) != MIS_UCODE) &&
-             (!patch || (compare_patch(new_patch, patch) == NEW_UCODE)) )
+        if ( (microcode_update_match(&mc->hdr) != MIS_UCODE) &&
+             (!saved || (mc->hdr.rev > saved->hdr.rev)) )
         {
-            struct microcode_patch *tmp = patch;
-
-            patch = new_patch;
-            new_patch = tmp;
+            xfree(saved);
+            saved = mc;
         }
-
-        if ( new_patch )
-            microcode_free_patch(new_patch);
-
-        xfree(mc);
+        else
+            xfree(mc);
     }
-    if ( offset > 0 )
-        xfree(mc);
     if ( offset < 0 )
         error = offset;
 
+    if ( saved )
+    {
+        patch = xmalloc(struct microcode_patch);
+        if ( patch )
+            patch->mc_intel = saved;
+        else
+        {
+            xfree(saved);
+            error = -ENOMEM;
+        }
+    }
+
     if ( error && !patch )
         patch = ERR_PTR(error);
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (10 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12 15:32   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 13/16] microcode: remove microcode_update_lock Chao Gao
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Ashok Raj, Wei Liu,
	Jun Nakajima, Andrew Cooper, Jan Beulich, Thomas Gleixner,
	Chao Gao, Roger Pau Monné

This patch ports microcode improvement patches from linux kernel.

Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.

Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
---
Changes in v10:
 - introduce wait_for_state() and set_state() helper functions
 - make wait_for_condition() return bool and take const void *
 - disable/enable watchdog in control thread
 - rename "master" and "slave" thread to "primary" and "secondary"

Changes in v9:
 - log __buildin_return_address(0) when timeout
 - divide CPUs into three logical sets and they will call different
 functions during ucode loading. The 'control thread' is chosen to
 coordinate ucode loading on all CPUs. Since only control thread would
 set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
 - s/rep_nop/cpu_relax
 - each thread updates its revision number itself
 - add XENLOG_ERR prefix for each line of multi-line log messages

Changes in v8:
 - to support blocking #NMI handling during loading ucode
   * introduce a flag, 'loading_state', to mark the start or end of
     ucode loading.
   * use a bitmap for cpu callin since if cpu may stay in #NMI handling,
     there are two places for a cpu to call in. bitmap won't be counted
     twice.
   * don't wait for all CPUs callout, just wait for CPUs that perform the
     update. We have to do this because some threads may be stuck in NMI
     handling (where cannot reach the rendezvous).
 - emit a warning if the system stays in stop_machine context for more
 than 1s
 - comment that rdtsc is fine while loading an update
 - use cmpxchg() to avoid panic being called on multiple CPUs
 - Propagate revision number to other threads
 - refine comments and prompt messages

Changes in v7:
 - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
 - reword the comment above microcode_update_cpu() to clearly state that
 one thread per core should do the update.
---
 xen/arch/x86/microcode.c | 296 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 269 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index c2ea20f..049eda6 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -30,18 +30,52 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/spinlock.h>
+#include <xen/stop_machine.h>
 #include <xen/tasklet.h>
 #include <xen/guest_access.h>
 #include <xen/earlycpio.h>
+#include <xen/watchdog.h>
 
+#include <asm/delay.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
 
+/*
+ * Before performing a late microcode update on any thread, we
+ * rendezvous all cpus in stop_machine context. The timeout for
+ * waiting for cpu rendezvous is 30ms. It is the timeout used by
+ * live patching
+ */
+#define MICROCODE_CALLIN_TIMEOUT_US 30000
+
+/*
+ * Timeout for each thread to complete update is set to 1s. It is a
+ * conservative choice considering all possible interference.
+ */
+#define MICROCODE_UPDATE_TIMEOUT_US 1000000
+
 static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
+static unsigned int nr_cores;
+
+/*
+ * These states help to coordinate CPUs during loading an update.
+ *
+ * The semantics of each state is as follow:
+ *  - LOADING_PREPARE: initial state of 'loading_state'.
+ *  - LOADING_CALLIN: CPUs are allowed to callin.
+ *  - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
+ *  - LOADING_EXIT: ucode loading is done or aborted.
+ */
+static enum {
+    LOADING_PREPARE,
+    LOADING_CALLIN,
+    LOADING_ENTER,
+    LOADING_EXIT,
+} loading_state;
 
 /*
  * If we scan the initramfs.cpio for the early microcode code
@@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 /*
+ * Count the CPUs that have entered, exited the rendezvous and succeeded in
+ * microcode update during late microcode update respectively.
+ *
+ * Note that a bitmap is used for callin to allow cpu to set a bit multiple
+ * times. It is required to do busy-loop in #NMI handling.
+ */
+static cpumask_t cpu_callin_map;
+static atomic_t cpu_out, cpu_updated;
+
+/*
  * Return a patch that covers current CPU. If there are multiple patches,
  * return the one with the highest revision number. Return error If no
  * patch is found and an error occurs during the parsing process. Otherwise
@@ -231,6 +275,34 @@ static bool microcode_update_cache(struct microcode_patch *patch)
     return true;
 }
 
+/* Wait for a condition to be met with a timeout (us). */
+static int wait_for_condition(bool (*func)(const void *data), void *data,
+                              unsigned int timeout)
+{
+    while ( !func(data) )
+    {
+        if ( !timeout-- )
+        {
+            printk("CPU%u: Timeout in %pS\n",
+                   smp_processor_id(), __builtin_return_address(0));
+            return -EBUSY;
+        }
+        udelay(1);
+    }
+
+    return 0;
+}
+
+static bool wait_cpu_callin(const void *nr)
+{
+    return cpumask_weight(&cpu_callin_map) >= (unsigned long)nr;
+}
+
+static bool wait_cpu_callout(const void *nr)
+{
+    return atomic_read(&cpu_out) >= (unsigned long)nr;
+}
+
 /*
  * Load a microcode update to current CPU.
  *
@@ -264,38 +336,158 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
     return err;
 }
 
-static long do_microcode_update(void *patch)
+static bool wait_for_state(unsigned int state)
+{
+    while ( loading_state != state )
+    {
+        if ( state != LOADING_EXIT && loading_state == LOADING_EXIT )
+            return false;
+        cpu_relax();
+    }
+
+    return true;
+}
+
+static void set_state(unsigned int state)
+{
+    loading_state = state;
+    smp_wmb();
+}
+
+static int secondary_thread_fn(void)
+{
+    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
+
+    if ( !wait_for_state(LOADING_CALLIN) )
+        return -EBUSY;
+
+    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
+
+    if ( !wait_for_state(LOADING_EXIT) )
+        return -EBUSY;
+
+    /* Copy update revision from the primary thread. */
+    this_cpu(cpu_sig).rev = per_cpu(cpu_sig, primary).rev;
+
+    return 0;
+}
+
+static int primary_thread_fn(const struct microcode_patch *patch)
+{
+    int ret = 0;
+
+    if ( !wait_for_state(LOADING_CALLIN) )
+        return -EBUSY;
+
+    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
+
+    if ( !wait_for_state(LOADING_ENTER) )
+        return -EBUSY;
+
+    ret = microcode_ops->apply_microcode(patch);
+    if ( !ret )
+        atomic_inc(&cpu_updated);
+    atomic_inc(&cpu_out);
+
+    return ret;
+}
+
+static int control_thread_fn(const struct microcode_patch *patch)
 {
-    unsigned int cpu;
-    int ret = microcode_update_cpu(patch);
+    unsigned int cpu = smp_processor_id(), done;
+    unsigned long tick;
+    int ret;
+
+    /*
+     * We intend to disable interrupt for long time, which may lead to
+     * watchdog timeout.
+     */
+    watchdog_disable();
 
-    /* Store the patch after a successful loading */
-    if ( !ret && patch )
+    /* Allow threads to call in */
+    set_state(LOADING_CALLIN);
+
+    cpumask_set_cpu(cpu, &cpu_callin_map);
+
+    /* Waiting for all threads calling in */
+    ret = wait_for_condition(wait_cpu_callin,
+                             (void *)(unsigned long)num_online_cpus(),
+                             MICROCODE_CALLIN_TIMEOUT_US);
+    if ( ret )
     {
-        spin_lock(&microcode_mutex);
-        microcode_update_cache(patch);
-        spin_unlock(&microcode_mutex);
-        patch = NULL;
+        set_state(LOADING_EXIT);
+        return ret;
     }
 
-    if ( microcode_ops->end_update_percpu )
-        microcode_ops->end_update_percpu();
+    /* Let primary threads load the given ucode update */
+    set_state(LOADING_ENTER);
 
+    ret = microcode_ops->apply_microcode(patch);
+    if ( !ret )
+        atomic_inc(&cpu_updated);
+    atomic_inc(&cpu_out);
+
+    tick = rdtsc_ordered();
+    /* Wait for primary threads finishing update */
+    done = atomic_read(&cpu_out);
+    while ( done != nr_cores )
+    {
+        /*
+         * During each timeout interval, at least a CPU is expected to
+         * finish its update. Otherwise, something goes wrong.
+         *
+         * Note that RDTSC (in wait_for_condition()) is safe for threads to
+         * execute while waiting for completion of loading an update.
+         */
+        if ( wait_for_condition(wait_cpu_callout,
+                                (void *)(unsigned long)(done + 1),
+                                MICROCODE_UPDATE_TIMEOUT_US) )
+            panic("Timeout when finished updating microcode (finished %u/%u)",
+                  done, nr_cores);
+
+        /* Print warning message once if long time is spent here */
+        if ( tick && rdtsc_ordered() - tick >= cpu_khz * 1000 )
+        {
+            printk(XENLOG_WARNING
+                   "WARNING: UPDATING MICROCODE HAS CONSUMED MORE THAN 1 SECOND!\n");
+            tick = 0;
+        }
+        done = atomic_read(&cpu_out);
+    }
+
+    /* Mark loading is done to unblock other threads */
+    set_state(LOADING_EXIT);
+
+    watchdog_enable();
+
+    return ret;
+}
+
+static int do_microcode_update(void *patch)
+{
+    unsigned int cpu = smp_processor_id();
     /*
-     * Each thread tries to load ucode and only the first thread of a core
-     * would succeed. Ignore error other than -EIO.
+     * primary thread is the one with the lowest thread id among all siblings
+     * thread in a core or a compute unit. It is chosen to load a microcode
+     * update.
      */
-    if ( ret != -EIO )
-        ret = 0;
+    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
+    int ret;
 
-    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
-    if ( cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch) ?
-                                                                          : ret;
+    /*
+     * The control thread set state to coordinate ucode loading. Primary
+     * threads load the given ucode patch. Secondary threads just wait for
+     * the completion of the ucode loading process.
+     */
+    if ( cpu == cpumask_first(&cpu_online_map) )
+        ret = control_thread_fn(patch);
+    else if ( cpu == primary )
+        ret = primary_thread_fn(patch);
+    else
+        ret = secondary_thread_fn();
 
-    /* Free the patch if no CPU has loaded it successfully. */
-    if ( patch )
-        microcode_free_patch(patch);
+    if ( microcode_ops->end_update_percpu )
+        microcode_ops->end_update_percpu();
 
     return ret;
 }
@@ -304,6 +496,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
+    unsigned int cpu, updated;
     struct microcode_patch *patch;
 
     if ( len != (uint32_t)len )
@@ -322,18 +515,25 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         goto free;
     }
 
+    /* cpu_online_map must not change during update */
+    if ( !get_cpu_maps() )
+    {
+        ret = -EBUSY;
+        goto free;
+    }
+
     patch = parse_blob(buffer, len);
     if ( IS_ERR(patch) )
     {
         ret = PTR_ERR(patch);
         printk(XENLOG_WARNING "Parsing microcode blob error %d\n", ret);
-        goto free;
+        goto put;
     }
 
     if ( !patch )
     {
         ret = -ENOENT;
-        goto free;
+        goto put;
     }
 
     if ( microcode_ops->start_update )
@@ -342,13 +542,55 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         if ( ret != 0 )
         {
             microcode_free_patch(patch);
-            goto free;
+            goto put;
         }
     }
 
-    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
-                                    do_microcode_update, patch);
+    cpumask_clear(&cpu_callin_map);
+    atomic_set(&cpu_out, 0);
+    atomic_set(&cpu_updated, 0);
+    loading_state = LOADING_PREPARE;
+
+    /* Calculate the number of online CPU core */
+    nr_cores = 0;
+    for_each_online_cpu(cpu)
+        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+            nr_cores++;
+
+    printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
+
+    /*
+     * Late loading dance. Why the heavy-handed stop_machine effort?
+     *
+     * - HT siblings must be idle and not execute other code while the other
+     *   sibling is loading microcode in order to avoid any negative
+     *   interactions cause by the loading.
+     *
+     * - In addition, microcode update on the cores must be serialized until
+     *   this requirement can be relaxed in the future. Right now, this is
+     *   conservative and good.
+     */
+    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
+
+    updated = atomic_read(&cpu_updated);
+    if ( updated > 0 )
+    {
+        spin_lock(&microcode_mutex);
+        microcode_update_cache(patch);
+        spin_unlock(&microcode_mutex);
+    }
+    else
+        microcode_free_patch(patch);
+
+    if ( updated && updated != nr_cores )
+        printk(XENLOG_ERR "ERROR: Updating microcode succeeded on %u cores and failed\n"
+               XENLOG_ERR "on other %u cores. A system with differing microcode\n"
+               XENLOG_ERR "revisions is considered unstable. Please reboot and do not\n"
+               XENLOG_ERR "load the microcode that triggers this warning!\n",
+               updated, nr_cores - updated);
 
+ put:
+    put_cpu_maps();
  free:
     xfree(buffer);
     return ret;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 13/16] microcode: remove microcode_update_lock
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (11 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode Chao Gao
                   ` (3 subsequent siblings)
  16 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

microcode_update_lock is to prevent logic threads of a same core from
updating microcode at the same time. But due to using a global lock, it
also prevented parallel microcode updating on different cores.

Remove this lock in order to update microcode in parallel. It is safe
because we have already ensured serialization of sibling threads at the
caller side.
1.For late microcode update, do_microcode_update() ensures that only one
  sibiling thread of a core can update microcode.
2.For microcode update during system startup or CPU-hotplug,
  microcode_mutex() guarantees update serialization of logical threads.
3.get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
  late microcode update.

Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
only) are still processed sequentially.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
---
Changes in v7:
 - reworked. Remove complex lock logics introduced in v5 and v6. The microcode
 patch to be applied is passed as an argument without any global variable. Thus
 no lock is added to serialize potential readers/writers. Callers of
 apply_microcode() will guarantee the correctness: the patch poninted by the
 arguments won't be changed by others.

Changes in v6:
 - introduce early_ucode_update_lock to serialize early ucode update.

Changes in v5:
 - newly add
---
 xen/arch/x86/microcode_amd.c   | 8 +-------
 xen/arch/x86/microcode_intel.c | 8 +-------
 2 files changed, 2 insertions(+), 14 deletions(-)

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index f05db72..856caea 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -74,9 +74,6 @@ struct mpbhdr {
     uint8_t data[];
 };
 
-/* serialize access to the physical write */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 /* See comment in start_update() for cases when this routine fails */
 static int collect_cpu_info(struct cpu_signature *csig)
 {
@@ -232,7 +229,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-    unsigned long flags;
     uint32_t rev;
     int hw_err;
     unsigned int cpu = smp_processor_id();
@@ -247,15 +243,13 @@ static int apply_microcode(const struct microcode_patch *patch)
 
     hdr = patch->mc_amd->mpb;
 
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
 
     /* get patch id after patching */
     rdmsrl(MSR_AMD_PATCHLEVEL, rev);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
-
     /*
      * Some processors leave the ucode blob mapping as UC after the update.
      * Flush the mapping to regain normal cacheability.
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 4e811b7..19f1ba0 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -93,9 +93,6 @@ struct extended_sigtable {
 
 #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
 
-/* serialize access to the physical write to MSR 0x79 */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 static int collect_cpu_info(struct cpu_signature *csig)
 {
     unsigned int cpu_num = smp_processor_id();
@@ -288,7 +285,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(const struct microcode_patch *patch)
 {
-    unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
@@ -303,8 +299,7 @@ static int apply_microcode(const struct microcode_patch *patch)
 
     mc_intel = patch->mc_intel;
 
-    /* serialize access to the physical write to MSR 0x79 */
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     /* write microcode via MSR 0x79 */
     wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
@@ -317,7 +312,6 @@ static int apply_microcode(const struct microcode_patch *patch)
     rdmsrl(MSR_IA32_UCODE_REV, msr_content);
     val[1] = (uint32_t)(msr_content >> 32);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
     if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (12 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 13/16] microcode: remove microcode_update_lock Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-13  9:14   ` Jan Beulich
  2019-09-13  9:18   ` Jan Beulich
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90 Chao Gao
                   ` (2 subsequent siblings)
  16 siblings, 2 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Stefano Stabellini, Ashok Raj, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Ian Jackson, Tim Deegan,
	Julien Grall, Jan Beulich, Andrew Cooper, Chao Gao,
	Roger Pau Monné

When one core is loading ucode, handling NMI on sibling threads or
on other cores in the system might be problematic. By rendezvousing
all CPUs in NMI handler, it prevents NMI acceptance during ucode
loading.

Basically, some work previously done in stop_machine context is
moved to NMI handler. Primary threads call in and load ucode in
NMI handler. Secondary threads wait for the completion of ucode
loading on all CPU cores. An option is introduced to disable this
behavior.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
---
Changes in v10:
 - rewrite based on Sergey's idea and patch
 - add Sergey's SOB.
 - add an option to disable ucode loading in NMI handler
 - don't send IPI NMI to the control thread to avoid unknown_nmi_error()
 in do_nmi().
 - add an assertion to make sure the cpu chosen to handle platform NMI
 won't send self NMI. Otherwise, there is a risk that we encounter
 unknown_nmi_error() and system crashes.

Changes in v9:
 - control threads send NMI to all other threads. Slave threads will
 stay in the NMI handling to prevent NMI acceptance during ucode
 loading. Note that self-nmi is invalid according to SDM.
 - s/rep_nop/cpu_relax
 - remove debug message in microcode_nmi_callback(). Printing debug
 message would take long times and control thread may timeout.
 - rebase and fix conflicts

Changes in v8:
 - new
---
 docs/misc/xen-command-line.pandoc | 10 +++++
 xen/arch/x86/microcode.c          | 95 ++++++++++++++++++++++++++++++++-------
 xen/arch/x86/traps.c              |  6 ++-
 xen/include/asm-x86/nmi.h         |  3 ++
 4 files changed, 96 insertions(+), 18 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 7c72e31..3017073 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2056,6 +2056,16 @@ microcode in the cpio name space must be:
   - on Intel: kernel/x86/microcode/GenuineIntel.bin
   - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
 
+### ucode_loading_in_nmi (x86)
+> `= <boolean>`
+
+> Default: `true`
+
+When one CPU is loading ucode, handling NMIs on sibling threads or threads on
+other cores might cause problems. By default, all CPUs rendezvous in NMI handler
+and load ucode. This option provides a way to disable it in case of some CPUs
+don't allow ucode loading in NMI handler.
+
 ### unrestricted_guest (Intel)
 > `= <boolean>`
 
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 049eda6..64a4321 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -36,8 +36,10 @@
 #include <xen/earlycpio.h>
 #include <xen/watchdog.h>
 
+#include <asm/apic.h>
 #include <asm/delay.h>
 #include <asm/msr.h>
+#include <asm/nmi.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
@@ -125,6 +127,9 @@ static int __init parse_ucode(const char *s)
 }
 custom_param("ucode", parse_ucode);
 
+static bool __read_mostly opt_ucode_loading_in_nmi = true;
+boolean_runtime_param("ucode_loading_in_nmi", opt_ucode_loading_in_nmi);
+
 /*
  * 8MB ought to be enough.
  */
@@ -232,6 +237,7 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
  */
 static cpumask_t cpu_callin_map;
 static atomic_t cpu_out, cpu_updated;
+const struct microcode_patch *nmi_patch;
 
 /*
  * Return a patch that covers current CPU. If there are multiple patches,
@@ -354,6 +360,50 @@ static void set_state(unsigned int state)
     smp_wmb();
 }
 
+static int secondary_thread_work(void)
+{
+    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
+
+    return wait_for_state(LOADING_EXIT) ? 0 : -EBUSY;
+}
+
+static int primary_thread_work(const struct microcode_patch *patch)
+{
+    int ret;
+
+    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
+
+    if ( !wait_for_state(LOADING_ENTER) )
+        return -EBUSY;
+
+    ret = microcode_ops->apply_microcode(patch);
+    if ( !ret )
+        atomic_inc(&cpu_updated);
+    atomic_inc(&cpu_out);
+
+    return ret;
+}
+
+static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
+{
+    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
+    unsigned int controller = cpumask_first(&cpu_online_map);
+
+    /* System-generated NMI, will be ignored */
+    if ( loading_state != LOADING_CALLIN )
+        return 0;
+
+    if ( cpu == controller || (!opt_ucode_loading_in_nmi && cpu == primary) )
+        return 0;
+
+    if ( cpu == primary )
+        primary_thread_work(nmi_patch);
+    else
+        secondary_thread_work();
+
+    return 0;
+}
+
 static int secondary_thread_fn(void)
 {
     unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
@@ -361,10 +411,7 @@ static int secondary_thread_fn(void)
     if ( !wait_for_state(LOADING_CALLIN) )
         return -EBUSY;
 
-    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
-
-    if ( !wait_for_state(LOADING_EXIT) )
-        return -EBUSY;
+    self_nmi();
 
     /* Copy update revision from the primary thread. */
     this_cpu(cpu_sig).rev = per_cpu(cpu_sig, primary).rev;
@@ -379,15 +426,10 @@ static int primary_thread_fn(const struct microcode_patch *patch)
     if ( !wait_for_state(LOADING_CALLIN) )
         return -EBUSY;
 
-    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
-
-    if ( !wait_for_state(LOADING_ENTER) )
-        return -EBUSY;
-
-    ret = microcode_ops->apply_microcode(patch);
-    if ( !ret )
-        atomic_inc(&cpu_updated);
-    atomic_inc(&cpu_out);
+    if ( opt_ucode_loading_in_nmi )
+        self_nmi();
+    else
+        ret = primary_thread_work(patch);
 
     return ret;
 }
@@ -397,6 +439,7 @@ static int control_thread_fn(const struct microcode_patch *patch)
     unsigned int cpu = smp_processor_id(), done;
     unsigned long tick;
     int ret;
+    nmi_callback_t *saved_nmi_callback;
 
     /*
      * We intend to disable interrupt for long time, which may lead to
@@ -404,6 +447,9 @@ static int control_thread_fn(const struct microcode_patch *patch)
      */
     watchdog_disable();
 
+    nmi_patch = patch;
+    saved_nmi_callback = set_nmi_callback(microcode_nmi_callback);
+
     /* Allow threads to call in */
     set_state(LOADING_CALLIN);
 
@@ -419,14 +465,23 @@ static int control_thread_fn(const struct microcode_patch *patch)
         return ret;
     }
 
-    /* Let primary threads load the given ucode update */
-    set_state(LOADING_ENTER);
-
+    /* Control thread loads ucode first while others are in NMI handler. */
     ret = microcode_ops->apply_microcode(patch);
     if ( !ret )
         atomic_inc(&cpu_updated);
     atomic_inc(&cpu_out);
 
+    if ( ret == -EIO )
+    {
+        printk(XENLOG_ERR
+               "Late loading aborted: CPU%u failed to update ucode\n", cpu);
+        set_state(LOADING_EXIT);
+        return ret;
+    }
+
+    /* Let primary threads load the given ucode update */
+    set_state(LOADING_ENTER);
+
     tick = rdtsc_ordered();
     /* Wait for primary threads finishing update */
     done = atomic_read(&cpu_out);
@@ -458,6 +513,7 @@ static int control_thread_fn(const struct microcode_patch *patch)
     /* Mark loading is done to unblock other threads */
     set_state(LOADING_EXIT);
 
+    set_nmi_callback(saved_nmi_callback);
     watchdog_enable();
 
     return ret;
@@ -522,6 +578,13 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         goto free;
     }
 
+    /*
+     * CPUs except the first online CPU would send a fake (self) NMI to
+     * rendezvous in NMI handler. But a fake NMI to nmi_cpu may trigger
+     * unknown_nmi_error(). It ensures nmi_cpu won't receive a fake NMI.
+     */
+    ASSERT( !cpu_online(nmi_cpu) || nmi_cpu == cpumask_first(&cpu_online_map) );
+
     patch = parse_blob(buffer, len);
     if ( IS_ERR(patch) )
     {
diff --git a/xen/arch/x86/traps.c b/xen/arch/x86/traps.c
index 16c590d..503f5c8 100644
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -126,6 +126,8 @@ boolean_param("ler", opt_ler);
 /* LastExceptionFromIP on this hardware.  Zero if LER is not in use. */
 unsigned int __read_mostly ler_msr;
 
+unsigned int __read_mostly nmi_cpu;
+
 #define stack_words_per_line 4
 #define ESP_BEFORE_EXCEPTION(regs) ((unsigned long *)regs->rsp)
 
@@ -1679,7 +1681,7 @@ void do_nmi(const struct cpu_user_regs *regs)
      * this port before we re-arm the NMI watchdog, we reduce the chance
      * of having an NMI watchdog expire while in the SMI handler.
      */
-    if ( cpu == 0 )
+    if ( cpu == nmi_cpu )
         reason = inb(0x61);
 
     if ( (nmi_watchdog == NMI_NONE) ||
@@ -1687,7 +1689,7 @@ void do_nmi(const struct cpu_user_regs *regs)
         handle_unknown = true;
 
     /* Only the BSP gets external NMIs from the system. */
-    if ( cpu == 0 )
+    if ( cpu == nmi_cpu )
     {
         if ( reason & 0x80 )
             pci_serr_error(regs);
diff --git a/xen/include/asm-x86/nmi.h b/xen/include/asm-x86/nmi.h
index 99f6284..dbebffe 100644
--- a/xen/include/asm-x86/nmi.h
+++ b/xen/include/asm-x86/nmi.h
@@ -11,6 +11,9 @@ extern bool opt_watchdog;
 
 /* Watchdog force parameter from the command line */
 extern bool watchdog_force;
+
+/* CPU to handle platform NMI */
+extern unsigned int nmi_cpu;
  
 typedef int nmi_callback_t(const struct cpu_user_regs *regs, int cpu);
  
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (13 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-13  9:22   ` Jan Beulich
  2019-09-13 12:23   ` Andrew Cooper
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally Chao Gao
  2019-09-13  8:47 ` [Xen-devel] [PATCH v10 00/16] improve late microcode loading Jan Beulich
  16 siblings, 2 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

It ports the implementation of is_blacklisted() in linux kernel
to Xen.

Late loading may cause system hang if CPUs are affected by BDF90.
Check against BDF90 before performing a late loading.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode.c        |  6 ++++++
 xen/arch/x86/microcode_intel.c  | 23 +++++++++++++++++++++++
 xen/include/asm-x86/microcode.h |  1 +
 3 files changed, 30 insertions(+)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 64a4321..dbd2730 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -561,6 +561,12 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
+    if ( microcode_ops->is_blacklisted && microcode_ops->is_blacklisted() )
+    {
+        printk(XENLOG_WARNING "Late ucode loading is disabled!\n");
+        return -EPERM;
+    }
+
     buffer = xmalloc_bytes(len);
     if ( !buffer )
         return -ENOMEM;
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 19f1ba0..bcef668 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -28,6 +28,7 @@
 #include <xen/smp.h>
 #include <xen/spinlock.h>
 
+#include <asm/div64.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/microcode.h>
@@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch(
                                                              : OLD_UCODE;
 }
 
+static bool is_blacklisted(void)
+{
+    struct cpuinfo_x86 *c = &current_cpu_data;
+    uint64_t llc_size = c->x86_cache_size * 1024ULL;
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
+
+    do_div(llc_size, c->x86_max_cores);
+
+    /*
+     * Late loading on model 79 with microcode revision less than 0x0b000021
+     * and LLC size per core bigger than 2.5MB may result in a system hang.
+     * This behavior is documented in item BDF90, #334165 (Intel Xeon
+     * Processor E7-8800/4800 v4 Product Family).
+     */
+    if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 &&
+         llc_size > 2621440 && sig->rev < 0x0b000021 )
+        return true;
+
+    return false;
+}
+
 static int apply_microcode(const struct microcode_patch *patch)
 {
     uint64_t msr_content;
@@ -415,6 +437,7 @@ static const struct microcode_ops microcode_intel_ops = {
     .free_patch                       = free_patch,
     .compare_patch                    = compare_patch,
     .match_cpu                        = match_cpu,
+    .is_blacklisted                   = is_blacklisted,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 7d5a1f8..9ffd9d2 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -30,6 +30,7 @@ struct microcode_ops {
     bool (*match_cpu)(const struct microcode_patch *patch);
     enum microcode_match_result (*compare_patch)(
         const struct microcode_patch *new, const struct microcode_patch *old);
+    bool (*is_blacklisted)(void);
 };
 
 struct cpu_signature {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (14 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90 Chao Gao
@ 2019-09-12  7:22 ` Chao Gao
  2019-09-13  9:32   ` Jan Beulich
  2019-09-13  8:47 ` [Xen-devel] [PATCH v10 00/16] improve late microcode loading Jan Beulich
  16 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-12  7:22 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

It is needed to mitigate some issues on this specific Broadwell CPU.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode_intel.c | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index bcef668..4e5e7f9 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -305,6 +305,31 @@ static bool is_blacklisted(void)
     return false;
 }
 
+static void microcode_quirk(void)
+{
+    struct cpuinfo_x86 *c;
+    uint64_t llc_size;
+
+    /*
+     * Don't refer to current_cpu_data, which isn't fully initialized
+     * before this stage.
+     */
+    if ( system_state < SYS_STATE_smp_boot )
+        return;
+
+    c = &current_cpu_data;
+    llc_size = c->x86_cache_size * 1024ULL;
+    do_div(llc_size, c->x86_max_cores);
+
+    /*
+     * To mitigate some issues on this specific Broadwell CPU, writeback and
+     * invalidate cache regardless of ucode revision.
+     */
+    if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 &&
+         llc_size > 2621440 )
+        wbinvd();
+}
+
 static int apply_microcode(const struct microcode_patch *patch)
 {
     uint64_t msr_content;
@@ -323,6 +348,8 @@ static int apply_microcode(const struct microcode_patch *patch)
 
     BUG_ON(local_irq_is_enabled());
 
+    microcode_quirk();
+
     /* write microcode via MSR 0x79 */
     wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match()
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-09-12 10:24   ` Jan Beulich
  2019-09-13  6:50     ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 10:24 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -134,21 +134,11 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>      return 0;
>  }
>  
> -static inline int microcode_update_match(
> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
> -    int sig, int pf)
> +static int microcode_sanity_check(const void *mc)
>  {
> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
> -
> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
> -            (mc_header->rev > uci->cpu_sig.rev));
> -}
> -
> -static int microcode_sanity_check(void *mc)
> -{
> -    struct microcode_header_intel *mc_header = mc;
> -    struct extended_sigtable *ext_header = NULL;
> -    struct extended_signature *ext_sig;
> +    const struct microcode_header_intel *mc_header = mc;
> +    const struct extended_sigtable *ext_header = NULL;
> +    const struct extended_signature *ext_sig;
>      unsigned long total_size, data_size, ext_table_size;
>      unsigned int ext_sigcount = 0, i;
>      uint32_t sum, orig_sum;
> @@ -234,6 +224,42 @@ static int microcode_sanity_check(void *mc)
>      return 0;
>  }
>  
> +/* Check an update against the CPU signature and current update revision */
> +static enum microcode_match_result microcode_update_match(
> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
> +{
> +    const struct extended_sigtable *ext_header;
> +    const struct extended_signature *ext_sig;
> +    unsigned int i;
> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
> +    unsigned int sig = uci->cpu_sig.sig;
> +    unsigned int pf = uci->cpu_sig.pf;
> +    unsigned int rev = uci->cpu_sig.rev;
> +    unsigned long data_size = get_datasize(mc_header);
> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
> +
> +    ASSERT(!microcode_sanity_check(mc_header));
> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
> +
> +    ext_header = (const void *)(mc_header + 1) + data_size;
> +    ext_sig = (const void *)(ext_header + 1);
> +
> +    /*
> +     * Make sure there is enough space to hold an extended header and enough
> +     * array elements.
> +     */
> +    if ( (end < (const void *)ext_sig) ||
> +         (end < (const void *)(ext_sig + ext_header->count)) )
> +        return MIS_UCODE;

With you now assuming that the blob has previously passed
microcode_sanity_check(), this only needs to be

    if ( (end <= (const void *)ext_sig) )
        return MIS_UCODE;

now afaict.

Reviewed-by: Jan Beulich <jbeulich@suse.com>
preferably with this adjustment (assuming you agree).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-09-12 10:29   ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 10:29 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -260,6 +260,36 @@ static enum microcode_match_result microcode_update_match(
>      return MIS_UCODE;
>  }
>  
> +static bool match_cpu(const struct microcode_patch *patch)
> +{
> +    if ( !patch )
> +        return false;
> +
> +    return microcode_update_match(&patch->mc_intel->hdr,
> +                                  smp_processor_id()) == NEW_UCODE;
> +}
> +
> +static void free_patch(void *mc)
> +{
> +    xfree(mc);
> +}
> +
> +static enum microcode_match_result compare_patch(
> +    const struct microcode_patch *new, const struct microcode_patch *old)
> +{
> +    /*
> +     * Both patches to compare are supposed to be applicable to local CPU.
> +     * Just compare the revision number.
> +     */
> +    ASSERT(microcode_update_match(&old->mc_intel->hdr, smp_processor_id()) !=
> +               MIS_UCODE);
> +    ASSERT(microcode_update_match(&new->mc_intel->hdr, smp_processor_id()) !=
> +               MIS_UCODE);

With the indentation here fixed
Acked-by: Jan Beulich <jbeulich@suse.com>

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
@ 2019-09-12 12:34   ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 12:34 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> Introduce a vendor hook, .end_update_percpu, for svm_host_osvw_init().
> The hook function is called on each cpu after loading an update.
> It is a preparation for spliting out apply_microcode() from
> cpu_request_microcode().
> 
> Note that svm_host_osvm_init() should be called regardless of the
> result of loading an update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
albeit with one further remark:

> @@ -613,29 +609,32 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
>      return error;
>  }
>  
> +#ifdef CONFIG_HVM
>  static int start_update(void)
>  {
> -#if CONFIG_HVM
>      /*
> -     * We assume here that svm_host_osvw_init() will be called on each cpu (from
> -     * cpu_request_microcode()).
> -     *
> -     * Note that if collect_cpu_info() returns an error then
> -     * cpu_request_microcode() will not invoked thus leaving OSVW bits not
> -     * updated. Currently though collect_cpu_info() will not fail on processors
> -     * supporting OSVW so we will not deal with this possibility.
> +     * svm_host_osvw_init() will be called on each cpu by calling '.end_update'
> +     * in common code.
>       */
>      svm_host_osvw_reset();
> -#endif
>  
>      return 0;
>  }
>  
> +static void end_update_percpu(void)
> +{
> +    svm_host_osvw_init();
> +}

I don't see the need for this wrapper, you could ...

> +#endif
> +
>  static const struct microcode_ops microcode_amd_ops = {
>      .cpu_request_microcode            = cpu_request_microcode,
>      .collect_cpu_info                 = collect_cpu_info,
>      .apply_microcode                  = apply_microcode,
> +#ifdef CONFIG_HVM
>      .start_update                     = start_update,
> +    .end_update_percpu                = end_update_percpu,

... use svm_host_osvw_init here directly.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-09-12 14:07   ` Jan Beulich
  2019-09-13  6:47     ` Chao Gao
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 14:07 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> @@ -249,49 +249,80 @@ bool microcode_update_cache(struct microcode_patch *patch)
>      return true;
>  }
>  
> -static int microcode_update_cpu(const void *buf, size_t size)
> +/*
> + * Load a microcode update to current CPU.
> + *
> + * If no patch is provided, the cached patch will be loaded. Microcode update
> + * during APs bringup and CPU resuming falls into this case.
> + */
> +static int microcode_update_cpu(const struct microcode_patch *patch)
>  {
> -    int err;
> -    unsigned int cpu = smp_processor_id();
> -    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
> +    int err = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>  
> -    spin_lock(&microcode_mutex);
> +    if ( unlikely(err) )
> +        return err;
>  
> -    err = microcode_ops->collect_cpu_info(sig);
> -    if ( likely(!err) )
> -        err = microcode_ops->cpu_request_microcode(buf, size);
> -    spin_unlock(&microcode_mutex);
> +    if ( patch )
> +        err = microcode_ops->apply_microcode(patch);
> +    else if ( microcode_cache )
> +    {
> +        spin_lock(&microcode_mutex);
> +        err = microcode_ops->apply_microcode(microcode_cache);
> +        if ( err == -EIO )
> +        {
> +            microcode_free_patch(microcode_cache);
> +            microcode_cache = NULL;
> +        }
> +        spin_unlock(&microcode_mutex);
> +    }

I'm having trouble understanding the locking discipline here: Why
do you call ->apply_microcode() once with the lock held and once
without? If this is to guard against microcode_cache changing,
then (a) the check of it being non-NULL would need to be done with
the lock held as well and (b) you'd need to explain why the non-
locked call to ->apply_microcode() is okay.

It certainly wasn't this way in v8, yet the v9 revision log also
doesn't mention such a (not insignificant) change (which is part
of the reason why I didn't spot it in v9).

> +    else
> +        /* No patch to update */
> +        err = -ENOENT;
>  
>      return err;
>  }
>  
> -static long do_microcode_update(void *_info)
> +static long do_microcode_update(void *patch)
>  {
> -    struct microcode_info *info = _info;
> -    int error;
> -
> -    BUG_ON(info->cpu != smp_processor_id());
> +    unsigned int cpu;
> +    int ret = microcode_update_cpu(patch);
>  
> -    error = microcode_update_cpu(info->buffer, info->buffer_size);
> -    if ( error )
> -        info->error = error;
> +    /* Store the patch after a successful loading */
> +    if ( !ret && patch )
> +    {
> +        spin_lock(&microcode_mutex);
> +        microcode_update_cache(patch);
> +        spin_unlock(&microcode_mutex);
> +        patch = NULL;
> +    }
>  
>      if ( microcode_ops->end_update_percpu )
>          microcode_ops->end_update_percpu();
>  
> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
> -    if ( info->cpu < nr_cpu_ids )
> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +    /*
> +     * Each thread tries to load ucode and only the first thread of a core
> +     * would succeed. Ignore error other than -EIO.
> +     */
> +    if ( ret != -EIO )
> +        ret = 0;

I don't think this is a good idea. Ignoring a _specific_ error
code (e.g. indicating "already loaded" or "newer patch already
loaded") is fine, but here you also ignore things like -ENOMEM
or -EINVAL.

> +    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
> +    if ( cpu < nr_cpu_ids )
> +        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch) ?
> +                                                                          : ret;

When there's no middle operand I don't think ? and : should be on
separate lines.

> @@ -299,32 +330,46 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      if ( microcode_ops == NULL )
>          return -EINVAL;
>  
> -    info = xmalloc_bytes(sizeof(*info) + len);
> -    if ( info == NULL )
> +    buffer = xmalloc_bytes(len);
> +    if ( !buffer )
>          return -ENOMEM;
>  
> -    ret = copy_from_guest(info->buffer, buf, len);
> -    if ( ret != 0 )
> +    if ( copy_from_guest(buffer, buf, len) )
> +    {
> +        ret = -EFAULT;
> +        goto free;
> +    }
> +
> +    patch = parse_blob(buffer, len);

You don't look to be using buffer anymore below this point. Why don't
you free it right here, avoiding the need for the "free" label below
and also further reducing the overall code churn as it seems.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming Chao Gao
@ 2019-09-12 14:59   ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 14:59 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> During system bootup and resuming, CPUs just load the cached ucode.
> So one unified function microcode_update_one() is introduced. It
> takes a boolean to indicate whether ->start_update should be called.
> Since early_microcode_update_cpu() is only called on BSP (APs call
> the unified function), start_update is always true and so remove
> this parameter.
> 
> There is a functional change: ->start_update is called on BSP and
> ->end_update_percpu is called during system resuming. They are not
> invoked by previous microcode_resume_cpu().
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch Chao Gao
@ 2019-09-12 15:04   ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 15:04 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> To create a microcode patch from a vendor-specific update,
> allocate_microcode_patch() copied everything from the update.
> It is not efficient. Essentially, we just need to go through
> ucodes in the blob, find the one with the newest revision and
> install it into the microcode_patch. In the process, buffers
> like mc_amd, equiv_cpu_table (on AMD side), and mc (on Intel
> side) can be reused. microcode_patch now is allocated after
> it is sure that there is a matching ucode.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>

> @@ -353,8 +334,9 @@ static int apply_microcode(const struct microcode_patch *patch)
>      return 0;
>  }
>  
> -static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
> -                                       unsigned long size, long offset)
> +static long get_next_ucode_from_buffer(struct microcode_intel **mc,
> +                                       const u8 *buf, unsigned long size,
> +                                       long offset)

As you touch this anyway, it would have been nice if you had
taken the time to correct the one or two style issues here
(u8 -> uint8_t and likely long -> unsigned long).

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-09-12 15:32   ` Jan Beulich
  2019-09-13  7:01     ` Chao Gao
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-12 15:32 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper,
	Jun Nakajima, xen-devel, Thomas Gleixner, Borislav Petkov,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> @@ -264,38 +336,158 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
>      return err;
>  }
>  
> -static long do_microcode_update(void *patch)
> +static bool wait_for_state(unsigned int state)
> +{
> +    while ( loading_state != state )
> +    {
> +        if ( state != LOADING_EXIT && loading_state == LOADING_EXIT )
> +            return false;

This is at least somewhat confusing: There's no indication here
that "loading_state" may change behind the function's back. So
in general one could be (and I initially was) tempted to suggest
dropping the apparently redundant left side of the &&. But that
would end up wrong if the compiler translates the above to two
separate reads of "loading_state". Therefore I'd like to suggest

static bool wait_for_state(typeof(loading_state) state)
{
    typeof(loading_state) cur_state;

    while ( (cur_state = ACCESS_ONCE(loading_state)) != state )
    {
        if ( cur_state == LOADING_EXIT )
            return false;
        cpu_relax();
    }

    return true;
}

or something substantially similar (if, e.g., you dislike the
use of typeof() here).

> +static int secondary_thread_fn(void)
> +{
> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
> +
> +    if ( !wait_for_state(LOADING_CALLIN) )
> +        return -EBUSY;
> +
> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
> +
> +    if ( !wait_for_state(LOADING_EXIT) )
> +        return -EBUSY;

This return looks to be unreachable, doesn't it?

> +static int control_thread_fn(const struct microcode_patch *patch)
>  {
> -    unsigned int cpu;
> -    int ret = microcode_update_cpu(patch);
> +    unsigned int cpu = smp_processor_id(), done;
> +    unsigned long tick;
> +    int ret;
> +
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();

With the move here, the comment should start "We intend to keep
interrupts disabled for a long time, ..." - they are disabled
already at this point.

> -    /* Store the patch after a successful loading */
> -    if ( !ret && patch )
> +    /* Allow threads to call in */
> +    set_state(LOADING_CALLIN);
> +
> +    cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> +    /* Waiting for all threads calling in */
> +    ret = wait_for_condition(wait_cpu_callin,
> +                             (void *)(unsigned long)num_online_cpus(),

I'm puzzled by my own suggestion in reply to v9, and I'm equally
puzzled that you didn't take it that little step further: All of
this casting would be unnecessary if the two wait_cpu_*()
functions took "unsigned int" as their parameters. (The
observation with v9 was that both are similar enough to allow
for a common signature, even if that signature would not be a
typical one for a callback.)

> +                             MICROCODE_CALLIN_TIMEOUT_US);
> +    if ( ret )
>      {
> -        spin_lock(&microcode_mutex);
> -        microcode_update_cache(patch);
> -        spin_unlock(&microcode_mutex);
> -        patch = NULL;
> +        set_state(LOADING_EXIT);
> +        return ret;
>      }
>  
> -    if ( microcode_ops->end_update_percpu )
> -        microcode_ops->end_update_percpu();
> +    /* Let primary threads load the given ucode update */
> +    set_state(LOADING_ENTER);
>  
> +    ret = microcode_ops->apply_microcode(patch);
> +    if ( !ret )
> +        atomic_inc(&cpu_updated);
> +    atomic_inc(&cpu_out);
> +
> +    tick = rdtsc_ordered();
> +    /* Wait for primary threads finishing update */
> +    done = atomic_read(&cpu_out);

Would you mind eliminating the duplication of this and ...

> +    while ( done != nr_cores )
> +    {
> +        /*
> +         * During each timeout interval, at least a CPU is expected to
> +         * finish its update. Otherwise, something goes wrong.
> +         *
> +         * Note that RDTSC (in wait_for_condition()) is safe for threads to
> +         * execute while waiting for completion of loading an update.
> +         */
> +        if ( wait_for_condition(wait_cpu_callout,
> +                                (void *)(unsigned long)(done + 1),
> +                                MICROCODE_UPDATE_TIMEOUT_US) )
> +            panic("Timeout when finished updating microcode (finished %u/%u)",
> +                  done, nr_cores);
> +
> +        /* Print warning message once if long time is spent here */
> +        if ( tick && rdtsc_ordered() - tick >= cpu_khz * 1000 )
> +        {
> +            printk(XENLOG_WARNING
> +                   "WARNING: UPDATING MICROCODE HAS CONSUMED MORE THAN 1 SECOND!\n");
> +            tick = 0;
> +        }
> +        done = atomic_read(&cpu_out);

... this, by doing the assignment inside the while()?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-09-12 14:07   ` Jan Beulich
@ 2019-09-13  6:47     ` Chao Gao
  0 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-13  6:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On Thu, Sep 12, 2019 at 04:07:16PM +0200, Jan Beulich wrote:
>On 12.09.2019 09:22, Chao Gao wrote:
>> @@ -249,49 +249,80 @@ bool microcode_update_cache(struct microcode_patch *patch)
>>      return true;
>>  }
>>  
>> -static int microcode_update_cpu(const void *buf, size_t size)
>> +/*
>> + * Load a microcode update to current CPU.
>> + *
>> + * If no patch is provided, the cached patch will be loaded. Microcode update
>> + * during APs bringup and CPU resuming falls into this case.
>> + */
>> +static int microcode_update_cpu(const struct microcode_patch *patch)
>>  {
>> -    int err;
>> -    unsigned int cpu = smp_processor_id();
>> -    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
>> +    int err = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
>>  
>> -    spin_lock(&microcode_mutex);
>> +    if ( unlikely(err) )
>> +        return err;
>>  
>> -    err = microcode_ops->collect_cpu_info(sig);
>> -    if ( likely(!err) )
>> -        err = microcode_ops->cpu_request_microcode(buf, size);
>> -    spin_unlock(&microcode_mutex);
>> +    if ( patch )
>> +        err = microcode_ops->apply_microcode(patch);
>> +    else if ( microcode_cache )
>> +    {
>> +        spin_lock(&microcode_mutex);
>> +        err = microcode_ops->apply_microcode(microcode_cache);
>> +        if ( err == -EIO )
>> +        {
>> +            microcode_free_patch(microcode_cache);
>> +            microcode_cache = NULL;
>> +        }
>> +        spin_unlock(&microcode_mutex);
>> +    }
>
>I'm having trouble understanding the locking discipline here: Why
>do you call ->apply_microcode() once with the lock held and once
>without? If this is to guard against microcode_cache changing,

Yes. microcode_cache is protected by microcode_mutex;

>then (a) the check of it being non-NULL would need to be done with
>the lock held as well and

Will do.

>(b) you'd need to explain why the non-
>locked call to ->apply_microcode() is okay.

->apply_microcode() was always called with this lock held was because
it always read the old per-cpu cache which was protected by the lock.
It gave us an impression that ->apply_microcode() was protected by the
lock.

The patch before this one makes ->apply_microcode() accept a patch
pointer. With this change, if the patch being passed should be accessed
with some lock held (like the secondary call site above), we acquire
the lock. Otherwise, no lock is taken and the caller of
microcode_update_cpu() is supposed to guarantee the patch won't be
changed by others.

>
>It certainly wasn't this way in v8, yet the v9 revision log also
>doesn't mention such a (not insignificant) change (which is part
>of the reason why I didn't spot it in v9).

It is my bad.

>
>> +    else
>> +        /* No patch to update */
>> +        err = -ENOENT;
>>  
>>      return err;
>>  }
>>  
>> -static long do_microcode_update(void *_info)
>> +static long do_microcode_update(void *patch)
>>  {
>> -    struct microcode_info *info = _info;
>> -    int error;
>> -
>> -    BUG_ON(info->cpu != smp_processor_id());
>> +    unsigned int cpu;
>> +    int ret = microcode_update_cpu(patch);
>>  
>> -    error = microcode_update_cpu(info->buffer, info->buffer_size);
>> -    if ( error )
>> -        info->error = error;
>> +    /* Store the patch after a successful loading */
>> +    if ( !ret && patch )
>> +    {
>> +        spin_lock(&microcode_mutex);
>> +        microcode_update_cache(patch);
>> +        spin_unlock(&microcode_mutex);
>> +        patch = NULL;
>> +    }
>>  
>>      if ( microcode_ops->end_update_percpu )
>>          microcode_ops->end_update_percpu();
>>  
>> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
>> -    if ( info->cpu < nr_cpu_ids )
>> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>> +    /*
>> +     * Each thread tries to load ucode and only the first thread of a core
>> +     * would succeed. Ignore error other than -EIO.
>> +     */
>> +    if ( ret != -EIO )
>> +        ret = 0;
>
>I don't think this is a good idea. Ignoring a _specific_ error
>code (e.g. indicating "already loaded" or "newer patch already
>loaded") is fine, but here you also ignore things like -ENOMEM
>or -EINVAL.

will do.

>
>> +    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
>> +    if ( cpu < nr_cpu_ids )
>> +        return continue_hypercall_on_cpu(cpu, do_microcode_update, patch) ?
>> +                                                                          : ret;
>
>When there's no middle operand I don't think ? and : should be on
>separate lines.
>
>> @@ -299,32 +330,46 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>>      if ( microcode_ops == NULL )
>>          return -EINVAL;
>>  
>> -    info = xmalloc_bytes(sizeof(*info) + len);
>> -    if ( info == NULL )
>> +    buffer = xmalloc_bytes(len);
>> +    if ( !buffer )
>>          return -ENOMEM;
>>  
>> -    ret = copy_from_guest(info->buffer, buf, len);
>> -    if ( ret != 0 )
>> +    if ( copy_from_guest(buffer, buf, len) )
>> +    {
>> +        ret = -EFAULT;
>> +        goto free;
>> +    }
>> +
>> +    patch = parse_blob(buffer, len);
>
>You don't look to be using buffer anymore below this point. Why don't
>you free it right here, avoiding the need for the "free" label below
>and also further reducing the overall code churn as it seems.

Yes. Good idea.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match()
  2019-09-12 10:24   ` Jan Beulich
@ 2019-09-13  6:50     ` Jan Beulich
  2019-09-13  7:02       ` Chao Gao
  0 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  6:50 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 12:24, Jan Beulich wrote:
> On 12.09.2019 09:22, Chao Gao wrote:
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -134,21 +134,11 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>>      return 0;
>>  }
>>  
>> -static inline int microcode_update_match(
>> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>> -    int sig, int pf)
>> +static int microcode_sanity_check(const void *mc)
>>  {
>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>> -
>> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>> -            (mc_header->rev > uci->cpu_sig.rev));
>> -}
>> -
>> -static int microcode_sanity_check(void *mc)
>> -{
>> -    struct microcode_header_intel *mc_header = mc;
>> -    struct extended_sigtable *ext_header = NULL;
>> -    struct extended_signature *ext_sig;
>> +    const struct microcode_header_intel *mc_header = mc;
>> +    const struct extended_sigtable *ext_header = NULL;
>> +    const struct extended_signature *ext_sig;
>>      unsigned long total_size, data_size, ext_table_size;
>>      unsigned int ext_sigcount = 0, i;
>>      uint32_t sum, orig_sum;
>> @@ -234,6 +224,42 @@ static int microcode_sanity_check(void *mc)
>>      return 0;
>>  }
>>  
>> +/* Check an update against the CPU signature and current update revision */
>> +static enum microcode_match_result microcode_update_match(
>> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
>> +{
>> +    const struct extended_sigtable *ext_header;
>> +    const struct extended_signature *ext_sig;
>> +    unsigned int i;
>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> +    unsigned int sig = uci->cpu_sig.sig;
>> +    unsigned int pf = uci->cpu_sig.pf;
>> +    unsigned int rev = uci->cpu_sig.rev;
>> +    unsigned long data_size = get_datasize(mc_header);
>> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
>> +
>> +    ASSERT(!microcode_sanity_check(mc_header));
>> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>> +
>> +    ext_header = (const void *)(mc_header + 1) + data_size;
>> +    ext_sig = (const void *)(ext_header + 1);
>> +
>> +    /*
>> +     * Make sure there is enough space to hold an extended header and enough
>> +     * array elements.
>> +     */
>> +    if ( (end < (const void *)ext_sig) ||
>> +         (end < (const void *)(ext_sig + ext_header->count)) )
>> +        return MIS_UCODE;
> 
> With you now assuming that the blob has previously passed
> microcode_sanity_check(), this only needs to be
> 
>     if ( (end <= (const void *)ext_sig) )
>         return MIS_UCODE;
> 
> now afaict.
> 
> Reviewed-by: Jan Beulich <jbeulich@suse.com>
> preferably with this adjustment (assuming you agree).

FAOD: I'd be happy to make the adjustment while committing, but
I'd like to have your consent (or you proving me wrong). This
would, as it looks, allow everything up to patch 8 to go in.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading
  2019-09-12 15:32   ` Jan Beulich
@ 2019-09-13  7:01     ` Chao Gao
  2019-09-13  7:15       ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-13  7:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper,
	Jun Nakajima, xen-devel, Thomas Gleixner, Borislav Petkov,
	Roger Pau Monné

On Thu, Sep 12, 2019 at 05:32:22PM +0200, Jan Beulich wrote:
>On 12.09.2019 09:22, Chao Gao wrote:
>> @@ -264,38 +336,158 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
>>      return err;
>>  }
>>  
>> -static long do_microcode_update(void *patch)
>> +static bool wait_for_state(unsigned int state)
>> +{
>> +    while ( loading_state != state )
>> +    {
>> +        if ( state != LOADING_EXIT && loading_state == LOADING_EXIT )
>> +            return false;
>
>This is at least somewhat confusing: There's no indication here
>that "loading_state" may change behind the function's back. So
>in general one could be (and I initially was) tempted to suggest
>dropping the apparently redundant left side of the &&. But that
>would end up wrong if the compiler translates the above to two
>separate reads of "loading_state". Therefore I'd like to suggest
>
>static bool wait_for_state(typeof(loading_state) state)
>{
>    typeof(loading_state) cur_state;
>
>    while ( (cur_state = ACCESS_ONCE(loading_state)) != state )
>    {
>        if ( cur_state == LOADING_EXIT )
>            return false;
>        cpu_relax();
>    }
>
>    return true;
>}
>
>or something substantially similar (if, e.g., you dislike the
>use of typeof() here).

The code snippet above is terrific. Will take it.

>
>> +static int secondary_thread_fn(void)
>> +{
>> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
>> +
>> +    if ( !wait_for_state(LOADING_CALLIN) )
>> +        return -EBUSY;
>> +
>> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
>> +
>> +    if ( !wait_for_state(LOADING_EXIT) )
>> +        return -EBUSY;
>
>This return looks to be unreachable, doesn't it?

Yes. I will use a variable to hold its return value and assert the
return value is always true.

Other comments are reasonable and I will follow your suggestion.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match()
  2019-09-13  6:50     ` Jan Beulich
@ 2019-09-13  7:02       ` Chao Gao
  0 siblings, 0 replies; 41+ messages in thread
From: Chao Gao @ 2019-09-13  7:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On Fri, Sep 13, 2019 at 08:50:59AM +0200, Jan Beulich wrote:
>On 12.09.2019 12:24, Jan Beulich wrote:
>> On 12.09.2019 09:22, Chao Gao wrote:
>>> --- a/xen/arch/x86/microcode_intel.c
>>> +++ b/xen/arch/x86/microcode_intel.c
>>> @@ -134,21 +134,11 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>>>      return 0;
>>>  }
>>>  
>>> -static inline int microcode_update_match(
>>> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>>> -    int sig, int pf)
>>> +static int microcode_sanity_check(const void *mc)
>>>  {
>>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>>> -
>>> -    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
>>> -            (mc_header->rev > uci->cpu_sig.rev));
>>> -}
>>> -
>>> -static int microcode_sanity_check(void *mc)
>>> -{
>>> -    struct microcode_header_intel *mc_header = mc;
>>> -    struct extended_sigtable *ext_header = NULL;
>>> -    struct extended_signature *ext_sig;
>>> +    const struct microcode_header_intel *mc_header = mc;
>>> +    const struct extended_sigtable *ext_header = NULL;
>>> +    const struct extended_signature *ext_sig;
>>>      unsigned long total_size, data_size, ext_table_size;
>>>      unsigned int ext_sigcount = 0, i;
>>>      uint32_t sum, orig_sum;
>>> @@ -234,6 +224,42 @@ static int microcode_sanity_check(void *mc)
>>>      return 0;
>>>  }
>>>  
>>> +/* Check an update against the CPU signature and current update revision */
>>> +static enum microcode_match_result microcode_update_match(
>>> +    const struct microcode_header_intel *mc_header, unsigned int cpu)
>>> +{
>>> +    const struct extended_sigtable *ext_header;
>>> +    const struct extended_signature *ext_sig;
>>> +    unsigned int i;
>>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>>> +    unsigned int sig = uci->cpu_sig.sig;
>>> +    unsigned int pf = uci->cpu_sig.pf;
>>> +    unsigned int rev = uci->cpu_sig.rev;
>>> +    unsigned long data_size = get_datasize(mc_header);
>>> +    const void *end = (const void *)mc_header + get_totalsize(mc_header);
>>> +
>>> +    ASSERT(!microcode_sanity_check(mc_header));
>>> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>>> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
>>> +
>>> +    ext_header = (const void *)(mc_header + 1) + data_size;
>>> +    ext_sig = (const void *)(ext_header + 1);
>>> +
>>> +    /*
>>> +     * Make sure there is enough space to hold an extended header and enough
>>> +     * array elements.
>>> +     */
>>> +    if ( (end < (const void *)ext_sig) ||
>>> +         (end < (const void *)(ext_sig + ext_header->count)) )
>>> +        return MIS_UCODE;
>> 
>> With you now assuming that the blob has previously passed
>> microcode_sanity_check(), this only needs to be
>> 
>>     if ( (end <= (const void *)ext_sig) )
>>         return MIS_UCODE;
>> 
>> now afaict.
>> 
>> Reviewed-by: Jan Beulich <jbeulich@suse.com>
>> preferably with this adjustment (assuming you agree).
>
>FAOD: I'd be happy to make the adjustment while committing, but
>I'd like to have your consent (or you proving me wrong). This
>would, as it looks, allow everything up to patch 8 to go in.

Please go ahead. Thanks

Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading
  2019-09-13  7:01     ` Chao Gao
@ 2019-09-13  7:15       ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  7:15 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Ashok Raj, Wei Liu, Andrew Cooper,
	Jun Nakajima, xen-devel, Thomas Gleixner, Borislav Petkov,
	Roger Pau Monné

On 13.09.2019 09:01, Chao Gao wrote:
> On Thu, Sep 12, 2019 at 05:32:22PM +0200, Jan Beulich wrote:
>> On 12.09.2019 09:22, Chao Gao wrote:
>>> +static int secondary_thread_fn(void)
>>> +{
>>> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
>>> +
>>> +    if ( !wait_for_state(LOADING_CALLIN) )
>>> +        return -EBUSY;
>>> +
>>> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
>>> +
>>> +    if ( !wait_for_state(LOADING_EXIT) )
>>> +        return -EBUSY;
>>
>> This return looks to be unreachable, doesn't it?
> 
> Yes. I will use a variable to hold its return value and assert the
> return value is always true.

Or simply add ASSERT_UNREACHABLE() to the if()'s body?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 00/16] improve late microcode loading
  2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
                   ` (15 preceding siblings ...)
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally Chao Gao
@ 2019-09-13  8:47 ` Jan Beulich
  2019-09-17  7:09   ` Chao Gao
  16 siblings, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  8:47 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper,
	Suravee Suthikulpanit, xen-devel, Boris Ostrovsky,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> This series includes below changes:
>  1. Patch 1-11: introduce a global microcode cache and some cleanup
>  2. Patch 12: synchronize late microcode loading
>  3. Patch 13: support parallel microcodes update on different cores
>  4. Patch 14: block #NMI handling during microcode loading
>  5. Patch 15: disable late ucode loading due to BDF90
>  6. Patch 16: call wbinvd() conditionally

I don't know why it didn't occur to me earlier, but what about
parked / offlined CPUs? They'll have their ucode updated when they
get brought back online, but until then their ucode will disagree
with that of the online CPUs. For truly offline CPUs this may be
fine, but parked ones should probably be updated, perhaps via the
same approach as used when C-state data becomes available (see
set_cx_pminfo())?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode Chao Gao
@ 2019-09-13  9:14   ` Jan Beulich
  2019-09-16  3:18     ` Chao Gao
  2019-09-13  9:18   ` Jan Beulich
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  9:14 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Stefano Stabellini, Ashok Raj, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, xen-devel, Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> When one core is loading ucode, handling NMI on sibling threads or
> on other cores in the system might be problematic. By rendezvousing
> all CPUs in NMI handler, it prevents NMI acceptance during ucode
> loading.
> 
> Basically, some work previously done in stop_machine context is
> moved to NMI handler. Primary threads call in and load ucode in
> NMI handler. Secondary threads wait for the completion of ucode
> loading on all CPU cores. An option is introduced to disable this
> behavior.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>



> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -2056,6 +2056,16 @@ microcode in the cpio name space must be:
>    - on Intel: kernel/x86/microcode/GenuineIntel.bin
>    - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
>  
> +### ucode_loading_in_nmi (x86)
> +> `= <boolean>`
> +
> +> Default: `true`
> +
> +When one CPU is loading ucode, handling NMIs on sibling threads or threads on
> +other cores might cause problems. By default, all CPUs rendezvous in NMI handler
> +and load ucode. This option provides a way to disable it in case of some CPUs
> +don't allow ucode loading in NMI handler.

We already have "ucode=", why don't you extend it to allow "ucode=nmi"
and "ucode=no-nmi"? (In any event, please no underscores in new
command line options - use hyphens if necessary.)

> @@ -232,6 +237,7 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>   */
>  static cpumask_t cpu_callin_map;
>  static atomic_t cpu_out, cpu_updated;
> +const struct microcode_patch *nmi_patch;

static

> @@ -354,6 +360,50 @@ static void set_state(unsigned int state)
>      smp_wmb();
>  }
>  
> +static int secondary_thread_work(void)
> +{
> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
> +
> +    return wait_for_state(LOADING_EXIT) ? 0 : -EBUSY;
> +}
> +
> +static int primary_thread_work(const struct microcode_patch *patch)

I think it would be nice if both functions carried "nmi" in their
names - how about {primary,secondary}_nmi_work()? Or wait - the
primary one gets used outside of NMI as well, so I'm fine with its
name. The secondary one, otoh, is NMI-specific and also its only
caller doesn't care about the return value, so I'd suggest making
it return void alongside adding some form of "nmi" to its name. Or,
perhaps even better, have secondary_thread_fn() call it, moving the
cpu_sig update here (and of course then there shouldn't be any
"nmi" added to its name).

> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
> +{
> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
> +    unsigned int controller = cpumask_first(&cpu_online_map);
> +
> +    /* System-generated NMI, will be ignored */
> +    if ( loading_state != LOADING_CALLIN )
> +        return 0;

I'm not happy at all to see NMIs being ignored. But by returning
zero, you do _not_ ignore it. Did you perhaps mean "will be ignored
here", in which case perhaps better "leave to main handler"? And
for the comment to extend to the other two conditions right below,
I think it would be better to combine them all into a single if().

Also, throughout the series, I think you want to consistently use
ACCESS_ONCE() for reads/writes from/to loading_state.

> +    if ( cpu == controller || (!opt_ucode_loading_in_nmi && cpu == primary) )
> +        return 0;

Why not

    if ( cpu == controller || !opt_ucode_loading_in_nmi )
        return 0;

? (And then, there being just a single use each in this function, I
don't think there's a need for the two local variables.)

> @@ -361,10 +411,7 @@ static int secondary_thread_fn(void)
>      if ( !wait_for_state(LOADING_CALLIN) )
>          return -EBUSY;
>  
> -    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
> -
> -    if ( !wait_for_state(LOADING_EXIT) )
> -        return -EBUSY;
> +    self_nmi();

Loosing the -EBUSY indication here isn't very nice. Perhaps this
should be conveyed via a per-CPU variable?

> @@ -379,15 +426,10 @@ static int primary_thread_fn(const struct microcode_patch *patch)
>      if ( !wait_for_state(LOADING_CALLIN) )
>          return -EBUSY;
>  
> -    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
> -
> -    if ( !wait_for_state(LOADING_ENTER) )
> -        return -EBUSY;
> -
> -    ret = microcode_ops->apply_microcode(patch);
> -    if ( !ret )
> -        atomic_inc(&cpu_updated);
> -    atomic_inc(&cpu_out);
> +    if ( opt_ucode_loading_in_nmi )
> +        self_nmi();

Same here.

> @@ -404,6 +447,9 @@ static int control_thread_fn(const struct microcode_patch *patch)
>       */
>      watchdog_disable();
>  
> +    nmi_patch = patch;
> +    saved_nmi_callback = set_nmi_callback(microcode_nmi_callback);

Shouldn't there be smb_wmb() between these two?

> @@ -458,6 +513,7 @@ static int control_thread_fn(const struct microcode_patch *patch)
>      /* Mark loading is done to unblock other threads */
>      set_state(LOADING_EXIT);
>  
> +    set_nmi_callback(saved_nmi_callback);

To be on the safe side, I think you also want to clear nmi_patch again.
Or maybe even better not clear it, but set it to a non-NULL value which,
when accessed, would trap (e.g. ZERO_BLOCK_PTR). This value should then
also be the variable's initializer.

> @@ -522,6 +578,13 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          goto free;
>      }
>  
> +    /*
> +     * CPUs except the first online CPU would send a fake (self) NMI to
> +     * rendezvous in NMI handler. But a fake NMI to nmi_cpu may trigger
> +     * unknown_nmi_error(). It ensures nmi_cpu won't receive a fake NMI.
> +     */
> +    ASSERT( !cpu_online(nmi_cpu) || nmi_cpu == cpumask_first(&cpu_online_map) );

Please drop the blanks immediately inside the parentheses.

As to the left side of the || - is this really needed? It surely would
be very wrong (but entirely unrelated to ucode loading) if the CPU to
receive platform NMIs was offline.

> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -126,6 +126,8 @@ boolean_param("ler", opt_ler);
>  /* LastExceptionFromIP on this hardware.  Zero if LER is not in use. */
>  unsigned int __read_mostly ler_msr;
>  
> +unsigned int __read_mostly nmi_cpu;

Since this variable (for now) is never written to it should gain a
comment saying why this is, and perhaps it would then also better be
const rather than __read_mostly.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode Chao Gao
  2019-09-13  9:14   ` Jan Beulich
@ 2019-09-13  9:18   ` Jan Beulich
  1 sibling, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  9:18 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Stefano Stabellini, Ashok Raj, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, xen-devel, Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> @@ -419,14 +465,23 @@ static int control_thread_fn(const struct microcode_patch *patch)
>          return ret;
>      }
>  
> -    /* Let primary threads load the given ucode update */
> -    set_state(LOADING_ENTER);
> -
> +    /* Control thread loads ucode first while others are in NMI handler. */
>      ret = microcode_ops->apply_microcode(patch);
>      if ( !ret )
>          atomic_inc(&cpu_updated);
>      atomic_inc(&cpu_out);
>  
> +    if ( ret == -EIO )
> +    {
> +        printk(XENLOG_ERR
> +               "Late loading aborted: CPU%u failed to update ucode\n", cpu);
> +        set_state(LOADING_EXIT);
> +        return ret;
> +    }
> +
> +    /* Let primary threads load the given ucode update */
> +    set_state(LOADING_ENTER);

One more question - why this deferral of setting state to ENTER? If
it's needed, it should be explained in the description.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90 Chao Gao
@ 2019-09-13  9:22   ` Jan Beulich
  2019-09-17  9:01     ` Chao Gao
  2019-09-13 12:23   ` Andrew Cooper
  1 sibling, 1 reply; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  9:22 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> @@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch(
>                                                               : OLD_UCODE;
>  }
>  
> +static bool is_blacklisted(void)
> +{
> +    struct cpuinfo_x86 *c = &current_cpu_data;
> +    uint64_t llc_size = c->x86_cache_size * 1024ULL;
> +    struct cpu_signature *sig = &this_cpu(cpu_sig);
> +
> +    do_div(llc_size, c->x86_max_cores);
> +
> +    /*
> +     * Late loading on model 79 with microcode revision less than 0x0b000021
> +     * and LLC size per core bigger than 2.5MB may result in a system hang.
> +     * This behavior is documented in item BDF90, #334165 (Intel Xeon
> +     * Processor E7-8800/4800 v4 Product Family).
> +     */
> +    if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 &&
> +         llc_size > 2621440 && sig->rev < 0x0b000021 )
> +        return true;
> +
> +    return false;
> +}

Isn't this misbehavior worked around by the wbinvd() you add in the next
patch?

> --- a/xen/include/asm-x86/microcode.h
> +++ b/xen/include/asm-x86/microcode.h
> @@ -30,6 +30,7 @@ struct microcode_ops {
>      bool (*match_cpu)(const struct microcode_patch *patch);
>      enum microcode_match_result (*compare_patch)(
>          const struct microcode_patch *new, const struct microcode_patch *old);
> +    bool (*is_blacklisted)(void);

Why a hook rather than a boolean flag, which could be set by
microcode_update_one() (as invoked during AP bringup)?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally Chao Gao
@ 2019-09-13  9:32   ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-13  9:32 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 12.09.2019 09:22, Chao Gao wrote:
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -305,6 +305,31 @@ static bool is_blacklisted(void)
>      return false;
>  }
>  
> +static void microcode_quirk(void)
> +{
> +    struct cpuinfo_x86 *c;

const

> +    uint64_t llc_size;
> +
> +    /*
> +     * Don't refer to current_cpu_data, which isn't fully initialized
> +     * before this stage.
> +     */
> +    if ( system_state < SYS_STATE_smp_boot )
> +        return;

If the workaround is needed, why would it not be needed for the BSP?

> +    c = &current_cpu_data;
> +    llc_size = c->x86_cache_size * 1024ULL;
> +    do_div(llc_size, c->x86_max_cores);

Instead of the local variable, ...

> +
> +    /*
> +     * To mitigate some issues on this specific Broadwell CPU, writeback and
> +     * invalidate cache regardless of ucode revision.
> +     */
> +    if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 &&
> +         llc_size > 2621440 )

... why don't you compare c->x86_cache_size / c->x86_max_cores
against 2560 here? Is there any risk of truncating relevant low
bits?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90
  2019-09-12  7:22 ` [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90 Chao Gao
  2019-09-13  9:22   ` Jan Beulich
@ 2019-09-13 12:23   ` Andrew Cooper
  1 sibling, 0 replies; 41+ messages in thread
From: Andrew Cooper @ 2019-09-13 12:23 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: Sergey Dyasli, Jan Beulich, Ashok Raj, Wei Liu, Roger Pau Monné

On 12/09/2019 08:22, Chao Gao wrote:
> It ports the implementation of is_blacklisted() in linux kernel
> to Xen.
>
> Late loading may cause system hang if CPUs are affected by BDF90.
> Check against BDF90 before performing a late loading.
>
> Signed-off-by: Chao Gao <chao.gao@intel.com>

There is an Intel-blessed workaround for this issue, which is to perform
parallel loading and issue a WBINVD inside the rendezvous, which is
being used in production by a number of vendors.

We should prohibit late loading cases which we know to be broken, but
parallel loading should be permitted.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode
  2019-09-13  9:14   ` Jan Beulich
@ 2019-09-16  3:18     ` Chao Gao
  2019-09-16  8:22       ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-16  3:18 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Stefano Stabellini, Ashok Raj, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, xen-devel, Roger Pau Monné

On Fri, Sep 13, 2019 at 11:14:59AM +0200, Jan Beulich wrote:
>On 12.09.2019 09:22, Chao Gao wrote:
>> When one core is loading ucode, handling NMI on sibling threads or
>> on other cores in the system might be problematic. By rendezvousing
>> all CPUs in NMI handler, it prevents NMI acceptance during ucode
>> loading.
>> 
>> Basically, some work previously done in stop_machine context is
>> moved to NMI handler. Primary threads call in and load ucode in
>> NMI handler. Secondary threads wait for the completion of ucode
>> loading on all CPU cores. An option is introduced to disable this
>> behavior.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Signed-off-by: Sergey Dyasli <sergey.dyasli@citrix.com>
>
>
>
>> --- a/docs/misc/xen-command-line.pandoc
>> +++ b/docs/misc/xen-command-line.pandoc
>> @@ -2056,6 +2056,16 @@ microcode in the cpio name space must be:
>>    - on Intel: kernel/x86/microcode/GenuineIntel.bin
>>    - on AMD  : kernel/x86/microcode/AuthenticAMD.bin
>>  
>> +### ucode_loading_in_nmi (x86)
>> +> `= <boolean>`
>> +
>> +> Default: `true`
>> +
>> +When one CPU is loading ucode, handling NMIs on sibling threads or threads on
>> +other cores might cause problems. By default, all CPUs rendezvous in NMI handler
>> +and load ucode. This option provides a way to disable it in case of some CPUs
>> +don't allow ucode loading in NMI handler.
>
>We already have "ucode=", why don't you extend it to allow "ucode=nmi"
>and "ucode=no-nmi"? (In any event, please no underscores in new
>command line options - use hyphens if necessary.)

Ok. Will extend the "ucode" parameter.

>
>> @@ -232,6 +237,7 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>>   */
>>  static cpumask_t cpu_callin_map;
>>  static atomic_t cpu_out, cpu_updated;
>> +const struct microcode_patch *nmi_patch;
>
>static
>
>> @@ -354,6 +360,50 @@ static void set_state(unsigned int state)
>>      smp_wmb();
>>  }
>>  
>> +static int secondary_thread_work(void)
>> +{
>> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
>> +
>> +    return wait_for_state(LOADING_EXIT) ? 0 : -EBUSY;
>> +}
>> +
>> +static int primary_thread_work(const struct microcode_patch *patch)
>
>I think it would be nice if both functions carried "nmi" in their
>names - how about {primary,secondary}_nmi_work()? Or wait - the
>primary one gets used outside of NMI as well, so I'm fine with its
>name.
>The secondary one, otoh, is NMI-specific and also its only
>caller doesn't care about the return value, so I'd suggest making
>it return void alongside adding some form of "nmi" to its name. Or,

Will do.

>perhaps even better, have secondary_thread_fn() call it, moving the
>cpu_sig update here (and of course then there shouldn't be any
>"nmi" added to its name).

Even with "ucode=no-nmi", secondary threads have to do busy-loop in
NMI handling util primary threads completing the update. Otherwise,
it may access MSRs (like SPEC_CTRL) which is considered unsafe.

>
>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>> +{
>> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
>> +    unsigned int controller = cpumask_first(&cpu_online_map);
>> +
>> +    /* System-generated NMI, will be ignored */
>> +    if ( loading_state != LOADING_CALLIN )
>> +        return 0;
>
>I'm not happy at all to see NMIs being ignored. But by returning
>zero, you do _not_ ignore it. Did you perhaps mean "will be ignored
>here", in which case perhaps better "leave to main handler"? And
>for the comment to extend to the other two conditions right below,
>I think it would be better to combine them all into a single if().
>
>Also, throughout the series, I think you want to consistently use
>ACCESS_ONCE() for reads/writes from/to loading_state.
>
>> +    if ( cpu == controller || (!opt_ucode_loading_in_nmi && cpu == primary) )
>> +        return 0;
>
>Why not

As I said above, secondary threads are expected to stay in NMI handler
regardless the setting of opt_ucode_loading_in_nmi.

>> --- a/xen/arch/x86/traps.c
>> +++ b/xen/arch/x86/traps.c
>> @@ -126,6 +126,8 @@ boolean_param("ler", opt_ler);
>>  /* LastExceptionFromIP on this hardware.  Zero if LER is not in use. */
>>  unsigned int __read_mostly ler_msr;
>>  
>> +unsigned int __read_mostly nmi_cpu;
>
>Since this variable (for now) is never written to it should gain a
>comment saying why this is, and perhaps it would then also better be
>const rather than __read_mostly.

How about use the macro below:
#define NMI_CPU 0

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode
  2019-09-16  3:18     ` Chao Gao
@ 2019-09-16  8:22       ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-16  8:22 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Stefano Stabellini, Ashok Raj, Wei Liu,
	Konrad Rzeszutek Wilk, George Dunlap, Andrew Cooper, Ian Jackson,
	Tim Deegan, Julien Grall, xen-devel, Roger Pau Monné

On 16.09.2019 05:18, Chao Gao wrote:
> On Fri, Sep 13, 2019 at 11:14:59AM +0200, Jan Beulich wrote:
>> On 12.09.2019 09:22, Chao Gao wrote:
>>> @@ -354,6 +360,50 @@ static void set_state(unsigned int state)
>>>      smp_wmb();
>>>  }
>>>  
>>> +static int secondary_thread_work(void)
>>> +{
>>> +    cpumask_set_cpu(smp_processor_id(), &cpu_callin_map);
>>> +
>>> +    return wait_for_state(LOADING_EXIT) ? 0 : -EBUSY;
>>> +}
>>> +
>>> +static int primary_thread_work(const struct microcode_patch *patch)
>>
>> I think it would be nice if both functions carried "nmi" in their
>> names - how about {primary,secondary}_nmi_work()? Or wait - the
>> primary one gets used outside of NMI as well, so I'm fine with its
>> name.
>> The secondary one, otoh, is NMI-specific and also its only
>> caller doesn't care about the return value, so I'd suggest making
>> it return void alongside adding some form of "nmi" to its name. Or,
> 
> Will do.
> 
>> perhaps even better, have secondary_thread_fn() call it, moving the
>> cpu_sig update here (and of course then there shouldn't be any
>> "nmi" added to its name).
> 
> Even with "ucode=no-nmi", secondary threads have to do busy-loop in
> NMI handling util primary threads completing the update. Otherwise,
> it may access MSRs (like SPEC_CTRL) which is considered unsafe.

Of course. Note that I said "call it"; I did not suggest to replace
secondary_thread_fn().

>>> +static int microcode_nmi_callback(const struct cpu_user_regs *regs, int cpu)
>>> +{
>>> +    unsigned int primary = cpumask_first(this_cpu(cpu_sibling_mask));
>>> +    unsigned int controller = cpumask_first(&cpu_online_map);
>>> +
>>> +    /* System-generated NMI, will be ignored */
>>> +    if ( loading_state != LOADING_CALLIN )
>>> +        return 0;
>>
>> I'm not happy at all to see NMIs being ignored. But by returning
>> zero, you do _not_ ignore it. Did you perhaps mean "will be ignored
>> here", in which case perhaps better "leave to main handler"? And
>> for the comment to extend to the other two conditions right below,
>> I think it would be better to combine them all into a single if().
>>
>> Also, throughout the series, I think you want to consistently use
>> ACCESS_ONCE() for reads/writes from/to loading_state.
>>
>>> +    if ( cpu == controller || (!opt_ucode_loading_in_nmi && cpu == primary) )
>>> +        return 0;
>>
>> Why not
> 
> As I said above, secondary threads are expected to stay in NMI handler
> regardless the setting of opt_ucode_loading_in_nmi.

Oh, here I see how your remark above matters. Please add code
comments then to make this clear to the reader.

>>> --- a/xen/arch/x86/traps.c
>>> +++ b/xen/arch/x86/traps.c
>>> @@ -126,6 +126,8 @@ boolean_param("ler", opt_ler);
>>>  /* LastExceptionFromIP on this hardware.  Zero if LER is not in use. */
>>>  unsigned int __read_mostly ler_msr;
>>>  
>>> +unsigned int __read_mostly nmi_cpu;
>>
>> Since this variable (for now) is never written to it should gain a
>> comment saying why this is, and perhaps it would then also better be
>> const rather than __read_mostly.
> 
> How about use the macro below:
> #define NMI_CPU 0

This is another option, yes. If there's any intention to ever allow
offlining CPU 0, then having the variable in place would seem better
to me. But I'll leave it to you at this point.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 00/16] improve late microcode loading
  2019-09-13  8:47 ` [Xen-devel] [PATCH v10 00/16] improve late microcode loading Jan Beulich
@ 2019-09-17  7:09   ` Chao Gao
  2019-09-17  7:11     ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-17  7:09 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper,
	Suravee Suthikulpanit, xen-devel, Boris Ostrovsky,
	Roger Pau Monné

On Fri, Sep 13, 2019 at 10:47:36AM +0200, Jan Beulich wrote:
>On 12.09.2019 09:22, Chao Gao wrote:
>> This series includes below changes:
>>  1. Patch 1-11: introduce a global microcode cache and some cleanup
>>  2. Patch 12: synchronize late microcode loading
>>  3. Patch 13: support parallel microcodes update on different cores
>>  4. Patch 14: block #NMI handling during microcode loading
>>  5. Patch 15: disable late ucode loading due to BDF90
>>  6. Patch 16: call wbinvd() conditionally
>
>I don't know why it didn't occur to me earlier, but what about
>parked / offlined CPUs? They'll have their ucode updated when they
>get brought back online, but until then their ucode will disagree
>with that of the online CPUs. For truly offline CPUs this may be
>fine, but parked ones should probably be updated, perhaps via the
>same approach as used when C-state data becomes available (see
>set_cx_pminfo())?

Yes. It provides a means to wake up the parked CPU and a chance to run
some code (like loading ucode). But parked CPUs are cleared from
sibling info and cpu_online_map (see __cpu_disable()). If parallel
ucode loading is expected on parked CPUs, we should be able to
determine the primary threads and the number of cores no matter it is
online or parked. To this end, a new sibling map should be maintained
for each CPU and this map isn't changed when a CPU gets parked.

In Linux kernel, the approach is quite simple: late loading is
prohibited if any CPU is parked; admin should online all parked CPU
before loading ucode.

Do you have any preference?

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 00/16] improve late microcode loading
  2019-09-17  7:09   ` Chao Gao
@ 2019-09-17  7:11     ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-17  7:11 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper,
	Suravee Suthikulpanit, xen-devel, Boris Ostrovsky,
	Roger Pau Monné

On 17.09.2019 09:09, Chao Gao wrote:
> On Fri, Sep 13, 2019 at 10:47:36AM +0200, Jan Beulich wrote:
>> On 12.09.2019 09:22, Chao Gao wrote:
>>> This series includes below changes:
>>>  1. Patch 1-11: introduce a global microcode cache and some cleanup
>>>  2. Patch 12: synchronize late microcode loading
>>>  3. Patch 13: support parallel microcodes update on different cores
>>>  4. Patch 14: block #NMI handling during microcode loading
>>>  5. Patch 15: disable late ucode loading due to BDF90
>>>  6. Patch 16: call wbinvd() conditionally
>>
>> I don't know why it didn't occur to me earlier, but what about
>> parked / offlined CPUs? They'll have their ucode updated when they
>> get brought back online, but until then their ucode will disagree
>> with that of the online CPUs. For truly offline CPUs this may be
>> fine, but parked ones should probably be updated, perhaps via the
>> same approach as used when C-state data becomes available (see
>> set_cx_pminfo())?
> 
> Yes. It provides a means to wake up the parked CPU and a chance to run
> some code (like loading ucode). But parked CPUs are cleared from
> sibling info and cpu_online_map (see __cpu_disable()). If parallel
> ucode loading is expected on parked CPUs, we should be able to
> determine the primary threads and the number of cores no matter it is
> online or parked. To this end, a new sibling map should be maintained
> for each CPU and this map isn't changed when a CPU gets parked.

Would this really be necessary? If any thread in a core is not
parked, bringing up the parked threads is unnecessary. If all
threads of a core are parked, simply nudge the thread with ID
zero?

> In Linux kernel, the approach is quite simple: late loading is
> prohibited if any CPU is parked; admin should online all parked CPU
> before loading ucode.

Well, this is certainly an option, but (as per above) I think
this is too rigid: Refusing the operation would be necessary
only if there's a core with all of its threads parked. (I'd
in particular like late loading to work in SMT-disabled mode.)

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90
  2019-09-13  9:22   ` Jan Beulich
@ 2019-09-17  9:01     ` Chao Gao
  2019-09-17 10:49       ` Jan Beulich
  0 siblings, 1 reply; 41+ messages in thread
From: Chao Gao @ 2019-09-17  9:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On Fri, Sep 13, 2019 at 11:22:59AM +0200, Jan Beulich wrote:
>On 12.09.2019 09:22, Chao Gao wrote:
>> @@ -283,6 +284,27 @@ static enum microcode_match_result compare_patch(
>>                                                               : OLD_UCODE;
>>  }
>>  
>> +static bool is_blacklisted(void)
>> +{
>> +    struct cpuinfo_x86 *c = &current_cpu_data;
>> +    uint64_t llc_size = c->x86_cache_size * 1024ULL;
>> +    struct cpu_signature *sig = &this_cpu(cpu_sig);
>> +
>> +    do_div(llc_size, c->x86_max_cores);
>> +
>> +    /*
>> +     * Late loading on model 79 with microcode revision less than 0x0b000021
>> +     * and LLC size per core bigger than 2.5MB may result in a system hang.
>> +     * This behavior is documented in item BDF90, #334165 (Intel Xeon
>> +     * Processor E7-8800/4800 v4 Product Family).
>> +     */
>> +    if ( c->x86 == 6 && c->x86_model == 0x4F && c->x86_mask == 0x1 &&
>> +         llc_size > 2621440 && sig->rev < 0x0b000021 )
>> +        return true;
>> +
>> +    return false;
>> +}
>
>Isn't this misbehavior worked around by the wbinvd() you add in the next
>patch?

Hi Jan and Andrew,

Perhaps I misunderstood what I was told. I am confirming with Ashok
whether this patch is necessary.

>
>> --- a/xen/include/asm-x86/microcode.h
>> +++ b/xen/include/asm-x86/microcode.h
>> @@ -30,6 +30,7 @@ struct microcode_ops {
>>      bool (*match_cpu)(const struct microcode_patch *patch);
>>      enum microcode_match_result (*compare_patch)(
>>          const struct microcode_patch *new, const struct microcode_patch *old);
>> +    bool (*is_blacklisted)(void);
>
>Why a hook rather than a boolean flag, which could be set by
>microcode_update_one() (as invoked during AP bringup)?

How about set the boolean flag in Intel_errata_workarounds?

One limitation of setting the flag in microcode_update_one() is:
BSP also calls microcode_update_one(). But calculating LLC size per
core on BSP would meet the same issue as the following patch
(i.e. patch 16/16): BSP's current_cpu_data isn't initialized
properly. We might need to revert commit f97838bbd980a01 in
some way and reenumerate features after ucode loading is done.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90
  2019-09-17  9:01     ` Chao Gao
@ 2019-09-17 10:49       ` Jan Beulich
  0 siblings, 0 replies; 41+ messages in thread
From: Jan Beulich @ 2019-09-17 10:49 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Ashok Raj, Wei Liu, Andrew Cooper, xen-devel,
	Roger Pau Monné

On 17.09.2019 11:01, Chao Gao wrote:
> On Fri, Sep 13, 2019 at 11:22:59AM +0200, Jan Beulich wrote:
>> On 12.09.2019 09:22, Chao Gao wrote:
>>> --- a/xen/include/asm-x86/microcode.h
>>> +++ b/xen/include/asm-x86/microcode.h
>>> @@ -30,6 +30,7 @@ struct microcode_ops {
>>>      bool (*match_cpu)(const struct microcode_patch *patch);
>>>      enum microcode_match_result (*compare_patch)(
>>>          const struct microcode_patch *new, const struct microcode_patch *old);
>>> +    bool (*is_blacklisted)(void);
>>
>> Why a hook rather than a boolean flag, which could be set by
>> microcode_update_one() (as invoked during AP bringup)?
> 
> How about set the boolean flag in Intel_errata_workarounds?

Wherever it ends up working best. My suggestion was just a wild
guess.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 41+ messages in thread

end of thread, back to index

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-12  7:22 [Xen-devel] [PATCH v10 00/16] improve late microcode loading Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 01/16] microcode/intel: extend microcode_update_match() Chao Gao
2019-09-12 10:24   ` Jan Beulich
2019-09-13  6:50     ` Jan Beulich
2019-09-13  7:02       ` Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 02/16] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 03/16] microcode: introduce a global cache of ucode patch Chao Gao
2019-09-12 10:29   ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 04/16] microcode: clean up microcode_resume_cpu Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 05/16] microcode: remove struct ucode_cpu_info Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 06/16] microcode: remove pointless 'cpu' parameter Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 07/16] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
2019-09-12 12:34   ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 08/16] microcode: pass a patch pointer to apply_microcode() Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 09/16] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-09-12 14:07   ` Jan Beulich
2019-09-13  6:47     ` Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 10/16] microcode: unify ucode loading during system bootup and resuming Chao Gao
2019-09-12 14:59   ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 11/16] microcode: reduce memory allocation and copy when creating a patch Chao Gao
2019-09-12 15:04   ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 12/16] x86/microcode: Synchronize late microcode loading Chao Gao
2019-09-12 15:32   ` Jan Beulich
2019-09-13  7:01     ` Chao Gao
2019-09-13  7:15       ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 13/16] microcode: remove microcode_update_lock Chao Gao
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 14/16] microcode: rendezvous CPUs in NMI handler and load ucode Chao Gao
2019-09-13  9:14   ` Jan Beulich
2019-09-16  3:18     ` Chao Gao
2019-09-16  8:22       ` Jan Beulich
2019-09-13  9:18   ` Jan Beulich
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 15/16] microcode: disable late loading if CPUs are affected by BDF90 Chao Gao
2019-09-13  9:22   ` Jan Beulich
2019-09-17  9:01     ` Chao Gao
2019-09-17 10:49       ` Jan Beulich
2019-09-13 12:23   ` Andrew Cooper
2019-09-12  7:22 ` [Xen-devel] [PATCH v10 16/16] microcode/intel: writeback and invalidate cache conditionally Chao Gao
2019-09-13  9:32   ` Jan Beulich
2019-09-13  8:47 ` [Xen-devel] [PATCH v10 00/16] improve late microcode loading Jan Beulich
2019-09-17  7:09   ` Chao Gao
2019-09-17  7:11     ` Jan Beulich

Xen-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/xen-devel/0 xen-devel/git/0.git
	git clone --mirror https://lore.kernel.org/xen-devel/1 xen-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 xen-devel xen-devel/ https://lore.kernel.org/xen-devel \
		xen-devel@lists.xenproject.org xen-devel@archiver.kernel.org
	public-inbox-index xen-devel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.xenproject.lists.xen-devel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox