All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/8] improve late microcode loading
@ 2019-01-28  7:06 Chao Gao
  2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
                   ` (8 more replies)
  0 siblings, 9 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, Boris Ostrovsky,
	Chao Gao, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

Changes in this version:
 - support parallel microcode updates for all cores (see patch 8)
 - Address Roger's comments on the last version.

The intention of this series is to make the late microcode loading
more reliable by rendezvousing all cpus in stop_machine context.
This idea comes from Ashok. I am porting his linux patch to Xen
(see patch 7 for more details).

This series makes three changes:
 1. Patch 1-6: introduce a global microcode cache
 2. Patch 7: synchronize late microcode loading
 3. Patch 8: support parallel microcodes update on different cores

Currently, late microcode loading does a lot of things including
parsing microcode blob, checking the signature/revision and performing
update. Putting all of them into stop_machine context is a bad idea
because of complexity (One issue I observed is memory allocation
triggered one assertion in stop_machine context). In order to simplify
the load process, I move parsing microcode out of the load process.
The microcode blob is parsed and a global microcode cache is built on
a single CPU before rendezvousing all cpus to update microcode. Other
CPUs just get and load a suitable microcode from the global cache.
With this global cache, it is safe to put simplified load process to
stop_machine context.

Regarding changes to AMD side, I didn't do any test for them due to
lack of hardware. Could you help to test this series on an AMD machine?
At least, two basic tests are needed:
* do a microcode update after system bootup
* don't bring all pCPUs up at bootup by specifying maxcpus option in xen
  command line and then do a microcode update and online all offlined
  CPUs via 'xen-hptool'.

Chao Gao (8):
  microcode/intel: remove redundent check against ucode size
  microcode/intel: extend microcode_update_match()
  microcode: introduce the global microcode cache
  microcode: delete 'mc' field from struct ucode_cpu_info
  microcode: split out apply_microcode() from cpu_request_microcode()
  microcode: delete microcode pointer and size from microcode_info
  x86/microcode: Synchronize late microcode loading
  microcode: update microcode on cores in parallel

 xen/arch/x86/microcode.c        | 336 ++++++++++++++++++++++++++--------------
 xen/arch/x86/microcode_amd.c    | 202 ++++++++++++------------
 xen/arch/x86/microcode_intel.c  | 139 +++++++++--------
 xen/include/asm-x86/microcode.h |  25 ++-
 4 files changed, 416 insertions(+), 286 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-28 16:40   ` Roger Pau Monné
  2019-01-29 10:26   ` Jan Beulich
  2019-01-28  7:06 ` [PATCH v5 2/8] microcode/intel: extend microcode_update_match() Chao Gao
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

This check has been done in microcode_sanity_check(). Needn't do it
again in get_matching_microcode().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode_intel.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 9657575..4f69f4a 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -246,9 +246,6 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
                                 mc_header->sig, mc_header->pf) )
         goto find;
 
-    if ( total_size <= (get_datasize(mc_header) + MC_HEADER_SIZE) )
-        return 0;
-
     ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
     ext_sigcount = ext_header->count;
     ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 2/8] microcode/intel: extend microcode_update_match()
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
  2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-28 16:55   ` Roger Pau Monné
  2019-01-29 10:41   ` Jan Beulich
  2019-01-28  7:06 ` [PATCH v5 3/8] microcode: introduce the global microcode cache Chao Gao
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

to a more generic function. Then, this function can compare two given
microcodes' signature/revision as well. Comparing two microcodes is
used to update the global microcode cache (introduced by the later
patches in this series) when a new microcode is given.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v5:
 - constify the extended_signature
 - use named enum type for the return value of microcode_update_match
---
 xen/arch/x86/microcode_intel.c  | 43 ++++++++++++++++++-----------------------
 xen/include/asm-x86/microcode.h |  6 ++++++
 2 files changed, 25 insertions(+), 24 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 4f69f4a..1ed573a 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -127,14 +127,24 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
     return 0;
 }
 
-static inline int microcode_update_match(
-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
-    int sig, int pf)
+static enum microcode_match_result microcode_update_match(
+    const void *mc, unsigned int sig, unsigned int pf, unsigned int rev)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    const struct microcode_header_intel *mc_header = mc;
+    const struct extended_sigtable *ext_header;
+    const struct extended_signature *ext_sig;
+    unsigned int i;
+
+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
 
-    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
-            (mc_header->rev > uci->cpu_sig.rev));
+    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
+    ext_sig = (const void *)ext_header + EXT_HEADER_SIZE;
+    for ( i = 0; i < ext_header->count; i++ )
+        if ( sigmatch(sig, ext_sig[i].sig, pf, ext_sig[i].pf) )
+            return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
 }
 
 static int microcode_sanity_check(void *mc)
@@ -236,28 +246,13 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
-    const struct extended_sigtable *ext_header;
     unsigned long total_size = get_totalsize(mc_header);
-    int ext_sigcount, i;
-    struct extended_signature *ext_sig;
     void *new_mc;
 
-    if ( microcode_update_match(cpu, mc_header,
-                                mc_header->sig, mc_header->pf) )
-        goto find;
+    if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
+                                uci->cpu_sig.rev) != NEW_UCODE )
+        return 0;
 
-    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
-    ext_sigcount = ext_header->count;
-    ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
-    for ( i = 0; i < ext_sigcount; i++ )
-    {
-        if ( microcode_update_match(cpu, mc_header,
-                                    ext_sig->sig, ext_sig->pf) )
-            goto find;
-        ext_sig++;
-    }
-    return 0;
- find:
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
              cpu, mc_header->rev, uci->cpu_sig.rev);
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 23ea954..73ebe9a 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -3,6 +3,12 @@
 
 #include <xen/percpu.h>
 
+enum microcode_match_result {
+    OLD_UCODE, /* signature matched, but revision id isn't newer */
+    NEW_UCODE, /* signature matched, but revision id is newer */
+    MIS_UCODE, /* signature mismatched */
+};
+
 struct cpu_signature;
 struct ucode_cpu_info;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
  2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
  2019-01-28  7:06 ` [PATCH v5 2/8] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-28 17:39   ` Roger Pau Monné
  2019-02-08 11:41   ` Jan Beulich
  2019-01-28  7:06 ` [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info Chao Gao
                   ` (5 subsequent siblings)
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

to replace the current per-cpu cache 'uci->mc'.

Compared to the current per-cpu cache, the benefits of the global
microcode cache are:
1. It reduces the work that need to be done on each CPU. Parsing ucode
file can be done once on one CPU. Other CPUs needn't parse ucode file.
Instead, they can find out and load a patch with newer revision from
the global cache.
2. It reduces the memory consumption on a system with many CPU cores.

Two functions, save_patch() and find_patch() are introduced. The
former adds one given patch to the global cache. The latter gets
a newer and matched ucode patch from the global cache.

Note that I deliberately avoid touching 'uci->mc' as I am going to
remove it completely in the next patch.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v5:
 - reword the commit description
 - find_patch() and save_patch() are abstracted into common functions
   with some hooks for AMD and Intel
---
 xen/arch/x86/microcode.c        | 54 +++++++++++++++++++++++
 xen/arch/x86/microcode_amd.c    | 94 ++++++++++++++++++++++++++++++++++++++---
 xen/arch/x86/microcode_intel.c  | 71 ++++++++++++++++++++++++++++---
 xen/include/asm-x86/microcode.h | 13 ++++++
 4 files changed, 219 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 4163f50..7d5b769 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
  */
 static bool_t __initdata ucode_scan;
 
+static LIST_HEAD(microcode_cache);
+
 void __init microcode_set_module(unsigned int idx)
 {
     ucode_mod_idx = idx;
@@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
     spin_unlock(&microcode_mutex);
 }
 
+/* Save a ucode patch to the global cache list */
+bool save_patch(struct microcode_patch *new_patch)
+{
+    struct microcode_patch *microcode_patch;
+
+    list_for_each_entry(microcode_patch, &microcode_cache, list)
+    {
+        enum microcode_match_result result =
+            microcode_ops->replace_patch(new_patch, microcode_patch);
+
+        switch ( result )
+        {
+        case OLD_UCODE:
+            microcode_ops->free_patch(new_patch);
+            return false;
+
+        case NEW_UCODE:
+            microcode_ops->free_patch(microcode_patch);
+            return true;
+
+        case MIS_UCODE:
+            continue;
+
+        default:
+            ASSERT_UNREACHABLE();
+            return 0;
+        }
+    }
+    list_add_tail(&new_patch->list, &microcode_cache);
+    return true;
+}
+
+/* Find a ucode patch who has newer revision than the one in use */
+struct microcode_patch *find_patch(unsigned int cpu)
+{
+    struct microcode_patch *microcode_patch;
+    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+
+    if ( microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig) )
+    {
+        __microcode_fini_cpu(cpu);
+        return NULL;
+    }
+
+    list_for_each_entry(microcode_patch, &microcode_cache, list)
+    {
+        if ( microcode_ops->match_cpu(microcode_patch, cpu) )
+            return microcode_patch;
+    }
+    return NULL;
+}
+
 int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index fba44cc..301acb6 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -190,24 +190,90 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
     return 1;
 }
 
+static bool match_cpu(const struct microcode_patch *patch, unsigned int cpu)
+{
+    return microcode_fits(patch->data, cpu);
+}
+
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_amd *mc_amd)
+{
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+    struct microcode_amd *cache = xmalloc(struct microcode_amd);
+    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
+    struct equiv_cpu_entry *equiv_cpu_table =
+                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
+
+    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
+    {
+        xfree(microcode_patch);
+        xfree(cache);
+        xfree(mpb);
+        xfree(equiv_cpu_table);
+        printk(XENLOG_ERR "microcode: Can not allocate memory\n");
+        return ERR_PTR(-ENOMEM);
+    }
+
+    cache->equiv_cpu_table = equiv_cpu_table;
+    cache->mpb = mpb;
+    memcpy(cache->equiv_cpu_table, mc_amd->equiv_cpu_table,
+           mc_amd->equiv_cpu_table_size);
+    memcpy(cache->mpb, mc_amd->mpb, mc_amd->mpb_size);
+    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
+    cache->mpb_size = mc_amd->mpb_size;
+    microcode_patch->data = cache;
+
+    return microcode_patch;
+}
+
+static void free_patch(struct microcode_patch *microcode_patch)
+{
+    struct microcode_amd *mc_amd = microcode_patch->data;
+
+    xfree(mc_amd->equiv_cpu_table);
+    xfree(mc_amd->mpb);
+    xfree(mc_amd);
+    xfree(microcode_patch);
+}
+
+static enum microcode_match_result replace_patch(struct microcode_patch *new,
+                                                 struct microcode_patch *old)
+{
+    struct microcode_amd *new_mc = new->data;
+    struct microcode_header_amd *new_header = new_mc->mpb;
+    struct microcode_amd *old_mc = old->data;
+    struct microcode_header_amd *old_header = old_mc->mpb;
+
+    if ( new_header->processor_rev_id == old_header->processor_rev_id )
+    {
+        if ( new_header->patch_id <= old_header->patch_id )
+            return OLD_UCODE;
+
+        list_replace(&old->list, &new->list);
+        return NEW_UCODE;
+    }
+
+    return MIS_UCODE;
+}
+
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
     struct microcode_header_amd *hdr;
+    struct microcode_patch *patch;
     int hw_err;
 
     /* We should bind the task to the CPU */
     BUG_ON(raw_smp_processor_id() != cpu);
 
-    if ( mc_amd == NULL )
+    patch = find_patch(cpu);
+    if ( patch == NULL )
         return -EINVAL;
 
-    hdr = mc_amd->mpb;
-    if ( hdr == NULL )
-        return -EINVAL;
+    hdr = patch->data;
+    BUG_ON(!hdr);
 
     spin_lock_irqsave(&microcode_update_lock, flags);
 
@@ -491,7 +557,20 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) )
+        struct microcode_patch *microcode_patch = alloc_microcode_patch(mc_amd);
+
+        if ( IS_ERR(microcode_patch) )
+        {
+            error = PTR_ERR(microcode_patch);
+            break;
+        }
+
+        /*
+         * In order to support a system with mixed stepping CPUs, save
+         * this ucode patch before checking whether it matches with
+         * current CPU.
+         */
+        if ( save_patch(microcode_patch) && microcode_fits(mc_amd, cpu) )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -633,6 +712,9 @@ static const struct microcode_ops microcode_amd_ops = {
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
     .start_update                     = start_update,
+    .replace_patch                    = replace_patch,
+    .free_patch                       = free_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_amd(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 1ed573a..fc35c8d 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -147,6 +147,15 @@ static enum microcode_match_result microcode_update_match(
     return MIS_UCODE;
 }
 
+static bool match_cpu(const struct microcode_patch *patch, unsigned int cpu)
+{
+    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    int ret = microcode_update_match(patch->data, uci->cpu_sig.sig,
+                                     uci->cpu_sig.pf, uci->cpu_sig.rev);
+
+    return ret == NEW_UCODE;
+}
+
 static int microcode_sanity_check(void *mc)
 {
     struct microcode_header_intel *mc_header = mc;
@@ -237,6 +246,26 @@ static int microcode_sanity_check(void *mc)
     return 0;
 }
 
+static void free_patch(struct microcode_patch *patch)
+{
+    xfree(patch->data);
+    xfree(patch);
+}
+
+static enum microcode_match_result replace_patch(struct microcode_patch *new,
+                                                 struct microcode_patch *old)
+{
+    struct microcode_header_intel *old_header = old->data;
+    enum microcode_match_result ret =
+                microcode_update_match(new->data, old_header->sig,
+                                       old_header->pf, old_header->rev);
+
+    if ( ret == NEW_UCODE )
+        list_replace(&old->list, &new->list);
+
+    return ret;
+}
+
 /*
  * return 0 - no update found
  * return 1 - found update
@@ -248,6 +277,25 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc;
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+    void *new_mc2 = xmalloc_bytes(total_size);
+
+    if ( !microcode_patch || !new_mc2 )
+    {
+        xfree(microcode_patch);
+        xfree(new_mc2);
+        printk(XENLOG_ERR "microcode: Can not allocate memory\n");
+        return -ENOMEM;
+    }
+    memcpy(new_mc2, mc, total_size);
+    microcode_patch->data = new_mc2;
+
+    /*
+     * In order to support a system with mixed stepping CPUs, save this ucode
+     * patch before checking whether it matches with current CPU.
+     */
+    if ( !save_patch(microcode_patch) )
+        return 0;
 
     if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
                                 uci->cpu_sig.rev) != NEW_UCODE )
@@ -276,18 +324,24 @@ static int apply_microcode(unsigned int cpu)
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    struct microcode_intel *mc_intel;
+    struct microcode_patch *patch;
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu_num != cpu);
 
-    if ( uci->mc.mc_intel == NULL )
+    patch = find_patch(cpu);
+    if ( !patch )
         return -EINVAL;
 
+    mc_intel = patch->data;
+    BUG_ON(!mc_intel);
+
     /* serialize access to the physical write to MSR 0x79 */
     spin_lock_irqsave(&microcode_update_lock, flags);
 
     /* write microcode via MSR 0x79 */
-    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
+    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
 
     /* As documented in the SDM: Do a CPUID 1 here */
@@ -298,19 +352,19 @@ static int apply_microcode(unsigned int cpu)
     val[1] = (uint32_t)(msr_content >> 32);
 
     spin_unlock_irqrestore(&microcode_update_lock, flags);
-    if ( val[1] != uci->mc.mc_intel->hdr.rev )
+    if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
+               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
            cpu_num, uci->cpu_sig.rev, val[1],
-           uci->mc.mc_intel->hdr.date & 0xffff,
-           uci->mc.mc_intel->hdr.date >> 24,
-           (uci->mc.mc_intel->hdr.date >> 16) & 0xff);
+           mc_intel->hdr.date & 0xffff,
+           mc_intel->hdr.date >> 24,
+           (mc_intel->hdr.date >> 16) & 0xff);
     uci->cpu_sig.rev = val[1];
 
     return 0;
@@ -395,6 +449,9 @@ static const struct microcode_ops microcode_intel_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+    .replace_patch                    = replace_patch,
+    .free_patch                       = free_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 73ebe9a..fc98fed 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -1,6 +1,7 @@
 #ifndef ASM_X86__MICROCODE_H
 #define ASM_X86__MICROCODE_H
 
+#include <xen/list.h>
 #include <xen/percpu.h>
 
 enum microcode_match_result {
@@ -12,6 +13,11 @@ enum microcode_match_result {
 struct cpu_signature;
 struct ucode_cpu_info;
 
+struct microcode_patch {
+    struct list_head list;
+    void *data;
+};
+
 struct microcode_ops {
     int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
@@ -19,6 +25,10 @@ struct microcode_ops {
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
     int (*apply_microcode)(unsigned int cpu);
     int (*start_update)(void);
+    enum microcode_match_result (*replace_patch)(struct microcode_patch *new,
+                                                 struct microcode_patch *old);
+    void (*free_patch)(struct microcode_patch *patch);
+    bool (*match_cpu)(const struct microcode_patch *patch, unsigned int cpu);
 };
 
 struct cpu_signature {
@@ -39,4 +49,7 @@ struct ucode_cpu_info {
 DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 extern const struct microcode_ops *microcode_ops;
 
+bool save_patch(struct microcode_patch *new_patch);
+struct microcode_patch *find_patch(unsigned int cpu);
+
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (2 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 3/8] microcode: introduce the global microcode cache Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-29  9:25   ` Roger Pau Monné
  2019-01-28  7:06 ` [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

apply_microcode() now gets ucode patch from the global cache rather
than using the microcode stored in "mc" field of ucode_cpu_info.
Also remove 'microcode_resume_match' from microcode_ops because the
matching is done in find_patch(). The cpu status notifier is also
removed. It was used to free the "mc" field to avoid memory leak.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode.c        | 71 +++---------------------------------
 xen/arch/x86/microcode_amd.c    | 80 ++---------------------------------------
 xen/arch/x86/microcode_intel.c  | 27 +++-----------
 xen/include/asm-x86/microcode.h |  6 ----
 4 files changed, 12 insertions(+), 172 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 7d5b769..0c77e90 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -195,21 +195,13 @@ struct microcode_info {
     char buffer[1];
 };
 
-static void __microcode_fini_cpu(unsigned int cpu)
+static void microcode_fini_cpu(unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
 
-    xfree(uci->mc.mc_valid);
     memset(uci, 0, sizeof(*uci));
 }
 
-static void microcode_fini_cpu(unsigned int cpu)
-{
-    spin_lock(&microcode_mutex);
-    __microcode_fini_cpu(cpu);
-    spin_unlock(&microcode_mutex);
-}
-
 /* Save a ucode patch to the global cache list */
 bool save_patch(struct microcode_patch *new_patch)
 {
@@ -250,7 +242,7 @@ struct microcode_patch *find_patch(unsigned int cpu)
 
     if ( microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig) )
     {
-        __microcode_fini_cpu(cpu);
+        microcode_fini_cpu(cpu);
         return NULL;
     }
 
@@ -266,8 +258,6 @@ int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct cpu_signature nsig;
-    unsigned int cpu2;
 
     if ( !microcode_ops )
         return 0;
@@ -277,40 +267,12 @@ int microcode_resume_cpu(unsigned int cpu)
     err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
     if ( err )
     {
-        __microcode_fini_cpu(cpu);
+        microcode_fini_cpu(cpu);
         spin_unlock(&microcode_mutex);
         return err;
     }
 
-    if ( uci->mc.mc_valid )
-    {
-        err = microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid);
-        if ( err >= 0 )
-        {
-            if ( err )
-                err = microcode_ops->apply_microcode(cpu);
-            spin_unlock(&microcode_mutex);
-            return err;
-        }
-    }
-
-    nsig = uci->cpu_sig;
-    __microcode_fini_cpu(cpu);
-    uci->cpu_sig = nsig;
-
-    err = -EIO;
-    for_each_online_cpu ( cpu2 )
-    {
-        uci = &per_cpu(ucode_cpu_info, cpu2);
-        if ( uci->mc.mc_valid &&
-             microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid) > 0 )
-        {
-            err = microcode_ops->apply_microcode(cpu);
-            break;
-        }
-    }
-
-    __microcode_fini_cpu(cpu);
+    err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -328,7 +290,7 @@ static int microcode_update_cpu(const void *buf, size_t size)
     if ( likely(!err) )
         err = microcode_ops->cpu_request_microcode(cpu, buf, size);
     else
-        __microcode_fini_cpu(cpu);
+        microcode_fini_cpu(cpu);
 
     spin_unlock(&microcode_mutex);
 
@@ -416,25 +378,6 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-static int microcode_percpu_callback(
-    struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
-    unsigned int cpu = (unsigned long)hcpu;
-
-    switch ( action )
-    {
-    case CPU_DEAD:
-        microcode_fini_cpu(cpu);
-        break;
-    }
-
-    return NOTIFY_DONE;
-}
-
-static struct notifier_block microcode_percpu_nfb = {
-    .notifier_call = microcode_percpu_callback,
-};
-
 int __init early_microcode_update_cpu(bool start_update)
 {
     int rc = 0;
@@ -478,12 +421,8 @@ int __init early_microcode_init(void)
         return rc;
 
     if ( microcode_ops )
-    {
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
 
-        register_cpu_notifier(&microcode_percpu_nfb);
-    }
-
     return rc;
 }
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 301acb6..d86a596 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -457,10 +457,9 @@ static bool_t check_final_patch_levels(unsigned int cpu)
 static int cpu_request_microcode(unsigned int cpu, const void *buf,
                                  size_t bufsize)
 {
-    struct microcode_amd *mc_amd, *mc_old;
+    struct microcode_amd *mc_amd;
     size_t offset = 0;
-    size_t last_offset, applied_offset = 0;
-    int error = 0, save_error = 1;
+    int error = 0;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
@@ -543,17 +542,12 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    mc_old = uci->mc.mc_amd;
-    /* implicitely validates uci->mc.mc_valid */
-    uci->mc.mc_amd = mc_amd;
-
     /*
      * It's possible the data file has multiple matching ucode,
      * lets keep searching till the latest version
      */
     mc_amd->mpb = NULL;
     mc_amd->mpb_size = 0;
-    last_offset = offset;
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
@@ -575,11 +569,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             error = apply_microcode(cpu);
             if ( error )
                 break;
-            applied_offset = last_offset;
         }
 
-        last_offset = offset;
-
         if ( offset >= bufsize )
             break;
 
@@ -608,26 +599,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             break;
     }
 
-    /* On success keep the microcode patch for
-     * re-apply on resume.
-     */
-    if ( applied_offset )
-    {
-        save_error = get_ucode_from_buffer_amd(
-            mc_amd, buf, bufsize, &applied_offset);
-
-        if ( save_error )
-            error = save_error;
-    }
-
-    if ( save_error )
-    {
-        xfree(mc_amd);
-        uci->mc.mc_amd = mc_old;
-    }
-    else
-        xfree(mc_old);
-
   out:
 #if CONFIG_HVM
     svm_host_osvw_init();
@@ -642,52 +613,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    const struct microcode_amd *src = mc;
-
-    if ( !microcode_fits(src, cpu) )
-        return 0;
-
-    if ( src != mc_amd )
-    {
-        if ( mc_amd )
-        {
-            xfree(mc_amd->equiv_cpu_table);
-            xfree(mc_amd->mpb);
-            xfree(mc_amd);
-        }
-
-        mc_amd = xmalloc(struct microcode_amd);
-        uci->mc.mc_amd = mc_amd;
-        if ( !mc_amd )
-            return -ENOMEM;
-        mc_amd->equiv_cpu_table = xmalloc_bytes(src->equiv_cpu_table_size);
-        if ( !mc_amd->equiv_cpu_table )
-            goto err1;
-        mc_amd->mpb = xmalloc_bytes(src->mpb_size);
-        if ( !mc_amd->mpb )
-            goto err2;
-
-        mc_amd->equiv_cpu_table_size = src->equiv_cpu_table_size;
-        mc_amd->mpb_size = src->mpb_size;
-        memcpy(mc_amd->mpb, src->mpb, src->mpb_size);
-        memcpy(mc_amd->equiv_cpu_table, src->equiv_cpu_table,
-               src->equiv_cpu_table_size);
-    }
-
-    return 1;
-
-err2:
-    xfree(mc_amd->equiv_cpu_table);
-err1:
-    xfree(mc_amd);
-    uci->mc.mc_amd = NULL;
-    return -ENOMEM;
-}
-
 static int start_update(void)
 {
 #if CONFIG_HVM
@@ -707,7 +632,6 @@ static int start_update(void)
 }
 
 static const struct microcode_ops microcode_amd_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index fc35c8d..d227f8a 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -276,19 +276,18 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc;
     struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
-    void *new_mc2 = xmalloc_bytes(total_size);
+    void *new_mc = xmalloc_bytes(total_size);
 
-    if ( !microcode_patch || !new_mc2 )
+    if ( !microcode_patch || !new_mc )
     {
         xfree(microcode_patch);
-        xfree(new_mc2);
+        xfree(new_mc);
         printk(XENLOG_ERR "microcode: Can not allocate memory\n");
         return -ENOMEM;
     }
-    memcpy(new_mc2, mc, total_size);
-    microcode_patch->data = new_mc2;
+    memcpy(new_mc, mc, total_size);
+    microcode_patch->data = new_mc;
 
     /*
      * In order to support a system with mixed stepping CPUs, save this ucode
@@ -304,16 +303,6 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
              cpu, mc_header->rev, uci->cpu_sig.rev);
-    new_mc = xmalloc_bytes(total_size);
-    if ( new_mc == NULL )
-    {
-        printk(KERN_ERR "microcode: error! Can not allocate memory\n");
-        return -ENOMEM;
-    }
-
-    memcpy(new_mc, mc, total_size);
-    xfree(uci->mc.mc_intel);
-    uci->mc.mc_intel = new_mc;
     return 1;
 }
 
@@ -439,13 +428,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    return get_matching_microcode(mc, cpu);
-}
-
 static const struct microcode_ops microcode_intel_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index fc98fed..507da2e 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -19,7 +19,6 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
                                  size_t size);
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
@@ -39,11 +38,6 @@ struct cpu_signature {
 
 struct ucode_cpu_info {
     struct cpu_signature cpu_sig;
-    union {
-        struct microcode_intel *mc_intel;
-        struct microcode_amd *mc_amd;
-        void *mc_valid;
-    } mc;
 };
 
 DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (3 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-29  9:58   ` Roger Pau Monné
  2019-02-08 15:58   ` Jan Beulich
  2019-01-28  7:06 ` [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info Chao Gao
                   ` (3 subsequent siblings)
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

During late microcode update, apply_microcode() is invoked in
cpu_request_microcode(). To make late microcode update more reliable,
we want to put the apply_microcode() into stop_machine context. So
we split out it from cpu_request_microcode(). As a consequence,
apply_microcode() should be invoked explicitly in the common code.

Also with the global ucode cache, microcode parsing only needs
to be done once; cpu_request_microcode() is also moved out of
microcode_update_cpu().

On AMD side, svm_host_osvw_init() is supposed to be called after
microcode update. As apply_micrcode() won't be called by
cpu_request_microcode() now, svm_host_osvw_init() is also moved to the
end of apply_microcode().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode.c       | 80 +++++++++++++++++++++++++-----------------
 xen/arch/x86/microcode_amd.c   | 23 +++++++-----
 xen/arch/x86/microcode_intel.c | 20 ++---------
 3 files changed, 64 insertions(+), 59 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 0c77e90..936f0b8 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -254,47 +254,42 @@ struct microcode_patch *find_patch(unsigned int cpu)
     return NULL;
 }
 
-int microcode_resume_cpu(unsigned int cpu)
+/*
+ * Return the number of ucode patch inserted to the global cache.
+ * Return negtive value on error.
+ */
+static int parse_microcode_blob(const void *buffer, size_t len)
 {
-    int err;
+    unsigned int cpu = smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-
-    if ( !microcode_ops )
-        return 0;
+    int ret;
 
     spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
-    if ( err )
-    {
+    ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    if ( likely(!ret) )
+        ret = microcode_ops->cpu_request_microcode(cpu, buffer, len);
+    else
         microcode_fini_cpu(cpu);
-        spin_unlock(&microcode_mutex);
-        return err;
-    }
-
-    err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
 
-    return err;
+    return ret;
 }
 
-static int microcode_update_cpu(const void *buf, size_t size)
+static int microcode_update_cpu(void)
 {
-    int err;
-    unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    int ret;
 
     spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
-    if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
-    else
-        microcode_fini_cpu(cpu);
-
+    ret = microcode_ops->apply_microcode(smp_processor_id());
     spin_unlock(&microcode_mutex);
 
-    return err;
+    return ret;
+}
+
+int microcode_resume_cpu(unsigned int cpu)
+{
+    BUG_ON(cpu != smp_processor_id());
+    return microcode_ops ? microcode_update_cpu() : 0;
 }
 
 static long do_microcode_update(void *_info)
@@ -304,7 +299,7 @@ static long do_microcode_update(void *_info)
 
     BUG_ON(info->cpu != smp_processor_id());
 
-    error = microcode_update_cpu(info->buffer, info->buffer_size);
+    error = microcode_update_cpu();
     if ( error )
         info->error = error;
 
@@ -339,10 +334,6 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         return ret;
     }
 
-    info->buffer_size = len;
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
-
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
@@ -353,6 +344,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         }
     }
 
+    ret = parse_microcode_blob(info->buffer, len);
+    if ( ret <= 0 )
+    {
+        printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
+        xfree(info);
+        return -EINVAL;
+    }
+
+    info->buffer_size = len;
+    info->error = 0;
+    info->cpu = cpumask_first(&cpu_online_map);
+
     return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
 }
 
@@ -396,13 +399,24 @@ int __init early_microcode_update_cpu(bool start_update)
     }
     if ( data )
     {
+        static bool parsed = false;
+
         if ( start_update && microcode_ops->start_update )
             rc = microcode_ops->start_update();
 
         if ( rc )
             return rc;
 
-        return microcode_update_cpu(data, len);
+        if ( !parsed )
+        {
+            rc = parse_microcode_blob(data, len);
+            parsed = true;
+
+            if ( rc <= 0 )
+                return -EINVAL;
+        }
+
+        return microcode_update_cpu();
     }
     else
         return -ENOMEM;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index d86a596..80e274e 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -297,6 +297,10 @@ static int apply_microcode(unsigned int cpu)
 
     uci->cpu_sig.rev = rev;
 
+#if CONFIG_HVM
+    svm_host_osvw_init();
+#endif
+
     return 0;
 }
 
@@ -463,6 +467,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    unsigned int matched_cnt = 0;
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -564,11 +569,15 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
          * this ucode patch before checking whether it matches with
          * current CPU.
          */
-        if ( save_patch(microcode_patch) && microcode_fits(mc_amd, cpu) )
+        if ( save_patch(microcode_patch) )
         {
-            error = apply_microcode(cpu);
-            if ( error )
-                break;
+            matched_cnt++;
+            if ( microcode_fits(mc_amd, cpu) )
+            {
+                error = apply_microcode(cpu);
+                if ( error )
+                    break;
+            }
         }
 
         if ( offset >= bufsize )
@@ -600,17 +609,13 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     }
 
   out:
-#if CONFIG_HVM
-    svm_host_osvw_init();
-#endif
-
     /*
      * In some cases we may return an error even if processor's microcode has
      * been updated. For example, the first patch in a container file is loaded
      * successfully but subsequent container file processing encounters a
      * failure.
      */
-    return error;
+    return error ?: matched_cnt;
 }
 
 static int start_update(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index d227f8a..02fc5a0 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -273,7 +273,6 @@ static enum microcode_match_result replace_patch(struct microcode_patch *new,
  */
 static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
@@ -291,19 +290,9 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     /*
      * In order to support a system with mixed stepping CPUs, save this ucode
-     * patch before checking whether it matches with current CPU.
+     * patch without checking whether it matches with current CPU.
      */
-    if ( !save_patch(microcode_patch) )
-        return 0;
-
-    if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
-                                uci->cpu_sig.rev) != NEW_UCODE )
-        return 0;
-
-    pr_debug("microcode: CPU%d found a matching microcode update with"
-             " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, uci->cpu_sig.rev);
-    return 1;
+    return save_patch(microcode_patch);
 }
 
 static int apply_microcode(unsigned int cpu)
@@ -422,10 +411,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && matching_count )
-        error = apply_microcode(cpu);
-
-    return error;
+    return error ?: matching_count;
 }
 
 static const struct microcode_ops microcode_intel_ops = {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (4 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-29 10:10   ` Roger Pau Monné
  2019-01-28  7:06 ` [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading Chao Gao
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

microcode pointer and size were passed to other CPUs to parse
microcode locally. Now, parsing microcode is done on one CPU.
Other CPUs needn't parse the microcode blob; the pointer and
size can be removed.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode.c | 33 +++++++++++++++++----------------
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 936f0b8..3c2274f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -190,9 +190,7 @@ DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 
 struct microcode_info {
     unsigned int cpu;
-    uint32_t buffer_size;
     int error;
-    char buffer[1];
 };
 
 static void microcode_fini_cpu(unsigned int cpu)
@@ -316,6 +314,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     struct microcode_info *info;
+    void * buffer;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -323,28 +322,26 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
-    info = xmalloc_bytes(sizeof(*info) + len);
-    if ( info == NULL )
-        return -ENOMEM;
-
-    ret = copy_from_guest(info->buffer, buf, len);
-    if ( ret != 0 )
+    info = xmalloc(struct microcode_info);
+    buffer = xmalloc_bytes(len);
+    if ( !info || !buffer )
     {
-        xfree(info);
-        return ret;
+        ret = -ENOMEM;
+        goto free;
     }
 
+    ret = copy_from_guest(buffer, buf, len);
+    if ( ret != 0 )
+        goto free;
+
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-        {
-            xfree(info);
-            return ret;
-        }
+            goto free;
     }
 
-    ret = parse_microcode_blob(info->buffer, len);
+    ret = parse_microcode_blob(buffer, len);
     if ( ret <= 0 )
     {
         printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
@@ -352,11 +349,15 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         return -EINVAL;
     }
 
-    info->buffer_size = len;
     info->error = 0;
     info->cpu = cpumask_first(&cpu_online_map);
 
     return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+
+ free:
+    xfree(info);
+    xfree(buffer);
+    return ret;
 }
 
 static int __init microcode_init(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (5 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-29 10:37   ` Roger Pau Monné
  2019-02-08 16:29   ` Jan Beulich
  2019-01-28  7:06 ` [PATCH v5 8/8] microcode: update microcode on cores in parallel Chao Gao
  2019-01-29 11:31 ` [PATCH v5 0/8] improve late microcode loading Roger Pau Monné
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Borislav Petkov, Wei Liu, Jun Nakajima,
	Andrew Cooper, Jan Beulich, Ashok Raj, Chao Gao, Thomas Gleixner,
	Roger Pau Monné

This patch ports microcode improvement patches from linux kernel.

Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.

Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/microcode.c | 125 +++++++++++++++++++++++++++++++++++++----------
 1 file changed, 98 insertions(+), 27 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 3c2274f..b7b20cf 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -22,6 +22,7 @@
  */
 
 #include <xen/cpu.h>
+#include <xen/cpumask.h>
 #include <xen/lib.h>
 #include <xen/kernel.h>
 #include <xen/init.h>
@@ -30,18 +31,25 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/spinlock.h>
+#include <xen/stop_machine.h>
 #include <xen/tasklet.h>
 #include <xen/guest_access.h>
 #include <xen/earlycpio.h>
+#include <xen/watchdog.h>
 
+#include <asm/delay.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
 
+/* By default, wait for 30000us */
+#define MICROCODE_DEFAULT_TIMEOUT_US 30000
+
 static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
+static unsigned int nr_cores;
 
 /*
  * If we scan the initramfs.cpio for the early microcode code
@@ -188,10 +196,11 @@ static DEFINE_SPINLOCK(microcode_mutex);
 
 DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 
-struct microcode_info {
-    unsigned int cpu;
-    int error;
-};
+/*
+ * Count the CPUs that have entered and exited the rendezvous
+ * during late microcode update.
+ */
+static atomic_t cpu_in, cpu_out;
 
 static void microcode_fini_cpu(unsigned int cpu)
 {
@@ -290,30 +299,60 @@ int microcode_resume_cpu(unsigned int cpu)
     return microcode_ops ? microcode_update_cpu() : 0;
 }
 
-static long do_microcode_update(void *_info)
+/* Wait for all CPUs to rendezvous with a timeout (us) */
+static int wait_for_cpus(atomic_t *cnt, unsigned int timeout)
 {
-    struct microcode_info *info = _info;
-    int error;
+    unsigned int cpus = num_online_cpus();
 
-    BUG_ON(info->cpu != smp_processor_id());
+    atomic_inc(cnt);
+
+    while ( atomic_read(cnt) != cpus )
+    {
+        if ( timeout <= 0 )
+        {
+            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
+                   smp_processor_id());
+            return -EBUSY;
+        }
+        udelay(1);
+        timeout--;
+    }
+
+    return 0;
+}
+
+static int do_microcode_update(void *unused)
+{
+    unsigned int cpu = smp_processor_id();
+    int ret;
 
-    error = microcode_update_cpu();
-    if ( error )
-        info->error = error;
+    ret = wait_for_cpus(&cpu_in, MICROCODE_DEFAULT_TIMEOUT_US);
+    if ( ret )
+        return ret;
 
-    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
-    if ( info->cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    /*
+     * Initiate an update on all processors which don't have an online sibling
+     * thread with a lower thread id. Other sibling threads just await the
+     * completion of microcode update.
+     */
+    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+        ret = microcode_update_cpu();
+    /*
+     * Increase the wait timeout to a safe value here since we're serializing
+     * the microcode update and that could take a while on a large number of
+     * CPUs. And that is fine as the *actual* timeout will be determined by
+     * the last CPU finished updating and thus cut short
+     */
+    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
+        panic("Timeout when finishing updating microcode");
 
-    error = info->error;
-    xfree(info);
-    return error;
+    return ret;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
-    struct microcode_info *info;
+    unsigned int cpu;
     void * buffer;
 
     if ( len != (uint32_t)len )
@@ -322,9 +361,8 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
-    info = xmalloc(struct microcode_info);
     buffer = xmalloc_bytes(len);
-    if ( !info || !buffer )
+    if ( !buffer )
     {
         ret = -ENOMEM;
         goto free;
@@ -334,28 +372,61 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( ret != 0 )
         goto free;
 
+    /* cpu_online_map must not change during update */
+    if ( !get_cpu_maps() )
+    {
+        ret = -EBUSY;
+        goto free;
+    }
+
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-            goto free;
+            goto put;
     }
 
     ret = parse_microcode_blob(buffer, len);
     if ( ret <= 0 )
     {
         printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
-        xfree(info);
-        return -EINVAL;
+        ret = -EINVAL;
+        goto put;
     }
 
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
+    atomic_set(&cpu_in, 0);
+    atomic_set(&cpu_out, 0);
+
+    /* Calculate the number of online CPU core */
+    nr_cores = 0;
+    for_each_online_cpu(cpu)
+        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+            nr_cores++;
+
+    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
 
-    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    /*
+     * We intend to disable interrupt for long time, which may lead to
+     * watchdog timeout.
+     */
+    watchdog_disable();
+    /*
+     * Late loading dance. Why the heavy-handed stop_machine effort?
+     *
+     * - HT siblings must be idle and not execute other code while the other
+     *   sibling is loading microcode in order to avoid any negative
+     *   interactions cause by the loading.
+     *
+     * - In addition, microcode update on the cores must be serialized until
+     *   this requirement can be relaxed in the future. Right now, this is
+     *   conservative and good.
+     */
+    ret = stop_machine_run(do_microcode_update, NULL, NR_CPUS);
+    watchdog_enable();
 
+ put:
+    put_cpu_maps();
  free:
-    xfree(info);
     xfree(buffer);
     return ret;
 }
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (6 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-01-28  7:06 ` Chao Gao
  2019-01-29 11:27   ` Roger Pau Monné
  2019-02-12 12:51   ` Jan Beulich
  2019-01-29 11:31 ` [PATCH v5 0/8] improve late microcode loading Roger Pau Monné
  8 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-28  7:06 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, Roger Pau Monné, Wei Liu, Jan Beulich, Chao Gao

Currently, microcode_update_lock and microcode_mutex prevent cores
from updating microcode in parallel. Below changes are made to support
parallel microcode update on cores.

microcode_update_lock is removed. The purpose of this lock is to
prevent logic threads of a same core from updating microcode at the
same time. But due to using a global lock, it also prevents parallel
microcode updating on different cores. The original purpose of
microcode_update_lock is already enforced at the level of
apply_microcode()'s caller:
1. For late microcode update, only one sibiling thread of a core will
call the apply_microcode().
2. For microcode update during system startup or CPU-hotplug, each
logical thread is woken up one-by-one.
3. get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
late microcode update.

microcode_mutex is replaced by a rwlock. microcode_mutex was used to
prevent concurrent accesses to 'uci' and microcode_cache. Now the
per-cpu variable, 'uci', won't be accessed by remote cpus after most
fields in 'uci' have been removed; The only shared resource which
needs to be protected is the microcode_cache. A rwlock allows multiple
readers (one thread of each core) to access the global cache and
update microcode simultaneously. Because the rwlock may be held in
stop_machine context, where interrupt is disabled, irq{save, restore}
variants are used to get/release the rwlock.

Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
only) are still processed sequentially.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v5:
 - newly add
---
 xen/arch/x86/microcode.c       | 51 ++++++++++++++++++++++++++++++------------
 xen/arch/x86/microcode_amd.c   |  9 +-------
 xen/arch/x86/microcode_intel.c |  9 +-------
 3 files changed, 39 insertions(+), 30 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index b7b20cf..8de9f0e 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -192,7 +192,7 @@ scan:
 
 const struct microcode_ops *microcode_ops;
 
-static DEFINE_SPINLOCK(microcode_mutex);
+static DEFINE_RWLOCK(cache_rwlock);
 
 DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 
@@ -206,6 +206,7 @@ static void microcode_fini_cpu(unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
 
+    ASSERT(cpu == smp_processor_id());
     memset(uci, 0, sizeof(*uci));
 }
 
@@ -213,21 +214,25 @@ static void microcode_fini_cpu(unsigned int cpu)
 bool save_patch(struct microcode_patch *new_patch)
 {
     struct microcode_patch *microcode_patch;
+    enum microcode_match_result result = MIS_UCODE;
+    bool ret;
+    unsigned long flag;
+
+    write_lock_irqsave(&cache_rwlock, flag);
 
     list_for_each_entry(microcode_patch, &microcode_cache, list)
     {
-        enum microcode_match_result result =
-            microcode_ops->replace_patch(new_patch, microcode_patch);
+        result = microcode_ops->replace_patch(new_patch, microcode_patch);
 
         switch ( result )
         {
         case OLD_UCODE:
-            microcode_ops->free_patch(new_patch);
-            return false;
+            ret = false;
+            goto out;
 
         case NEW_UCODE:
-            microcode_ops->free_patch(microcode_patch);
-            return true;
+            ret = true;
+            goto out;
 
         case MIS_UCODE:
             continue;
@@ -238,7 +243,27 @@ bool save_patch(struct microcode_patch *new_patch)
         }
     }
     list_add_tail(&new_patch->list, &microcode_cache);
-    return true;
+    ret = true;
+
+ out:
+    write_unlock_irqrestore(&cache_rwlock, flag);
+
+    /* free useless patches after interrupt enabled */
+    switch ( result )
+    {
+    case OLD_UCODE:
+        microcode_ops->free_patch(new_patch);
+        break;
+
+    case NEW_UCODE:
+        microcode_ops->free_patch(microcode_patch);
+        break;
+
+    default:
+        break;
+    }
+
+    return ret;
 }
 
 /* Find a ucode patch who has newer revision than the one in use */
@@ -269,15 +294,12 @@ static int parse_microcode_blob(const void *buffer, size_t len)
 {
     unsigned int cpu = smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    int ret;
+    int ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
 
-    spin_lock(&microcode_mutex);
-    ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
     if ( likely(!ret) )
         ret = microcode_ops->cpu_request_microcode(cpu, buffer, len);
     else
         microcode_fini_cpu(cpu);
-    spin_unlock(&microcode_mutex);
 
     return ret;
 }
@@ -285,10 +307,11 @@ static int parse_microcode_blob(const void *buffer, size_t len)
 static int microcode_update_cpu(void)
 {
     int ret;
+    unsigned long flag;
 
-    spin_lock(&microcode_mutex);
+    read_lock_irqsave(&cache_rwlock, flag);
     ret = microcode_ops->apply_microcode(smp_processor_id());
-    spin_unlock(&microcode_mutex);
+    read_unlock_irqrestore(&cache_rwlock, flag);
 
     return ret;
 }
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 80e274e..73df708 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -74,9 +74,6 @@ struct mpbhdr {
     uint8_t data[];
 };
 
-/* serialize access to the physical write */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 /* See comment in start_update() for cases when this routine fails */
 static int collect_cpu_info(unsigned int cpu, struct cpu_signature *csig)
 {
@@ -258,7 +255,6 @@ static enum microcode_match_result replace_patch(struct microcode_patch *new,
 
 static int apply_microcode(unsigned int cpu)
 {
-    unsigned long flags;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
     struct microcode_header_amd *hdr;
@@ -274,16 +270,13 @@ static int apply_microcode(unsigned int cpu)
 
     hdr = patch->data;
     BUG_ON(!hdr);
-
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
 
     /* get patch id after patching */
     rdmsrl(MSR_AMD_PATCHLEVEL, rev);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
-
     /* check current patch id and patch's id for match */
     if ( hw_err || (rev != hdr->patch_id) )
     {
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 02fc5a0..11440d1 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -86,9 +86,6 @@ struct extended_sigtable {
 
 #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
 
-/* serialize access to the physical write to MSR 0x79 */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
 {
     struct cpuinfo_x86 *c = &cpu_data[cpu_num];
@@ -297,7 +294,6 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
 static int apply_microcode(unsigned int cpu)
 {
-    unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
@@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
 
     mc_intel = patch->data;
     BUG_ON(!mc_intel);
-
-    /* serialize access to the physical write to MSR 0x79 */
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     /* write microcode via MSR 0x79 */
     wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
@@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
     rdmsrl(MSR_IA32_UCODE_REV, msr_content);
     val[1] = (uint32_t)(msr_content >> 32);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
     if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size
  2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
@ 2019-01-28 16:40   ` Roger Pau Monné
  2019-01-29 10:26   ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-28 16:40 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:43PM +0800, Chao Gao wrote:
> This check has been done in microcode_sanity_check(). Needn't do it
> again in get_matching_microcode().
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 2/8] microcode/intel: extend microcode_update_match()
  2019-01-28  7:06 ` [PATCH v5 2/8] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-01-28 16:55   ` Roger Pau Monné
  2019-01-28 17:00     ` Jan Beulich
  2019-01-29 10:41   ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-28 16:55 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:44PM +0800, Chao Gao wrote:
> to a more generic function. Then, this function can compare two given
> microcodes' signature/revision as well. Comparing two microcodes is
> used to update the global microcode cache (introduced by the later
> patches in this series) when a new microcode is given.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v5:
>  - constify the extended_signature
>  - use named enum type for the return value of microcode_update_match
> ---
>  xen/arch/x86/microcode_intel.c  | 43 ++++++++++++++++++-----------------------
>  xen/include/asm-x86/microcode.h |  6 ++++++
>  2 files changed, 25 insertions(+), 24 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
> index 4f69f4a..1ed573a 100644
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -127,14 +127,24 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>      return 0;
>  }
>  
> -static inline int microcode_update_match(
> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
> -    int sig, int pf)
> +static enum microcode_match_result microcode_update_match(
> +    const void *mc, unsigned int sig, unsigned int pf, unsigned int rev)

Why are you passing this as a void pointer? The only caller is already
passing this as a mc_header pointer.

>  {
> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
> +    const struct microcode_header_intel *mc_header = mc;
> +    const struct extended_sigtable *ext_header;
> +    const struct extended_signature *ext_sig;
> +    unsigned int i;
> +
> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;

The code above implies that a microcode blob can only have a single
version of microcode for each model, I guess this is OK and guaranteed
by Intel?

With at least the first comment fixed:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 2/8] microcode/intel: extend microcode_update_match()
  2019-01-28 16:55   ` Roger Pau Monné
@ 2019-01-28 17:00     ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-01-28 17:00 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Chao Gao

>>> On 28.01.19 at 17:55, <roger.pau@citrix.com> wrote:
> On Mon, Jan 28, 2019 at 03:06:44PM +0800, Chao Gao wrote:
>> to a more generic function. Then, this function can compare two given
>> microcodes' signature/revision as well. Comparing two microcodes is
>> used to update the global microcode cache (introduced by the later
>> patches in this series) when a new microcode is given.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v5:
>>  - constify the extended_signature
>>  - use named enum type for the return value of microcode_update_match
>> ---
>>  xen/arch/x86/microcode_intel.c  | 43 ++++++++++++++++++-----------------------
>>  xen/include/asm-x86/microcode.h |  6 ++++++
>>  2 files changed, 25 insertions(+), 24 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
>> index 4f69f4a..1ed573a 100644
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -127,14 +127,24 @@ static int collect_cpu_info(unsigned int cpu_num, 
> struct cpu_signature *csig)
>>      return 0;
>>  }
>>  
>> -static inline int microcode_update_match(
>> -    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>> -    int sig, int pf)
>> +static enum microcode_match_result microcode_update_match(
>> +    const void *mc, unsigned int sig, unsigned int pf, unsigned int rev)
> 
> Why are you passing this as a void pointer? The only caller is already
> passing this as a mc_header pointer.
> 
>>  {
>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>> +    const struct microcode_header_intel *mc_header = mc;
>> +    const struct extended_sigtable *ext_header;
>> +    const struct extended_signature *ext_sig;
>> +    unsigned int i;
>> +
>> +    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>> +        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
> 
> The code above implies that a microcode blob can only have a single
> version of microcode for each model, I guess this is OK and guaranteed
> by Intel?

The containerization is a little different from AMD's. Multiple
different versions can be present in what commonly named
microcode.bin, but each individual blob holds exactly one
version for one family:model:stepping tuple (plus the further
restriction, named "pf" above).

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-01-28  7:06 ` [PATCH v5 3/8] microcode: introduce the global microcode cache Chao Gao
@ 2019-01-28 17:39   ` Roger Pau Monné
  2019-01-29  4:41     ` Chao Gao
  2019-02-08 11:41   ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-28 17:39 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:45PM +0800, Chao Gao wrote:
> to replace the current per-cpu cache 'uci->mc'.
> 
> Compared to the current per-cpu cache, the benefits of the global
> microcode cache are:
> 1. It reduces the work that need to be done on each CPU. Parsing ucode
> file can be done once on one CPU. Other CPUs needn't parse ucode file.
> Instead, they can find out and load a patch with newer revision from
> the global cache.
> 2. It reduces the memory consumption on a system with many CPU cores.
> 
> Two functions, save_patch() and find_patch() are introduced. The
> former adds one given patch to the global cache. The latter gets
> a newer and matched ucode patch from the global cache.
> 
> Note that I deliberately avoid touching 'uci->mc' as I am going to
> remove it completely in the next patch.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v5:
>  - reword the commit description
>  - find_patch() and save_patch() are abstracted into common functions
>    with some hooks for AMD and Intel
> ---
>  xen/arch/x86/microcode.c        | 54 +++++++++++++++++++++++
>  xen/arch/x86/microcode_amd.c    | 94 ++++++++++++++++++++++++++++++++++++++---
>  xen/arch/x86/microcode_intel.c  | 71 ++++++++++++++++++++++++++++---
>  xen/include/asm-x86/microcode.h | 13 ++++++
>  4 files changed, 219 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 4163f50..7d5b769 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
>   */
>  static bool_t __initdata ucode_scan;
>  
> +static LIST_HEAD(microcode_cache);
> +
>  void __init microcode_set_module(unsigned int idx)
>  {
>      ucode_mod_idx = idx;
> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
>      spin_unlock(&microcode_mutex);
>  }
>  
> +/* Save a ucode patch to the global cache list */
> +bool save_patch(struct microcode_patch *new_patch)

This being a global function likely requires some kind of prefix, I
would suggest microcode_save_patch, the same applies to the find_patch
function below.

> +{
> +    struct microcode_patch *microcode_patch;
> +
> +    list_for_each_entry(microcode_patch, &microcode_cache, list)

I think I'm missing something here, but given the conversation we had
in the previous version of the series [0] I assumed there was only a
single microcode patch that applies to the whole system, and that
there was no need to keep a list?

Because Xen doesn't support running on such mixed systems anyway?

[0] https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg00381.html

> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
> index 1ed573a..fc35c8d 100644
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -276,18 +324,24 @@ static int apply_microcode(unsigned int cpu)
>      unsigned int val[2];
>      unsigned int cpu_num = raw_smp_processor_id();
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
> +    struct microcode_intel *mc_intel;
> +    struct microcode_patch *patch;
>  
>      /* We should bind the task to the CPU */
>      BUG_ON(cpu_num != cpu);
>  
> -    if ( uci->mc.mc_intel == NULL )
> +    patch = find_patch(cpu);
> +    if ( !patch )
>          return -EINVAL;
>  
> +    mc_intel = patch->data;
> +    BUG_ON(!mc_intel);
> +
>      /* serialize access to the physical write to MSR 0x79 */
>      spin_lock_irqsave(&microcode_update_lock, flags);
>  
>      /* write microcode via MSR 0x79 */
> -    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
> +    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>      wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
>  
>      /* As documented in the SDM: Do a CPUID 1 here */
> @@ -298,19 +352,19 @@ static int apply_microcode(unsigned int cpu)
>      val[1] = (uint32_t)(msr_content >> 32);
>  
>      spin_unlock_irqrestore(&microcode_update_lock, flags);
> -    if ( val[1] != uci->mc.mc_intel->hdr.rev )
> +    if ( val[1] != mc_intel->hdr.rev )
>      {
>          printk(KERN_ERR "microcode: CPU%d update from revision "
>                 "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
> -               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
> +               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
>          return -EIO;
>      }
>      printk(KERN_INFO "microcode: CPU%d updated from revision "
>             "%#x to %#x, date = %04x-%02x-%02x \n",
>             cpu_num, uci->cpu_sig.rev, val[1],
> -           uci->mc.mc_intel->hdr.date & 0xffff,
> -           uci->mc.mc_intel->hdr.date >> 24,
> -           (uci->mc.mc_intel->hdr.date >> 16) & 0xff);
> +           mc_intel->hdr.date & 0xffff,
> +           mc_intel->hdr.date >> 24,
> +           (mc_intel->hdr.date >> 16) & 0xff);

Nit: while here could you make an union of the date field with it's
format, so that you can print it without having to perform this
shifting and masking?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-01-28 17:39   ` Roger Pau Monné
@ 2019-01-29  4:41     ` Chao Gao
  2019-01-29  8:56       ` Roger Pau Monné
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-01-29  4:41 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 06:39:43PM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:45PM +0800, Chao Gao wrote:
>> to replace the current per-cpu cache 'uci->mc'.
>> 
>> Compared to the current per-cpu cache, the benefits of the global
>> microcode cache are:
>> 1. It reduces the work that need to be done on each CPU. Parsing ucode
>> file can be done once on one CPU. Other CPUs needn't parse ucode file.
>> Instead, they can find out and load a patch with newer revision from
>> the global cache.
>> 2. It reduces the memory consumption on a system with many CPU cores.
>> 
>> Two functions, save_patch() and find_patch() are introduced. The
>> former adds one given patch to the global cache. The latter gets
>> a newer and matched ucode patch from the global cache.
>> 
>> Note that I deliberately avoid touching 'uci->mc' as I am going to
>> remove it completely in the next patch.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v5:
>>  - reword the commit description
>>  - find_patch() and save_patch() are abstracted into common functions
>>    with some hooks for AMD and Intel
>> ---
>>  xen/arch/x86/microcode.c        | 54 +++++++++++++++++++++++
>>  xen/arch/x86/microcode_amd.c    | 94 ++++++++++++++++++++++++++++++++++++++---
>>  xen/arch/x86/microcode_intel.c  | 71 ++++++++++++++++++++++++++++---
>>  xen/include/asm-x86/microcode.h | 13 ++++++
>>  4 files changed, 219 insertions(+), 13 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 4163f50..7d5b769 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
>>   */
>>  static bool_t __initdata ucode_scan;
>>  
>> +static LIST_HEAD(microcode_cache);
>> +
>>  void __init microcode_set_module(unsigned int idx)
>>  {
>>      ucode_mod_idx = idx;
>> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
>>      spin_unlock(&microcode_mutex);
>>  }
>>  
>> +/* Save a ucode patch to the global cache list */
>> +bool save_patch(struct microcode_patch *new_patch)
>
>This being a global function likely requires some kind of prefix, I
>would suggest microcode_save_patch, the same applies to the find_patch
>function below.

Will do.

>
>> +{
>> +    struct microcode_patch *microcode_patch;
>> +
>> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
>
>I think I'm missing something here, but given the conversation we had
>in the previous version of the series [0] I assumed there was only a
>single microcode patch that applies to the whole system, and that
>there was no need to keep a list?
>
>Because Xen doesn't support running on such mixed systems anyway?

No. As Jan pointed out in [1], we still want to support mixed systems
which are allowed by Intel and AMD.

[1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03479.html

>
>[0] https://lists.xenproject.org/archives/html/xen-devel/2018-12/msg00381.html
>
>> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
>> index 1ed573a..fc35c8d 100644
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -276,18 +324,24 @@ static int apply_microcode(unsigned int cpu)
>>      unsigned int val[2];
>>      unsigned int cpu_num = raw_smp_processor_id();
>>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>> +    struct microcode_intel *mc_intel;
>> +    struct microcode_patch *patch;
>>  
>>      /* We should bind the task to the CPU */
>>      BUG_ON(cpu_num != cpu);
>>  
>> -    if ( uci->mc.mc_intel == NULL )
>> +    patch = find_patch(cpu);
>> +    if ( !patch )
>>          return -EINVAL;
>>  
>> +    mc_intel = patch->data;
>> +    BUG_ON(!mc_intel);
>> +
>>      /* serialize access to the physical write to MSR 0x79 */
>>      spin_lock_irqsave(&microcode_update_lock, flags);
>>  
>>      /* write microcode via MSR 0x79 */
>> -    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
>> +    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>>      wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
>>  
>>      /* As documented in the SDM: Do a CPUID 1 here */
>> @@ -298,19 +352,19 @@ static int apply_microcode(unsigned int cpu)
>>      val[1] = (uint32_t)(msr_content >> 32);
>>  
>>      spin_unlock_irqrestore(&microcode_update_lock, flags);
>> -    if ( val[1] != uci->mc.mc_intel->hdr.rev )
>> +    if ( val[1] != mc_intel->hdr.rev )
>>      {
>>          printk(KERN_ERR "microcode: CPU%d update from revision "
>>                 "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
>> -               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
>> +               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
>>          return -EIO;
>>      }
>>      printk(KERN_INFO "microcode: CPU%d updated from revision "
>>             "%#x to %#x, date = %04x-%02x-%02x \n",
>>             cpu_num, uci->cpu_sig.rev, val[1],
>> -           uci->mc.mc_intel->hdr.date & 0xffff,
>> -           uci->mc.mc_intel->hdr.date >> 24,
>> -           (uci->mc.mc_intel->hdr.date >> 16) & 0xff);
>> +           mc_intel->hdr.date & 0xffff,
>> +           mc_intel->hdr.date >> 24,
>> +           (mc_intel->hdr.date >> 16) & 0xff);
>
>Nit: while here could you make an union of the date field with it's
>format, so that you can print it without having to perform this
>shifting and masking?

Will do.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-01-29  4:41     ` Chao Gao
@ 2019-01-29  8:56       ` Roger Pau Monné
  0 siblings, 0 replies; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29  8:56 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Jan 29, 2019 at 12:41:42PM +0800, Chao Gao wrote:
> On Mon, Jan 28, 2019 at 06:39:43PM +0100, Roger Pau Monné wrote:
> >On Mon, Jan 28, 2019 at 03:06:45PM +0800, Chao Gao wrote:
> >> to replace the current per-cpu cache 'uci->mc'.
> >> 
> >> Compared to the current per-cpu cache, the benefits of the global
> >> microcode cache are:
> >> 1. It reduces the work that need to be done on each CPU. Parsing ucode
> >> file can be done once on one CPU. Other CPUs needn't parse ucode file.
> >> Instead, they can find out and load a patch with newer revision from
> >> the global cache.
> >> 2. It reduces the memory consumption on a system with many CPU cores.
> >> 
> >> Two functions, save_patch() and find_patch() are introduced. The
> >> former adds one given patch to the global cache. The latter gets
> >> a newer and matched ucode patch from the global cache.
> >> 
> >> Note that I deliberately avoid touching 'uci->mc' as I am going to
> >> remove it completely in the next patch.
> >> 
> >> Signed-off-by: Chao Gao <chao.gao@intel.com>
> >> ---
> >> Changes in v5:
> >>  - reword the commit description
> >>  - find_patch() and save_patch() are abstracted into common functions
> >>    with some hooks for AMD and Intel
> >> ---
> >>  xen/arch/x86/microcode.c        | 54 +++++++++++++++++++++++
> >>  xen/arch/x86/microcode_amd.c    | 94 ++++++++++++++++++++++++++++++++++++++---
> >>  xen/arch/x86/microcode_intel.c  | 71 ++++++++++++++++++++++++++++---
> >>  xen/include/asm-x86/microcode.h | 13 ++++++
> >>  4 files changed, 219 insertions(+), 13 deletions(-)
> >> 
> >> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> >> index 4163f50..7d5b769 100644
> >> --- a/xen/arch/x86/microcode.c
> >> +++ b/xen/arch/x86/microcode.c
> >> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
> >>   */
> >>  static bool_t __initdata ucode_scan;
> >>  
> >> +static LIST_HEAD(microcode_cache);
> >> +
> >>  void __init microcode_set_module(unsigned int idx)
> >>  {
> >>      ucode_mod_idx = idx;
> >> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
> >>      spin_unlock(&microcode_mutex);
> >>  }
> >>  
> >> +/* Save a ucode patch to the global cache list */
> >> +bool save_patch(struct microcode_patch *new_patch)
> >
> >This being a global function likely requires some kind of prefix, I
> >would suggest microcode_save_patch, the same applies to the find_patch
> >function below.
> 
> Will do.
> 
> >
> >> +{
> >> +    struct microcode_patch *microcode_patch;
> >> +
> >> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
> >
> >I think I'm missing something here, but given the conversation we had
> >in the previous version of the series [0] I assumed there was only a
> >single microcode patch that applies to the whole system, and that
> >there was no need to keep a list?
> >
> >Because Xen doesn't support running on such mixed systems anyway?
> 
> No. As Jan pointed out in [1], we still want to support mixed systems
> which are allowed by Intel and AMD.
> 
> [1] https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03479.html

Oh thanks, I've missed that reply while reading the previous series.
Then with the comments fixed you can add:

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info
  2019-01-28  7:06 ` [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info Chao Gao
@ 2019-01-29  9:25   ` Roger Pau Monné
  2019-01-29 13:27     ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29  9:25 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

Thanks for the cleanup!

On Mon, Jan 28, 2019 at 03:06:46PM +0800, Chao Gao wrote:
> diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
> index fc98fed..507da2e 100644
> --- a/xen/include/asm-x86/microcode.h
> +++ b/xen/include/asm-x86/microcode.h
> @@ -19,7 +19,6 @@ struct microcode_patch {
>  };
>  
>  struct microcode_ops {
> -    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
>      int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
>                                   size_t size);
>      int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
> @@ -39,11 +38,6 @@ struct cpu_signature {
>  
>  struct ucode_cpu_info {
>      struct cpu_signature cpu_sig;
> -    union {
> -        struct microcode_intel *mc_intel;
> -        struct microcode_amd *mc_amd;
> -        void *mc_valid;
> -    } mc;
>  };

Is there really a need for such structure since it only has one field
now?

I'm trying to figure out whether this is expanded by further patches,
but it seems like it's not, if so please remove the struct altogether.

I'm also wondering whether it's needed to store the cpu signature in
the pcpu area, AFAICT you always call collect_cpu_info before
apply_microcode, at which point cpu_signature could be stored in the
stack and passed to apply_microcode as a parameter?

Or apply_microcode could call collect_cpu_info directly. Getting rid
of the pcpu field would also allow you to get rid of
microcode_fini_cpu, further cleaning the code.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-01-28  7:06 ` [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-01-29  9:58   ` Roger Pau Monné
  2019-01-29 12:47     ` Chao Gao
  2019-02-08 15:58   ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29  9:58 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:47PM +0800, Chao Gao wrote:
> During late microcode update, apply_microcode() is invoked in
> cpu_request_microcode(). To make late microcode update more reliable,
> we want to put the apply_microcode() into stop_machine context. So
> we split out it from cpu_request_microcode(). As a consequence,
> apply_microcode() should be invoked explicitly in the common code.
> 
> Also with the global ucode cache, microcode parsing only needs
> to be done once; cpu_request_microcode() is also moved out of
> microcode_update_cpu().
> 
> On AMD side, svm_host_osvw_init() is supposed to be called after
> microcode update. As apply_micrcode() won't be called by
> cpu_request_microcode() now, svm_host_osvw_init() is also moved to the
> end of apply_microcode().
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  xen/arch/x86/microcode.c       | 80 +++++++++++++++++++++++++-----------------
>  xen/arch/x86/microcode_amd.c   | 23 +++++++-----
>  xen/arch/x86/microcode_intel.c | 20 ++---------
>  3 files changed, 64 insertions(+), 59 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 0c77e90..936f0b8 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -254,47 +254,42 @@ struct microcode_patch *find_patch(unsigned int cpu)
>      return NULL;
>  }
>  
> -int microcode_resume_cpu(unsigned int cpu)
> +/*
> + * Return the number of ucode patch inserted to the global cache.
> + * Return negtive value on error.
> + */
> +static int parse_microcode_blob(const void *buffer, size_t len)
>  {
> -    int err;
> +    unsigned int cpu = smp_processor_id();
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
> -
> -    if ( !microcode_ops )
> -        return 0;
> +    int ret;
>  
>      spin_lock(&microcode_mutex);
> -
> -    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
> -    if ( err )
> -    {
> +    ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
> +    if ( likely(!ret) )
> +        ret = microcode_ops->cpu_request_microcode(cpu, buffer, len);
> +    else
>          microcode_fini_cpu(cpu);
> -        spin_unlock(&microcode_mutex);
> -        return err;
> -    }
> -
> -    err = microcode_ops->apply_microcode(cpu);
>      spin_unlock(&microcode_mutex);
>  
> -    return err;
> +    return ret;
>  }
>  
> -static int microcode_update_cpu(const void *buf, size_t size)
> +static int microcode_update_cpu(void)
>  {
> -    int err;
> -    unsigned int cpu = smp_processor_id();
> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
> +    int ret;
>  
>      spin_lock(&microcode_mutex);
> -
> -    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
> -    if ( likely(!err) )
> -        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
> -    else
> -        microcode_fini_cpu(cpu);
> -
> +    ret = microcode_ops->apply_microcode(smp_processor_id());

I've realized that most of this helpers take a cpu id parameter
(apply_microcode, collect_cpu_info and cpu_request_microcode), but
that at least on Intel they are required to be called on the current
CPU, at which point I wonder about their purpose. IMO they just make
the interface more messy, without adding any functionality.

>      spin_unlock(&microcode_mutex);
>  
> -    return err;
> +    return ret;
> +}
> +
> +int microcode_resume_cpu(unsigned int cpu)
> +{
> +    BUG_ON(cpu != smp_processor_id());

Same here, what's the point of passing a cpu id as parameter when
there's only one valid value?

> +    return microcode_ops ? microcode_update_cpu() : 0;
>  }
>  
>  static long do_microcode_update(void *_info)
> @@ -304,7 +299,7 @@ static long do_microcode_update(void *_info)
>  
>      BUG_ON(info->cpu != smp_processor_id());
>  
> -    error = microcode_update_cpu(info->buffer, info->buffer_size);
> +    error = microcode_update_cpu();
>      if ( error )
>          info->error = error;
>  
> @@ -339,10 +334,6 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          return ret;
>      }
>  
> -    info->buffer_size = len;
> -    info->error = 0;
> -    info->cpu = cpumask_first(&cpu_online_map);
> -
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
> @@ -353,6 +344,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          }
>      }
>  
> +    ret = parse_microcode_blob(info->buffer, len);
> +    if ( ret <= 0 )
> +    {
> +        printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
> +        xfree(info);
> +        return -EINVAL;
> +    }
> +
> +    info->buffer_size = len;
> +    info->error = 0;
> +    info->cpu = cpumask_first(&cpu_online_map);

It looks like you can also get rid of the cpu field in microcode_info,
since it's always smp_processor_id?

> +
>      return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>  }
>  
> @@ -396,13 +399,24 @@ int __init early_microcode_update_cpu(bool start_update)
>      }
>      if ( data )
>      {
> +        static bool parsed = false;
> +
>          if ( start_update && microcode_ops->start_update )
>              rc = microcode_ops->start_update();
>  
>          if ( rc )
>              return rc;
>  
> -        return microcode_update_cpu(data, len);
> +        if ( !parsed )
> +        {
> +            rc = parse_microcode_blob(data, len);
> +            parsed = true;
> +
> +            if ( rc <= 0 )
> +                return -EINVAL;
> +        }
> +
> +        return microcode_update_cpu();

Hm, the usage of parsed here is kind of dangerous. I assume this works
fine because early_microcode_update_cpu is called from the BSP in
early_microcode_init, and then concurrent calls done by the APs always
see parsed as true.

I would however recommend that you move the parsing to
early_microcode_init, and that early_microcode_update_cpu always
assume the blob has been parsed.

If that doesn't work for some reason, I would then recommend that you
gate the parsing based on cpu_id, so "smp_processor_id() == 0"

>      }
>      else
>          return -ENOMEM;
> diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
> index d86a596..80e274e 100644
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -297,6 +297,10 @@ static int apply_microcode(unsigned int cpu)
>  
>      uci->cpu_sig.rev = rev;
>  
> +#if CONFIG_HVM
> +    svm_host_osvw_init();
> +#endif
> +
>      return 0;
>  }
>  
> @@ -463,6 +467,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>      unsigned int current_cpu_id;
>      unsigned int equiv_cpu_id;
> +    unsigned int matched_cnt = 0;
>  
>      /* We should bind the task to the CPU */
>      BUG_ON(cpu != raw_smp_processor_id());
> @@ -564,11 +569,15 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
>           * this ucode patch before checking whether it matches with
>           * current CPU.
>           */
> -        if ( save_patch(microcode_patch) && microcode_fits(mc_amd, cpu) )
> +        if ( save_patch(microcode_patch) )
>          {
> -            error = apply_microcode(cpu);
> -            if ( error )
> -                break;
> +            matched_cnt++;
> +            if ( microcode_fits(mc_amd, cpu) )
> +            {
> +                error = apply_microcode(cpu);
> +                if ( error )
> +                    break;
> +            }

In the commit message you mention that apply_microcode won't be called
by cpu_request_microcode, yet it seems it's called?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info
  2019-01-28  7:06 ` [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info Chao Gao
@ 2019-01-29 10:10   ` Roger Pau Monné
  2019-01-29 14:11     ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29 10:10 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:48PM +0800, Chao Gao wrote:
> microcode pointer and size were passed to other CPUs to parse
> microcode locally. Now, parsing microcode is done on one CPU.
> Other CPUs needn't parse the microcode blob; the pointer and
> size can be removed.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  xen/arch/x86/microcode.c | 33 +++++++++++++++++----------------
>  1 file changed, 17 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 936f0b8..3c2274f 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -190,9 +190,7 @@ DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
>  
>  struct microcode_info {
>      unsigned int cpu;

I think cpu can also be removed as my previous reply to patch 5, at
which point you only need to store an error? In which case you should
also remove this struct then.

> -    uint32_t buffer_size;
>      int error;
> -    char buffer[1];
>  };
>  
>  static void microcode_fini_cpu(unsigned int cpu)
> @@ -316,6 +314,7 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>  {
>      int ret;
>      struct microcode_info *info;
> +    void * buffer;
>  
>      if ( len != (uint32_t)len )
>          return -E2BIG;
> @@ -323,28 +322,26 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      if ( microcode_ops == NULL )
>          return -EINVAL;
>  
> -    info = xmalloc_bytes(sizeof(*info) + len);
> -    if ( info == NULL )
> -        return -ENOMEM;
> -
> -    ret = copy_from_guest(info->buffer, buf, len);
> -    if ( ret != 0 )
> +    info = xmalloc(struct microcode_info);
> +    buffer = xmalloc_bytes(len);
> +    if ( !info || !buffer )
>      {
> -        xfree(info);
> -        return ret;
> +        ret = -ENOMEM;
> +        goto free;
>      }
>  
> +    ret = copy_from_guest(buffer, buf, len);
> +    if ( ret != 0 )
> +        goto free;

copy_from_guest doesn't return an errno value, you have to set ret to
EFAULT:

if ( copy_from_guest(buffer, buf, len) )
{
    ret = -EFAULT;
    goto free;
}

> +
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
>          if ( ret != 0 )
> -        {
> -            xfree(info);
> -            return ret;
> -        }
> +            goto free;
>      }
>  
> -    ret = parse_microcode_blob(info->buffer, len);
> +    ret = parse_microcode_blob(buffer, len);
>      if ( ret <= 0 )

Don't you need to free info and buffer here also?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size
  2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
  2019-01-28 16:40   ` Roger Pau Monné
@ 2019-01-29 10:26   ` Jan Beulich
  2019-01-29 13:34     ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-01-29 10:26 UTC (permalink / raw)
  To: chao.gao; +Cc: andrew.cooper3, wei.liu2, xen-devel, roger.pau

>>> Chao Gao <chao.gao@intel.com> 01/28/19 8:06 AM >>>
>This check has been done in microcode_sanity_check(). Needn't do it
>again in get_matching_microcode().

But while there are two call sites of get_matching_microcode() only
one is preceded by a call to microcode_sanity_check(). There's also
no visible connection between cpu_request_microcode() (which frees
the blob it has validated) and microcode_resume_match() (and in
particular the blob passed into there). What am I missing?


Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-01-28  7:06 ` [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-01-29 10:37   ` Roger Pau Monné
  2019-01-29 10:45     ` Jan Beulich
  2019-01-30 13:44     ` Chao Gao
  2019-02-08 16:29   ` Jan Beulich
  1 sibling, 2 replies; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29 10:37 UTC (permalink / raw)
  To: Chao Gao
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Andrew Cooper, Jun Nakajima,
	xen-devel, Thomas Gleixner, Borislav Petkov, Ashok Raj

On Mon, Jan 28, 2019 at 03:06:49PM +0800, Chao Gao wrote:
> This patch ports microcode improvement patches from linux kernel.
> 
> Before you read any further: the early loading method is still the
> preferred one and you should always do that. The following patch is
> improving the late loading mechanism for long running jobs and cloud use
> cases.
> 
> Gather all cores and serialize the microcode update on them by doing it
> one-by-one to make the late update process as reliable as possible and
> avoid potential issues caused by the microcode update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Chao Gao <chao.gao@intel.com>
> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
>  xen/arch/x86/microcode.c | 125 +++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 98 insertions(+), 27 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 3c2274f..b7b20cf 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -22,6 +22,7 @@
>   */
>  
>  #include <xen/cpu.h>
> +#include <xen/cpumask.h>
>  #include <xen/lib.h>
>  #include <xen/kernel.h>
>  #include <xen/init.h>
> @@ -30,18 +31,25 @@
>  #include <xen/smp.h>
>  #include <xen/softirq.h>
>  #include <xen/spinlock.h>
> +#include <xen/stop_machine.h>
>  #include <xen/tasklet.h>
>  #include <xen/guest_access.h>
>  #include <xen/earlycpio.h>
> +#include <xen/watchdog.h>
>  
> +#include <asm/delay.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
>  #include <asm/setup.h>
>  #include <asm/microcode.h>
>  
> +/* By default, wait for 30000us */
> +#define MICROCODE_DEFAULT_TIMEOUT_US 30000
> +
>  static module_t __initdata ucode_mod;
>  static signed int __initdata ucode_mod_idx;
>  static bool_t __initdata ucode_mod_forced;
> +static unsigned int nr_cores;
>  
>  /*
>   * If we scan the initramfs.cpio for the early microcode code
> @@ -188,10 +196,11 @@ static DEFINE_SPINLOCK(microcode_mutex);
>  
>  DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
>  
> -struct microcode_info {
> -    unsigned int cpu;
> -    int error;
> -};
> +/*
> + * Count the CPUs that have entered and exited the rendezvous
> + * during late microcode update.
> + */
> +static atomic_t cpu_in, cpu_out;
>  
>  static void microcode_fini_cpu(unsigned int cpu)
>  {
> @@ -290,30 +299,60 @@ int microcode_resume_cpu(unsigned int cpu)
>      return microcode_ops ? microcode_update_cpu() : 0;
>  }
>  
> -static long do_microcode_update(void *_info)
> +/* Wait for all CPUs to rendezvous with a timeout (us) */
> +static int wait_for_cpus(atomic_t *cnt, unsigned int timeout)
>  {
> -    struct microcode_info *info = _info;
> -    int error;
> +    unsigned int cpus = num_online_cpus();
>  
> -    BUG_ON(info->cpu != smp_processor_id());
> +    atomic_inc(cnt);
> +
> +    while ( atomic_read(cnt) != cpus )
> +    {
> +        if ( timeout <= 0 )
> +        {
> +            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
> +                   smp_processor_id());
> +            return -EBUSY;
> +        }
> +        udelay(1);

udelay will call the rdtsc instruction, is it fine to use it on a
sibling thread while there's a microcode update in process on the same
core?

> +        timeout--;
> +    }
> +
> +    return 0;
> +}
> +
> +static int do_microcode_update(void *unused)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    int ret;
>  
> -    error = microcode_update_cpu();
> -    if ( error )
> -        info->error = error;
> +    ret = wait_for_cpus(&cpu_in, MICROCODE_DEFAULT_TIMEOUT_US);
> +    if ( ret )
> +        return ret;
>  
> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
> -    if ( info->cpu < nr_cpu_ids )
> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +    /*
> +     * Initiate an update on all processors which don't have an online sibling
> +     * thread with a lower thread id. Other sibling threads just await the
> +     * completion of microcode update.
> +     */
> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        ret = microcode_update_cpu();

The description says "Gather all cores and serialize the microcode
update on them by doing it one-by-one" but it looks like you are doing
the update in parallel actually?

> +    /*
> +     * Increase the wait timeout to a safe value here since we're serializing
> +     * the microcode update and that could take a while on a large number of
> +     * CPUs. And that is fine as the *actual* timeout will be determined by
> +     * the last CPU finished updating and thus cut short
> +     */
> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
> +        panic("Timeout when finishing updating microcode");
>  
> -    error = info->error;
> -    xfree(info);
> -    return error;
> +    return ret;
>  }
>  
>  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>  {
>      int ret;
> -    struct microcode_info *info;
> +    unsigned int cpu;
>      void * buffer;
>  
>      if ( len != (uint32_t)len )
> @@ -334,28 +372,61 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      if ( ret != 0 )
>          goto free;
>  
> +    /* cpu_online_map must not change during update */
> +    if ( !get_cpu_maps() )
> +    {
> +        ret = -EBUSY;
> +        goto free;
> +    }
> +
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
>          if ( ret != 0 )
> -            goto free;
> +            goto put;
>      }
>  
>      ret = parse_microcode_blob(buffer, len);
>      if ( ret <= 0 )
>      {
>          printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
> -        xfree(info);
> -        return -EINVAL;
> +        ret = -EINVAL;
> +        goto put;
>      }
>  
> -    info->error = 0;
> -    info->cpu = cpumask_first(&cpu_online_map);
> +    atomic_set(&cpu_in, 0);
> +    atomic_set(&cpu_out, 0);
> +
> +    /* Calculate the number of online CPU core */
> +    nr_cores = 0;
> +    for_each_online_cpu(cpu)
> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +            nr_cores++;
> +
> +    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
>  
> -    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();
> +    /*
> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> +     *
> +     * - HT siblings must be idle and not execute other code while the other
> +     *   sibling is loading microcode in order to avoid any negative
> +     *   interactions cause by the loading.
> +     *
> +     * - In addition, microcode update on the cores must be serialized until
> +     *   this requirement can be relaxed in the future. Right now, this is

As said above, I'm not sure what you are doing here could be
considered serialized, the previous method was clearly serialized
moving from one CPU to the next one.

Here you are likely updating multiple cores at the same time, which
I'm not saying it's wrong, but doesn't seem to match the commit
description or the comments in the code.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 2/8] microcode/intel: extend microcode_update_match()
  2019-01-28  7:06 ` [PATCH v5 2/8] microcode/intel: extend microcode_update_match() Chao Gao
  2019-01-28 16:55   ` Roger Pau Monné
@ 2019-01-29 10:41   ` Jan Beulich
  2019-01-29 13:52     ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-01-29 10:41 UTC (permalink / raw)
  To: chao.gao; +Cc: andrew.cooper3, wei.liu2, xen-devel, roger.pau

>>> Chao Gao <chao.gao@intel.com> 01/28/19 8:10 AM >>>
>--- a/xen/arch/x86/microcode_intel.c
>+++ b/xen/arch/x86/microcode_intel.c
>@@ -127,14 +127,24 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>return 0;
>}
 >
>-static inline int microcode_update_match(
>-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>-    int sig, int pf)
>+static enum microcode_match_result microcode_update_match(
>+    const void *mc, unsigned int sig, unsigned int pf, unsigned int rev)
>{
>-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>+    const struct microcode_header_intel *mc_header = mc;
>+    const struct extended_sigtable *ext_header;
>+    const struct extended_signature *ext_sig;
>+    unsigned int i;
>+
>+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
 
You may want a tristate return here: I know there are systems where
firmware updates ucode only on core 0 of every socket, in which case we'd
very much like to apply the same microcode on the other cores in case we
find the blob matching what is currently installed. IOW depending how later
patches actually work, you may also want a SAME_UCODE return case.

>+    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;

On top of what Roger has said, isn't mc + MC_HEADER_SIZE the same
as mc_header + 1?

>+    ext_sig = (const void *)ext_header + EXT_HEADER_SIZE;

And (const void *)ext_header + EXT_HEADER_SIZE the same as
(const void *)(ext_header + 1)?

In both cases this would eliminate unnecessary implications of certain
two sub-terms to refer to the same types, i.e. also make the casts less
scary / dangerous.


>--- a/xen/include/asm-x86/microcode.h
>+++ b/xen/include/asm-x86/microcode.h
>@@ -3,6 +3,12 @@
 >
>#include <xen/percpu.h>
 >
>+enum microcode_match_result {
>+    OLD_UCODE, /* signature matched, but revision id isn't newer */
>+    NEW_UCODE, /* signature matched, but revision id is newer */
>+    MIS_UCODE, /* signature mismatched */
>+};

It's not clear at this point of the series or from the commit message whether
this is to be used by AMD code as well. If not, it would better move into
microcode_intel.c.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-01-29 10:37   ` Roger Pau Monné
@ 2019-01-29 10:45     ` Jan Beulich
  2019-01-30 13:44     ` Chao Gao
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-01-29 10:45 UTC (permalink / raw)
  To: roger.pau
  Cc: kevin.tian, wei.liu2, ashok.raj, andrew.cooper3, jun.nakajima,
	xen-devel, tglx, bp, chao.gao

>>> Roger Pau Monné <roger.pau@citrix.com> 01/29/19 11:39 AM >>>
>On Mon, Jan 28, 2019 at 03:06:49PM +0800, Chao Gao wrote:
>> +    /*
>> +     * Initiate an update on all processors which don't have an online sibling
>> +     * thread with a lower thread id. Other sibling threads just await the
>> +     * completion of microcode update.
>> +     */
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +        ret = microcode_update_cpu();
>
>The description says "Gather all cores and serialize the microcode
>update on them by doing it one-by-one" but it looks like you are doing
>the update in parallel actually?

As discussed in the context of earlier versions, the apparent parallelism
here goes away immediately inside microcode_update_cpu(), which

acquires a global spin lock first thing.


Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-01-28  7:06 ` [PATCH v5 8/8] microcode: update microcode on cores in parallel Chao Gao
@ 2019-01-29 11:27   ` Roger Pau Monné
  2019-01-30 13:36     ` Chao Gao
  2019-02-12 12:51   ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29 11:27 UTC (permalink / raw)
  To: Chao Gao; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Mon, Jan 28, 2019 at 03:06:50PM +0800, Chao Gao wrote:
> Currently, microcode_update_lock and microcode_mutex prevent cores
> from updating microcode in parallel. Below changes are made to support
> parallel microcode update on cores.

Oh, that's what I missed from the previous patch then, and what
serialises the applying of the microcode update.

> 
> microcode_update_lock is removed. The purpose of this lock is to
> prevent logic threads of a same core from updating microcode at the
> same time. But due to using a global lock, it also prevents parallel
> microcode updating on different cores. The original purpose of
> microcode_update_lock is already enforced at the level of
> apply_microcode()'s caller:
> 1. For late microcode update, only one sibiling thread of a core will
> call the apply_microcode().
> 2. For microcode update during system startup or CPU-hotplug, each
> logical thread is woken up one-by-one.
> 3. get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
> late microcode update.
> 
> microcode_mutex is replaced by a rwlock. microcode_mutex was used to
> prevent concurrent accesses to 'uci' and microcode_cache. Now the
> per-cpu variable, 'uci', won't be accessed by remote cpus after most
> fields in 'uci' have been removed; The only shared resource which
> needs to be protected is the microcode_cache. A rwlock allows multiple
> readers (one thread of each core) to access the global cache and
> update microcode simultaneously. Because the rwlock may be held in
> stop_machine context, where interrupt is disabled, irq{save, restore}
> variants are used to get/release the rwlock.
> 
> Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
> only) are still processed sequentially.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>

Thanks, this LGTM, just one question below.

> @@ -285,10 +307,11 @@ static int parse_microcode_blob(const void *buffer, size_t len)
>  static int microcode_update_cpu(void)
>  {
>      int ret;
> +    unsigned long flag;
>  
> -    spin_lock(&microcode_mutex);
> +    read_lock_irqsave(&cache_rwlock, flag);
>      ret = microcode_ops->apply_microcode(smp_processor_id());
> -    spin_unlock(&microcode_mutex);
> +    read_unlock_irqrestore(&cache_rwlock, flag);

Why do you take the lock here, wouldn't it be better to just take it
for find_patch? (ie: like you do for save_patch)

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 0/8] improve late microcode loading
  2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
                   ` (7 preceding siblings ...)
  2019-01-28  7:06 ` [PATCH v5 8/8] microcode: update microcode on cores in parallel Chao Gao
@ 2019-01-29 11:31 ` Roger Pau Monné
  2019-01-29 12:11   ` Chao Gao
  8 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29 11:31 UTC (permalink / raw)
  To: Chao Gao
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel,
	Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit

On Mon, Jan 28, 2019 at 03:06:42PM +0800, Chao Gao wrote:
> Changes in this version:
>  - support parallel microcode updates for all cores (see patch 8)
>  - Address Roger's comments on the last version.
> 
> The intention of this series is to make the late microcode loading
> more reliable by rendezvousing all cpus in stop_machine context.
> This idea comes from Ashok. I am porting his linux patch to Xen
> (see patch 7 for more details).
> 
> This series makes three changes:
>  1. Patch 1-6: introduce a global microcode cache
>  2. Patch 7: synchronize late microcode loading
>  3. Patch 8: support parallel microcodes update on different cores
> 
> Currently, late microcode loading does a lot of things including
> parsing microcode blob, checking the signature/revision and performing
> update. Putting all of them into stop_machine context is a bad idea
> because of complexity (One issue I observed is memory allocation
> triggered one assertion in stop_machine context). In order to simplify
> the load process, I move parsing microcode out of the load process.
> The microcode blob is parsed and a global microcode cache is built on
> a single CPU before rendezvousing all cpus to update microcode. Other
> CPUs just get and load a suitable microcode from the global cache.
> With this global cache, it is safe to put simplified load process to
> stop_machine context.
> 
> Regarding changes to AMD side, I didn't do any test for them due to
> lack of hardware. Could you help to test this series on an AMD machine?
> At least, two basic tests are needed:
> * do a microcode update after system bootup
> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>   command line and then do a microcode update and online all offlined
>   CPUs via 'xen-hptool'.
> 

Thanks for the series, I think it's a good improvement to current
microcode loading.

I would like to ask how have you tested the series, I don't seem to
find any tool in the current tree to load a microcode to Xen. The only
thing I've found is:

https://lists.xen.org/archives/html/xen-devel/2013-07/txtpyXvYZGRwb.txt

Have you used this tool to test the code?

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 0/8] improve late microcode loading
  2019-01-29 11:31 ` [PATCH v5 0/8] improve late microcode loading Roger Pau Monné
@ 2019-01-29 12:11   ` Chao Gao
  2019-01-29 14:17     ` Roger Pau Monné
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-01-29 12:11 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel,
	Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit

[-- Attachment #1: Type: text/plain, Size: 2333 bytes --]

On Tue, Jan 29, 2019 at 12:31:51PM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:42PM +0800, Chao Gao wrote:
>> Changes in this version:
>>  - support parallel microcode updates for all cores (see patch 8)
>>  - Address Roger's comments on the last version.
>> 
>> The intention of this series is to make the late microcode loading
>> more reliable by rendezvousing all cpus in stop_machine context.
>> This idea comes from Ashok. I am porting his linux patch to Xen
>> (see patch 7 for more details).
>> 
>> This series makes three changes:
>>  1. Patch 1-6: introduce a global microcode cache
>>  2. Patch 7: synchronize late microcode loading
>>  3. Patch 8: support parallel microcodes update on different cores
>> 
>> Currently, late microcode loading does a lot of things including
>> parsing microcode blob, checking the signature/revision and performing
>> update. Putting all of them into stop_machine context is a bad idea
>> because of complexity (One issue I observed is memory allocation
>> triggered one assertion in stop_machine context). In order to simplify
>> the load process, I move parsing microcode out of the load process.
>> The microcode blob is parsed and a global microcode cache is built on
>> a single CPU before rendezvousing all cpus to update microcode. Other
>> CPUs just get and load a suitable microcode from the global cache.
>> With this global cache, it is safe to put simplified load process to
>> stop_machine context.
>> 
>> Regarding changes to AMD side, I didn't do any test for them due to
>> lack of hardware. Could you help to test this series on an AMD machine?
>> At least, two basic tests are needed:
>> * do a microcode update after system bootup
>> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>>   command line and then do a microcode update and online all offlined
>>   CPUs via 'xen-hptool'.
>> 
>
>Thanks for the series, I think it's a good improvement to current
>microcode loading.
>
>I would like to ask how have you tested the series, I don't seem to
>find any tool in the current tree to load a microcode to Xen. The only
>thing I've found is:
>
>https://lists.xen.org/archives/html/xen-devel/2013-07/txtpyXvYZGRwb.txt
>
>Have you used this tool to test the code?

Yes. I am using this patch with some issues fixed.

Thanks
Chao


[-- Attachment #2: 0001-misc-xenmicrocode-Upload-lib-firmware-some-blob-to-t.patch --]
[-- Type: text/x-diff, Size: 5111 bytes --]

>From ff00662f3e35f661a06dbf143150f300664d5e93 Mon Sep 17 00:00:00 2001
From: Chao Gao <chao.gao@intel.com>
Date: Fri, 9 Mar 2018 20:01:53 +0800
Subject: [PATCH 1/9] misc/xenmicrocode: Upload /lib/firmware/<some blob> to
 the hypervisor

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 tools/libxc/include/xenctrl.h |  1 +
 tools/libxc/xc_misc.c         | 19 +++++++++++
 tools/misc/Makefile           |  4 +++
 tools/misc/xenmicrocode.c     | 73 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 97 insertions(+)
 create mode 100644 tools/misc/xenmicrocode.c

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 31cdda7..c69699b 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1245,6 +1245,7 @@ typedef uint32_t xc_node_to_node_dist_t;
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo);
+int xc_platform_op(xc_interface *xch, struct xen_platform_op *op);
 int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
                 xc_meminfo_t *meminfo, uint32_t *distance);
 int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 5e6714a..76e1bd5 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -226,6 +226,25 @@ int xc_physinfo(xc_interface *xch,
     return 0;
 }
 
+int xc_platform_op(xc_interface *xch, struct xen_platform_op *op)
+{
+    int ret = 0;
+    DECLARE_PLATFORM_OP;
+    DECLARE_HYPERCALL_BOUNCE(op, sizeof(*op), XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+
+    if ( xc_hypercall_bounce_pre(xch, op) )
+    {
+        PERROR("Could not bounce xen_platform_op memory buffer");
+        return -1;
+    }
+    op->interface_version = XENPF_INTERFACE_VERSION;
+
+    platform_op = *op;
+    ret = do_platform_op(xch, &platform_op);
+    xc_hypercall_bounce_post(xch, op);
+    return ret;
+}
+
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo)
 {
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index eaa2879..d522101 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -23,6 +23,7 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmcrash
 INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmctx
 INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
 INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86)     += xenmicrocode
 INSTALL_SBIN                   += xen-ringwatch
 INSTALL_SBIN                   += xen-tmem-list-parse
 INSTALL_SBIN                   += xencov
@@ -118,4 +119,7 @@ xen-lowmemd: xen-lowmemd.o
 xencov: xencov.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xenmicrocode: xenmicrocode.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 -include $(DEPS_INCLUDE)
diff --git a/tools/misc/xenmicrocode.c b/tools/misc/xenmicrocode.c
new file mode 100644
index 0000000..ba16d21
--- /dev/null
+++ b/tools/misc/xenmicrocode.c
@@ -0,0 +1,73 @@
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+#include <inttypes.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <xenctrl.h>
+
+int main(int argc, char *argv[])
+{
+    int fd = 0;
+    unsigned char *fbuf;
+    int len;
+    xc_interface *xc_handle;
+    char *filename;
+    struct stat buf;
+    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
+    struct xen_platform_op op;
+    int ret;
+
+    filename = argv[1];
+    fd = open(filename, O_RDONLY);
+    if (fd <= 0) {
+        printf("Could not open; err: %d(%s)\n", errno, strerror(errno));
+        return errno;
+    }
+    if (stat(filename, &buf) != 0) {
+        printf("Could not open; err: %d(%s)\n", errno, strerror(errno));
+        return errno;
+    }
+
+    printf("%s: %ld\n", filename, buf.st_size);
+    len = buf.st_size;
+    fbuf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
+    if ( (xc_handle = xc_interface_open(0,0,0)) == 0 )
+    {
+        fprintf(stderr, "Error opening xc interface: %d (%s)\n",
+                errno, strerror(errno));
+        return 1;
+    }
+    if (fbuf == MAP_FAILED) {
+        printf("Could not map: error: %d(%s)\n", errno,
+               strerror(errno));
+        return errno;
+    }
+
+    uc = xc_hypercall_buffer_alloc(xc_handle, uc, len);
+    memcpy(uc, fbuf, len);
+
+    set_xen_guest_handle(op.u.microcode.data, uc);
+    op.cmd = XENPF_microcode_update;
+    op.interface_version = XENPF_INTERFACE_VERSION;
+    op.u.microcode.length = len;
+    ret = xc_platform_op(xc_handle, &op);
+    if ( ret )
+        fprintf(stderr, "Error in xc_platform_op %d\n", ret);
+
+    xc_hypercall_buffer_free(xc_handle, uc);
+    xc_interface_close(xc_handle);
+
+    if (munmap(fbuf, len)) {
+        printf("Could not unmap: %d(%s)\n", errno, strerror(errno));
+        return errno;
+    }
+    close(fd);
+    return 0;
+}
-- 
1.8.3.1


[-- Attachment #3: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-01-29  9:58   ` Roger Pau Monné
@ 2019-01-29 12:47     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-29 12:47 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Jan 29, 2019 at 10:58:11AM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:47PM +0800, Chao Gao wrote:
>> During late microcode update, apply_microcode() is invoked in
>> cpu_request_microcode(). To make late microcode update more reliable,
>> we want to put the apply_microcode() into stop_machine context. So
>> we split out it from cpu_request_microcode(). As a consequence,
>> apply_microcode() should be invoked explicitly in the common code.
>> 
>> Also with the global ucode cache, microcode parsing only needs
>> to be done once; cpu_request_microcode() is also moved out of
>> microcode_update_cpu().
>> 
>> On AMD side, svm_host_osvw_init() is supposed to be called after
>> microcode update. As apply_micrcode() won't be called by
>> cpu_request_microcode() now, svm_host_osvw_init() is also moved to the
>> end of apply_microcode().
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>  xen/arch/x86/microcode.c       | 80 +++++++++++++++++++++++++-----------------
>>  xen/arch/x86/microcode_amd.c   | 23 +++++++-----
>>  xen/arch/x86/microcode_intel.c | 20 ++---------
>>  3 files changed, 64 insertions(+), 59 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 0c77e90..936f0b8 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -254,47 +254,42 @@ struct microcode_patch *find_patch(unsigned int cpu)
>>      return NULL;
>>  }
>>  
>> -int microcode_resume_cpu(unsigned int cpu)
>> +/*
>> + * Return the number of ucode patch inserted to the global cache.
>> + * Return negtive value on error.
>> + */
>> +static int parse_microcode_blob(const void *buffer, size_t len)
>>  {
>> -    int err;
>> +    unsigned int cpu = smp_processor_id();
>>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> -
>> -    if ( !microcode_ops )
>> -        return 0;
>> +    int ret;
>>  
>>      spin_lock(&microcode_mutex);
>> -
>> -    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>> -    if ( err )
>> -    {
>> +    ret = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>> +    if ( likely(!ret) )
>> +        ret = microcode_ops->cpu_request_microcode(cpu, buffer, len);
>> +    else
>>          microcode_fini_cpu(cpu);
>> -        spin_unlock(&microcode_mutex);
>> -        return err;
>> -    }
>> -
>> -    err = microcode_ops->apply_microcode(cpu);
>>      spin_unlock(&microcode_mutex);
>>  
>> -    return err;
>> +    return ret;
>>  }
>>  
>> -static int microcode_update_cpu(const void *buf, size_t size)
>> +static int microcode_update_cpu(void)
>>  {
>> -    int err;
>> -    unsigned int cpu = smp_processor_id();
>> -    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> +    int ret;
>>  
>>      spin_lock(&microcode_mutex);
>> -
>> -    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
>> -    if ( likely(!err) )
>> -        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
>> -    else
>> -        microcode_fini_cpu(cpu);
>> -
>> +    ret = microcode_ops->apply_microcode(smp_processor_id());
>
>I've realized that most of this helpers take a cpu id parameter
>(apply_microcode, collect_cpu_info and cpu_request_microcode), but
>that at least on Intel they are required to be called on the current
>CPU, at which point I wonder about their purpose. IMO they just make
>the interface more messy, without adding any functionality.

I agree with you and will remove the 'cpu' parameter from the interfaces
you mentioned above.

>
>>      spin_unlock(&microcode_mutex);
>>  
>> -    return err;
>> +    return ret;
>> +}
>> +
>> +int microcode_resume_cpu(unsigned int cpu)
>> +{
>> +    BUG_ON(cpu != smp_processor_id());
>
>Same here, what's the point of passing a cpu id as parameter when
>there's only one valid value?
>
>> +    return microcode_ops ? microcode_update_cpu() : 0;
>>  }
>>  
>>  static long do_microcode_update(void *_info)
>> @@ -304,7 +299,7 @@ static long do_microcode_update(void *_info)
>>  
>>      BUG_ON(info->cpu != smp_processor_id());
>>  
>> -    error = microcode_update_cpu(info->buffer, info->buffer_size);
>> +    error = microcode_update_cpu();
>>      if ( error )
>>          info->error = error;
>>  
>> @@ -339,10 +334,6 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>>          return ret;
>>      }
>>  
>> -    info->buffer_size = len;
>> -    info->error = 0;
>> -    info->cpu = cpumask_first(&cpu_online_map);
>> -
>>      if ( microcode_ops->start_update )
>>      {
>>          ret = microcode_ops->start_update();
>> @@ -353,6 +344,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>>          }
>>      }
>>  
>> +    ret = parse_microcode_blob(info->buffer, len);
>> +    if ( ret <= 0 )
>> +    {
>> +        printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
>> +        xfree(info);
>> +        return -EINVAL;
>> +    }
>> +
>> +    info->buffer_size = len;
>> +    info->error = 0;
>> +    info->cpu = cpumask_first(&cpu_online_map);
>
>It looks like you can also get rid of the cpu field in microcode_info,
>since it's always smp_processor_id?

Yes. Will remove the 'cpu' field.

>
>> +
>>      return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>>  }
>>  
>> @@ -396,13 +399,24 @@ int __init early_microcode_update_cpu(bool start_update)
>>      }
>>      if ( data )
>>      {
>> +        static bool parsed = false;
>> +
>>          if ( start_update && microcode_ops->start_update )
>>              rc = microcode_ops->start_update();
>>  
>>          if ( rc )
>>              return rc;
>>  
>> -        return microcode_update_cpu(data, len);
>> +        if ( !parsed )
>> +        {
>> +            rc = parse_microcode_blob(data, len);
>> +            parsed = true;
>> +
>> +            if ( rc <= 0 )
>> +                return -EINVAL;
>> +        }
>> +
>> +        return microcode_update_cpu();
>
>Hm, the usage of parsed here is kind of dangerous. I assume this works
>fine because early_microcode_update_cpu is called from the BSP in
>early_microcode_init, and then concurrent calls done by the APs always
>see parsed as true.

I think APs are woken up one-by-one. See the call site of cpu_up in
__start_xen.

>
>I would however recommend that you move the parsing to
>early_microcode_init, and that early_microcode_update_cpu always
>assume the blob has been parsed.
>
>If that doesn't work for some reason, I would then recommend that you
>gate the parsing based on cpu_id, so "smp_processor_id() == 0"

Do you think it is better:
remove the parsed flag and check whether current cpu is bsp in
early_microcode_init(); bsp calls the early_microcode_update_cpu(). APs
just call microcode_update_cpu().

>
>>      }
>>      else
>>          return -ENOMEM;
>> diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
>> index d86a596..80e274e 100644
>> --- a/xen/arch/x86/microcode_amd.c
>> +++ b/xen/arch/x86/microcode_amd.c
>> @@ -297,6 +297,10 @@ static int apply_microcode(unsigned int cpu)
>>  
>>      uci->cpu_sig.rev = rev;
>>  
>> +#if CONFIG_HVM
>> +    svm_host_osvw_init();
>> +#endif
>> +
>>      return 0;
>>  }
>>  
>> @@ -463,6 +467,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
>>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>>      unsigned int current_cpu_id;
>>      unsigned int equiv_cpu_id;
>> +    unsigned int matched_cnt = 0;
>>  
>>      /* We should bind the task to the CPU */
>>      BUG_ON(cpu != raw_smp_processor_id());
>> @@ -564,11 +569,15 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
>>           * this ucode patch before checking whether it matches with
>>           * current CPU.
>>           */
>> -        if ( save_patch(microcode_patch) && microcode_fits(mc_amd, cpu) )
>> +        if ( save_patch(microcode_patch) )
>>          {
>> -            error = apply_microcode(cpu);
>> -            if ( error )
>> -                break;
>> +            matched_cnt++;
>> +            if ( microcode_fits(mc_amd, cpu) )
>> +            {
>> +                error = apply_microcode(cpu);
>> +                if ( error )
>> +                    break;
>> +            }
>
>In the commit message you mention that apply_microcode won't be called
>by cpu_request_microcode, yet it seems it's called?

lines below "matched_cnt++" should be removed. They slipped in when I
did a rebasing.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info
  2019-01-29  9:25   ` Roger Pau Monné
@ 2019-01-29 13:27     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-29 13:27 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Jan 29, 2019 at 10:25:03AM +0100, Roger Pau Monné wrote:
>Thanks for the cleanup!
>
>On Mon, Jan 28, 2019 at 03:06:46PM +0800, Chao Gao wrote:
>> diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
>> index fc98fed..507da2e 100644
>> --- a/xen/include/asm-x86/microcode.h
>> +++ b/xen/include/asm-x86/microcode.h
>> @@ -19,7 +19,6 @@ struct microcode_patch {
>>  };
>>  
>>  struct microcode_ops {
>> -    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
>>      int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
>>                                   size_t size);
>>      int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
>> @@ -39,11 +38,6 @@ struct cpu_signature {
>>  
>>  struct ucode_cpu_info {
>>      struct cpu_signature cpu_sig;
>> -    union {
>> -        struct microcode_intel *mc_intel;
>> -        struct microcode_amd *mc_amd;
>> -        void *mc_valid;
>> -    } mc;
>>  };
>
>Is there really a need for such structure since it only has one field
>now?
>
>I'm trying to figure out whether this is expanded by further patches,
>but it seems like it's not, if so please remove the struct altogether.
>
>I'm also wondering whether it's needed to store the cpu signature in
>the pcpu area, AFAICT you always call collect_cpu_info before
>apply_microcode, at which point cpu_signature could be stored in the
>stack and passed to apply_microcode as a parameter?
>
>Or apply_microcode could call collect_cpu_info directly. Getting rid
>of the pcpu field would also allow you to get rid of
>microcode_fini_cpu, further cleaning the code.

Your suggestions are viable and will follow them.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size
  2019-01-29 10:26   ` Jan Beulich
@ 2019-01-29 13:34     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-29 13:34 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, wei.liu2, xen-devel, roger.pau

On Tue, Jan 29, 2019 at 03:26:18AM -0700, Jan Beulich wrote:
>>>> Chao Gao <chao.gao@intel.com> 01/28/19 8:06 AM >>>
>>This check has been done in microcode_sanity_check(). Needn't do it
>>again in get_matching_microcode().
>
>But while there are two call sites of get_matching_microcode() only
>one is preceded by a call to microcode_sanity_check(). There's also
>no visible connection between cpu_request_microcode() (which frees
>the blob it has validated) and microcode_resume_match() (and in
>particular the blob passed into there). What am I missing?

microcode_resume_match() takes a cached blob; the blob should have passed
the sanity check. I will explain it in the commit message.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 2/8] microcode/intel: extend microcode_update_match()
  2019-01-29 10:41   ` Jan Beulich
@ 2019-01-29 13:52     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-29 13:52 UTC (permalink / raw)
  To: Jan Beulich; +Cc: andrew.cooper3, wei.liu2, xen-devel, roger.pau

On Tue, Jan 29, 2019 at 03:41:09AM -0700, Jan Beulich wrote:
>>>> Chao Gao <chao.gao@intel.com> 01/28/19 8:10 AM >>>
>>--- a/xen/arch/x86/microcode_intel.c
>>+++ b/xen/arch/x86/microcode_intel.c
>>@@ -127,14 +127,24 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
>>return 0;
>>}
> >
>>-static inline int microcode_update_match(
>>-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
>>-    int sig, int pf)
>>+static enum microcode_match_result microcode_update_match(
>>+    const void *mc, unsigned int sig, unsigned int pf, unsigned int rev)
>>{
>>-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>>+    const struct microcode_header_intel *mc_header = mc;
>>+    const struct extended_sigtable *ext_header;
>>+    const struct extended_signature *ext_sig;
>>+    unsigned int i;
>>+
>>+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
>>+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
> 
>You may want a tristate return here: I know there are systems where
>firmware updates ucode only on core 0 of every socket, in which case we'd
>very much like to apply the same microcode on the other cores in case we
>find the blob matching what is currently installed. IOW depending how later
>patches actually work, you may also want a SAME_UCODE return case.

It seems we don't need it. For CPU hot-plug, we would also meet the same
issue you described. To resolve it, a ucode patch is added to the global
cache without checking against the revision on current CPU even if the
signature is matched.

>
>>+    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
>
>On top of what Roger has said, isn't mc + MC_HEADER_SIZE the same
>as mc_header + 1?
>
>>+    ext_sig = (const void *)ext_header + EXT_HEADER_SIZE;
>
>And (const void *)ext_header + EXT_HEADER_SIZE the same as
>(const void *)(ext_header + 1)?
>
>In both cases this would eliminate unnecessary implications of certain
>two sub-terms to refer to the same types, i.e. also make the casts less
>scary / dangerous.
>

Will do.

>
>>--- a/xen/include/asm-x86/microcode.h
>>+++ b/xen/include/asm-x86/microcode.h
>>@@ -3,6 +3,12 @@
> >
>>#include <xen/percpu.h>
> >
>>+enum microcode_match_result {
>>+    OLD_UCODE, /* signature matched, but revision id isn't newer */
>>+    NEW_UCODE, /* signature matched, but revision id is newer */
>>+    MIS_UCODE, /* signature mismatched */
>>+};
>
>It's not clear at this point of the series or from the commit message whether
>this is to be used by AMD code as well. If not, it would better move into
>microcode_intel.c.

It will be used by AMD code. Will mention this in commit message.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info
  2019-01-29 10:10   ` Roger Pau Monné
@ 2019-01-29 14:11     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-29 14:11 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Jan 29, 2019 at 11:10:51AM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:48PM +0800, Chao Gao wrote:
>> microcode pointer and size were passed to other CPUs to parse
>> microcode locally. Now, parsing microcode is done on one CPU.
>> Other CPUs needn't parse the microcode blob; the pointer and
>> size can be removed.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>  xen/arch/x86/microcode.c | 33 +++++++++++++++++----------------
>>  1 file changed, 17 insertions(+), 16 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 936f0b8..3c2274f 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -190,9 +190,7 @@ DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
>>  
>>  struct microcode_info {
>>      unsigned int cpu;
>
>I think cpu can also be removed as my previous reply to patch 5, at
>which point you only need to store an error? In which case you should
>also remove this struct then.

I agree to all your comments on this patch. Will correct them.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 0/8] improve late microcode loading
  2019-01-29 12:11   ` Chao Gao
@ 2019-01-29 14:17     ` Roger Pau Monné
  0 siblings, 0 replies; 50+ messages in thread
From: Roger Pau Monné @ 2019-01-29 14:17 UTC (permalink / raw)
  To: Chao Gao
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel,
	Boris Ostrovsky, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

On Tue, Jan 29, 2019 at 08:11:33PM +0800, Chao Gao wrote:
> On Tue, Jan 29, 2019 at 12:31:51PM +0100, Roger Pau Monné wrote:
> >On Mon, Jan 28, 2019 at 03:06:42PM +0800, Chao Gao wrote:
> >> Changes in this version:
> >>  - support parallel microcode updates for all cores (see patch 8)
> >>  - Address Roger's comments on the last version.
> >> 
> >> The intention of this series is to make the late microcode loading
> >> more reliable by rendezvousing all cpus in stop_machine context.
> >> This idea comes from Ashok. I am porting his linux patch to Xen
> >> (see patch 7 for more details).
> >> 
> >> This series makes three changes:
> >>  1. Patch 1-6: introduce a global microcode cache
> >>  2. Patch 7: synchronize late microcode loading
> >>  3. Patch 8: support parallel microcodes update on different cores
> >> 
> >> Currently, late microcode loading does a lot of things including
> >> parsing microcode blob, checking the signature/revision and performing
> >> update. Putting all of them into stop_machine context is a bad idea
> >> because of complexity (One issue I observed is memory allocation
> >> triggered one assertion in stop_machine context). In order to simplify
> >> the load process, I move parsing microcode out of the load process.
> >> The microcode blob is parsed and a global microcode cache is built on
> >> a single CPU before rendezvousing all cpus to update microcode. Other
> >> CPUs just get and load a suitable microcode from the global cache.
> >> With this global cache, it is safe to put simplified load process to
> >> stop_machine context.
> >> 
> >> Regarding changes to AMD side, I didn't do any test for them due to
> >> lack of hardware. Could you help to test this series on an AMD machine?
> >> At least, two basic tests are needed:
> >> * do a microcode update after system bootup
> >> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
> >>   command line and then do a microcode update and online all offlined
> >>   CPUs via 'xen-hptool'.
> >> 
> >
> >Thanks for the series, I think it's a good improvement to current
> >microcode loading.
> >
> >I would like to ask how have you tested the series, I don't seem to
> >find any tool in the current tree to load a microcode to Xen. The only
> >thing I've found is:
> >
> >https://lists.xen.org/archives/html/xen-devel/2013-07/txtpyXvYZGRwb.txt
> >
> >Have you used this tool to test the code?
> 
> Yes. I am using this patch with some issues fixed.

Could you please include it in the next version of the series?

We really need this tool in-tree IMO.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-01-29 11:27   ` Roger Pau Monné
@ 2019-01-30 13:36     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-30 13:36 UTC (permalink / raw)
  To: Roger Pau Monné; +Cc: xen-devel, Wei Liu, Jan Beulich, Andrew Cooper

On Tue, Jan 29, 2019 at 12:27:30PM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:50PM +0800, Chao Gao wrote:
>> Currently, microcode_update_lock and microcode_mutex prevent cores
>> from updating microcode in parallel. Below changes are made to support
>> parallel microcode update on cores.
>
>Oh, that's what I missed from the previous patch then, and what
>serialises the applying of the microcode update.
>
>> 
>> microcode_update_lock is removed. The purpose of this lock is to
>> prevent logic threads of a same core from updating microcode at the
>> same time. But due to using a global lock, it also prevents parallel
>> microcode updating on different cores. The original purpose of
>> microcode_update_lock is already enforced at the level of
>> apply_microcode()'s caller:
>> 1. For late microcode update, only one sibiling thread of a core will
>> call the apply_microcode().
>> 2. For microcode update during system startup or CPU-hotplug, each
>> logical thread is woken up one-by-one.
>> 3. get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
>> late microcode update.
>> 
>> microcode_mutex is replaced by a rwlock. microcode_mutex was used to
>> prevent concurrent accesses to 'uci' and microcode_cache. Now the
>> per-cpu variable, 'uci', won't be accessed by remote cpus after most
>> fields in 'uci' have been removed; The only shared resource which
>> needs to be protected is the microcode_cache. A rwlock allows multiple
>> readers (one thread of each core) to access the global cache and
>> update microcode simultaneously. Because the rwlock may be held in
>> stop_machine context, where interrupt is disabled, irq{save, restore}
>> variants are used to get/release the rwlock.
>> 
>> Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
>> only) are still processed sequentially.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>
>Thanks, this LGTM, just one question below.
>
>> @@ -285,10 +307,11 @@ static int parse_microcode_blob(const void *buffer, size_t len)
>>  static int microcode_update_cpu(void)
>>  {
>>      int ret;
>> +    unsigned long flag;
>>  
>> -    spin_lock(&microcode_mutex);
>> +    read_lock_irqsave(&cache_rwlock, flag);
>>      ret = microcode_ops->apply_microcode(smp_processor_id());
>> -    spin_unlock(&microcode_mutex);
>> +    read_unlock_irqrestore(&cache_rwlock, flag);
>
>Why do you take the lock here, wouldn't it be better to just take it
>for find_patch? (ie: like you do for save_patch)

Because find_patch() is expected to return a pointer to a ucode patch.
If a thread is to load this ucode patch, we should hold the lock to
avoid it being freed by another thread.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-01-29 10:37   ` Roger Pau Monné
  2019-01-29 10:45     ` Jan Beulich
@ 2019-01-30 13:44     ` Chao Gao
  1 sibling, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-01-30 13:44 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Kevin Tian, Wei Liu, Jan Beulich, Andrew Cooper, Jun Nakajima,
	xen-devel, Thomas Gleixner, Borislav Petkov, Ashok Raj

On Tue, Jan 29, 2019 at 11:37:12AM +0100, Roger Pau Monné wrote:
>On Mon, Jan 28, 2019 at 03:06:49PM +0800, Chao Gao wrote:
>> This patch ports microcode improvement patches from linux kernel.
>> 
>> Before you read any further: the early loading method is still the
>> preferred one and you should always do that. The following patch is
>> improving the late loading mechanism for long running jobs and cloud use
>> cases.
>> 
>> Gather all cores and serialize the microcode update on them by doing it
>> one-by-one to make the late update process as reliable as possible and
>> avoid potential issues caused by the microcode update.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Tested-by: Chao Gao <chao.gao@intel.com>
>> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
>> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> ---
>>  xen/arch/x86/microcode.c | 125 +++++++++++++++++++++++++++++++++++++----------
>>  1 file changed, 98 insertions(+), 27 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 3c2274f..b7b20cf 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -22,6 +22,7 @@
>>   */
>>  
>>  #include <xen/cpu.h>
>> +#include <xen/cpumask.h>
>>  #include <xen/lib.h>
>>  #include <xen/kernel.h>
>>  #include <xen/init.h>
>> @@ -30,18 +31,25 @@
>>  #include <xen/smp.h>
>>  #include <xen/softirq.h>
>>  #include <xen/spinlock.h>
>> +#include <xen/stop_machine.h>
>>  #include <xen/tasklet.h>
>>  #include <xen/guest_access.h>
>>  #include <xen/earlycpio.h>
>> +#include <xen/watchdog.h>
>>  
>> +#include <asm/delay.h>
>>  #include <asm/msr.h>
>>  #include <asm/processor.h>
>>  #include <asm/setup.h>
>>  #include <asm/microcode.h>
>>  
>> +/* By default, wait for 30000us */
>> +#define MICROCODE_DEFAULT_TIMEOUT_US 30000
>> +
>>  static module_t __initdata ucode_mod;
>>  static signed int __initdata ucode_mod_idx;
>>  static bool_t __initdata ucode_mod_forced;
>> +static unsigned int nr_cores;
>>  
>>  /*
>>   * If we scan the initramfs.cpio for the early microcode code
>> @@ -188,10 +196,11 @@ static DEFINE_SPINLOCK(microcode_mutex);
>>  
>>  DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
>>  
>> -struct microcode_info {
>> -    unsigned int cpu;
>> -    int error;
>> -};
>> +/*
>> + * Count the CPUs that have entered and exited the rendezvous
>> + * during late microcode update.
>> + */
>> +static atomic_t cpu_in, cpu_out;
>>  
>>  static void microcode_fini_cpu(unsigned int cpu)
>>  {
>> @@ -290,30 +299,60 @@ int microcode_resume_cpu(unsigned int cpu)
>>      return microcode_ops ? microcode_update_cpu() : 0;
>>  }
>>  
>> -static long do_microcode_update(void *_info)
>> +/* Wait for all CPUs to rendezvous with a timeout (us) */
>> +static int wait_for_cpus(atomic_t *cnt, unsigned int timeout)
>>  {
>> -    struct microcode_info *info = _info;
>> -    int error;
>> +    unsigned int cpus = num_online_cpus();
>>  
>> -    BUG_ON(info->cpu != smp_processor_id());
>> +    atomic_inc(cnt);
>> +
>> +    while ( atomic_read(cnt) != cpus )
>> +    {
>> +        if ( timeout <= 0 )
>> +        {
>> +            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
>> +                   smp_processor_id());
>> +            return -EBUSY;
>> +        }
>> +        udelay(1);
>
>udelay will call the rdtsc instruction, is it fine to use it on a
>sibling thread while there's a microcode update in process on the same
>core?

I think it is ok. I just checked the kernel code. It uses the ndelay,
which in turn calls the rdtsc instruction in some cases.

Ashok, what's your opinion?

>
>> +        timeout--;
>> +    }
>> +
>> +    return 0;
>> +}
>> +
>> +static int do_microcode_update(void *unused)
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    int ret;
>>  
>> -    error = microcode_update_cpu();
>> -    if ( error )
>> -        info->error = error;
>> +    ret = wait_for_cpus(&cpu_in, MICROCODE_DEFAULT_TIMEOUT_US);
>> +    if ( ret )
>> +        return ret;
>>  
>> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
>> -    if ( info->cpu < nr_cpu_ids )
>> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>> +    /*
>> +     * Initiate an update on all processors which don't have an online sibling
>> +     * thread with a lower thread id. Other sibling threads just await the
>> +     * completion of microcode update.
>> +     */
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +        ret = microcode_update_cpu();
>
>The description says "Gather all cores and serialize the microcode
>update on them by doing it one-by-one" but it looks like you are doing
>the update in parallel actually?
>
>> +    /*
>> +     * Increase the wait timeout to a safe value here since we're serializing
>> +     * the microcode update and that could take a while on a large number of
>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>> +     * the last CPU finished updating and thus cut short
>> +     */
>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
>> +        panic("Timeout when finishing updating microcode");
>>  
>> -    error = info->error;
>> -    xfree(info);
>> -    return error;
>> +    return ret;
>>  }
>>  
>>  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>>  {
>>      int ret;
>> -    struct microcode_info *info;
>> +    unsigned int cpu;
>>      void * buffer;
>>  
>>      if ( len != (uint32_t)len )
>> @@ -334,28 +372,61 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>>      if ( ret != 0 )
>>          goto free;
>>  
>> +    /* cpu_online_map must not change during update */
>> +    if ( !get_cpu_maps() )
>> +    {
>> +        ret = -EBUSY;
>> +        goto free;
>> +    }
>> +
>>      if ( microcode_ops->start_update )
>>      {
>>          ret = microcode_ops->start_update();
>>          if ( ret != 0 )
>> -            goto free;
>> +            goto put;
>>      }
>>  
>>      ret = parse_microcode_blob(buffer, len);
>>      if ( ret <= 0 )
>>      {
>>          printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
>> -        xfree(info);
>> -        return -EINVAL;
>> +        ret = -EINVAL;
>> +        goto put;
>>      }
>>  
>> -    info->error = 0;
>> -    info->cpu = cpumask_first(&cpu_online_map);
>> +    atomic_set(&cpu_in, 0);
>> +    atomic_set(&cpu_out, 0);
>> +
>> +    /* Calculate the number of online CPU core */
>> +    nr_cores = 0;
>> +    for_each_online_cpu(cpu)
>> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +            nr_cores++;
>> +
>> +    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
>>  
>> -    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>> +    /*
>> +     * We intend to disable interrupt for long time, which may lead to
>> +     * watchdog timeout.
>> +     */
>> +    watchdog_disable();
>> +    /*
>> +     * Late loading dance. Why the heavy-handed stop_machine effort?
>> +     *
>> +     * - HT siblings must be idle and not execute other code while the other
>> +     *   sibling is loading microcode in order to avoid any negative
>> +     *   interactions cause by the loading.
>> +     *
>> +     * - In addition, microcode update on the cores must be serialized until
>> +     *   this requirement can be relaxed in the future. Right now, this is
>
>As said above, I'm not sure what you are doing here could be
>considered serialized, the previous method was clearly serialized
>moving from one CPU to the next one.
>
>Here you are likely updating multiple cores at the same time, which
>I'm not saying it's wrong, but doesn't seem to match the commit
>description or the comments in the code.

Indeed, it is a little confusing. will explain why it is serialized in the
commit message.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-01-28  7:06 ` [PATCH v5 3/8] microcode: introduce the global microcode cache Chao Gao
  2019-01-28 17:39   ` Roger Pau Monné
@ 2019-02-08 11:41   ` Jan Beulich
  2019-02-11  3:59     ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-08 11:41 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
>      spin_unlock(&microcode_mutex);
>  }
>  
> +/* Save a ucode patch to the global cache list */
> +bool save_patch(struct microcode_patch *new_patch)
> +{
> +    struct microcode_patch *microcode_patch;
> +
> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
> +    {
> +        enum microcode_match_result result =
> +            microcode_ops->replace_patch(new_patch, microcode_patch);
> +
> +        switch ( result )
> +        {
> +        case OLD_UCODE:
> +            microcode_ops->free_patch(new_patch);
> +            return false;
> +
> +        case NEW_UCODE:
> +            microcode_ops->free_patch(microcode_patch);
> +            return true;
> +
> +        case MIS_UCODE:
> +            continue;
> +
> +        default:
> +            ASSERT_UNREACHABLE();
> +            return 0;

false (or true; either value is going to be fine/wrong here afaict)

Anyway I'm having some difficulty seeing what the intended
meaning of the return value is, and without that being clear I
also can't make up my mind whether I agree with the cases
here.

> +        }
> +    }
> +    list_add_tail(&new_patch->list, &microcode_cache);

Hmm, you add _every_ patch producing MIS_UCODE to the list.
This is going to be a long list then - there are over 100 blobs in
the latest dropping, and this number is only going to grow.
Saving blobs is useful only for cases where mixed steppings are
actually supported. I guess that wouldn't go much beyond the
stepping and/or "pf" differing, but family and model matching up.

Additionally list management generally requires locking. This
function doesn't even have a comment saying what lock(s) is
(are) necessary to be held (same elsewhere).

Finally please add blank lines ahead of the line above as well
as (not just here) ahead of the main return statement of the
function.

> +/* Find a ucode patch who has newer revision than the one in use */
> +struct microcode_patch *find_patch(unsigned int cpu)

Is the caller allowed to alter the returned object? If not, you want
to return a pointer to const.

> +{
> +    struct microcode_patch *microcode_patch;
> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
> +
> +    if ( microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig) )
> +    {
> +        __microcode_fini_cpu(cpu);

This is kind of a library function - I can't see how it could legitimately
invoke "fini", the more that you do here but not ...

> +        return NULL;
> +    }
> +
> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
> +    {
> +        if ( microcode_ops->match_cpu(microcode_patch, cpu) )
> +            return microcode_patch;
> +    }
> +    return NULL;

... here.

> +static enum microcode_match_result replace_patch(struct microcode_patch *new,
> +                                                 struct microcode_patch *old)
> +{
> +    struct microcode_amd *new_mc = new->data;
> +    struct microcode_header_amd *new_header = new_mc->mpb;
> +    struct microcode_amd *old_mc = old->data;
> +    struct microcode_header_amd *old_header = old_mc->mpb;

const (all four of them afaict)

> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -147,6 +147,15 @@ static enum microcode_match_result microcode_update_match(
>      return MIS_UCODE;
>  }
>  
> +static bool match_cpu(const struct microcode_patch *patch, unsigned int cpu)
> +{
> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);

const

> +static enum microcode_match_result replace_patch(struct microcode_patch *new,
> +                                                 struct microcode_patch *old)
> +{
> +    struct microcode_header_intel *old_header = old->data;

const

> +    enum microcode_match_result ret =
> +                microcode_update_match(new->data, old_header->sig,
> +                                       old_header->pf, old_header->rev);
> +
> +    if ( ret == NEW_UCODE )
> +        list_replace(&old->list, &new->list);

I think it would be better to leave actual list maintenance to generic
code. That way the function parameters can also be pointer-to-const.

> @@ -248,6 +277,25 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
>      const struct microcode_header_intel *mc_header = mc;
>      unsigned long total_size = get_totalsize(mc_header);
>      void *new_mc;
> +    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
> +    void *new_mc2 = xmalloc_bytes(total_size);

Why don't you use the already existing new_mc here?

> @@ -276,18 +324,24 @@ static int apply_microcode(unsigned int cpu)
>      unsigned int val[2];
>      unsigned int cpu_num = raw_smp_processor_id();
>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
> +    struct microcode_intel *mc_intel;
> +    struct microcode_patch *patch;
>  
>      /* We should bind the task to the CPU */
>      BUG_ON(cpu_num != cpu);
>  
> -    if ( uci->mc.mc_intel == NULL )
> +    patch = find_patch(cpu);
> +    if ( !patch )
>          return -EINVAL;
>  
> +    mc_intel = patch->data;
> +    BUG_ON(!mc_intel);

I'm not convinced this is a useful check - you never save_patch()
anything that has a NULL pointer here. And general corruption might
put bad values other than NULL into the field.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-01-28  7:06 ` [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
  2019-01-29  9:58   ` Roger Pau Monné
@ 2019-02-08 15:58   ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-02-08 15:58 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> --- a/xen/arch/x86/microcode_amd.c
> +++ b/xen/arch/x86/microcode_amd.c
> @@ -297,6 +297,10 @@ static int apply_microcode(unsigned int cpu)
>  
>      uci->cpu_sig.rev = rev;
>  
> +#if CONFIG_HVM
> +    svm_host_osvw_init();
> +#endif

#ifdef (it's been wrong at where you move it from)

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-01-28  7:06 ` [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading Chao Gao
  2019-01-29 10:37   ` Roger Pau Monné
@ 2019-02-08 16:29   ` Jan Beulich
  2019-02-11  5:40     ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-08 16:29 UTC (permalink / raw)
  To: Chao Gao
  Cc: Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> @@ -30,18 +31,25 @@
>  #include <xen/smp.h>
>  #include <xen/softirq.h>
>  #include <xen/spinlock.h>
> +#include <xen/stop_machine.h>
>  #include <xen/tasklet.h>
>  #include <xen/guest_access.h>
>  #include <xen/earlycpio.h>
> +#include <xen/watchdog.h>
>  
> +#include <asm/delay.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
>  #include <asm/setup.h>
>  #include <asm/microcode.h>
>  
> +/* By default, wait for 30000us */
> +#define MICROCODE_DEFAULT_TIMEOUT_US 30000

This value deserves some explanation as to how it was chosen.

>  static module_t __initdata ucode_mod;
>  static signed int __initdata ucode_mod_idx;
>  static bool_t __initdata ucode_mod_forced;
> +static unsigned int nr_cores;

Could you see about avoiding such a static? You have ...

> +static int do_microcode_update(void *unused)

... an "unused" here after all, and it's the (indirect) caller of the
function which calculates the value.

> +{
> +    unsigned int cpu = smp_processor_id();
> +    int ret;
>  
> -    error = microcode_update_cpu();
> -    if ( error )
> -        info->error = error;
> +    ret = wait_for_cpus(&cpu_in, MICROCODE_DEFAULT_TIMEOUT_US);
> +    if ( ret )
> +        return ret;
>  
> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
> -    if ( info->cpu < nr_cpu_ids )
> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
> +    /*
> +     * Initiate an update on all processors which don't have an online sibling
> +     * thread with a lower thread id. Other sibling threads just await the
> +     * completion of microcode update.
> +     */
> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        ret = microcode_update_cpu();
> +    /*
> +     * Increase the wait timeout to a safe value here since we're serializing
> +     * the microcode update and that could take a while on a large number of
> +     * CPUs. And that is fine as the *actual* timeout will be determined by
> +     * the last CPU finished updating and thus cut short
> +     */
> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
> +        panic("Timeout when finishing updating microcode");

While I expect this to go away again in the next patch, I'd still like to
see this improved, in particular in case the patch here goes in
independently of the next one. After all on a system with 100 cores
the timeout totals to a whopping 3 seconds.

Generally the time needed to wait scales by the number of CPUs still
in need of doing the update. And if a timeout is really to occur, it's
perhaps because of one bad core or socket, not because nothing
works at all. Hence it would seem both nice and possible to scale the
"remaining time to wait" by the (known) number of remaining
processors to respond.

Additionally it would be confusing to see dozens of panics in case
this actually triggers, because all CPUs would hit this at about the
same time.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-02-08 11:41   ` Jan Beulich
@ 2019-02-11  3:59     ` Chao Gao
  2019-02-11 13:16       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-02-11  3:59 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Fri, Feb 08, 2019 at 04:41:19AM -0700, Jan Beulich wrote:
>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
>>      spin_unlock(&microcode_mutex);
>>  }
>>  
>> +/* Save a ucode patch to the global cache list */
>> +bool save_patch(struct microcode_patch *new_patch)
>> +{
>> +    struct microcode_patch *microcode_patch;
>> +
>> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
>> +    {
>> +        enum microcode_match_result result =
>> +            microcode_ops->replace_patch(new_patch, microcode_patch);
>> +
>> +        switch ( result )
>> +        {
>> +        case OLD_UCODE:
>> +            microcode_ops->free_patch(new_patch);
>> +            return false;
>> +
>> +        case NEW_UCODE:
>> +            microcode_ops->free_patch(microcode_patch);
>> +            return true;
>> +
>> +        case MIS_UCODE:
>> +            continue;
>> +
>> +        default:
>> +            ASSERT_UNREACHABLE();
>> +            return 0;
>
>false (or true; either value is going to be fine/wrong here afaict)
>
>Anyway I'm having some difficulty seeing what the intended
>meaning of the return value is, and without that being clear I
>also can't make up my mind whether I agree with the cases
>here.

The return value means whether saving a ucode patch succeeded or failed.
In another word, it also means whether the ucode cache is updated or
not. According to the return value, the caller decides not to do the
expensive late ucode update for some cases (i.e. when admin provides a
old ucode).

>
>> +        }
>> +    }
>> +    list_add_tail(&new_patch->list, &microcode_cache);
>
>Hmm, you add _every_ patch producing MIS_UCODE to the list.
>This is going to be a long list then - there are over 100 blobs in
>the latest dropping, and this number is only going to grow.
>Saving blobs is useful only for cases where mixed steppings are
>actually supported. I guess that wouldn't go much beyond the
>stepping and/or "pf" differing, but family and model matching up.

We can introduce another type (i.e. MIX_UCODE) to denote the ucode
patch which mismatches any saved patches but only differs from current
cpu in stepping or "pf" (or model for AMD).

And save ucodes of MIX_UCODE type and discard ucodes of MIS_UCODE type.

Arch specific callback can determine the type (MIX_UCODE or MIS_UCODE)
according to the supported mixes.

>
>Additionally list management generally requires locking. This
>function doesn't even have a comment saying what lock(s) is
>(are) necessary to be held (same elsewhere).
>
>Finally please add blank lines ahead of the line above as well
>as (not just here) ahead of the main return statement of the
>function.

Will do

>
>> +/* Find a ucode patch who has newer revision than the one in use */
>> +struct microcode_patch *find_patch(unsigned int cpu)
>
>Is the caller allowed to alter the returned object? If not, you want
>to return a pointer to const.

No. Will constify the return value.

>
>> +{
>> +    struct microcode_patch *microcode_patch;
>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>> +
>> +    if ( microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig) )
>> +    {
>> +        __microcode_fini_cpu(cpu);
>
>This is kind of a library function - I can't see how it could legitimately
>invoke "fini", the more that you do here but not ...
>
>> +        return NULL;
>> +    }
>> +
>> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
>> +    {
>> +        if ( microcode_ops->match_cpu(microcode_patch, cpu) )
>> +            return microcode_patch;
>> +    }
>> +    return NULL;
>
>... here.

My understanding is saving the cpu signature and ucode revision can
reduce MSR reads to get such information. So "fini" is invoked when
"collect" failed. The apply_microcode() following find_patch() can
access "uci" without invoking "collect" again.

>
>> +static enum microcode_match_result replace_patch(struct microcode_patch *new,
>> +                                                 struct microcode_patch *old)
>> +{
>> +    struct microcode_amd *new_mc = new->data;
>> +    struct microcode_header_amd *new_header = new_mc->mpb;
>> +    struct microcode_amd *old_mc = old->data;
>> +    struct microcode_header_amd *old_header = old_mc->mpb;
>
>const (all four of them afaict)
>
>> --- a/xen/arch/x86/microcode_intel.c
>> +++ b/xen/arch/x86/microcode_intel.c
>> @@ -147,6 +147,15 @@ static enum microcode_match_result microcode_update_match(
>>      return MIS_UCODE;
>>  }
>>  
>> +static bool match_cpu(const struct microcode_patch *patch, unsigned int cpu)
>> +{
>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>
>const
>
>> +static enum microcode_match_result replace_patch(struct microcode_patch *new,
>> +                                                 struct microcode_patch *old)
>> +{
>> +    struct microcode_header_intel *old_header = old->data;
>
>const
>
>> +    enum microcode_match_result ret =
>> +                microcode_update_match(new->data, old_header->sig,
>> +                                       old_header->pf, old_header->rev);
>> +
>> +    if ( ret == NEW_UCODE )
>> +        list_replace(&old->list, &new->list);
>
>I think it would be better to leave actual list maintenance to generic
>code. That way the function parameters can also be pointer-to-const.

Will do.

>
>> @@ -248,6 +277,25 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
>>      const struct microcode_header_intel *mc_header = mc;
>>      unsigned long total_size = get_totalsize(mc_header);
>>      void *new_mc;
>> +    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
>> +    void *new_mc2 = xmalloc_bytes(total_size);
>
>Why don't you use the already existing new_mc here?

Will reuse the new_mc here.

>
>> @@ -276,18 +324,24 @@ static int apply_microcode(unsigned int cpu)
>>      unsigned int val[2];
>>      unsigned int cpu_num = raw_smp_processor_id();
>>      struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
>> +    struct microcode_intel *mc_intel;
>> +    struct microcode_patch *patch;
>>  
>>      /* We should bind the task to the CPU */
>>      BUG_ON(cpu_num != cpu);
>>  
>> -    if ( uci->mc.mc_intel == NULL )
>> +    patch = find_patch(cpu);
>> +    if ( !patch )
>>          return -EINVAL;
>>  
>> +    mc_intel = patch->data;
>> +    BUG_ON(!mc_intel);
>
>I'm not convinced this is a useful check - you never save_patch()
>anything that has a NULL pointer here. And general corruption might
>put bad values other than NULL into the field.

Then I will remove this check.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-02-08 16:29   ` Jan Beulich
@ 2019-02-11  5:40     ` Chao Gao
  2019-02-11 13:23       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-02-11  5:40 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

On Fri, Feb 08, 2019 at 09:29:32AM -0700, Jan Beulich wrote:
>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>> @@ -30,18 +31,25 @@
>>  #include <xen/smp.h>
>>  #include <xen/softirq.h>
>>  #include <xen/spinlock.h>
>> +#include <xen/stop_machine.h>
>>  #include <xen/tasklet.h>
>>  #include <xen/guest_access.h>
>>  #include <xen/earlycpio.h>
>> +#include <xen/watchdog.h>
>>  
>> +#include <asm/delay.h>
>>  #include <asm/msr.h>
>>  #include <asm/processor.h>
>>  #include <asm/setup.h>
>>  #include <asm/microcode.h>
>>  
>> +/* By default, wait for 30000us */
>> +#define MICROCODE_DEFAULT_TIMEOUT_US 30000
>
>This value deserves some explanation as to how it was chosen.

Will do.

>
>>  static module_t __initdata ucode_mod;
>>  static signed int __initdata ucode_mod_idx;
>>  static bool_t __initdata ucode_mod_forced;
>> +static unsigned int nr_cores;
>
>Could you see about avoiding such a static? You have ...
>
>> +static int do_microcode_update(void *unused)
>
>... an "unused" here after all, and it's the (indirect) caller of the
>function which calculates the value.

Will do.

>
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    int ret;
>>  
>> -    error = microcode_update_cpu();
>> -    if ( error )
>> -        info->error = error;
>> +    ret = wait_for_cpus(&cpu_in, MICROCODE_DEFAULT_TIMEOUT_US);
>> +    if ( ret )
>> +        return ret;
>>  
>> -    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
>> -    if ( info->cpu < nr_cpu_ids )
>> -        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
>> +    /*
>> +     * Initiate an update on all processors which don't have an online sibling
>> +     * thread with a lower thread id. Other sibling threads just await the
>> +     * completion of microcode update.
>> +     */
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +        ret = microcode_update_cpu();
>> +    /*
>> +     * Increase the wait timeout to a safe value here since we're serializing
>> +     * the microcode update and that could take a while on a large number of
>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>> +     * the last CPU finished updating and thus cut short
>> +     */
>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
>> +        panic("Timeout when finishing updating microcode");
>
>While I expect this to go away again in the next patch, I'd still like to
>see this improved, in particular in case the patch here goes in
>independently of the next one. After all on a system with 100 cores
>the timeout totals to a whopping 3 seconds.

To be clear, the timeout remains the same in the next patch due to
the serial print clause in apply_microcode().

>
>Generally the time needed to wait scales by the number of CPUs still
>in need of doing the update. And if a timeout is really to occur, it's
>perhaps because of one bad core or socket, not because nothing
>works at all. Hence it would seem both nice and possible to scale the
>"remaining time to wait" by the (known) number of remaining
>processors to respond.

Basically, I think the benefit is we can recognize the failure earlier
if no core called in in a given interval (i.e. 30ms), and trigger a
panic. Considering for such case, even with this optimization, the
system needs reboot, which generally takes several minutes, what's the
value of this optimization?

>
>Additionally it would be confusing to see dozens of panics in case
>this actually triggers, because all CPUs would hit this at about the
>same time.

Will avoid this confusion.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 3/8] microcode: introduce the global microcode cache
  2019-02-11  3:59     ` Chao Gao
@ 2019-02-11 13:16       ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-02-11 13:16 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 11.02.19 at 04:59, <chao.gao@intel.com> wrote:
> On Fri, Feb 08, 2019 at 04:41:19AM -0700, Jan Beulich wrote:
>>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>> @@ -208,6 +210,58 @@ static void microcode_fini_cpu(unsigned int cpu)
>>>      spin_unlock(&microcode_mutex);
>>>  }
>>>  
>>> +/* Save a ucode patch to the global cache list */
>>> +bool save_patch(struct microcode_patch *new_patch)
>>> +{
>>> +    struct microcode_patch *microcode_patch;
>>> +
>>> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
>>> +    {
>>> +        enum microcode_match_result result =
>>> +            microcode_ops->replace_patch(new_patch, microcode_patch);
>>> +
>>> +        switch ( result )
>>> +        {
>>> +        case OLD_UCODE:
>>> +            microcode_ops->free_patch(new_patch);
>>> +            return false;
>>> +
>>> +        case NEW_UCODE:
>>> +            microcode_ops->free_patch(microcode_patch);
>>> +            return true;
>>> +
>>> +        case MIS_UCODE:
>>> +            continue;
>>> +
>>> +        default:
>>> +            ASSERT_UNREACHABLE();
>>> +            return 0;
>>
>>false (or true; either value is going to be fine/wrong here afaict)
>>
>>Anyway I'm having some difficulty seeing what the intended
>>meaning of the return value is, and without that being clear I
>>also can't make up my mind whether I agree with the cases
>>here.
> 
> The return value means whether saving a ucode patch succeeded or failed.
> In another word, it also means whether the ucode cache is updated or
> not. According to the return value, the caller decides not to do the
> expensive late ucode update for some cases (i.e. when admin provides a
> old ucode).

Would you mind extending the comment to the effect?

>>> +/* Find a ucode patch who has newer revision than the one in use */
>>> +struct microcode_patch *find_patch(unsigned int cpu)
>>> +{
>>> +    struct microcode_patch *microcode_patch;
>>> +    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
>>> +
>>> +    if ( microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig) )
>>> +    {
>>> +        __microcode_fini_cpu(cpu);
>>
>>This is kind of a library function - I can't see how it could legitimately
>>invoke "fini", the more that you do here but not ...
>>
>>> +        return NULL;
>>> +    }
>>> +
>>> +    list_for_each_entry(microcode_patch, &microcode_cache, list)
>>> +    {
>>> +        if ( microcode_ops->match_cpu(microcode_patch, cpu) )
>>> +            return microcode_patch;
>>> +    }
>>> +    return NULL;
>>
>>... here.
> 
> My understanding is saving the cpu signature and ucode revision can
> reduce MSR reads to get such information. So "fini" is invoked when
> "collect" failed. The apply_microcode() following find_patch() can
> access "uci" without invoking "collect" again.

I fail to see the connection between "saving the cpu signature and
ucode revision" and the apparent layering violation here. Could you
help me with this?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-02-11  5:40     ` Chao Gao
@ 2019-02-11 13:23       ` Jan Beulich
  2019-02-11 13:35         ` Juergen Gross
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-11 13:23 UTC (permalink / raw)
  To: Chao Gao
  Cc: Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

>>> On 11.02.19 at 06:40, <chao.gao@intel.com> wrote:
> On Fri, Feb 08, 2019 at 09:29:32AM -0700, Jan Beulich wrote:
>>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>> +    /*
>>> +     * Initiate an update on all processors which don't have an online sibling
>>> +     * thread with a lower thread id. Other sibling threads just await the
>>> +     * completion of microcode update.
>>> +     */
>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>> +        ret = microcode_update_cpu();
>>> +    /*
>>> +     * Increase the wait timeout to a safe value here since we're serializing
>>> +     * the microcode update and that could take a while on a large number of
>>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>>> +     * the last CPU finished updating and thus cut short
>>> +     */
>>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
>>> +        panic("Timeout when finishing updating microcode");
>>
>>While I expect this to go away again in the next patch, I'd still like to
>>see this improved, in particular in case the patch here goes in
>>independently of the next one. After all on a system with 100 cores
>>the timeout totals to a whopping 3 seconds.
> 
> To be clear, the timeout remains the same in the next patch due to
> the serial print clause in apply_microcode().
> 
>>
>>Generally the time needed to wait scales by the number of CPUs still
>>in need of doing the update. And if a timeout is really to occur, it's
>>perhaps because of one bad core or socket, not because nothing
>>works at all. Hence it would seem both nice and possible to scale the
>>"remaining time to wait" by the (known) number of remaining
>>processors to respond.
> 
> Basically, I think the benefit is we can recognize the failure earlier
> if no core called in in a given interval (i.e. 30ms), and trigger a
> panic. Considering for such case, even with this optimization, the
> system needs reboot, which generally takes several minutes, what's the
> value of this optimization?

Hmm, on one hand this is a fair point you make. Otoh, why do
you add any timeout at all, if we say we're hosed anyway if the
timeout expires? You could then as well log a message (say
once a second) about how many (or which) CPUs still didn't
respond. The admin can then still reboot the system if desired.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-02-11 13:23       ` Jan Beulich
@ 2019-02-11 13:35         ` Juergen Gross
  2019-02-11 15:28           ` Raj, Ashok
  2019-02-11 16:49           ` Jan Beulich
  0 siblings, 2 replies; 50+ messages in thread
From: Juergen Gross @ 2019-02-11 13:35 UTC (permalink / raw)
  To: Jan Beulich, Chao Gao
  Cc: Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

On 11/02/2019 14:23, Jan Beulich wrote:
>>>> On 11.02.19 at 06:40, <chao.gao@intel.com> wrote:
>> On Fri, Feb 08, 2019 at 09:29:32AM -0700, Jan Beulich wrote:
>>>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>>> +    /*
>>>> +     * Initiate an update on all processors which don't have an online sibling
>>>> +     * thread with a lower thread id. Other sibling threads just await the
>>>> +     * completion of microcode update.
>>>> +     */
>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>> +        ret = microcode_update_cpu();
>>>> +    /*
>>>> +     * Increase the wait timeout to a safe value here since we're serializing
>>>> +     * the microcode update and that could take a while on a large number of
>>>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>>>> +     * the last CPU finished updating and thus cut short
>>>> +     */
>>>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
>>>> +        panic("Timeout when finishing updating microcode");
>>>
>>> While I expect this to go away again in the next patch, I'd still like to
>>> see this improved, in particular in case the patch here goes in
>>> independently of the next one. After all on a system with 100 cores
>>> the timeout totals to a whopping 3 seconds.
>>
>> To be clear, the timeout remains the same in the next patch due to
>> the serial print clause in apply_microcode().
>>
>>>
>>> Generally the time needed to wait scales by the number of CPUs still
>>> in need of doing the update. And if a timeout is really to occur, it's
>>> perhaps because of one bad core or socket, not because nothing
>>> works at all. Hence it would seem both nice and possible to scale the
>>> "remaining time to wait" by the (known) number of remaining
>>> processors to respond.
>>
>> Basically, I think the benefit is we can recognize the failure earlier
>> if no core called in in a given interval (i.e. 30ms), and trigger a
>> panic. Considering for such case, even with this optimization, the
>> system needs reboot, which generally takes several minutes, what's the
>> value of this optimization?
> 
> Hmm, on one hand this is a fair point you make. Otoh, why do
> you add any timeout at all, if we say we're hosed anyway if the
> timeout expires? You could then as well log a message (say
> once a second) about how many (or which) CPUs still didn't
> respond. The admin can then still reboot the system if desired.

That's not a data center friendly approach.

The ability to do microcode update in an online system might by
risky, but in case of failure requiring access to the console or
power settings of the system isn't nice.

I think doing a panic() after some timeout is a sensible way to
handle a failure.

In case you'd like having a way to wait longer: we could allow the
"noreboot" parameter to be modified at runtime and do the panic only
if opt_noreboot isn't set.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-02-11 13:35         ` Juergen Gross
@ 2019-02-11 15:28           ` Raj, Ashok
  2019-02-11 16:49           ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Raj, Ashok @ 2019-02-11 15:28 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Kevin Tian, Roger Pau Monne, Wei Liu, Jan Beulich, Andrew Cooper,
	Jun Nakajima, Ashok Raj, xen-devel, tglx, Borislav Petkov,
	Chao Gao

On Mon, Feb 11, 2019 at 02:35:30PM +0100, Juergen Gross wrote:
> On 11/02/2019 14:23, Jan Beulich wrote:
> >>>> On 11.02.19 at 06:40, <chao.gao@intel.com> wrote:
> >> On Fri, Feb 08, 2019 at 09:29:32AM -0700, Jan Beulich wrote:
> >>>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> >>>> +    /*
> >>>> +     * Initiate an update on all processors which don't have an online sibling
> >>>> +     * thread with a lower thread id. Other sibling threads just await the
> >>>> +     * completion of microcode update.
> >>>> +     */
> >>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> >>>> +        ret = microcode_update_cpu();
> >>>> +    /*
> >>>> +     * Increase the wait timeout to a safe value here since we're serializing
> >>>> +     * the microcode update and that could take a while on a large number of
> >>>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
> >>>> +     * the last CPU finished updating and thus cut short
> >>>> +     */
> >>>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
> >>>> +        panic("Timeout when finishing updating microcode");
> >>>
> >>> While I expect this to go away again in the next patch, I'd still like to
> >>> see this improved, in particular in case the patch here goes in
> >>> independently of the next one. After all on a system with 100 cores
> >>> the timeout totals to a whopping 3 seconds.
> >>
> >> To be clear, the timeout remains the same in the next patch due to
> >> the serial print clause in apply_microcode().
> >>
> >>>
> >>> Generally the time needed to wait scales by the number of CPUs still
> >>> in need of doing the update. And if a timeout is really to occur, it's
> >>> perhaps because of one bad core or socket, not because nothing
> >>> works at all. Hence it would seem both nice and possible to scale the
> >>> "remaining time to wait" by the (known) number of remaining
> >>> processors to respond.
> >>
> >> Basically, I think the benefit is we can recognize the failure earlier
> >> if no core called in in a given interval (i.e. 30ms), and trigger a
> >> panic. Considering for such case, even with this optimization, the
> >> system needs reboot, which generally takes several minutes, what's the
> >> value of this optimization?
> > 
> > Hmm, on one hand this is a fair point you make. Otoh, why do
> > you add any timeout at all, if we say we're hosed anyway if the
> > timeout expires? You could then as well log a message (say
> > once a second) about how many (or which) CPUs still didn't
> > respond. The admin can then still reboot the system if desired.
> 
> That's not a data center friendly approach.
> 
> The ability to do microcode update in an online system might by
> risky, but in case of failure requiring access to the console or
> power settings of the system isn't nice.

Thats right... the reason it cannnot be a perfect number is because
it really depends on several factors. If the other thread is in a 
long flow instruction we can only break at instruction boundary, even
ipi. Say if you were in wbinvd() for example, or some new ISA taking its
sweet time. In the grand scheme of things its less important to fine 
optimnize this number. 

On the other hand not panicking and waiting for sysadmin certainly isn't
datacenter friendly as you had pointed out. 
> 
> I think doing a panic() after some timeout is a sensible way to
> handle a failure.
> 
> In case you'd like having a way to wait longer: we could allow the
> "noreboot" parameter to be modified at runtime and do the panic only
> if opt_noreboot isn't set.
> 
> 
> Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading
  2019-02-11 13:35         ` Juergen Gross
  2019-02-11 15:28           ` Raj, Ashok
@ 2019-02-11 16:49           ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-02-11 16:49 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Kevin Tian, Borislav Petkov, Wei Liu, Ashok Raj, Andrew Cooper,
	Jun Nakajima, xen-devel, tglx, Roger Pau Monne, Chao Gao

>>> On 11.02.19 at 14:35, <jgross@suse.com> wrote:
> On 11/02/2019 14:23, Jan Beulich wrote:
>>>>> On 11.02.19 at 06:40, <chao.gao@intel.com> wrote:
>>> On Fri, Feb 08, 2019 at 09:29:32AM -0700, Jan Beulich wrote:
>>>>>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>>>> +    /*
>>>>> +     * Initiate an update on all processors which don't have an online 
> sibling
>>>>> +     * thread with a lower thread id. Other sibling threads just await the
>>>>> +     * completion of microcode update.
>>>>> +     */
>>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>>> +        ret = microcode_update_cpu();
>>>>> +    /*
>>>>> +     * Increase the wait timeout to a safe value here since we're 
> serializing
>>>>> +     * the microcode update and that could take a while on a large number 
> of
>>>>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>>>>> +     * the last CPU finished updating and thus cut short
>>>>> +     */
>>>>> +    if ( wait_for_cpus(&cpu_out, MICROCODE_DEFAULT_TIMEOUT_US * nr_cores) )
>>>>> +        panic("Timeout when finishing updating microcode");
>>>>
>>>> While I expect this to go away again in the next patch, I'd still like to
>>>> see this improved, in particular in case the patch here goes in
>>>> independently of the next one. After all on a system with 100 cores
>>>> the timeout totals to a whopping 3 seconds.
>>>
>>> To be clear, the timeout remains the same in the next patch due to
>>> the serial print clause in apply_microcode().
>>>
>>>>
>>>> Generally the time needed to wait scales by the number of CPUs still
>>>> in need of doing the update. And if a timeout is really to occur, it's
>>>> perhaps because of one bad core or socket, not because nothing
>>>> works at all. Hence it would seem both nice and possible to scale the
>>>> "remaining time to wait" by the (known) number of remaining
>>>> processors to respond.
>>>
>>> Basically, I think the benefit is we can recognize the failure earlier
>>> if no core called in in a given interval (i.e. 30ms), and trigger a
>>> panic. Considering for such case, even with this optimization, the
>>> system needs reboot, which generally takes several minutes, what's the
>>> value of this optimization?
>> 
>> Hmm, on one hand this is a fair point you make. Otoh, why do
>> you add any timeout at all, if we say we're hosed anyway if the
>> timeout expires? You could then as well log a message (say
>> once a second) about how many (or which) CPUs still didn't
>> respond. The admin can then still reboot the system if desired.
> 
> That's not a data center friendly approach.
> 
> The ability to do microcode update in an online system might by
> risky, but in case of failure requiring access to the console or
> power settings of the system isn't nice.

I realize this, but I also wouldn't bet the system would reboot cleanly
in such a case (i.e. a power cycle may be required anyway).

> I think doing a panic() after some timeout is a sensible way to
> handle a failure.

I don't disagree; I'm just not convinced this is the only "sensible"
way.

> In case you'd like having a way to wait longer: we could allow the
> "noreboot" parameter to be modified at runtime and do the panic only
> if opt_noreboot isn't set.

Why would runtime modification come into play here?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-01-28  7:06 ` [PATCH v5 8/8] microcode: update microcode on cores in parallel Chao Gao
  2019-01-29 11:27   ` Roger Pau Monné
@ 2019-02-12 12:51   ` Jan Beulich
  2019-02-12 13:25     ` Roger Pau Monné
  1 sibling, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-12 12:51 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> @@ -213,21 +214,25 @@ static void microcode_fini_cpu(unsigned int cpu)
>  bool save_patch(struct microcode_patch *new_patch)
>  {
>      struct microcode_patch *microcode_patch;
> +    enum microcode_match_result result = MIS_UCODE;
> +    bool ret;
> +    unsigned long flag;
> +
> +    write_lock_irqsave(&cache_rwlock, flag);

Can the new variable please be named "flags", like we do (almost?)
everywhere else?

>      list_for_each_entry(microcode_patch, &microcode_cache, list)
>      {
> -        enum microcode_match_result result =
> -            microcode_ops->replace_patch(new_patch, microcode_patch);
> +        result = microcode_ops->replace_patch(new_patch, microcode_patch);
>  
>          switch ( result )
>          {
>          case OLD_UCODE:
> -            microcode_ops->free_patch(new_patch);
> -            return false;
> +            ret = false;
> +            goto out;
>  
>          case NEW_UCODE:
> -            microcode_ops->free_patch(microcode_patch);
> -            return true;
> +            ret = true;
> +            goto out;
>  
>          case MIS_UCODE:
>              continue;
> @@ -238,7 +243,27 @@ bool save_patch(struct microcode_patch *new_patch)
>          }
>      }
>      list_add_tail(&new_patch->list, &microcode_cache);
> -    return true;
> +    ret = true;

I don't see the need to update "ret" upwards from here. Afaict all
derivation can be done from "result" below here. Which then puts
under question the utility of the entire switch() above.

I have to admit that I also dislike the use of "goto" here: While
I've learned to accept its use in particular on some error handling
paths, I'm unconvinced that this function can't be written without
its use.

> @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>  
>      mc_intel = patch->data;
>      BUG_ON(!mc_intel);
> -
> -    /* serialize access to the physical write to MSR 0x79 */
> -    spin_lock_irqsave(&microcode_update_lock, flags);
> +    BUG_ON(local_irq_is_enabled());
>  
>      /* write microcode via MSR 0x79 */
>      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
> @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>      val[1] = (uint32_t)(msr_content >> 32);
>  
> -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>      if ( val[1] != mc_intel->hdr.rev )
>      {
>          printk(KERN_ERR "microcode: CPU%d update from revision "

Am I understanding right that you now rely on upper layers in the
call tree to avoid calling into here in parallel for two hyperthreads
of the same core? I can't see how you avoid this situation during
AP bringup, for example. Did I overlook anything in this regard?

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-12 12:51   ` Jan Beulich
@ 2019-02-12 13:25     ` Roger Pau Monné
  2019-02-12 13:55       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-02-12 13:25 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Chao Gao

On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
> >  
> >      mc_intel = patch->data;
> >      BUG_ON(!mc_intel);
> > -
> > -    /* serialize access to the physical write to MSR 0x79 */
> > -    spin_lock_irqsave(&microcode_update_lock, flags);
> > +    BUG_ON(local_irq_is_enabled());
> >  
> >      /* write microcode via MSR 0x79 */
> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
> >      val[1] = (uint32_t)(msr_content >> 32);
> >  
> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
> >      if ( val[1] != mc_intel->hdr.rev )
> >      {
> >          printk(KERN_ERR "microcode: CPU%d update from revision "
> 
> Am I understanding right that you now rely on upper layers in the
> call tree to avoid calling into here in parallel for two hyperthreads
> of the same core? I can't see how you avoid this situation during
> AP bringup, for example. Did I overlook anything in this regard?

IIRC microcode update is done in the serialized part of AP bringup,
before the call to smp_callin, which guarantees serialization.

Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-12 13:25     ` Roger Pau Monné
@ 2019-02-12 13:55       ` Jan Beulich
  2019-02-13  2:30         ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-12 13:55 UTC (permalink / raw)
  To: Roger Pau Monne; +Cc: Andrew Cooper, Wei Liu, xen-devel, Chao Gao

>>> On 12.02.19 at 14:25, <roger.pau@citrix.com> wrote:
> On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
>> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>> >  
>> >      mc_intel = patch->data;
>> >      BUG_ON(!mc_intel);
>> > -
>> > -    /* serialize access to the physical write to MSR 0x79 */
>> > -    spin_lock_irqsave(&microcode_update_lock, flags);
>> > +    BUG_ON(local_irq_is_enabled());
>> >  
>> >      /* write microcode via MSR 0x79 */
>> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>> >      val[1] = (uint32_t)(msr_content >> 32);
>> >  
>> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>> >      if ( val[1] != mc_intel->hdr.rev )
>> >      {
>> >          printk(KERN_ERR "microcode: CPU%d update from revision "
>> 
>> Am I understanding right that you now rely on upper layers in the
>> call tree to avoid calling into here in parallel for two hyperthreads
>> of the same core? I can't see how you avoid this situation during
>> AP bringup, for example. Did I overlook anything in this regard?
> 
> IIRC microcode update is done in the serialized part of AP bringup,
> before the call to smp_callin, which guarantees serialization.

Hmm, yes, right now it is. But I'd call this "happens to be that way"
rather than "is guaranteed to be that way" - prior to commit
f97838bbd9 it did happen later.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-12 13:55       ` Jan Beulich
@ 2019-02-13  2:30         ` Chao Gao
  2019-02-13  7:20           ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-02-13  2:30 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Tue, Feb 12, 2019 at 06:55:20AM -0700, Jan Beulich wrote:
>>>> On 12.02.19 at 14:25, <roger.pau@citrix.com> wrote:
>> On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
>>> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>>> >  
>>> >      mc_intel = patch->data;
>>> >      BUG_ON(!mc_intel);
>>> > -
>>> > -    /* serialize access to the physical write to MSR 0x79 */
>>> > -    spin_lock_irqsave(&microcode_update_lock, flags);
>>> > +    BUG_ON(local_irq_is_enabled());
>>> >  
>>> >      /* write microcode via MSR 0x79 */
>>> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>>> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>>> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>>> >      val[1] = (uint32_t)(msr_content >> 32);
>>> >  
>>> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>>> >      if ( val[1] != mc_intel->hdr.rev )
>>> >      {
>>> >          printk(KERN_ERR "microcode: CPU%d update from revision "
>>> 
>>> Am I understanding right that you now rely on upper layers in the
>>> call tree to avoid calling into here in parallel for two hyperthreads
>>> of the same core? I can't see how you avoid this situation during
>>> AP bringup, for example. Did I overlook anything in this regard?
>> 
>> IIRC microcode update is done in the serialized part of AP bringup,
>> before the call to smp_callin, which guarantees serialization.
>
>Hmm, yes, right now it is. But I'd call this "happens to be that way"
>rather than "is guaranteed to be that way" - prior to commit
>f97838bbd9 it did happen later.

How about employing another lock, "early_ucode_update_lock", to
guarantee serialization.

In particular, early_microcode_update_cpu() and microcode_resume_cpu()
will acquire this lock before ucode update.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-13  2:30         ` Chao Gao
@ 2019-02-13  7:20           ` Jan Beulich
  2019-02-13  8:50             ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-02-13  7:20 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 13.02.19 at 03:30, <chao.gao@intel.com> wrote:
> On Tue, Feb 12, 2019 at 06:55:20AM -0700, Jan Beulich wrote:
>>>>> On 12.02.19 at 14:25, <roger.pau@citrix.com> wrote:
>>> On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
>>>> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>>> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>>>> >  
>>>> >      mc_intel = patch->data;
>>>> >      BUG_ON(!mc_intel);
>>>> > -
>>>> > -    /* serialize access to the physical write to MSR 0x79 */
>>>> > -    spin_lock_irqsave(&microcode_update_lock, flags);
>>>> > +    BUG_ON(local_irq_is_enabled());
>>>> >  
>>>> >      /* write microcode via MSR 0x79 */
>>>> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>>>> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>>>> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>>>> >      val[1] = (uint32_t)(msr_content >> 32);
>>>> >  
>>>> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>>>> >      if ( val[1] != mc_intel->hdr.rev )
>>>> >      {
>>>> >          printk(KERN_ERR "microcode: CPU%d update from revision "
>>>> 
>>>> Am I understanding right that you now rely on upper layers in the
>>>> call tree to avoid calling into here in parallel for two hyperthreads
>>>> of the same core? I can't see how you avoid this situation during
>>>> AP bringup, for example. Did I overlook anything in this regard?
>>> 
>>> IIRC microcode update is done in the serialized part of AP bringup,
>>> before the call to smp_callin, which guarantees serialization.
>>
>>Hmm, yes, right now it is. But I'd call this "happens to be that way"
>>rather than "is guaranteed to be that way" - prior to commit
>>f97838bbd9 it did happen later.
> 
> How about employing another lock, "early_ucode_update_lock", to
> guarantee serialization.
> 
> In particular, early_microcode_update_cpu() and microcode_resume_cpu()
> will acquire this lock before ucode update.

That's a (temporary) option, but would over-serialize things again.
I was rather trying to hint towards a per-core lock.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-13  7:20           ` Jan Beulich
@ 2019-02-13  8:50             ` Chao Gao
  2019-02-13 10:05               ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-02-13  8:50 UTC (permalink / raw)
  To: Jan Beulich; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

On Wed, Feb 13, 2019 at 12:20:51AM -0700, Jan Beulich wrote:
>>>> On 13.02.19 at 03:30, <chao.gao@intel.com> wrote:
>> On Tue, Feb 12, 2019 at 06:55:20AM -0700, Jan Beulich wrote:
>>>>>> On 12.02.19 at 14:25, <roger.pau@citrix.com> wrote:
>>>> On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
>>>>> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>>>> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>>>>> >  
>>>>> >      mc_intel = patch->data;
>>>>> >      BUG_ON(!mc_intel);
>>>>> > -
>>>>> > -    /* serialize access to the physical write to MSR 0x79 */
>>>>> > -    spin_lock_irqsave(&microcode_update_lock, flags);
>>>>> > +    BUG_ON(local_irq_is_enabled());
>>>>> >  
>>>>> >      /* write microcode via MSR 0x79 */
>>>>> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>>>>> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>>>>> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>>>>> >      val[1] = (uint32_t)(msr_content >> 32);
>>>>> >  
>>>>> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>>>>> >      if ( val[1] != mc_intel->hdr.rev )
>>>>> >      {
>>>>> >          printk(KERN_ERR "microcode: CPU%d update from revision "
>>>>> 
>>>>> Am I understanding right that you now rely on upper layers in the
>>>>> call tree to avoid calling into here in parallel for two hyperthreads
>>>>> of the same core? I can't see how you avoid this situation during
>>>>> AP bringup, for example. Did I overlook anything in this regard?
>>>> 
>>>> IIRC microcode update is done in the serialized part of AP bringup,
>>>> before the call to smp_callin, which guarantees serialization.
>>>
>>>Hmm, yes, right now it is. But I'd call this "happens to be that way"
>>>rather than "is guaranteed to be that way" - prior to commit
>>>f97838bbd9 it did happen later.
>> 
>> How about employing another lock, "early_ucode_update_lock", to
>> guarantee serialization.
>> 
>> In particular, early_microcode_update_cpu() and microcode_resume_cpu()
>> will acquire this lock before ucode update.
>
>That's a (temporary) option, but would over-serialize things again.
>I was rather trying to hint towards a per-core lock.

To implement a per-core lock, a core should be conscious of its siblings.
Unfortunately, during AP bringup, ucode update is done prior to the
initialization of 'cpu_sibling_mask'.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v5 8/8] microcode: update microcode on cores in parallel
  2019-02-13  8:50             ` Chao Gao
@ 2019-02-13 10:05               ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-02-13 10:05 UTC (permalink / raw)
  To: Chao Gao; +Cc: Andrew Cooper, Wei Liu, xen-devel, Roger Pau Monne

>>> On 13.02.19 at 09:50, <chao.gao@intel.com> wrote:
> On Wed, Feb 13, 2019 at 12:20:51AM -0700, Jan Beulich wrote:
>>>>> On 13.02.19 at 03:30, <chao.gao@intel.com> wrote:
>>> On Tue, Feb 12, 2019 at 06:55:20AM -0700, Jan Beulich wrote:
>>>>>>> On 12.02.19 at 14:25, <roger.pau@citrix.com> wrote:
>>>>> On Tue, Feb 12, 2019 at 05:51:41AM -0700, Jan Beulich wrote:
>>>>>> >>> On 28.01.19 at 08:06, <chao.gao@intel.com> wrote:
>>>>>> > @@ -314,9 +310,7 @@ static int apply_microcode(unsigned int cpu)
>>>>>> >  
>>>>>> >      mc_intel = patch->data;
>>>>>> >      BUG_ON(!mc_intel);
>>>>>> > -
>>>>>> > -    /* serialize access to the physical write to MSR 0x79 */
>>>>>> > -    spin_lock_irqsave(&microcode_update_lock, flags);
>>>>>> > +    BUG_ON(local_irq_is_enabled());
>>>>>> >  
>>>>>> >      /* write microcode via MSR 0x79 */
>>>>>> >      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>>>>>> > @@ -329,7 +323,6 @@ static int apply_microcode(unsigned int cpu)
>>>>>> >      rdmsrl(MSR_IA32_UCODE_REV, msr_content);
>>>>>> >      val[1] = (uint32_t)(msr_content >> 32);
>>>>>> >  
>>>>>> > -    spin_unlock_irqrestore(&microcode_update_lock, flags);
>>>>>> >      if ( val[1] != mc_intel->hdr.rev )
>>>>>> >      {
>>>>>> >          printk(KERN_ERR "microcode: CPU%d update from revision "
>>>>>> 
>>>>>> Am I understanding right that you now rely on upper layers in the
>>>>>> call tree to avoid calling into here in parallel for two hyperthreads
>>>>>> of the same core? I can't see how you avoid this situation during
>>>>>> AP bringup, for example. Did I overlook anything in this regard?
>>>>> 
>>>>> IIRC microcode update is done in the serialized part of AP bringup,
>>>>> before the call to smp_callin, which guarantees serialization.
>>>>
>>>>Hmm, yes, right now it is. But I'd call this "happens to be that way"
>>>>rather than "is guaranteed to be that way" - prior to commit
>>>>f97838bbd9 it did happen later.
>>> 
>>> How about employing another lock, "early_ucode_update_lock", to
>>> guarantee serialization.
>>> 
>>> In particular, early_microcode_update_cpu() and microcode_resume_cpu()
>>> will acquire this lock before ucode update.
>>
>>That's a (temporary) option, but would over-serialize things again.
>>I was rather trying to hint towards a per-core lock.
> 
> To implement a per-core lock, a core should be conscious of its siblings.
> Unfortunately, during AP bringup, ucode update is done prior to the
> initialization of 'cpu_sibling_mask'.

Okay, in this case a global lock will perhaps do for now (as boot
time AP bringup is serialized in this regard already anyway). I do
think though that the sibling maps should be set up earlier, but
this doesn't look to be as simple as moving ahead the call to
set_cpu_sibling_map(). I wonder anyway why all sorts of stuff
have accumulated ahead of smp_callin(), thus delaying the boot
process in particular on large systems. Clearly something taking
as long as microcode load should, if at all possible, happen
afterwards.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2019-02-13 10:05 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-28  7:06 [PATCH v5 0/8] improve late microcode loading Chao Gao
2019-01-28  7:06 ` [PATCH v5 1/8] microcode/intel: remove redundent check against ucode size Chao Gao
2019-01-28 16:40   ` Roger Pau Monné
2019-01-29 10:26   ` Jan Beulich
2019-01-29 13:34     ` Chao Gao
2019-01-28  7:06 ` [PATCH v5 2/8] microcode/intel: extend microcode_update_match() Chao Gao
2019-01-28 16:55   ` Roger Pau Monné
2019-01-28 17:00     ` Jan Beulich
2019-01-29 10:41   ` Jan Beulich
2019-01-29 13:52     ` Chao Gao
2019-01-28  7:06 ` [PATCH v5 3/8] microcode: introduce the global microcode cache Chao Gao
2019-01-28 17:39   ` Roger Pau Monné
2019-01-29  4:41     ` Chao Gao
2019-01-29  8:56       ` Roger Pau Monné
2019-02-08 11:41   ` Jan Beulich
2019-02-11  3:59     ` Chao Gao
2019-02-11 13:16       ` Jan Beulich
2019-01-28  7:06 ` [PATCH v5 4/8] microcode: delete 'mc' field from struct ucode_cpu_info Chao Gao
2019-01-29  9:25   ` Roger Pau Monné
2019-01-29 13:27     ` Chao Gao
2019-01-28  7:06 ` [PATCH v5 5/8] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-01-29  9:58   ` Roger Pau Monné
2019-01-29 12:47     ` Chao Gao
2019-02-08 15:58   ` Jan Beulich
2019-01-28  7:06 ` [PATCH v5 6/8] microcode: delete microcode pointer and size from microcode_info Chao Gao
2019-01-29 10:10   ` Roger Pau Monné
2019-01-29 14:11     ` Chao Gao
2019-01-28  7:06 ` [PATCH v5 7/8] x86/microcode: Synchronize late microcode loading Chao Gao
2019-01-29 10:37   ` Roger Pau Monné
2019-01-29 10:45     ` Jan Beulich
2019-01-30 13:44     ` Chao Gao
2019-02-08 16:29   ` Jan Beulich
2019-02-11  5:40     ` Chao Gao
2019-02-11 13:23       ` Jan Beulich
2019-02-11 13:35         ` Juergen Gross
2019-02-11 15:28           ` Raj, Ashok
2019-02-11 16:49           ` Jan Beulich
2019-01-28  7:06 ` [PATCH v5 8/8] microcode: update microcode on cores in parallel Chao Gao
2019-01-29 11:27   ` Roger Pau Monné
2019-01-30 13:36     ` Chao Gao
2019-02-12 12:51   ` Jan Beulich
2019-02-12 13:25     ` Roger Pau Monné
2019-02-12 13:55       ` Jan Beulich
2019-02-13  2:30         ` Chao Gao
2019-02-13  7:20           ` Jan Beulich
2019-02-13  8:50             ` Chao Gao
2019-02-13 10:05               ` Jan Beulich
2019-01-29 11:31 ` [PATCH v5 0/8] improve late microcode loading Roger Pau Monné
2019-01-29 12:11   ` Chao Gao
2019-01-29 14:17     ` Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.