All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/12] improve late microcode loading
@ 2019-03-11  7:57 Chao Gao
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
                   ` (12 more replies)
  0 siblings, 13 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Boris Ostrovsky, Chao Gao, Brian Woods, Suravee Suthikulpanit,
	Roger Pau Monné

Major changes in version 6:
 - run wbinvd before updating microcode (patch 10)
 - add an userspace tool for late microcode update (patch 1)
 - scale time to wait by the number of remaining CPUs to respond 
 - remove 'cpu' parameters from some related callbacks and functins
 - save an ucode patch only if its supported CPU is allowed to mix with
   current cpu.

Changes in version 5:
 - support parallel microcode updates for all cores (see patch 8)
 - Address Roger's comments on the last version.

The intention of this series is to make the late microcode loading
more reliable by rendezvousing all cpus in stop_machine context.
This idea comes from Ashok. I am porting his linux patch to Xen
(see patch 10 and 11 for more details).

This series makes five changes:
 1. Patch 1: an userspace tool for late microcode update
 2. Patch 2-9: introduce a global microcode cache and some cleanup
 3. Patch 10: writeback and invalidate cache before updating microcode
 3. Patch 11: synchronize late microcode loading
 4. Patch 12: support parallel microcodes update on different cores

Currently, late microcode loading does a lot of things including
parsing microcode blob, checking the signature/revision and performing
update. Putting all of them into stop_machine context is a bad idea
because of complexity (One issue I observed is memory allocation
triggered one assertion in stop_machine context). In order to simplify
the load process, I move parsing microcode out of the load process.
The microcode blob is parsed and a global microcode cache is built on
a single CPU before rendezvousing all cpus to update microcode. Other
CPUs just get and load a suitable microcode from the global cache.
With this global cache, it is safe to put simplified load process to
stop_machine context.

Regarding changes to AMD side, I didn't do any test for them due to
lack of hardware. Could you help to test this series on an AMD machine?
At least, two basic tests are needed:
* do a microcode update after system bootup
* don't bring all pCPUs up at bootup by specifying maxcpus option in xen
  command line and then do a microcode update and online all offlined
  CPUs via 'xen-hptool'.

Chao Gao (12):
  misc/xenmicrocode: Upload a microcode blob to the hypervisor
  microcode/intel: use union to get fields without shifting and masking
  microcode/intel: extend microcode_update_match()
  microcode: introduce a global cache of ucode patch
  microcode: only save compatible ucode patches
  microcode: remove struct ucode_cpu_info
  microcode: remove pointless 'cpu' parameter
  microcode: split out apply_microcode() from cpu_request_microcode()
  microcode: remove struct microcode_info
  microcode/intel: Writeback and invalidate caches before updating
    microcode
  x86/microcode: Synchronize late microcode loading
  microcode: update microcode on cores in parallel

 tools/libxc/include/xenctrl.h   |   1 +
 tools/libxc/xc_misc.c           |  20 +++
 tools/misc/Makefile             |   4 +
 tools/misc/xenmicrocode.c       |  89 ++++++++++
 xen/arch/x86/acpi/power.c       |   2 +-
 xen/arch/x86/apic.c             |   2 +-
 xen/arch/x86/microcode.c        | 380 +++++++++++++++++++++++++++-------------
 xen/arch/x86/microcode_amd.c    | 236 ++++++++++++-------------
 xen/arch/x86/microcode_intel.c  | 206 +++++++++++++---------
 xen/arch/x86/smpboot.c          |   5 +-
 xen/arch/x86/spec_ctrl.c        |   2 +-
 xen/include/asm-x86/microcode.h |  40 +++--
 xen/include/asm-x86/processor.h |   3 +-
 13 files changed, 639 insertions(+), 351 deletions(-)
 create mode 100644 tools/misc/xenmicrocode.c

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-12 15:27   ` Roger Pau Monné
                     ` (2 more replies)
  2019-03-11  7:57 ` [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking Chao Gao
                   ` (11 subsequent siblings)
  12 siblings, 3 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Konrad Rzeszutek Wilk,
	Ian Jackson, Chao Gao

This patch provides a tool for late microcode update.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 tools/libxc/include/xenctrl.h |  1 +
 tools/libxc/xc_misc.c         | 20 ++++++++++
 tools/misc/Makefile           |  4 ++
 tools/misc/xenmicrocode.c     | 89 +++++++++++++++++++++++++++++++++++++++++++
 4 files changed, 114 insertions(+)
 create mode 100644 tools/misc/xenmicrocode.c

diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 31cdda7..c69699b 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1245,6 +1245,7 @@ typedef uint32_t xc_node_to_node_dist_t;
 int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo);
+int xc_platform_op(xc_interface *xch, struct xen_platform_op *op);
 int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
                 xc_meminfo_t *meminfo, uint32_t *distance);
 int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
index 5e6714a..061c7a5 100644
--- a/tools/libxc/xc_misc.c
+++ b/tools/libxc/xc_misc.c
@@ -226,6 +226,26 @@ int xc_physinfo(xc_interface *xch,
     return 0;
 }
 
+int xc_platform_op(xc_interface *xch, struct xen_platform_op *op)
+{
+    int ret = 0;
+    DECLARE_PLATFORM_OP;
+    DECLARE_HYPERCALL_BOUNCE(op, sizeof(*op), XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
+
+    if ( xc_hypercall_bounce_pre(xch, op) )
+    {
+        PERROR("Could not bounce xen_platform_op memory buffer");
+        return -1;
+    }
+    op->interface_version = XENPF_INTERFACE_VERSION;
+
+    platform_op = *op;
+    ret = do_platform_op(xch, &platform_op);
+    xc_hypercall_bounce_post(xch, op);
+
+    return ret;
+}
+
 int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
                    xc_cputopo_t *cputopo)
 {
diff --git a/tools/misc/Makefile b/tools/misc/Makefile
index 51adb6f..c297a75 100644
--- a/tools/misc/Makefile
+++ b/tools/misc/Makefile
@@ -22,6 +22,7 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmcrash
 INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmctx
 INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
 INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
+INSTALL_SBIN-$(CONFIG_X86)     += xenmicrocode
 INSTALL_SBIN                   += xen-tmem-list-parse
 INSTALL_SBIN                   += xencov
 INSTALL_SBIN                   += xenlockprof
@@ -114,4 +115,7 @@ xen-lowmemd: xen-lowmemd.o
 xencov: xencov.o
 	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
 
+xenmicrocode: xenmicrocode.o
+	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
+
 -include $(DEPS_INCLUDE)
diff --git a/tools/misc/xenmicrocode.c b/tools/misc/xenmicrocode.c
new file mode 100644
index 0000000..e6c8a3d
--- /dev/null
+++ b/tools/misc/xenmicrocode.c
@@ -0,0 +1,89 @@
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <stdlib.h>
+#include <sys/mman.h>
+#include <errno.h>
+#include <string.h>
+#include <inttypes.h>
+#include <unistd.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <xenctrl.h>
+
+void show_help(void)
+{
+    fprintf(stderr,
+            "xenmicrocode: Xen microcode updating tool\n"
+            "Usage: xenmicrocode <microcode blob>\n");
+}
+
+int main(int argc, char *argv[])
+{
+    int fd, len, ret;
+    char *filename, *buf;
+    struct stat st;
+    struct xen_platform_op op;
+    xc_interface *xch;
+    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
+
+    if (argc < 2)
+    {
+        show_help();
+        return 0;
+    }
+
+    filename = argv[1];
+    fd = open(filename, O_RDONLY);
+    if (fd < 0) {
+        fprintf(stderr, "Could not open %s. (err: %s)\n",
+                filename, strerror(errno));
+        return errno;
+    }
+
+    if (stat(filename, &st) != 0) {
+        fprintf(stderr, "Could not get the size of %s. (err: %s)\n",
+                filename, strerror(errno));
+        return errno;
+    }
+
+    len = st.st_size;
+    buf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
+    if (buf == MAP_FAILED) {
+        fprintf(stderr, "mmap failed. (error: %s)\n", strerror(errno));
+        return errno;
+    }
+
+    xch = xc_interface_open(0,0,0);
+    if (xch == NULL)
+    {
+        fprintf(stderr, "Error opening xc interface. (err: %s)\n",
+                strerror(errno));
+        return errno;
+    }
+
+    uc = xc_hypercall_buffer_alloc(xch, uc, len);
+    if (uc == NULL)
+        return -1;
+
+    memcpy(uc, buf, len);
+    set_xen_guest_handle(op.u.microcode.data, uc);
+    op.cmd = XENPF_microcode_update;
+    op.interface_version = XENPF_INTERFACE_VERSION;
+    op.u.microcode.length = len;
+    ret = xc_platform_op(xch, &op);
+    if ( ret )
+        fprintf(stderr, "Failed to update microcode. (err: %d)\n", ret);
+
+    xc_hypercall_buffer_free(xch, uc);
+    xc_interface_close(xch);
+
+    if (munmap(buf, len)) {
+        printf("Could not unmap: %d(%s)\n", errno, strerror(errno));
+        return errno;
+    }
+    close(fd);
+
+    return 0;
+}
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-12 15:33   ` Roger Pau Monné
  2019-03-11  7:57 ` [PATCH v6 03/12] microcode/intel: extend microcode_update_match() Chao Gao
                   ` (10 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/microcode_intel.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 9657575..22fdeca 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -37,7 +37,14 @@
 struct microcode_header_intel {
     unsigned int hdrver;
     unsigned int rev;
-    unsigned int date;
+    union {
+        struct {
+            uint16_t year;
+            uint8_t day;
+            uint8_t month;
+        };
+        unsigned int date;
+    };
     unsigned int sig;
     unsigned int cksum;
     unsigned int ldrver;
@@ -316,9 +323,9 @@ static int apply_microcode(unsigned int cpu)
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
            cpu_num, uci->cpu_sig.rev, val[1],
-           uci->mc.mc_intel->hdr.date & 0xffff,
-           uci->mc.mc_intel->hdr.date >> 24,
-           (uci->mc.mc_intel->hdr.date >> 16) & 0xff);
+           uci->mc.mc_intel->hdr.year,
+           uci->mc.mc_intel->hdr.month,
+           uci->mc.mc_intel->hdr.day);
     uci->cpu_sig.rev = val[1];
 
     return 0;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 03/12] microcode/intel: extend microcode_update_match()
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
  2019-03-11  7:57 ` [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-11  7:57 ` [PATCH v6 04/12] microcode: introduce a global cache of ucode patch Chao Gao
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

to a more generic function. Then, this function can compare two given
microcodes' signature/revision as well. Comparing two microcodes is
used to update the global microcode cache (introduced by the later
patches in this series) when a new microcode is given.

Note that enum microcode_match_result will be used in common code
(aka microcode.c), it has been placed in the common header.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
---
Changes in v6:
 - eliminate unnecessary type casting in microcode_update_match
 - check if a patch has an extend header

Changes in v5:
 - constify the extended_signature
 - use named enum type for the return value of microcode_update_match
---
 xen/arch/x86/microcode_intel.c  | 48 +++++++++++++++++++----------------------
 xen/include/asm-x86/microcode.h |  6 ++++++
 2 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 22fdeca..ecec83b 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -134,14 +134,28 @@ static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
     return 0;
 }
 
-static inline int microcode_update_match(
-    unsigned int cpu_num, const struct microcode_header_intel *mc_header,
-    int sig, int pf)
+static enum microcode_match_result microcode_update_match(
+    const struct microcode_header_intel *mc_header, unsigned int sig,
+    unsigned int pf, unsigned int rev)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    const struct extended_sigtable *ext_header;
+    const struct extended_signature *ext_sig;
+    unsigned long data_size = get_datasize(mc_header);
+    unsigned int i;
+
+    if ( sigmatch(sig, mc_header->sig, pf, mc_header->pf) )
+        return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
 
-    return (sigmatch(sig, uci->cpu_sig.sig, pf, uci->cpu_sig.pf) &&
-            (mc_header->rev > uci->cpu_sig.rev));
+    if ( get_totalsize(mc_header) == (data_size + MC_HEADER_SIZE) )
+        return MIS_UCODE;
+
+    ext_header = (const void *)(mc_header + 1) + data_size;
+    ext_sig = (const void *)(ext_header + 1);
+    for ( i = 0; i < ext_header->count; i++ )
+        if ( sigmatch(sig, ext_sig[i].sig, pf, ext_sig[i].pf) )
+            return (mc_header->rev > rev) ? NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
 }
 
 static int microcode_sanity_check(void *mc)
@@ -243,31 +257,13 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
-    const struct extended_sigtable *ext_header;
     unsigned long total_size = get_totalsize(mc_header);
-    int ext_sigcount, i;
-    struct extended_signature *ext_sig;
     void *new_mc;
 
-    if ( microcode_update_match(cpu, mc_header,
-                                mc_header->sig, mc_header->pf) )
-        goto find;
-
-    if ( total_size <= (get_datasize(mc_header) + MC_HEADER_SIZE) )
+    if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
+                                uci->cpu_sig.rev) != NEW_UCODE )
         return 0;
 
-    ext_header = mc + get_datasize(mc_header) + MC_HEADER_SIZE;
-    ext_sigcount = ext_header->count;
-    ext_sig = (void *)ext_header + EXT_HEADER_SIZE;
-    for ( i = 0; i < ext_sigcount; i++ )
-    {
-        if ( microcode_update_match(cpu, mc_header,
-                                    ext_sig->sig, ext_sig->pf) )
-            goto find;
-        ext_sig++;
-    }
-    return 0;
- find:
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
              cpu, mc_header->rev, uci->cpu_sig.rev);
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 23ea954..73ebe9a 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -3,6 +3,12 @@
 
 #include <xen/percpu.h>
 
+enum microcode_match_result {
+    OLD_UCODE, /* signature matched, but revision id isn't newer */
+    NEW_UCODE, /* signature matched, but revision id is newer */
+    MIS_UCODE, /* signature mismatched */
+};
+
 struct cpu_signature;
 struct ucode_cpu_info;
 
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (2 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 03/12] microcode/intel: extend microcode_update_match() Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-12 16:53   ` Roger Pau Monné
  2019-03-13 16:36   ` Sergey Dyasli
  2019-03-11  7:57 ` [PATCH v6 05/12] microcode: only save compatible ucode patches Chao Gao
                   ` (8 subsequent siblings)
  12 siblings, 2 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

to replace the current per-cpu cache 'uci->mc'.

Compared to the current per-cpu cache, the benefits of the global
microcode cache are:
1. It reduces the work that need to be done on each CPU. Parsing ucode
file is done once on one CPU. Other CPUs needn't parse ucode file.
Instead, they can find out and load the newest patch from the global
cache.
2. It reduces the memory consumption on a system with many CPU cores.

Two functions, microcode_save_patch() and microcode_find_patch() are
introduced. The former adds one given patch to the global cache. The
latter gets a newer and matched ucode patch from the global cache.
All operations on the cache is expected to be done with the
'microcode_mutex' hold.

Note that I deliberately avoid touching 'uci->mc' as I am going to
remove it completely in the next patch.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v6:
 - constify local variables and function parameters if possible
 - comment that the global cache is protected by 'microcode_mutex'.
   and add assertions to catch violations in microcode_{save/find}_patch()

Changes in v5:
 - reword the commit description
 - find_patch() and save_patch() are abstracted into common functions
   with some hooks for AMD and Intel
---
 xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
 xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
 xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
 xen/include/asm-x86/microcode.h | 13 ++++++
 4 files changed, 215 insertions(+), 15 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 4163f50..e629e6c 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
  */
 static bool_t __initdata ucode_scan;
 
+static LIST_HEAD(microcode_cache);
+
 void __init microcode_set_module(unsigned int idx)
 {
     ucode_mod_idx = idx;
@@ -208,6 +210,64 @@ static void microcode_fini_cpu(unsigned int cpu)
     spin_unlock(&microcode_mutex);
 }
 
+/*
+ * Save an ucode patch to the global cache list.
+ *
+ * Return true if a patch is added to the global cache or it replaces
+ * one old patch in the cache. Otherwise, return false. According to
+ * the return value, the caller decides not to do an expensive ucode
+ * update for some cases (i.e. when admin provides an old ucode).
+ */
+bool microcode_save_patch(struct microcode_patch *new)
+{
+    struct microcode_patch *old;
+
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    list_for_each_entry(old, &microcode_cache, list)
+    {
+        enum microcode_match_result result =
+            microcode_ops->compare_patch(new, old);
+
+        if ( result == OLD_UCODE )
+        {
+            microcode_ops->free_patch(new);
+            return false;
+        }
+        else if ( result == NEW_UCODE )
+        {
+            list_replace(&old->list, &new->list);
+            microcode_ops->free_patch(old);
+            return true;
+        }
+        else /* result == MIS_UCODE */
+            continue;
+    }
+    list_add_tail(&new->list, &microcode_cache);
+
+    return true;
+}
+
+/*
+ * Find an ucode patch which has newer revision than the one in use.
+ *
+ * Return NULL if there is no patch of newer revision.
+ */
+const struct microcode_patch *microcode_find_patch(void)
+{
+    const struct microcode_patch *microcode_patch;
+
+    ASSERT(spin_is_locked(&microcode_mutex));
+
+    list_for_each_entry(microcode_patch, &microcode_cache, list)
+    {
+        if ( microcode_ops->match_cpu(microcode_patch) )
+            return microcode_patch;
+    }
+
+    return NULL;
+}
+
 int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 7a854c0..8897c2c 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -190,24 +190,84 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
     return 1;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    return microcode_fits(patch->data, smp_processor_id());
+}
+
+static struct microcode_patch *alloc_microcode_patch(
+    const struct microcode_amd *mc_amd)
+{
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+    struct microcode_amd *cache = xmalloc(struct microcode_amd);
+    void *mpb = xmalloc_bytes(mc_amd->mpb_size);
+    struct equiv_cpu_entry *equiv_cpu_table =
+                                xmalloc_bytes(mc_amd->equiv_cpu_table_size);
+
+    if ( !microcode_patch || !cache || !mpb || !equiv_cpu_table )
+    {
+        xfree(microcode_patch);
+        xfree(cache);
+        xfree(mpb);
+        xfree(equiv_cpu_table);
+        printk(XENLOG_ERR "microcode: Can not allocate memory\n");
+        return ERR_PTR(-ENOMEM);
+    }
+
+    cache->equiv_cpu_table = equiv_cpu_table;
+    cache->mpb = mpb;
+    memcpy(cache->equiv_cpu_table, mc_amd->equiv_cpu_table,
+           mc_amd->equiv_cpu_table_size);
+    memcpy(cache->mpb, mc_amd->mpb, mc_amd->mpb_size);
+    cache->equiv_cpu_table_size = mc_amd->equiv_cpu_table_size;
+    cache->mpb_size = mc_amd->mpb_size;
+    microcode_patch->data = cache;
+
+    return microcode_patch;
+}
+
+static void free_patch(struct microcode_patch *microcode_patch)
+{
+    struct microcode_amd *mc_amd = microcode_patch->data;
+
+    xfree(mc_amd->equiv_cpu_table);
+    xfree(mc_amd->mpb);
+    xfree(mc_amd);
+    xfree(microcode_patch);
+}
+
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    const struct microcode_amd *new_mc = new->data;
+    const struct microcode_header_amd *new_header = new_mc->mpb;
+    const struct microcode_amd *old_mc = old->data;
+    const struct microcode_header_amd *old_header = old_mc->mpb;
+
+    if ( new_header->processor_rev_id == old_header->processor_rev_id )
+        return (new_header->patch_id > old_header->patch_id) ?
+                NEW_UCODE : OLD_UCODE;
+
+    return MIS_UCODE;
+}
+
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    struct microcode_header_amd *hdr;
+    const struct microcode_header_amd *hdr;
+    const struct microcode_patch *patch;
     int hw_err;
 
     /* We should bind the task to the CPU */
     BUG_ON(raw_smp_processor_id() != cpu);
 
-    if ( mc_amd == NULL )
+    patch = microcode_find_patch();
+    if ( patch == NULL )
         return -EINVAL;
 
-    hdr = mc_amd->mpb;
-    if ( hdr == NULL )
-        return -EINVAL;
+    hdr = patch->data;
 
     spin_lock_irqsave(&microcode_update_lock, flags);
 
@@ -497,7 +557,21 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
-        if ( microcode_fits(mc_amd, cpu) )
+        struct microcode_patch *microcode_patch = alloc_microcode_patch(mc_amd);
+
+        if ( IS_ERR(microcode_patch) )
+        {
+            error = PTR_ERR(microcode_patch);
+            break;
+        }
+
+        /*
+         * In order to support a system with mixed stepping CPUs, save
+         * this ucode patch before checking whether it matches with
+         * current CPU.
+         */
+        if ( microcode_save_patch(microcode_patch) &&
+             microcode_fits(mc_amd, cpu) )
         {
             error = apply_microcode(cpu);
             if ( error )
@@ -639,6 +713,9 @@ static const struct microcode_ops microcode_amd_ops = {
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
     .start_update                     = start_update,
+    .compare_patch                    = compare_patch,
+    .free_patch                       = free_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_amd(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index ecec83b..b556d3d 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -158,6 +158,15 @@ static enum microcode_match_result microcode_update_match(
     return MIS_UCODE;
 }
 
+static bool match_cpu(const struct microcode_patch *patch)
+{
+    struct ucode_cpu_info *uci = &this_cpu(ucode_cpu_info);
+    int ret = microcode_update_match(patch->data, uci->cpu_sig.sig,
+                                     uci->cpu_sig.pf, uci->cpu_sig.rev);
+
+    return ret == NEW_UCODE;
+}
+
 static int microcode_sanity_check(void *mc)
 {
     struct microcode_header_intel *mc_header = mc;
@@ -248,6 +257,21 @@ static int microcode_sanity_check(void *mc)
     return 0;
 }
 
+static void free_patch(struct microcode_patch *patch)
+{
+    xfree(patch->data);
+    xfree(patch);
+}
+
+static enum microcode_match_result compare_patch(
+    const struct microcode_patch *new, const struct microcode_patch *old)
+{
+    const struct microcode_header_intel *old_header = old->data;
+
+    return microcode_update_match(new->data, old_header->sig,
+                                  old_header->pf, old_header->rev);
+}
+
 /*
  * return 0 - no update found
  * return 1 - found update
@@ -258,7 +282,25 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
-    void *new_mc;
+    void *new_mc = xmalloc_bytes(total_size);
+    struct microcode_patch *microcode_patch = xmalloc(struct microcode_patch);
+
+    if ( !microcode_patch || !new_mc )
+    {
+        xfree(microcode_patch);
+        xfree(new_mc);
+        printk(XENLOG_ERR "microcode: Can not allocate memory\n");
+        return -ENOMEM;
+    }
+    memcpy(new_mc, mc, total_size);
+    microcode_patch->data = new_mc;
+
+    /*
+     * In order to support a system with mixed stepping CPUs, save this ucode
+     * patch before checking whether it matches with current CPU.
+     */
+    if ( !microcode_save_patch(microcode_patch) )
+        return 0;
 
     if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
                                 uci->cpu_sig.rev) != NEW_UCODE )
@@ -287,18 +329,23 @@ static int apply_microcode(unsigned int cpu)
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
     struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    const struct microcode_intel *mc_intel;
+    const struct microcode_patch *patch;
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu_num != cpu);
 
-    if ( uci->mc.mc_intel == NULL )
+    patch = microcode_find_patch();
+    if ( !patch )
         return -EINVAL;
 
+    mc_intel = patch->data;
+
     /* serialize access to the physical write to MSR 0x79 */
     spin_lock_irqsave(&microcode_update_lock, flags);
 
     /* write microcode via MSR 0x79 */
-    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)uci->mc.mc_intel->bits);
+    wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
 
     /* As documented in the SDM: Do a CPUID 1 here */
@@ -309,19 +356,19 @@ static int apply_microcode(unsigned int cpu)
     val[1] = (uint32_t)(msr_content >> 32);
 
     spin_unlock_irqrestore(&microcode_update_lock, flags);
-    if ( val[1] != uci->mc.mc_intel->hdr.rev )
+    if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, uci->mc.mc_intel->hdr.rev, val[1]);
+               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
            cpu_num, uci->cpu_sig.rev, val[1],
-           uci->mc.mc_intel->hdr.year,
-           uci->mc.mc_intel->hdr.month,
-           uci->mc.mc_intel->hdr.day);
+           mc_intel->hdr.year,
+           mc_intel->hdr.month,
+           mc_intel->hdr.day);
     uci->cpu_sig.rev = val[1];
 
     return 0;
@@ -406,6 +453,9 @@ static const struct microcode_ops microcode_intel_ops = {
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
+    .compare_patch                    = compare_patch,
+    .free_patch                       = free_patch,
+    .match_cpu                        = match_cpu,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 73ebe9a..7ff42fe 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -1,6 +1,7 @@
 #ifndef ASM_X86__MICROCODE_H
 #define ASM_X86__MICROCODE_H
 
+#include <xen/list.h>
 #include <xen/percpu.h>
 
 enum microcode_match_result {
@@ -12,6 +13,11 @@ enum microcode_match_result {
 struct cpu_signature;
 struct ucode_cpu_info;
 
+struct microcode_patch {
+    struct list_head list;
+    void *data;
+};
+
 struct microcode_ops {
     int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
@@ -19,6 +25,10 @@ struct microcode_ops {
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
     int (*apply_microcode)(unsigned int cpu);
     int (*start_update)(void);
+    enum microcode_match_result (*compare_patch)(
+        const struct microcode_patch *new, const struct microcode_patch *old);
+    void (*free_patch)(struct microcode_patch *patch);
+    bool (*match_cpu)(const struct microcode_patch *patch);
 };
 
 struct cpu_signature {
@@ -39,4 +49,7 @@ struct ucode_cpu_info {
 DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
 extern const struct microcode_ops *microcode_ops;
 
+bool microcode_save_patch(struct microcode_patch *new_patch);
+const struct microcode_patch *microcode_find_patch(void);
+
 #endif /* ASM_X86__MICROCODE_H */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 05/12] microcode: only save compatible ucode patches
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (3 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 04/12] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-12 17:03   ` Roger Pau Monné
  2019-03-11  7:57 ` [PATCH v6 06/12] microcode: remove struct ucode_cpu_info Chao Gao
                   ` (7 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Intel CPU only allows mixing in stepping or 'pf'. If an ucode patch is
for other CPU families or models, it won't be compatible to all CPUs on
current system and even hot-plugged CPUs.  Don't save such patch to
reduce the size of ucode cache.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v6:
 - new
---
 xen/arch/x86/microcode.c        |  8 ++++++++
 xen/arch/x86/microcode_intel.c  | 35 +++++++++++++++++++++++++++++++++++
 xen/include/asm-x86/microcode.h |  1 +
 3 files changed, 44 insertions(+)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index e629e6c..9ffbb15 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -243,6 +243,14 @@ bool microcode_save_patch(struct microcode_patch *new)
         else /* result == MIS_UCODE */
             continue;
     }
+
+    if ( microcode_ops->is_patch_compatible &&
+         !microcode_ops->is_patch_compatible(new) )
+    {
+        xfree(new);
+        return false;
+    }
+
     list_add_tail(&new->list, &microcode_cache);
 
     return true;
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index b556d3d..d973634 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -93,6 +93,8 @@ struct extended_sigtable {
 
 #define exttable_size(et) ((et)->count * EXT_SIGNATURE_SIZE + EXT_HEADER_SIZE)
 
+#define STEPPING_MASK ~0xfU
+
 /* serialize access to the physical write to MSR 0x79 */
 static DEFINE_SPINLOCK(microcode_update_lock);
 
@@ -167,6 +169,38 @@ static bool match_cpu(const struct microcode_patch *patch)
     return ret == NEW_UCODE;
 }
 
+/*
+ * Make sure the patch is compatible with current system. It depends on
+ * how much difference current CPU and the patch's supported CPUs have.
+ *
+ * For Intel cpu, make sure that the patch can be applied to a cpu which has
+ * the same family and model as current CPU.
+ */
+static bool is_patch_compatible(const struct microcode_patch *patch)
+{
+    struct ucode_cpu_info *uci = &this_cpu(ucode_cpu_info);
+    const struct cpu_signature *csig = &uci->cpu_sig;
+    const struct microcode_header_intel *mc_header = patch->data;
+    const struct extended_sigtable *ext_header;
+    const struct extended_signature *ext_sig;
+    unsigned long data_size = get_datasize(mc_header);
+    unsigned int i;
+
+    if ( (csig->sig & STEPPING_MASK) == (mc_header->sig & STEPPING_MASK) )
+        return true;
+
+    if ( get_totalsize(mc_header) == (data_size + MC_HEADER_SIZE) )
+        return false;
+
+    ext_header = (const void *)(mc_header + 1) + data_size;
+    ext_sig = (const void *)(ext_header + 1);
+    for ( i = 0; i < ext_header->count; i++ )
+        if ( (csig->sig & STEPPING_MASK) == (ext_sig[i].sig & STEPPING_MASK) )
+            return true;
+
+    return false;
+}
+
 static int microcode_sanity_check(void *mc)
 {
     struct microcode_header_intel *mc_header = mc;
@@ -456,6 +490,7 @@ static const struct microcode_ops microcode_intel_ops = {
     .compare_patch                    = compare_patch,
     .free_patch                       = free_patch,
     .match_cpu                        = match_cpu,
+    .is_patch_compatible              = is_patch_compatible,
 };
 
 int __init microcode_init_intel(void)
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 7ff42fe..92631b4 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -29,6 +29,7 @@ struct microcode_ops {
         const struct microcode_patch *new, const struct microcode_patch *old);
     void (*free_patch)(struct microcode_patch *patch);
     bool (*match_cpu)(const struct microcode_patch *patch);
+    bool (*is_patch_compatible)(const struct microcode_patch *patch);
 };
 
 struct cpu_signature {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 06/12] microcode: remove struct ucode_cpu_info
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (4 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 05/12] microcode: only save compatible ucode patches Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-11  7:57 ` [PATCH v6 07/12] microcode: remove pointless 'cpu' parameter Chao Gao
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

We can remove the per-cpu cache field in struct ucode_cpu_info since
it has been replaced by a global cache. It would leads to only one field
remaining in ucode_cpu_info. Then, this struct is removed and the
remaining field (cpu signature) is stored in per-cpu area.

Also remove 'microcode_resume_match' from microcode_ops because the
check is done in find_patch(). The cpu status notifier is also
removed. It was used to free the "mc" field to avoid memory leak.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v6:
 - remove the whole struct ucode_cpu_info instead of the per-cpu cache
 in it.
---
 xen/arch/x86/apic.c             |   2 +-
 xen/arch/x86/microcode.c        |  91 +++---------------------------------
 xen/arch/x86/microcode_amd.c    | 100 +++++-----------------------------------
 xen/arch/x86/microcode_intel.c  |  41 +++++-----------
 xen/arch/x86/spec_ctrl.c        |   2 +-
 xen/include/asm-x86/microcode.h |  13 +-----
 6 files changed, 33 insertions(+), 216 deletions(-)

diff --git a/xen/arch/x86/apic.c b/xen/arch/x86/apic.c
index 2a24326..fd31d33 100644
--- a/xen/arch/x86/apic.c
+++ b/xen/arch/x86/apic.c
@@ -1185,7 +1185,7 @@ static void __init check_deadline_errata(void)
     else
         rev = (unsigned long)m->driver_data;
 
-    if ( this_cpu(ucode_cpu_info).cpu_sig.rev >= rev )
+    if ( this_cpu(cpu_sig).rev >= rev )
         return;
 
     setup_clear_cpu_cap(X86_FEATURE_TSC_DEADLINE);
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 9ffbb15..fd1a243 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -186,7 +186,7 @@ const struct microcode_ops *microcode_ops;
 
 static DEFINE_SPINLOCK(microcode_mutex);
 
-DEFINE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 struct microcode_info {
     unsigned int cpu;
@@ -195,21 +195,6 @@ struct microcode_info {
     char buffer[1];
 };
 
-static void __microcode_fini_cpu(unsigned int cpu)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-
-    xfree(uci->mc.mc_valid);
-    memset(uci, 0, sizeof(*uci));
-}
-
-static void microcode_fini_cpu(unsigned int cpu)
-{
-    spin_lock(&microcode_mutex);
-    __microcode_fini_cpu(cpu);
-    spin_unlock(&microcode_mutex);
-}
-
 /*
  * Save an ucode patch to the global cache list.
  *
@@ -279,52 +264,16 @@ const struct microcode_patch *microcode_find_patch(void)
 int microcode_resume_cpu(unsigned int cpu)
 {
     int err;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct cpu_signature nsig;
-    unsigned int cpu2;
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
-    if ( err )
-    {
-        __microcode_fini_cpu(cpu);
-        spin_unlock(&microcode_mutex);
-        return err;
-    }
-
-    if ( uci->mc.mc_valid )
-    {
-        err = microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid);
-        if ( err >= 0 )
-        {
-            if ( err )
-                err = microcode_ops->apply_microcode(cpu);
-            spin_unlock(&microcode_mutex);
-            return err;
-        }
-    }
-
-    nsig = uci->cpu_sig;
-    __microcode_fini_cpu(cpu);
-    uci->cpu_sig = nsig;
-
-    err = -EIO;
-    for_each_online_cpu ( cpu2 )
-    {
-        uci = &per_cpu(ucode_cpu_info, cpu2);
-        if ( uci->mc.mc_valid &&
-             microcode_ops->microcode_resume_match(cpu, uci->mc.mc_valid) > 0 )
-        {
-            err = microcode_ops->apply_microcode(cpu);
-            break;
-        }
-    }
-
-    __microcode_fini_cpu(cpu);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
+    if ( likely(!err) )
+        err = microcode_ops->apply_microcode(cpu);
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -334,16 +283,13 @@ static int microcode_update_cpu(const void *buf, size_t size)
 {
     int err;
     unsigned int cpu = smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, &uci->cpu_sig);
+    err = microcode_ops->collect_cpu_info(cpu, sig);
     if ( likely(!err) )
         err = microcode_ops->cpu_request_microcode(cpu, buf, size);
-    else
-        __microcode_fini_cpu(cpu);
-
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -430,25 +376,6 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-static int microcode_percpu_callback(
-    struct notifier_block *nfb, unsigned long action, void *hcpu)
-{
-    unsigned int cpu = (unsigned long)hcpu;
-
-    switch ( action )
-    {
-    case CPU_DEAD:
-        microcode_fini_cpu(cpu);
-        break;
-    }
-
-    return NOTIFY_DONE;
-}
-
-static struct notifier_block microcode_percpu_nfb = {
-    .notifier_call = microcode_percpu_callback,
-};
-
 int __init early_microcode_update_cpu(bool start_update)
 {
     int rc = 0;
@@ -492,12 +419,8 @@ int __init early_microcode_init(void)
         return rc;
 
     if ( microcode_ops )
-    {
         if ( ucode_mod.mod_end || ucode_blob.size )
             rc = early_microcode_update_cpu(true);
 
-        register_cpu_notifier(&microcode_percpu_nfb);
-    }
-
     return rc;
 }
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 8897c2c..4586e8d 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -155,7 +155,7 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
 static bool_t microcode_fits(const struct microcode_amd *mc_amd,
                              unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
@@ -178,14 +178,14 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
         return 0;
     }
 
-    if ( mc_header->patch_id <= uci->cpu_sig.rev )
+    if ( mc_header->patch_id <= sig->rev )
     {
         pr_debug("microcode: patch is already at required level or greater.\n");
         return 0;
     }
 
     pr_debug("microcode: CPU%d found a matching microcode update with version %#x (current=%#x)\n",
-             cpu, mc_header->patch_id, uci->cpu_sig.rev);
+             cpu, mc_header->patch_id, sig->rev);
 
     return 1;
 }
@@ -254,11 +254,11 @@ static enum microcode_match_result compare_patch(
 static int apply_microcode(unsigned int cpu)
 {
     unsigned long flags;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
     uint32_t rev;
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch;
     int hw_err;
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     /* We should bind the task to the CPU */
     BUG_ON(raw_smp_processor_id() != cpu);
@@ -293,9 +293,9 @@ static int apply_microcode(unsigned int cpu)
     }
 
     printk(KERN_WARNING "microcode: CPU%d updated from revision %#x to %#x\n",
-           cpu, uci->cpu_sig.rev, hdr->patch_id);
+           cpu, sig->rev, hdr->patch_id);
 
-    uci->cpu_sig.rev = rev;
+    sig->rev = rev;
 
     return 0;
 }
@@ -441,14 +441,14 @@ static bool_t check_final_patch_levels(unsigned int cpu)
      * any of the 'final_levels', then we should not update the microcode
      * patch on the cpu as system will hang otherwise.
      */
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     unsigned int i;
 
     if ( boot_cpu_data.x86 != 0x10 )
         return 0;
 
     for ( i = 0; i < ARRAY_SIZE(final_levels); i++ )
-        if ( uci->cpu_sig.rev == final_levels[i] )
+        if ( sig->rev == final_levels[i] )
             return 1;
 
     return 0;
@@ -457,13 +457,12 @@ static bool_t check_final_patch_levels(unsigned int cpu)
 static int cpu_request_microcode(unsigned int cpu, const void *buf,
                                  size_t bufsize)
 {
-    struct microcode_amd *mc_amd, *mc_old;
+    struct microcode_amd *mc_amd;
     size_t offset = 0;
-    size_t last_offset, applied_offset = 0;
-    int error = 0, save_error = 1;
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
     /* We should bind the task to the CPU */
     BUG_ON(cpu != raw_smp_processor_id());
@@ -532,7 +531,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         {
             printk(KERN_ERR "microcode: CPU%d incorrect or corrupt container file\n"
                    "microcode: Failed to update patch level. "
-                   "Current lvl:%#x\n", cpu, uci->cpu_sig.rev);
+                   "Current lvl:%#x\n", cpu, sig->rev);
             break;
         }
     }
@@ -543,17 +542,12 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    mc_old = uci->mc.mc_amd;
-    /* implicitely validates uci->mc.mc_valid */
-    uci->mc.mc_amd = mc_amd;
-
     /*
      * It's possible the data file has multiple matching ucode,
      * lets keep searching till the latest version
      */
     mc_amd->mpb = NULL;
     mc_amd->mpb_size = 0;
-    last_offset = offset;
     while ( (error = get_ucode_from_buffer_amd(mc_amd, buf, bufsize,
                                                &offset)) == 0 )
     {
@@ -576,11 +570,8 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             error = apply_microcode(cpu);
             if ( error )
                 break;
-            applied_offset = last_offset;
         }
 
-        last_offset = offset;
-
         if ( offset >= bufsize )
             break;
 
@@ -609,26 +600,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
             break;
     }
 
-    /* On success keep the microcode patch for
-     * re-apply on resume.
-     */
-    if ( applied_offset )
-    {
-        save_error = get_ucode_from_buffer_amd(
-            mc_amd, buf, bufsize, &applied_offset);
-
-        if ( save_error )
-            error = save_error;
-    }
-
-    if ( save_error )
-    {
-        xfree(mc_amd);
-        uci->mc.mc_amd = mc_old;
-    }
-    else
-        xfree(mc_old);
-
   out:
 #if CONFIG_HVM
     svm_host_osvw_init();
@@ -643,52 +614,6 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
-    struct microcode_amd *mc_amd = uci->mc.mc_amd;
-    const struct microcode_amd *src = mc;
-
-    if ( !microcode_fits(src, cpu) )
-        return 0;
-
-    if ( src != mc_amd )
-    {
-        if ( mc_amd )
-        {
-            xfree(mc_amd->equiv_cpu_table);
-            xfree(mc_amd->mpb);
-            xfree(mc_amd);
-        }
-
-        mc_amd = xmalloc(struct microcode_amd);
-        uci->mc.mc_amd = mc_amd;
-        if ( !mc_amd )
-            return -ENOMEM;
-        mc_amd->equiv_cpu_table = xmalloc_bytes(src->equiv_cpu_table_size);
-        if ( !mc_amd->equiv_cpu_table )
-            goto err1;
-        mc_amd->mpb = xmalloc_bytes(src->mpb_size);
-        if ( !mc_amd->mpb )
-            goto err2;
-
-        mc_amd->equiv_cpu_table_size = src->equiv_cpu_table_size;
-        mc_amd->mpb_size = src->mpb_size;
-        memcpy(mc_amd->mpb, src->mpb, src->mpb_size);
-        memcpy(mc_amd->equiv_cpu_table, src->equiv_cpu_table,
-               src->equiv_cpu_table_size);
-    }
-
-    return 1;
-
-err2:
-    xfree(mc_amd->equiv_cpu_table);
-err1:
-    xfree(mc_amd);
-    uci->mc.mc_amd = NULL;
-    return -ENOMEM;
-}
-
 static int start_update(void)
 {
 #if CONFIG_HVM
@@ -708,7 +633,6 @@ static int start_update(void)
 }
 
 static const struct microcode_ops microcode_amd_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index d973634..a495701 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -162,11 +162,10 @@ static enum microcode_match_result microcode_update_match(
 
 static bool match_cpu(const struct microcode_patch *patch)
 {
-    struct ucode_cpu_info *uci = &this_cpu(ucode_cpu_info);
-    int ret = microcode_update_match(patch->data, uci->cpu_sig.sig,
-                                     uci->cpu_sig.pf, uci->cpu_sig.rev);
+    const struct cpu_signature *sig = &this_cpu(cpu_sig);
 
-    return ret == NEW_UCODE;
+    return microcode_update_match(patch->data, sig->sig, sig->pf, sig->rev) ==
+               NEW_UCODE;
 }
 
 /*
@@ -178,8 +177,7 @@ static bool match_cpu(const struct microcode_patch *patch)
  */
 static bool is_patch_compatible(const struct microcode_patch *patch)
 {
-    struct ucode_cpu_info *uci = &this_cpu(ucode_cpu_info);
-    const struct cpu_signature *csig = &uci->cpu_sig;
+    const struct cpu_signature *csig = &this_cpu(cpu_sig);
     const struct microcode_header_intel *mc_header = patch->data;
     const struct extended_sigtable *ext_header;
     const struct extended_signature *ext_sig;
@@ -313,7 +311,7 @@ static enum microcode_match_result compare_patch(
  */
 static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu);
+    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
@@ -336,23 +334,12 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     if ( !microcode_save_patch(microcode_patch) )
         return 0;
 
-    if ( microcode_update_match(mc, uci->cpu_sig.sig, uci->cpu_sig.pf,
-                                uci->cpu_sig.rev) != NEW_UCODE )
+    if ( microcode_update_match(mc, sig->sig, sig->pf, sig->rev) != NEW_UCODE )
         return 0;
 
     pr_debug("microcode: CPU%d found a matching microcode update with"
              " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, uci->cpu_sig.rev);
-    new_mc = xmalloc_bytes(total_size);
-    if ( new_mc == NULL )
-    {
-        printk(KERN_ERR "microcode: error! Can not allocate memory\n");
-        return -ENOMEM;
-    }
-
-    memcpy(new_mc, mc, total_size);
-    xfree(uci->mc.mc_intel);
-    uci->mc.mc_intel = new_mc;
+             cpu, mc_header->rev, sig->rev);
     return 1;
 }
 
@@ -362,7 +349,7 @@ static int apply_microcode(unsigned int cpu)
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct ucode_cpu_info *uci = &per_cpu(ucode_cpu_info, cpu_num);
+    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch;
 
@@ -394,16 +381,16 @@ static int apply_microcode(unsigned int cpu)
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
                "%#x to %#x failed. Resulting revision is %#x.\n", cpu_num,
-               uci->cpu_sig.rev, mc_intel->hdr.rev, val[1]);
+               sig->rev, mc_intel->hdr.rev, val[1]);
         return -EIO;
     }
     printk(KERN_INFO "microcode: CPU%d updated from revision "
            "%#x to %#x, date = %04x-%02x-%02x \n",
-           cpu_num, uci->cpu_sig.rev, val[1],
+           cpu_num, sig->rev, val[1],
            mc_intel->hdr.year,
            mc_intel->hdr.month,
            mc_intel->hdr.day);
-    uci->cpu_sig.rev = val[1];
+    sig->rev = val[1];
 
     return 0;
 }
@@ -477,13 +464,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
     return error;
 }
 
-static int microcode_resume_match(unsigned int cpu, const void *mc)
-{
-    return get_matching_microcode(mc, cpu);
-}
-
 static const struct microcode_ops microcode_intel_ops = {
-    .microcode_resume_match           = microcode_resume_match,
     .cpu_request_microcode            = cpu_request_microcode,
     .collect_cpu_info                 = collect_cpu_info,
     .apply_microcode                  = apply_microcode,
diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
index ad72ecd..b14d591 100644
--- a/xen/arch/x86/spec_ctrl.c
+++ b/xen/arch/x86/spec_ctrl.c
@@ -304,7 +304,7 @@ static void __init print_details(enum ind_thunk thunk, uint64_t caps)
 /* Calculate whether Retpoline is known-safe on this CPU. */
 static bool __init retpoline_safe(uint64_t caps)
 {
-    unsigned int ucode_rev = this_cpu(ucode_cpu_info).cpu_sig.rev;
+    unsigned int ucode_rev = this_cpu(cpu_sig).rev;
 
     if ( boot_cpu_data.x86_vendor == X86_VENDOR_AMD )
         return true;
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 92631b4..23fc19b 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -11,7 +11,6 @@ enum microcode_match_result {
 };
 
 struct cpu_signature;
-struct ucode_cpu_info;
 
 struct microcode_patch {
     struct list_head list;
@@ -19,7 +18,6 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*microcode_resume_match)(unsigned int cpu, const void *mc);
     int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
                                  size_t size);
     int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
@@ -38,16 +36,7 @@ struct cpu_signature {
     unsigned int rev;
 };
 
-struct ucode_cpu_info {
-    struct cpu_signature cpu_sig;
-    union {
-        struct microcode_intel *mc_intel;
-        struct microcode_amd *mc_amd;
-        void *mc_valid;
-    } mc;
-};
-
-DECLARE_PER_CPU(struct ucode_cpu_info, ucode_cpu_info);
+DECLARE_PER_CPU(struct cpu_signature, cpu_sig);
 extern const struct microcode_ops *microcode_ops;
 
 bool microcode_save_patch(struct microcode_patch *new_patch);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 07/12] microcode: remove pointless 'cpu' parameter
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (5 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 06/12] microcode: remove struct ucode_cpu_info Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-11  7:57 ` [PATCH v6 08/12] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Some callbacks in microcode_ops or related functions take a cpu
id parameter. But at current call sites, the cpu id parameter is
always equal to current cpu id. Some of them even use an assertion
to guarantee this. Remove this redundent 'cpu' parameter.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 12 ++++++------
 xen/arch/x86/microcode_amd.c    | 36 ++++++++++++++----------------------
 xen/arch/x86/microcode_intel.c  | 21 +++++++--------------
 xen/arch/x86/smpboot.c          |  2 +-
 xen/include/asm-x86/microcode.h |  7 +++----
 xen/include/asm-x86/processor.h |  2 +-
 7 files changed, 33 insertions(+), 49 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index 93e967f..b4c3669 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -253,7 +253,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu(0);
+    microcode_resume_cpu();
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index fd1a243..8bfdf95 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -261,19 +261,19 @@ const struct microcode_patch *microcode_find_patch(void)
     return NULL;
 }
 
-int microcode_resume_cpu(unsigned int cpu)
+int microcode_resume_cpu(void)
 {
     int err;
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
 
     if ( !microcode_ops )
         return 0;
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->apply_microcode(cpu);
+        err = microcode_ops->apply_microcode();
     spin_unlock(&microcode_mutex);
 
     return err;
@@ -287,9 +287,9 @@ static int microcode_update_cpu(const void *buf, size_t size)
 
     spin_lock(&microcode_mutex);
 
-    err = microcode_ops->collect_cpu_info(cpu, sig);
+    err = microcode_ops->collect_cpu_info(sig);
     if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(cpu, buf, size);
+        err = microcode_ops->cpu_request_microcode(buf, size);
     spin_unlock(&microcode_mutex);
 
     return err;
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 4586e8d..25935dd 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -78,8 +78,9 @@ struct mpbhdr {
 static DEFINE_SPINLOCK(microcode_update_lock);
 
 /* See comment in start_update() for cases when this routine fails */
-static int collect_cpu_info(unsigned int cpu, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu];
 
     memset(csig, 0, sizeof(*csig));
@@ -152,18 +153,15 @@ static bool_t find_equiv_cpu_id(const struct equiv_cpu_entry *equiv_cpu_table,
     return 0;
 }
 
-static bool_t microcode_fits(const struct microcode_amd *mc_amd,
-                             unsigned int cpu)
+static bool microcode_fits(const struct microcode_amd *mc_amd)
 {
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_amd *mc_header = mc_amd->mpb;
     const struct equiv_cpu_entry *equiv_cpu_table = mc_amd->equiv_cpu_table;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( !find_equiv_cpu_id(equiv_cpu_table, current_cpu_id, &equiv_cpu_id) )
@@ -192,7 +190,7 @@ static bool_t microcode_fits(const struct microcode_amd *mc_amd,
 
 static bool match_cpu(const struct microcode_patch *patch)
 {
-    return microcode_fits(patch->data, smp_processor_id());
+    return microcode_fits(patch->data);
 }
 
 static struct microcode_patch *alloc_microcode_patch(
@@ -251,18 +249,16 @@ static enum microcode_match_result compare_patch(
     return MIS_UCODE;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint32_t rev;
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch;
     int hw_err;
+    unsigned int cpu = smp_processor_id();
     struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
-    /* We should bind the task to the CPU */
-    BUG_ON(raw_smp_processor_id() != cpu);
-
     patch = microcode_find_patch();
     if ( patch == NULL )
         return -EINVAL;
@@ -434,14 +430,14 @@ static const unsigned int final_levels[] = {
     0x010000af
 };
 
-static bool_t check_final_patch_levels(unsigned int cpu)
+static bool check_final_patch_levels(void)
 {
     /*
      * Check the current patch levels on the cpu. If they are equal to
      * any of the 'final_levels', then we should not update the microcode
      * patch on the cpu as system will hang otherwise.
      */
-    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    const struct cpu_signature *sig = &this_cpu(cpu_sig);
     unsigned int i;
 
     if ( boot_cpu_data.x86 != 0x10 )
@@ -454,19 +450,16 @@ static bool_t check_final_patch_levels(unsigned int cpu)
     return 0;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t bufsize)
+static int cpu_request_microcode(const void *buf, size_t bufsize)
 {
     struct microcode_amd *mc_amd;
     size_t offset = 0;
     int error = 0;
     unsigned int current_cpu_id;
     unsigned int equiv_cpu_id;
+    unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     current_cpu_id = cpuid_eax(0x00000001);
 
     if ( *(const uint32_t *)buf != UCODE_MAGIC )
@@ -476,7 +469,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         goto out;
     }
 
-    if ( check_final_patch_levels(cpu) )
+    if ( check_final_patch_levels() )
     {
         printk(XENLOG_INFO
                "microcode: Cannot update microcode patch on the cpu as we hit a final level\n");
@@ -564,10 +557,9 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
          * this ucode patch before checking whether it matches with
          * current CPU.
          */
-        if ( microcode_save_patch(microcode_patch) &&
-             microcode_fits(mc_amd, cpu) )
+        if ( microcode_save_patch(microcode_patch) && microcode_fits(mc_amd) )
         {
-            error = apply_microcode(cpu);
+            error = apply_microcode();
             if ( error )
                 break;
         }
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index a495701..18c833f 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -98,13 +98,12 @@ struct extended_sigtable {
 /* serialize access to the physical write to MSR 0x79 */
 static DEFINE_SPINLOCK(microcode_update_lock);
 
-static int collect_cpu_info(unsigned int cpu_num, struct cpu_signature *csig)
+static int collect_cpu_info(struct cpu_signature *csig)
 {
+    unsigned int cpu_num = smp_processor_id();
     struct cpuinfo_x86 *c = &cpu_data[cpu_num];
     uint64_t msr_content;
 
-    BUG_ON(cpu_num != smp_processor_id());
-
     memset(csig, 0, sizeof(*csig));
 
     if ( (c->x86_vendor != X86_VENDOR_INTEL) || (c->x86 < 6) )
@@ -343,19 +342,16 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
     return 1;
 }
 
-static int apply_microcode(unsigned int cpu)
+static int apply_microcode(void)
 {
     unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    struct cpu_signature *sig = &this_cpu(cpu_sig);
     const struct microcode_intel *mc_intel;
     const struct microcode_patch *patch;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu_num != cpu);
-
     patch = microcode_find_patch();
     if ( !patch )
         return -EINVAL;
@@ -423,17 +419,14 @@ static long get_next_ucode_from_buffer(void **mc, const u8 *buf,
     return offset + total_size;
 }
 
-static int cpu_request_microcode(unsigned int cpu, const void *buf,
-                                 size_t size)
+static int cpu_request_microcode(const void *buf, size_t size)
 {
     long offset = 0;
     int error = 0;
     void *mc;
+    unsigned int cpu = smp_processor_id();
     unsigned int matching_count = 0;
 
-    /* We should bind the task to the CPU */
-    BUG_ON(cpu != raw_smp_processor_id());
-
     while ( (offset = get_next_ucode_from_buffer(&mc, buf, size, offset)) > 0 )
     {
         error = microcode_sanity_check(mc);
@@ -459,7 +452,7 @@ static int cpu_request_microcode(unsigned int cpu, const void *buf,
         error = offset;
 
     if ( !error && matching_count )
-        error = apply_microcode(cpu);
+        error = apply_microcode();
 
     return error;
 }
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 7d1226d..763b0bd 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -366,7 +366,7 @@ void start_secondary(void *unused)
     if ( system_state <= SYS_STATE_smp_boot )
         early_microcode_update_cpu(false);
     else
-        microcode_resume_cpu(cpu);
+        microcode_resume_cpu();
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/microcode.h b/xen/include/asm-x86/microcode.h
index 23fc19b..5cf177e 100644
--- a/xen/include/asm-x86/microcode.h
+++ b/xen/include/asm-x86/microcode.h
@@ -18,10 +18,9 @@ struct microcode_patch {
 };
 
 struct microcode_ops {
-    int (*cpu_request_microcode)(unsigned int cpu, const void *buf,
-                                 size_t size);
-    int (*collect_cpu_info)(unsigned int cpu, struct cpu_signature *csig);
-    int (*apply_microcode)(unsigned int cpu);
+    int (*cpu_request_microcode)(const void *buf, size_t size);
+    int (*collect_cpu_info)(struct cpu_signature *csig);
+    int (*apply_microcode)(void);
     int (*start_update)(void);
     enum microcode_match_result (*compare_patch)(
         const struct microcode_patch *new, const struct microcode_patch *old);
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index f3275ca..186eb8b 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -573,7 +573,7 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(unsigned int cpu);
+int microcode_resume_cpu(void);
 int early_microcode_update_cpu(bool start_update);
 int early_microcode_init(void);
 int microcode_init_intel(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 08/12] microcode: split out apply_microcode() from cpu_request_microcode()
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (6 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 07/12] microcode: remove pointless 'cpu' parameter Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-11  7:57 ` [PATCH v6 09/12] microcode: remove struct microcode_info Chao Gao
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

During late microcode update, apply_microcode() is invoked in
cpu_request_microcode(). To make late microcode update more reliable,
we want to put the apply_microcode() into stop_machine context. So
we split out it from cpu_request_microcode(). As a consequence,
apply_microcode() should be invoked explicitly in the common code.

Also with the global ucode cache, microcode parsing only needs
to be done once; cpu_request_microcode() is also moved out of
microcode_update_cpu().

On AMD side, svm_host_osvw_init() is supposed to be called after
microcode update. As apply_micrcode() won't be called by
cpu_request_microcode() now, svm_host_osvw_init() is moved to the
end of apply_microcode().

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v6:
 - during early microcode update, BSP and APs call different functions.
   Thus AP can bypass parsing microcode blob.
---
 xen/arch/x86/acpi/power.c       |  2 +-
 xen/arch/x86/microcode.c        | 77 +++++++++++++++++++++++++----------------
 xen/arch/x86/microcode_amd.c    | 19 +++++-----
 xen/arch/x86/microcode_intel.c  | 19 ++--------
 xen/arch/x86/smpboot.c          |  5 +--
 xen/include/asm-x86/processor.h |  3 +-
 6 files changed, 62 insertions(+), 63 deletions(-)

diff --git a/xen/arch/x86/acpi/power.c b/xen/arch/x86/acpi/power.c
index b4c3669..aeacb57 100644
--- a/xen/arch/x86/acpi/power.c
+++ b/xen/arch/x86/acpi/power.c
@@ -253,7 +253,7 @@ static int enter_state(u32 state)
 
     console_end_sync();
 
-    microcode_resume_cpu();
+    early_microcode_update_cpu();
 
     if ( !recheck_cpu_features(0) )
         panic("Missing previously available feature(s)\n");
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 8bfdf95..e4e2e74 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -261,38 +261,34 @@ const struct microcode_patch *microcode_find_patch(void)
     return NULL;
 }
 
-int microcode_resume_cpu(void)
+/*
+ * Return the number of ucode patch inserted to the global cache.
+ * Return a negtive value on error.
+ */
+static int microcode_parse_blob(char *buf, uint32_t len)
 {
-    int err;
-    struct cpu_signature *sig = &this_cpu(cpu_sig);
-
-    if ( !microcode_ops )
-        return 0;
+    int ret;
 
     spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->apply_microcode();
+    ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
+    if ( likely(!ret) )
+        ret = microcode_ops->cpu_request_microcode(buf, len);
     spin_unlock(&microcode_mutex);
 
-    return err;
+    return ret;
 }
 
-static int microcode_update_cpu(const void *buf, size_t size)
+static int microcode_update_cpu(void)
 {
-    int err;
-    unsigned int cpu = smp_processor_id();
-    struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    int ret;
 
     spin_lock(&microcode_mutex);
-
-    err = microcode_ops->collect_cpu_info(sig);
-    if ( likely(!err) )
-        err = microcode_ops->cpu_request_microcode(buf, size);
+    ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
+    if ( likely(!ret) )
+        ret = microcode_ops->apply_microcode();
     spin_unlock(&microcode_mutex);
 
-    return err;
+    return ret;
 }
 
 static long do_microcode_update(void *_info)
@@ -302,7 +298,7 @@ static long do_microcode_update(void *_info)
 
     BUG_ON(info->cpu != smp_processor_id());
 
-    error = microcode_update_cpu(info->buffer, info->buffer_size);
+    error = microcode_update_cpu();
     if ( error )
         info->error = error;
 
@@ -337,10 +333,6 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         return ret;
     }
 
-    info->buffer_size = len;
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
-
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
@@ -351,6 +343,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         }
     }
 
+    ret = microcode_parse_blob(info->buffer, len);
+    if ( ret <= 0 )
+    {
+        printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
+        xfree(info);
+        return -EINVAL;
+    }
+
+    info->buffer_size = len;
+    info->error = 0;
+    info->cpu = cpumask_first(&cpu_online_map);
+
     return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
 }
 
@@ -376,7 +380,16 @@ static int __init microcode_init(void)
 }
 __initcall(microcode_init);
 
-int __init early_microcode_update_cpu(bool start_update)
+int early_microcode_update_cpu(void)
+{
+    return microcode_ops ? microcode_update_cpu() : 0;
+}
+
+/*
+ * BSP needs to parse the ucode blob and then apply an update.
+ * APs just apply an update by calling early_microcode_update_cpu().
+ */
+static int __init early_microcode_parse_and_update_cpu(void)
 {
     int rc = 0;
     void *data = NULL;
@@ -394,13 +407,17 @@ int __init early_microcode_update_cpu(bool start_update)
     }
     if ( data )
     {
-        if ( start_update && microcode_ops->start_update )
+        if ( microcode_ops->start_update )
             rc = microcode_ops->start_update();
 
         if ( rc )
             return rc;
 
-        return microcode_update_cpu(data, len);
+        rc = microcode_parse_blob(data, len);
+        if ( rc <= 0 )
+            return -EINVAL;
+
+        return early_microcode_update_cpu();
     }
     else
         return -ENOMEM;
@@ -419,8 +436,10 @@ int __init early_microcode_init(void)
         return rc;
 
     if ( microcode_ops )
+    {
         if ( ucode_mod.mod_end || ucode_blob.size )
-            rc = early_microcode_update_cpu(true);
+            rc = early_microcode_parse_and_update_cpu();
+    }
 
     return rc;
 }
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 25935dd..5c25ff2 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -293,6 +293,10 @@ static int apply_microcode(void)
 
     sig->rev = rev;
 
+#ifdef CONFIG_HVM
+    svm_host_osvw_init();
+#endif
+
     return 0;
 }
 
@@ -459,6 +463,7 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     unsigned int equiv_cpu_id;
     unsigned int cpu = smp_processor_id();
     const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
+    unsigned int matched_cnt = 0;
 
     current_cpu_id = cpuid_eax(0x00000001);
 
@@ -557,12 +562,8 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
          * this ucode patch before checking whether it matches with
          * current CPU.
          */
-        if ( microcode_save_patch(microcode_patch) && microcode_fits(mc_amd) )
-        {
-            error = apply_microcode();
-            if ( error )
-                break;
-        }
+        if ( microcode_save_patch(microcode_patch) )
+            matched_cnt++;
 
         if ( offset >= bufsize )
             break;
@@ -593,17 +594,13 @@ static int cpu_request_microcode(const void *buf, size_t bufsize)
     }
 
   out:
-#if CONFIG_HVM
-    svm_host_osvw_init();
-#endif
-
     /*
      * In some cases we may return an error even if processor's microcode has
      * been updated. For example, the first patch in a container file is loaded
      * successfully but subsequent container file processing encounters a
      * failure.
      */
-    return error;
+    return error ?: matched_cnt;
 }
 
 static int start_update(void)
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index 18c833f..c921ea9 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -310,7 +310,6 @@ static enum microcode_match_result compare_patch(
  */
 static int get_matching_microcode(const void *mc, unsigned int cpu)
 {
-    const struct cpu_signature *sig = &per_cpu(cpu_sig, cpu);
     const struct microcode_header_intel *mc_header = mc;
     unsigned long total_size = get_totalsize(mc_header);
     void *new_mc = xmalloc_bytes(total_size);
@@ -328,18 +327,9 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
     /*
      * In order to support a system with mixed stepping CPUs, save this ucode
-     * patch before checking whether it matches with current CPU.
+     * patch without checking whether it matches with current CPU.
      */
-    if ( !microcode_save_patch(microcode_patch) )
-        return 0;
-
-    if ( microcode_update_match(mc, sig->sig, sig->pf, sig->rev) != NEW_UCODE )
-        return 0;
-
-    pr_debug("microcode: CPU%d found a matching microcode update with"
-             " version %#x (current=%#x)\n",
-             cpu, mc_header->rev, sig->rev);
-    return 1;
+    return microcode_save_patch(microcode_patch);
 }
 
 static int apply_microcode(void)
@@ -451,10 +441,7 @@ static int cpu_request_microcode(const void *buf, size_t size)
     if ( offset < 0 )
         error = offset;
 
-    if ( !error && matching_count )
-        error = apply_microcode();
-
-    return error;
+    return error ?: matching_count;
 }
 
 static const struct microcode_ops microcode_intel_ops = {
diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
index 763b0bd..2a7f30e 100644
--- a/xen/arch/x86/smpboot.c
+++ b/xen/arch/x86/smpboot.c
@@ -363,10 +363,7 @@ void start_secondary(void *unused)
 
     initialize_cpu_data(cpu);
 
-    if ( system_state <= SYS_STATE_smp_boot )
-        early_microcode_update_cpu(false);
-    else
-        microcode_resume_cpu();
+    early_microcode_update_cpu();
 
     /*
      * If MSR_SPEC_CTRL is available, apply Xen's default setting and discard
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 186eb8b..9925916 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -573,8 +573,7 @@ int guest_wrmsr_xen(struct vcpu *v, uint32_t idx, uint64_t val);
 
 void microcode_set_module(unsigned int);
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
-int microcode_resume_cpu(void);
-int early_microcode_update_cpu(bool start_update);
+int early_microcode_update_cpu(void);
 int early_microcode_init(void);
 int microcode_init_intel(void);
 int microcode_init_amd(void);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 09/12] microcode: remove struct microcode_info
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (7 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 08/12] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-11  7:57 ` [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode Chao Gao
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Struct microcode_info is useless now. microcode pointer and size inside
it were passed to other CPUs to parse microcode locally. Now, parsing
microcode blob is done on one of CPUs. Other CPUs needn't parse the
microcode blob; the pointer and size can be removed.

The 'cpu' field is also redundent because it always can be inferred from
current cpu id.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
v6:
 - remove the whole microcode_info instead of two fields of it.
---
 xen/arch/x86/microcode.c | 62 +++++++++++++++++++-----------------------------
 1 file changed, 25 insertions(+), 37 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index e4e2e74..c510808 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -188,13 +188,6 @@ static DEFINE_SPINLOCK(microcode_mutex);
 
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
-struct microcode_info {
-    unsigned int cpu;
-    uint32_t buffer_size;
-    int error;
-    char buffer[1];
-};
-
 /*
  * Save an ucode patch to the global cache list.
  *
@@ -291,30 +284,25 @@ static int microcode_update_cpu(void)
     return ret;
 }
 
-static long do_microcode_update(void *_info)
+static long do_microcode_update(void *unused)
 {
-    struct microcode_info *info = _info;
-    int error;
-
-    BUG_ON(info->cpu != smp_processor_id());
+    int error, cpu;
 
     error = microcode_update_cpu();
     if ( error )
-        info->error = error;
+        return error;
 
-    info->cpu = cpumask_next(info->cpu, &cpu_online_map);
-    if ( info->cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
+    if ( cpu < nr_cpu_ids )
+        return continue_hypercall_on_cpu(cpu, do_microcode_update, NULL);
 
-    error = info->error;
-    xfree(info);
     return error;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
-    struct microcode_info *info;
+    void *buffer;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -322,40 +310,40 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;
 
-    info = xmalloc_bytes(sizeof(*info) + len);
-    if ( info == NULL )
-        return -ENOMEM;
+    buffer = xmalloc_bytes(len);
+    if ( !buffer )
+    {
+        ret = -ENOMEM;
+        goto free;
+    }
 
-    ret = copy_from_guest(info->buffer, buf, len);
-    if ( ret != 0 )
+    if ( copy_from_guest(buffer, buf, len) )
     {
-        xfree(info);
-        return ret;
+        ret = -EFAULT;
+        goto free;
     }
 
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-        {
-            xfree(info);
-            return ret;
-        }
+            goto free;
     }
 
-    ret = microcode_parse_blob(info->buffer, len);
+    ret = microcode_parse_blob(buffer, len);
     if ( ret <= 0 )
     {
         printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
-        xfree(info);
-        return -EINVAL;
+        ret = -EINVAL;
+        goto free;
     }
 
-    info->buffer_size = len;
-    info->error = 0;
-    info->cpu = cpumask_first(&cpu_online_map);
+    return continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
+                                     do_microcode_update, NULL);
 
-    return continue_hypercall_on_cpu(info->cpu, do_microcode_update, info);
+ free:
+    xfree(buffer);
+    return ret;
 }
 
 static int __init microcode_init(void)
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (8 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 09/12] microcode: remove struct microcode_info Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-21 11:08   ` Sergey Dyasli
  2019-03-11  7:57 ` [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading Chao Gao
                   ` (2 subsequent siblings)
  12 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

Updating microcode is less error prone when caches have been flushed and
depending on what exactly the microcode is updating. For example, some
of the issues around certain Broadwell parts can be addressed by doing a
full cache flush.

[linux commit: 91df9fdf51492aec9fed6b4cbd33160886740f47]
Signed-off-by: Chao Gao <chao.gao@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
---
Changes in v6:
 - new
---
 xen/arch/x86/microcode_intel.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index c921ea9..d5ef145 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -351,6 +351,12 @@ static int apply_microcode(void)
     /* serialize access to the physical write to MSR 0x79 */
     spin_lock_irqsave(&microcode_update_lock, flags);
 
+    /*
+     * Writeback and invalidate caches before updating microcode to avoid
+     * internal issues depending on what the microcode is updating.
+     */
+    wbinvd();
+
     /* write microcode via MSR 0x79 */
     wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
     wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (9 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-13  0:07   ` Raj, Ashok
  2019-03-11  7:57 ` [PATCH v6 12/12] microcode: update microcode on cores in parallel Chao Gao
  2019-03-19 20:22 ` [PATCH v6 00/12] improve late microcode loading Woods, Brian
  12 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Wei Liu,
	Jun Nakajima, Andrew Cooper, Jan Beulich, Ashok Raj, Chao Gao,
	Thomas Gleixner, Roger Pau Monné

This patch ports microcode improvement patches from linux kernel.

Before you read any further: the early loading method is still the
preferred one and you should always do that. The following patch is
improving the late loading mechanism for long running jobs and cloud use
cases.

Gather all cores and serialize the microcode update on them by doing it
one-by-one to make the late update process as reliable as possible and
avoid potential issues caused by the microcode update.

Signed-off-by: Chao Gao <chao.gao@intel.com>
Tested-by: Chao Gao <chao.gao@intel.com>
[linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
[linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
Cc: Kevin Tian <kevin.tian@intel.com>
Cc: Jun Nakajima <jun.nakajima@intel.com>
Cc: Ashok Raj <ashok.raj@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
---
Changes in v6:
 - Use one timeout period for rendezvous stage and another for update stage.
 - scale time to wait by the number of remaining cpus to respond.
   It helps to find something wrong earlier and thus we can reboot the
   system earlier.
---
 xen/arch/x86/microcode.c | 149 ++++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 136 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index c510808..96bcef6 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -22,6 +22,7 @@
  */
 
 #include <xen/cpu.h>
+#include <xen/cpumask.h>
 #include <xen/lib.h>
 #include <xen/kernel.h>
 #include <xen/init.h>
@@ -30,15 +31,34 @@
 #include <xen/smp.h>
 #include <xen/softirq.h>
 #include <xen/spinlock.h>
+#include <xen/stop_machine.h>
 #include <xen/tasklet.h>
 #include <xen/guest_access.h>
 #include <xen/earlycpio.h>
+#include <xen/watchdog.h>
 
+#include <asm/delay.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
 #include <asm/setup.h>
 #include <asm/microcode.h>
 
+/*
+ * Before performing a late microcode update on any thread, we
+ * rendezvous all cpus in stop_machine context. The timeout for
+ * waiting for cpu rendezvous is 30ms. It is the timeout used by
+ * live patching
+ */
+#define MICROCODE_CALLIN_TIMEOUT_US 30000
+
+/*
+ * Timeout for each thread to complete update is set to 1s. It is a
+ * conservative choice considering all possible interference (for
+ * instance, sometimes wbinvd takes relative long time). And a perfect
+ * timeout doesn't help a lot except an early shutdown.
+ */
+#define MICROCODE_UPDATE_TIMEOUT_US 1000000
+
 static module_t __initdata ucode_mod;
 static signed int __initdata ucode_mod_idx;
 static bool_t __initdata ucode_mod_forced;
@@ -189,6 +209,12 @@ static DEFINE_SPINLOCK(microcode_mutex);
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
 /*
+ * Count the CPUs that have entered and exited the rendezvous
+ * during late microcode update.
+ */
+static atomic_t cpu_in, cpu_out;
+
+/*
  * Save an ucode patch to the global cache list.
  *
  * Return true if a patch is added to the global cache or it replaces
@@ -284,25 +310,86 @@ static int microcode_update_cpu(void)
     return ret;
 }
 
-static long do_microcode_update(void *unused)
+/* Wait for CPUs to rendezvous with a timeout (us) */
+static int wait_for_cpus(atomic_t *cnt, unsigned int expect,
+                         unsigned int timeout)
 {
-    int error, cpu;
+    while ( atomic_read(cnt) < expect )
+    {
+        if ( timeout <= 0 )
+        {
+            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
+                   smp_processor_id());
+            return -EBUSY;
+        }
+        udelay(1);
+        timeout--;
+    }
+
+    return 0;
+}
 
-    error = microcode_update_cpu();
-    if ( error )
-        return error;
+static int do_microcode_update(void *unused)
+{
+    unsigned int cpu = smp_processor_id();
+    unsigned int cpu_nr = num_online_cpus();
+    unsigned int finished;
+    int ret;
+    static bool error;
 
-    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
-    if ( cpu < nr_cpu_ids )
-        return continue_hypercall_on_cpu(cpu, do_microcode_update, NULL);
 
-    return error;
+    atomic_inc(&cpu_in);
+    ret = wait_for_cpus(&cpu_in, cpu_nr, MICROCODE_CALLIN_TIMEOUT_US);
+    if ( ret )
+        return ret;
+
+    /*
+     * Initiate an update on all processors which don't have an online sibling
+     * thread with a lower thread id. Other sibling threads just await the
+     * completion of microcode update.
+     */
+    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+        ret = microcode_update_cpu();
+    /*
+     * Increase the wait timeout to a safe value here since we're serializing
+     * the microcode update and that could take a while on a large number of
+     * CPUs. And that is fine as the *actual* timeout will be determined by
+     * the last CPU finished updating and thus cut short
+     */
+    atomic_inc(&cpu_out);
+    finished = atomic_read(&cpu_out);
+    while ( !error && finished != cpu_nr )
+    {
+        /*
+         * During each timeout interval, at least a CPU is expected to
+         * finish its update. Otherwise, something goes wrong.
+         */
+        if ( wait_for_cpus(&cpu_out, finished + 1,
+                           MICROCODE_UPDATE_TIMEOUT_US) && !error )
+        {
+            error = true;
+            panic("Timeout when finishing updating microcode (finished %d/%d)",
+                  finished, cpu_nr);
+        }
+
+        finished = atomic_read(&cpu_out);
+    }
+
+    /*
+     * Refresh CPU signature (revision) on threads which didn't call
+     * apply_microcode().
+     */
+    if ( cpu != cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+        ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
+
+    return ret;
 }
 
 int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
 {
     int ret;
     void *buffer;
+    unsigned int cpu, nr_cores;
 
     if ( len != (uint32_t)len )
         return -E2BIG;
@@ -323,11 +410,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         goto free;
     }
 
+    /* cpu_online_map must not change during update */
+    if ( !get_cpu_maps() )
+    {
+        ret = -EBUSY;
+        goto free;
+    }
+
     if ( microcode_ops->start_update )
     {
         ret = microcode_ops->start_update();
         if ( ret != 0 )
-            goto free;
+            goto put;
     }
 
     ret = microcode_parse_blob(buffer, len);
@@ -335,12 +429,41 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     {
         printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
         ret = -EINVAL;
-        goto free;
+        goto put;
     }
 
-    return continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
-                                     do_microcode_update, NULL);
+    atomic_set(&cpu_in, 0);
+    atomic_set(&cpu_out, 0);
+
+    /* Calculate the number of online CPU core */
+    nr_cores = 0;
+    for_each_online_cpu(cpu)
+        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
+            nr_cores++;
+
+    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
+
+    /*
+     * We intend to disable interrupt for long time, which may lead to
+     * watchdog timeout.
+     */
+    watchdog_disable();
+    /*
+     * Late loading dance. Why the heavy-handed stop_machine effort?
+     *
+     * - HT siblings must be idle and not execute other code while the other
+     *   sibling is loading microcode in order to avoid any negative
+     *   interactions cause by the loading.
+     *
+     * - In addition, microcode update on the cores must be serialized until
+     *   this requirement can be relaxed in the future. Right now, this is
+     *   conservative and good.
+     */
+    ret = stop_machine_run(do_microcode_update, NULL, NR_CPUS);
+    watchdog_enable();
 
+ put:
+    put_cpu_maps();
  free:
     xfree(buffer);
     return ret;
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v6 12/12] microcode: update microcode on cores in parallel
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (10 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-03-11  7:57 ` Chao Gao
  2019-03-21 12:24   ` [RFC PATCH v6 13/12] microcode: add sequential application policy Sergey Dyasli
  2019-03-19 20:22 ` [PATCH v6 00/12] improve late microcode loading Woods, Brian
  12 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-11  7:57 UTC (permalink / raw)
  To: xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Chao Gao, Roger Pau Monné

microcode_update_lock and microcode_mutex prevent cores from updating
microcode in parallel. Below changes are made to support parallel
microcode update on cores.

microcode_update_lock is removed. This lock was to prevent logic threads
of a same core from updating microcode at the same time. But due to
using a global lock, it also prevented parallel microcode updating on
different cores. We ensure the siblings of a same core won't update
microcode at the caller level of apply_microcode() instead of inside
it:
1. For late microcode update, do_microcode_update() ensures that only
one sibiling thread of a core can update microcode.
2. For microcode update during system startup or CPU-hotplug, the new
early_ucode_update_lock() can guarantee update serialization of logical
threads. We use this sub-optimal method because it is hard to implement a
per-core lock at that early stage where sibling info isn't initialized yet.
3. get/put_cpu_bitmaps() prevents the concurrency of CPU-hotplug and
late microcode update.

microcode_mutex is replaced by a rwlock. microcode_mutex was used to
prevent concurrent accesses to 'uci' (already removed in previous patch)
and microcode_cache. Now the only shared resource which needs to be
protected is the microcode_cache. A rwlock allows multiple readers (one
thread of each core) to access the global cache and update microcode
simultaneously. Because the rwlock may be held in stop_machine context,
where interrupt is disabled, irq{save, restore} variants are used to
get/release the rwlock.

Note that printk in apply_microcode() and svm_host_osvm_init() (for AMD
only) are still processed sequentially.

Signed-off-by: Chao Gao <chao.gao@intel.com>
---
Changes in v6:
 - introduce early_ucode_update_lock to serialize early ucode update.

Changes in v5:
 - newly add
---
 xen/arch/x86/microcode.c       | 65 ++++++++++++++++++++++++++----------------
 xen/arch/x86/microcode_amd.c   |  8 +-----
 xen/arch/x86/microcode_intel.c |  9 +-----
 3 files changed, 43 insertions(+), 39 deletions(-)

diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 96bcef6..e7df70f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -204,7 +204,9 @@ scan:
 
 const struct microcode_ops *microcode_ops;
 
-static DEFINE_SPINLOCK(microcode_mutex);
+static DEFINE_RWLOCK(cache_rwlock);
+/* Hold this lock during early microcode update  */
+static DEFINE_SPINLOCK(early_ucode_update_lock);
 
 DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
 
@@ -224,40 +226,47 @@ static atomic_t cpu_in, cpu_out;
  */
 bool microcode_save_patch(struct microcode_patch *new)
 {
-    struct microcode_patch *old;
+    struct microcode_patch *old, *free = NULL;
+    enum microcode_match_result result = MIS_UCODE;
+    unsigned long flags;
 
-    ASSERT(spin_is_locked(&microcode_mutex));
+    write_lock_irqsave(&cache_rwlock, flags);
 
     list_for_each_entry(old, &microcode_cache, list)
     {
-        enum microcode_match_result result =
-            microcode_ops->compare_patch(new, old);
+        result = microcode_ops->compare_patch(new, old);
 
         if ( result == OLD_UCODE )
         {
-            microcode_ops->free_patch(new);
-            return false;
+            free = new;
+            break;
         }
         else if ( result == NEW_UCODE )
         {
             list_replace(&old->list, &new->list);
-            microcode_ops->free_patch(old);
-            return true;
+            free = old;
+            break;
         }
         else /* result == MIS_UCODE */
             continue;
     }
 
-    if ( microcode_ops->is_patch_compatible &&
-         !microcode_ops->is_patch_compatible(new) )
+    if ( result == MIS_UCODE )
     {
-        xfree(new);
-        return false;
+        if ( microcode_ops->is_patch_compatible &&
+             !microcode_ops->is_patch_compatible(new) )
+            free = new;
+        else
+            list_add_tail(&new->list, &microcode_cache);
     }
 
-    list_add_tail(&new->list, &microcode_cache);
+    write_unlock_irqrestore(&cache_rwlock, flags);
+
+    /* free useless patches after interrupt enabled */
+    if ( free )
+        microcode_ops->free_patch(free);
 
-    return true;
+    return free != new;
 }
 
 /*
@@ -269,8 +278,6 @@ const struct microcode_patch *microcode_find_patch(void)
 {
     const struct microcode_patch *microcode_patch;
 
-    ASSERT(spin_is_locked(&microcode_mutex));
-
     list_for_each_entry(microcode_patch, &microcode_cache, list)
     {
         if ( microcode_ops->match_cpu(microcode_patch) )
@@ -288,11 +295,9 @@ static int microcode_parse_blob(char *buf, uint32_t len)
 {
     int ret;
 
-    spin_lock(&microcode_mutex);
     ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
     if ( likely(!ret) )
         ret = microcode_ops->cpu_request_microcode(buf, len);
-    spin_unlock(&microcode_mutex);
 
     return ret;
 }
@@ -300,12 +305,15 @@ static int microcode_parse_blob(char *buf, uint32_t len)
 static int microcode_update_cpu(void)
 {
     int ret;
+    unsigned long flags;
 
-    spin_lock(&microcode_mutex);
     ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
-    if ( likely(!ret) )
-        ret = microcode_ops->apply_microcode();
-    spin_unlock(&microcode_mutex);
+    if ( unlikely(ret) )
+        return ret;
+
+    read_lock_irqsave(&cache_rwlock, flags);
+    ret = microcode_ops->apply_microcode();
+    read_unlock_irqrestore(&cache_rwlock, flags);
 
     return ret;
 }
@@ -493,7 +501,16 @@ __initcall(microcode_init);
 
 int early_microcode_update_cpu(void)
 {
-    return microcode_ops ? microcode_update_cpu() : 0;
+    int rc;
+
+    if ( !microcode_ops )
+        return 0;
+
+    spin_lock(&early_ucode_update_lock);
+    rc = microcode_update_cpu();
+    spin_unlock(&early_ucode_update_lock);
+
+    return rc;
 }
 
 /*
diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 5c25ff2..4aa8fdc 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -74,9 +74,6 @@ struct mpbhdr {
     uint8_t data[];
 };
 
-/* serialize access to the physical write */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 /* See comment in start_update() for cases when this routine fails */
 static int collect_cpu_info(struct cpu_signature *csig)
 {
@@ -251,7 +248,6 @@ static enum microcode_match_result compare_patch(
 
 static int apply_microcode(void)
 {
-    unsigned long flags;
     uint32_t rev;
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch;
@@ -265,15 +261,13 @@ static int apply_microcode(void)
 
     hdr = patch->data;
 
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     hw_err = wrmsr_safe(MSR_AMD_PATCHLOADER, (unsigned long)hdr);
 
     /* get patch id after patching */
     rdmsrl(MSR_AMD_PATCHLEVEL, rev);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
-
     /*
      * Some processors leave the ucode blob mapping as UC after the update.
      * Flush the mapping to regain normal cacheability.
diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
index d5ef145..56f3956 100644
--- a/xen/arch/x86/microcode_intel.c
+++ b/xen/arch/x86/microcode_intel.c
@@ -95,9 +95,6 @@ struct extended_sigtable {
 
 #define STEPPING_MASK ~0xfU
 
-/* serialize access to the physical write to MSR 0x79 */
-static DEFINE_SPINLOCK(microcode_update_lock);
-
 static int collect_cpu_info(struct cpu_signature *csig)
 {
     unsigned int cpu_num = smp_processor_id();
@@ -334,7 +331,6 @@ static int get_matching_microcode(const void *mc, unsigned int cpu)
 
 static int apply_microcode(void)
 {
-    unsigned long flags;
     uint64_t msr_content;
     unsigned int val[2];
     unsigned int cpu_num = raw_smp_processor_id();
@@ -347,9 +343,7 @@ static int apply_microcode(void)
         return -EINVAL;
 
     mc_intel = patch->data;
-
-    /* serialize access to the physical write to MSR 0x79 */
-    spin_lock_irqsave(&microcode_update_lock, flags);
+    BUG_ON(local_irq_is_enabled());
 
     /*
      * Writeback and invalidate caches before updating microcode to avoid
@@ -368,7 +362,6 @@ static int apply_microcode(void)
     rdmsrl(MSR_IA32_UCODE_REV, msr_content);
     val[1] = (uint32_t)(msr_content >> 32);
 
-    spin_unlock_irqrestore(&microcode_update_lock, flags);
     if ( val[1] != mc_intel->hdr.rev )
     {
         printk(KERN_ERR "microcode: CPU%d update from revision "
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
@ 2019-03-12 15:27   ` Roger Pau Monné
  2019-03-13  5:05     ` Chao Gao
  2019-03-13  9:24   ` Wei Liu
  2019-03-25  9:38   ` Sergey Dyasli
  2 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-03-12 15:27 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Konrad Rzeszutek Wilk,
	Ian Jackson, xen-devel

On Mon, Mar 11, 2019 at 03:57:25PM +0800, Chao Gao wrote:
> This patch provides a tool for late microcode update.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  tools/libxc/include/xenctrl.h |  1 +
>  tools/libxc/xc_misc.c         | 20 ++++++++++
>  tools/misc/Makefile           |  4 ++
>  tools/misc/xenmicrocode.c     | 89 +++++++++++++++++++++++++++++++++++++++++++

Would you mind naming the tool xen-ucode or xen-microcode?

That seems more inline with the naming schemed used for most of the
tools in misc/.

> +int main(int argc, char *argv[])
> +{
> +    int fd, len, ret;
> +    char *filename, *buf;
> +    struct stat st;
> +    struct xen_platform_op op;
> +    xc_interface *xch;
> +    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
> +
> +    if (argc < 2)
> +    {
> +        show_help();
> +        return 0;
> +    }
> +
> +    filename = argv[1];
> +    fd = open(filename, O_RDONLY);
> +    if (fd < 0) {
> +        fprintf(stderr, "Could not open %s. (err: %s)\n",
> +                filename, strerror(errno));
> +        return errno;
> +    }
> +
> +    if (stat(filename, &st) != 0) {
> +        fprintf(stderr, "Could not get the size of %s. (err: %s)\n",
> +                filename, strerror(errno));
> +        return errno;
> +    }
> +
> +    len = st.st_size;
> +    buf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
> +    if (buf == MAP_FAILED) {
> +        fprintf(stderr, "mmap failed. (error: %s)\n", strerror(errno));
> +        return errno;
> +    }

IMO you could just use fread to read the full contents of the file
into the buffer you allocate and avoid this mmap dance.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking
  2019-03-11  7:57 ` [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking Chao Gao
@ 2019-03-12 15:33   ` Roger Pau Monné
  2019-03-12 16:43     ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-03-12 15:33 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel

On Mon, Mar 11, 2019 at 03:57:26PM +0800, Chao Gao wrote:

Please add "No functional change" to the commit description.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

This can likely go in ahead of the rest of the series.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking
  2019-03-12 15:33   ` Roger Pau Monné
@ 2019-03-12 16:43     ` Jan Beulich
  2019-03-12 18:23       ` Wei Liu
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-03-12 16:43 UTC (permalink / raw)
  To: Roger Pau Monne, Chao Gao
  Cc: Andrew Cooper, xen-devel, Wei Liu, Ashok Raj, Sergey Dyasli

>>> On 12.03.19 at 16:33, <roger.pau@citrix.com> wrote:
> On Mon, Mar 11, 2019 at 03:57:26PM +0800, Chao Gao wrote:
> 
> Please add "No functional change" to the commit description.
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Acked-by: Jan Beulich <jbeulich@suse.com>

> This can likely go in ahead of the rest of the series.

Indeed.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-11  7:57 ` [PATCH v6 04/12] microcode: introduce a global cache of ucode patch Chao Gao
@ 2019-03-12 16:53   ` Roger Pau Monné
  2019-03-12 23:31     ` Raj, Ashok
                       ` (2 more replies)
  2019-03-13 16:36   ` Sergey Dyasli
  1 sibling, 3 replies; 50+ messages in thread
From: Roger Pau Monné @ 2019-03-12 16:53 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel

On Mon, Mar 11, 2019 at 03:57:28PM +0800, Chao Gao wrote:
> to replace the current per-cpu cache 'uci->mc'.
> 
> Compared to the current per-cpu cache, the benefits of the global
> microcode cache are:
> 1. It reduces the work that need to be done on each CPU. Parsing ucode
> file is done once on one CPU. Other CPUs needn't parse ucode file.
> Instead, they can find out and load the newest patch from the global
> cache.
> 2. It reduces the memory consumption on a system with many CPU cores.
> 
> Two functions, microcode_save_patch() and microcode_find_patch() are
> introduced. The former adds one given patch to the global cache. The
> latter gets a newer and matched ucode patch from the global cache.
> All operations on the cache is expected to be done with the
> 'microcode_mutex' hold.
> 
> Note that I deliberately avoid touching 'uci->mc' as I am going to
> remove it completely in the next patch.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v6:
>  - constify local variables and function parameters if possible
>  - comment that the global cache is protected by 'microcode_mutex'.
>    and add assertions to catch violations in microcode_{save/find}_patch()
> 
> Changes in v5:
>  - reword the commit description
>  - find_patch() and save_patch() are abstracted into common functions
>    with some hooks for AMD and Intel
> ---
>  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
>  xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
>  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
>  xen/include/asm-x86/microcode.h | 13 ++++++
>  4 files changed, 215 insertions(+), 15 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 4163f50..e629e6c 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
>   */
>  static bool_t __initdata ucode_scan;
>  
> +static LIST_HEAD(microcode_cache);
> +
>  void __init microcode_set_module(unsigned int idx)
>  {
>      ucode_mod_idx = idx;
> @@ -208,6 +210,64 @@ static void microcode_fini_cpu(unsigned int cpu)
>      spin_unlock(&microcode_mutex);
>  }
>  
> +/*
> + * Save an ucode patch to the global cache list.
> + *
> + * Return true if a patch is added to the global cache or it replaces
> + * one old patch in the cache. Otherwise, return false. According to
> + * the return value, the caller decides not to do an expensive ucode
> + * update for some cases (i.e. when admin provides an old ucode).
> + */
> +bool microcode_save_patch(struct microcode_patch *new)
> +{
> +    struct microcode_patch *old;
> +
> +    ASSERT(spin_is_locked(&microcode_mutex));
> +
> +    list_for_each_entry(old, &microcode_cache, list)
> +    {
> +        enum microcode_match_result result =
> +            microcode_ops->compare_patch(new, old);
> +
> +        if ( result == OLD_UCODE )
> +        {
> +            microcode_ops->free_patch(new);
> +            return false;
> +        }
> +        else if ( result == NEW_UCODE )
> +        {
> +            list_replace(&old->list, &new->list);
> +            microcode_ops->free_patch(old);
> +            return true;
> +        }
> +        else /* result == MIS_UCODE */
> +            continue;

I think this is rather too simplistic, and I'm not sure you really
need a list in order to store the microcodes.

IIRC we agreed that systems with mixed CPU versions are not supported,
hence the same microcode blob should be used to update all the
possible CPUs on the system, so a list should not be needed here.

Also I'm afraid that freeing the old microcode when a new version is
uploaded is not fully correct. For example given the following
scenario:

1. Upload a microcode blob to the hypervisor and apply it to online
CPUs.

2. Upload a microcode blob with a higher version than the previous one,
but which fails to apply. This microcode would replace the
previous one.

3. Online a CPU. This CPU will try to use the last uploaded microcode
and fail, because last uploaded version is broken. Newly onlined CPUs
would then end up with a microcode version different from the
currently running CPUs, likely breaking the system.

I think the best way to solve this is to ditch the list usage and
instead only keep at most two microcode versions cached in the
hypervisor:

 - The microcode version that's currently successfully applied (if any).
 - A microcode version higher than the current version, that has yet
   to be applied.

Once this new microcode version has been applied it will replace the
previously applied version. If the new microcode version fails to
apply it will be discarded, thus keeping a copy of the currently
applied microcode version.

With this approach AFAICT you only need two variables, one to store
the currently applied microcode_patch and another one to save the new
microcode version in order to attempt to apply it.

I think this will also simplify some of the code. Let me know if this
sounds sensible, or if I'm missing something.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 05/12] microcode: only save compatible ucode patches
  2019-03-11  7:57 ` [PATCH v6 05/12] microcode: only save compatible ucode patches Chao Gao
@ 2019-03-12 17:03   ` Roger Pau Monné
  2019-03-13  7:45     ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Roger Pau Monné @ 2019-03-12 17:03 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel

On Mon, Mar 11, 2019 at 03:57:29PM +0800, Chao Gao wrote:
> Intel CPU only allows mixing in stepping or 'pf'. If an ucode patch is
> for other CPU families or models, it won't be compatible to all CPUs on
> current system and even hot-plugged CPUs.  Don't save such patch to
> reduce the size of ucode cache.

Oh, so we are indeed aiming to support systems that can require CPUs
to use different microcode blobs in order to update?

If that's the case, and we are sure this will work, we do indeed need
a list of microcodes, but you will have to somehow flag applied
microcodes in order to prevent them from being replaced with new
microcode versions until Xen knows that such new versions apply
correctly.

Maybe I'm being too simplistic here, but does anyone really have a
system with CPUs with different stepping?

Supporting such scenario adds quite a lot of complexity and even if
theoretically possible I don't think it makes much sense to support
unless there are such systems out there.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking
  2019-03-12 16:43     ` Jan Beulich
@ 2019-03-12 18:23       ` Wei Liu
  0 siblings, 0 replies; 50+ messages in thread
From: Wei Liu @ 2019-03-12 18:23 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, xen-devel,
	Roger Pau Monne, Chao Gao

On Tue, Mar 12, 2019 at 10:43:23AM -0600, Jan Beulich wrote:
> >>> On 12.03.19 at 16:33, <roger.pau@citrix.com> wrote:
> > On Mon, Mar 11, 2019 at 03:57:26PM +0800, Chao Gao wrote:
> > 
> > Please add "No functional change" to the commit description.
> > 
> > Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Acked-by: Jan Beulich <jbeulich@suse.com>
> 
> > This can likely go in ahead of the rest of the series.
> 
> Indeed.

I have picked up this patch in my patch sweeping.

Wei.

> 
> Jan
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-12 16:53   ` Roger Pau Monné
@ 2019-03-12 23:31     ` Raj, Ashok
  2019-03-13  5:28     ` Chao Gao
  2019-03-13  7:39     ` Jan Beulich
  2 siblings, 0 replies; 50+ messages in thread
From: Raj, Ashok @ 2019-03-12 23:31 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	xen-devel, Chao Gao

On Tue, Mar 12, 2019 at 05:53:53PM +0100, Roger Pau Monné wrote:
> On Mon, Mar 11, 2019 at 03:57:28PM +0800, Chao Gao wrote:
> > to replace the current per-cpu cache 'uci->mc'.
> > 
> > Compared to the current per-cpu cache, the benefits of the global
> > microcode cache are:
> > 1. It reduces the work that need to be done on each CPU. Parsing ucode
> > file is done once on one CPU. Other CPUs needn't parse ucode file.
> > Instead, they can find out and load the newest patch from the global
> > cache.
> > 2. It reduces the memory consumption on a system with many CPU cores.
> > 
> > Two functions, microcode_save_patch() and microcode_find_patch() are
> > introduced. The former adds one given patch to the global cache. The
> > latter gets a newer and matched ucode patch from the global cache.
> > All operations on the cache is expected to be done with the
> > 'microcode_mutex' hold.
> > 
> > Note that I deliberately avoid touching 'uci->mc' as I am going to
> > remove it completely in the next patch.
> > 
> > Signed-off-by: Chao Gao <chao.gao@intel.com>
> > ---
> > Changes in v6:
> >  - constify local variables and function parameters if possible
> >  - comment that the global cache is protected by 'microcode_mutex'.
> >    and add assertions to catch violations in microcode_{save/find}_patch()
> > 
> > Changes in v5:
> >  - reword the commit description
> >  - find_patch() and save_patch() are abstracted into common functions
> >    with some hooks for AMD and Intel
> > ---
> >  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
> >  xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
> >  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
> >  xen/include/asm-x86/microcode.h | 13 ++++++
> >  4 files changed, 215 insertions(+), 15 deletions(-)
> > 
> > diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> > index 4163f50..e629e6c 100644

[snip]
> 
> I think this is rather too simplistic, and I'm not sure you really
> need a list in order to store the microcodes.
> 
> IIRC we agreed that systems with mixed CPU versions are not supported,
> hence the same microcode blob should be used to update all the
> possible CPUs on the system, so a list should not be needed here.
> 
> Also I'm afraid that freeing the old microcode when a new version is
> uploaded is not fully correct. For example given the following
> scenario:
> 
> 1. Upload a microcode blob to the hypervisor and apply it to online
> CPUs.
> 
> 2. Upload a microcode blob with a higher version than the previous one,
> but which fails to apply. This microcode would replace the
> previous one.
> 
> 3. Online a CPU. This CPU will try to use the last uploaded microcode
> and fail, because last uploaded version is broken. Newly onlined CPUs
> would then end up with a microcode version different from the
> currently running CPUs, likely breaking the system.
> 
> I think the best way to solve this is to ditch the list usage and
> instead only keep at most two microcode versions cached in the
> hypervisor:
> 
>  - The microcode version that's currently successfully applied (if any).
>  - A microcode version higher than the current version, that has yet
>    to be applied.
> 
> Once this new microcode version has been applied it will replace the
> previously applied version. If the new microcode version fails to
> apply it will be discarded, thus keeping a copy of the currently
> applied microcode version.
> 
> With this approach AFAICT you only need two variables, one to store
> the currently applied microcode_patch and another one to save the new
> microcode version in order to attempt to apply it.

This sounds very reasonable!

> 
> I think this will also simplify some of the code. Let me know if this
> sounds sensible, or if I'm missing something.
> 
> Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-11  7:57 ` [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading Chao Gao
@ 2019-03-13  0:07   ` Raj, Ashok
  2019-03-13  5:02     ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Raj, Ashok @ 2019-03-13  0:07 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Jun Nakajima, Andrew Cooper,
	Jan Beulich, Ashok Raj, xen-devel, Thomas Gleixner,
	Borislav Petkov, Roger Pau Monné

On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
> This patch ports microcode improvement patches from linux kernel.
> 
> Before you read any further: the early loading method is still the
> preferred one and you should always do that. The following patch is
> improving the late loading mechanism for long running jobs and cloud use
> cases.
> 
> Gather all cores and serialize the microcode update on them by doing it
> one-by-one to make the late update process as reliable as possible and
> avoid potential issues caused by the microcode update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Chao Gao <chao.gao@intel.com>
> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v6:
>  - Use one timeout period for rendezvous stage and another for update stage.
>  - scale time to wait by the number of remaining cpus to respond.
>    It helps to find something wrong earlier and thus we can reboot the
>    system earlier.
> ---
>  xen/arch/x86/microcode.c | 149 ++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 136 insertions(+), 13 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index c510808..96bcef6 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -22,6 +22,7 @@
>   */
>  
>  #include <xen/cpu.h>
> +#include <xen/cpumask.h>
>  #include <xen/lib.h>
>  #include <xen/kernel.h>
>  #include <xen/init.h>
> @@ -30,15 +31,34 @@
>  #include <xen/smp.h>
>  #include <xen/softirq.h>
>  #include <xen/spinlock.h>
> +#include <xen/stop_machine.h>
>  #include <xen/tasklet.h>
>  #include <xen/guest_access.h>
>  #include <xen/earlycpio.h>
> +#include <xen/watchdog.h>
>  
> +#include <asm/delay.h>
>  #include <asm/msr.h>
>  #include <asm/processor.h>
>  #include <asm/setup.h>
>  #include <asm/microcode.h>
>  
> +/*
> + * Before performing a late microcode update on any thread, we
> + * rendezvous all cpus in stop_machine context. The timeout for
> + * waiting for cpu rendezvous is 30ms. It is the timeout used by
> + * live patching
> + */
> +#define MICROCODE_CALLIN_TIMEOUT_US 30000
> +
> +/*
> + * Timeout for each thread to complete update is set to 1s. It is a
> + * conservative choice considering all possible interference (for
> + * instance, sometimes wbinvd takes relative long time). And a perfect
> + * timeout doesn't help a lot except an early shutdown.
> + */
> +#define MICROCODE_UPDATE_TIMEOUT_US 1000000
> +
>  static module_t __initdata ucode_mod;
>  static signed int __initdata ucode_mod_idx;
>  static bool_t __initdata ucode_mod_forced;
> @@ -189,6 +209,12 @@ static DEFINE_SPINLOCK(microcode_mutex);
>  DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>  
>  /*
> + * Count the CPUs that have entered and exited the rendezvous
> + * during late microcode update.
> + */
> +static atomic_t cpu_in, cpu_out;
> +
> +/*
>   * Save an ucode patch to the global cache list.
>   *
>   * Return true if a patch is added to the global cache or it replaces
> @@ -284,25 +310,86 @@ static int microcode_update_cpu(void)
>      return ret;
>  }
>  
> -static long do_microcode_update(void *unused)
> +/* Wait for CPUs to rendezvous with a timeout (us) */
> +static int wait_for_cpus(atomic_t *cnt, unsigned int expect,
> +                         unsigned int timeout)
>  {
> -    int error, cpu;
> +    while ( atomic_read(cnt) < expect )
> +    {
> +        if ( timeout <= 0 )

timeout is unsigned int.. it will never be < 0.. flip it to read better maybe?

> +        {
> +            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
> +                   smp_processor_id());
> +            return -EBUSY;
> +        }
> +        udelay(1);
> +        timeout--;
> +    }
> +
> +    return 0;
> +}
>  
> -    error = microcode_update_cpu();
> -    if ( error )
> -        return error;
> +static int do_microcode_update(void *unused)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    unsigned int cpu_nr = num_online_cpus();
> +    unsigned int finished;
> +    int ret;
> +    static bool error;
>  
> -    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
> -    if ( cpu < nr_cpu_ids )
> -        return continue_hypercall_on_cpu(cpu, do_microcode_update, NULL);
>  
> -    return error;
> +    atomic_inc(&cpu_in);
> +    ret = wait_for_cpus(&cpu_in, cpu_nr, MICROCODE_CALLIN_TIMEOUT_US);
> +    if ( ret )
> +        return ret;
> +
> +    /*
> +     * Initiate an update on all processors which don't have an online sibling
> +     * thread with a lower thread id. Other sibling threads just await the
> +     * completion of microcode update.
> +     */

The above comment isn't clear. Looks like you are doing the update just
to the first cpu in the sibling map? 

> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        ret = microcode_update_cpu();

Does ret have any useful things on if the update failed? Doesn't seem 
to be used before you overwrite later in collect_cpu_info()?


> +    /*
> +     * Increase the wait timeout to a safe value here since we're serializing
> +     * the microcode update and that could take a while on a large number of
> +     * CPUs. And that is fine as the *actual* timeout will be determined by
> +     * the last CPU finished updating and thus cut short
> +     */
> +    atomic_inc(&cpu_out);
> +    finished = atomic_read(&cpu_out);
> +    while ( !error && finished != cpu_nr )

Maybe i'm reading this wrong.. is "error" used uninitialized?

> +    {
> +        /*
> +         * During each timeout interval, at least a CPU is expected to
> +         * finish its update. Otherwise, something goes wrong.
> +         */
> +        if ( wait_for_cpus(&cpu_out, finished + 1,
> +                           MICROCODE_UPDATE_TIMEOUT_US) && !error )
> +        {
> +            error = true;
> +            panic("Timeout when finishing updating microcode (finished %d/%d)",
> +                  finished, cpu_nr);
> +        }
> +
> +        finished = atomic_read(&cpu_out);
> +    }
> +
> +    /*
> +     * Refresh CPU signature (revision) on threads which didn't call
> +     * apply_microcode().
> +     */
> +    if ( cpu != cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
> +
> +    return ret;
>  }
>  
>  int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>  {
>      int ret;
>      void *buffer;
> +    unsigned int cpu, nr_cores;
>  
>      if ( len != (uint32_t)len )
>          return -E2BIG;
> @@ -323,11 +410,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          goto free;
>      }
>  
> +    /* cpu_online_map must not change during update */
> +    if ( !get_cpu_maps() )
> +    {
> +        ret = -EBUSY;
> +        goto free;
> +    }
> +
>      if ( microcode_ops->start_update )
>      {
>          ret = microcode_ops->start_update();
>          if ( ret != 0 )
> -            goto free;
> +            goto put;
>      }
>  
>      ret = microcode_parse_blob(buffer, len);
> @@ -335,12 +429,41 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>      {
>          printk(XENLOG_ERR "No valid or newer ucode found. Update abort!\n");
>          ret = -EINVAL;
> -        goto free;
> +        goto put;
>      }
>  
> -    return continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
> -                                     do_microcode_update, NULL);
> +    atomic_set(&cpu_in, 0);
> +    atomic_set(&cpu_out, 0);
> +
> +    /* Calculate the number of online CPU core */
> +    nr_cores = 0;
> +    for_each_online_cpu(cpu)
> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +            nr_cores++;
> +
> +    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
> +
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();
> +    /*
> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> +     *
> +     * - HT siblings must be idle and not execute other code while the other
> +     *   sibling is loading microcode in order to avoid any negative
> +     *   interactions cause by the loading.
> +     *
> +     * - In addition, microcode update on the cores must be serialized until
> +     *   this requirement can be relaxed in the future. Right now, this is
> +     *   conservative and good.
> +     */
> +    ret = stop_machine_run(do_microcode_update, NULL, NR_CPUS);
> +    watchdog_enable();
>  
> + put:
> +    put_cpu_maps();
>   free:
>      xfree(buffer);
>      return ret;
> -- 
> 1.8.3.1
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-13  0:07   ` Raj, Ashok
@ 2019-03-13  5:02     ` Chao Gao
  2019-03-13  7:54       ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-13  5:02 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Jun Nakajima, Andrew Cooper,
	Jan Beulich, xen-devel, Thomas Gleixner, Borislav Petkov,
	Roger Pau Monné

On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>> This patch ports microcode improvement patches from linux kernel.
>> 
>> Before you read any further: the early loading method is still the
>> preferred one and you should always do that. The following patch is
>> improving the late loading mechanism for long running jobs and cloud use
>> cases.
>> 
>> Gather all cores and serialize the microcode update on them by doing it
>> one-by-one to make the late update process as reliable as possible and
>> avoid potential issues caused by the microcode update.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> Tested-by: Chao Gao <chao.gao@intel.com>
>> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
>> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
>> Cc: Kevin Tian <kevin.tian@intel.com>
>> Cc: Jun Nakajima <jun.nakajima@intel.com>
>> Cc: Ashok Raj <ashok.raj@intel.com>
>> Cc: Borislav Petkov <bp@suse.de>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
>> Cc: Jan Beulich <jbeulich@suse.com>
>> ---
>> Changes in v6:
>>  - Use one timeout period for rendezvous stage and another for update stage.
>>  - scale time to wait by the number of remaining cpus to respond.
>>    It helps to find something wrong earlier and thus we can reboot the
>>    system earlier.
>> ---
>>  xen/arch/x86/microcode.c | 149 ++++++++++++++++++++++++++++++++++++++++++-----
>>  1 file changed, 136 insertions(+), 13 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index c510808..96bcef6 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -22,6 +22,7 @@
>>   */
>>  
>>  #include <xen/cpu.h>
>> +#include <xen/cpumask.h>
>>  #include <xen/lib.h>
>>  #include <xen/kernel.h>
>>  #include <xen/init.h>
>> @@ -30,15 +31,34 @@
>>  #include <xen/smp.h>
>>  #include <xen/softirq.h>
>>  #include <xen/spinlock.h>
>> +#include <xen/stop_machine.h>
>>  #include <xen/tasklet.h>
>>  #include <xen/guest_access.h>
>>  #include <xen/earlycpio.h>
>> +#include <xen/watchdog.h>
>>  
>> +#include <asm/delay.h>
>>  #include <asm/msr.h>
>>  #include <asm/processor.h>
>>  #include <asm/setup.h>
>>  #include <asm/microcode.h>
>>  
>> +/*
>> + * Before performing a late microcode update on any thread, we
>> + * rendezvous all cpus in stop_machine context. The timeout for
>> + * waiting for cpu rendezvous is 30ms. It is the timeout used by
>> + * live patching
>> + */
>> +#define MICROCODE_CALLIN_TIMEOUT_US 30000
>> +
>> +/*
>> + * Timeout for each thread to complete update is set to 1s. It is a
>> + * conservative choice considering all possible interference (for
>> + * instance, sometimes wbinvd takes relative long time). And a perfect
>> + * timeout doesn't help a lot except an early shutdown.
>> + */
>> +#define MICROCODE_UPDATE_TIMEOUT_US 1000000
>> +
>>  static module_t __initdata ucode_mod;
>>  static signed int __initdata ucode_mod_idx;
>>  static bool_t __initdata ucode_mod_forced;
>> @@ -189,6 +209,12 @@ static DEFINE_SPINLOCK(microcode_mutex);
>>  DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>>  
>>  /*
>> + * Count the CPUs that have entered and exited the rendezvous
>> + * during late microcode update.
>> + */
>> +static atomic_t cpu_in, cpu_out;
>> +
>> +/*
>>   * Save an ucode patch to the global cache list.
>>   *
>>   * Return true if a patch is added to the global cache or it replaces
>> @@ -284,25 +310,86 @@ static int microcode_update_cpu(void)
>>      return ret;
>>  }
>>  
>> -static long do_microcode_update(void *unused)
>> +/* Wait for CPUs to rendezvous with a timeout (us) */
>> +static int wait_for_cpus(atomic_t *cnt, unsigned int expect,
>> +                         unsigned int timeout)
>>  {
>> -    int error, cpu;
>> +    while ( atomic_read(cnt) < expect )
>> +    {
>> +        if ( timeout <= 0 )
>
>timeout is unsigned int.. it will never be < 0.. flip it to read better maybe?

Will do.

>
>> +        {
>> +            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
>> +                   smp_processor_id());
>> +            return -EBUSY;
>> +        }
>> +        udelay(1);
>> +        timeout--;
>> +    }
>> +
>> +    return 0;
>> +}
>>  
>> -    error = microcode_update_cpu();
>> -    if ( error )
>> -        return error;
>> +static int do_microcode_update(void *unused)
>> +{
>> +    unsigned int cpu = smp_processor_id();
>> +    unsigned int cpu_nr = num_online_cpus();
>> +    unsigned int finished;
>> +    int ret;
>> +    static bool error;
>>  
>> -    cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
>> -    if ( cpu < nr_cpu_ids )
>> -        return continue_hypercall_on_cpu(cpu, do_microcode_update, NULL);
>>  
>> -    return error;
>> +    atomic_inc(&cpu_in);
>> +    ret = wait_for_cpus(&cpu_in, cpu_nr, MICROCODE_CALLIN_TIMEOUT_US);
>> +    if ( ret )
>> +        return ret;
>> +
>> +    /*
>> +     * Initiate an update on all processors which don't have an online sibling
>> +     * thread with a lower thread id. Other sibling threads just await the
>> +     * completion of microcode update.
>> +     */
>
>The above comment isn't clear. Looks like you are doing the update just
>to the first cpu in the sibling map? 

Yes. Will refine it.

>
>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>> +        ret = microcode_update_cpu();
>
>Does ret have any useful things on if the update failed? Doesn't seem 
>to be used before you overwrite later in collect_cpu_info()?

It has the reason of failure on error. Actally, there are two reasons:
one is no patch of newer revision, the other is we tried to update but
the microcode revision didn't change. I can check this return value and
print more informative message to admin. And furthermore, for the
latter, we can remove the ucode patch from caches to address Roger's
concern expressed in comments to patch 4 & 5.

>
>
>> +    /*
>> +     * Increase the wait timeout to a safe value here since we're serializing
>> +     * the microcode update and that could take a while on a large number of
>> +     * CPUs. And that is fine as the *actual* timeout will be determined by
>> +     * the last CPU finished updating and thus cut short
>> +     */
>> +    atomic_inc(&cpu_out);
>> +    finished = atomic_read(&cpu_out);
>> +    while ( !error && finished != cpu_nr )
>
>Maybe i'm reading this wrong.. is "error" used uninitialized?

"error" is a static local variable. So it should be initialized to 0.
And if we got an error, the system will panic; there is no need to
re-initialize "error" at the begining of this function.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-12 15:27   ` Roger Pau Monné
@ 2019-03-13  5:05     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-13  5:05 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Konrad Rzeszutek Wilk,
	Ian Jackson, xen-devel

On Tue, Mar 12, 2019 at 04:27:33PM +0100, Roger Pau Monné wrote:
>On Mon, Mar 11, 2019 at 03:57:25PM +0800, Chao Gao wrote:
>> This patch provides a tool for late microcode update.
>> 
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>  tools/libxc/include/xenctrl.h |  1 +
>>  tools/libxc/xc_misc.c         | 20 ++++++++++
>>  tools/misc/Makefile           |  4 ++
>>  tools/misc/xenmicrocode.c     | 89 +++++++++++++++++++++++++++++++++++++++++++
>
>Would you mind naming the tool xen-ucode or xen-microcode?
>
>That seems more inline with the naming schemed used for most of the
>tools in misc/.
>
>> +int main(int argc, char *argv[])
>> +{
>> +    int fd, len, ret;
>> +    char *filename, *buf;
>> +    struct stat st;
>> +    struct xen_platform_op op;
>> +    xc_interface *xch;
>> +    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
>> +
>> +    if (argc < 2)
>> +    {
>> +        show_help();
>> +        return 0;
>> +    }
>> +
>> +    filename = argv[1];
>> +    fd = open(filename, O_RDONLY);
>> +    if (fd < 0) {
>> +        fprintf(stderr, "Could not open %s. (err: %s)\n",
>> +                filename, strerror(errno));
>> +        return errno;
>> +    }
>> +
>> +    if (stat(filename, &st) != 0) {
>> +        fprintf(stderr, "Could not get the size of %s. (err: %s)\n",
>> +                filename, strerror(errno));
>> +        return errno;
>> +    }
>> +
>> +    len = st.st_size;
>> +    buf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
>> +    if (buf == MAP_FAILED) {
>> +        fprintf(stderr, "mmap failed. (error: %s)\n", strerror(errno));
>> +        return errno;
>> +    }
>
>IMO you could just use fread to read the full contents of the file
>into the buffer you allocate and avoid this mmap dance.

Sure. Will follow your advices.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-12 16:53   ` Roger Pau Monné
  2019-03-12 23:31     ` Raj, Ashok
@ 2019-03-13  5:28     ` Chao Gao
  2019-03-13  7:39     ` Jan Beulich
  2 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-13  5:28 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel

Thanks for your great suggestion.

On Tue, Mar 12, 2019 at 05:53:53PM +0100, Roger Pau Monné wrote:
>On Mon, Mar 11, 2019 at 03:57:28PM +0800, Chao Gao wrote:
>> to replace the current per-cpu cache 'uci->mc'.
>> 
>> Compared to the current per-cpu cache, the benefits of the global
>> microcode cache are:
>> 1. It reduces the work that need to be done on each CPU. Parsing ucode
>> file is done once on one CPU. Other CPUs needn't parse ucode file.
>> Instead, they can find out and load the newest patch from the global
>> cache.
>> 2. It reduces the memory consumption on a system with many CPU cores.
>> 
>> Two functions, microcode_save_patch() and microcode_find_patch() are
>> introduced. The former adds one given patch to the global cache. The
>> latter gets a newer and matched ucode patch from the global cache.
>> All operations on the cache is expected to be done with the
>> 'microcode_mutex' hold.
>> 
>> Note that I deliberately avoid touching 'uci->mc' as I am going to
>> remove it completely in the next patch.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v6:
>>  - constify local variables and function parameters if possible
>>  - comment that the global cache is protected by 'microcode_mutex'.
>>    and add assertions to catch violations in microcode_{save/find}_patch()
>> 
>> Changes in v5:
>>  - reword the commit description
>>  - find_patch() and save_patch() are abstracted into common functions
>>    with some hooks for AMD and Intel
>> ---
>>  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
>>  xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
>>  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
>>  xen/include/asm-x86/microcode.h | 13 ++++++
>>  4 files changed, 215 insertions(+), 15 deletions(-)
>> 
>> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
>> index 4163f50..e629e6c 100644
>> --- a/xen/arch/x86/microcode.c
>> +++ b/xen/arch/x86/microcode.c
>> @@ -61,6 +61,8 @@ static struct ucode_mod_blob __initdata ucode_blob;
>>   */
>>  static bool_t __initdata ucode_scan;
>>  
>> +static LIST_HEAD(microcode_cache);
>> +
>>  void __init microcode_set_module(unsigned int idx)
>>  {
>>      ucode_mod_idx = idx;
>> @@ -208,6 +210,64 @@ static void microcode_fini_cpu(unsigned int cpu)
>>      spin_unlock(&microcode_mutex);
>>  }
>>  
>> +/*
>> + * Save an ucode patch to the global cache list.
>> + *
>> + * Return true if a patch is added to the global cache or it replaces
>> + * one old patch in the cache. Otherwise, return false. According to
>> + * the return value, the caller decides not to do an expensive ucode
>> + * update for some cases (i.e. when admin provides an old ucode).
>> + */
>> +bool microcode_save_patch(struct microcode_patch *new)
>> +{
>> +    struct microcode_patch *old;
>> +
>> +    ASSERT(spin_is_locked(&microcode_mutex));
>> +
>> +    list_for_each_entry(old, &microcode_cache, list)
>> +    {
>> +        enum microcode_match_result result =
>> +            microcode_ops->compare_patch(new, old);
>> +
>> +        if ( result == OLD_UCODE )
>> +        {
>> +            microcode_ops->free_patch(new);
>> +            return false;
>> +        }
>> +        else if ( result == NEW_UCODE )
>> +        {
>> +            list_replace(&old->list, &new->list);
>> +            microcode_ops->free_patch(old);
>> +            return true;
>> +        }
>> +        else /* result == MIS_UCODE */
>> +            continue;
>
>I think this is rather too simplistic, and I'm not sure you really
>need a list in order to store the microcodes.
>
>IIRC we agreed that systems with mixed CPU versions are not supported,

In previous version, I also mentioned that one instance might be enough.
(two instances considering the concern you elaborated below)
But I was told that it would be better to support mixed CPUs if they are
allowed by Intel or AMD. And the old per-cpu cache, IMO, also supports
mixed CPUs.

If now we don't want to support it any longer, I agree to your proposal.
I will defer to Jan's opinion.

Thanks
Chao

>hence the same microcode blob should be used to update all the
>possible CPUs on the system, so a list should not be needed here.
>
>Also I'm afraid that freeing the old microcode when a new version is
>uploaded is not fully correct. For example given the following
>scenario:
>
>1. Upload a microcode blob to the hypervisor and apply it to online
>CPUs.
>
>2. Upload a microcode blob with a higher version than the previous one,
>but which fails to apply. This microcode would replace the
>previous one.
>
>3. Online a CPU. This CPU will try to use the last uploaded microcode
>and fail, because last uploaded version is broken. Newly onlined CPUs
>would then end up with a microcode version different from the
>currently running CPUs, likely breaking the system.
>
>I think the best way to solve this is to ditch the list usage and
>instead only keep at most two microcode versions cached in the
>hypervisor:
>
> - The microcode version that's currently successfully applied (if any).
> - A microcode version higher than the current version, that has yet
>   to be applied.
>
>Once this new microcode version has been applied it will replace the
>previously applied version. If the new microcode version fails to
>apply it will be discarded, thus keeping a copy of the currently
>applied microcode version.
>
>With this approach AFAICT you only need two variables, one to store
>the currently applied microcode_patch and another one to save the new
>microcode version in order to attempt to apply it.
>
>I think this will also simplify some of the code. Let me know if this
>sounds sensible, or if I'm missing something.
>
>Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-12 16:53   ` Roger Pau Monné
  2019-03-12 23:31     ` Raj, Ashok
  2019-03-13  5:28     ` Chao Gao
@ 2019-03-13  7:39     ` Jan Beulich
  2019-03-13 10:30       ` Andrew Cooper
  2 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-03-13  7:39 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, xen-devel, Chao Gao

>>> On 12.03.19 at 17:53, <roger.pau@citrix.com> wrote:
> IIRC we agreed that systems with mixed CPU versions are not supported,
> hence the same microcode blob should be used to update all the
> possible CPUs on the system, so a list should not be needed here.

That's not what I recall. We agreed to not store everything, because
mixed-family / mixed-model systems are commonly not supported by
the hw vendors. But mixed stepping systems may well be.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 05/12] microcode: only save compatible ucode patches
  2019-03-12 17:03   ` Roger Pau Monné
@ 2019-03-13  7:45     ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-03-13  7:45 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, xen-devel, Chao Gao

>>> On 12.03.19 at 18:03, <roger.pau@citrix.com> wrote:
> On Mon, Mar 11, 2019 at 03:57:29PM +0800, Chao Gao wrote:
>> Intel CPU only allows mixing in stepping or 'pf'. If an ucode patch is
>> for other CPU families or models, it won't be compatible to all CPUs on
>> current system and even hot-plugged CPUs.  Don't save such patch to
>> reduce the size of ucode cache.
> 
> Oh, so we are indeed aiming to support systems that can require CPUs
> to use different microcode blobs in order to update?
> 
> If that's the case, and we are sure this will work, we do indeed need
> a list of microcodes, but you will have to somehow flag applied
> microcodes in order to prevent them from being replaced with new
> microcode versions until Xen knows that such new versions apply
> correctly.
> 
> Maybe I'm being too simplistic here, but does anyone really have a
> system with CPUs with different stepping?

I think the primary scenario is to see a newer stepping CPU added
into a slightly older system (hotplug or not). Whether people
actually do this I don't know, but I suspect there's a reason hw
vendors state what combinations are supported.

> Supporting such scenario adds quite a lot of complexity and even if
> theoretically possible I don't think it makes much sense to support
> unless there are such systems out there.

What complexity are you thinking about? Of course the expectation
is for hotplugged CPUs to have all features already present CPUs
have. And of the ones we find at boot we settle on a common
minimum set of features (or at least we try to), but in mixed-
stepping systems the common situation would anyway be that
they're all equal in terms of software visible features.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-13  5:02     ` Chao Gao
@ 2019-03-13  7:54       ` Jan Beulich
  2019-03-13  8:02         ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Jan Beulich @ 2019-03-13  7:54 UTC (permalink / raw)
  To: Ashok Raj, Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

>>> On 13.03.19 at 06:02, <chao.gao@intel.com> wrote:
> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>> +        ret = microcode_update_cpu();
>>
>>Does ret have any useful things on if the update failed? Doesn't seem 
>>to be used before you overwrite later in collect_cpu_info()?
> 
> It has the reason of failure on error. Actally, there are two reasons:
> one is no patch of newer revision, the other is we tried to update but
> the microcode revision didn't change. I can check this return value and
> print more informative message to admin. And furthermore, for the
> latter, we can remove the ucode patch from caches to address Roger's
> concern expressed in comments to patch 4 & 5.

Btw, I'm not sure removing such ucode from the cache is appropriate:
It may well apply elsewhere, unless there's a clear indication that the
blob is broken. So perhaps there needs to be special casing of -EIO,
which gets returned when the ucode rev reported by the CPU after
the update does not match expectations.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-13  7:54       ` Jan Beulich
@ 2019-03-13  8:02         ` Jan Beulich
  2019-03-14 12:39           ` Andrew Cooper
  2019-03-14 13:01           ` Chao Gao
  0 siblings, 2 replies; 50+ messages in thread
From: Jan Beulich @ 2019-03-13  8:02 UTC (permalink / raw)
  To: Ashok Raj, Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel, tglx, Borislav Petkov, Roger Pau Monne

>>> On 13.03.19 at 08:54, <JBeulich@suse.com> wrote:
>>>> On 13.03.19 at 06:02, <chao.gao@intel.com> wrote:
>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>>>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>> +        ret = microcode_update_cpu();
>>>
>>>Does ret have any useful things on if the update failed? Doesn't seem 
>>>to be used before you overwrite later in collect_cpu_info()?
>> 
>> It has the reason of failure on error. Actally, there are two reasons:
>> one is no patch of newer revision, the other is we tried to update but
>> the microcode revision didn't change. I can check this return value and
>> print more informative message to admin. And furthermore, for the
>> latter, we can remove the ucode patch from caches to address Roger's
>> concern expressed in comments to patch 4 & 5.
> 
> Btw, I'm not sure removing such ucode from the cache is appropriate:
> It may well apply elsewhere, unless there's a clear indication that the
> blob is broken. So perhaps there needs to be special casing of -EIO,
> which gets returned when the ucode rev reported by the CPU after
> the update does not match expectations.

An to go one step further, perhaps we should also store more than
just the newest variant for a given pf. If the newest fails to apply
but there is another one newer than what's on a CPU, updating to
that may work, and once that intermediate update worked, the
update to the newest version may then work too.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
  2019-03-12 15:27   ` Roger Pau Monné
@ 2019-03-13  9:24   ` Wei Liu
  2019-03-25  9:38   ` Sergey Dyasli
  2 siblings, 0 replies; 50+ messages in thread
From: Wei Liu @ 2019-03-13  9:24 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Konrad Rzeszutek Wilk,
	Ian Jackson, xen-devel

On Mon, Mar 11, 2019 at 03:57:25PM +0800, Chao Gao wrote:
> +    ret = xc_platform_op(xch, &op);
> +    if ( ret )

Since you will be resending anyway, please make the coding style
consistent here. Thanks.

> +        fprintf(stderr, "Failed to update microcode. (err: %d)\n", ret);

ret is -1 here. I think you want errno.

Wei.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-13  7:39     ` Jan Beulich
@ 2019-03-13 10:30       ` Andrew Cooper
  2019-03-13 17:04         ` Andrew Cooper
  0 siblings, 1 reply; 50+ messages in thread
From: Andrew Cooper @ 2019-03-13 10:30 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monne
  Cc: Sergey Dyasli, xen-devel, Wei Liu, Ashok Raj, Chao Gao

On 13/03/2019 07:39, Jan Beulich wrote:
>>>> On 12.03.19 at 17:53, <roger.pau@citrix.com> wrote:
>> IIRC we agreed that systems with mixed CPU versions are not supported,
>> hence the same microcode blob should be used to update all the
>> possible CPUs on the system, so a list should not be needed here.
> That's not what I recall. We agreed to not store everything, because
> mixed-family / mixed-model systems are commonly not supported by
> the hw vendors. But mixed stepping systems may well be.

The difference between Skylake and CascadeLake server CPUs is only in
the stepping (4 vs 6).  Same for Kaby/Coffee/Whiskey/Amber lake (the
mobile and desktop lines have a model each, and newer generations are
just new steppings).

I'll have to defer to Intel as to exactly what is supported, but when I
asked this before, the answer was that no heterogeneity was supported at
all, not even at the stepping level or platform ID level.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-11  7:57 ` [PATCH v6 04/12] microcode: introduce a global cache of ucode patch Chao Gao
  2019-03-12 16:53   ` Roger Pau Monné
@ 2019-03-13 16:36   ` Sergey Dyasli
  2019-03-14  1:39     ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Sergey Dyasli @ 2019-03-13 16:36 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	Ashok Raj, Andrew Cooper, Jan Beulich, Roger Pau Monné

On 11/03/2019 07:57, Chao Gao wrote:
> to replace the current per-cpu cache 'uci->mc'.
> 
> Compared to the current per-cpu cache, the benefits of the global
> microcode cache are:
> 1. It reduces the work that need to be done on each CPU. Parsing ucode
> file is done once on one CPU. Other CPUs needn't parse ucode file.
> Instead, they can find out and load the newest patch from the global
> cache.
> 2. It reduces the memory consumption on a system with many CPU cores.
> 
> Two functions, microcode_save_patch() and microcode_find_patch() are
> introduced. The former adds one given patch to the global cache. The
> latter gets a newer and matched ucode patch from the global cache.
> All operations on the cache is expected to be done with the
> 'microcode_mutex' hold.
> 
> Note that I deliberately avoid touching 'uci->mc' as I am going to
> remove it completely in the next patch.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
> Changes in v6:
>  - constify local variables and function parameters if possible
>  - comment that the global cache is protected by 'microcode_mutex'.
>    and add assertions to catch violations in microcode_{save/find}_patch()
> 
> Changes in v5:
>  - reword the commit description
>  - find_patch() and save_patch() are abstracted into common functions
>    with some hooks for AMD and Intel
> ---
>  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
>  xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
>  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
>  xen/include/asm-x86/microcode.h | 13 ++++++
>  4 files changed, 215 insertions(+), 15 deletions(-)

This needs the following change for the AMD part:

diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
index 4aa8fdca38..358f3c44e3 100644
--- a/xen/arch/x86/microcode_amd.c
+++ b/xen/arch/x86/microcode_amd.c
@@ -249,6 +249,7 @@ static enum microcode_match_result compare_patch(
 static int apply_microcode(void)
 {
     uint32_t rev;
+    struct microcode_amd *mc_amd;
     const struct microcode_header_amd *hdr;
     const struct microcode_patch *patch;
     int hw_err;
@@ -259,7 +260,8 @@ static int apply_microcode(void)
     if ( patch == NULL )
         return -EINVAL;

-    hdr = patch->data;
+    mc_amd = patch->data;
+    hdr = mc_amd->mpb;

     BUG_ON(local_irq_is_enabled());


Another thing to mention is that even with this fix applied, I get
early ucode update only on CPU0. With the whole series applied, secondary
CPUs also start doing early ucode update.

--
Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-13 10:30       ` Andrew Cooper
@ 2019-03-13 17:04         ` Andrew Cooper
  2019-03-14  7:42           ` Jan Beulich
  0 siblings, 1 reply; 50+ messages in thread
From: Andrew Cooper @ 2019-03-13 17:04 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monne
  Cc: Sergey Dyasli, Chao Gao, Wei Liu, Ashok Raj, xen-devel

On 13/03/2019 10:30, Andrew Cooper wrote:
> On 13/03/2019 07:39, Jan Beulich wrote:
>>>>> On 12.03.19 at 17:53, <roger.pau@citrix.com> wrote:
>>> IIRC we agreed that systems with mixed CPU versions are not supported,
>>> hence the same microcode blob should be used to update all the
>>> possible CPUs on the system, so a list should not be needed here.
>> That's not what I recall. We agreed to not store everything, because
>> mixed-family / mixed-model systems are commonly not supported by
>> the hw vendors. But mixed stepping systems may well be.
> The difference between Skylake and CascadeLake server CPUs is only in
> the stepping (4 vs 6).  Same for Kaby/Coffee/Whiskey/Amber lake (the
> mobile and desktop lines have a model each, and newer generations are
> just new steppings).
>
> I'll have to defer to Intel as to exactly what is supported, but when I
> asked this before, the answer was that no heterogeneity was supported at
> all, not even at the stepping level or platform ID level.

So, as it turns out, Xen in practice only supports a single files worth
of microcode from /lib/firmware

This is because:
1) That is the only behaviour drakut has, and
2) drakut has been the default initrd-generator in Linux distros for far
longer than Xen has had boot time microcode loading support, and
3) The runtime microcode loading hypercall is currently completely broken

Therefore in practice, the drakut way of doing things is the only way
which has ever seen any testing, and is therefore the only one which works.

So, making a simplifying assumption of "single stepping/plat_id only" in
the Intel case is fine if it useful.

Beyond that, it would be useful to know what is an isn't supported by
Intel, and whether we are likely to find systems in the wild with mixed
cpu packages.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-13 16:36   ` Sergey Dyasli
@ 2019-03-14  1:39     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-14  1:39 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel,
	Roger Pau Monné

On Wed, Mar 13, 2019 at 04:36:50PM +0000, Sergey Dyasli wrote:
>On 11/03/2019 07:57, Chao Gao wrote:
>> to replace the current per-cpu cache 'uci->mc'.
>> 
>> Compared to the current per-cpu cache, the benefits of the global
>> microcode cache are:
>> 1. It reduces the work that need to be done on each CPU. Parsing ucode
>> file is done once on one CPU. Other CPUs needn't parse ucode file.
>> Instead, they can find out and load the newest patch from the global
>> cache.
>> 2. It reduces the memory consumption on a system with many CPU cores.
>> 
>> Two functions, microcode_save_patch() and microcode_find_patch() are
>> introduced. The former adds one given patch to the global cache. The
>> latter gets a newer and matched ucode patch from the global cache.
>> All operations on the cache is expected to be done with the
>> 'microcode_mutex' hold.
>> 
>> Note that I deliberately avoid touching 'uci->mc' as I am going to
>> remove it completely in the next patch.
>> 
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>> Changes in v6:
>>  - constify local variables and function parameters if possible
>>  - comment that the global cache is protected by 'microcode_mutex'.
>>    and add assertions to catch violations in microcode_{save/find}_patch()
>> 
>> Changes in v5:
>>  - reword the commit description
>>  - find_patch() and save_patch() are abstracted into common functions
>>    with some hooks for AMD and Intel
>> ---
>>  xen/arch/x86/microcode.c        | 60 +++++++++++++++++++++++++++
>>  xen/arch/x86/microcode_amd.c    | 91 +++++++++++++++++++++++++++++++++++++----
>>  xen/arch/x86/microcode_intel.c  | 66 ++++++++++++++++++++++++++----
>>  xen/include/asm-x86/microcode.h | 13 ++++++
>>  4 files changed, 215 insertions(+), 15 deletions(-)

Hi Sergey,

Thanks for your testing.

>
>This needs the following change for the AMD part:
>
>diff --git a/xen/arch/x86/microcode_amd.c b/xen/arch/x86/microcode_amd.c
>index 4aa8fdca38..358f3c44e3 100644
>--- a/xen/arch/x86/microcode_amd.c
>+++ b/xen/arch/x86/microcode_amd.c
>@@ -249,6 +249,7 @@ static enum microcode_match_result compare_patch(
> static int apply_microcode(void)
> {
>     uint32_t rev;
>+    struct microcode_amd *mc_amd;
>     const struct microcode_header_amd *hdr;
>     const struct microcode_patch *patch;
>     int hw_err;
>@@ -259,7 +260,8 @@ static int apply_microcode(void)
>     if ( patch == NULL )
>         return -EINVAL;
>
>-    hdr = patch->data;
>+    mc_amd = patch->data;
>+    hdr = mc_amd->mpb;
>
>     BUG_ON(local_irq_is_enabled());

Yes. Indeed we need this fix.

>
>
>Another thing to mention is that even with this fix applied, I get
>early ucode update only on CPU0.

In this patch, I checked the return value of microcode_save_patch().
If it failed, the CPU wouldn't update ucode. Here, each cpu will
parse microcode itself and add found patches to cache. Only the first
saving would succeed and I believe it causes the issue you found.

>With the whole series applied, secondary
>CPUs also start doing early ucode update.

I assume that you mean secondary CPUs also invoke .apply_microcode
callback.

This is an expected behavior. On early ucode update, the sibling info isn't
initialized. We don't do anything to prevent secondary cpus (for we haven't
identified them) from updating CPU. Secondary CPUs wouldn't find any
newer patch when calling microcode_find_patch() in apply_microcode().

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 04/12] microcode: introduce a global cache of ucode patch
  2019-03-13 17:04         ` Andrew Cooper
@ 2019-03-14  7:42           ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-03-14  7:42 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, xen-devel, Roger Pau Monne, Chao Gao

>>> On 13.03.19 at 18:04, <andrew.cooper3@citrix.com> wrote:
> On 13/03/2019 10:30, Andrew Cooper wrote:
>> On 13/03/2019 07:39, Jan Beulich wrote:
>>>>>> On 12.03.19 at 17:53, <roger.pau@citrix.com> wrote:
>>>> IIRC we agreed that systems with mixed CPU versions are not supported,
>>>> hence the same microcode blob should be used to update all the
>>>> possible CPUs on the system, so a list should not be needed here.
>>> That's not what I recall. We agreed to not store everything, because
>>> mixed-family / mixed-model systems are commonly not supported by
>>> the hw vendors. But mixed stepping systems may well be.
>> The difference between Skylake and CascadeLake server CPUs is only in
>> the stepping (4 vs 6).  Same for Kaby/Coffee/Whiskey/Amber lake (the
>> mobile and desktop lines have a model each, and newer generations are
>> just new steppings).
>>
>> I'll have to defer to Intel as to exactly what is supported, but when I
>> asked this before, the answer was that no heterogeneity was supported at
>> all, not even at the stepping level or platform ID level.
> 
> So, as it turns out, Xen in practice only supports a single files worth
> of microcode from /lib/firmware
> 
> This is because:
> 1) That is the only behaviour drakut has, and
> 2) drakut has been the default initrd-generator in Linux distros for far
> longer than Xen has had boot time microcode loading support, and

I disagree. pulling the microcode blob out of the initrd is only a
secondary option, as far as Xen's history is concerned. My
original early load implementation required specifying a separate
module in the grub configuration; ucode=scan support was added
later by Konrad. Yet the ucode=<num> logic also supports the
full microcode.bin blob to be loaded. (And as to the timeline, I'm
not sure SLE in particular was switched to dracut before Xen had
gained early loading support.)

That said - I agree with you that the overwhelming amount of
testing likely covers the case you mention, and nothing else. So
if for all practical purposes supporting just a single family:model:
stepping tuple is enough - fine with me. Given my other reply
regarding how to deal with load failure though, I'm not sure this
would mean we can resort back to caching just a single blob.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-13  8:02         ` Jan Beulich
@ 2019-03-14 12:39           ` Andrew Cooper
  2019-03-14 18:57             ` Raj, Ashok
  2019-03-14 13:01           ` Chao Gao
  1 sibling, 1 reply; 50+ messages in thread
From: Andrew Cooper @ 2019-03-14 12:39 UTC (permalink / raw)
  To: Jan Beulich, Ashok Raj, Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Jun Nakajima, xen-devel,
	tglx, Borislav Petkov, Roger Pau Monne

On 13/03/2019 08:02, Jan Beulich wrote:
>>>> On 13.03.19 at 08:54, <JBeulich@suse.com> wrote:
>>>>> On 13.03.19 at 06:02, <chao.gao@intel.com> wrote:
>>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>>>> On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>>> +        ret = microcode_update_cpu();
>>>> Does ret have any useful things on if the update failed? Doesn't seem 
>>>> to be used before you overwrite later in collect_cpu_info()?
>>> It has the reason of failure on error. Actally, there are two reasons:
>>> one is no patch of newer revision, the other is we tried to update but
>>> the microcode revision didn't change. I can check this return value and
>>> print more informative message to admin. And furthermore, for the
>>> latter, we can remove the ucode patch from caches to address Roger's
>>> concern expressed in comments to patch 4 & 5.
>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>> It may well apply elsewhere, unless there's a clear indication that the
>> blob is broken. So perhaps there needs to be special casing of -EIO,
>> which gets returned when the ucode rev reported by the CPU after
>> the update does not match expectations.
> An to go one step further, perhaps we should also store more than
> just the newest variant for a given pf. If the newest fails to apply
> but there is another one newer than what's on a CPU, updating to
> that may work, and once that intermediate update worked, the
> update to the newest version may then work too.

I don't think this is sensible.

Running with mismatched microcode is unsupported (for a suitable
definition of mismatched).

At boot, there will be 1 individual blob which is applicable to the
CPUs, and gets loaded on all APs.  (Possibly more than 1 blob, but
remember that only multi-socket servers and top end workstations have a
chance of having mixed steppings in the first place.)

During late load, we will have 1 (or more) blobs provided in the late
hypercall, and we will apply in a logically-atomic fashion in a
rendezvous'd context.

There are a few outcomes from this action.  If the ucode application
fails internally, the system is already lost and will crash.  This is an
inherent risk which people doing late loading need to be aware of (and
why test workloads exist in production setups).

If some cores accept the update but others don't, then we've also got
serious problems and probably system instability.  This is the kind of
problem which needs to be detected during testing, and may require a
change in application strategy, or may in practice prevent a particular
from being declared safe to late load.

The expected case is that all cores accept the blob during the rendezvous. 

As a result, I don't see any need to store more than a single ucode
version (whether this is a single blob or perhaps a set of closely
related blobs for a mixed-stepping system) in the stead state (to cope
with AP boot, suspend/resume, or CPU hotplug), and a second version
which is the proposed-new ucode for late loading.

On late load failure, we should dump enough information to work out
exactly what went on, to determine how best to proceed, but the server
is effectively lost to us.  On late load success, the proposed new
"version" replaces the current "version".

And again - I reiterate the point that I think it is fine to have a
simplifying assumption that we don't have mixed stepping systems to
start with, presuming this is generally in line with Intel's support
statement.  If in practice we find mixed stepping systems which are
supported by an OEM/Intel, we can see about extending the logic.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-13  8:02         ` Jan Beulich
  2019-03-14 12:39           ` Andrew Cooper
@ 2019-03-14 13:01           ` Chao Gao
  2019-03-14 13:08             ` Jan Beulich
  1 sibling, 1 reply; 50+ messages in thread
From: Chao Gao @ 2019-03-14 13:01 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper,
	Jun Nakajima, xen-devel, tglx, Borislav Petkov, Roger Pau Monne

On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote:
>>>> On 13.03.19 at 08:54, <JBeulich@suse.com> wrote:
>>>>> On 13.03.19 at 06:02, <chao.gao@intel.com> wrote:
>>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>>>>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>>> +        ret = microcode_update_cpu();
>>>>
>>>>Does ret have any useful things on if the update failed? Doesn't seem 
>>>>to be used before you overwrite later in collect_cpu_info()?
>>> 
>>> It has the reason of failure on error. Actally, there are two reasons:
>>> one is no patch of newer revision, the other is we tried to update but
>>> the microcode revision didn't change. I can check this return value and
>>> print more informative message to admin. And furthermore, for the
>>> latter, we can remove the ucode patch from caches to address Roger's
>>> concern expressed in comments to patch 4 & 5.
>> 
>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>> It may well apply elsewhere, unless there's a clear indication that the
>> blob is broken.

Yes. Got it. Can we just assume we won't encounter that ucode update
succeeded only on part of cpus and warn a reboot is needed if it happened?
We definitely want to tolerate some kinds of hardware misbehavior. But
for such cases which are unlikely to happen, I prefer to improve this code
when we meet this kind of issue.

>> So perhaps there needs to be special casing of -EIO,
>> which gets returned when the ucode rev reported by the CPU after
>> the update does not match expectations.
>
>An to go one step further, perhaps we should also store more than
>just the newest variant for a given pf. If the newest fails to apply
>but there is another one newer than what's on a CPU, updating to
>that may work, and once that intermediate update worked, the
>update to the newest version may then work too.

Intel SDM doesn't mention this dependency (to apply an ucode relies on a
specific old ucode applied). Perhaps we can also assume we won't fall
into this case.

Hi Ashok,

Do you know whether Intel's ucode update mechanism has such dependency?

Thanks
Chao


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-14 13:01           ` Chao Gao
@ 2019-03-14 13:08             ` Jan Beulich
  0 siblings, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-03-14 13:08 UTC (permalink / raw)
  To: Chao Gao
  Cc: Sergey Dyasli, Kevin Tian, Wei Liu, Ashok Raj, Andrew Cooper,
	Jun Nakajima, xen-devel, tglx, Borislav Petkov, Roger Pau Monne

>>> On 14.03.19 at 14:01, <chao.gao@intel.com> wrote:
> On Wed, Mar 13, 2019 at 02:02:55AM -0600, Jan Beulich wrote:
>>>>> On 13.03.19 at 08:54, <JBeulich@suse.com> wrote:
>>>>>> On 13.03.19 at 06:02, <chao.gao@intel.com> wrote:
>>>> On Tue, Mar 12, 2019 at 05:07:51PM -0700, Raj, Ashok wrote:
>>>>>On Mon, Mar 11, 2019 at 03:57:35PM +0800, Chao Gao wrote:
>>>>>> +    if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
>>>>>> +        ret = microcode_update_cpu();
>>>>>
>>>>>Does ret have any useful things on if the update failed? Doesn't seem 
>>>>>to be used before you overwrite later in collect_cpu_info()?
>>>> 
>>>> It has the reason of failure on error. Actally, there are two reasons:
>>>> one is no patch of newer revision, the other is we tried to update but
>>>> the microcode revision didn't change. I can check this return value and
>>>> print more informative message to admin. And furthermore, for the
>>>> latter, we can remove the ucode patch from caches to address Roger's
>>>> concern expressed in comments to patch 4 & 5.
>>> 
>>> Btw, I'm not sure removing such ucode from the cache is appropriate:
>>> It may well apply elsewhere, unless there's a clear indication that the
>>> blob is broken.
> 
> Yes. Got it. Can we just assume we won't encounter that ucode update
> succeeded only on part of cpus and warn a reboot is needed if it happened?
> We definitely want to tolerate some kinds of hardware misbehavior. But
> for such cases which are unlikely to happen, I prefer to improve this code
> when we meet this kind of issue.

Okay, for both this and ...

>>> So perhaps there needs to be special casing of -EIO,
>>> which gets returned when the ucode rev reported by the CPU after
>>> the update does not match expectations.
>>
>>An to go one step further, perhaps we should also store more than
>>just the newest variant for a given pf. If the newest fails to apply
>>but there is another one newer than what's on a CPU, updating to
>>that may work, and once that intermediate update worked, the
>>update to the newest version may then work too.

... see also Andrew's reply. As long as no-one is aware of mixed-
stepping systems out there in the wild, I'm fine with simplifying
things where possible.

> Intel SDM doesn't mention this dependency (to apply an ucode relies on a
> specific old ucode applied). Perhaps we can also assume we won't fall
> into this case.

Please see Linux'es
arch/x86/kernel/cpu/microcode/intel.c:is_blacklisted()
for one of the possible problematic situations here. Of course
the SDM wouldn't mention it, only the errata document for this
specific model would.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-14 12:39           ` Andrew Cooper
@ 2019-03-14 18:57             ` Raj, Ashok
  2019-03-14 20:25               ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Raj, Ashok @ 2019-03-14 18:57 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Wei Liu, Jan Beulich,
	Mallick, Asit K, Jun Nakajima, Ashok Raj, xen-devel, tglx,
	Chao Gao, Roger Pau Monne

Hi Andrew 

Sorry it took a while, since I had to check with couple others.

+ Asit

On Thu, Mar 14, 2019 at 12:39:46PM +0000, Andrew Cooper wrote:
[snip]
> 
> On late load failure, we should dump enough information to work out
> exactly what went on, to determine how best to proceed, but the server
> is effectively lost to us.  On late load success, the proposed new
> "version" replaces the current "version".
> 
> And again - I reiterate the point that I think it is fine to have a
> simplifying assumption that we don't have mixed stepping systems to
> start with, presuming this is generally in line with Intel's support
> statement.  If in practice we find mixed stepping systems which are
> supported by an OEM/Intel, we can see about extending the logic.

Checking with Asit he says it is in fact permitted to have 1 step behind
even on a multi-socket system. One could be N and other N-1 should be 
supported.

Cheers,
Ashok

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-14 18:57             ` Raj, Ashok
@ 2019-03-14 20:25               ` Thomas Gleixner
  2019-03-15  9:40                 ` Andrew Cooper
  0 siblings, 1 reply; 50+ messages in thread
From: Thomas Gleixner @ 2019-03-14 20:25 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Wei Liu,
	Jun Nakajima, Mallick, Asit K, Andrew Cooper, Peter Zijlstra,
	Jan Beulich, xen-devel, Chao Gao, Roger Pau Monne

[-- Attachment #1: Type: text/plain, Size: 1348 bytes --]

Ashok,

On Thu, 14 Mar 2019, Raj, Ashok wrote:
> On Thu, Mar 14, 2019 at 12:39:46PM +0000, Andrew Cooper wrote:
> > On late load failure, we should dump enough information to work out
> > exactly what went on, to determine how best to proceed, but the server
> > is effectively lost to us.  On late load success, the proposed new
> > "version" replaces the current "version".
> > 
> > And again - I reiterate the point that I think it is fine to have a
> > simplifying assumption that we don't have mixed stepping systems to
> > start with, presuming this is generally in line with Intel's support
> > statement.  If in practice we find mixed stepping systems which are
> > supported by an OEM/Intel, we can see about extending the logic.
> 
> Checking with Asit he says it is in fact permitted to have 1 step behind
> even on a multi-socket system. One could be N and other N-1 should be 
> supported.

That turns into a total disaster if N has an issue fixed ant N-1 requires
microcode + software workaround.

So if N is on the boot socket, then we fail to enable the workaround
because CPU0 has the 'Issue fixed' bit set.

If N-1 is on the boot socket, then we go to do the workaround nevertheless
on N and that might dependend on the issue just be some pointless exercise
or even try to access some MSR which is not available.

*Shudder*

	tglx

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-14 20:25               ` Thomas Gleixner
@ 2019-03-15  9:40                 ` Andrew Cooper
  2019-03-15 10:44                   ` Thomas Gleixner
  0 siblings, 1 reply; 50+ messages in thread
From: Andrew Cooper @ 2019-03-15  9:40 UTC (permalink / raw)
  To: Raj, Ashok
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Wei Liu,
	Jun Nakajima, Mallick, Asit K, Peter Zijlstra, Jan Beulich,
	xen-devel, Thomas Gleixner, Chao Gao, Roger Pau Monne

On 14/03/2019 20:25, Thomas Gleixner wrote:
> Ashok,
>
> On Thu, 14 Mar 2019, Raj, Ashok wrote:
>> On Thu, Mar 14, 2019 at 12:39:46PM +0000, Andrew Cooper wrote:
>>> On late load failure, we should dump enough information to work out
>>> exactly what went on, to determine how best to proceed, but the server
>>> is effectively lost to us.  On late load success, the proposed new
>>> "version" replaces the current "version".
>>>
>>> And again - I reiterate the point that I think it is fine to have a
>>> simplifying assumption that we don't have mixed stepping systems to
>>> start with, presuming this is generally in line with Intel's support
>>> statement.  If in practice we find mixed stepping systems which are
>>> supported by an OEM/Intel, we can see about extending the logic.
>> Checking with Asit he says it is in fact permitted to have 1 step behind
>> even on a multi-socket system. One could be N and other N-1 should be 
>> supported.
> That turns into a total disaster if N has an issue fixed ant N-1 requires
> microcode + software workaround.
>
> So if N is on the boot socket, then we fail to enable the workaround
> because CPU0 has the 'Issue fixed' bit set.
>
> If N-1 is on the boot socket, then we go to do the workaround nevertheless
> on N and that might dependend on the issue just be some pointless exercise
> or even try to access some MSR which is not available.
>
> *Shudder*

Intel: Are you saying that Skylake (06-55-04) is supported in
combination with Cascade Lake B0 (06-55-05) and/or Cascade Lake B1
(06-55-06) ?

The most insidious problem is TSX_FORCE_ABORT between the two Cascade
Lakes.  There really will be an asymmetric existence of an MSR required
for use in one part of the system, and unavailable in the other part of
the system.

To a certain degree, what is technically supported by Intel is also
tempered by what the major OS/VMM vendors are willing to boot on, as
that is ultimately what the customer is paying for.  When the steppings
differed only by the errata fixed, and the silicon was otherwise
identical from software's point of view, supporting a range of adjacent
steppings seems entirely reasonable.

In this case you've got 3 adjacent steppings, *all* of which offer
different architecturally defined features, and will involve software
changes to allow mixed systems to function in a safe way.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading
  2019-03-15  9:40                 ` Andrew Cooper
@ 2019-03-15 10:44                   ` Thomas Gleixner
  0 siblings, 0 replies; 50+ messages in thread
From: Thomas Gleixner @ 2019-03-15 10:44 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Sergey Dyasli, Kevin Tian, Borislav Petkov, Wei Liu, Raj, Ashok,
	Jun Nakajima, Mallick, Asit K, Peter Zijlstra, Jan Beulich,
	xen-devel, Chao Gao, Roger Pau Monne

[-- Attachment #1: Type: text/plain, Size: 2736 bytes --]

On Fri, 15 Mar 2019, Andrew Cooper wrote:
> On 14/03/2019 20:25, Thomas Gleixner wrote:
> > On Thu, 14 Mar 2019, Raj, Ashok wrote:
> >> On Thu, Mar 14, 2019 at 12:39:46PM +0000, Andrew Cooper wrote:
> >>> On late load failure, we should dump enough information to work out
> >>> exactly what went on, to determine how best to proceed, but the server
> >>> is effectively lost to us.  On late load success, the proposed new
> >>> "version" replaces the current "version".
> >>>
> >>> And again - I reiterate the point that I think it is fine to have a
> >>> simplifying assumption that we don't have mixed stepping systems to
> >>> start with, presuming this is generally in line with Intel's support
> >>> statement.  If in practice we find mixed stepping systems which are
> >>> supported by an OEM/Intel, we can see about extending the logic.
> >> Checking with Asit he says it is in fact permitted to have 1 step behind
> >> even on a multi-socket system. One could be N and other N-1 should be 
> >> supported.
> > That turns into a total disaster if N has an issue fixed ant N-1 requires
> > microcode + software workaround.
> >
> > So if N is on the boot socket, then we fail to enable the workaround
> > because CPU0 has the 'Issue fixed' bit set.
> >
> > If N-1 is on the boot socket, then we go to do the workaround nevertheless
> > on N and that might dependend on the issue just be some pointless exercise
> > or even try to access some MSR which is not available.
> >
> > *Shudder*
> 
> Intel: Are you saying that Skylake (06-55-04) is supported in
> combination with Cascade Lake B0 (06-55-05) and/or Cascade Lake B1
> (06-55-06) ?
> 
> The most insidious problem is TSX_FORCE_ABORT between the two Cascade
> Lakes.  There really will be an asymmetric existence of an MSR required
> for use in one part of the system, and unavailable in the other part of
> the system.
> 
> To a certain degree, what is technically supported by Intel is also
> tempered by what the major OS/VMM vendors are willing to boot on, as
> that is ultimately what the customer is paying for.  When the steppings
> differed only by the errata fixed, and the silicon was otherwise
> identical from software's point of view, supporting a range of adjacent
> steppings seems entirely reasonable.

Sure, if the software does not have to worry about the differences then
supporting that is a no-brainer.

> In this case you've got 3 adjacent steppings, *all* of which offer
> different architecturally defined features, and will involve software
> changes to allow mixed systems to function in a safe way.

Let's not go there. We've seen the mess which other architectures created
with big/little CPUs which expose different feature sets.

Thanks,

	tglx

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 00/12] improve late microcode loading
  2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
                   ` (11 preceding siblings ...)
  2019-03-11  7:57 ` [PATCH v6 12/12] microcode: update microcode on cores in parallel Chao Gao
@ 2019-03-19 20:22 ` Woods, Brian
  2019-03-19 21:39   ` Woods, Brian
  12 siblings, 1 reply; 50+ messages in thread
From: Woods, Brian @ 2019-03-19 20:22 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Boris Ostrovsky, Suthikulpanit, Suravee, Roger Pau Monné

On 3/11/19 2:57 AM, Chao Gao wrote:
> Major changes in version 6:
>   - run wbinvd before updating microcode (patch 10)
>   - add an userspace tool for late microcode update (patch 1)
>   - scale time to wait by the number of remaining CPUs to respond
>   - remove 'cpu' parameters from some related callbacks and functins
>   - save an ucode patch only if its supported CPU is allowed to mix with
>     current cpu.
> 
> Changes in version 5:
>   - support parallel microcode updates for all cores (see patch 8)
>   - Address Roger's comments on the last version.
> 
> The intention of this series is to make the late microcode loading
> more reliable by rendezvousing all cpus in stop_machine context.
> This idea comes from Ashok. I am porting his linux patch to Xen
> (see patch 10 and 11 for more details).
> 
> This series makes five changes:
>   1. Patch 1: an userspace tool for late microcode update
>   2. Patch 2-9: introduce a global microcode cache and some cleanup
>   3. Patch 10: writeback and invalidate cache before updating microcode
>   3. Patch 11: synchronize late microcode loading
>   4. Patch 12: support parallel microcodes update on different cores
> 
> Currently, late microcode loading does a lot of things including
> parsing microcode blob, checking the signature/revision and performing
> update. Putting all of them into stop_machine context is a bad idea
> because of complexity (One issue I observed is memory allocation
> triggered one assertion in stop_machine context). In order to simplify
> the load process, I move parsing microcode out of the load process.
> The microcode blob is parsed and a global microcode cache is built on
> a single CPU before rendezvousing all cpus to update microcode. Other
> CPUs just get and load a suitable microcode from the global cache.
> With this global cache, it is safe to put simplified load process to
> stop_machine context.
> 
> Regarding changes to AMD side, I didn't do any test for them due to
> lack of hardware. Could you help to test this series on an AMD machine?
> At least, two basic tests are needed:
> * do a microcode update after system bootup
> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>    command line and then do a microcode update and online all offlined
>    CPUs via 'xen-hptool'.
> 
> Chao Gao (12):
>    misc/xenmicrocode: Upload a microcode blob to the hypervisor
>    microcode/intel: use union to get fields without shifting and masking
>    microcode/intel: extend microcode_update_match()
>    microcode: introduce a global cache of ucode patch
>    microcode: only save compatible ucode patches
>    microcode: remove struct ucode_cpu_info
>    microcode: remove pointless 'cpu' parameter
>    microcode: split out apply_microcode() from cpu_request_microcode()
>    microcode: remove struct microcode_info
>    microcode/intel: Writeback and invalidate caches before updating
>      microcode
>    x86/microcode: Synchronize late microcode loading
>    microcode: update microcode on cores in parallel
> 
>   tools/libxc/include/xenctrl.h   |   1 +
>   tools/libxc/xc_misc.c           |  20 +++
>   tools/misc/Makefile             |   4 +
>   tools/misc/xenmicrocode.c       |  89 ++++++++++
>   xen/arch/x86/acpi/power.c       |   2 +-
>   xen/arch/x86/apic.c             |   2 +-
>   xen/arch/x86/microcode.c        | 380 +++++++++++++++++++++++++++-------------
>   xen/arch/x86/microcode_amd.c    | 236 ++++++++++++-------------
>   xen/arch/x86/microcode_intel.c  | 206 +++++++++++++---------
>   xen/arch/x86/smpboot.c          |   5 +-
>   xen/arch/x86/spec_ctrl.c        |   2 +-
>   xen/include/asm-x86/microcode.h |  40 +++--
>   xen/include/asm-x86/processor.h |   3 +-
>   13 files changed, 639 insertions(+), 351 deletions(-)
>   create mode 100644 tools/misc/xenmicrocode.c
> 

Sorry for the delay.  These patches fail on F17h.  I'm looking into 
where it fails now.
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 00/12] improve late microcode loading
  2019-03-19 20:22 ` [PATCH v6 00/12] improve late microcode loading Woods, Brian
@ 2019-03-19 21:39   ` Woods, Brian
  2019-03-20  8:58     ` Chao Gao
  0 siblings, 1 reply; 50+ messages in thread
From: Woods, Brian @ 2019-03-19 21:39 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	Boris Ostrovsky, Suthikulpanit, Suravee, Roger Pau Monné

On 3/19/19 3:22 PM, Brian Woods wrote:
> On 3/11/19 2:57 AM, Chao Gao wrote:
>> Major changes in version 6:
>>   - run wbinvd before updating microcode (patch 10)
>>   - add an userspace tool for late microcode update (patch 1)
>>   - scale time to wait by the number of remaining CPUs to respond
>>   - remove 'cpu' parameters from some related callbacks and functins
>>   - save an ucode patch only if its supported CPU is allowed to mix with
>>     current cpu.
>>
>> Changes in version 5:
>>   - support parallel microcode updates for all cores (see patch 8)
>>   - Address Roger's comments on the last version.
>>
>> The intention of this series is to make the late microcode loading
>> more reliable by rendezvousing all cpus in stop_machine context.
>> This idea comes from Ashok. I am porting his linux patch to Xen
>> (see patch 10 and 11 for more details).
>>
>> This series makes five changes:
>>   1. Patch 1: an userspace tool for late microcode update
>>   2. Patch 2-9: introduce a global microcode cache and some cleanup
>>   3. Patch 10: writeback and invalidate cache before updating microcode
>>   3. Patch 11: synchronize late microcode loading
>>   4. Patch 12: support parallel microcodes update on different cores
>>
>> Currently, late microcode loading does a lot of things including
>> parsing microcode blob, checking the signature/revision and performing
>> update. Putting all of them into stop_machine context is a bad idea
>> because of complexity (One issue I observed is memory allocation
>> triggered one assertion in stop_machine context). In order to simplify
>> the load process, I move parsing microcode out of the load process.
>> The microcode blob is parsed and a global microcode cache is built on
>> a single CPU before rendezvousing all cpus to update microcode. Other
>> CPUs just get and load a suitable microcode from the global cache.
>> With this global cache, it is safe to put simplified load process to
>> stop_machine context.
>>
>> Regarding changes to AMD side, I didn't do any test for them due to
>> lack of hardware. Could you help to test this series on an AMD machine?
>> At least, two basic tests are needed:
>> * do a microcode update after system bootup
>> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>>    command line and then do a microcode update and online all offlined
>>    CPUs via 'xen-hptool'.
>>
>> Chao Gao (12):
>>    misc/xenmicrocode: Upload a microcode blob to the hypervisor
>>    microcode/intel: use union to get fields without shifting and masking
>>    microcode/intel: extend microcode_update_match()
>>    microcode: introduce a global cache of ucode patch
>>    microcode: only save compatible ucode patches
>>    microcode: remove struct ucode_cpu_info
>>    microcode: remove pointless 'cpu' parameter
>>    microcode: split out apply_microcode() from cpu_request_microcode()
>>    microcode: remove struct microcode_info
>>    microcode/intel: Writeback and invalidate caches before updating
>>      microcode
>>    x86/microcode: Synchronize late microcode loading
>>    microcode: update microcode on cores in parallel
>>
>>   tools/libxc/include/xenctrl.h   |   1 +
>>   tools/libxc/xc_misc.c           |  20 +++
>>   tools/misc/Makefile             |   4 +
>>   tools/misc/xenmicrocode.c       |  89 ++++++++++
>>   xen/arch/x86/acpi/power.c       |   2 +-
>>   xen/arch/x86/apic.c             |   2 +-
>>   xen/arch/x86/microcode.c        | 380 
>> +++++++++++++++++++++++++++-------------
>>   xen/arch/x86/microcode_amd.c    | 236 ++++++++++++-------------
>>   xen/arch/x86/microcode_intel.c  | 206 +++++++++++++---------
>>   xen/arch/x86/smpboot.c          |   5 +-
>>   xen/arch/x86/spec_ctrl.c        |   2 +-
>>   xen/include/asm-x86/microcode.h |  40 +++--
>>   xen/include/asm-x86/processor.h |   3 +-
>>   13 files changed, 639 insertions(+), 351 deletions(-)
>>   create mode 100644 tools/misc/xenmicrocode.c
>>
> 
> Sorry for the delay.  These patches fail on F17h.  I'm looking into 
> where it fails now.

Bisecting it says it's commit "microcode: introduce a global cache of 
ucode patch."

The failing commit fails with:
(XEN) [00000085227df312] microcode: CPU0 update from revision 0x8001207 
to 0xffff8304 failed
(XEN) [00000085240578ec] traps.c:1574: GPF (0000): ffff82d080426c88 
[probe_cpuid_faulting+0xe/0xa2] -> ffff82d0803818b2

That microcode revision is WAY off.  It should be 0x8001227 and not 
0xffff8304.  I don't think I'll be able to do much on it before the end 
of today, but let me what information you need or if there's anything I 
should be looking at in particular.

Thanks,
Brian
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 00/12] improve late microcode loading
  2019-03-19 21:39   ` Woods, Brian
@ 2019-03-20  8:58     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-20  8:58 UTC (permalink / raw)
  To: Woods, Brian
  Cc: Sergey Dyasli, Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich,
	xen-devel, Boris Ostrovsky, Suthikulpanit, Suravee,
	Roger Pau Monné

On Tue, Mar 19, 2019 at 09:39:59PM +0000, Woods, Brian wrote:
>On 3/19/19 3:22 PM, Brian Woods wrote:
>> On 3/11/19 2:57 AM, Chao Gao wrote:
>>> Major changes in version 6:
>>>   - run wbinvd before updating microcode (patch 10)
>>>   - add an userspace tool for late microcode update (patch 1)
>>>   - scale time to wait by the number of remaining CPUs to respond
>>>   - remove 'cpu' parameters from some related callbacks and functins
>>>   - save an ucode patch only if its supported CPU is allowed to mix with
>>>     current cpu.
>>>
>>> Changes in version 5:
>>>   - support parallel microcode updates for all cores (see patch 8)
>>>   - Address Roger's comments on the last version.
>>>
>>> The intention of this series is to make the late microcode loading
>>> more reliable by rendezvousing all cpus in stop_machine context.
>>> This idea comes from Ashok. I am porting his linux patch to Xen
>>> (see patch 10 and 11 for more details).
>>>
>>> This series makes five changes:
>>>   1. Patch 1: an userspace tool for late microcode update
>>>   2. Patch 2-9: introduce a global microcode cache and some cleanup
>>>   3. Patch 10: writeback and invalidate cache before updating microcode
>>>   3. Patch 11: synchronize late microcode loading
>>>   4. Patch 12: support parallel microcodes update on different cores
>>>
>>> Currently, late microcode loading does a lot of things including
>>> parsing microcode blob, checking the signature/revision and performing
>>> update. Putting all of them into stop_machine context is a bad idea
>>> because of complexity (One issue I observed is memory allocation
>>> triggered one assertion in stop_machine context). In order to simplify
>>> the load process, I move parsing microcode out of the load process.
>>> The microcode blob is parsed and a global microcode cache is built on
>>> a single CPU before rendezvousing all cpus to update microcode. Other
>>> CPUs just get and load a suitable microcode from the global cache.
>>> With this global cache, it is safe to put simplified load process to
>>> stop_machine context.
>>>
>>> Regarding changes to AMD side, I didn't do any test for them due to
>>> lack of hardware. Could you help to test this series on an AMD machine?
>>> At least, two basic tests are needed:
>>> * do a microcode update after system bootup
>>> * don't bring all pCPUs up at bootup by specifying maxcpus option in xen
>>>    command line and then do a microcode update and online all offlined
>>>    CPUs via 'xen-hptool'.
>>>
>>> Chao Gao (12):
>>>    misc/xenmicrocode: Upload a microcode blob to the hypervisor
>>>    microcode/intel: use union to get fields without shifting and masking
>>>    microcode/intel: extend microcode_update_match()
>>>    microcode: introduce a global cache of ucode patch
>>>    microcode: only save compatible ucode patches
>>>    microcode: remove struct ucode_cpu_info
>>>    microcode: remove pointless 'cpu' parameter
>>>    microcode: split out apply_microcode() from cpu_request_microcode()
>>>    microcode: remove struct microcode_info
>>>    microcode/intel: Writeback and invalidate caches before updating
>>>      microcode
>>>    x86/microcode: Synchronize late microcode loading
>>>    microcode: update microcode on cores in parallel
>>>
>>>   tools/libxc/include/xenctrl.h   |   1 +
>>>   tools/libxc/xc_misc.c           |  20 +++
>>>   tools/misc/Makefile             |   4 +
>>>   tools/misc/xenmicrocode.c       |  89 ++++++++++
>>>   xen/arch/x86/acpi/power.c       |   2 +-
>>>   xen/arch/x86/apic.c             |   2 +-
>>>   xen/arch/x86/microcode.c        | 380 
>>> +++++++++++++++++++++++++++-------------
>>>   xen/arch/x86/microcode_amd.c    | 236 ++++++++++++-------------
>>>   xen/arch/x86/microcode_intel.c  | 206 +++++++++++++---------
>>>   xen/arch/x86/smpboot.c          |   5 +-
>>>   xen/arch/x86/spec_ctrl.c        |   2 +-
>>>   xen/include/asm-x86/microcode.h |  40 +++--
>>>   xen/include/asm-x86/processor.h |   3 +-
>>>   13 files changed, 639 insertions(+), 351 deletions(-)
>>>   create mode 100644 tools/misc/xenmicrocode.c
>>>
>> 
>> Sorry for the delay.  These patches fail on F17h.  I'm looking into 
>> where it fails now.
>
>Bisecting it says it's commit "microcode: introduce a global cache of 
>ucode patch."
>
>The failing commit fails with:
>(XEN) [00000085227df312] microcode: CPU0 update from revision 0x8001207 
>to 0xffff8304 failed
>(XEN) [00000085240578ec] traps.c:1574: GPF (0000): ffff82d080426c88 
>[probe_cpuid_faulting+0xe/0xa2] -> ffff82d0803818b2
>
>That microcode revision is WAY off.  It should be 0x8001227 and not 
>0xffff8304.  I don't think I'll be able to do much on it before the end 
>of today, but let me what information you need or if there's anything I 
>should be looking at in particular.

Thanks for your testing.

Sergey tested it on some AMD machines. He pointed out an error in the
patch 4. I think the failure you observed was caused by that error.
I will fix it in the next version.

I am really sorry for this. I should have you copied on each patch.

[1]: https://lists.xenproject.org/archives/html/xen-devel/2019-03/msg00901.html

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode
  2019-03-11  7:57 ` [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode Chao Gao
@ 2019-03-21 11:08   ` Sergey Dyasli
  0 siblings, 0 replies; 50+ messages in thread
From: Sergey Dyasli @ 2019-03-21 11:08 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Kevin Tian,
	Wei Liu, Ashok Raj, Jun Nakajima, Andrew Cooper, Jan Beulich,
	Thomas Gleixner, Borislav, Petkov, Roger Pau Monné

On 11/03/2019 07:57, Chao Gao wrote:
> Updating microcode is less error prone when caches have been flushed and
> depending on what exactly the microcode is updating. For example, some
> of the issues around certain Broadwell parts can be addressed by doing a
> full cache flush.
> 
> [linux commit: 91df9fdf51492aec9fed6b4cbd33160886740f47]
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> ---
> Changes in v6:
>  - new
> ---
>  xen/arch/x86/microcode_intel.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/xen/arch/x86/microcode_intel.c b/xen/arch/x86/microcode_intel.c
> index c921ea9..d5ef145 100644
> --- a/xen/arch/x86/microcode_intel.c
> +++ b/xen/arch/x86/microcode_intel.c
> @@ -351,6 +351,12 @@ static int apply_microcode(void)
>      /* serialize access to the physical write to MSR 0x79 */
>      spin_lock_irqsave(&microcode_update_lock, flags);
>  
> +    /*
> +     * Writeback and invalidate caches before updating microcode to avoid
> +     * internal issues depending on what the microcode is updating.
> +     */
> +    wbinvd();

While this is a workaround for a specific Broadwell erratum, does wbinvd
need to be performed unconditionally everywhere? High-end Xeons have
up to 38.5M L3 and 28M L2. What is the upper bound of wbinvd latency on
such systems?

> +
>      /* write microcode via MSR 0x79 */
>      wrmsrl(MSR_IA32_UCODE_WRITE, (unsigned long)mc_intel->bits);
>      wrmsrl(MSR_IA32_UCODE_REV, 0x0ULL);
> 

Thanks,
Sergey

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [RFC PATCH v6 13/12] microcode: add sequential application policy
  2019-03-11  7:57 ` [PATCH v6 12/12] microcode: update microcode on cores in parallel Chao Gao
@ 2019-03-21 12:24   ` Sergey Dyasli
  2019-03-21 14:25     ` Chao Gao
  2019-03-26 16:23     ` Jan Beulich
  0 siblings, 2 replies; 50+ messages in thread
From: Sergey Dyasli @ 2019-03-21 12:24 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	Ashok Raj, Andrew Cooper, Jan Beulich, Roger Pau Monné

Hi Chao,

Updating microcode in parallel by default should be fine, but I think
there are no guarantees that a parallel application will be fine for
all future microcodes. To retain the ability to update microcode on cores
sequentially and give some choice to a user, I developed the below patch.

Could you consider including something like this into v7?

(The patch is for Xen 4.11 but forward-porting should be trivial)

---
 tools/misc/xen-microcode.c        | 17 +++++++++++++--
 xen/arch/x86/microcode.c          | 35 +++++++++++++++++++++++++++++--
 xen/arch/x86/platform_hypercall.c |  3 ++-
 xen/include/asm-x86/processor.h   |  3 ++-
 xen/include/public/platform.h     |  3 +++
 5 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/tools/misc/xen-microcode.c b/tools/misc/xen-microcode.c
index 2c1b2e362b..55d7d2f5df 100644
--- a/tools/misc/xen-microcode.c
+++ b/tools/misc/xen-microcode.c
@@ -46,7 +46,7 @@ void show_help(void)
 {
     fprintf(stderr,
             "xen-microcode: Xen microcode updating tool\n"
-            "Usage: xen-microcode <microcode blob>\n");
+            "Usage: xen-microcode <microcode blob> <parallel|sequential>\n");

     get_cpu_family(cpuid_eax(1));
     fprintf(stderr,
@@ -62,13 +62,25 @@ int main(int argc, char *argv[])
     struct xen_platform_op op;
     xc_interface *xch;
     DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
+    uint32_t strategy;

-    if (argc < 2)
+    if (argc < 3)
     {
         show_help();
         return 0;
     }

+    if (!strcmp(argv[2], "parallel"))
+        strategy = XENPF_microcode_parallel;
+    else if (!strcmp(argv[2], "sequential"))
+        strategy = XENPF_microcode_sequential;
+    else
+    {
+        show_help();
+        return 0;
+    }
+
+
     filename = argv[1];
     fd = open(filename, O_RDONLY);
     if (fd < 0) {
@@ -107,6 +119,7 @@ int main(int argc, char *argv[])
     op.cmd = XENPF_microcode_update;
     op.interface_version = XENPF_INTERFACE_VERSION;
     op.u.microcode.length = len;
+    op.u.microcode.strategy = strategy;
     ret = xc_platform_op(xch, &op);
     if ( ret )
         fprintf(stderr, "Failed to update microcode. (err: %d)\n", ret);
diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
index 95e39c33bc..63b01b571f 100644
--- a/xen/arch/x86/microcode.c
+++ b/xen/arch/x86/microcode.c
@@ -37,6 +37,8 @@
 #include <xen/earlycpio.h>
 #include <xen/watchdog.h>

+#include <public/platform.h>
+
 #include <asm/delay.h>
 #include <asm/msr.h>
 #include <asm/processor.h>
@@ -216,6 +218,10 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
  */
 static atomic_t cpu_in, cpu_out;

+static uint32_t application_strategy;
+/* The next CPU to perform a ucode update */
+static int next_cpu;
+
 /*
  * Save an ucode patch to the global cache list.
  *
@@ -351,6 +357,16 @@ static int do_microcode_update(void *unused)
     if ( ret )
         return ret;

+    while ( application_strategy == XENPF_microcode_sequential &&
+            cpu != next_cpu )
+    {
+        finished = atomic_read(&cpu_out);
+        if ( wait_for_cpus(&cpu_out, finished + 1,
+                           MICROCODE_UPDATE_TIMEOUT_US) )
+            panic("Timeout during sequential microcode update (finished %d/%d)",
+                  finished, cpu_nr);
+    }
+
     /*
      * Initiate an update on all processors which don't have an online sibling
      * thread with a lower thread id. Other sibling threads just await the
@@ -358,6 +374,10 @@ static int do_microcode_update(void *unused)
      */
     if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
         ret = microcode_update_cpu();
+
+    if ( application_strategy == XENPF_microcode_sequential )
+        next_cpu = cpumask_next(next_cpu, &cpu_online_map);
+
     /*
      * Increase the wait timeout to a safe value here since we're serializing
      * the microcode update and that could take a while on a large number of
@@ -393,7 +413,8 @@ static int do_microcode_update(void *unused)
     return ret;
 }

-int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
+int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len,
+                     uint32_t strategy)
 {
     int ret;
     void *buffer;
@@ -405,6 +426,10 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
     if ( microcode_ops == NULL )
         return -EINVAL;

+    if ( strategy != XENPF_microcode_parallel &&
+         strategy != XENPF_microcode_sequential )
+        return -EINVAL;
+
     buffer = xmalloc_bytes(len);
     if ( !buffer )
     {
@@ -449,13 +474,19 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
         if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
             nr_cores++;

-    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
+    printk(XENLOG_INFO "%d cores are to update their microcode %s\n", nr_cores,
+           strategy == XENPF_microcode_parallel ? "in parallel" :
+                                                  "sequentially");

     /*
      * We intend to disable interrupt for long time, which may lead to
      * watchdog timeout.
      */
     watchdog_disable();
+
+    application_strategy = strategy;
+    if ( strategy == XENPF_microcode_sequential )
+        next_cpu = cpumask_first(&cpu_online_map);
     /*
      * Late loading dance. Why the heavy-handed stop_machine effort?
      *
diff --git a/xen/arch/x86/platform_hypercall.c b/xen/arch/x86/platform_hypercall.c
index ea18c3215a..b5d2c5790b 100644
--- a/xen/arch/x86/platform_hypercall.c
+++ b/xen/arch/x86/platform_hypercall.c
@@ -296,7 +296,8 @@ ret_t do_platform_op(XEN_GUEST_HANDLE_PARAM(xen_platform_op_t) u_xenpf_op)

         ret = microcode_update(
                 guest_handle_to_param(data, const_void),
-                op->u.microcode.length);
+                op->u.microcode.length,
+                op->u.microcode.strategy);
         spin_unlock(&vcpu_alloc_lock);
     }
     break;
diff --git a/xen/include/asm-x86/processor.h b/xen/include/asm-x86/processor.h
index 0a62dfa0a3..612b786e81 100644
--- a/xen/include/asm-x86/processor.h
+++ b/xen/include/asm-x86/processor.h
@@ -573,7 +573,8 @@ int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val);
 int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val);

 void microcode_set_module(unsigned int);
-int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
+int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len,
+                     uint32_t strategy);
 int early_microcode_update_cpu(void);
 int early_microcode_init(void);
 int microcode_init_intel(void);
diff --git a/xen/include/public/platform.h b/xen/include/public/platform.h
index 94dbc3feb4..4dea94b84d 100644
--- a/xen/include/public/platform.h
+++ b/xen/include/public/platform.h
@@ -114,6 +114,9 @@ struct xenpf_microcode_update {
     /* IN variables. */
     XEN_GUEST_HANDLE(const_void) data;/* Pointer to microcode data */
     uint32_t length;                  /* Length of microcode data. */
+#define XENPF_microcode_parallel   0
+#define XENPF_microcode_sequential 1
+    uint32_t strategy;                /* Application strategy.     */
 };
 typedef struct xenpf_microcode_update xenpf_microcode_update_t;
 DEFINE_XEN_GUEST_HANDLE(xenpf_microcode_update_t);

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH v6 13/12] microcode: add sequential application policy
  2019-03-21 12:24   ` [RFC PATCH v6 13/12] microcode: add sequential application policy Sergey Dyasli
@ 2019-03-21 14:25     ` Chao Gao
  2019-03-26 16:23     ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-03-21 14:25 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Wei Liu, Ashok Raj, Andrew Cooper, Jan Beulich, xen-devel,
	Roger Pau Monné

On Thu, Mar 21, 2019 at 12:24:46PM +0000, Sergey Dyasli wrote:
>Hi Chao,
>
>Updating microcode in parallel by default should be fine, but I think
>there are no guarantees that a parallel application will be fine for
>all future microcodes. To retain the ability to update microcode on cores
>sequentially and give some choice to a user, I developed the below patch.
>
>Could you consider including something like this into v7?

Sure. Will do. Thanks for your patch.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
  2019-03-12 15:27   ` Roger Pau Monné
  2019-03-13  9:24   ` Wei Liu
@ 2019-03-25  9:38   ` Sergey Dyasli
  2019-04-02  2:26     ` Chao Gao
  2 siblings, 1 reply; 50+ messages in thread
From: Sergey Dyasli @ 2019-03-25  9:38 UTC (permalink / raw)
  To: Chao Gao, xen-devel
  Cc: sergey.dyasli@citrix.com >> Sergey Dyasli, Wei Liu,
	Ashok Raj, Konrad Rzeszutek Wilk, Ian Jackson, Roger Pau Monne

On 11/03/2019 07:57, Chao Gao wrote:
> This patch provides a tool for late microcode update.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  tools/libxc/include/xenctrl.h |  1 +
>  tools/libxc/xc_misc.c         | 20 ++++++++++
>  tools/misc/Makefile           |  4 ++
>  tools/misc/xenmicrocode.c     | 89 +++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 114 insertions(+)
>  create mode 100644 tools/misc/xenmicrocode.c
> 
> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
> index 31cdda7..c69699b 100644
> --- a/tools/libxc/include/xenctrl.h
> +++ b/tools/libxc/include/xenctrl.h
> @@ -1245,6 +1245,7 @@ typedef uint32_t xc_node_to_node_dist_t;
>  int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
>  int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
>                     xc_cputopo_t *cputopo);
> +int xc_platform_op(xc_interface *xch, struct xen_platform_op *op);
>  int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
>                  xc_meminfo_t *meminfo, uint32_t *distance);
>  int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
> diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
> index 5e6714a..061c7a5 100644
> --- a/tools/libxc/xc_misc.c
> +++ b/tools/libxc/xc_misc.c
> @@ -226,6 +226,26 @@ int xc_physinfo(xc_interface *xch,
>      return 0;
>  }
>  
> +int xc_platform_op(xc_interface *xch, struct xen_platform_op *op)
> +{
> +    int ret = 0;
> +    DECLARE_PLATFORM_OP;
> +    DECLARE_HYPERCALL_BOUNCE(op, sizeof(*op), XC_HYPERCALL_BUFFER_BOUNCE_BOTH);

So I've found that the bouncing in this function is not needed.
In fact, it gets in a way when Xen is returning information via xen_platform_op.
I ended up having only the single line in this function:

	return do_platform_op(xch, op);

Not sure how correct this is, but it seems to work for ucode application and
XENPF_get_cpu_version.

Thanks,
Sergey

> +
> +    if ( xc_hypercall_bounce_pre(xch, op) )
> +    {
> +        PERROR("Could not bounce xen_platform_op memory buffer");
> +        return -1;
> +    }
> +    op->interface_version = XENPF_INTERFACE_VERSION;
> +
> +    platform_op = *op;
> +    ret = do_platform_op(xch, &platform_op);
> +    xc_hypercall_bounce_post(xch, op);
> +
> +    return ret;
> +}
> +
>  int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
>                     xc_cputopo_t *cputopo)
>  {
> diff --git a/tools/misc/Makefile b/tools/misc/Makefile
> index 51adb6f..c297a75 100644
> --- a/tools/misc/Makefile
> +++ b/tools/misc/Makefile
> @@ -22,6 +22,7 @@ INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmcrash
>  INSTALL_SBIN-$(CONFIG_X86)     += xen-hvmctx
>  INSTALL_SBIN-$(CONFIG_X86)     += xen-lowmemd
>  INSTALL_SBIN-$(CONFIG_X86)     += xen-mfndump
> +INSTALL_SBIN-$(CONFIG_X86)     += xenmicrocode
>  INSTALL_SBIN                   += xen-tmem-list-parse
>  INSTALL_SBIN                   += xencov
>  INSTALL_SBIN                   += xenlockprof
> @@ -114,4 +115,7 @@ xen-lowmemd: xen-lowmemd.o
>  xencov: xencov.o
>  	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
>  
> +xenmicrocode: xenmicrocode.o
> +	$(CC) $(LDFLAGS) -o $@ $< $(LDLIBS_libxenctrl) $(APPEND_LDFLAGS)
> +
>  -include $(DEPS_INCLUDE)
> diff --git a/tools/misc/xenmicrocode.c b/tools/misc/xenmicrocode.c
> new file mode 100644
> index 0000000..e6c8a3d
> --- /dev/null
> +++ b/tools/misc/xenmicrocode.c
> @@ -0,0 +1,89 @@
> +#define _GNU_SOURCE
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <sys/mman.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <inttypes.h>
> +#include <unistd.h>
> +#include <sys/types.h>
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <xenctrl.h>
> +
> +void show_help(void)
> +{
> +    fprintf(stderr,
> +            "xenmicrocode: Xen microcode updating tool\n"
> +            "Usage: xenmicrocode <microcode blob>\n");
> +}
> +
> +int main(int argc, char *argv[])
> +{
> +    int fd, len, ret;
> +    char *filename, *buf;
> +    struct stat st;
> +    struct xen_platform_op op;
> +    xc_interface *xch;
> +    DECLARE_HYPERCALL_BUFFER(struct xenpf_microcode_update, uc);
> +
> +    if (argc < 2)
> +    {
> +        show_help();
> +        return 0;
> +    }
> +
> +    filename = argv[1];
> +    fd = open(filename, O_RDONLY);
> +    if (fd < 0) {
> +        fprintf(stderr, "Could not open %s. (err: %s)\n",
> +                filename, strerror(errno));
> +        return errno;
> +    }
> +
> +    if (stat(filename, &st) != 0) {
> +        fprintf(stderr, "Could not get the size of %s. (err: %s)\n",
> +                filename, strerror(errno));
> +        return errno;
> +    }
> +
> +    len = st.st_size;
> +    buf = mmap(0, len, PROT_READ, MAP_PRIVATE, fd, 0);
> +    if (buf == MAP_FAILED) {
> +        fprintf(stderr, "mmap failed. (error: %s)\n", strerror(errno));
> +        return errno;
> +    }
> +
> +    xch = xc_interface_open(0,0,0);
> +    if (xch == NULL)
> +    {
> +        fprintf(stderr, "Error opening xc interface. (err: %s)\n",
> +                strerror(errno));
> +        return errno;
> +    }
> +
> +    uc = xc_hypercall_buffer_alloc(xch, uc, len);
> +    if (uc == NULL)
> +        return -1;
> +
> +    memcpy(uc, buf, len);
> +    set_xen_guest_handle(op.u.microcode.data, uc);
> +    op.cmd = XENPF_microcode_update;
> +    op.interface_version = XENPF_INTERFACE_VERSION;
> +    op.u.microcode.length = len;
> +    ret = xc_platform_op(xch, &op);
> +    if ( ret )
> +        fprintf(stderr, "Failed to update microcode. (err: %d)\n", ret);
> +
> +    xc_hypercall_buffer_free(xch, uc);
> +    xc_interface_close(xch);
> +
> +    if (munmap(buf, len)) {
> +        printf("Could not unmap: %d(%s)\n", errno, strerror(errno));
> +        return errno;
> +    }
> +    close(fd);
> +
> +    return 0;
> +}
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [RFC PATCH v6 13/12] microcode: add sequential application policy
  2019-03-21 12:24   ` [RFC PATCH v6 13/12] microcode: add sequential application policy Sergey Dyasli
  2019-03-21 14:25     ` Chao Gao
@ 2019-03-26 16:23     ` Jan Beulich
  1 sibling, 0 replies; 50+ messages in thread
From: Jan Beulich @ 2019-03-26 16:23 UTC (permalink / raw)
  To: Sergey Dyasli, Chao Gao
  Cc: Andrew Cooper, xen-devel, Wei Liu, Ashok Raj, Roger Pau Monne

>>> On 21.03.19 at 13:24, <sergey.dyasli@citrix.com> wrote:
> @@ -216,6 +218,10 @@ DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>   */
>  static atomic_t cpu_in, cpu_out;
> 
> +static uint32_t application_strategy;
> +/* The next CPU to perform a ucode update */
> +static int next_cpu;

unsigned int (twice).

> --- a/xen/include/asm-x86/processor.h
> +++ b/xen/include/asm-x86/processor.h
> @@ -573,7 +573,8 @@ int rdmsr_hypervisor_regs(uint32_t idx, uint64_t *val);
>  int wrmsr_hypervisor_regs(uint32_t idx, uint64_t val);
> 
>  void microcode_set_module(unsigned int);
> -int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len);
> +int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void), unsigned long len,
> +                     uint32_t strategy);

Again.

> --- a/xen/include/public/platform.h
> +++ b/xen/include/public/platform.h
> @@ -114,6 +114,9 @@ struct xenpf_microcode_update {
>      /* IN variables. */
>      XEN_GUEST_HANDLE(const_void) data;/* Pointer to microcode data */
>      uint32_t length;                  /* Length of microcode data. */
> +#define XENPF_microcode_parallel   0
> +#define XENPF_microcode_sequential 1
> +    uint32_t strategy;                /* Application strategy.     */
>  };

This is not a compatible extension of the interface: For 64-bit
there's no guarantee existing callers would zero the padding
field, while for 32-bit there's no padding field at all, so for an
existing, unaware caller you'd consume random data as input.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor
  2019-03-25  9:38   ` Sergey Dyasli
@ 2019-04-02  2:26     ` Chao Gao
  0 siblings, 0 replies; 50+ messages in thread
From: Chao Gao @ 2019-04-02  2:26 UTC (permalink / raw)
  To: Sergey Dyasli
  Cc: Wei Liu, Ashok Raj, Konrad Rzeszutek Wilk, Ian Jackson,
	xen-devel, Roger Pau Monne

On Mon, Mar 25, 2019 at 09:38:21AM +0000, Sergey Dyasli wrote:
>On 11/03/2019 07:57, Chao Gao wrote:
>> This patch provides a tool for late microcode update.
>> 
>> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
>> Signed-off-by: Chao Gao <chao.gao@intel.com>
>> ---
>>  tools/libxc/include/xenctrl.h |  1 +
>>  tools/libxc/xc_misc.c         | 20 ++++++++++
>>  tools/misc/Makefile           |  4 ++
>>  tools/misc/xenmicrocode.c     | 89 +++++++++++++++++++++++++++++++++++++++++++
>>  4 files changed, 114 insertions(+)
>>  create mode 100644 tools/misc/xenmicrocode.c
>> 
>> diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
>> index 31cdda7..c69699b 100644
>> --- a/tools/libxc/include/xenctrl.h
>> +++ b/tools/libxc/include/xenctrl.h
>> @@ -1245,6 +1245,7 @@ typedef uint32_t xc_node_to_node_dist_t;
>>  int xc_physinfo(xc_interface *xch, xc_physinfo_t *info);
>>  int xc_cputopoinfo(xc_interface *xch, unsigned *max_cpus,
>>                     xc_cputopo_t *cputopo);
>> +int xc_platform_op(xc_interface *xch, struct xen_platform_op *op);
>>  int xc_numainfo(xc_interface *xch, unsigned *max_nodes,
>>                  xc_meminfo_t *meminfo, uint32_t *distance);
>>  int xc_pcitopoinfo(xc_interface *xch, unsigned num_devs,
>> diff --git a/tools/libxc/xc_misc.c b/tools/libxc/xc_misc.c
>> index 5e6714a..061c7a5 100644
>> --- a/tools/libxc/xc_misc.c
>> +++ b/tools/libxc/xc_misc.c
>> @@ -226,6 +226,26 @@ int xc_physinfo(xc_interface *xch,
>>      return 0;
>>  }
>>  
>> +int xc_platform_op(xc_interface *xch, struct xen_platform_op *op)
>> +{
>> +    int ret = 0;
>> +    DECLARE_PLATFORM_OP;
>> +    DECLARE_HYPERCALL_BOUNCE(op, sizeof(*op), XC_HYPERCALL_BUFFER_BOUNCE_BOTH);
>
>So I've found that the bouncing in this function is not needed.
>In fact, it gets in a way when Xen is returning information via xen_platform_op.
>I ended up having only the single line in this function:
>
>	return do_platform_op(xch, op);
>
>Not sure how correct this is, but it seems to work for ucode application and
>XENPF_get_cpu_version.

You are right. The bounce buffer here is pointless, as do_platform_op()
creates bounce buffer anyhow. Will remove it.

Thanks
Chao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2019-04-02  2:22 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-03-11  7:57 [PATCH v6 00/12] improve late microcode loading Chao Gao
2019-03-11  7:57 ` [PATCH v6 01/12] misc/xenmicrocode: Upload a microcode blob to the hypervisor Chao Gao
2019-03-12 15:27   ` Roger Pau Monné
2019-03-13  5:05     ` Chao Gao
2019-03-13  9:24   ` Wei Liu
2019-03-25  9:38   ` Sergey Dyasli
2019-04-02  2:26     ` Chao Gao
2019-03-11  7:57 ` [PATCH v6 02/12] microcode/intel: use union to get fields without shifting and masking Chao Gao
2019-03-12 15:33   ` Roger Pau Monné
2019-03-12 16:43     ` Jan Beulich
2019-03-12 18:23       ` Wei Liu
2019-03-11  7:57 ` [PATCH v6 03/12] microcode/intel: extend microcode_update_match() Chao Gao
2019-03-11  7:57 ` [PATCH v6 04/12] microcode: introduce a global cache of ucode patch Chao Gao
2019-03-12 16:53   ` Roger Pau Monné
2019-03-12 23:31     ` Raj, Ashok
2019-03-13  5:28     ` Chao Gao
2019-03-13  7:39     ` Jan Beulich
2019-03-13 10:30       ` Andrew Cooper
2019-03-13 17:04         ` Andrew Cooper
2019-03-14  7:42           ` Jan Beulich
2019-03-13 16:36   ` Sergey Dyasli
2019-03-14  1:39     ` Chao Gao
2019-03-11  7:57 ` [PATCH v6 05/12] microcode: only save compatible ucode patches Chao Gao
2019-03-12 17:03   ` Roger Pau Monné
2019-03-13  7:45     ` Jan Beulich
2019-03-11  7:57 ` [PATCH v6 06/12] microcode: remove struct ucode_cpu_info Chao Gao
2019-03-11  7:57 ` [PATCH v6 07/12] microcode: remove pointless 'cpu' parameter Chao Gao
2019-03-11  7:57 ` [PATCH v6 08/12] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-03-11  7:57 ` [PATCH v6 09/12] microcode: remove struct microcode_info Chao Gao
2019-03-11  7:57 ` [PATCH v6 10/12] microcode/intel: Writeback and invalidate caches before updating microcode Chao Gao
2019-03-21 11:08   ` Sergey Dyasli
2019-03-11  7:57 ` [PATCH v6 11/12] x86/microcode: Synchronize late microcode loading Chao Gao
2019-03-13  0:07   ` Raj, Ashok
2019-03-13  5:02     ` Chao Gao
2019-03-13  7:54       ` Jan Beulich
2019-03-13  8:02         ` Jan Beulich
2019-03-14 12:39           ` Andrew Cooper
2019-03-14 18:57             ` Raj, Ashok
2019-03-14 20:25               ` Thomas Gleixner
2019-03-15  9:40                 ` Andrew Cooper
2019-03-15 10:44                   ` Thomas Gleixner
2019-03-14 13:01           ` Chao Gao
2019-03-14 13:08             ` Jan Beulich
2019-03-11  7:57 ` [PATCH v6 12/12] microcode: update microcode on cores in parallel Chao Gao
2019-03-21 12:24   ` [RFC PATCH v6 13/12] microcode: add sequential application policy Sergey Dyasli
2019-03-21 14:25     ` Chao Gao
2019-03-26 16:23     ` Jan Beulich
2019-03-19 20:22 ` [PATCH v6 00/12] improve late microcode loading Woods, Brian
2019-03-19 21:39   ` Woods, Brian
2019-03-20  8:58     ` Chao Gao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.