From: Sergey Dyasli <sergey.dyasli@citrix.com>
To: Chao Gao <chao.gao@intel.com>, <xen-devel@lists.xenproject.org>
Cc: "sergey.dyasli@citrix.com >> Sergey Dyasli"
<sergey.dyasli@citrix.com>, "Kevin Tian" <kevin.tian@intel.com>,
"Ashok Raj" <ashok.raj@intel.com>, "Wei Liu" <wl@xen.org>,
"Andrew Cooper" <andrew.cooper3@citrix.com>,
"Jan Beulich" <jbeulich@suse.com>,
"Jun Nakajima" <jun.nakajima@intel.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
"Borislav Petkov" <bp@suse.de>,
"Roger Pau Monné" <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading
Date: Mon, 19 Aug 2019 11:27:36 +0100 [thread overview]
Message-ID: <ea4e0bee-7151-82da-c293-fec2532f9b92@citrix.com> (raw)
In-Reply-To: <1566177928-19114-14-git-send-email-chao.gao@intel.com>
On 19/08/2019 02:25, Chao Gao wrote:
> This patch ports microcode improvement patches from linux kernel.
>
> Before you read any further: the early loading method is still the
> preferred one and you should always do that. The following patch is
> improving the late loading mechanism for long running jobs and cloud use
> cases.
>
> Gather all cores and serialize the microcode update on them by doing it
> one-by-one to make the late update process as reliable as possible and
> avoid potential issues caused by the microcode update.
>
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Chao Gao <chao.gao@intel.com>
> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v9:
> - log __buildin_return_address(0) when timeout
> - divide CPUs into three logical sets and they will call different
> functions during ucode loading. The 'control thread' is chosen to
> coordinate ucode loading on all CPUs. Since only control thread would
> set 'loading_state', we can get rid of 'cmpxchg' stuff in v8.
> - s/rep_nop/cpu_relax
> - each thread updates its revision number itself
> - add XENLOG_ERR prefix for each line of multi-line log messages
>
> Changes in v8:
> - to support blocking #NMI handling during loading ucode
> * introduce a flag, 'loading_state', to mark the start or end of
> ucode loading.
> * use a bitmap for cpu callin since if cpu may stay in #NMI handling,
> there are two places for a cpu to call in. bitmap won't be counted
> twice.
> * don't wait for all CPUs callout, just wait for CPUs that perform the
> update. We have to do this because some threads may be stuck in NMI
> handling (where cannot reach the rendezvous).
> - emit a warning if the system stays in stop_machine context for more
> than 1s
> - comment that rdtsc is fine while loading an update
> - use cmpxchg() to avoid panic being called on multiple CPUs
> - Propagate revision number to other threads
> - refine comments and prompt messages
>
> Changes in v7:
> - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
> - reword the comment above microcode_update_cpu() to clearly state that
> one thread per core should do the update.
> ---
> xen/arch/x86/microcode.c | 289 +++++++++++++++++++++++++++++++++++++++++++----
> 1 file changed, 267 insertions(+), 22 deletions(-)
>
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index bdd9c9f..91f9e81 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -30,18 +30,52 @@
> #include <xen/smp.h>
> #include <xen/softirq.h>
> #include <xen/spinlock.h>
> +#include <xen/stop_machine.h>
> #include <xen/tasklet.h>
> #include <xen/guest_access.h>
> #include <xen/earlycpio.h>
> +#include <xen/watchdog.h>
>
> +#include <asm/delay.h>
> #include <asm/msr.h>
> #include <asm/processor.h>
> #include <asm/setup.h>
> #include <asm/microcode.h>
>
> +/*
> + * Before performing a late microcode update on any thread, we
> + * rendezvous all cpus in stop_machine context. The timeout for
> + * waiting for cpu rendezvous is 30ms. It is the timeout used by
> + * live patching
> + */
> +#define MICROCODE_CALLIN_TIMEOUT_US 30000
> +
> +/*
> + * Timeout for each thread to complete update is set to 1s. It is a
> + * conservative choice considering all possible interference.
> + */
> +#define MICROCODE_UPDATE_TIMEOUT_US 1000000
> +
> static module_t __initdata ucode_mod;
> static signed int __initdata ucode_mod_idx;
> static bool_t __initdata ucode_mod_forced;
> +static unsigned int nr_cores;
> +
> +/*
> + * These states help to coordinate CPUs during loading an update.
> + *
> + * The semantics of each state is as follow:
> + * - LOADING_PREPARE: initial state of 'loading_state'.
> + * - LOADING_CALLIN: CPUs are allowed to callin.
> + * - LOADING_ENTER: all CPUs have called in. Initiate ucode loading.
> + * - LOADING_EXIT: ucode loading is done or aborted.
> + */
> +static enum {
> + LOADING_PREPARE,
> + LOADING_CALLIN,
> + LOADING_ENTER,
> + LOADING_EXIT,
> +} loading_state;
>
> /*
> * If we scan the initramfs.cpio for the early microcode code
> @@ -190,6 +224,16 @@ static DEFINE_SPINLOCK(microcode_mutex);
> DEFINE_PER_CPU(struct cpu_signature, cpu_sig);
>
> /*
> + * Count the CPUs that have entered, exited the rendezvous and succeeded in
> + * microcode update during late microcode update respectively.
> + *
> + * Note that a bitmap is used for callin to allow cpu to set a bit multiple
> + * times. It is required to do busy-loop in #NMI handling.
> + */
> +static cpumask_t cpu_callin_map;
> +static atomic_t cpu_out, cpu_updated;
> +
> +/*
> * Return a patch that covers current CPU. If there are multiple patches,
> * return the one with the highest revision number. Return error If no
> * patch is found and an error occurs during the parsing process. Otherwise
> @@ -232,6 +276,34 @@ bool microcode_update_cache(struct microcode_patch *patch)
> return true;
> }
>
> +/* Wait for a condition to be met with a timeout (us). */
> +static int wait_for_condition(int (*func)(void *data), void *data,
> + unsigned int timeout)
> +{
> + while ( !func(data) )
> + {
> + if ( !timeout-- )
> + {
> + printk("CPU%u: Timeout in %pS\n",
> + smp_processor_id(), __builtin_return_address(0));
> + return -EBUSY;
> + }
> + udelay(1);
> + }
> +
> + return 0;
> +}
> +
> +static int wait_cpu_callin(void *nr)
> +{
> + return cpumask_weight(&cpu_callin_map) >= (unsigned long)nr;
> +}
> +
> +static int wait_cpu_callout(void *nr)
> +{
> + return atomic_read(&cpu_out) >= (unsigned long)nr;
> +}
> +
> /*
> * Load a microcode update to current CPU.
> *
> @@ -265,37 +337,155 @@ static int microcode_update_cpu(const struct microcode_patch *patch)
> return err;
> }
>
> -static long do_microcode_update(void *patch)
> +static int slave_thread_fn(void)
> +{
> + unsigned int cpu = smp_processor_id();
> + unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> +
> + while ( loading_state != LOADING_CALLIN )
> + cpu_relax();
> +
> + cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> + while ( loading_state != LOADING_EXIT )
> + cpu_relax();
> +
> + /* Copy update revision from the "master" thread. */
> + this_cpu(cpu_sig).rev = per_cpu(cpu_sig, master).rev;
> +
> + return 0;
> +}
> +
> +static int master_thread_fn(const struct microcode_patch *patch)
> +{
> + unsigned int cpu = smp_processor_id();
> + int ret = 0;
> +
> + while ( loading_state != LOADING_CALLIN )
> + cpu_relax();
> +
> + cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> + while ( loading_state != LOADING_ENTER )
> + cpu_relax();
If I'm reading it right, this will wait forever in case when...
> +
> + /*
> + * If an error happened, control thread would set 'loading_state'
> + * to LOADING_EXIT. Don't perform ucode loading for this case
> + */
> + if ( loading_state == LOADING_EXIT )
> + return ret;
> +
> + ret = microcode_ops->apply_microcode(patch);
> + if ( !ret )
> + atomic_inc(&cpu_updated);
> + atomic_inc(&cpu_out);
> +
> + while ( loading_state != LOADING_EXIT )
> + cpu_relax();
> +
> + return ret;
> +}
> +
> +static int control_thread_fn(const struct microcode_patch *patch)
> {
> - unsigned int cpu;
> + unsigned int cpu = smp_processor_id(), done;
> + unsigned long tick;
> + int ret;
>
> - /* Store the patch after a successful loading */
> - if ( !microcode_update_cpu(patch) && patch )
> + /* Allow threads to call in */
> + loading_state = LOADING_CALLIN;
> + smp_mb();
> +
> + cpumask_set_cpu(cpu, &cpu_callin_map);
> +
> + /* Waiting for all threads calling in */
> + ret = wait_for_condition(wait_cpu_callin,
> + (void *)(unsigned long)num_online_cpus(),
> + MICROCODE_CALLIN_TIMEOUT_US);
> + if ( ret ) {
> + loading_state = LOADING_EXIT;
> + return ret;
> + }
...this condition holds. Have you actually tested this case?
> +
> + /* Let master threads load the given ucode update */
> + loading_state = LOADING_ENTER;
> + smp_mb();
> +
> + ret = microcode_ops->apply_microcode(patch);
> + if ( !ret )
> + atomic_inc(&cpu_updated);
> + atomic_inc(&cpu_out);
> +
> + tick = rdtsc_ordered();
> + /* Waiting for master threads finishing update */
> + done = atomic_read(&cpu_out);
> + while ( done != nr_cores )
> {
> - spin_lock(µcode_mutex);
> - microcode_update_cache(patch);
> - spin_unlock(µcode_mutex);
> - patch = NULL;
> + /*
> + * During each timeout interval, at least a CPU is expected to
> + * finish its update. Otherwise, something goes wrong.
> + *
> + * Note that RDTSC (in wait_for_condition()) is safe for threads to
> + * execute while waiting for completion of loading an update.
> + */
> + if ( wait_for_condition(wait_cpu_callout,
> + (void *)(unsigned long)(done + 1),
> + MICROCODE_UPDATE_TIMEOUT_US) )
> + panic("Timeout when finished updating microcode (finished %u/%u)",
> + done, nr_cores);
> +
> + /* Print warning message once if long time is spent here */
> + if ( tick && rdtsc_ordered() - tick >= cpu_khz * 1000 )
> + {
> + printk(XENLOG_WARNING
> + "WARNING: UPDATING MICROCODE HAS CONSUMED MORE THAN 1 SECOND!\n");
> + tick = 0;
> + }
> + done = atomic_read(&cpu_out);
> }
>
> - if ( microcode_ops->end_update )
> - microcode_ops->end_update();
> + /* Mark loading is done to unblock other threads */
> + loading_state = LOADING_EXIT;
> + smp_mb();
>
> - cpu = cpumask_next(smp_processor_id(), &cpu_online_map);
> - if ( cpu < nr_cpu_ids )
> - return continue_hypercall_on_cpu(cpu, do_microcode_update, patch);
> + return ret;
> +}
>
> - /* Free the patch if no CPU has loaded it successfully. */
> - if ( patch )
> - microcode_free_patch(patch);
> +static int do_microcode_update(void *patch)
> +{
> + unsigned int cpu = smp_processor_id();
> + /*
> + * "master" thread is the one with the lowest thread id among all siblings
> + * thread in a core or a compute unit. It is chosen to load a microcode
> + * update.
> + */
> + unsigned int master = cpumask_first(this_cpu(cpu_sibling_mask));
> + int ret;
>
> - return 0;
> + /*
> + * The control thread set state to coordinate ucode loading. Master threads
> + * load the given ucode patch. Slave threads just wait for the completion
> + * of the ucode loading process.
> + */
> + if ( cpu == cpumask_first(&cpu_online_map) )
> + ret = control_thread_fn(patch);
> + else if ( cpu == master )
> + ret = master_thread_fn(patch);
> + else
> + ret = slave_thread_fn();
> +
> + if ( microcode_ops->end_update )
> + microcode_ops->end_update();
> +
> + return ret;
> }
>
> int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
> {
> int ret;
> void *buffer;
> + unsigned int cpu, updated;
> struct microcode_patch *patch;
>
> if ( len != (uint32_t)len )
> @@ -314,11 +504,18 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
> goto free;
> }
>
> + /* cpu_online_map must not change during update */
> + if ( !get_cpu_maps() )
> + {
> + ret = -EBUSY;
> + goto free;
> + }
> +
> if ( microcode_ops->start_update )
> {
> ret = microcode_ops->start_update();
> if ( ret != 0 )
> - goto free;
> + goto put;
> }
>
> patch = parse_blob(buffer, len);
> @@ -326,19 +523,67 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
> {
> ret = PTR_ERR(patch);
> printk(XENLOG_INFO "Parsing microcode blob error %d\n", ret);
> - goto free;
> + goto put;
> }
>
> if ( !patch )
> {
> printk(XENLOG_INFO "No ucode found. Update aborted!\n");
> ret = -EINVAL;
> - goto free;
> + goto put;
> + }
> +
> + cpumask_clear(&cpu_callin_map);
> + atomic_set(&cpu_out, 0);
> + atomic_set(&cpu_updated, 0);
> + loading_state = LOADING_PREPARE;
> +
> + /* Calculate the number of online CPU core */
> + nr_cores = 0;
> + for_each_online_cpu(cpu)
> + if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> + nr_cores++;
> +
> + printk(XENLOG_INFO "%u cores are to update their microcode\n", nr_cores);
> +
> + /*
> + * We intend to disable interrupt for long time, which may lead to
> + * watchdog timeout.
> + */
> + watchdog_disable();
> + /*
> + * Late loading dance. Why the heavy-handed stop_machine effort?
> + *
> + * - HT siblings must be idle and not execute other code while the other
> + * sibling is loading microcode in order to avoid any negative
> + * interactions cause by the loading.
> + *
> + * - In addition, microcode update on the cores must be serialized until
> + * this requirement can be relaxed in the future. Right now, this is
> + * conservative and good.
> + */
> + ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
> + watchdog_enable();
> +
> + updated = atomic_read(&cpu_updated);
> + if ( updated > 0 )
> + {
> + spin_lock(µcode_mutex);
> + microcode_update_cache(patch);
> + spin_unlock(µcode_mutex);
> }
> + else
> + microcode_free_patch(patch);
>
> - ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
> - do_microcode_update, patch);
> + if ( updated && updated != nr_cores )
> + printk(XENLOG_ERR "ERROR: Updating microcode succeeded on %u cores and failed\n"
> + XENLOG_ERR "on other %u cores. A system with differing microcode \n"
> + XENLOG_ERR "revisions is considered unstable. Please reboot and do not\n"
> + XENLOG_ERR "load the microcode that triggersthis warning!\n",
> + updated, nr_cores - updated);
>
> + put:
> + put_cpu_maps();
> free:
> xfree(buffer);
> return ret;
>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
next prev parent reply other threads:[~2019-08-19 10:28 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-19 1:25 [Xen-devel] [PATCH v9 00/15] improve late microcode loading Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 01/15] microcode/intel: extend microcode_update_match() Chao Gao
2019-08-28 15:12 ` Jan Beulich
2019-08-29 7:15 ` Chao Gao
2019-08-29 7:14 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 02/15] microcode/amd: fix memory leak Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 03/15] microcode/amd: distinguish old and mismatched ucode in microcode_fits() Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 04/15] microcode: introduce a global cache of ucode patch Chao Gao
2019-08-22 11:11 ` Roger Pau Monné
2019-08-28 15:21 ` Jan Beulich
2019-08-29 10:18 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 05/15] microcode: clean up microcode_resume_cpu Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 06/15] microcode: remove struct ucode_cpu_info Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 07/15] microcode: remove pointless 'cpu' parameter Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 08/15] microcode/amd: call svm_host_osvw_init() in common code Chao Gao
2019-08-22 13:08 ` Roger Pau Monné
2019-08-28 15:26 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 09/15] microcode: pass a patch pointer to apply_microcode() Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 10/15] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-08-22 13:59 ` Roger Pau Monné
2019-08-29 10:06 ` Jan Beulich
2019-08-30 3:22 ` Chao Gao
2019-08-30 7:25 ` Jan Beulich
2019-08-29 10:19 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 11/15] microcode: unify loading update during CPU resuming and AP wakeup Chao Gao
2019-08-22 14:10 ` Roger Pau Monné
2019-08-22 16:44 ` Chao Gao
2019-08-23 9:09 ` Roger Pau Monné
2019-08-29 7:37 ` Chao Gao
2019-08-29 8:16 ` Roger Pau Monné
2019-08-29 10:26 ` Jan Beulich
2019-08-29 10:29 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 12/15] microcode: reduce memory allocation and copy when creating a patch Chao Gao
2019-08-23 8:11 ` Roger Pau Monné
2019-08-26 7:03 ` Chao Gao
2019-08-26 8:11 ` Roger Pau Monné
2019-08-29 10:47 ` Jan Beulich
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 13/15] x86/microcode: Synchronize late microcode loading Chao Gao
2019-08-19 10:27 ` Sergey Dyasli [this message]
2019-08-19 14:49 ` Chao Gao
2019-08-29 12:06 ` Jan Beulich
2019-08-30 3:30 ` Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 14/15] microcode: remove microcode_update_lock Chao Gao
2019-08-19 1:25 ` [Xen-devel] [PATCH v9 15/15] microcode: block #NMI handling when loading an ucode Chao Gao
2019-08-23 8:46 ` Sergey Dyasli
2019-08-26 8:07 ` Chao Gao
2019-08-27 4:52 ` Chao Gao
2019-08-28 8:52 ` Sergey Dyasli
2019-08-29 12:11 ` Jan Beulich
2019-08-30 6:35 ` Chao Gao
2019-09-09 5:52 ` Chao Gao
2019-09-09 6:16 ` Jan Beulich
2019-08-29 12:22 ` Jan Beulich
2019-08-30 6:33 ` Chao Gao
2019-08-30 7:30 ` Jan Beulich
2019-08-22 7:51 ` [Xen-devel] [PATCH v9 00/15] improve late microcode loading Sergey Dyasli
2019-08-22 15:39 ` Chao Gao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ea4e0bee-7151-82da-c293-fec2532f9b92@citrix.com \
--to=sergey.dyasli@citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=ashok.raj@intel.com \
--cc=bp@suse.de \
--cc=chao.gao@intel.com \
--cc=jbeulich@suse.com \
--cc=jun.nakajima@intel.com \
--cc=kevin.tian@intel.com \
--cc=roger.pau@citrix.com \
--cc=tglx@linutronix.de \
--cc=wl@xen.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).