xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: "Chao Gao" <chao.gao@intel.com>
Cc: Sergey Dyasli <sergey.dyasli@citrix.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Ashok Raj <ashok.raj@intel.com>, WeiLiu <wl@xen.org>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>,
	xen-devel <xen-devel@lists.xenproject.org>,
	tglx@linutronix.de, Borislav Petkov <bp@suse.de>,
	Roger Pau Monne <roger.pau@citrix.com>
Subject: Re: [Xen-devel] [PATCH v7 08/10] x86/microcode: Synchronize late microcode loading
Date: Wed, 05 Jun 2019 08:09:43 -0600	[thread overview]
Message-ID: <5CF7CD2702000078002358F4@prv1-mh.provo.novell.com> (raw)
In-Reply-To: <1558945891-3015-9-git-send-email-chao.gao@intel.com>

>>> On 27.05.19 at 10:31, <chao.gao@intel.com> wrote:
> This patch ports microcode improvement patches from linux kernel.
> 
> Before you read any further: the early loading method is still the
> preferred one and you should always do that. The following patch is
> improving the late loading mechanism for long running jobs and cloud use
> cases.
> 
> Gather all cores and serialize the microcode update on them by doing it
> one-by-one to make the late update process as reliable as possible and
> avoid potential issues caused by the microcode update.
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> Tested-by: Chao Gao <chao.gao@intel.com>
> [linux commit: a5321aec6412b20b5ad15db2d6b916c05349dbff]
> [linux commit: bb8c13d61a629276a162c1d2b1a20a815cbcfbb7]
> Cc: Kevin Tian <kevin.tian@intel.com>
> Cc: Jun Nakajima <jun.nakajima@intel.com>
> Cc: Ashok Raj <ashok.raj@intel.com>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Andrew Cooper <andrew.cooper3@citrix.com>
> Cc: Jan Beulich <jbeulich@suse.com>
> ---
> Changes in v7:
>  - Check whether 'timeout' is 0 rather than "<=0" since it is unsigned int.
>  - reword the comment above microcode_update_cpu() to clearly state that
>  one thread per core should do the update.
> 
> Changes in v6:
>  - Use one timeout period for rendezvous stage and another for update stage.
>  - scale time to wait by the number of remaining cpus to respond.
>    It helps to find something wrong earlier and thus we can reboot the
>    system earlier.
> ---
>  xen/arch/x86/microcode.c | 171 ++++++++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 155 insertions(+), 16 deletions(-)
> 
> diff --git a/xen/arch/x86/microcode.c b/xen/arch/x86/microcode.c
> index 23cf550..f4a417e 100644
> --- a/xen/arch/x86/microcode.c
> +++ b/xen/arch/x86/microcode.c
> @@ -22,6 +22,7 @@
>   */
>  
>  #include <xen/cpu.h>
> +#include <xen/cpumask.h>

It seems vanishingly unlikely that you would need this explicit #include
here, but it certainly isn't wrong.

> @@ -270,31 +296,90 @@ bool microcode_update_cache(struct microcode_patch *patch)
>      return true;
>  }
>  
> -static long do_microcode_update(void *patch)
> +/* Wait for CPUs to rendezvous with a timeout (us) */
> +static int wait_for_cpus(atomic_t *cnt, unsigned int expect,
> +                         unsigned int timeout)
>  {
> -    int error, cpu;
> -
> -    error = microcode_update_cpu(patch);
> -    if ( error )
> +    while ( atomic_read(cnt) < expect )
>      {
> -        microcode_ops->free_patch(microcode_cache);
> -        return error;
> +        if ( !timeout )
> +        {
> +            printk("CPU%d: Timeout when waiting for CPUs calling in\n",
> +                   smp_processor_id());
> +            return -EBUSY;
> +        }
> +        udelay(1);
> +        timeout--;
>      }

There's no comment here and nothing in the description: I don't
recall clarification as to whether RDTSC is fine to be issued by a
thread when ucode is being updated by another thread on the
same core.

> +static int do_microcode_update(void *patch)
> +{
> +    unsigned int cpu = smp_processor_id();
> +    unsigned int cpu_nr = num_online_cpus();
> +    unsigned int finished;
> +    int ret;
> +    static bool error;
>  
> -    microcode_update_cache(patch);
> +    atomic_inc(&cpu_in);
> +    ret = wait_for_cpus(&cpu_in, cpu_nr, MICROCODE_CALLIN_TIMEOUT_US);
> +    if ( ret )
> +        return ret;
>  
> -    return error;
> +    ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));
> +    /*
> +     * Load microcode update on only one logical processor per core.
> +     * Here, among logical processors of a core, the one with the
> +     * lowest thread id is chosen to perform the loading.
> +     */
> +    if ( !ret && (cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu))) )

At the very least it's not obvious whether this hyper-threading-centric
view ("logical processor") also applies to AMD's compute unit model
(which reuses cpu_sibling_mask). It does, as the respective MSRs are
per-compute-unit rather than per-core, but I'd appreciate if the
wording could be adjusted to explicitly name both cases (multiple
threads per core and multiple cores per CU).

> +    {
> +        ret = microcode_ops->apply_microcode(patch);
> +        if ( !ret )
> +            atomic_inc(&cpu_updated);
> +    }
> +    /*
> +     * Increase the wait timeout to a safe value here since we're serializing

I'm struggling with the "increase": I don't see anything being increased
here. You simply use a larger timeout than above.

> +     * the microcode update and that could take a while on a large number of
> +     * CPUs. And that is fine as the *actual* timeout will be determined by
> +     * the last CPU finished updating and thus cut short
> +     */
> +    atomic_inc(&cpu_out);
> +    finished = atomic_read(&cpu_out);
> +    while ( !error && finished != cpu_nr )
> +    {
> +        /*
> +         * During each timeout interval, at least a CPU is expected to
> +         * finish its update. Otherwise, something goes wrong.
> +         */
> +        if ( wait_for_cpus(&cpu_out, finished + 1,
> +                           MICROCODE_UPDATE_TIMEOUT_US) && !error )
> +        {
> +            error = true;
> +            panic("Timeout when finishing updating microcode (finished %d/%d)",
> +                  finished, cpu_nr);

Why the setting of "error" when you panic anyway?

And please use format specifiers matching the types of the
further arguments (i.e. twice %u here, but please check other
code as well).

Furthermore (and I'm sure I've given this comment before) if
you really hit the limit, how many panic() invocations are there
going to be? You run this function on all CPUs after all.

On the whole, taking a 256-thread system as example, you
allow the whole process to take over 4 min without calling
panic(). Leaving aside guests, I don't think Xen itself would
survive this in all cases. We've found the need to process
softirqs with far smaller delays, in particular from key handlers
producing lots of output. At the very least there should be a
bold warning logged if the system had been in stop-machine
state for, say, longer than 100ms (value subject to discussion).

> +        }
> +
> +        finished = atomic_read(&cpu_out);
> +    }
> +
> +    /*
> +     * Refresh CPU signature (revision) on threads which didn't call
> +     * apply_microcode().
> +     */
> +    if ( cpu != cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +        ret = microcode_ops->collect_cpu_info(&this_cpu(cpu_sig));

Another option would be for the CPU doing the update to simply
propagate the new value to all its siblings' cpu_sig values.

> @@ -337,12 +429,59 @@ int microcode_update(XEN_GUEST_HANDLE_PARAM(const_void) buf, unsigned long len)
>          if ( patch )
>              microcode_ops->free_patch(patch);
>          ret = -EINVAL;
> -        goto free;
> +        goto put;
>      }
>  
> -    ret = continue_hypercall_on_cpu(cpumask_first(&cpu_online_map),
> -                                    do_microcode_update, patch);
> +    atomic_set(&cpu_in, 0);
> +    atomic_set(&cpu_out, 0);
> +    atomic_set(&cpu_updated, 0);
> +
> +    /* Calculate the number of online CPU core */
> +    nr_cores = 0;
> +    for_each_online_cpu(cpu)
> +        if ( cpu == cpumask_first(per_cpu(cpu_sibling_mask, cpu)) )
> +            nr_cores++;
> +
> +    printk(XENLOG_INFO "%d cores are to update their microcode\n", nr_cores);
> +
> +    /*
> +     * We intend to disable interrupt for long time, which may lead to
> +     * watchdog timeout.
> +     */
> +    watchdog_disable();
> +    /*
> +     * Late loading dance. Why the heavy-handed stop_machine effort?
> +     *
> +     * - HT siblings must be idle and not execute other code while the other
> +     *   sibling is loading microcode in order to avoid any negative
> +     *   interactions cause by the loading.
> +     *
> +     * - In addition, microcode update on the cores must be serialized until
> +     *   this requirement can be relaxed in the future. Right now, this is
> +     *   conservative and good.
> +     */
> +    ret = stop_machine_run(do_microcode_update, patch, NR_CPUS);
> +    watchdog_enable();
> +
> +    if ( atomic_read(&cpu_updated) == nr_cores )
> +    {
> +        spin_lock(&microcode_mutex);
> +        microcode_update_cache(patch);
> +        spin_unlock(&microcode_mutex);
> +    }
> +    else if ( atomic_read(&cpu_updated) == 0 )
> +        microcode_ops->free_patch(patch);
> +    else
> +    {
> +        printk("Updating microcode succeeded on part of CPUs and failed on\n"
> +               "others due to an unknown reason. A system with different\n"
> +               "microcode revisions is considered unstable. Please reboot and\n"
> +               "do not load the microcode that triggers this warning\n");
> +        microcode_ops->free_patch(patch);
> +    }

As said on an earlier patch, I think the cache can be updated if at
least one CPU loaded the blob successfully. Additionally I'd like to
ask that you log the number of successfully updated cores. And
finally perhaps "differing" instead of "different" and omit "due to
an unknown reason"?

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

  parent reply	other threads:[~2019-06-05 14:10 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-05-27  8:31 [PATCH v7 00/10] improve late microcode loading Chao Gao
2019-05-27  8:31 ` [Xen-devel] " Chao Gao
2019-05-27  8:31 ` [PATCH v7 01/10] misc/xen-ucode: Upload a microcode blob to the hypervisor Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-04 16:14   ` Andrew Cooper
2019-06-04 16:23     ` Jan Beulich
2019-06-06  2:29     ` Chao Gao
2019-05-27  8:31 ` [PATCH v7 02/10] microcode/intel: extend microcode_update_match() Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-04 14:39   ` Jan Beulich
2019-06-05 13:22     ` Roger Pau Monné
2019-06-05 14:16       ` Jan Beulich
2019-06-06  8:26     ` Chao Gao
2019-06-06  9:01       ` Jan Beulich
2019-05-27  8:31 ` [PATCH v7 03/10] microcode: introduce a global cache of ucode patch Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-04 15:03   ` Jan Beulich
2019-06-10  5:33     ` Chao Gao
2019-06-11  6:50       ` Jan Beulich
2019-05-27  8:31 ` [PATCH v7 04/10] microcode: remove struct ucode_cpu_info Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-04 15:13   ` Jan Beulich
2019-06-10  7:19     ` Chao Gao
2019-05-27  8:31 ` [PATCH v7 05/10] microcode: remove pointless 'cpu' parameter Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-04 15:29   ` Jan Beulich
2019-06-10  7:31     ` Chao Gao
2019-05-27  8:31 ` [PATCH v7 06/10] microcode: split out apply_microcode() from cpu_request_microcode() Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-05 12:37   ` Jan Beulich
2019-06-11  3:32     ` Chao Gao
2019-06-11  7:08       ` Jan Beulich
2019-06-11  8:53         ` Chao Gao
2019-06-11  9:15           ` Jan Beulich
2019-05-27  8:31 ` [PATCH v7 07/10] microcode/intel: Writeback and invalidate caches before updating microcode Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-05 13:20   ` Jan Beulich
2019-05-27  8:31 ` [PATCH v7 08/10] x86/microcode: Synchronize late microcode loading Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-05 14:09   ` Jan Beulich [this message]
2019-06-11 12:36     ` Chao Gao
2019-06-11 12:58       ` Jan Beulich
2019-06-11 15:47       ` Raj, Ashok
2019-06-05 14:42   ` Roger Pau Monné
2019-05-27  8:31 ` [PATCH v7 09/10] microcode: remove microcode_update_lock Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-05 14:52   ` Roger Pau Monné
2019-06-05 15:15     ` Jan Beulich
2019-06-05 14:53   ` Jan Beulich
2019-06-11 12:46     ` Chao Gao
2019-06-11 13:23       ` Jan Beulich
2019-06-11 16:04       ` Raj, Ashok
2019-06-12  7:38         ` Jan Beulich
2019-06-13 14:05           ` Chao Gao
2019-06-13 14:08             ` Jan Beulich
2019-06-13 14:58               ` Chao Gao
2019-06-13 17:47               ` Raj, Ashok
2019-06-14  8:58                 ` Jan Beulich
2019-05-27  8:31 ` [PATCH v7 10/10] x86/microcode: always collect_cpu_info() during boot Chao Gao
2019-05-27  8:31   ` [Xen-devel] " Chao Gao
2019-06-05 14:56   ` Roger Pau Monné
2019-06-11 13:02     ` Chao Gao
2019-06-05 15:05   ` Jan Beulich
2019-06-11 12:58     ` Chao Gao

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5CF7CD2702000078002358F4@prv1-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=ashok.raj@intel.com \
    --cc=bp@suse.de \
    --cc=chao.gao@intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=roger.pau@citrix.com \
    --cc=sergey.dyasli@citrix.com \
    --cc=tglx@linutronix.de \
    --cc=wl@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).