xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Dongli Zhang <dongli.zhang@oracle.com>,
	xen-devel@lists.xenproject.org, x86@kernel.org
Cc: jgross@suse.com, sstabellini@kernel.org, tglx@linutronix.de,
	mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
	hpa@zytor.com, joe.jin@oracle.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
Date: Mon, 1 Nov 2021 13:34:03 -0400	[thread overview]
Message-ID: <1f19674f-e608-1faa-5656-fec853297198@oracle.com> (raw)
In-Reply-To: <20211028012543.8776-1-dongli.zhang@oracle.com>


On 10/27/21 9:25 PM, Dongli Zhang wrote:
> The sched_clock() can be used very early since
> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
> with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from
> 0"), kdump kernel in Xen HVM guest may panic at very early stage when
> accessing &__this_cpu_read(xen_vcpu)->time as in below:
>
> setup_arch()
>   -> init_hypervisor_platform()
>       -> x86_init.hyper.init_platform = xen_hvm_guest_init()
>           -> xen_hvm_init_time_ops()
>               -> xen_clocksource_read()
>                   -> src = &__this_cpu_read(xen_vcpu)->time;
>
> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>
> However, when Xen HVM guest panic on vcpu >= 32, since
> xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
> vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
>
> This patch delays xen_hvm_init_time_ops() to later in
> xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
> registered when the boot vcpu is >= 32.
>
> Another option is to always delay xen_hvm_init_time_ops() for any vcpus
> (including vcpu=0). Since to delay xen_hvm_init_time_ops() may lead to
> clock backward issue,


This is referring to https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg01516.html I assume?


>   it is preferred to avoid that for regular boot (The
> pv_sched_clock=native_sched_clock() is used at the very beginning until
> xen_sched_clock() is registered). That requires to adjust
> xen_sched_clock_offset. That's why we only delay xen_hvm_init_time_ops()
> for vcpu>=32.


We delay only on VCPU>=32 because we want to avoid the clock going backwards due to hypervisor problem pointed to be the link above, not because we need to adjust xen_sched_clock_offset (which we could if we wanted).


>
> This issue can be reproduced on purpose via below command at the guest
> side when kdump/kexec is enabled:
>
> "taskset -c 33 echo c > /proc/sysrq-trigger"
>
> Reference:
> https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
> Cc: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
>    - Add commit message to explain why xen_hvm_init_time_ops() is delayed
>      for any vcpus. (Suggested by Boris Ostrovsky)
>    - Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related
>      code in xen_hvm_guest_init(). (suggested by Juergen Gross)
>
>   arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
>   arch/x86/xen/smp_hvm.c       |  8 ++++++++
>   2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index e68ea5f4ad1c..7734dec52794 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
>   	WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
>   	xen_unplug_emulated_devices();
>   	x86_init.irqs.intr_init = xen_init_IRQ;
> -	xen_hvm_init_time_ops();
> +
> +	/*
> +	 * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
> +	 * and the VM would use them until xen_vcpu_setup() is used to
> +	 * allocate/relocate them at arbitrary address.
> +	 *
> +	 * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
> +	 * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
> +	 * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
> +	 *
> +	 * Therefore we delay xen_hvm_init_time_ops() to
> +	 * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
> +	 */
> +	if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> +		pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
> +			xen_vcpu_nr(0));
> +	else
> +		xen_hvm_init_time_ops();
> +
>   	xen_hvm_init_mmu_ops();
>   
>   #ifdef CONFIG_KEXEC_CORE
> diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
> index 6ff3c887e0b9..f99043df8bb5 100644
> --- a/arch/x86/xen/smp_hvm.c
> +++ b/arch/x86/xen/smp_hvm.c
> @@ -19,6 +19,14 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
>   	 */
>   	xen_vcpu_setup(0);
>   
> +	/*
> +	 * The xen_hvm_init_time_ops() is delayed from
> +	 * xen_hvm_guest_init() to here to avoid panic when the kernel
> +	 * boots from vcpu>=MAX_VIRT_CPUS (32).
> +	 */


How about

   /* Deferred call to xen_hvm_init_time_ops(). See comment in xen_hvm_guest_init() */


-boris



> +	if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> +		xen_hvm_init_time_ops();
> +
>   	/*
>   	 * The alternative logic (which patches the unlock/lock) runs before
>   	 * the smp bootup up code is activated. Hence we need to set this up


  reply	other threads:[~2021-11-01 17:34 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-28  1:25 [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
2021-11-01 17:34 ` Boris Ostrovsky [this message]
2021-11-04 17:59   ` Dongli Zhang
2021-11-05 20:41     ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1f19674f-e608-1faa-5656-fec853297198@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=dongli.zhang@oracle.com \
    --cc=hpa@zytor.com \
    --cc=jgross@suse.com \
    --cc=joe.jin@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=sstabellini@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).