From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: Dongli Zhang <dongli.zhang@oracle.com>,
xen-devel@lists.xenproject.org, x86@kernel.org
Cc: jgross@suse.com, sstabellini@kernel.org, tglx@linutronix.de,
mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com,
hpa@zytor.com, joe.jin@oracle.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
Date: Mon, 1 Nov 2021 13:34:03 -0400 [thread overview]
Message-ID: <1f19674f-e608-1faa-5656-fec853297198@oracle.com> (raw)
In-Reply-To: <20211028012543.8776-1-dongli.zhang@oracle.com>
On 10/27/21 9:25 PM, Dongli Zhang wrote:
> The sched_clock() can be used very early since
> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
> with commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock time from
> 0"), kdump kernel in Xen HVM guest may panic at very early stage when
> accessing &__this_cpu_read(xen_vcpu)->time as in below:
>
> setup_arch()
> -> init_hypervisor_platform()
> -> x86_init.hyper.init_platform = xen_hvm_guest_init()
> -> xen_hvm_init_time_ops()
> -> xen_clocksource_read()
> -> src = &__this_cpu_read(xen_vcpu)->time;
>
> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>
> However, when Xen HVM guest panic on vcpu >= 32, since
> xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
> vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
>
> This patch delays xen_hvm_init_time_ops() to later in
> xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
> registered when the boot vcpu is >= 32.
>
> Another option is to always delay xen_hvm_init_time_ops() for any vcpus
> (including vcpu=0). Since to delay xen_hvm_init_time_ops() may lead to
> clock backward issue,
This is referring to https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg01516.html I assume?
> it is preferred to avoid that for regular boot (The
> pv_sched_clock=native_sched_clock() is used at the very beginning until
> xen_sched_clock() is registered). That requires to adjust
> xen_sched_clock_offset. That's why we only delay xen_hvm_init_time_ops()
> for vcpu>=32.
We delay only on VCPU>=32 because we want to avoid the clock going backwards due to hypervisor problem pointed to be the link above, not because we need to adjust xen_sched_clock_offset (which we could if we wanted).
>
> This issue can be reproduced on purpose via below command at the guest
> side when kdump/kexec is enabled:
>
> "taskset -c 33 echo c > /proc/sysrq-trigger"
>
> Reference:
> https://lists.xenproject.org/archives/html/xen-devel/2021-10/msg00571.html
> Cc: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> Changed since v1:
> - Add commit message to explain why xen_hvm_init_time_ops() is delayed
> for any vcpus. (Suggested by Boris Ostrovsky)
> - Add a comment in xen_hvm_smp_prepare_boot_cpu() referencing the related
> code in xen_hvm_guest_init(). (suggested by Juergen Gross)
>
> arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
> arch/x86/xen/smp_hvm.c | 8 ++++++++
> 2 files changed, 27 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index e68ea5f4ad1c..7734dec52794 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
> WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
> xen_unplug_emulated_devices();
> x86_init.irqs.intr_init = xen_init_IRQ;
> - xen_hvm_init_time_ops();
> +
> + /*
> + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
> + * and the VM would use them until xen_vcpu_setup() is used to
> + * allocate/relocate them at arbitrary address.
> + *
> + * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
> + * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
> + * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
> + *
> + * Therefore we delay xen_hvm_init_time_ops() to
> + * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
> + */
> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> + pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
> + xen_vcpu_nr(0));
> + else
> + xen_hvm_init_time_ops();
> +
> xen_hvm_init_mmu_ops();
>
> #ifdef CONFIG_KEXEC_CORE
> diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
> index 6ff3c887e0b9..f99043df8bb5 100644
> --- a/arch/x86/xen/smp_hvm.c
> +++ b/arch/x86/xen/smp_hvm.c
> @@ -19,6 +19,14 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
> */
> xen_vcpu_setup(0);
>
> + /*
> + * The xen_hvm_init_time_ops() is delayed from
> + * xen_hvm_guest_init() to here to avoid panic when the kernel
> + * boots from vcpu>=MAX_VIRT_CPUS (32).
> + */
How about
/* Deferred call to xen_hvm_init_time_ops(). See comment in xen_hvm_guest_init() */
-boris
> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> + xen_hvm_init_time_ops();
> +
> /*
> * The alternative logic (which patches the unlock/lock) runs before
> * the smp bootup up code is activated. Hence we need to set this up
next prev parent reply other threads:[~2021-11-01 17:34 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-28 1:25 [PATCH v2 1/1] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
2021-11-01 17:34 ` Boris Ostrovsky [this message]
2021-11-04 17:59 ` Dongli Zhang
2021-11-05 20:41 ` Boris Ostrovsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1f19674f-e608-1faa-5656-fec853297198@oracle.com \
--to=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=dongli.zhang@oracle.com \
--cc=hpa@zytor.com \
--cc=jgross@suse.com \
--cc=joe.jin@oracle.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=sstabellini@kernel.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).