* [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
2021-10-12 7:24 [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Dongli Zhang
@ 2021-10-12 7:24 ` Dongli Zhang
2021-10-12 8:48 ` Juergen Gross
2021-10-12 17:17 ` Boris Ostrovsky
2021-10-12 7:24 ` [PATCH xen 2/2] xen: update system time immediately when VCPUOP_register_vcpu_info Dongli Zhang
2021-10-12 8:47 ` [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Juergen Gross
2 siblings, 2 replies; 8+ messages in thread
From: Dongli Zhang @ 2021-10-12 7:24 UTC (permalink / raw)
To: xen-devel
Cc: linux-kernel, x86, boris.ostrovsky, jgross, sstabellini, tglx,
mingo, bp, hpa, andrew.cooper3, george.dunlap, iwj, jbeulich,
julien, wl, joe.jin
The sched_clock() can be used very early since upstream
commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
with upstream commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock
time from 0"), kdump kernel in Xen HVM guest may panic at very early stage
when accessing &__this_cpu_read(xen_vcpu)->time as in below:
setup_arch()
-> init_hypervisor_platform()
-> x86_init.hyper.init_platform = xen_hvm_guest_init()
-> xen_hvm_init_time_ops()
-> xen_clocksource_read()
-> src = &__this_cpu_read(xen_vcpu)->time;
This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
However, when Xen HVM guest panic on vcpu >= 32, since
xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
This patch delays xen_hvm_init_time_ops() to later in
xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
registered when the boot vcpu is >= 32.
This issue can be reproduced on purpose via below command at the guest
side when kdump/kexec is enabled:
"taskset -c 33 echo c > /proc/sysrq-trigger"
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
arch/x86/xen/smp_hvm.c | 3 +++
2 files changed, 22 insertions(+), 1 deletion(-)
diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
index e68ea5f4ad1c..152279416d9a 100644
--- a/arch/x86/xen/enlighten_hvm.c
+++ b/arch/x86/xen/enlighten_hvm.c
@@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
xen_unplug_emulated_devices();
x86_init.irqs.intr_init = xen_init_IRQ;
- xen_hvm_init_time_ops();
+
+ /*
+ * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
+ * and the VM would use them until xen_vcpu_setup() is used to
+ * allocate/relocate them at arbitrary address.
+ *
+ * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
+ * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
+ * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
+ *
+ * Therefore we delay xen_hvm_init_time_ops() to
+ * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
+ */
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
+ xen_vcpu_nr(0));
+ else
+ xen_hvm_init_time_ops();
+
xen_hvm_init_mmu_ops();
#ifdef CONFIG_KEXEC_CORE
diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
index 6ff3c887e0b9..60cd4fafd188 100644
--- a/arch/x86/xen/smp_hvm.c
+++ b/arch/x86/xen/smp_hvm.c
@@ -19,6 +19,9 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
*/
xen_vcpu_setup(0);
+ if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
+ xen_hvm_init_time_ops();
+
/*
* The alternative logic (which patches the unlock/lock) runs before
* the smp bootup up code is activated. Hence we need to set this up
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
2021-10-12 7:24 ` [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
@ 2021-10-12 8:48 ` Juergen Gross
2021-10-12 17:17 ` Boris Ostrovsky
1 sibling, 0 replies; 8+ messages in thread
From: Juergen Gross @ 2021-10-12 8:48 UTC (permalink / raw)
To: Dongli Zhang, xen-devel
Cc: linux-kernel, x86, boris.ostrovsky, sstabellini, tglx, mingo, bp,
hpa, andrew.cooper3, george.dunlap, iwj, jbeulich, julien, wl,
joe.jin
[-- Attachment #1.1.1: Type: text/plain, Size: 3436 bytes --]
On 12.10.21 09:24, Dongli Zhang wrote:
> The sched_clock() can be used very early since upstream
> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
> with upstream commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock
> time from 0"), kdump kernel in Xen HVM guest may panic at very early stage
> when accessing &__this_cpu_read(xen_vcpu)->time as in below:
>
> setup_arch()
> -> init_hypervisor_platform()
> -> x86_init.hyper.init_platform = xen_hvm_guest_init()
> -> xen_hvm_init_time_ops()
> -> xen_clocksource_read()
> -> src = &__this_cpu_read(xen_vcpu)->time;
>
> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>
> However, when Xen HVM guest panic on vcpu >= 32, since
> xen_vcpu_info_reset(0) would set per_cpu(xen_vcpu, cpu) = NULL when
> vcpu >= 32, xen_clocksource_read() on vcpu >= 32 would panic.
>
> This patch delays xen_hvm_init_time_ops() to later in
> xen_hvm_smp_prepare_boot_cpu() after the 'vcpu_info' for boot vcpu is
> registered when the boot vcpu is >= 32.
>
> This issue can be reproduced on purpose via below command at the guest
> side when kdump/kexec is enabled:
>
> "taskset -c 33 echo c > /proc/sysrq-trigger"
>
> Cc: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> ---
> arch/x86/xen/enlighten_hvm.c | 20 +++++++++++++++++++-
> arch/x86/xen/smp_hvm.c | 3 +++
> 2 files changed, 22 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index e68ea5f4ad1c..152279416d9a 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -216,7 +216,25 @@ static void __init xen_hvm_guest_init(void)
> WARN_ON(xen_cpuhp_setup(xen_cpu_up_prepare_hvm, xen_cpu_dead_hvm));
> xen_unplug_emulated_devices();
> x86_init.irqs.intr_init = xen_init_IRQ;
> - xen_hvm_init_time_ops();
> +
> + /*
> + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
> + * and the VM would use them until xen_vcpu_setup() is used to
> + * allocate/relocate them at arbitrary address.
> + *
> + * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
> + * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
> + * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
> + *
> + * Therefore we delay xen_hvm_init_time_ops() to
> + * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
> + */
> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> + pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
> + xen_vcpu_nr(0));
> + else
> + xen_hvm_init_time_ops();
> +
> xen_hvm_init_mmu_ops();
>
> #ifdef CONFIG_KEXEC_CORE
> diff --git a/arch/x86/xen/smp_hvm.c b/arch/x86/xen/smp_hvm.c
> index 6ff3c887e0b9..60cd4fafd188 100644
> --- a/arch/x86/xen/smp_hvm.c
> +++ b/arch/x86/xen/smp_hvm.c
> @@ -19,6 +19,9 @@ static void __init xen_hvm_smp_prepare_boot_cpu(void)
> */
> xen_vcpu_setup(0);
>
> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
> + xen_hvm_init_time_ops();
> +
Please add a comment referencing the related code in
xen_hvm_guest_init().
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
2021-10-12 7:24 ` [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
2021-10-12 8:48 ` Juergen Gross
@ 2021-10-12 17:17 ` Boris Ostrovsky
2021-10-25 5:20 ` Dongli Zhang
1 sibling, 1 reply; 8+ messages in thread
From: Boris Ostrovsky @ 2021-10-12 17:17 UTC (permalink / raw)
To: Dongli Zhang, xen-devel
Cc: linux-kernel, x86, jgross, sstabellini, tglx, mingo, bp, hpa,
andrew.cooper3, george.dunlap, iwj, jbeulich, julien, wl,
joe.jin
On 10/12/21 3:24 AM, Dongli Zhang wrote:
> The sched_clock() can be used very early since upstream
> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
> with upstream commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock
> time from 0"), kdump kernel in Xen HVM guest may panic at very early stage
> when accessing &__this_cpu_read(xen_vcpu)->time as in below:
Please drop "upstream". It's always upstream here.
> +
> + /*
> + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
> + * and the VM would use them until xen_vcpu_setup() is used to
> + * allocate/relocate them at arbitrary address.
> + *
> + * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
> + * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
> + * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
> + *
> + * Therefore we delay xen_hvm_init_time_ops() to
> + * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
> + */
> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
What about always deferring this when panicing? Would that work?
Deciding whether to defer based on cpu number feels a bit awkward.
-boris
> + pr_info("Delay xen_hvm_init_time_ops() as kernel is running on vcpu=%d\n",
> + xen_vcpu_nr(0));
> + else
> + xen_hvm_init_time_ops();
> +
> xen_hvm_init_mmu_ops();
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32
2021-10-12 17:17 ` Boris Ostrovsky
@ 2021-10-25 5:20 ` Dongli Zhang
0 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2021-10-25 5:20 UTC (permalink / raw)
To: Boris Ostrovsky, xen-devel
Cc: linux-kernel, x86, jgross, sstabellini, tglx, mingo, bp, hpa,
andrew.cooper3, george.dunlap, iwj, jbeulich, julien, wl,
joe.jin
Hi Boris,
On 10/12/21 10:17 AM, Boris Ostrovsky wrote:
>
> On 10/12/21 3:24 AM, Dongli Zhang wrote:
>> The sched_clock() can be used very early since upstream
>> commit 857baa87b642 ("sched/clock: Enable sched clock early"). In addition,
>> with upstream commit 38669ba205d1 ("x86/xen/time: Output xen sched_clock
>> time from 0"), kdump kernel in Xen HVM guest may panic at very early stage
>> when accessing &__this_cpu_read(xen_vcpu)->time as in below:
>
>
> Please drop "upstream". It's always upstream here.
>
>
>> +
>> + /*
>> + * Only MAX_VIRT_CPUS 'vcpu_info' are embedded inside 'shared_info'
>> + * and the VM would use them until xen_vcpu_setup() is used to
>> + * allocate/relocate them at arbitrary address.
>> + *
>> + * However, when Xen HVM guest panic on vcpu >= MAX_VIRT_CPUS,
>> + * per_cpu(xen_vcpu, cpu) is still NULL at this stage. To access
>> + * per_cpu(xen_vcpu, cpu) via xen_clocksource_read() would panic.
>> + *
>> + * Therefore we delay xen_hvm_init_time_ops() to
>> + * xen_hvm_smp_prepare_boot_cpu() when boot vcpu is >= MAX_VIRT_CPUS.
>> + */
>> + if (xen_vcpu_nr(0) >= MAX_VIRT_CPUS)
>
>
> What about always deferring this when panicing? Would that work?
>
>
> Deciding whether to defer based on cpu number feels a bit awkward.
>
>
> -boris
>
I did some tests and I do not think this works well. I prefer to delay the
initialization only for VCPU >= 32.
This is the syslog if we always delay xen_hvm_init_time_ops(), regardless
whether VCPU >= 32.
[ 0.032372] Booting paravirtualized kernel on Xen HVM
[ 0.032376] clocksource: refined-jiffies: mask: 0xffffffff max_cycles:
0xffffffff, max_idle_ns: 1910969940391419 ns
[ 0.037683] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:64
nr_node_ids:2
[ 0.041876] percpu: Embedded 49 pages/cpu s162968 r8192 d29544 u262144
--> There is a clock backwards from 0.041876 to 0.000010.
[ 0.000010] Built 2 zonelists, mobility grouping on. Total pages: 2015744
[ 0.000012] Policy zone: Normal
[ 0.000014] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.15.0-rc6xen+
root=UUID=2a5975ab-a059-4697-9aee-7a53ddfeea21 ro text console=ttyS0,115200n8
console=tty1 crashkernel=512M-:192M
This is because the initial pv_sched_clock is native_sched_clock(), and it
switches to xen_sched_clock() in xen_hvm_init_time_ops(). Is it fine to always
have a clock backward for non-kdump kernel?
To avoid the clock backward, we may register a dummy clocksource which always
returns 0, before xen_hvm_init_time_ops(). I do not think this is reasonable.
Thank you very much!
Dongli Zhang
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH xen 2/2] xen: update system time immediately when VCPUOP_register_vcpu_info
2021-10-12 7:24 [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Dongli Zhang
2021-10-12 7:24 ` [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
@ 2021-10-12 7:24 ` Dongli Zhang
2021-10-12 8:47 ` [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Juergen Gross
2 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2021-10-12 7:24 UTC (permalink / raw)
To: xen-devel
Cc: linux-kernel, x86, boris.ostrovsky, jgross, sstabellini, tglx,
mingo, bp, hpa, andrew.cooper3, george.dunlap, iwj, jbeulich,
julien, wl, joe.jin
The guest may access the pv vcpu_time_info immediately after
VCPUOP_register_vcpu_info. This is to borrow the idea of
VCPUOP_register_vcpu_time_memory_area, where the
force_update_vcpu_system_time() is called immediately when the new memory
area is registered.
Otherwise, we may observe clock drift at the VM side if the VM accesses
the clocksource immediately after VCPUOP_register_vcpu_info().
Cc: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
---
xen/common/domain.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/xen/common/domain.c b/xen/common/domain.c
index 40d67ec342..c879f6723b 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -1695,6 +1695,8 @@ long do_vcpu_op(int cmd, unsigned int vcpuid, XEN_GUEST_HANDLE_PARAM(void) arg)
rc = map_vcpu_info(v, info.mfn, info.offset);
domain_unlock(d);
+ force_update_vcpu_system_time(v);
+
break;
}
--
2.17.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue
2021-10-12 7:24 [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Dongli Zhang
2021-10-12 7:24 ` [PATCH linux 1/2] xen: delay xen_hvm_init_time_ops() if kdump is boot on vcpu>=32 Dongli Zhang
2021-10-12 7:24 ` [PATCH xen 2/2] xen: update system time immediately when VCPUOP_register_vcpu_info Dongli Zhang
@ 2021-10-12 8:47 ` Juergen Gross
2021-10-12 15:50 ` Dongli Zhang
2 siblings, 1 reply; 8+ messages in thread
From: Juergen Gross @ 2021-10-12 8:47 UTC (permalink / raw)
To: Dongli Zhang, xen-devel
Cc: linux-kernel, x86, boris.ostrovsky, sstabellini, tglx, mingo, bp,
hpa, andrew.cooper3, george.dunlap, iwj, jbeulich, julien, wl,
joe.jin
[-- Attachment #1.1.1: Type: text/plain, Size: 2779 bytes --]
On 12.10.21 09:24, Dongli Zhang wrote:
> When the kdump/kexec is enabled at HVM VM side, to panic kernel will trap
> to xen side with reason=soft_reset. As a result, the xen will reboot the VM
> with the kdump kernel.
>
> Unfortunately, when the VM is panic with below command line ...
>
> "taskset -c 33 echo c > /proc/sysrq-trigger"
>
> ... the kdump kernel is panic at early stage ...
>
> PANIC: early exception 0x0e IP 10:ffffffffa8c66876 error 0 cr2 0x20
> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc5xen #1
> [ 0.000000] Hardware name: Xen HVM domU
> [ 0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
> ... ...
> [ 0.000000] RSP: 0000:ffffffffaa203e20 EFLAGS: 00010082 ORIG_RAX: 0000000000000000
> [ 0.000000] RAX: 0000000000000003 RBX: 0000000000010000 RCX: 00000000ffffdfff
> [ 0.000000] RDX: 0000000000000003 RSI: 00000000ffffdfff RDI: 0000000000000020
> [ 0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09: 0000000000000001
> [ 0.000000] R10: ffffffffaa203e00 R11: ffffffffaa203c70 R12: 0000000040000004
> [ 0.000000] R13: ffffffffaa203e5c R14: ffffffffaa203e58 R15: 0000000000000000
> [ 0.000000] FS: 0000000000000000(0000) GS:ffffffffaa95e000(0000) knlGS:0000000000000000
> [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 0.000000] CR2: 0000000000000020 CR3: 00000000ec9e0000 CR4: 00000000000406a0
> [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 0.000000] Call Trace:
> [ 0.000000] ? xen_init_time_common+0x11/0x55
> [ 0.000000] ? xen_hvm_init_time_ops+0x23/0x45
> [ 0.000000] ? xen_hvm_guest_init+0x214/0x251
> [ 0.000000] ? 0xffffffffa8c00000
> [ 0.000000] ? setup_arch+0x440/0xbd6
> [ 0.000000] ? start_kernel+0x6a/0x689
> [ 0.000000] ? secondary_startup_64_no_verify+0xc2/0xcb
>
> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>
>
> The 1st patch is to fix the issue at VM kernel side. However, we may
> observe clock drift at VM side due to the issue at xen hypervisor side.
> This is because the pv vcpu_time_info is not updated when
> VCPUOP_register_vcpu_info.
>
> The 2nd patch is to force_update_vcpu_system_time() at xen side when
> VCPUOP_register_vcpu_info, to avoid the VM clock drift during kdump kernel
> boot.
Please don't mix patches for multiple projects in one series.
In cases like this it is fine to mention the other project's patch
verbally instead.
Juergen
[-- Attachment #1.1.2: OpenPGP public key --]
[-- Type: application/pgp-keys, Size: 3135 bytes --]
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 495 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue
2021-10-12 8:47 ` [PATCH 0/2] Fix the Xen HVM kdump/kexec boot panic issue Juergen Gross
@ 2021-10-12 15:50 ` Dongli Zhang
0 siblings, 0 replies; 8+ messages in thread
From: Dongli Zhang @ 2021-10-12 15:50 UTC (permalink / raw)
To: Juergen Gross, xen-devel
Cc: linux-kernel, x86, boris.ostrovsky, sstabellini, tglx, mingo, bp,
hpa, andrew.cooper3, george.dunlap, iwj, jbeulich, julien, wl,
joe.jin
Hi Juergen,
On 10/12/21 1:47 AM, Juergen Gross wrote:
> On 12.10.21 09:24, Dongli Zhang wrote:
>> When the kdump/kexec is enabled at HVM VM side, to panic kernel will trap
>> to xen side with reason=soft_reset. As a result, the xen will reboot the VM
>> with the kdump kernel.
>>
>> Unfortunately, when the VM is panic with below command line ...
>>
>> "taskset -c 33 echo c > /proc/sysrq-trigger"
>>
>> ... the kdump kernel is panic at early stage ...
>>
>> PANIC: early exception 0x0e IP 10:ffffffffa8c66876 error 0 cr2 0x20
>> [ 0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 5.15.0-rc5xen #1
>> [ 0.000000] Hardware name: Xen HVM domU
>> [ 0.000000] RIP: 0010:pvclock_clocksource_read+0x6/0xb0
>> ... ...
>> [ 0.000000] RSP: 0000:ffffffffaa203e20 EFLAGS: 00010082 ORIG_RAX:
>> 0000000000000000
>> [ 0.000000] RAX: 0000000000000003 RBX: 0000000000010000 RCX: 00000000ffffdfff
>> [ 0.000000] RDX: 0000000000000003 RSI: 00000000ffffdfff RDI: 0000000000000020
>> [ 0.000000] RBP: 0000000000011000 R08: 0000000000000000 R09: 0000000000000001
>> [ 0.000000] R10: ffffffffaa203e00 R11: ffffffffaa203c70 R12: 0000000040000004
>> [ 0.000000] R13: ffffffffaa203e5c R14: ffffffffaa203e58 R15: 0000000000000000
>> [ 0.000000] FS: 0000000000000000(0000) GS:ffffffffaa95e000(0000)
>> knlGS:0000000000000000
>> [ 0.000000] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 0.000000] CR2: 0000000000000020 CR3: 00000000ec9e0000 CR4: 00000000000406a0
>> [ 0.000000] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 0.000000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [ 0.000000] Call Trace:
>> [ 0.000000] ? xen_init_time_common+0x11/0x55
>> [ 0.000000] ? xen_hvm_init_time_ops+0x23/0x45
>> [ 0.000000] ? xen_hvm_guest_init+0x214/0x251
>> [ 0.000000] ? 0xffffffffa8c00000
>> [ 0.000000] ? setup_arch+0x440/0xbd6
>> [ 0.000000] ? start_kernel+0x6a/0x689
>> [ 0.000000] ? secondary_startup_64_no_verify+0xc2/0xcb
>>
>> This is because Xen HVM supports at most MAX_VIRT_CPUS=32 'vcpu_info'
>> embedded inside 'shared_info' during early stage until xen_vcpu_setup() is
>> used to allocate/relocate 'vcpu_info' for boot cpu at arbitrary address.
>>
>>
>> The 1st patch is to fix the issue at VM kernel side. However, we may
>> observe clock drift at VM side due to the issue at xen hypervisor side.
>> This is because the pv vcpu_time_info is not updated when
>> VCPUOP_register_vcpu_info.
>>
>> The 2nd patch is to force_update_vcpu_system_time() at xen side when
>> VCPUOP_register_vcpu_info, to avoid the VM clock drift during kdump kernel
>> boot.
>
> Please don't mix patches for multiple projects in one series.
>
> In cases like this it is fine to mention the other project's patch
> verbally instead.
>
I will split the patchset in v2 and email to different projects.
The core ideas of this combined patchset are:
1. Fix at HVM domU side (kdump kernel panic)
2. Fix at Xen hypervisor side (clock drift issue in kdump kernel)
3. To report (or seek for help) that soft_reset does not work with mainline-xen
so that I am not able to test my patchset with the most recent mainline xen.
Thank you very much!
Dongli Zhang
^ permalink raw reply [flat|nested] 8+ messages in thread