* [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
@ 2017-11-06 0:55 Wanpeng Li
2017-11-06 9:29 ` Paolo Bonzini
0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06 0:55 UTC (permalink / raw)
To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li
From: Wanpeng Li <wanpeng.li@hotmail.com>
watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185]
CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G OE 4.14.0-rc4+ #4
RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm]
Call Trace:
? get_kvmclock_ns+0xa3/0x140 [kvm]
get_time_ref_counter+0x5a/0x80 [kvm]
kvm_hv_process_stimers+0x120/0x5f0 [kvm]
? kvm_hv_process_stimers+0x120/0x5f0 [kvm]
? preempt_schedule+0x27/0x30
? ___preempt_schedule+0x16/0x18
kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm]
? kvm_arch_vcpu_load+0x47/0x230 [kvm]
kvm_vcpu_ioctl+0x33a/0x620 [kvm]
? kvm_vcpu_ioctl+0x33a/0x620 [kvm]
? kvm_vm_ioctl_check_extension_generic+0x3b/0x40 [kvm]
? kvm_dev_ioctl+0x279/0x6c0 [kvm]
do_vfs_ioctl+0xa1/0x5d0
? __fget+0x73/0xa0
SyS_ioctl+0x79/0x90
entry_SYSCALL_64_fastpath+0x1e/0xa9
This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
results in softlockup.
This patch fixes it by disabling preemption just for rdtsc() and __this_cpu_read()
section in order to reduce preemption disable time, the per-CPU variable is just
read operation and the master clock is active, so it is as safe as how it is used
in kvm_guest_time_update().
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
arch/x86/kvm/x86.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34c85aa..2542f9b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1781,6 +1781,7 @@ u64 get_kvmclock_ns(struct kvm *kvm)
struct kvm_arch *ka = &kvm->arch;
struct pvclock_vcpu_time_info hv_clock;
u64 ret;
+ unsigned long this_tsc_khz, host_tsc;
spin_lock(&ka->pvclock_gtod_sync_lock);
if (!ka->use_master_clock) {
@@ -1795,13 +1796,17 @@ u64 get_kvmclock_ns(struct kvm *kvm)
/* both __this_cpu_read() and rdtsc() should be on the same cpu */
get_cpu();
- kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
- &hv_clock.tsc_shift,
- &hv_clock.tsc_to_system_mul);
- ret = __pvclock_read_cycles(&hv_clock, rdtsc());
+ this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
+ host_tsc = rdtsc();
put_cpu();
+ /* With all the info we got, fill in the values */
+ kvm_get_time_scale(NSEC_PER_SEC, this_tsc_khz * 1000LL,
+ &hv_clock.tsc_shift,
+ &hv_clock.tsc_to_system_mul);
+ ret = __pvclock_read_cycles(&hv_clock, host_tsc);
+
return ret;
}
--
2.7.4
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
2017-11-06 0:55 [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp Wanpeng Li
@ 2017-11-06 9:29 ` Paolo Bonzini
2017-11-06 10:06 ` Wanpeng Li
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2017-11-06 9:29 UTC (permalink / raw)
To: Wanpeng Li, linux-kernel, kvm; +Cc: Radim Krčmář, Wanpeng Li
On 06/11/2017 01:55, Wanpeng Li wrote:
> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
> results in softlockup.
Apart from the pr_debug, kvm_get_time_scale should take less than a
microsecond. The patch is fine, but can you confirm that pr_debug is
the culprit?
Thanks,
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
2017-11-06 9:29 ` Paolo Bonzini
@ 2017-11-06 10:06 ` Wanpeng Li
2017-11-06 10:21 ` Paolo Bonzini
0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06 10:06 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li
2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 06/11/2017 01:55, Wanpeng Li wrote:
>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>> results in softlockup.
>
> Apart from the pr_debug, kvm_get_time_scale should take less than a
> microsecond. The patch is fine, but can you confirm that pr_debug is
> the culprit?
I can still encounter softlockup after removing the pr_debug.
Regards,
Wanpeng Li
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
2017-11-06 10:06 ` Wanpeng Li
@ 2017-11-06 10:21 ` Paolo Bonzini
2017-11-06 10:49 ` Wanpeng Li
0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2017-11-06 10:21 UTC (permalink / raw)
To: Wanpeng Li; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li
On 06/11/2017 11:06, Wanpeng Li wrote:
> 2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>> On 06/11/2017 01:55, Wanpeng Li wrote:
>>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>>> results in softlockup.
>>
>> Apart from the pr_debug, kvm_get_time_scale should take less than a
>> microsecond. The patch is fine, but can you confirm that pr_debug is
>> the culprit?
>
> I can still encounter softlockup after removing the pr_debug.
Is kvm_get_time_scale getting into an infinite loop then? That would be
the actual bug.
Paolo
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
2017-11-06 10:21 ` Paolo Bonzini
@ 2017-11-06 10:49 ` Wanpeng Li
0 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06 10:49 UTC (permalink / raw)
To: Paolo Bonzini; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li
2017-11-06 18:21 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 06/11/2017 11:06, Wanpeng Li wrote:
>> 2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>> On 06/11/2017 01:55, Wanpeng Li wrote:
>>>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>>>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>>>> results in softlockup.
>>>
>>> Apart from the pr_debug, kvm_get_time_scale should take less than a
>>> microsecond. The patch is fine, but can you confirm that pr_debug is
>>> the culprit?
>>
>> I can still encounter softlockup after removing the pr_debug.
>
> Is kvm_get_time_scale getting into an infinite loop then? That would be
> the actual bug.
I think so, kvm_get_time_scale almost occupy 100% cpu utilization.
: static void kvm_get_time_scale(uint64_t scaled_hz,
uint64_t base_hz,
: s8 *pshift, u32 *pmultiplier)
: {
8.97 : 21a34: test %rdx,%rbx
0.00 : 21a37: jne 21a86 <kvm_get_time_scale+0xb6>
: uint64_t tps64;
: uint32_t tps32;
:
: tps64 = base_hz;
: scaled64 = scaled_hz;
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
28.01 : 21a39: test %r13d,%r13d
: tps64 >>= 1;
0.00 : 21a3c: js 21a86 <kvm_get_time_scale+0xb6>
: shift--;
6.16 : 21a3e: add %r13d,%r13d
: uint64_t tps64;
: uint32_t tps32;
:
: tps64 = base_hz;
: scaled64 = scaled_hz;
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
7.22 : 21a41: mov %r13d,%eax
17.86 : 21a44: add $0x1,%r14d
0.00 : 21a48: cmp %rax,%rbx
: tps64 >>= 1;
: shift--;
: }
:
: tps32 = (uint32_t)tps64;
: while (tps32 <= scaled64 || scaled64 &
0xffffffff00000000ULL) {
31.78 : 21a4b: jae 21a34 <kvm_get_time_scale+0x64>
0.00 : 21a4d: mov %r15,%rdi
: while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
: tps64 >>= 1;
: shift--;
: }
Regards,
Wanpeng Li
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-11-06 10:49 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06 0:55 [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp Wanpeng Li
2017-11-06 9:29 ` Paolo Bonzini
2017-11-06 10:06 ` Wanpeng Li
2017-11-06 10:21 ` Paolo Bonzini
2017-11-06 10:49 ` Wanpeng Li
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.