All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
@ 2017-11-06  0:55 Wanpeng Li
  2017-11-06  9:29 ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06  0:55 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

 watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [qemu-system-x86:10185]
 CPU: 6 PID: 10185 Comm: qemu-system-x86 Tainted: G           OE   4.14.0-rc4+ #4
 RIP: 0010:kvm_get_time_scale+0x4e/0xa0 [kvm]
 Call Trace:
  ? get_kvmclock_ns+0xa3/0x140 [kvm]
  get_time_ref_counter+0x5a/0x80 [kvm]
  kvm_hv_process_stimers+0x120/0x5f0 [kvm]
  ? kvm_hv_process_stimers+0x120/0x5f0 [kvm]
  ? preempt_schedule+0x27/0x30
  ? ___preempt_schedule+0x16/0x18
  kvm_arch_vcpu_ioctl_run+0x4b4/0x1690 [kvm]
  ? kvm_arch_vcpu_load+0x47/0x230 [kvm]
  kvm_vcpu_ioctl+0x33a/0x620 [kvm]
  ? kvm_vcpu_ioctl+0x33a/0x620 [kvm]
  ? kvm_vm_ioctl_check_extension_generic+0x3b/0x40 [kvm]
  ? kvm_dev_ioctl+0x279/0x6c0 [kvm]
  do_vfs_ioctl+0xa1/0x5d0
  ? __fget+0x73/0xa0
  SyS_ioctl+0x79/0x90
  entry_SYSCALL_64_fastpath+0x1e/0xa9

This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and 
cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which 
results in softlockup.

This patch fixes it by disabling preemption just for rdtsc() and __this_cpu_read() 
section in order to reduce preemption disable time, the per-CPU variable is just 
read operation and the master clock is active, so it is as safe as how it is used 
in kvm_guest_time_update().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/x86.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 34c85aa..2542f9b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1781,6 +1781,7 @@ u64 get_kvmclock_ns(struct kvm *kvm)
 	struct kvm_arch *ka = &kvm->arch;
 	struct pvclock_vcpu_time_info hv_clock;
 	u64 ret;
+	unsigned long this_tsc_khz, host_tsc;
 
 	spin_lock(&ka->pvclock_gtod_sync_lock);
 	if (!ka->use_master_clock) {
@@ -1795,13 +1796,17 @@ u64 get_kvmclock_ns(struct kvm *kvm)
 	/* both __this_cpu_read() and rdtsc() should be on the same cpu */
 	get_cpu();
 
-	kvm_get_time_scale(NSEC_PER_SEC, __this_cpu_read(cpu_tsc_khz) * 1000LL,
-			   &hv_clock.tsc_shift,
-			   &hv_clock.tsc_to_system_mul);
-	ret = __pvclock_read_cycles(&hv_clock, rdtsc());
+	this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
+	host_tsc = rdtsc();
 
 	put_cpu();
 
+	/* With all the info we got, fill in the values */
+	kvm_get_time_scale(NSEC_PER_SEC, this_tsc_khz * 1000LL,
+			   &hv_clock.tsc_shift,
+			   &hv_clock.tsc_to_system_mul);
+	ret = __pvclock_read_cycles(&hv_clock, host_tsc);
+
 	return ret;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
  2017-11-06  0:55 [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp Wanpeng Li
@ 2017-11-06  9:29 ` Paolo Bonzini
  2017-11-06 10:06   ` Wanpeng Li
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2017-11-06  9:29 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm; +Cc: Radim Krčmář, Wanpeng Li

On 06/11/2017 01:55, Wanpeng Li wrote:
> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and 
> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which 
> results in softlockup.

Apart from the pr_debug, kvm_get_time_scale should take less than a
microsecond.  The patch is fine, but can you confirm that pr_debug is
the culprit?

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
  2017-11-06  9:29 ` Paolo Bonzini
@ 2017-11-06 10:06   ` Wanpeng Li
  2017-11-06 10:21     ` Paolo Bonzini
  0 siblings, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06 10:06 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li

2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 06/11/2017 01:55, Wanpeng Li wrote:
>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>> results in softlockup.
>
> Apart from the pr_debug, kvm_get_time_scale should take less than a
> microsecond.  The patch is fine, but can you confirm that pr_debug is
> the culprit?

I can still encounter softlockup after removing the pr_debug.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
  2017-11-06 10:06   ` Wanpeng Li
@ 2017-11-06 10:21     ` Paolo Bonzini
  2017-11-06 10:49       ` Wanpeng Li
  0 siblings, 1 reply; 5+ messages in thread
From: Paolo Bonzini @ 2017-11-06 10:21 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li

On 06/11/2017 11:06, Wanpeng Li wrote:
> 2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>> On 06/11/2017 01:55, Wanpeng Li wrote:
>>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>>> results in softlockup.
>>
>> Apart from the pr_debug, kvm_get_time_scale should take less than a
>> microsecond.  The patch is fine, but can you confirm that pr_debug is
>> the culprit?
> 
> I can still encounter softlockup after removing the pr_debug.

Is kvm_get_time_scale getting into an infinite loop then?  That would be
the actual bug.

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp
  2017-11-06 10:21     ` Paolo Bonzini
@ 2017-11-06 10:49       ` Wanpeng Li
  0 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2017-11-06 10:49 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: linux-kernel, kvm, Radim Krčmář, Wanpeng Li

2017-11-06 18:21 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> On 06/11/2017 11:06, Wanpeng Li wrote:
>> 2017-11-06 17:29 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>>> On 06/11/2017 01:55, Wanpeng Li wrote:
>>>> This can be reproduced when running kvm-unit-tests/hyperv_stimer.flat and
>>>> cpu-hotplug stress simultaneously. kvm_get_time_scale() takes too long which
>>>> results in softlockup.
>>>
>>> Apart from the pr_debug, kvm_get_time_scale should take less than a
>>> microsecond.  The patch is fine, but can you confirm that pr_debug is
>>> the culprit?
>>
>> I can still encounter softlockup after removing the pr_debug.
>
> Is kvm_get_time_scale getting into an infinite loop then?  That would be
> the actual bug.

I think so, kvm_get_time_scale almost occupy 100% cpu utilization.

         :      static void kvm_get_time_scale(uint64_t scaled_hz,
uint64_t base_hz,
         :                                     s8 *pshift, u32 *pmultiplier)
         :      {
    8.97 :        21a34:       test   %rdx,%rbx
    0.00 :        21a37:       jne    21a86 <kvm_get_time_scale+0xb6>
         :              uint64_t tps64;
         :              uint32_t tps32;
         :
         :              tps64 = base_hz;
         :              scaled64 = scaled_hz;
         :              while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
   28.01 :        21a39:       test   %r13d,%r13d
         :                      tps64 >>= 1;
    0.00 :        21a3c:       js     21a86 <kvm_get_time_scale+0xb6>
         :                      shift--;
    6.16 :        21a3e:       add    %r13d,%r13d
         :              uint64_t tps64;
         :              uint32_t tps32;
         :
         :              tps64 = base_hz;
         :              scaled64 = scaled_hz;
         :              while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
    7.22 :        21a41:       mov    %r13d,%eax
   17.86 :        21a44:       add    $0x1,%r14d
    0.00 :        21a48:       cmp    %rax,%rbx
         :                      tps64 >>= 1;
         :                      shift--;
         :              }
         :
         :              tps32 = (uint32_t)tps64;
         :              while (tps32 <= scaled64 || scaled64 &
0xffffffff00000000ULL) {
   31.78 :        21a4b:       jae    21a34 <kvm_get_time_scale+0x64>
    0.00 :        21a4d:       mov    %r15,%rdi
         :              while (tps64 > scaled64*2 || tps64 &
0xffffffff00000000ULL) {
         :                      tps64 >>= 1;
         :                      shift--;
         :              }

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2017-11-06 10:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-06  0:55 [PATCH] KVM: X86: Fix softlockup when get the current kvmclock timestamp Wanpeng Li
2017-11-06  9:29 ` Paolo Bonzini
2017-11-06 10:06   ` Wanpeng Li
2017-11-06 10:21     ` Paolo Bonzini
2017-11-06 10:49       ` Wanpeng Li

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.