linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug
       [not found] <SG2PR02MB1550F4ECAD6D74591300EBBA805D0@SG2PR02MB1550.apcprd02.prod.outlook.com>
@ 2016-06-07 10:39 ` Paolo Bonzini
  2016-06-07 11:16   ` Wanpeng Li
  2016-06-07 11:52   ` Wanpeng Li
  0 siblings, 2 replies; 5+ messages in thread
From: Paolo Bonzini @ 2016-06-07 10:39 UTC (permalink / raw)
  To: Wanpeng Li, linux-kernel, kvm
  Cc: Radim Krčmář, Ingo Molnar, Peter Zijlstra (Intel),
	Rik van Riel, Thomas Gleixner, Frederic Weisbecker, John Stultz



On 07/06/2016 09:59, Wanpeng Li wrote:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> I observed that sometimes st is 100% instantaneous, then idle is 100% 
> even if there is a cpu hog on the guest cpu after the cpu hotplug comes 
> back(N.B. this can not always be readily reproduced). I add trace to 
> capture it as below:
> 
> cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0         
> cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291          
> <idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000          
> <idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437
> 
> The steal clock warp and then steal_jiffies underflow.
> 
> Rik also pointed out to me:
>  
> | I have seen stuff like that with live migration too, in the past 
> 
> The root cause of steal clock warp during hotplug is kvm_steal_time reset 
> to 0 after cpu hotplug comes back which should be preexiting guest value. 
> This patch fix it by don't reset kvm_steal_time during guest cpu hotplug.

Improved commit message:

Sometimes, after CPU hotplug you can observe a spike in stolen time 
(100%) followed by the CPU being marked as 100% idle when it's actually 
busy with a CPU hog task.  The trace looks like the following:

cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0         
cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291          
<idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000          
<idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437

The sudden decrease of "steal" causes steal_jiffies to underflow.
The root cause is kvm_steal_time being reset to 0 after hot-plugging
back in a CPU.  Instead, the preexisting value can be used, which is
what the core scheduler code expects.

John Stultz also reported a similar issue after guest S3.
------

Please also add

Cc: John Stultz <john.stultz@linaro.org>

Thanks,

Paolo

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug
  2016-06-07 10:39 ` [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug Paolo Bonzini
@ 2016-06-07 11:16   ` Wanpeng Li
  2016-06-07 11:52   ` Wanpeng Li
  1 sibling, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2016-06-07 11:16 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Wanpeng Li, linux-kernel, kvm, Radim Krčmář,
	Ingo Molnar, Peter Zijlstra (Intel),
	Rik van Riel, Thomas Gleixner, Frederic Weisbecker, John Stultz

2016-06-07 18:39 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
>
>
> On 07/06/2016 09:59, Wanpeng Li wrote:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> I observed that sometimes st is 100% instantaneous, then idle is 100%
>> even if there is a cpu hog on the guest cpu after the cpu hotplug comes
>> back(N.B. this can not always be readily reproduced). I add trace to
>> capture it as below:
>>
>> cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
>> cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291
>> <idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
>> <idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437
>>
>> The steal clock warp and then steal_jiffies underflow.
>>
>> Rik also pointed out to me:
>>
>> | I have seen stuff like that with live migration too, in the past
>>
>> The root cause of steal clock warp during hotplug is kvm_steal_time reset
>> to 0 after cpu hotplug comes back which should be preexiting guest value.
>> This patch fix it by don't reset kvm_steal_time during guest cpu hotplug.
>
> Improved commit message:
>
> Sometimes, after CPU hotplug you can observe a spike in stolen time
> (100%) followed by the CPU being marked as 100% idle when it's actually
> busy with a CPU hog task.  The trace looks like the following:
>
> cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0
> cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291
> <idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000
> <idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437
>
> The sudden decrease of "steal" causes steal_jiffies to underflow.
> The root cause is kvm_steal_time being reset to 0 after hot-plugging
> back in a CPU.  Instead, the preexisting value can be used, which is
> what the core scheduler code expects.
>
> John Stultz also reported a similar issue after guest S3.
> ------
>
> Please also add
>
> Cc: John Stultz <john.stultz@linaro.org>

Thanks Paolo! Your help is always a great appreciated. :)

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug
  2016-06-07 10:39 ` [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug Paolo Bonzini
  2016-06-07 11:16   ` Wanpeng Li
@ 2016-06-07 11:52   ` Wanpeng Li
  2016-06-07 18:21     ` John Stultz
  1 sibling, 1 reply; 5+ messages in thread
From: Wanpeng Li @ 2016-06-07 11:52 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Wanpeng Li, linux-kernel, kvm, Radim Krčmář,
	Ingo Molnar, Peter Zijlstra (Intel),
	Rik van Riel, Thomas Gleixner, Frederic Weisbecker, John Stultz

2016-06-07 18:39 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
[...]
>
> John Stultz also reported a similar issue after guest S3.

Since there is cpu hot-unplug during S3.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug
  2016-06-07 11:52   ` Wanpeng Li
@ 2016-06-07 18:21     ` John Stultz
  0 siblings, 0 replies; 5+ messages in thread
From: John Stultz @ 2016-06-07 18:21 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Paolo Bonzini, Wanpeng Li, linux-kernel, kvm,
	Radim Krčmář, Ingo Molnar, Peter Zijlstra (Intel),
	Rik van Riel, Thomas Gleixner, Frederic Weisbecker

On Tue, Jun 7, 2016 at 4:52 AM, Wanpeng Li <kernellwp@gmail.com> wrote:
> 2016-06-07 18:39 GMT+08:00 Paolo Bonzini <pbonzini@redhat.com>:
> [...]
>>
>> John Stultz also reported a similar issue after guest S3.
>
> Since there is cpu hot-unplug during S3.

While I'm excited to finally see some progress on this, I
unfortunately can't verify this fixes the issue I saw, as my qemu/kvm
test environments have regressed far enough that suspend/resume no
longer works. :P

I'll see about trying to re-build qemu from source to see if the
latest code works.

thanks
-john

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug
@ 2016-06-07  8:33 Wanpeng Li
  0 siblings, 0 replies; 5+ messages in thread
From: Wanpeng Li @ 2016-06-07  8:33 UTC (permalink / raw)
  To: linux-kernel, kvm
  Cc: Wanpeng Li, Paolo Bonzini, Radim Krčmář,
	Ingo Molnar, Peter Zijlstra (Intel),
	Rik van Riel, Thomas Gleixner, Frederic Weisbecker

From: Wanpeng Li <wanpeng.li@hotmail.com>

I observed that sometimes st is 100% instantaneous, then idle is 100% 
even if there is a cpu hog on the guest cpu after the cpu hotplug comes 
back(N.B. this can not always be readily reproduced). I add trace to 
capture it as below:

cpuhp/1-12    [001] d.h1   167.461657: account_process_tick: steal = 1291385514, prev_steal_time = 0         
cpuhp/1-12    [001] d.h1   167.461659: account_process_tick: steal_jiffies = 1291          
<idle>-0     [001] d.h1   167.462663: account_process_tick: steal = 18732255, prev_steal_time = 1291000000          
<idle>-0     [001] d.h1   167.462664: account_process_tick: steal_jiffies = 18446744072437

The steal clock warp and then steal_jiffies underflow.

Rik also pointed out to me:
 
| I have seen stuff like that with live migration too, in the past 

The root cause of steal clock warp during hotplug is kvm_steal_time reset 
to 0 after cpu hotplug comes back which should be preexiting guest value. 
This patch fix it by don't reset kvm_steal_time during guest cpu hotplug.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
v2 -> v3:
 * fix the root cause
v1 -> v2:
 * update patch subject, description and comments
 * deal with the case where steal time suddenly increases by a ludicrous amount

 arch/x86/kernel/kvm.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index eea2a6f..1ef5e48 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -301,8 +301,6 @@ static void kvm_register_steal_time(void)
 	if (!has_steal_clock)
 		return;
 
-	memset(st, 0, sizeof(*st));
-
 	wrmsrl(MSR_KVM_STEAL_TIME, (slow_virt_to_phys(st) | KVM_MSR_ENABLED));
 	pr_info("kvm-stealtime: cpu %d, msr %llx\n",
 		cpu, (unsigned long long) slow_virt_to_phys(st));
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2016-06-07 18:21 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <SG2PR02MB1550F4ECAD6D74591300EBBA805D0@SG2PR02MB1550.apcprd02.prod.outlook.com>
2016-06-07 10:39 ` [PATCH v4 1/3] KVM: fix steal clock warp during guest cpu hotplug Paolo Bonzini
2016-06-07 11:16   ` Wanpeng Li
2016-06-07 11:52   ` Wanpeng Li
2016-06-07 18:21     ` John Stultz
2016-06-07  8:33 Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).