* Xen: decreasing cpu steal clock counter
@ 2017-08-31 10:44 Valentin Vidic
2017-08-31 13:37 ` Greg KH
0 siblings, 1 reply; 5+ messages in thread
From: Valentin Vidic @ 2017-08-31 10:44 UTC (permalink / raw)
To: stable; +Cc: Michael Lass
The following behavior of the steal counter observed
in a Xen guest running 4.9 kernel:
$ while sleep 1; do head -1 /proc/stat ; done
cpu 1556 0 1429 314195002 5529 0 64 14370419283 0 0
cpu 1556 0 1429 314195402 5529 0 64 3601506907 0 0
cpu 1556 0 1429 314195802 5529 0 64 1833790429262 0 0
cpu 1556 0 1429 314196203 5529 0 64 1821957766874 0 0
cpu 1556 0 1429 314196603 5529 0 64 1810766851628 0 0
cpu 1556 0 1429 314197002 5529 0 64 1792853828090 0 0
Could this patch or some variation of it be included in
the 4.9 LTS?
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5
Problem exists in versions 4.8 until 4.11 so older LTS
kernels should not be affected.
More details on this problem here:
https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest/
--
Valentin
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter
2017-08-31 10:44 Xen: decreasing cpu steal clock counter Valentin Vidic
@ 2017-08-31 13:37 ` Greg KH
2017-08-31 13:51 ` Valentin Vidic
0 siblings, 1 reply; 5+ messages in thread
From: Greg KH @ 2017-08-31 13:37 UTC (permalink / raw)
To: Valentin Vidic; +Cc: stable, Michael Lass
On Thu, Aug 31, 2017 at 12:44:51PM +0200, Valentin Vidic wrote:
> The following behavior of the steal counter observed
> in a Xen guest running 4.9 kernel:
>
> $ while sleep 1; do head -1 /proc/stat ; done
> cpu 1556 0 1429 314195002 5529 0 64 14370419283 0 0
> cpu 1556 0 1429 314195402 5529 0 64 3601506907 0 0
> cpu 1556 0 1429 314195802 5529 0 64 1833790429262 0 0
> cpu 1556 0 1429 314196203 5529 0 64 1821957766874 0 0
> cpu 1556 0 1429 314196603 5529 0 64 1810766851628 0 0
> cpu 1556 0 1429 314197002 5529 0 64 1792853828090 0 0
>
> Could this patch or some variation of it be included in
> the 4.9 LTS?
What patch?
> https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5
>
> Problem exists in versions 4.8 until 4.11 so older LTS
> kernels should not be affected.
>
> More details on this problem here:
>
> https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest/
What is the git commit id of the aptch in Linus's tree that resolves
this issue?
thanks,
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter
2017-08-31 13:37 ` Greg KH
@ 2017-08-31 13:51 ` Valentin Vidic
2017-08-31 14:08 ` Greg KH
0 siblings, 1 reply; 5+ messages in thread
From: Valentin Vidic @ 2017-08-31 13:51 UTC (permalink / raw)
To: Greg KH; +Cc: stable, Michael Lass
[-- Attachment #1: Type: text/plain, Size: 544 bytes --]
On Thu, Aug 31, 2017 at 03:37:09PM +0200, Greg KH wrote:
> What patch?
Attaching the patch from this link:
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5
> What is the git commit id of the aptch in Linus's tree that resolves
> this issue?
The issue was fixed later in 4.11, but this might be a bigger change
so not sure if you want to take that:
2b1f967d80e8e5d7361f0e1654c842869570f573
sched/cputime: Complete nsec conversion of tick based accounting
--
Valentin
[-- Attachment #2: handle-decreasing-steal-clock.patch --]
[-- Type: text/x-diff, Size: 2959 bytes --]
>From 4b66621a06a94d22629661a9262f92b8cf5b7ca9 Mon Sep 17 00:00:00 2001
From: Michael Lass <bevan@bi-co.net>
Date: Sun, 6 Aug 2017 18:09:21 +0200
Subject: [PATCH] sched/cputime: handle decreasing steal clock
On some flaky Xen hosts, the steal clock returned by paravirt_steal_clock is
not monotonically increasing but can slightly decrease. Currently this results
in an overflow of u64 steal. Before giving this number to account_steal_time()
it is converted into cputime, so the target cpustat counter
cpustat[CPUTIME_STEAL] is not overflowing as well but instead increased by a
large amount. Due to the conversion to cputime and back into nanoseconds,
this_rq()->prev_steal_time does not correctly reflect the latest reported steal
clock afterwards, resulting in erratic behavior such as backwards running
cpustat[CPUTIME_STEAL]. The following is a trace from userspace of the value for
steal time reported in /proc/stat:
time stolen diff
---- ------ ----
0ms 784
100ms 1844670130367 1844670129583
200ms 1844664564089 -5566278
300ms 1844659554439 -5009650
400ms 1844655101417 -4453022
This issue was probably introduced by the following commits, which deactivate a
check for (steal < 0) in the Xen pv guest codepath and allow unlimited jumps of
the cpustat counters (both introduced in v4.8):
ecb23dc6f2eff0ce64dd60351a81f376f13b12cc
03cbc732639ddcad15218c4b2046d255851ff1e3
As a workaround, ignore decreasing values steal clock. By not updating
this_rq()->prev_steal_time we make sure that steal time is only accuonted as
soon as the steal clock raises above the value that was already observed and
accounted for earlier.
In current kernel versions (v4.11 and higher) this issue should not exist since
conversion between nsec and cputime has been eliminated. Therefore all values
will overflow, i.e. decrease as reported by the host system.
---
kernel/sched/cputime.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 5ebee3164e64..5f039f7f9294 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -262,10 +262,19 @@ static __always_inline cputime_t steal_account_process_time(cputime_t maxtime)
#ifdef CONFIG_PARAVIRT
if (static_key_false(¶virt_steal_enabled)) {
cputime_t steal_cputime;
- u64 steal;
-
- steal = paravirt_steal_clock(smp_processor_id());
- steal -= this_rq()->prev_steal_time;
+ u64 steal_time;
+ s64 steal;
+
+ steal_time = paravirt_steal_clock(smp_processor_id());
+ steal = steal_time - this_rq()->prev_steal_time;
+
+ if (unlikely(steal < 0)) {
+ printk_ratelimited(KERN_DEBUG "cputime: steal_clock for "
+ "processor %d decreased: %llu -> %llu, "
+ "ignoring\n", smp_processor_id(),
+ this_rq()->prev_steal_time, steal_time);
+ return 0;
+ }
steal_cputime = min(nsecs_to_cputime(steal), maxtime);
account_steal_time(steal_cputime);
--
2.14.0
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter
2017-08-31 13:51 ` Valentin Vidic
@ 2017-08-31 14:08 ` Greg KH
2017-08-31 15:05 ` Valentin Vidic
0 siblings, 1 reply; 5+ messages in thread
From: Greg KH @ 2017-08-31 14:08 UTC (permalink / raw)
To: Valentin Vidic; +Cc: stable, Michael Lass
On Thu, Aug 31, 2017 at 03:51:39PM +0200, Valentin Vidic wrote:
> On Thu, Aug 31, 2017 at 03:37:09PM +0200, Greg KH wrote:
> > What patch?
>
> Attaching the patch from this link:
>
> https://bugs.debian.org/cgi-bin/bugreport.cgi?att=1;bug=871608;filename=handle-decreasing-steal-clock.patch;msg=5
I can't do anything with non-upstream patches for stable kernels. You
have read
https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html
right?
> > What is the git commit id of the aptch in Linus's tree that resolves
> > this issue?
>
> The issue was fixed later in 4.11, but this might be a bigger change
> so not sure if you want to take that:
>
> 2b1f967d80e8e5d7361f0e1654c842869570f573
> sched/cputime: Complete nsec conversion of tick based accounting
I always would rather take the original change that is in Linus's tree,
as 99% of the time we take something different, it ends up being wrong.
But I kind of doubt the above git commit id is the right one to take :(
I need some feedback from the Xen maintainers before I can do anything
else.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: Xen: decreasing cpu steal clock counter
2017-08-31 14:08 ` Greg KH
@ 2017-08-31 15:05 ` Valentin Vidic
0 siblings, 0 replies; 5+ messages in thread
From: Valentin Vidic @ 2017-08-31 15:05 UTC (permalink / raw)
To: Greg KH; +Cc: stable, Michael Lass
On Thu, Aug 31, 2017 at 04:08:51PM +0200, Greg KH wrote:
> I always would rather take the original change that is in Linus's tree,
> as 99% of the time we take something different, it ends up being wrong.
>
> But I kind of doubt the above git commit id is the right one to take :(
>
> I need some feedback from the Xen maintainers before I can do anything
> else.
No problem, thanks for the reply. I will try to extract and test the
changes for this problem from the later kernel version and also check
with the Xen people.
--
Valentin
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2017-08-31 15:05 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-31 10:44 Xen: decreasing cpu steal clock counter Valentin Vidic
2017-08-31 13:37 ` Greg KH
2017-08-31 13:51 ` Valentin Vidic
2017-08-31 14:08 ` Greg KH
2017-08-31 15:05 ` Valentin Vidic
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.