All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislaw Gruszka <sgruszka@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: xen.list@daevel.fr, Rik van Riel <riel@redhat.com>,
	peterz@infradead.org, Dongli Zhang <dongli.zhang@oracle.com>,
	dario.faggioli@citrix.com, bevan@bi-co.net,
	linux-kernel@vger.kernel.org, Xiaolong Ye <xiaolong.ye@intel.com>,
	mingo@redhat.com, Frederic Weisbecker <fweisbec@gmail.com>,
	joao.m.martins@oracle.com, xen-devel@lists.xen.org,
	Wanpeng Li <wanpeng.li@hotmail.com>
Subject: Re: [PATCH 1/1] sched/cputime: do not decrease steal time after live migration on xen
Date: Tue, 10 Oct 2017 14:42:01 +0200	[thread overview]
Message-ID: <20171010124201.GD8263__32078.7792363719$1507639512$gmane$org@redhat.com> (raw)
In-Reply-To: <20171010105925.mla7tpdh6stlxie3@gmail.com>

On Tue, Oct 10, 2017 at 12:59:26PM +0200, Ingo Molnar wrote:
> 
> (Cc:-ed more gents involved in kernel/sched/cputime.c work. Full patch quoted 
> below.)
> 
> * Dongli Zhang <dongli.zhang@oracle.com> wrote:
> 
> > After guest live migration on xen, steal time in /proc/stat
> > (cpustat[CPUTIME_STEAL]) might decrease because steal returned by
> > paravirt_steal_clock() might be less than this_rq()->prev_steal_time.
> > 
> > For instance, steal time of each vcpu is 335 before live migration.
> > 
> > cpu  198 0 368 200064 1962 0 0 1340 0 0
> > cpu0 38 0 81 50063 492 0 0 335 0 0
> > cpu1 65 0 97 49763 634 0 0 335 0 0
> > cpu2 38 0 81 50098 462 0 0 335 0 0
> > cpu3 56 0 107 50138 374 0 0 335 0 0
> > 
> > After live migration, steal time is reduced to 312.
> > 
> > cpu  200 0 370 200330 1971 0 0 1248 0 0
> > cpu0 38 0 82 50123 500 0 0 312 0 0
> > cpu1 65 0 97 49832 634 0 0 312 0 0
> > cpu2 39 0 82 50167 462 0 0 312 0 0
> > cpu3 56 0 107 50207 374 0 0 312 0 0
> > 
> > The code in this patch is borrowed from do_stolen_accounting() which has
> > already been removed from linux source code since commit ecb23dc6 ("xen:
> > add steal_clock support on x86").
> > 
> > Similar and more severe issue would impact prior linux 4.8-4.10 as
> > discussed by Michael Las at
> > https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest.
> > Unlike the issue discussed by Michael Las which would overflow steal time
> > and lead to 100% st usage in top command for linux 4.8-4.10, the issue for
> > linux 4.11+ would only decrease but not overflow steal time after live
> > migration.
> > 
> > References: https://0xstubs.org/debugging-a-flaky-cpu-steal-time-counter-on-a-paravirtualized-xen-guest
> > Signed-off-by: Dongli Zhang <dongli.zhang@oracle.com>
> > ---
> >  kernel/sched/cputime.c | 13 ++++++++++---
> >  1 file changed, 10 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> > index 14d2dbf..57d09cab 100644
> > --- a/kernel/sched/cputime.c
> > +++ b/kernel/sched/cputime.c
> > @@ -238,10 +238,17 @@ static __always_inline u64 steal_account_process_time(u64 maxtime)
> >  {
> >  #ifdef CONFIG_PARAVIRT
> >  	if (static_key_false(&paravirt_steal_enabled)) {
> > -		u64 steal;
> > +		u64 steal, steal_time;
> > +		s64 steal_delta;
> > +
> > +		steal_time = paravirt_steal_clock(smp_processor_id());
> > +		steal = steal_delta = steal_time - this_rq()->prev_steal_time;
> > +
> > +		if (unlikely(steal_delta < 0)) {
> > +			this_rq()->prev_steal_time = steal_time;

I don't think setting prev_steal_time to smaller value is right
thing to do. 

Beside, I don't think we need to check for overflow condition for
cputime variables (it will happen after 279 years :-). So instead
of introducing signed steal_delta variable I would just add
below check, which should be sufficient to fix the problem:

	if (unlikely(steal <= this_rq()->prev_steal_time))
		return 0;

Thanks
Stanislaw

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-10-10 12:42 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-10  9:14 [PATCH 1/1] sched/cputime: do not decrease steal time after live migration on xen Dongli Zhang
2017-10-10 10:07 ` Jan Beulich
2017-10-10 10:07 ` [Xen-devel] " Jan Beulich
2017-10-10 10:59 ` Ingo Molnar
2017-10-10 10:59 ` Ingo Molnar
2017-10-10 12:42   ` Stanislaw Gruszka [this message]
2017-10-10 12:42   ` Stanislaw Gruszka
2017-10-10 12:48     ` Peter Zijlstra
2017-10-10 12:48     ` Peter Zijlstra
2017-10-10 14:01       ` Rik van Riel
2017-10-10 14:01       ` Rik van Riel
2017-10-11  7:47         ` Dongli Zhang
2017-10-11  7:47         ` Dongli Zhang
2017-10-11  7:29     ` Dongli Zhang
2017-10-11  7:29     ` Dongli Zhang
2017-10-10 11:58 ` Peter Zijlstra
2017-10-10 11:58 ` Peter Zijlstra
2017-10-10  9:14 Dongli Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='20171010124201.GD8263__32078.7792363719$1507639512$gmane$org@redhat.com' \
    --to=sgruszka@redhat.com \
    --cc=bevan@bi-co.net \
    --cc=dario.faggioli@citrix.com \
    --cc=dongli.zhang@oracle.com \
    --cc=fweisbec@gmail.com \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=wanpeng.li@hotmail.com \
    --cc=xen-devel@lists.xen.org \
    --cc=xen.list@daevel.fr \
    --cc=xiaolong.ye@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.