All of lore.kernel.org
 help / color / mirror / Atom feed
From: Wanpeng Li <kernellwp@gmail.com>
To: Rik van Riel <riel@redhat.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Wanpeng Li <wanpeng.li@hotmail.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Radim Krcmar <rkrcmar@redhat.com>, Mike Galbraith <efault@gmx.de>
Subject: Re: [PATCH] time,virt: resync steal time when guest & host lose sync
Date: Tue, 16 Aug 2016 14:54:50 +0800	[thread overview]
Message-ID: <CANRm+CyxRQjewCNQ96CKw0ZSk0zW5CTAuPwjsmCVSdFKs_Sfpg@mail.gmail.com> (raw)
In-Reply-To: <1471313483.32433.33.camel@redhat.com>

2016-08-16 10:11 GMT+08:00 Rik van Riel <riel@redhat.com>:
> On Tue, 2016-08-16 at 09:31 +0800, Wanpeng Li wrote:
>> 2016-08-15 23:00 GMT+08:00 Rik van Riel <riel@redhat.com>:
>> > On Mon, 2016-08-15 at 16:53 +0800, Wanpeng Li wrote:
>> > > 2016-08-12 23:58 GMT+08:00 Rik van Riel <riel@redhat.com>:
>> > > [...]
>> > > > Wanpeng, does the patch below work for you?
>> > >
>> > > It will break steal time for full dynticks guest, and there is a
>> > > calltrace of thread_group_cputime_adjusted call stack, RIP is
>> > > cputime_adjust+0xff/0x130.
>> >
>> > How?  This patch is equivalent to passing ULONG_MAX to
>> > steal_account_process_time, which you tried to no ill
>> > effect before.
>>
>> https://lkml.org/lkml/2016/6/8/404/ Paolo original suggested to add
>> the max cputime limit to the vtime, when the cpu is running in nohz
>> full mode and stop the tick, jiffies will be updated depends on clock
>> source instead of clock event device in
>> guest(tick_nohz_update_jiffies() callsite, ktime_get()), so it will
>> not be affected by lost clock ticks, my patch keeps the limit for
>> vtime and remove the limit to non-vtime. However, your patch removes
>> the limit for both scenarios and results in the below calltrace for
>> vtime.
>
> I understand what it does.
>
> What I would like to understand is WHY enforcing the limit
> is the right thing when using vtime, and the wrong thing
> in all other scenarios.

I observed that function get_vtime_delta() underflow which means that
delta < other when debugging your bugfix patch, I believe that is why
Paolo suggested to add the max cputime limit to vtime, he also pointed
out the potentional underflow before
https://lkml.org/lkml/2016/6/8/404/

>
> Can you explain why you change the limit to ULONG_MAX in
> three call sites, but not in the last one?
>
> What is different about the first three, versus the last
> one?
>
> Are you sure it should be changed in three places, and
> not in eg. two?
>
> This seems like something we should try to understand,
> rather than patch up haphazardly.
>
> The changelog of your patch could use an explanation of
> why the change is the correct way to go.
>
>> >
>> > Do you have the full call trace?
>
> OK, so you are seeing a divide by zero in cputime_adjust.
>
> Specifically, this would be scale_stime getting passed
> a (utime + stime) that adds up to 0. Stranger still, that
> only gets called if neither utime or stime is 0, meaning
> that one of utime or stime is negative, at the exact same
> magnitude as the other.
>
> Looking at thread_group_cputime(), I see some room for
> rounding errors.
>
>         do {
>                 seq = nextseq;
>                 flags = read_seqbegin_or_lock_irqsave(&sig->stats_lock,
> &seq);
>                 times->utime = sig->utime;
>                 times->stime = sig->stime;
>                 times->sum_exec_runtime = sig->sum_sched_runtime;
>
>                 for_each_thread(tsk, t) {
>                         task_cputime(t, &utime, &stime);
>                         times->utime += utime;
>                         times->stime += stime;
>                         times->sum_exec_runtime +=
> task_sched_runtime(t);
>                 }
>                 /* If lockless access failed, take the lock. */
>                 nextseq = 1;
>         } while (need_seqretry(&sig->stats_lock, seq));
>
> Specifically, task_cputime calls vtime_delta, which works
> off jiffies, while task_sched_runtime works straight off
> the sched clock.

I try to replace task_sched_runtime() by t->se.sum_exec_runtime just
for testing (refer to Commit d670ec13178d0 "posix-cpu-timers: Cure SMP
wobbles"), divide zero still appear.

>
> I can see how this would lead to a non-zero ->sum_exec_runtime,
> while ->utime and/or ->stime are still at zero. This is fine.
>
> What I do not see is how ->utime or ->stime could end up negative,
> which is what would be needed to hit that divide by zero.

stime is negative since underflow incurred in function
get_vtime_delta(), however, it is u64, so it is a very big number, and
the total is 0 after for loop in scale_time(), but not 0 before that.

Regards,
Wanpeng Li

  reply	other threads:[~2016-08-16  6:54 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-07-13 14:50 [GIT PULL] cputime fixes and cleanups Frederic Weisbecker
2016-07-13 14:50 ` [PATCH 1/5] sched,time: Count actually elapsed irq & softirq time Frederic Weisbecker
2016-07-14 10:37   ` [tip:timers/nohz] sched/cputime: " tip-bot for Rik van Riel
2016-08-09  3:59   ` [PATCH 1/5] sched,time: " Wanpeng Li
2016-08-09 14:06     ` Rik van Riel
2016-08-09 23:07       ` Wanpeng Li
2016-08-10  7:51         ` Wanpeng Li
2016-08-09 23:25       ` Wanpeng Li
2016-08-09 23:31         ` Wanpeng Li
2016-08-09 23:35         ` Wanpeng Li
2016-08-09 23:39         ` Wanpeng Li
2016-08-10  5:07           ` Rik van Riel
2016-08-10  6:33             ` Wanpeng Li
2016-08-10 16:52           ` [PATCH] time,virt: resync steal time when guest & host lose sync Rik van Riel
2016-08-11 10:11             ` Wanpeng Li
2016-08-12  2:44               ` Rik van Riel
2016-08-12  7:09                 ` Wanpeng Li
2016-08-12 15:58                   ` Rik van Riel
2016-08-13 15:36                     ` Frederic Weisbecker
2016-08-15  8:53                     ` Wanpeng Li
2016-08-15 11:38                       ` Wanpeng Li
2016-08-15 15:00                       ` Rik van Riel
2016-08-15 22:19                         ` Wanpeng Li
2016-08-16  1:31                         ` Wanpeng Li
2016-08-16  2:11                           ` Rik van Riel
2016-08-16  6:54                             ` Wanpeng Li [this message]
2016-08-16 14:01                               ` Rik van Riel
2016-08-16 23:08                                 ` Wanpeng Li
2016-08-12 16:33             ` Paolo Bonzini
2016-08-12 17:23               ` Rik van Riel
2016-08-13  7:18                 ` Paolo Bonzini
2016-08-13  8:42             ` Ingo Molnar
2016-08-14  1:50               ` Rik van Riel
2016-08-18  8:23               ` Wanpeng Li
2016-07-13 14:50 ` [PATCH 2/5] nohz,cputime: Replace VTIME_GEN irq time code with IRQ_TIME_ACCOUNTING code Frederic Weisbecker
2016-07-14 10:37   ` [tip:timers/nohz] sched/cputime: " tip-bot for Rik van Riel
2016-07-13 14:50 ` [PATCH 3/5] sched: Complete cleanup of old vtime gen irqtime accounting Frederic Weisbecker
2016-07-14 10:38   ` [tip:timers/nohz] sched/cputime: Clean up the old vtime gen irqtime accounting completely tip-bot for Frederic Weisbecker
2016-07-13 14:50 ` [PATCH 4/5] sched: Reorganize vtime native irqtime accounting headers Frederic Weisbecker
2016-07-14 10:38   ` [tip:timers/nohz] sched/cputime: " tip-bot for Frederic Weisbecker
2016-07-13 14:50 ` [PATCH 5/5] time: Drop local_irq_save/restore from irqtime_account_irq Frederic Weisbecker
2016-07-14 10:38   ` [tip:timers/nohz] sched/cputime: Drop local_irq_save/restore from irqtime_account_irq() tip-bot for Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANRm+CyxRQjewCNQ96CKw0ZSk0zW5CTAuPwjsmCVSdFKs_Sfpg@mail.gmail.com \
    --to=kernellwp@gmail.com \
    --cc=efault@gmx.de \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=rkrcmar@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=wanpeng.li@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.