All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Gibson <david@gibson.dropbear.id.au>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: John Stultz <john.stultz@linaro.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Liav Rehana <liavr@mellanox.com>,
	Chris Metcalf <cmetcalf@mellanox.com>,
	Richard Cochran <richardcochran@gmail.com>,
	Ingo Molnar <mingo@kernel.org>,
	Prarit Bhargava <prarit@redhat.com>,
	Laurent Vivier <lvivier@redhat.com>,
	"Christopher S . Hall" <christopher.s.hall@intel.com>,
	"4.6+" <stable@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] timekeeping: Change type of nsec variable to unsigned in its calculation.
Date: Sat, 3 Dec 2016 11:33:09 +1100	[thread overview]
Message-ID: <20161203003309.GL10089@umbus.fritz.box> (raw)
In-Reply-To: <alpine.DEB.2.20.1612020921500.4295@nanos>

[-- Attachment #1: Type: text/plain, Size: 3134 bytes --]

On Fri, Dec 02, 2016 at 09:36:42AM +0100, Thomas Gleixner wrote:
> On Fri, 2 Dec 2016, David Gibson wrote:
> > On Thu, Dec 01, 2016 at 12:59:51PM +0100, Thomas Gleixner wrote:
> > > So I assume that you are talking about a VM which was not scheduled by the
> > > host due to overcommitment (who ever thought that this is a good idea) or
> > > whatever other reason (yes, people were complaining about wreckage caused
> > > by stopping kernels with debuggers) for a long enough time to trigger that
> > > overflow situation. If that's the case then the unsigned conversion will
> > > just make it more unlikely but it still will happen.
> > 
> > It was essentially the stopped by debugger case.  I forget exactly
> > why, but the guest was being explicitly stopped from outside, it
> > wasn't just scheduling lag.  I think it was something in the vicinity
> > of 10 minutes stopped.
> 
> Ok. Debuggers stopping stuff is one issue, but if I understood Liav
> correctly, then he is seing the issue on a heavy loaded machine.

Right.  I can't speak to other situations which might trigger this.

> Liav, can you please describe the scenario in detail? Are you observing
> this on bare metal or in a VM which gets scheduled out long enough or was
> there debugging/hypervisor intervention involved?
> 
> > It's long enough ago that I can't be sure, but I thought we'd tried
> > various different stoppage periods, which should have also triggered
> > the unsigned overflow you're describing, and didn't observe the crash
> > once the change was applied.  Note that there have been other changes
> > to the timekeeping code since then, which might have made a
> > difference.
> > 
> > I agree that it's not reasonable for the guest to be entirely
> > unaffected by such a large stoppage: I'd have no complaints if the
> > guest time was messed up, and/or it spewed warnings.  But complete
> > guest death seems a rather more fragile response to the situation than
> > we'd like.
> 
> Guests death? Is it really dead/crashed or just stuck in that endless loop
> trying to add that huge negative value piecewise?

Well, I don't know.  But the point was it was unusable from the
console, and didn't come back any time soon.

> That's at least what Liav was describing as he mentioned
> __iter_div_u64_rem() explicitely.
> 
> While I'm less worried about debuggers, I worry about the real thing.
> 
> I agree that we should not starve after resume from a debug stop, but in
> that case the least of my worries is time going backwards.
> 
> Though if the signed mult overrun is observable in a live system, then we
> need to worry about time going backwards even with the unsigned
> conversion. Simply because once we fixed the starvation issue people with
> insane enough setups will trigger the unsigned overrun and complain about
> time going backwards.
> 
> Thanks,
> 
> 	tglx
> 
> 

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

  reply	other threads:[~2016-12-03  0:33 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-19  4:53 [PATCH] timekeeping: Change type of nsec variable to unsigned in its calculation John Stultz
2016-11-28 22:50 ` John Stultz
2016-11-29 14:22 ` Thomas Gleixner
2016-11-29 23:57   ` David Gibson
2016-11-30 23:21     ` Thomas Gleixner
2016-12-01  2:12       ` David Gibson
2016-12-01 11:59         ` Thomas Gleixner
2016-12-01 20:23           ` John Stultz
2016-12-01 20:46             ` Thomas Gleixner
2016-12-01 21:19               ` John Stultz
2016-12-01 22:44                 ` Thomas Gleixner
2016-12-01 23:03                   ` John Stultz
2016-12-01 23:08                     ` Thomas Gleixner
2016-12-01 23:32           ` David Gibson
2016-12-02  8:36             ` Thomas Gleixner
2016-12-03  0:33               ` David Gibson [this message]
  -- strict thread matches above, loose matches on Subject: below --
2016-09-26  6:13 Liav Rehana
2016-09-26  5:45 Liav Rehana
2016-09-26  6:02 ` John Stultz
2016-09-27  0:01 ` Thomas Gleixner
2016-09-27  5:10   ` Liav Rehana
2016-09-27 14:18     ` Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161203003309.GL10089@umbus.fritz.box \
    --to=david@gibson.dropbear.id.au \
    --cc=christopher.s.hall@intel.com \
    --cc=cmetcalf@mellanox.com \
    --cc=john.stultz@linaro.org \
    --cc=liavr@mellanox.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lvivier@redhat.com \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=prarit@redhat.com \
    --cc=richardcochran@gmail.com \
    --cc=stable@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.