From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S933268AbcLAVTE (ORCPT <rfc822;w@1wt.eu>);
        Thu, 1 Dec 2016 16:19:04 -0500
Received: from mail-oi0-f41.google.com ([209.85.218.41]:33541 "EHLO
        mail-oi0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1751676AbcLAVTC (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Thu, 1 Dec 2016 16:19:02 -0500
MIME-Version: 1.0
In-Reply-To: <alpine.DEB.2.20.1612012127260.3666@nanos>
References: <1479531216-25361-1-git-send-email-john.stultz@linaro.org>
 <alpine.DEB.2.20.1611291520070.4358@nanos> <20161129235727.GA19891@umbus>
 <alpine.DEB.2.20.1611302355070.3619@nanos> <20161201021233.GI19891@umbus>
 <alpine.DEB.2.20.1612011110270.3453@nanos> <CALAqxLXSEYMrzZOPhg1ezzatqgRp9w_iyg=a_yZZXmK2rGDxOg@mail.gmail.com>
 <alpine.DEB.2.20.1612012127260.3666@nanos>
From: John Stultz <john.stultz@linaro.org>
Date: Thu, 1 Dec 2016 13:19:00 -0800
Message-ID: <CALAqxLXg2i6uiWcq21LK-ZsPvtugbuJa7Y8U0upXczS_o9aZOQ@mail.gmail.com>
Subject: Re: [PATCH] timekeeping: Change type of nsec variable to unsigned in
 its calculation.
To: Thomas Gleixner <tglx@linutronix.de>
Cc: David Gibson <david@gibson.dropbear.id.au>,
        lkml <linux-kernel@vger.kernel.org>, Liav Rehana <liavr@mellanox.com>,
        Chris Metcalf <cmetcalf@mellanox.com>,
        Richard Cochran <richardcochran@gmail.com>,
        Ingo Molnar <mingo@kernel.org>, Prarit Bhargava <prarit@redhat.com>,
        Laurent Vivier <lvivier@redhat.com>,
        "Christopher S . Hall" <christopher.s.hall@intel.com>,
        "4.6+" <stable@vger.kernel.org>, Peter Zijlstra <peterz@infradead.org>
Content-Type: text/plain; charset=UTF-8
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Thu, Dec 1, 2016 at 12:46 PM, Thomas Gleixner <tglx@linutronix.de> wrote:
> On Thu, 1 Dec 2016, John Stultz wrote:
>> I would also suggest:
>> 3) If the systems are halted for longer then the timekeeping core
>> expects, the system will "miss" or "lose" some portion of that halted
>> time, but otherwise the system will function properly.  Which is the
>> result with this patch.
>
> Wrong. This is not the result with this patch.
>
> If the time advances enough to overflow the unsigned mult, which is
> entirely possible as it takes just twice the time of the negative overflow,
> then time will go backwards again and that's not 'miss' or 'lose', that's
> just broken.

Eh? If you overflow the 64bits on the mult, the shift (which is likely
large if you're actually hitting the overflow) brings the value back
down to a smaller value. Time doesn't go backwards, its just smaller
then it ought to be (since the high bits were lost).

> If we want to prevent that, then we either have to clamp the delta value,
> which is the worst choice or use 128bit math to avoid the overflow.

I'm not convinced yet either of these approaches are really needed.

>> I'm not sure if its really worth trying to recover that time or be
>> perfect in those situations. Especially since on narrow clocksources
>> you'll have the same result.
>
> We can deal with the 64bit overflow at least for wide clocksources which
> all virtualizaton infected architectures provide in a sane way.

Another approach would be to push back on the virtualization
environments to step in and virtualize a solution if they've idled a
host for too long. They could do like the old tick-based
virtualization environments used to and trigger a few timer interrupts
while slowly removing a fake negative clocksource offset to allow time
to catch up more normally after a long stall.

Or they could require clocksources that have smaller shift values to
allow longer idle periods.

> For bare metal systems with narrow clocksources the whole issue is non
> existant we can make the 128bit math depend on both a config switch and a
> static key, so bare metal will not have to take the burden.

Bare metal machines also sometimes run virtualization. I'm not sure
the two are usefully exclusive.

thanks
-john