Re: Skylake (XPS 13 9350) TSC is way off

From: John Stultz <john.stultz@linaro.org>
To: Andy Lutomirski <luto@kernel.org>
Cc: "Brown, Len" <len.brown@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	X86 ML <x86@kernel.org>,
	"Hunter, Adrian" <adrian.hunter@intel.com>
Subject: Re: Skylake (XPS 13 9350) TSC is way off
Date: Wed, 2 Dec 2015 15:55:07 -0800	[thread overview]
Message-ID: <CALAqxLUdnvkHP9qKFmAkHfLy=5vz-GOTz9dJx2bDQMj9TqzZbA@mail.gmail.com> (raw)
In-Reply-To: <CALCETrUL=DdwhoPK2TaPwjMrpfVm8cn62ASfNjZfO3NcwQ7H8g@mail.gmail.com>

On Wed, Dec 2, 2015 at 3:42 PM, Andy Lutomirski <luto@kernel.org> wrote:
> On Wed, Dec 2, 2015 at 3:38 PM, John Stultz <john.stultz@linaro.org> wrote:
>> On Wed, Dec 2, 2015 at 3:25 PM, Andy Lutomirski <luto@kernel.org> wrote:
>>> In case it's at all useful, adjtimex -p says:
>>>
>>>          mode: 0
>>>        offset: 0
>>>     frequency: 135641
>>>      maxerror: 37498
>>>      esterror: 1532
>>>        status: 8192
>>> time_constant: 2
>>>     precision: 1
>>>     tolerance: 32768000
>>>          tick: 10000
>>>      raw time:  1449098317s 671243180us = 1449098317.671243180
>>>
>>> this suggests a rather small correction, so I really have no idea what
>>> "Adjusting tsc more than 11% (8039115 vs 7759462)" means.
>>>
>>> John, you wrote this code.  What does the error message mean?
>>
>> Basally the internal correction adjustments are getting pulled further
>> then it is supposed to (its concerning since in some cases we push the
>> clocksource mult value to be quite large, and so making a large
>> adjustment could possibly cause an overflow).
>>
>> Awhile back I had intended to cap the max adjustment, but out of
>> caution I put in a warning instead to see how often this might occur.
>>
>> I've seen it reported sometimes while folks were running trinity or
>> under a VM (suggesting that due to system delays timekeeping
>> management may have been delayed and the internal time error had grown
>> quite far, so the internal correction was being somewhat aggressive).
>> Though more recently (3.17 era) we've changed the internal adjustment
>> code to try to be more conservative to avoid over-steering w/ NOHZ, so
>> I'd expect fewer of these.
>>
>
> The trouble for me is that it's not clear from the message what rate
> doesn't agree with what rate (kernel's unadjusted rate vs adjtimex's
> request?), and the units are incomprehensible.  If the issue is that
> adjtimex(2) has asked for X PPM of adjustment and X is greater than Y,
> could we display that directly?

Sorry, yes, its the clocksource adjusted mult vs original mult values.

And its having to do with the internal correction (ie, what we've
actually done) logic, not the adjtimex requested adjustment (what we
were asked to do).

>> On a hunch, are you running chrony instead of ntpd?
>
> Yes, this is indeed chrony.

Ok. I'll have to look closer. The last time that message came up it
was in a report of a bug that chrony uncovered with the internal
correction being too slow. I know chrony is much more aggressive
compared to ntpd in tweaking the freq value for the initial converging
correction at startup, so maybe that along with something else is
causing us to get out of spec.

>From the values I see in the message, it looks like you're not
overflowing mult, and the smallish freq value from adjtimex you're
seeing now look fine, so I suspect you're not actually hitting a
problem, but we're momentarily outside what the code is designed for.

Still need to figure out why and fix that, of course. :)

thanks
-john