Re: [PATCH] x86/time: use correct (local) time stamp in constant-TSC calibration fast path

From: Joao Martins <joao.m.martins@oracle.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>,
	Wei Liu <wei.liu2@citrix.com>,
	Dario Faggioli <dario.faggioli@eu.citrix.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Subject: Re: [PATCH] x86/time: use correct (local) time stamp in constant-TSC calibration fast path
Date: Thu, 9 Jun 2016 16:00:20 +0100	[thread overview]
Message-ID: <57598484.1040300@oracle.com> (raw)
In-Reply-To: <575983BE02000078000F37BB@prv-mh.provo.novell.com>

On 06/09/2016 01:57 PM, Jan Beulich wrote:
>>>> On 09.06.16 at 14:11, <joao.m.martins@oracle.com> wrote:
>> On 06/09/2016 01:01 PM, Jan Beulich wrote:
>>> This looks like a copy and paste mistake in commit 1b6a99892d ("x86:
>>> Simpler time handling when TSC is constant across all power saving
>>> states"), responsible for occasional many-microsecond cross-CPU skew of
>>> what NOW() returns.
>>>
>>> Also improve the correlation between local TSC and stime stamps
>>> obtained at the end of the two calibration handlers: Compute the stime
>>> one from the TSC one, instead of doing another rdtsc() for that
>>> compuation.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>>> ---
>>> As to 4.7 inclusion: This of course looks like a pretty blatant mistake
>>> that has been there for many years (and hence many releases). There's
>>> certainly non-zero risk that I'm overlooking something here (despite
>>> Joao apparently having come to the same conclusion), so I can't really
>>> make up my mind on whether to request this patch to be put there right
>>> away, or rather having linger in -unstable for a while.
>>>
>> Initially I thought of this as a fix too, but then wouldn't having
>> t->stime_local_stamp be c->stime_local_stamp, render no use to the
>> platform timer reads done on calibration? Unless we would change
>> update_vcpu_system to use stime_master_stamp instead?
>> stime_master_stamp field isn't used anywhere other than the dom0 injected
>> cpu_frequency_change or when at boot seeding the cpu_time struct on
>> init_percpu_time (and the already mentioned use on local_time_calibration) ?
>> init_percpu_time also takes a different read of the platform timer per
>> cpu and could probably be inherited by a read done on the boot processor
>> and written on remaining CPUs, so that all would start from the same stamp.
>> IOW - this sounds like time we are turning stime to be totally TSC except
>> when initially seeding each cpu_time?
> 
> Did you also look at the "slow" path in local_time_calibration()? That
> does use the master stamp.
Yeah I am aware of its usage there - my comment was only on the fast path. I think
the latter is the most common case too.

> So in effect for the fast path the patch
> changes the situation from c->stime_local_stamp being effectively
> unused to c->stime_master_stamp being so. In the former case, if
> that really hadn't been a typo, deleting the write of that field from
> time_calibration_std_rendezvous() would have made sense, as
> get_s_time() certainly is more overhead than the simply memory
> read and write needed for keeping c->stime_master_stamp up to
> date (despite not being used).
I agree, but what I meant previously was more of a concern meaning: CPU 0 is doing an
expensive read_platform_time (e.g. HPET supposedly microseconds order, plus a
non-contented lock) to set stime_master_stamp that doesn't get used at all -
effectively not using the clocksource set initially at boot.

What if verify_tsc_reliability clears out X86_FEATURE_TSC_RELIABLE when failing
the warp test? The calibration function is set early on right after interrupts are
enabled and the time warp check later on when all CPUs are up. So on problematic
platforms it's possible that std_rendezvous is used with a constant TSC but still
deemed unreliable. We still keep incrementing deltas at roughly about the same time,
but in effect with this change the stime_local_stamp would be TSC-only based thus
leading to warps with an unreliable TSC? And there's also the CPU hotplug/onlining
case that we once discussed.

Joao

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel