From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH 01/13] x86/time.c: Use system time to calculate elapsed_nsec in tsc_get_info() Date: Fri, 9 Oct 2015 12:31:53 -0400 Message-ID: <5617EBF9.6040906@oracle.com> References: <1443424438-13404-1-git-send-email-haozhong.zhang@intel.com> <1443424438-13404-2-git-send-email-haozhong.zhang@intel.com> <5617801402000078000A992C@prv-mh.provo.novell.com> <5617C410.2090208@oracle.com> <20151009140005.GA10042@hzzhang-OptiPlex-9020.sh.intel.com> <5617F52C02000078000A9CCE@prv-mh.provo.novell.com> <5617E6D2.10608@oracle.com> <5618052202000078000A9D6F@prv-mh.provo.novell.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <5618052202000078000A9D6F@prv-mh.provo.novell.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: Jan Beulich , Haozhong Zhang Cc: Kevin Tian , Keir Fraser , Suravee Suthikulpanit , Andrew Cooper , xen-devel@lists.xen.org, Aravind Gopalakrishnan , Jun Nakajima List-Id: xen-devel@lists.xenproject.org On 10/09/2015 12:19 PM, Jan Beulich wrote: >>>> On 09.10.15 at 18:09, wrote: >> On 10/09/2015 11:11 AM, Jan Beulich wrote: >>>>>> On 09.10.15 at 16:00, wrote: >>>> On Fri, Oct 09, 2015 at 09:41:36AM -0400, Boris Ostrovsky wrote: >>>>> On 10/09/2015 02:51 AM, Jan Beulich wrote: >>>>>>>>> On 28.09.15 at 09:13, wrote: >>>>>>> When the TSC mode of a domain is TSC_MODE_DEFAULT and no TSC emulation >>>>>>> is used, the existing tsc_get_info() calculates elapsed_nsec by scaling >>>>>>> the host TSC with a ratio between guest TSC rate and >>>>>>> nanoseconds. However, the result will be incorrect if the guest TSC rate >>>>>>> differs from the host TSC rate. This patch fixes this problem by using >>>>>>> the system time as elapsed_nsec. >>>>>> For both this and patch 2, while at a first glance (and taking into >>>>>> account just the visible patch context) what you say seems to >>>>>> make sense, the explanation is far from sufficient namely when >>>>>> looking at the function as a whole. For one, effects on existing >>>>>> cases need to be explicitly described, in particular why SVM's TSC >>>>>> ratio code works without that change (or whether it has been >>>>>> broken all along, in which case these would become backporting >>>>>> candidates; input from SVM maintainers would be appreciated >>>>>> too). That may in particular mean being more specific about >>>>>> what is actually wrong with scaling the host TSC here (i.e. in >>>>>> which way both results differ), when supposedly that matches >>>>>> what the hardware does when TSC ratio is supported. >>>>> If elapsed_nsec is the time that guest has been running then how can >>>>> get_s_time(), which is system time, be the right answer here? But what >>>>> confuses me even more is that existing code is not doing that neither. >>>>> >>>>> Shouldn't elapsed_nsec be offset by d->arch.vtsc_offset on the get side? >>>>> I.e. >>>>> >>>>> *elapsed_nsec = get_s_time() - d->arch.vtsc_offset? >>>>> >>>> Yes, I should minus d->arch.vtsc_offset here. >>> In which case - afaict - the code becomes identical to that of the >>> TSC_MODE_ALWAYS_EMULATE case as well as the >>> TSC_MODE_DEFAULT w/ d->arch.vtsc true. Which seems quite >>> unlikely to be correct. >> *elapsed_nsec = *gtsc_khz = 0; ? Because we are effectively in >> TSC_MODE_NEVER. > How that? Talk here has been about TSC_MODE_DEFAULT... AFAIUI, TSC_MODE_DEFAULT is a shorthand for saying "I will let the hypervisor pick whether the guest will be in TSC_MODE_ALWAYS_EMULATE or TSC_MODE_NEVER". d->arch.vtsc is what ends up being internal implementation of user-provided mode (for the most parts; I think hvm_cpuid() being the only true exception --- and perhaps it needs to be looked at). So if we have d->arch.vtsc=0 (which is the case we are talking about here) then we are really in NEVER mode -boris > >> That can't be right... > Why not? tsc_set_info() doesn't care about any of its other input > values when that mode is in effect. > > Jan >