From mboxrd@z Thu Jan 1 00:00:00 1970 From: Boris Ostrovsky Subject: Re: [PATCH 01/13] x86/time.c: Use system time to calculate elapsed_nsec in tsc_get_info() Date: Fri, 9 Oct 2015 14:59:54 -0400 Message-ID: <56180EAA.6050405@oracle.com> References: <1443424438-13404-1-git-send-email-haozhong.zhang@intel.com> <1443424438-13404-2-git-send-email-haozhong.zhang@intel.com> <5617801402000078000A992C@prv-mh.provo.novell.com> <5617C410.2090208@oracle.com> <20151009140005.GA10042@hzzhang-OptiPlex-9020.sh.intel.com> <5617F52C02000078000A9CCE@prv-mh.provo.novell.com> <5617E6D2.10608@oracle.com> <5618052202000078000A9D6F@prv-mh.provo.novell.com> <5617EBF9.6040906@oracle.com> <20151009165112.GB23023@hzzhang-OptiPlex-9020.sh.intel.com> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii"; Format="flowed" Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <20151009165112.GB23023@hzzhang-OptiPlex-9020.sh.intel.com> List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Sender: xen-devel-bounces@lists.xen.org Errors-To: xen-devel-bounces@lists.xen.org To: xen-devel@lists.xen.org List-Id: xen-devel@lists.xenproject.org On 10/09/2015 12:51 PM, Haozhong Zhang wrote: > On Fri, Oct 09, 2015 at 12:31:53PM -0400, Boris Ostrovsky wrote: >> On 10/09/2015 12:19 PM, Jan Beulich wrote: >>>>>> On 09.10.15 at 18:09, wrote: >>>> On 10/09/2015 11:11 AM, Jan Beulich wrote: >>>>>>>> On 09.10.15 at 16:00, wrote: >>>>>> On Fri, Oct 09, 2015 at 09:41:36AM -0400, Boris Ostrovsky wrote: >>>>>>> On 10/09/2015 02:51 AM, Jan Beulich wrote: >>>>>>>>>>> On 28.09.15 at 09:13, wrote: >>>>>>>>> When the TSC mode of a domain is TSC_MODE_DEFAULT and no TSC emulation >>>>>>>>> is used, the existing tsc_get_info() calculates elapsed_nsec by scaling >>>>>>>>> the host TSC with a ratio between guest TSC rate and >>>>>>>>> nanoseconds. However, the result will be incorrect if the guest TSC rate >>>>>>>>> differs from the host TSC rate. This patch fixes this problem by using >>>>>>>>> the system time as elapsed_nsec. >>>>>>>> For both this and patch 2, while at a first glance (and taking into >>>>>>>> account just the visible patch context) what you say seems to >>>>>>>> make sense, the explanation is far from sufficient namely when >>>>>>>> looking at the function as a whole. For one, effects on existing >>>>>>>> cases need to be explicitly described, in particular why SVM's TSC >>>>>>>> ratio code works without that change (or whether it has been >>>>>>>> broken all along, in which case these would become backporting >>>>>>>> candidates; input from SVM maintainers would be appreciated >>>>>>>> too). That may in particular mean being more specific about >>>>>>>> what is actually wrong with scaling the host TSC here (i.e. in >>>>>>>> which way both results differ), when supposedly that matches >>>>>>>> what the hardware does when TSC ratio is supported. >>>>>>> If elapsed_nsec is the time that guest has been running then how can >>>>>>> get_s_time(), which is system time, be the right answer here? But what >>>>>>> confuses me even more is that existing code is not doing that neither. >>>>>>> >>>>>>> Shouldn't elapsed_nsec be offset by d->arch.vtsc_offset on the get side? >>>>>>> I.e. >>>>>>> >>>>>>> *elapsed_nsec = get_s_time() - d->arch.vtsc_offset? >>>>>>> >>>>>> Yes, I should minus d->arch.vtsc_offset here. >>>>> In which case - afaict - the code becomes identical to that of the >>>>> TSC_MODE_ALWAYS_EMULATE case as well as the >>>>> TSC_MODE_DEFAULT w/ d->arch.vtsc true. Which seems quite >>>>> unlikely to be correct. >>>> *elapsed_nsec = *gtsc_khz = 0; ? Because we are effectively in >>>> TSC_MODE_NEVER. >>> How that? Talk here has been about TSC_MODE_DEFAULT... >> AFAIUI, TSC_MODE_DEFAULT is a shorthand for saying "I will let the >> hypervisor pick whether the guest will be in TSC_MODE_ALWAYS_EMULATE or >> TSC_MODE_NEVER". d->arch.vtsc is what ends up being internal implementation >> of user-provided mode (for the most parts; I think hvm_cpuid() being the >> only true exception --- and perhaps it needs to be looked at). >> >> So if we have d->arch.vtsc=0 (which is the case we are talking about here) >> then we are really in NEVER mode >> > Not quite understand this. Is tsc_set_info() the only place to set > d->arch.tsc_mode ? Yes. > Though it may decide d->arch.vtsc should be 1, it > still sets d->arch.tsc_mode to the user provided TSC mode for a > non-pvh domain. And then in tsc_get_info(), it should never fall into > TSC_MODE_NEVER_EMULATE branch if d->arch.tsc_mode is not. I was trying to say that TSC behavior in current incarnation is equivalent to _NEVER if d->arch.vtsc is 0. But when we call tsc_get_info() we can not handle it out of _NEVER case (because, as you pointed out, d->arch.vtsc may change after migration). And we don't. -boris > > - Haozhong > >> -boris >> >>>> That can't be right... >>> Why not? tsc_set_info() doesn't care about any of its other input >>> values when that mode is in effect. >>> >>> Jan >>> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel