All of lore.kernel.org
 help / color / mirror / Atom feed
From: Haozhong Zhang <haozhong.zhang@intel.com>
To: Jan Beulich <JBeulich@suse.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Kevin Tian <kevin.tian@intel.com>, Keir Fraser <keir@xen.org>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	xen-devel@lists.xen.org,
	Aravind Gopalakrishnan <Aravind.Gopalakrishnan@amd.com>,
	Jun Nakajima <jun.nakajima@intel.com>
Subject: Re: [PATCH 01/13] x86/time.c: Use system time to calculate elapsed_nsec in tsc_get_info()
Date: Wed, 14 Oct 2015 10:45:04 +0800	[thread overview]
Message-ID: <20151014024504.GA3531@hzzhang-OptiPlex-9020.sh.intel.com> (raw)
In-Reply-To: <5617801402000078000A992C@prv-mh.provo.novell.com>

On Fri, Oct 09, 2015 at 12:51:32AM -0600, Jan Beulich wrote:
> >>> On 28.09.15 at 09:13, <haozhong.zhang@intel.com> wrote:
> > When the TSC mode of a domain is TSC_MODE_DEFAULT and no TSC emulation
> > is used, the existing tsc_get_info() calculates elapsed_nsec by scaling
> > the host TSC with a ratio between guest TSC rate and
> > nanoseconds. However, the result will be incorrect if the guest TSC rate
> > differs from the host TSC rate. This patch fixes this problem by using
> > the system time as elapsed_nsec.
> 
> For both this and patch 2, while at a first glance (and taking into
> account just the visible patch context) what you say seems to
> make sense, the explanation is far from sufficient namely when
> looking at the function as a whole. For one, effects on existing
> cases need to be explicitly described, in particular why SVM's TSC
> ratio code works without that change (or whether it has been
> broken all along, in which case these would become backporting
> candidates; input from SVM maintainers would be appreciated
> too). That may in particular mean being more specific about
> what is actually wrong with scaling the host TSC here (i.e. in
> which way both results differ), when supposedly that matches
> what the hardware does when TSC ratio is supported.
>

I just found that patch 1 is in fact not necessary for supporting VMX
TSC scaling/SVM TSC ratio, because

 1. VMX TSC scaling and SVM TSC ratio are only used for HVM domains.
 
 2. The value of elapsed_nsec, which is modified by patch 1, is used
    to compute d->arch.vtsc_offset by tsc_set_info() for domains using
    TSC_MODE_[DEFAULT|ALWAYS_EMULATE].
    
 3. d->arch.vtsc_offset is then used in three places:
   - gtime_to_gtsc() and gtsc_to_gtime()
    In these two functions, d->arch.vtsc_offset does not take effect
    for HVM domains.
   - cpuid_time_leaf()
    It's only used for domains using TSC_MODE_PVRDTSCP.

Therefore, I think patch 1 can be removed.


However, patch 2 is still necessary. The existing tsc_get_info() uses
the host TSC frequency as the guest TSC frequency for a domain in
TSC_MODE_DEFAULT, which could cause errors in the following example:
 - A domain d using TSC_MODE_DEFAULT is created on host A, then
   migrated to host B, and finally migrated to host C.
 - The host TSC frequencies of three hosts are f_a, f_b and f_c
   respectively and f_a != f_b and f_b != f_c.
 - Both host B and host C support TSC scaling (either VMX TSC scaling
   or SVM TSC ratio).

In above example w/o patch 2,
 1. Initially, d->arch.tsc_khz == f_a.
 
 2. In the first migration, tsc_get_info() on host A passes f_a as the
    guest TSC frequency to tsc_set_info() on host B, so that after the
    migration it's still that d->arch.tsc_khz == f_a. As TSC scaling
    takes effect, guest programs can still observe TSC in frequency f_a.
    So far so good.
    
 3. However, in the second migration, f_b (!= f_a) is passed as the
    guest TSC frequency to tsc_set_info() on host C so that after the
    migration d->arch.tsc_khz is not f_a any more. As TSC scaling
    takes effect on host C as well, the TSC frequency observed by
    guest programs changes and may break some TSC sensitive programs

    At least in my test for VMX TSC scaling, guest Linux kernel would
    complain tsc clocksource is unstable. SVM TSC ratio should have
    the same problem.

W/ patch 2, tsc_get_info() in the above case always gets the guest TSC
frequency from d->arch.tsc_khz. Then in the first migration above, it
behaves the same as before, while in the second migration it can
maintain the guest TSC frequency correctly.

> Then a reason needs to be given why the similar logic in the
> PVRDTSCP case does not also get adjusted.
>

It's just because I didn't consider about PVRDTSCP case. I will add it
in the next version.

- Haozhong

> Plus, looking at the respective code in tsc_set_info(), I'm
> getting the impression that what you're trying to do is not in line
> with what is intended so far: Especially the comment there
> suggests that the intention is for the guest TSC to be made
> match the host one. Considering migration this indeed looks
> suspicious, but then that would need changing too.
> 
> Jan
> 

  parent reply	other threads:[~2015-10-14  2:45 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-28  7:13 [PATCH 00/13] Add VMX TSC scaling support Haozhong Zhang
2015-09-28  7:13 ` [PATCH 01/13] x86/time.c: Use system time to calculate elapsed_nsec in tsc_get_info() Haozhong Zhang
2015-10-09  6:51   ` Jan Beulich
2015-10-09 13:41     ` Boris Ostrovsky
2015-10-09 14:00       ` Haozhong Zhang
2015-10-09 15:11         ` Jan Beulich
2015-10-09 16:09           ` Boris Ostrovsky
2015-10-09 16:19             ` Jan Beulich
2015-10-09 16:31               ` Boris Ostrovsky
2015-10-09 16:51                 ` Haozhong Zhang
2015-10-09 18:59                   ` Boris Ostrovsky
2015-10-09 14:39       ` Jan Beulich
2015-10-09 15:37         ` Boris Ostrovsky
2015-10-09 16:39           ` Haozhong Zhang
2015-10-09 16:44             ` Boris Ostrovsky
2015-10-09 14:35     ` Haozhong Zhang
2015-10-09 14:43       ` Jan Beulich
2015-10-09 15:56         ` Boris Ostrovsky
2015-10-14  2:45     ` Haozhong Zhang [this message]
2015-10-14  9:40       ` Jan Beulich
2015-10-14 10:00         ` Haozhong Zhang
2015-10-14 10:20           ` Jan Beulich
2015-10-14 10:24             ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 02/13] x86/time.c: Get the correct guest TSC rate " Haozhong Zhang
2015-09-28  7:13 ` [PATCH 03/13] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
2015-10-22 12:53   ` Jan Beulich
2015-10-22 14:40     ` Haozhong Zhang
2015-10-22 14:51       ` Jan Beulich
2015-10-22 15:57         ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 04/13] x86/hvm: Setup " Haozhong Zhang
2015-10-22 13:13   ` Jan Beulich
2015-10-22 15:55     ` Haozhong Zhang
2015-10-22 16:05       ` Jan Beulich
2015-10-22 16:39         ` Haozhong Zhang
2015-10-23  7:44     ` Haozhong Zhang
2015-10-23  7:59       ` Jan Beulich
2015-10-23  8:18         ` Haozhong Zhang
2015-10-23  8:31           ` Jan Beulich
2015-10-23  8:40             ` Haozhong Zhang
2015-10-23  9:18               ` Jan Beulich
2015-09-28  7:13 ` [PATCH 05/13] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
2015-10-22 13:52   ` Jan Beulich
2015-10-23  0:49     ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 06/13] x86/hvm: Scale host TSC when setting/getting guest TSC Haozhong Zhang
2015-10-22 14:17   ` Jan Beulich
2015-10-22 15:44     ` Boris Ostrovsky
2015-10-22 16:23       ` Haozhong Zhang
2015-10-27 20:16       ` Aravind Gopalakrishnan
2015-10-28  1:51         ` Haozhong Zhang
2015-11-09  7:43         ` Haozhong Zhang
2015-11-12 13:50           ` George Dunlap
2015-10-22 16:03     ` Haozhong Zhang
2015-10-27  1:54     ` Haozhong Zhang
2015-10-27  8:15       ` Jan Beulich
2015-10-27  8:25         ` Haozhong Zhang
2015-10-27  8:44     ` Haozhong Zhang
2015-10-27 13:10       ` Boris Ostrovsky
2015-10-27 13:55         ` Boris Ostrovsky
2015-10-27 16:13           ` haozhong.zhang
2015-10-27 16:13         ` haozhong.zhang
2015-09-28  7:13 ` [PATCH 07/13] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
2015-10-22 14:54   ` Jan Beulich
2015-09-28  7:13 ` [PATCH 08/13] x86/hvm: Detect TSC scaling through hvm_funcs in tsc_set_info() Haozhong Zhang
2015-10-22 15:01   ` Jan Beulich
2015-09-28  7:13 ` [PATCH 09/13] x86/time.c: Scale host TSC in pvclock properly Haozhong Zhang
2015-09-28 16:36   ` Boris Ostrovsky
2015-09-29  0:19     ` Haozhong Zhang
2015-10-22 15:50   ` Boris Ostrovsky
2015-10-22 16:44     ` Haozhong Zhang
2015-10-22 19:15       ` Boris Ostrovsky
2015-09-28  7:13 ` [PATCH 10/13] vmx: Detect and initialize VMX RDTSC(P) scaling Haozhong Zhang
2015-10-27 13:19   ` Jan Beulich
2015-10-27 16:17     ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 11/13] vmx: Use scaled host TSC to calculate TSC offset Haozhong Zhang
2015-10-22 15:55   ` Boris Ostrovsky
2015-10-22 17:12     ` Haozhong Zhang
2015-10-22 19:19       ` Boris Ostrovsky
2015-10-23  0:52         ` Haozhong Zhang
2015-10-27 13:29   ` Jan Beulich
2015-10-27 16:21     ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 12/13] vmx: Add a call-back to apply TSC scaling ratio to hardware Haozhong Zhang
2015-09-28 16:02   ` Boris Ostrovsky
2015-09-29  1:07     ` Haozhong Zhang
2015-09-29  9:33       ` Andrew Cooper
2015-09-29 10:02         ` Haozhong Zhang
2015-09-29 10:25           ` Andrew Cooper
2015-09-29 13:59             ` Haozhong Zhang
2015-10-27 13:33   ` Jan Beulich
2015-10-28  2:41     ` Haozhong Zhang
2015-09-28  7:13 ` [PATCH 13/13] tools/libxl: Add 'vtsc_khz' option to set guest TSC rate Haozhong Zhang
2015-09-28 11:47   ` Julien Grall
2015-09-28 12:11     ` Haozhong Zhang
2015-09-28 14:19   ` Wei Liu
2015-09-29  0:40     ` Haozhong Zhang
2015-09-29  9:20       ` Wei Liu
2015-09-29  9:50         ` Haozhong Zhang
2015-09-29 10:24           ` Julien Grall
2015-09-29 10:07       ` Ian Campbell
2015-09-29 10:33         ` Wei Liu
2015-09-29 12:57         ` Haozhong Zhang
2015-09-29 10:04   ` Ian Campbell
2015-09-29 10:13     ` Haozhong Zhang
2015-09-29 10:24       ` Andrew Cooper
2015-09-29 10:28         ` Ian Campbell
2015-09-29 10:31           ` Andrew Cooper
2015-09-29 13:53           ` Haozhong Zhang
2015-09-29 13:56             ` Andrew Cooper
2015-09-29 14:01               ` Haozhong Zhang
2015-09-29 14:37                 ` Ian Campbell
2015-09-29 15:16                   ` Haozhong Zhang
2015-09-28 10:51 ` [PATCH 00/13] Add VMX TSC scaling support Andrew Cooper
2015-09-28 13:48   ` Boris Ostrovsky
2015-11-22 17:54 ` Haozhong Zhang
2015-11-23 15:37   ` Boris Ostrovsky
2015-11-24 13:05     ` Haozhong Zhang
2015-11-24 14:19       ` Boris Ostrovsky
2015-11-24 14:25       ` Haozhong Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151014024504.GA3531@hzzhang-OptiPlex-9020.sh.intel.com \
    --to=haozhong.zhang@intel.com \
    --cc=Aravind.Gopalakrishnan@amd.com \
    --cc=JBeulich@suse.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=jun.nakajima@intel.com \
    --cc=keir@xen.org \
    --cc=kevin.tian@intel.com \
    --cc=suravee.suthikulpanit@amd.com \
    --cc=xen-devel@lists.xen.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.