All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/14] Add VMX TSC scaling support
@ 2015-12-06 20:58 Haozhong Zhang
  2015-12-06 20:58 ` [PATCH v2 01/14] svm: Fix incorrect TSC scaling Haozhong Zhang
                   ` (14 more replies)
  0 siblings, 15 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patchset adds support for VMX TSC scaling feature which is
available on Intel Skylake Server CPU. The specification of VMX TSC
scaling can be found at
http://www.intel.com/content/www/us/en/processors/timestamp-counter-scaling-virtualization-white-paper.html

VMX TSC scaling allows guest TSC which is read by guest rdtsc(p)
instructions increases in a rate that is customized by the hypervisor
and can be different than the host TSC frequency. Basically, VMX TSC
scaling adds a 64-bit field called TSC multiplier in VMCS so that, if
VMX TSC scaling is enabled, TSC read by guest rdtsc(p) instructions
will be calculated by the following formula:

  guest EDX:EAX = (Host TSC * TSC multiplier) >> 48 + VMX TSC Offset

where, Host TSC = Host MSR_IA32_TSC + Host MSR_IA32_TSC_ADJUST.

If the destination host supports VMX TSC scaling, this patchset allows
guest programs in a HVM container in the default TSC mode or PVRDTSCP
(native_paravirt) TSC mode to observe the same TSC frequency across
the migration.

This patchset is composed of following four parts.
 1. Patch 1 - 6 fix bugs in the current TSC scaling implementation
    on SVM TSC ratio.
 2. Patch 7 - 12 abstract the possible common code between VMX TSC
    scaling and SVM TSC ratio.
 3. Patch 13 adds the arch-specific code to support VMX TSC scaling.
 4. Patch 14 updates corresponding documents.

Compared to v1, some patches in v2 are reordered or merged and some
new patches are added in v2 as well. Relations of patches between v1
and v2 are listed below.
  v2                 Relation to v1
 -----------------------------------------
  Patch 1-3          New
  Patch 4            Update of Patch 2
  Patch 5            Reorder of Patch 6
  Patch 6            Patch 9
  Patch 7            New
  Patch 8-10         Update of Patch 3-5
  Patch 11-12        Patch 7-8
  Patch 13           Merge of Patch 10-12
  Patch 14           New

Changes in v2:
 * Remove unnecessary v1 patch 1&13.
 * Add and move all bug-fix patches to the beginning of this series.
   (Patch 1 - 6)
 * Update changes in tsc_set_info() and tsc_get_info() to make both
   functions consistent with each other. (Patch 2 - 4)
 * Move a part of scaling logic out of [vmx|svm]_set_tsc_offset().
   (Patch 7)
 * Remove redundant hvm_funcs.tsc_scaling_ratio_rsvd. (Patch 8)
 * Reimplement functions that calculate TSC ratio and scale TSC.
   (Patch 9&10)
 * Merge setting VMX TSC multiplier into patch 13.
 * Move initialing tsc_scaling_ratio in VMX ahead to
   vmx_vcpu_initialise() so as to make construct_vmcs() naturally
   use this field instead of a constant. (Patch 13)
 * Update documents related to tsc_mode.
 * Other code cleanup and style fixes.

Haozhong Zhang (14):
  svm: Fix incorrect TSC scaling
  x86/time.c: Fix domain type check in tsc_set_info()
  x86/time.c: Use correct guest TSC frequency in tsc_set_info()
  x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  x86/hvm: Scale host TSC when setting/getting guest TSC
  x86/time.c: Scale host TSC in pvclock properly
  svm: Remove redundant TSC scaling in svm_set_tsc_offset()
  x86/hvm: Collect information of TSC scaling ratio
  x86/hvm: Setup TSC scaling ratio
  x86/hvm: Replace architecture TSC scaling by a common function
  x86/hvm: Move saving/loading vcpu's TSC to common code
  x86/hvm: Detect TSC scaling through hvm_funcs
  vmx: Add VMX RDTSC(P) scaling support
  docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt

 docs/man/xl.cfg.pod.5              |  15 +++++-
 docs/misc/tscmode.txt              |  14 ++++++
 xen/arch/x86/hvm/hvm.c             |  65 ++++++++++++++++++++----
 xen/arch/x86/hvm/svm/svm.c         |  72 +++++++++++++++-----------
 xen/arch/x86/hvm/vmx/vmcs.c        |  12 +++--
 xen/arch/x86/hvm/vmx/vmx.c         |  20 ++++++--
 xen/arch/x86/time.c                |  57 ++++++++++++++++-----
 xen/include/asm-x86/hvm/hvm.h      |  18 +++++++
 xen/include/asm-x86/hvm/svm/svm.h  |   3 --
 xen/include/asm-x86/hvm/vcpu.h     |   2 +
 xen/include/asm-x86/hvm/vmx/vmcs.h |   7 +++
 xen/include/asm-x86/math64.h       | 100 +++++++++++++++++++++++++++++++++++++
 12 files changed, 322 insertions(+), 63 deletions(-)
 create mode 100644 xen/include/asm-x86/math64.h

-- 
2.6.3

^ permalink raw reply	[flat|nested] 61+ messages in thread

* [PATCH v2 01/14] svm: Fix incorrect TSC scaling
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 16:49   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info() Haozhong Zhang
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

SVM TSC ratio is incorrectly used in the current
svm_get_tsc_offset(). This patch replaces the scaling logic in
svm_get_tsc_offset() with a correct implementation.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/svm/svm.c | 41 ++++++++++++++++++++++++++++++++---------
 1 file changed, 32 insertions(+), 9 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index e2db4db..1e6529e 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -773,18 +773,41 @@ static int svm_get_guest_pat(struct vcpu *v, u64 *gpat)
     return 1;
 }
 
-static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
-    uint64_t ratio)
+static uint64_t scale_tsc(uint64_t host_tsc, uint64_t ratio)
 {
-    uint64_t offset;
+    uint64_t mult, frac, scaled_host_tsc;
+
+    if ( ratio == DEFAULT_TSC_RATIO )
+        return host_tsc;
+
+    /*
+     * Suppose the most significant 32 bits of host_tsc and ratio are
+     * tsc_h and mult, and the least 32 bits of them are tsc_l and frac,
+     * then
+     *     host_tsc * ratio * 2^-32
+     *     = host_tsc * (mult * 2^32 + frac) * 2^-32
+     *     = host_tsc * mult + (tsc_h * 2^32 + tsc_l) * frac * 2^-32
+     *     = host_tsc * mult + tsc_h * frac + ((tsc_l * frac) >> 32)
+     *
+     * Multiplications in the last two terms are between 32-bit integers,
+     * so both of them can fit in 64-bit integers.
+     *
+     * Because mult is usually less than 10 in practice, it's very rare
+     * that host_tsc * mult can overflow a 64-bit integer.
+     */
+    mult = ratio >> 32;
+    frac = ratio & ((1ULL << 32) - 1);
+    scaled_host_tsc  = host_tsc * mult;
+    scaled_host_tsc += (host_tsc >> 32) * frac;
+    scaled_host_tsc += ((host_tsc & ((1ULL << 32) - 1)) * frac) >> 32;
 
-    if (ratio == DEFAULT_TSC_RATIO)
-        return guest_tsc - host_tsc;
+    return scaled_host_tsc;
+}
 
-    /* calculate hi,lo parts in 64bits to prevent overflow */
-    offset = (((host_tsc >> 32U) * (ratio >> 32U)) << 32U) +
-          (host_tsc & 0xffffffffULL) * (ratio & 0xffffffffULL);
-    return guest_tsc - offset;
+static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
+    uint64_t ratio)
+{
+    return guest_tsc - scale_tsc(host_tsc, ratio);
 }
 
 static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info()
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
  2015-12-06 20:58 ` [PATCH v2 01/14] svm: Fix incorrect TSC scaling Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 16:49   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency " Haozhong Zhang
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

Replace is_hvm_domain() in tsc_set_info() by has_hvm_container_domain()
to keep consistent with other domain type checks in tsc_set_info().

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/time.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 30d52c4..b5223cf 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1878,7 +1878,7 @@ void tsc_set_info(struct domain *d,
         break;
     }
     d->arch.incarnation = incarnation + 1;
-    if ( is_hvm_domain(d) )
+    if ( has_hvm_container_domain(d) )
     {
         hvm_set_rdtsc_exiting(d, d->arch.vtsc);
         if ( d->vcpu && d->vcpu[0] && incarnation == 0 )
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency in tsc_set_info()
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
  2015-12-06 20:58 ` [PATCH v2 01/14] svm: Fix incorrect TSC scaling Haozhong Zhang
  2015-12-06 20:58 ` [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info() Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 16:57   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info() Haozhong Zhang
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

When TSC_MODE_PVRDTSCP is used for a HVM container and TSC scaling is
available, use the non-zero value of argument gtsc_khz of tsc_set_info()
as the guest TSC frequency rather than using the host TSC
frequency. Otherwise, TSC scaling will not be able get the correct ratio
between the host and guest TSC frequencies.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/time.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index b5223cf..1091e69 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1803,6 +1803,8 @@ void tsc_set_info(struct domain *d,
                   uint32_t tsc_mode, uint64_t elapsed_nsec,
                   uint32_t gtsc_khz, uint32_t incarnation)
 {
+    bool_t enable_tsc_scaling;
+
     if ( is_idle_domain(d) || is_hardware_domain(d) )
     {
         d->arch.vtsc = 0;
@@ -1864,7 +1866,9 @@ void tsc_set_info(struct domain *d,
     case TSC_MODE_PVRDTSCP:
         d->arch.vtsc = !boot_cpu_has(X86_FEATURE_RDTSCP) ||
                        !host_tsc_is_safe();
-        d->arch.tsc_khz = cpu_khz;
+        enable_tsc_scaling = has_hvm_container_domain(d) &&
+                             cpu_has_tsc_ratio && d->arch.vtsc;
+        d->arch.tsc_khz = (enable_tsc_scaling && gtsc_khz) ? gtsc_khz : cpu_khz;
         set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
         d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);
         if ( d->arch.vtsc )
@@ -1872,7 +1876,10 @@ void tsc_set_info(struct domain *d,
         else {
             /* when using native TSC, offset is nsec relative to power-on
              * of physical machine */
-            d->arch.vtsc_offset = scale_delta(rdtsc(), &d->arch.vtsc_to_ns) -
+            struct time_scale *scale = enable_tsc_scaling ?
+                                       &this_cpu(cpu_time).tsc_scale :
+                                       &d->arch.vtsc_to_ns;
+            d->arch.vtsc_offset = scale_delta(rdtsc(), scale) -
                                   elapsed_nsec;
         }
         break;
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (2 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency " Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 17:53   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC Haozhong Zhang
                   ` (10 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

When the TSC mode of a HVM container is TSC_MODE_DEFAULT or
TSC_MODE_PVRDTSCP and no TSC emulation is used, the existing
tsc_get_info() uses the host TSC frequency (cpu_khz) as the guest TSC
frequency. However, tsc_set_info() may set the guest TSC frequency to a
value different than the host. In order to keep consistent to
tsc_set_info(), this patch makes tsc_get_info() use the value set by
tsc_set_info() as the guest TSC frequency.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/time.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 1091e69..95df4f1 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1749,6 +1749,9 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
                   uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
                   uint32_t *incarnation)
 {
+    bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
+                                cpu_has_tsc_ratio;
+
     *incarnation = d->arch.incarnation;
     *tsc_mode = d->arch.tsc_mode;
 
@@ -1769,7 +1772,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
         }
         tsc = rdtsc();
         *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns);
-        *gtsc_khz = cpu_khz;
+        *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : cpu_khz;
         break;
     case TSC_MODE_PVRDTSCP:
         if ( d->arch.vtsc )
@@ -1779,10 +1782,13 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
         }
         else
         {
+            struct time_scale *scale = enable_tsc_scaling ?
+                                       &this_cpu(cpu_time).tsc_scale :
+                                       &d->arch.vtsc_to_ns;
             tsc = rdtsc();
-            *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns) -
+            *elapsed_nsec = scale_delta(tsc, scale) -
                             d->arch.vtsc_offset;
-            *gtsc_khz = 0; /* ignored by tsc_set_info */
+            *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : 0;
         }
         break;
     }
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (3 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info() Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 17:54   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly Haozhong Zhang
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

The existing hvm_[set|get]_guest_tsc_fixed() calculate the guest TSC by
adding the TSC offset to the host TSC. When the TSC scaling is enabled,
the host TSC should be scaled first. This patch adds the scaling logic
to those two functions.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/hvm.c        | 17 +++++++----------
 xen/arch/x86/hvm/svm/svm.c    | 12 ++++++++++++
 xen/include/asm-x86/hvm/hvm.h |  2 ++
 3 files changed, 21 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 6c2b512..0e63c33 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -60,6 +60,7 @@
 #include <asm/hvm/nestedhvm.h>
 #include <asm/hvm/event.h>
 #include <asm/hvm/vmx/vmx.h>
+#include <asm/hvm/svm/svm.h> /* for cpu_has_tsc_ratio */
 #include <asm/altp2m.h>
 #include <asm/mtrr.h>
 #include <asm/apic.h>
@@ -310,13 +311,11 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
         tsc = hvm_get_guest_time_fixed(v, at_tsc);
         tsc = gtime_to_gtsc(v->domain, tsc);
     }
-    else if ( at_tsc )
-    {
-        tsc = at_tsc;
-    }
     else
     {
-        tsc = rdtsc();
+        tsc = at_tsc ?: rdtsc();
+        if ( cpu_has_tsc_ratio )
+            tsc = hvm_funcs.scale_tsc(v, tsc);
     }
 
     delta_tsc = guest_tsc - tsc;
@@ -344,13 +343,11 @@ u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
         tsc = hvm_get_guest_time_fixed(v, at_tsc);
         tsc = gtime_to_gtsc(v->domain, tsc);
     }
-    else if ( at_tsc )
-    {
-        tsc = at_tsc;
-    }
     else
     {
-        tsc = rdtsc();
+        tsc = at_tsc ?: rdtsc();
+        if ( cpu_has_tsc_ratio )
+            tsc = hvm_funcs.scale_tsc(v, tsc);
     }
 
     return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 1e6529e..9d5f2c3 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -804,6 +804,16 @@ static uint64_t scale_tsc(uint64_t host_tsc, uint64_t ratio)
     return scaled_host_tsc;
 }
 
+static uint64_t svm_scale_tsc(struct vcpu *v, uint64_t tsc)
+{
+    struct domain *d = v->domain;
+
+    if ( !cpu_has_tsc_ratio || d->arch.vtsc )
+        return tsc;
+
+    return scale_tsc(tsc, vcpu_tsc_ratio(v));
+}
+
 static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
     uint64_t ratio)
 {
@@ -2270,6 +2280,8 @@ static struct hvm_function_table __initdata svm_function_table = {
     .nhvm_vmcx_hap_enabled = nsvm_vmcb_hap_enabled,
     .nhvm_intr_blocked = nsvm_intr_blocked,
     .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
+
+    .scale_tsc            = svm_scale_tsc,
 };
 
 void svm_vmexit_handler(struct cpu_user_regs *regs)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index f80e143..aba63ab 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -212,6 +212,8 @@ struct hvm_function_table {
     void (*altp2m_vcpu_update_vmfunc_ve)(struct vcpu *v);
     bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
     int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
+
+    uint64_t (*scale_tsc)(struct vcpu *v, uint64_t tsc);
 };
 
 extern struct hvm_function_table hvm_funcs;
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (4 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:14   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset() Haozhong Zhang
                   ` (8 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patch makes the pvclock return the scaled host TSC and
corresponding scaling parameters to HVM domains if guest TSC is not
emulated and TSC scaling is enabled.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/time.c | 16 ++++++++++++----
 1 file changed, 12 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 95df4f1..732d1e9 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
     }
     else
     {
-        tsc_stamp = t->local_tsc_stamp;
-
-        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
-        _u.tsc_shift         = (s8)t->tsc_scale.shift;
+        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
+        {
+            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
+            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
+            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
+        }
+        else
+        {
+            tsc_stamp            = t->local_tsc_stamp;
+            _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
+            _u.tsc_shift         = (s8)t->tsc_scale.shift;
+        }
     }
 
     _u.tsc_timestamp = tsc_stamp;
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset()
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (5 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:25   ` Boris Ostrovsky
  2015-12-06 20:58 ` [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

Now every caller passes an already scaled offset to
svm_set_tsc_offset(), so it's not necessary to recalculate a scaled TSC
offset in svm_set_tsc_offset().

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/svm/svm.c | 15 ++-------------
 1 file changed, 2 insertions(+), 13 deletions(-)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 9d5f2c3..3d22dfa 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -826,19 +826,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
     struct vmcb_struct *n1vmcb, *n2vmcb;
     uint64_t n2_tsc_offset = 0;
     struct domain *d = v->domain;
-    uint64_t host_tsc, guest_tsc;
-
-    guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
-
-    /* Re-adjust the offset value when TSC_RATIO is available */
-    if ( cpu_has_tsc_ratio && !d->arch.vtsc )
-    {
-        if ( at_tsc )
-            host_tsc = at_tsc;
-        else
-            host_tsc = rdtsc();
-        offset = svm_get_tsc_offset(host_tsc, guest_tsc, vcpu_tsc_ratio(v));
-    }
+    uint64_t guest_tsc;
 
     if ( !nestedhvm_enabled(d) ) {
         vmcb_set_tsc_offset(vmcb, offset);
@@ -854,6 +842,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
         n2_tsc_offset = vmcb_get_tsc_offset(n2vmcb) -
             vmcb_get_tsc_offset(n1vmcb);
         if ( svm->ns_tscratio != DEFAULT_TSC_RATIO ) {
+            guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
             n2_tsc_offset = svm_get_tsc_offset(guest_tsc,
                 guest_tsc + n2_tsc_offset, svm->ns_tscratio);
         }
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (6 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset() Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:31   ` Boris Ostrovsky
  2015-12-10 10:19   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 09/14] x86/hvm: Setup " Haozhong Zhang
                   ` (6 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

Both VMX TSC scaling and SVM TSC ratio use the 64-bit TSC scaling ratio,
but the number of fractional bits of the ratio is different between VMX
and SVM. This patch adds the architecture code to collect the number of
fractional bits and other related information into fields of struct
hvm_function_table so that they can be used in the common code.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/svm/svm.c    |  7 +++++++
 xen/include/asm-x86/hvm/hvm.h | 12 ++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 3d22dfa..9f741d0 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -1451,6 +1451,9 @@ const struct hvm_function_table * __init start_svm(void)
     if ( !cpu_has_svm_nrips )
         clear_bit(SVM_FEATURE_DECODEASSISTS, &svm_feature_flags);
 
+    if ( cpu_has_tsc_ratio )
+        svm_function_table.tsc_scaling_supported = 1;
+
 #define P(p,s) if ( p ) { printk(" - %s\n", s); printed = 1; }
     P(cpu_has_svm_npt, "Nested Page Tables (NPT)");
     P(cpu_has_svm_lbrv, "Last Branch Record (LBR) Virtualisation");
@@ -2271,6 +2274,10 @@ static struct hvm_function_table __initdata svm_function_table = {
     .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
 
     .scale_tsc            = svm_scale_tsc,
+
+    .default_tsc_scaling_ratio   = DEFAULT_TSC_RATIO,
+    .max_tsc_scaling_ratio       = ~TSC_RATIO_RSVD_BITS,
+    .tsc_scaling_ratio_frac_bits = 32,
 };
 
 void svm_vmexit_handler(struct cpu_user_regs *regs)
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index aba63ab..8b10a67 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -100,6 +100,18 @@ struct hvm_function_table {
     unsigned int hap_capabilities;
 
     /*
+     * Parameters of hardware-assisted TSC scaling.
+     */
+    /* is TSC scaling supported? */
+    bool_t   tsc_scaling_supported;
+    /* number of bits of the fractional part of TSC scaling ratio */
+    uint8_t  tsc_scaling_ratio_frac_bits;
+    /* default TSC scaling ratio (no scaling) */
+    uint64_t default_tsc_scaling_ratio;
+    /* maxmimum-allowed TSC scaling ratio */
+    uint64_t max_tsc_scaling_ratio;
+
+    /*
      * Initialise/destroy HVM domain/vcpu resources
      */
     int  (*domain_initialise)(struct domain *d);
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 09/14] x86/hvm: Setup TSC scaling ratio
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (7 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:42   ` Boris Ostrovsky
  2015-12-10 10:27   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
                   ` (5 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patch adds a field tsc_scaling_ratio in struct hvm_vcpu to
record the TSC scaling ratio, and sets it up when tsc_set_info() is
called for a vcpu or when a vcpu is restored or reset.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/hvm.c            | 30 ++++++++++++++++++++++++++++++
 xen/arch/x86/hvm/svm/svm.c        |  6 ++++--
 xen/arch/x86/time.c               | 13 ++++++++++++-
 xen/include/asm-x86/hvm/hvm.h     |  5 +++++
 xen/include/asm-x86/hvm/svm/svm.h |  3 ---
 xen/include/asm-x86/hvm/vcpu.h    |  2 ++
 xen/include/asm-x86/math64.h      | 30 ++++++++++++++++++++++++++++++
 7 files changed, 83 insertions(+), 6 deletions(-)
 create mode 100644 xen/include/asm-x86/math64.h

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 0e63c33..52a0ef8 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -65,6 +65,7 @@
 #include <asm/mtrr.h>
 #include <asm/apic.h>
 #include <asm/vm_event.h>
+#include <asm/math64.h>
 #include <public/sched.h>
 #include <public/hvm/ioreq.h>
 #include <public/version.h>
@@ -301,6 +302,29 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
     return 1;
 }
 
+void hvm_setup_tsc_scaling(struct vcpu *v)
+{
+    u64 ratio;
+
+    if ( !hvm_funcs.tsc_scaling_supported )
+        return;
+
+    /*
+     * The multiplication of the first two terms may overflow a 64-bit
+     * integer, so use mul_u64_u32_div() instead to keep precision.
+     */
+    ratio = mul_u64_u32_div(1ULL << hvm_funcs.tsc_scaling_ratio_frac_bits,
+                            v->domain->arch.tsc_khz, cpu_khz);
+
+    if ( ratio == 0 || ratio > hvm_funcs.max_tsc_scaling_ratio )
+        return;
+
+    v->arch.hvm_vcpu.tsc_scaling_ratio = ratio;
+
+    if ( hvm_funcs.setup_tsc_scaling )
+        hvm_funcs.setup_tsc_scaling(v);
+}
+
 void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
 {
     uint64_t tsc;
@@ -2024,6 +2048,9 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
     if ( hvm_funcs.load_cpu_ctxt(v, &ctxt) < 0 )
         return -EINVAL;
 
+    if ( hvm_funcs.tsc_scaling_supported )
+        hvm_setup_tsc_scaling(v);
+
     v->arch.hvm_vcpu.msr_tsc_aux = ctxt.msr_tsc_aux;
 
     seg.limit = ctxt.idtr_limit;
@@ -5584,6 +5611,9 @@ void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
     hvm_set_segment_register(v, x86_seg_gdtr, &reg);
     hvm_set_segment_register(v, x86_seg_idtr, &reg);
 
+    if ( hvm_funcs.tsc_scaling_supported )
+        hvm_setup_tsc_scaling(v);
+
     /* Sync AP's TSC with BSP's. */
     v->arch.hvm_vcpu.cache_tsc_offset =
         v->domain->vcpu[0]->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 9f741d0..d291327 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -811,7 +811,7 @@ static uint64_t svm_scale_tsc(struct vcpu *v, uint64_t tsc)
     if ( !cpu_has_tsc_ratio || d->arch.vtsc )
         return tsc;
 
-    return scale_tsc(tsc, vcpu_tsc_ratio(v));
+    return scale_tsc(tsc, v->arch.hvm_vcpu.tsc_scaling_ratio);
 }
 
 static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
@@ -987,7 +987,7 @@ static inline void svm_tsc_ratio_save(struct vcpu *v)
 static inline void svm_tsc_ratio_load(struct vcpu *v)
 {
     if ( cpu_has_tsc_ratio && !v->domain->arch.vtsc ) 
-        wrmsrl(MSR_AMD64_TSC_RATIO, vcpu_tsc_ratio(v));
+        wrmsrl(MSR_AMD64_TSC_RATIO, v->arch.hvm_vcpu.tsc_scaling_ratio);
 }
 
 static void svm_ctxt_switch_from(struct vcpu *v)
@@ -1193,6 +1193,8 @@ static int svm_vcpu_initialise(struct vcpu *v)
 
     svm_guest_osvw_init(v);
 
+    v->arch.hvm_vcpu.tsc_scaling_ratio = DEFAULT_TSC_RATIO;
+
     return 0;
 }
 
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 732d1e9..d4a94eb 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -1902,8 +1902,19 @@ void tsc_set_info(struct domain *d,
     if ( has_hvm_container_domain(d) )
     {
         hvm_set_rdtsc_exiting(d, d->arch.vtsc);
-        if ( d->vcpu && d->vcpu[0] && incarnation == 0 )
+        if ( d->vcpu && d->vcpu[0] )
         {
+             /*
+             * TSC scaling ratio on BSP is set here during initial boot, while
+             * the same TSC scaling ratio on APs will be set in
+             * hvm_vcpu_reset_state().
+             */
+            if ( !d->arch.vtsc && hvm_funcs.tsc_scaling_supported )
+                hvm_setup_tsc_scaling(d->vcpu[0]);
+
+            if ( incarnation )
+                return;
+
             /*
              * set_tsc_offset() is called from hvm_vcpu_initialise() before
              * tsc_set_info(). New vtsc mode may require recomputing TSC
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index 8b10a67..e74524e 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -226,6 +226,9 @@ struct hvm_function_table {
     int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 
     uint64_t (*scale_tsc)(struct vcpu *v, uint64_t tsc);
+
+    /* Architecture function to setup TSC scaling ratio */
+    void (*setup_tsc_scaling)(struct vcpu *v);
 };
 
 extern struct hvm_function_table hvm_funcs;
@@ -261,6 +264,8 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
 u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
 #define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed(v, 0)
 
+void hvm_setup_tsc_scaling(struct vcpu *v);
+
 int hvm_set_mode(struct vcpu *v, int mode);
 void hvm_init_guest_time(struct domain *d);
 void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
index d60ec23..c954b7e 100644
--- a/xen/include/asm-x86/hvm/svm/svm.h
+++ b/xen/include/asm-x86/hvm/svm/svm.h
@@ -97,9 +97,6 @@ extern u32 svm_feature_flags;
 /* TSC rate */
 #define DEFAULT_TSC_RATIO       0x0000000100000000ULL
 #define TSC_RATIO_RSVD_BITS     0xffffff0000000000ULL
-#define TSC_RATIO(g_khz, h_khz) ( (((u64)(g_khz)<<32)/(u64)(h_khz)) & \
-                                  ~TSC_RATIO_RSVD_BITS )
-#define vcpu_tsc_ratio(v)       TSC_RATIO((v)->domain->arch.tsc_khz, cpu_khz)
 
 extern void svm_host_osvw_reset(void);
 extern void svm_host_osvw_init(void);
diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
index 152d9f3..901c988 100644
--- a/xen/include/asm-x86/hvm/vcpu.h
+++ b/xen/include/asm-x86/hvm/vcpu.h
@@ -175,6 +175,8 @@ struct hvm_vcpu {
     u64                 msr_tsc_adjust;
     u64                 msr_xss;
 
+    u64                 tsc_scaling_ratio;
+
     union {
         struct arch_vmx_struct vmx;
         struct arch_svm_struct svm;
diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
new file mode 100644
index 0000000..9af6aee
--- /dev/null
+++ b/xen/include/asm-x86/math64.h
@@ -0,0 +1,30 @@
+#ifndef __X86_MATH64
+#define __X86_MATH64
+
+/*
+ * (a * mul) / divisor
+ */
+static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
+{
+    union {
+        u64 ll;
+        struct {
+            u32 low, high;
+        } l;
+    } u, rl, rh;
+
+    u.ll = a;
+    rl.ll = (u64)u.l.low * mul;
+    rh.ll = (u64)u.l.high * mul + rl.l.high;
+
+    /* Bits 32-63 of the result will be in rh.l.low. */
+    rl.l.high = do_div(rh.ll, divisor);
+
+    /* Bits 0-31 of the result will be in rl.l.low. */
+    do_div(rl.ll, divisor);
+
+    rl.l.high = rh.l.low;
+    return rl.ll;
+}
+
+#endif
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (8 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 09/14] x86/hvm: Setup " Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:55   ` Boris Ostrovsky
  2015-12-10 10:29   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
                   ` (4 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patch implements a common function hvm_scale_tsc() to scale TSC by
using TSC scaling information collected by architecture code.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/hvm.c        | 18 +++++++++--
 xen/arch/x86/hvm/svm/svm.c    | 12 --------
 xen/arch/x86/time.c           |  2 +-
 xen/include/asm-x86/hvm/hvm.h |  3 +-
 xen/include/asm-x86/math64.h  | 70 +++++++++++++++++++++++++++++++++++++++++++
 5 files changed, 88 insertions(+), 17 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 52a0ef8..1e3d89f 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -302,6 +302,20 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
     return 1;
 }
 
+u64 hvm_scale_tsc(const struct vcpu *v, u64 tsc)
+{
+    u64 ratio = v->arch.hvm_vcpu.tsc_scaling_ratio;
+
+    if ( !hvm_funcs.tsc_scaling_supported ||
+         ratio == hvm_funcs.default_tsc_scaling_ratio )
+        return tsc;
+
+    BUG_ON(hvm_funcs.tsc_scaling_ratio_frac_bits >= 64 ||
+           hvm_funcs.tsc_scaling_ratio_frac_bits == 0);
+
+    return mul_u64_u64_shr(tsc, ratio, hvm_funcs.tsc_scaling_ratio_frac_bits);
+}
+
 void hvm_setup_tsc_scaling(struct vcpu *v)
 {
     u64 ratio;
@@ -339,7 +353,7 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
     {
         tsc = at_tsc ?: rdtsc();
         if ( cpu_has_tsc_ratio )
-            tsc = hvm_funcs.scale_tsc(v, tsc);
+            tsc = hvm_scale_tsc(v, tsc);
     }
 
     delta_tsc = guest_tsc - tsc;
@@ -371,7 +385,7 @@ u64 hvm_get_guest_tsc_fixed(struct vcpu *v, uint64_t at_tsc)
     {
         tsc = at_tsc ?: rdtsc();
         if ( cpu_has_tsc_ratio )
-            tsc = hvm_funcs.scale_tsc(v, tsc);
+            tsc = hvm_scale_tsc(v, tsc);
     }
 
     return tsc + v->arch.hvm_vcpu.cache_tsc_offset;
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index d291327..9e28245 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -804,16 +804,6 @@ static uint64_t scale_tsc(uint64_t host_tsc, uint64_t ratio)
     return scaled_host_tsc;
 }
 
-static uint64_t svm_scale_tsc(struct vcpu *v, uint64_t tsc)
-{
-    struct domain *d = v->domain;
-
-    if ( !cpu_has_tsc_ratio || d->arch.vtsc )
-        return tsc;
-
-    return scale_tsc(tsc, v->arch.hvm_vcpu.tsc_scaling_ratio);
-}
-
 static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
     uint64_t ratio)
 {
@@ -2275,8 +2265,6 @@ static struct hvm_function_table __initdata svm_function_table = {
     .nhvm_intr_blocked = nsvm_intr_blocked,
     .nhvm_hap_walk_L1_p2m = nsvm_hap_walk_L1_p2m,
 
-    .scale_tsc            = svm_scale_tsc,
-
     .default_tsc_scaling_ratio   = DEFAULT_TSC_RATIO,
     .max_tsc_scaling_ratio       = ~TSC_RATIO_RSVD_BITS,
     .tsc_scaling_ratio_frac_bits = 32,
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index d4a94eb..2277158 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -817,7 +817,7 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
     {
         if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
         {
-            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
+            tsc_stamp            = hvm_scale_tsc(v, t->local_tsc_stamp);
             _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
             _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
         }
diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
index e74524e..8fc59d5 100644
--- a/xen/include/asm-x86/hvm/hvm.h
+++ b/xen/include/asm-x86/hvm/hvm.h
@@ -225,8 +225,6 @@ struct hvm_function_table {
     bool_t (*altp2m_vcpu_emulate_ve)(struct vcpu *v);
     int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
 
-    uint64_t (*scale_tsc)(struct vcpu *v, uint64_t tsc);
-
     /* Architecture function to setup TSC scaling ratio */
     void (*setup_tsc_scaling)(struct vcpu *v);
 };
@@ -264,6 +262,7 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
 u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
 #define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed(v, 0)
 
+u64 hvm_scale_tsc(const struct vcpu *v, u64 tsc);
 void hvm_setup_tsc_scaling(struct vcpu *v);
 
 int hvm_set_mode(struct vcpu *v, int mode);
diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
index 9af6aee..c6a472b 100644
--- a/xen/include/asm-x86/math64.h
+++ b/xen/include/asm-x86/math64.h
@@ -27,4 +27,74 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
     return rl.ll;
 }
 
+/*
+ * Multiply two 64-bit unsigned integers a and b. The most and least
+ * significant 64 bits of the 128-bit result are returned through hi
+ * and lo respectively.
+ */
+static inline void mul64(u64 *lo, u64 *hi, u64 a, u64 b)
+{
+    typedef union {
+        u64 ll;
+        struct {
+            u32 low, high;
+        } l;
+    } LL;
+    LL rl, rm, rn, rh, a0, b0;
+    u64 c;
+
+    a0.ll = a;
+    b0.ll = b;
+
+    rl.ll = (u64)a0.l.low * b0.l.low;
+    rm.ll = (u64)a0.l.low * b0.l.high;
+    rn.ll = (u64)a0.l.high * b0.l.low;
+    rh.ll = (u64)a0.l.high * b0.l.high;
+
+    c = (u64)rl.l.high + rm.l.low + rn.l.low;
+    rl.l.high = c;
+    c >>= 32;
+    c = c + rm.l.high + rn.l.high + rh.l.low;
+    rh.l.low = c;
+    rh.l.high += (u32)(c >> 32);
+
+    *lo = rl.ll;
+    *hi = rh.ll;
+}
+
+/*
+ * Right shift a 128-bit unsigned integer by n bits. The most and
+ * least significant 64 bits of the 128-bit integer are passed through
+ * hi and lo respectively. The most and least significant 64 bits of
+ * the result are also returned through hi and lo respectively.
+ */
+static inline void rshift128(u64 *lo, u64 *hi, unsigned int n)
+{
+    u64 h;
+    if ( !n )
+        return;
+    h = *hi >> (n & 63);
+    if ( n >= 64 )
+    {
+        *hi = 0;
+        *lo = h;
+    }
+    else
+    {
+        *lo = (*lo >> n) | (*hi << (64 - n));
+        *hi = h;
+    }
+}
+
+/*
+ * (a * mul) >> n
+ */
+static inline u64 mul_u64_u64_shr(u64 a, u64 mul, unsigned int n)
+{
+    u64 lo, hi;
+    mul64(&lo, &hi, a, mul);
+    rshift128(&lo, &hi, n);
+    return lo;
+}
+
 #endif
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (9 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:56   ` Boris Ostrovsky
  2015-12-10 10:30   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs Haozhong Zhang
                   ` (3 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

Both VMX and SVM save/load vcpu's TSC when saving/loading vcpu's
context, so this patch moves saving/loading vcpu's TSC to the common
functions hvm_[save|load]_cpu_ctxt().

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/hvm/hvm.c     | 4 ++++
 xen/arch/x86/hvm/svm/svm.c | 5 -----
 xen/arch/x86/hvm/vmx/vmx.c | 5 -----
 3 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
index 1e3d89f..12c47cd 100644
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1773,6 +1773,8 @@ static int hvm_save_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
         /* Architecture-specific vmcs/vmcb bits */
         hvm_funcs.save_cpu_ctxt(v, &ctxt);
 
+        ctxt.tsc = hvm_get_guest_tsc_fixed(v, d->arch.hvm_domain.sync_tsc);
+
         ctxt.msr_tsc_aux = hvm_msr_tsc_aux(v);
 
         hvm_get_segment_register(v, x86_seg_idtr, &seg);
@@ -2067,6 +2069,8 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
 
     v->arch.hvm_vcpu.msr_tsc_aux = ctxt.msr_tsc_aux;
 
+    hvm_set_guest_tsc_fixed(v, ctxt.tsc, d->arch.hvm_domain.sync_tsc);
+
     seg.limit = ctxt.idtr_limit;
     seg.base = ctxt.idtr_base;
     hvm_set_segment_register(v, x86_seg_idtr, &seg);
diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
index 9e28245..876ea76 100644
--- a/xen/arch/x86/hvm/svm/svm.c
+++ b/xen/arch/x86/hvm/svm/svm.c
@@ -357,9 +357,6 @@ static void svm_save_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     data->msr_syscall_mask = vmcb->sfmask;
     data->msr_efer         = v->arch.hvm_vcpu.guest_efer;
     data->msr_flags        = -1ULL;
-
-    data->tsc = hvm_get_guest_tsc_fixed(v,
-                                        v->domain->arch.hvm_domain.sync_tsc);
 }
 
 
@@ -374,8 +371,6 @@ static void svm_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     vmcb->sfmask     = data->msr_syscall_mask;
     v->arch.hvm_vcpu.guest_efer = data->msr_efer;
     svm_update_guest_efer(v);
-
-    hvm_set_guest_tsc_fixed(v, data->tsc, v->domain->arch.hvm_domain.sync_tsc);
 }
 
 static void svm_save_vmcb_ctxt(struct vcpu *v, struct hvm_hw_cpu *ctxt)
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 9ad6d82..8b5bb11 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -581,9 +581,6 @@ static void vmx_save_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
     data->msr_lstar        = guest_state->msrs[VMX_INDEX_MSR_LSTAR];
     data->msr_star         = guest_state->msrs[VMX_INDEX_MSR_STAR];
     data->msr_syscall_mask = guest_state->msrs[VMX_INDEX_MSR_SYSCALL_MASK];
-
-    data->tsc = hvm_get_guest_tsc_fixed(v,
-                                        v->domain->arch.hvm_domain.sync_tsc);
 }
 
 static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
@@ -598,8 +595,6 @@ static void vmx_load_cpu_state(struct vcpu *v, struct hvm_hw_cpu *data)
 
     v->arch.hvm_vmx.cstar     = data->msr_cstar;
     v->arch.hvm_vmx.shadow_gs = data->shadow_gs;
-
-    hvm_set_guest_tsc_fixed(v, data->tsc, v->domain->arch.hvm_domain.sync_tsc);
 }
 
 
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (10 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-07 18:58   ` Boris Ostrovsky
  2015-12-10 10:31   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support Haozhong Zhang
                   ` (2 subsequent siblings)
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patch uses hvm_funcs.tsc_scaling_supported instead of the
architecture code to detect the TSC scaling support.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
---
 xen/arch/x86/time.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 2277158..9f610c6 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -37,7 +37,6 @@
 #include <asm/hpet.h>
 #include <io_ports.h>
 #include <asm/setup.h> /* for early_time_init */
-#include <asm/hvm/svm/svm.h> /* for cpu_has_tsc_ratio */
 #include <public/arch-x86/cpuid.h>
 
 /* opt_clocksource: Force clocksource to one of: pit, hpet, acpi. */
@@ -815,7 +814,7 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
     }
     else
     {
-        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
+        if ( is_hvm_domain(d) && hvm_funcs.tsc_scaling_supported )
         {
             tsc_stamp            = hvm_scale_tsc(v, t->local_tsc_stamp);
             _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
@@ -1758,7 +1757,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
                   uint32_t *incarnation)
 {
     bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
-                                cpu_has_tsc_ratio;
+                                hvm_funcs.tsc_scaling_supported;
 
     *incarnation = d->arch.incarnation;
     *tsc_mode = d->arch.tsc_mode;
@@ -1867,7 +1866,7 @@ void tsc_set_info(struct domain *d,
          */
         if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
              (has_hvm_container_domain(d) ?
-              d->arch.tsc_khz == cpu_khz || cpu_has_tsc_ratio :
+              d->arch.tsc_khz == cpu_khz || hvm_funcs.tsc_scaling_supported :
               incarnation == 0) )
         {
     case TSC_MODE_NEVER_EMULATE:
@@ -1881,7 +1880,7 @@ void tsc_set_info(struct domain *d,
         d->arch.vtsc = !boot_cpu_has(X86_FEATURE_RDTSCP) ||
                        !host_tsc_is_safe();
         enable_tsc_scaling = has_hvm_container_domain(d) &&
-                             cpu_has_tsc_ratio && d->arch.vtsc;
+                             hvm_funcs.tsc_scaling_supported && d->arch.vtsc;
         d->arch.tsc_khz = (enable_tsc_scaling && gtsc_khz) ? gtsc_khz : cpu_khz;
         set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
         d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (11 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-10 10:33   ` Tian, Kevin
  2015-12-06 20:58 ` [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt Haozhong Zhang
  2015-12-06 21:14 ` [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

This patch adds the initialization and setup code for VMX TSC scaling.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 xen/arch/x86/hvm/vmx/vmcs.c        | 12 +++++++++---
 xen/arch/x86/hvm/vmx/vmx.c         | 15 +++++++++++++++
 xen/include/asm-x86/hvm/vmx/vmcs.h |  7 +++++++
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
index 000d06e..fb490bf 100644
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -149,6 +149,7 @@ static void __init vmx_display_features(void)
     P(cpu_has_vmx_vmfunc, "VM Functions");
     P(cpu_has_vmx_virt_exceptions, "Virtualisation Exceptions");
     P(cpu_has_vmx_pml, "Page Modification Logging");
+    P(cpu_has_vmx_tsc_scaling, "TSC Scaling");
 #undef P
 
     if ( !printed )
@@ -242,7 +243,8 @@ static int vmx_init_vmcs_config(void)
                SECONDARY_EXEC_ENABLE_INVPCID |
                SECONDARY_EXEC_ENABLE_VM_FUNCTIONS |
                SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS |
-               SECONDARY_EXEC_XSAVES);
+               SECONDARY_EXEC_XSAVES |
+               SECONDARY_EXEC_TSC_SCALING);
         rdmsrl(MSR_IA32_VMX_MISC, _vmx_misc_cap);
         if ( _vmx_misc_cap & VMX_MISC_VMWRITE_ALL )
             opt |= SECONDARY_EXEC_ENABLE_VMCS_SHADOWING;
@@ -998,7 +1000,7 @@ static int construct_vmcs(struct vcpu *v)
     __vmwrite(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_control);
 
     v->arch.hvm_vmx.exec_control = vmx_cpu_based_exec_control;
-    if ( d->arch.vtsc )
+    if ( d->arch.vtsc && !cpu_has_vmx_tsc_scaling )
         v->arch.hvm_vmx.exec_control |= CPU_BASED_RDTSC_EXITING;
 
     v->arch.hvm_vmx.secondary_exec_control = vmx_secondary_exec_control;
@@ -1277,6 +1279,9 @@ static int construct_vmcs(struct vcpu *v)
     if ( cpu_has_vmx_xsaves )
         __vmwrite(XSS_EXIT_BITMAP, 0);
 
+    if ( cpu_has_vmx_tsc_scaling )
+        __vmwrite(TSC_MULTIPLIER, v->arch.hvm_vcpu.tsc_scaling_ratio);
+
     vmx_vmcs_exit(v);
 
     /* PVH: paging mode is updated by arch_set_info_guest(). */
@@ -1859,7 +1864,8 @@ void vmcs_dump_vcpu(struct vcpu *v)
            vmr32(VM_EXIT_REASON), vmr(EXIT_QUALIFICATION));
     printk("IDTVectoring: info=%08x errcode=%08x\n",
            vmr32(IDT_VECTORING_INFO), vmr32(IDT_VECTORING_ERROR_CODE));
-    printk("TSC Offset = 0x%016lx\n", vmr(TSC_OFFSET));
+    printk("TSC Offset = 0x%016lx  TSC Multiplier = 0x%016lx\n",
+           vmr(TSC_OFFSET), vmr(TSC_MULTIPLIER));
     if ( (v->arch.hvm_vmx.exec_control & CPU_BASED_TPR_SHADOW) ||
          (vmx_pin_based_exec_control & PIN_BASED_POSTED_INTERRUPT) )
         printk("TPR Threshold = 0x%02x  PostedIntrVec = 0x%02x\n",
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index 8b5bb11..c0ea74c 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -109,6 +109,7 @@ static int vmx_vcpu_initialise(struct vcpu *v)
     v->arch.schedule_tail    = vmx_do_resume;
     v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
     v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
+    v->arch.hvm_vcpu.tsc_scaling_ratio = VMX_TSC_MULTIPLIER_DEFAULT;
 
     if ( (rc = vmx_create_vmcs(v)) != 0 )
     {
@@ -1111,6 +1112,13 @@ static void vmx_handle_cd(struct vcpu *v, unsigned long value)
     }
 }
 
+static void vmx_setup_tsc_scaling(struct vcpu *v)
+{
+    vmx_vmcs_enter(v);
+    __vmwrite(TSC_MULTIPLIER, v->arch.hvm_vcpu.tsc_scaling_ratio);
+    vmx_vmcs_exit(v);
+}
+
 static void vmx_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
 {
     vmx_vmcs_enter(v);
@@ -1980,6 +1988,10 @@ static struct hvm_function_table __initdata vmx_function_table = {
     .altp2m_vcpu_update_vmfunc_ve = vmx_vcpu_update_vmfunc_ve,
     .altp2m_vcpu_emulate_ve = vmx_vcpu_emulate_ve,
     .altp2m_vcpu_emulate_vmfunc = vmx_vcpu_emulate_vmfunc,
+    .default_tsc_scaling_ratio   = VMX_TSC_MULTIPLIER_DEFAULT,
+    .max_tsc_scaling_ratio       = VMX_TSC_MULTIPLIER_MAX,
+    .tsc_scaling_ratio_frac_bits = 48,
+    .setup_tsc_scaling           = vmx_setup_tsc_scaling,
 };
 
 const struct hvm_function_table * __init start_vmx(void)
@@ -2032,6 +2044,9 @@ const struct hvm_function_table * __init start_vmx(void)
          && cpu_has_vmx_secondary_exec_control )
         vmx_function_table.pvh_supported = 1;
 
+    if ( cpu_has_vmx_tsc_scaling )
+        vmx_function_table.tsc_scaling_supported = 1;
+
     setup_vmcs_dump();
 
     return &vmx_function_table;
diff --git a/xen/include/asm-x86/hvm/vmx/vmcs.h b/xen/include/asm-x86/hvm/vmx/vmcs.h
index b3b0946..ff991ea 100644
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -236,6 +236,7 @@ extern u32 vmx_vmentry_control;
 #define SECONDARY_EXEC_ENABLE_PML               0x00020000
 #define SECONDARY_EXEC_ENABLE_VIRT_EXCEPTIONS   0x00040000
 #define SECONDARY_EXEC_XSAVES                   0x00100000
+#define SECONDARY_EXEC_TSC_SCALING              0x02000000
 extern u32 vmx_secondary_exec_control;
 
 #define VMX_EPT_EXEC_ONLY_SUPPORTED                         0x00000001
@@ -258,6 +259,9 @@ extern u64 vmx_ept_vpid_cap;
 #define VMX_MISC_CR3_TARGET                     0x01ff0000
 #define VMX_MISC_VMWRITE_ALL                    0x20000000
 
+#define VMX_TSC_MULTIPLIER_DEFAULT              0x0001000000000000ULL
+#define VMX_TSC_MULTIPLIER_MAX                  0xffffffffffffffffULL
+
 #define cpu_has_wbinvd_exiting \
     (vmx_secondary_exec_control & SECONDARY_EXEC_WBINVD_EXITING)
 #define cpu_has_vmx_virtualize_apic_accesses \
@@ -303,6 +307,8 @@ extern u64 vmx_ept_vpid_cap;
     (vmx_secondary_exec_control & SECONDARY_EXEC_ENABLE_PML)
 #define cpu_has_vmx_xsaves \
     (vmx_secondary_exec_control & SECONDARY_EXEC_XSAVES)
+#define cpu_has_vmx_tsc_scaling \
+    (vmx_secondary_exec_control & SECONDARY_EXEC_TSC_SCALING)
 
 #define VMCS_RID_TYPE_MASK              0x80000000
 
@@ -378,6 +384,7 @@ enum vmcs_field {
     VMWRITE_BITMAP                  = 0x00002028,
     VIRT_EXCEPTION_INFO             = 0x0000202a,
     XSS_EXIT_BITMAP                 = 0x0000202c,
+    TSC_MULTIPLIER                  = 0x00002032,
     GUEST_PHYSICAL_ADDRESS          = 0x00002400,
     VMCS_LINK_POINTER               = 0x00002800,
     GUEST_IA32_DEBUGCTL             = 0x00002802,
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (12 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support Haozhong Zhang
@ 2015-12-06 20:58 ` Haozhong Zhang
  2015-12-10 10:40   ` Tian, Kevin
  2015-12-06 21:14 ` [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
  14 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 20:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Haozhong Zhang, Kevin Tian, Keir Fraser, Jan Beulich,
	Jun Nakajima, Andrew Cooper, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 docs/man/xl.cfg.pod.5 | 15 ++++++++++++++-
 docs/misc/tscmode.txt | 14 ++++++++++++++
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
index 2aca8dd..7e19a9b 100644
--- a/docs/man/xl.cfg.pod.5
+++ b/docs/man/xl.cfg.pod.5
@@ -1313,9 +1313,18 @@ deprecated. Options are:
 
 =item B<"default">
 
-Guest rdtsc/p executed natively when monotonicity can be guaranteed
+Guest rdtsc/p is executed natively when monotonicity can be guaranteed
 and emulated otherwise (with frequency scaled if necessary).
 
+If a HVM container in B<default> TSC mode is not migrated from other hosts
+and the host TSC monotonicity can be guaranteed, the guest and host TSC
+frequencies will be the same.
+
+If a HVM container in B<default> TSC mode is migrated to a host that can
+guarantee the TSC monotonicity and supports Intel VMX TSC scaling/AMD SVM
+TSC ratio, guest rdtsc/p will still execute natively after migration and the
+guest TSC frequencies before and after migration will be the same.
+
 =item B<"always_emulate">
 
 Guest rdtsc/p always emulated at 1GHz (kernel and user). Guest rdtsc/p
@@ -1337,6 +1346,10 @@ determine when a restore/migration has occurred and assumes guest
 obtains/uses pvclock-like mechanism to adjust for monotonicity and
 frequency changes.
 
+If a HVM container in B<native_paravirt> TSC mode can execute both guest
+rdtsc and guest rdtscp natively, then the guest TSC frequency will be
+determined in the similar way to that of B<default> TSC mode.
+
 =back
 
 Please see F<docs/misc/tscmode.txt> for more information on this option.
diff --git a/docs/misc/tscmode.txt b/docs/misc/tscmode.txt
index e8c84e8..f3b70be 100644
--- a/docs/misc/tscmode.txt
+++ b/docs/misc/tscmode.txt
@@ -297,3 +297,17 @@ and also much faster than nearly all OS-provided time mechanisms.
 While pvrtscp is too complex for most apps, certain enterprise
 TSC-sensitive high-TSC-frequency apps may find it useful to
 obtain a significant performance gain.
+
+Hardware TSC Scaling
+
+Intel VMX TSC scaling and AMD SVM TSC ratio allow the guest TSC read
+by guest rdtsc/p increasing in the different frequency than the host
+TSC frequency.
+
+For a HVM container is in default TSC mode (tsc_mode=0) or PVRDTSCP
+mode (tsc_mode=3) and can execute both guest rdtsc and rdtscp
+natively, if it is not migrated from other hosts, the guest and host
+TSC frequencies will be the same. If it is migrated to a host
+supporting Intel VMX TSC scaling/AMD SVM TSC ratio and can still
+execute guest rdtsc and rdtscp natively, the guest TSC frequencies
+before and after migration will be the same.
-- 
2.6.3

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
                   ` (13 preceding siblings ...)
  2015-12-06 20:58 ` [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt Haozhong Zhang
@ 2015-12-06 21:14 ` Haozhong Zhang
  2015-12-07  9:03   ` Egger, Christoph
  2015-12-07 10:37   ` Jan Beulich
  14 siblings, 2 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-06 21:14 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky

[-- Attachment #1: Type: text/plain, Size: 505 bytes --]

I have tested this patchset on both Intel and AMD systems.

For the test on AMD systems, I made two rounds of tests. The first
round only applied bug-fix patches 1-6. The second round applied the
entire series. In both rounds, I used the test process at
http://lists.xenproject.org/archives/html/xen-devel/2015-11/msg02590.html
except that patch 13 there was replace by the patch in the attachment.
In both rounds, I got the expect results.

Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>

Haozhong

[-- Attachment #2: v2-0015-tools-libxl-Add-vtsc_khz-option-to-set-guest-TSC-.patch --]
[-- Type: text/plain, Size: 4102 bytes --]

>From a2ba834e4f3988345661c1084765ea4c9a808941 Mon Sep 17 00:00:00 2001
From: Haozhong Zhang <haozhong.zhang@intel.com>
Date: Wed, 19 Aug 2015 16:26:29 +0800
Subject: [PATCH v2 15/15] tools/libxl: Add 'vtsc_khz' option to set guest TSC
 rate

This patch adds an option 'vtsc_khz' to allow users to set vcpu's TSC
rate in KHz. In the case that tsc_mode = 'default' or 'native_paravirt',
the default value of 'vtsc_khz' option is the host TSC rate which is
used when 'vtsc_khz' option is set to 0 or does not appear in the
configuration. In all other cases of tsc_mode, 'vtsc_khz' option is just
ignored.

Another purpose of adding this option is to keep vcpu's TSC rate across
guest reboot. In existing code, a new domain is created from the
configuration of the previous domain which was just rebooted. vcpu's TSC
rate is not stored in the configuration and the host TSC rate is the
used as vcpu's TSC rate. This works fine unless the previous domain was
migrated from another host machine with a different host TSC rate than
the current one.

Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
---
 tools/libxl/libxl_types.idl |  1 +
 tools/libxl/libxl_x86.c     |  4 +++-
 tools/libxl/xl_cmdimpl.c    | 23 +++++++++++++++++++++++
 3 files changed, 27 insertions(+), 1 deletion(-)

diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 6808f2b..5480e46 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -420,6 +420,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")),
     ("numa_placement",  libxl_defbool),
     ("tsc_mode",        libxl_tsc_mode),
+    ("vtsc_khz",        uint32),
     ("max_memkb",       MemKB),
     ("target_memkb",    MemKB),
     ("video_memkb",     MemKB),
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 183b6c2..f27b079 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -280,6 +280,7 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
 {
     int ret = 0;
     int tsc_mode;
+    uint32_t vtsc_khz;
     uint32_t rtc_timeoffset;
     libxl_ctx *ctx = libxl__gc_owner(gc);
 
@@ -304,7 +305,8 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
     default:
         abort();
     }
-    xc_domain_set_tsc_info(ctx->xch, domid, tsc_mode, 0, 0, 0);
+    vtsc_khz = d_config->b_info.vtsc_khz;
+    xc_domain_set_tsc_info(ctx->xch, domid, tsc_mode, 0, vtsc_khz, 0);
     if (libxl_defbool_val(d_config->b_info.disable_migrate))
         xc_domain_disable_migrate(ctx->xch, domid);
     rtc_timeoffset = d_config->b_info.rtc_timeoffset;
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index 2b6371d..2705f8c 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -1464,6 +1464,29 @@ static void parse_config_data(const char *config_source,
         }
     }
 
+    /* "vtsc_khz" option works only if "tsc_mode" option is "default"
+     * or "native_paravirt". In this case, if "vtsc_khz" option is set
+     * to 0, we will reset it to the host TSC rate. In all other
+     * cases, we just ignore any given value and always set it to 0.
+     */
+    if (!xlu_cfg_get_long(config, "vtsc_khz", &l, 0))
+        b_info->vtsc_khz = l;
+    if (b_info->tsc_mode == LIBXL_TSC_MODE_DEFAULT ||
+        b_info->tsc_mode == LIBXL_TSC_MODE_NATIVE_PARAVIRT) {
+        if (b_info->vtsc_khz == 0) {
+            libxl_physinfo physinfo;
+            if (!libxl_get_physinfo(ctx, &physinfo))
+                b_info->vtsc_khz = physinfo.cpu_khz;
+            else
+                fprintf(stderr, "WARNING: cannot get host TSC rate.\n");
+        }
+    } else {
+        fprintf(stderr, "WARNING: ignoring \"vtsc_khz\" option. "
+                "\"vtsc_khz\" option works only if "
+                "\"tsc_mode\" option is \"default\" or \"native_paravirt\".\n");
+        b_info->vtsc_khz = 0;
+    }
+
     if (!xlu_cfg_get_long(config, "rtc_timeoffset", &l, 0))
         b_info->rtc_timeoffset = l;
 
-- 
2.4.8


[-- Attachment #3: Type: text/plain, Size: 126 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-06 21:14 ` [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
@ 2015-12-07  9:03   ` Egger, Christoph
  2015-12-07 10:16     ` Haozhong Zhang
  2015-12-07 10:37   ` Jan Beulich
  1 sibling, 1 reply; 61+ messages in thread
From: Egger, Christoph @ 2015-12-07  9:03 UTC (permalink / raw)
  To: xen-devel, Keir Fraser, Jan Beulich, Andrew Cooper,
	Boris Ostrovsky, Suravee Suthikulpanit, Aravind Gopalakrishnan,
	Jun Nakajima, Kevin Tian

Did you consider nested virtualization?
L1 hypervisor may have a different tsc scaling
and L2 guest again may have a different tsc scale ratio.

Christoph

On 2015/12/06 22:14, Haozhong Zhang wrote:
> I have tested this patchset on both Intel and AMD systems.
> 
> For the test on AMD systems, I made two rounds of tests. The first
> round only applied bug-fix patches 1-6. The second round applied the
> entire series. In both rounds, I used the test process at
> http://lists.xenproject.org/archives/html/xen-devel/2015-11/msg02590.html
> except that patch 13 there was replace by the patch in the attachment.
> In both rounds, I got the expect results.
> 
> Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Haozhong
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel


Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-07  9:03   ` Egger, Christoph
@ 2015-12-07 10:16     ` Haozhong Zhang
  2015-12-07 17:04       ` Andrew Cooper
  0 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-07 10:16 UTC (permalink / raw)
  To: Egger, Christoph
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 12/07/15 10:03, Egger, Christoph wrote:
> Did you consider nested virtualization?
> L1 hypervisor may have a different tsc scaling
> and L2 guest again may have a different tsc scale ratio.
>

Oh, I forgot this. I'll check the nested TSC scaling code (mostly
about nested SVM TSC ratio, because this patch series does not expose
VMX TSC scaling to L1 guest).

BTW, are there any practical usage scenarios of nested TSC scaling? If
any, I may also need to consider adding nested virtualization support
to VMX TSC scaling.

Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-06 21:14 ` [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
  2015-12-07  9:03   ` Egger, Christoph
@ 2015-12-07 10:37   ` Jan Beulich
  2015-12-07 10:56     ` Haozhong Zhang
  1 sibling, 1 reply; 61+ messages in thread
From: Jan Beulich @ 2015-12-07 10:37 UTC (permalink / raw)
  To: Haozhong Zhang
  Cc: Kevin Tian, Keir Fraser, Suravee Suthikulpanit, Andrew Cooper,
	xen-devel, Aravind Gopalakrishnan, Jun Nakajima, Boris Ostrovsky

>>> On 06.12.15 at 22:14, <haozhong.zhang@intel.com> wrote:
> I have tested this patchset on both Intel and AMD systems.
> 
> For the test on AMD systems, I made two rounds of tests. The first
> round only applied bug-fix patches 1-6. The second round applied the
> entire series. In both rounds, I used the test process at
> http://lists.xenproject.org/archives/html/xen-devel/2015-11/msg02590.html 
> except that patch 13 there was replace by the patch in the attachment.
> In both rounds, I got the expect results.
> 
> Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>

Thanks for making the AMD status explicit. Yet I don't think a
Tested-by by the author of a patch series makes much sense.
It's simply implied that you test your changes.

Jan

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-07 10:37   ` Jan Beulich
@ 2015-12-07 10:56     ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-07 10:56 UTC (permalink / raw)
  To: Jan Beulich, Boris Ostrovsky, Aravind Gopalakrishnan,
	Suravee Suthikulpanit
  Cc: Andrew Cooper, Kevin Tian, Keir Fraser, Jun Nakajima, xen-devel

On 12/07/15 03:37, Jan Beulich wrote:
> >>> On 06.12.15 at 22:14, <haozhong.zhang@intel.com> wrote:
> > I have tested this patchset on both Intel and AMD systems.
> > 
> > For the test on AMD systems, I made two rounds of tests. The first
> > round only applied bug-fix patches 1-6. The second round applied the
> > entire series. In both rounds, I used the test process at
> > http://lists.xenproject.org/archives/html/xen-devel/2015-11/msg02590.html 
> > except that patch 13 there was replace by the patch in the attachment.
> > In both rounds, I got the expect results.
> > 
> > Tested-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Thanks for making the AMD status explicit. Yet I don't think a
> Tested-by by the author of a patch series makes much sense.
> It's simply implied that you test your changes.
> 
Yes, I understand. I just put them there in case they could be useful.

Boris and other SVM maintainers, can you help to review patch 1 - 6
that intend to fix bugs in the existing TSC scaling implementation on
SVM TSC ratio?

Thanks,
Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 01/14] svm: Fix incorrect TSC scaling
  2015-12-06 20:58 ` [PATCH v2 01/14] svm: Fix incorrect TSC scaling Haozhong Zhang
@ 2015-12-07 16:49   ` Boris Ostrovsky
  0 siblings, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 16:49 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> SVM TSC ratio is incorrectly used in the current
> svm_get_tsc_offset(). This patch replaces the scaling logic in
> svm_get_tsc_offset() with a correct implementation.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info()
  2015-12-06 20:58 ` [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info() Haozhong Zhang
@ 2015-12-07 16:49   ` Boris Ostrovsky
  0 siblings, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 16:49 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> Replace is_hvm_domain() in tsc_set_info() by has_hvm_container_domain()
> to keep consistent with other domain type checks in tsc_set_info().
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency in tsc_set_info()
  2015-12-06 20:58 ` [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency " Haozhong Zhang
@ 2015-12-07 16:57   ` Boris Ostrovsky
  2015-12-08  0:30     ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 16:57 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> When TSC_MODE_PVRDTSCP is used for a HVM container and TSC scaling is
> available, use the non-zero value of argument gtsc_khz of tsc_set_info()
> as the guest TSC frequency rather than using the host TSC
> frequency. Otherwise, TSC scaling will not be able get the correct ratio
> between the host and guest TSC frequencies.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>   xen/arch/x86/time.c | 11 +++++++++--
>   1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index b5223cf..1091e69 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1803,6 +1803,8 @@ void tsc_set_info(struct domain *d,
>                     uint32_t tsc_mode, uint64_t elapsed_nsec,
>                     uint32_t gtsc_khz, uint32_t incarnation)
>   {
> +    bool_t enable_tsc_scaling;
> +
>       if ( is_idle_domain(d) || is_hardware_domain(d) )
>       {
>           d->arch.vtsc = 0;
> @@ -1864,7 +1866,9 @@ void tsc_set_info(struct domain *d,
>       case TSC_MODE_PVRDTSCP:
>           d->arch.vtsc = !boot_cpu_has(X86_FEATURE_RDTSCP) ||
>                          !host_tsc_is_safe();
> -        d->arch.tsc_khz = cpu_khz;
> +        enable_tsc_scaling = has_hvm_container_domain(d) &&
> +                             cpu_has_tsc_ratio && d->arch.vtsc;
> +        d->arch.tsc_khz = (enable_tsc_scaling && gtsc_khz) ? gtsc_khz : cpu_khz;
>           set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
>           d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);
>           if ( d->arch.vtsc )
> @@ -1872,7 +1876,10 @@ void tsc_set_info(struct domain *d,
>           else {
>               /* when using native TSC, offset is nsec relative to power-on
>                * of physical machine */
> -            d->arch.vtsc_offset = scale_delta(rdtsc(), &d->arch.vtsc_to_ns) -
> +            struct time_scale *scale = enable_tsc_scaling ?

Do we need this test here? Per previous chunk, enable_tsc_scaling will 
be zero since d->arch.vtsc is false.

Actually, if you are trying to use (or, rather, account for) TSC 
scaling, shouldn't it be !arch.vtsc there?


-boris

> +                                       &this_cpu(cpu_time).tsc_scale :
> +                                       &d->arch.vtsc_to_ns;
> +            d->arch.vtsc_offset = scale_delta(rdtsc(), scale) -
>                                     elapsed_nsec;
>           }
>           break;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-07 10:16     ` Haozhong Zhang
@ 2015-12-07 17:04       ` Andrew Cooper
  2015-12-08  2:13         ` Haozhong Zhang
  2015-12-10 10:43         ` Tian, Kevin
  0 siblings, 2 replies; 61+ messages in thread
From: Andrew Cooper @ 2015-12-07 17:04 UTC (permalink / raw)
  To: Egger, Christoph, xen-devel, Keir Fraser, Jan Beulich,
	Boris Ostrovsky, Suravee Suthikulpanit, Aravind Gopalakrishnan,
	Jun Nakajima, Kevin Tian

On 07/12/15 10:16, Haozhong Zhang wrote:
> On 12/07/15 10:03, Egger, Christoph wrote:
>> Did you consider nested virtualization?
>> L1 hypervisor may have a different tsc scaling
>> and L2 guest again may have a different tsc scale ratio.
>>
> Oh, I forgot this. I'll check the nested TSC scaling code (mostly
> about nested SVM TSC ratio, because this patch series does not expose
> VMX TSC scaling to L1 guest).
>
> BTW, are there any practical usage scenarios of nested TSC scaling? If
> any, I may also need to consider adding nested virtualization support
> to VMX TSC scaling.

I would say that there are genuine uses for nested TSC scaling.  An L1
hypervisor is going to want to scale for the same reasons as the L0
hypervisor.

Having said that, if TSC scaling is correctly hidden from the L1 guests,
I would advise against conflating the two issues together.  i.e. getting
nested TSC scaling working is not a prerequisite for accepting this series.

~Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  2015-12-06 20:58 ` [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info() Haozhong Zhang
@ 2015-12-07 17:53   ` Boris Ostrovsky
  2015-12-08  1:08     ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 17:53 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> When the TSC mode of a HVM container is TSC_MODE_DEFAULT or
> TSC_MODE_PVRDTSCP and no TSC emulation is used, the existing
> tsc_get_info() uses the host TSC frequency (cpu_khz) as the guest TSC
> frequency. However, tsc_set_info() may set the guest TSC frequency to a
> value different than the host. In order to keep consistent to
> tsc_set_info(), this patch makes tsc_get_info() use the value set by
> tsc_set_info() as the guest TSC frequency.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>   xen/arch/x86/time.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index 1091e69..95df4f1 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1749,6 +1749,9 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>                     uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
>                     uint32_t *incarnation)
>   {
> +    bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
> +                                cpu_has_tsc_ratio;

&& !d->arch.vtsc ?

(assuming my comment to the previous patch is correct).


> +
>       *incarnation = d->arch.incarnation;
>       *tsc_mode = d->arch.tsc_mode;
>   
> @@ -1769,7 +1772,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>           }
>           tsc = rdtsc();
>           *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns);
> -        *gtsc_khz = cpu_khz;
> +        *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : cpu_khz;
>           break;
>       case TSC_MODE_PVRDTSCP:
>           if ( d->arch.vtsc )
> @@ -1779,10 +1782,13 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>           }
>           else
>           {
> +            struct time_scale *scale = enable_tsc_scaling ?
> +                                       &this_cpu(cpu_time).tsc_scale :
> +                                       &d->arch.vtsc_to_ns;

IIUIC tsc_scale is host property and so why would it be used if TSC 
scaling is available to guests?

-boris


>               tsc = rdtsc();
> -            *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns) -
> +            *elapsed_nsec = scale_delta(tsc, scale) -
>                               d->arch.vtsc_offset;
> -            *gtsc_khz = 0; /* ignored by tsc_set_info */
> +            *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : 0;
>           }
>           break;
>       }

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC
  2015-12-06 20:58 ` [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC Haozhong Zhang
@ 2015-12-07 17:54   ` Boris Ostrovsky
  0 siblings, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 17:54 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> The existing hvm_[set|get]_guest_tsc_fixed() calculate the guest TSC by
> adding the TSC offset to the host TSC. When the TSC scaling is enabled,
> the host TSC should be scaled first. This patch adds the scaling logic
> to those two functions.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>


Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-06 20:58 ` [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly Haozhong Zhang
@ 2015-12-07 18:14   ` Boris Ostrovsky
  2015-12-08  1:52     ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:14 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Joao Martins

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> This patch makes the pvclock return the scaled host TSC and
> corresponding scaling parameters to HVM domains if guest TSC is not
> emulated and TSC scaling is enabled.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

+Joao who has been staring at this code.

Joao, can you run this series through your test with non-native 
frequency? 
(http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html 
provides an interface to set it in config file).


> ---
>   xen/arch/x86/time.c | 16 ++++++++++++----
>   1 file changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index 95df4f1..732d1e9 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
>       }
>       else
>       {
> -        tsc_stamp = t->local_tsc_stamp;
> -
> -        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
> -        _u.tsc_shift         = (s8)t->tsc_scale.shift;
> +        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
> +        {
> +            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
> +            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
> +            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;

I am not sure this is correct (which is why I asked Joao to look at this 
and test). The scaler below is calculated as result of TSC calibration 
across physical CPUs and what you use above (vtsc_to_ns) is an 
uncalibrated value.

-boris

> +        }
> +        else
> +        {
> +            tsc_stamp            = t->local_tsc_stamp;
> +            _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
> +            _u.tsc_shift         = (s8)t->tsc_scale.shift;
> +        }
>       }
>   
>       _u.tsc_timestamp = tsc_stamp;

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset()
  2015-12-06 20:58 ` [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset() Haozhong Zhang
@ 2015-12-07 18:25   ` Boris Ostrovsky
  2015-12-08  1:54     ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:25 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> Now every caller passes an already scaled offset to
> svm_set_tsc_offset(), so it's not necessary to recalculate a scaled TSC
> offset in svm_set_tsc_offset().
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>   xen/arch/x86/hvm/svm/svm.c | 15 ++-------------
>   1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 9d5f2c3..3d22dfa 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -826,19 +826,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
>       struct vmcb_struct *n1vmcb, *n2vmcb;
>       uint64_t n2_tsc_offset = 0;
>       struct domain *d = v->domain;
> -    uint64_t host_tsc, guest_tsc;
> -
> -    guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
> -
> -    /* Re-adjust the offset value when TSC_RATIO is available */
> -    if ( cpu_has_tsc_ratio && !d->arch.vtsc )
> -    {
> -        if ( at_tsc )
> -            host_tsc = at_tsc;
> -        else
> -            host_tsc = rdtsc();
> -        offset = svm_get_tsc_offset(host_tsc, guest_tsc, vcpu_tsc_ratio(v));
> -    }
> +    uint64_t guest_tsc;

You could declare it at the first use site below.

Other than that,

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

>   
>       if ( !nestedhvm_enabled(d) ) {
>           vmcb_set_tsc_offset(vmcb, offset);
> @@ -854,6 +842,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
>           n2_tsc_offset = vmcb_get_tsc_offset(n2vmcb) -
>               vmcb_get_tsc_offset(n1vmcb);
>           if ( svm->ns_tscratio != DEFAULT_TSC_RATIO ) {
> +            guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
>               n2_tsc_offset = svm_get_tsc_offset(guest_tsc,
>                   guest_tsc + n2_tsc_offset, svm->ns_tscratio);
>           }

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio
  2015-12-06 20:58 ` [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
@ 2015-12-07 18:31   ` Boris Ostrovsky
  2015-12-10 10:19   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:31 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> Both VMX TSC scaling and SVM TSC ratio use the 64-bit TSC scaling ratio,
> but the number of fractional bits of the ratio is different between VMX
> and SVM. This patch adds the architecture code to collect the number of
> fractional bits and other related information into fields of struct
> hvm_function_table so that they can be used in the common code.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 09/14] x86/hvm: Setup TSC scaling ratio
  2015-12-06 20:58 ` [PATCH v2 09/14] x86/hvm: Setup " Haozhong Zhang
@ 2015-12-07 18:42   ` Boris Ostrovsky
  2015-12-08  1:58     ` Haozhong Zhang
  2015-12-10 10:27   ` Tian, Kevin
  1 sibling, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:42 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> This patch adds a field tsc_scaling_ratio in struct hvm_vcpu to
> record the TSC scaling ratio, and sets it up when tsc_set_info() is
> called for a vcpu or when a vcpu is restored or reset.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

with a few nits below.

> ---
>   xen/arch/x86/hvm/hvm.c            | 30 ++++++++++++++++++++++++++++++
>   xen/arch/x86/hvm/svm/svm.c        |  6 ++++--
>   xen/arch/x86/time.c               | 13 ++++++++++++-
>   xen/include/asm-x86/hvm/hvm.h     |  5 +++++
>   xen/include/asm-x86/hvm/svm/svm.h |  3 ---
>   xen/include/asm-x86/hvm/vcpu.h    |  2 ++
>   xen/include/asm-x86/math64.h      | 30 ++++++++++++++++++++++++++++++
>   7 files changed, 83 insertions(+), 6 deletions(-)
>   create mode 100644 xen/include/asm-x86/math64.h
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 0e63c33..52a0ef8 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -65,6 +65,7 @@
>   #include <asm/mtrr.h>
>   #include <asm/apic.h>
>   #include <asm/vm_event.h>
> +#include <asm/math64.h>
>   #include <public/sched.h>
>   #include <public/hvm/ioreq.h>
>   #include <public/version.h>
> @@ -301,6 +302,29 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
>       return 1;
>   }
>   
> +void hvm_setup_tsc_scaling(struct vcpu *v)
> +{
> +    u64 ratio;
> +
> +    if ( !hvm_funcs.tsc_scaling_supported )
> +        return;
> +
> +    /*
> +     * The multiplication of the first two terms may overflow a 64-bit
> +     * integer, so use mul_u64_u32_div() instead to keep precision.
> +     */
> +    ratio = mul_u64_u32_div(1ULL << hvm_funcs.tsc_scaling_ratio_frac_bits,
> +                            v->domain->arch.tsc_khz, cpu_khz);
> +
> +    if ( ratio == 0 || ratio > hvm_funcs.max_tsc_scaling_ratio )
> +        return;
> +
> +    v->arch.hvm_vcpu.tsc_scaling_ratio = ratio;
> +
> +    if ( hvm_funcs.setup_tsc_scaling )
> +        hvm_funcs.setup_tsc_scaling(v);
> +}
> +
>   void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
>   {
>       uint64_t tsc;
> @@ -2024,6 +2048,9 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
>       if ( hvm_funcs.load_cpu_ctxt(v, &ctxt) < 0 )
>           return -EINVAL;
>   
> +    if ( hvm_funcs.tsc_scaling_supported )
> +        hvm_setup_tsc_scaling(v);
> +


I think you can drop the test here (and below) since you do the same 
check inside hvm_setup_tsc_scaling()


>       v->arch.hvm_vcpu.msr_tsc_aux = ctxt.msr_tsc_aux;
>   
>       seg.limit = ctxt.idtr_limit;
> @@ -5584,6 +5611,9 @@ void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
>       hvm_set_segment_register(v, x86_seg_gdtr, &reg);
>       hvm_set_segment_register(v, x86_seg_idtr, &reg);
>   
> +    if ( hvm_funcs.tsc_scaling_supported )
> +        hvm_setup_tsc_scaling(v);
> +
>       /* Sync AP's TSC with BSP's. */
>       v->arch.hvm_vcpu.cache_tsc_offset =
>           v->domain->vcpu[0]->arch.hvm_vcpu.cache_tsc_offset;
> diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> index 9f741d0..d291327 100644
> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -811,7 +811,7 @@ static uint64_t svm_scale_tsc(struct vcpu *v, uint64_t tsc)
>       if ( !cpu_has_tsc_ratio || d->arch.vtsc )
>           return tsc;
>   
> -    return scale_tsc(tsc, vcpu_tsc_ratio(v));
> +    return scale_tsc(tsc, v->arch.hvm_vcpu.tsc_scaling_ratio);
>   }
>   
>   static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
> @@ -987,7 +987,7 @@ static inline void svm_tsc_ratio_save(struct vcpu *v)
>   static inline void svm_tsc_ratio_load(struct vcpu *v)
>   {
>       if ( cpu_has_tsc_ratio && !v->domain->arch.vtsc )
> -        wrmsrl(MSR_AMD64_TSC_RATIO, vcpu_tsc_ratio(v));
> +        wrmsrl(MSR_AMD64_TSC_RATIO, v->arch.hvm_vcpu.tsc_scaling_ratio);
>   }
>   
>   static void svm_ctxt_switch_from(struct vcpu *v)
> @@ -1193,6 +1193,8 @@ static int svm_vcpu_initialise(struct vcpu *v)
>   
>       svm_guest_osvw_init(v);
>   
> +    v->arch.hvm_vcpu.tsc_scaling_ratio = DEFAULT_TSC_RATIO;
> +
>       return 0;
>   }
>   
> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> index 732d1e9..d4a94eb 100644
> --- a/xen/arch/x86/time.c
> +++ b/xen/arch/x86/time.c
> @@ -1902,8 +1902,19 @@ void tsc_set_info(struct domain *d,
>       if ( has_hvm_container_domain(d) )
>       {
>           hvm_set_rdtsc_exiting(d, d->arch.vtsc);
> -        if ( d->vcpu && d->vcpu[0] && incarnation == 0 )
> +        if ( d->vcpu && d->vcpu[0] )
>           {
> +             /*
> +             * TSC scaling ratio on BSP is set here during initial boot, while

During domain creation rather then during boot.

> +             * the same TSC scaling ratio on APs will be set in
> +             * hvm_vcpu_reset_state().
> +             */
> +            if ( !d->arch.vtsc && hvm_funcs.tsc_scaling_supported )
> +                hvm_setup_tsc_scaling(d->vcpu[0]);
> +
> +            if ( incarnation )
> +                return;
> +
>               /*
>                * set_tsc_offset() is called from hvm_vcpu_initialise() before
>                * tsc_set_info(). New vtsc mode may require recomputing TSC
> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index 8b10a67..e74524e 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -226,6 +226,9 @@ struct hvm_function_table {
>       int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
>   
>       uint64_t (*scale_tsc)(struct vcpu *v, uint64_t tsc);
> +
> +    /* Architecture function to setup TSC scaling ratio */
> +    void (*setup_tsc_scaling)(struct vcpu *v);
>   };
>   
>   extern struct hvm_function_table hvm_funcs;
> @@ -261,6 +264,8 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
>   u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
>   #define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed(v, 0)
>   
> +void hvm_setup_tsc_scaling(struct vcpu *v);
> +
>   int hvm_set_mode(struct vcpu *v, int mode);
>   void hvm_init_guest_time(struct domain *d);
>   void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
> diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
> index d60ec23..c954b7e 100644
> --- a/xen/include/asm-x86/hvm/svm/svm.h
> +++ b/xen/include/asm-x86/hvm/svm/svm.h
> @@ -97,9 +97,6 @@ extern u32 svm_feature_flags;
>   /* TSC rate */
>   #define DEFAULT_TSC_RATIO       0x0000000100000000ULL
>   #define TSC_RATIO_RSVD_BITS     0xffffff0000000000ULL
> -#define TSC_RATIO(g_khz, h_khz) ( (((u64)(g_khz)<<32)/(u64)(h_khz)) & \
> -                                  ~TSC_RATIO_RSVD_BITS )
> -#define vcpu_tsc_ratio(v)       TSC_RATIO((v)->domain->arch.tsc_khz, cpu_khz)
>   
>   extern void svm_host_osvw_reset(void);
>   extern void svm_host_osvw_init(void);
> diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
> index 152d9f3..901c988 100644
> --- a/xen/include/asm-x86/hvm/vcpu.h
> +++ b/xen/include/asm-x86/hvm/vcpu.h
> @@ -175,6 +175,8 @@ struct hvm_vcpu {
>       u64                 msr_tsc_adjust;
>       u64                 msr_xss;
>   
> +    u64                 tsc_scaling_ratio;
> +
>       union {
>           struct arch_vmx_struct vmx;
>           struct arch_svm_struct svm;
> diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
> new file mode 100644
> index 0000000..9af6aee
> --- /dev/null
> +++ b/xen/include/asm-x86/math64.h
> @@ -0,0 +1,30 @@
> +#ifndef __X86_MATH64
> +#define __X86_MATH64
> +
> +/*
> + * (a * mul) / divisor
> + */
> +static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)

Since this is taken from Linux code I thinkit's worth mentioning so.

-boris

> +{
> +    union {
> +        u64 ll;
> +        struct {
> +            u32 low, high;
> +        } l;
> +    } u, rl, rh;
> +
> +    u.ll = a;
> +    rl.ll = (u64)u.l.low * mul;
> +    rh.ll = (u64)u.l.high * mul + rl.l.high;
> +
> +    /* Bits 32-63 of the result will be in rh.l.low. */
> +    rl.l.high = do_div(rh.ll, divisor);
> +
> +    /* Bits 0-31 of the result will be in rl.l.low. */
> +    do_div(rl.ll, divisor);
> +
> +    rl.l.high = rh.l.low;
> +    return rl.ll;
> +}
> +
> +#endif

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function
  2015-12-06 20:58 ` [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
@ 2015-12-07 18:55   ` Boris Ostrovsky
  2015-12-08  2:06     ` Haozhong Zhang
  2015-12-10 10:29   ` Tian, Kevin
  1 sibling, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:55 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> This patch implements a common function hvm_scale_tsc() to scale TSC by
> using TSC scaling information collected by architecture code.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>   xen/arch/x86/hvm/hvm.c        | 18 +++++++++--
>   xen/arch/x86/hvm/svm/svm.c    | 12 --------
>   xen/arch/x86/time.c           |  2 +-
>   xen/include/asm-x86/hvm/hvm.h |  3 +-
>   xen/include/asm-x86/math64.h  | 70 +++++++++++++++++++++++++++++++++++++++++++
>   5 files changed, 88 insertions(+), 17 deletions(-)
>
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 52a0ef8..1e3d89f 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -302,6 +302,20 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
>       return 1;
>   }
>   
> +u64 hvm_scale_tsc(const struct vcpu *v, u64 tsc)
> +{
> +    u64 ratio = v->arch.hvm_vcpu.tsc_scaling_ratio;
> +
> +    if ( !hvm_funcs.tsc_scaling_supported ||
> +         ratio == hvm_funcs.default_tsc_scaling_ratio )
> +        return tsc;
> +
> +    BUG_ON(hvm_funcs.tsc_scaling_ratio_frac_bits >= 64 ||
> +           hvm_funcs.tsc_scaling_ratio_frac_bits == 0);

Since these values never change, I am not sure it's worth testing this. 
And if you think it is then I'd do it somewhere in initialization code.

> +
> +    return mul_u64_u64_shr(tsc, ratio, hvm_funcs.tsc_scaling_ratio_frac_bits);
> +}
> +



> diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
> index 9af6aee..c6a472b 100644
> --- a/xen/include/asm-x86/math64.h
> +++ b/xen/include/asm-x86/math64.h
> @@ -27,4 +27,74 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
>       return rl.ll;
>   }
>   
> +/*
> + * Multiply two 64-bit unsigned integers a and b. The most and least
> + * significant 64 bits of the 128-bit result are returned through hi
> + * and lo respectively.
> + */
> +static inline void mul64(u64 *lo, u64 *hi, u64 a, u64 b)

Please move these routines (as well as one in previous patch into a 
standalone math patch (and provide attribution to wherever they came from).

-boris

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code
  2015-12-06 20:58 ` [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
@ 2015-12-07 18:56   ` Boris Ostrovsky
  2015-12-10 10:30   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:56 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> Both VMX and SVM save/load vcpu's TSC when saving/loading vcpu's
> context, so this patch moves saving/loading vcpu's TSC to the common
> functions hvm_[save|load]_cpu_ctxt().
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>   xen/arch/x86/hvm/hvm.c     | 4 ++++
>   xen/arch/x86/hvm/svm/svm.c | 5 -----
>   xen/arch/x86/hvm/vmx/vmx.c | 5 -----
>   3 files changed, 4 insertions(+), 10 deletions(-)

Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs
  2015-12-06 20:58 ` [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs Haozhong Zhang
@ 2015-12-07 18:58   ` Boris Ostrovsky
  2015-12-10 10:31   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-07 18:58 UTC (permalink / raw)
  To: Haozhong Zhang, xen-devel
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, Aravind Gopalakrishnan, Suravee Suthikulpanit

On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> This patch uses hvm_funcs.tsc_scaling_supported instead of the
> architecture code to detect the TSC scaling support.
>
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>
> ---
>   xen/arch/x86/time.c | 9 ++++-----
>   1 file changed, 4 insertions(+), 5 deletions(-)
>
>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency in tsc_set_info()
  2015-12-07 16:57   ` Boris Ostrovsky
@ 2015-12-08  0:30     ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  0:30 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/07/15 11:57, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >When TSC_MODE_PVRDTSCP is used for a HVM container and TSC scaling is
> >available, use the non-zero value of argument gtsc_khz of tsc_set_info()
> >as the guest TSC frequency rather than using the host TSC
> >frequency. Otherwise, TSC scaling will not be able get the correct ratio
> >between the host and guest TSC frequencies.
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >---
> >  xen/arch/x86/time.c | 11 +++++++++--
> >  1 file changed, 9 insertions(+), 2 deletions(-)
> >
> >diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >index b5223cf..1091e69 100644
> >--- a/xen/arch/x86/time.c
> >+++ b/xen/arch/x86/time.c
> >@@ -1803,6 +1803,8 @@ void tsc_set_info(struct domain *d,
> >                    uint32_t tsc_mode, uint64_t elapsed_nsec,
> >                    uint32_t gtsc_khz, uint32_t incarnation)
> >  {
> >+    bool_t enable_tsc_scaling;
> >+
> >      if ( is_idle_domain(d) || is_hardware_domain(d) )
> >      {
> >          d->arch.vtsc = 0;
> >@@ -1864,7 +1866,9 @@ void tsc_set_info(struct domain *d,
> >      case TSC_MODE_PVRDTSCP:
> >          d->arch.vtsc = !boot_cpu_has(X86_FEATURE_RDTSCP) ||
> >                         !host_tsc_is_safe();
> >-        d->arch.tsc_khz = cpu_khz;
> >+        enable_tsc_scaling = has_hvm_container_domain(d) &&
> >+                             cpu_has_tsc_ratio && d->arch.vtsc;
> >+        d->arch.tsc_khz = (enable_tsc_scaling && gtsc_khz) ? gtsc_khz : cpu_khz;
> >          set_time_scale(&d->arch.vtsc_to_ns, d->arch.tsc_khz * 1000 );
> >          d->arch.ns_to_vtsc = scale_reciprocal(d->arch.vtsc_to_ns);
> >          if ( d->arch.vtsc )
> >@@ -1872,7 +1876,10 @@ void tsc_set_info(struct domain *d,
> >          else {
> >              /* when using native TSC, offset is nsec relative to power-on
> >               * of physical machine */
> >-            d->arch.vtsc_offset = scale_delta(rdtsc(), &d->arch.vtsc_to_ns) -
> >+            struct time_scale *scale = enable_tsc_scaling ?
> 
> Do we need this test here? Per previous chunk, enable_tsc_scaling will be
> zero since d->arch.vtsc is false.
> 
> Actually, if you are trying to use (or, rather, account for) TSC scaling,
> shouldn't it be !arch.vtsc there?
>

It really should be !arch.vtsc. Don't know why I missed ! here but my
testing code did not.

Thanks,
Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  2015-12-07 17:53   ` Boris Ostrovsky
@ 2015-12-08  1:08     ` Haozhong Zhang
  2015-12-08 18:26       ` Boris Ostrovsky
  0 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  1:08 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/07/15 12:53, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >When the TSC mode of a HVM container is TSC_MODE_DEFAULT or
> >TSC_MODE_PVRDTSCP and no TSC emulation is used, the existing
> >tsc_get_info() uses the host TSC frequency (cpu_khz) as the guest TSC
> >frequency. However, tsc_set_info() may set the guest TSC frequency to a
> >value different than the host. In order to keep consistent to
> >tsc_set_info(), this patch makes tsc_get_info() use the value set by
> >tsc_set_info() as the guest TSC frequency.
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >---
> >  xen/arch/x86/time.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> >diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >index 1091e69..95df4f1 100644
> >--- a/xen/arch/x86/time.c
> >+++ b/xen/arch/x86/time.c
> >@@ -1749,6 +1749,9 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >                    uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
> >                    uint32_t *incarnation)
> >  {
> >+    bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
> >+                                cpu_has_tsc_ratio;
> 
> && !d->arch.vtsc ?
> 
> (assuming my comment to the previous patch is correct).
>

enable_tsc_scaling in tsc_get_info() is always used in the condition
!d->arch.vtsc, so it's not necessary to include it again in
enable_tsc_scaling.

> 
> >+
> >      *incarnation = d->arch.incarnation;
> >      *tsc_mode = d->arch.tsc_mode;
> >@@ -1769,7 +1772,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >          }
> >          tsc = rdtsc();
> >          *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns);
> >-        *gtsc_khz = cpu_khz;
> >+        *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : cpu_khz;
> >          break;
> >      case TSC_MODE_PVRDTSCP:
> >          if ( d->arch.vtsc )
> >@@ -1779,10 +1782,13 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >          }
> >          else
> >          {
> >+            struct time_scale *scale = enable_tsc_scaling ?
> >+                                       &this_cpu(cpu_time).tsc_scale :
> >+                                       &d->arch.vtsc_to_ns;
> 
> IIUIC tsc_scale is host property and so why would it be used if TSC scaling
> is available to guests?
>

scale is used below to convert a host TSC to nanosec. When TSC scaling is used,
d->arch.vtsc_to_ns may not base on the host TSC frequency, so I turn to the host
tsc_scale instead.

Haozhong

> -boris
> 
> 
> >              tsc = rdtsc();
> >-            *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns) -
> >+            *elapsed_nsec = scale_delta(tsc, scale) -
> >                              d->arch.vtsc_offset;
> >-            *gtsc_khz = 0; /* ignored by tsc_set_info */
> >+            *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : 0;
> >          }
> >          break;
> >      }
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-07 18:14   ` Boris Ostrovsky
@ 2015-12-08  1:52     ` Haozhong Zhang
  2015-12-08 19:21       ` Boris Ostrovsky
  0 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  1:52 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Joao Martins

On 12/07/15 13:14, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >This patch makes the pvclock return the scaled host TSC and
> >corresponding scaling parameters to HVM domains if guest TSC is not
> >emulated and TSC scaling is enabled.
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> +Joao who has been staring at this code.
> 
> Joao, can you run this series through your test with non-native frequency?
> (http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
> provides an interface to set it in config file).
> 
> 
> >---
> >  xen/arch/x86/time.c | 16 ++++++++++++----
> >  1 file changed, 12 insertions(+), 4 deletions(-)
> >
> >diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >index 95df4f1..732d1e9 100644
> >--- a/xen/arch/x86/time.c
> >+++ b/xen/arch/x86/time.c
> >@@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
> >      }
> >      else
> >      {
> >-        tsc_stamp = t->local_tsc_stamp;
> >-
> >-        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
> >-        _u.tsc_shift         = (s8)t->tsc_scale.shift;
> >+        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
> >+        {
> >+            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
> >+            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
> >+            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
> 
> I am not sure this is correct (which is why I asked Joao to look at this and
> test). The scaler below is calculated as result of TSC calibration across
> physical CPUs and what you use above (vtsc_to_ns) is an uncalibrated value.
>

Because guest TSC is synchronized among all vcpus of a domain, I think
it's safe to use d->arch.vtsc_to_ns here.

Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset()
  2015-12-07 18:25   ` Boris Ostrovsky
@ 2015-12-08  1:54     ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  1:54 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/07/15 13:25, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >Now every caller passes an already scaled offset to
> >svm_set_tsc_offset(), so it's not necessary to recalculate a scaled TSC
> >offset in svm_set_tsc_offset().
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >---
> >  xen/arch/x86/hvm/svm/svm.c | 15 ++-------------
> >  1 file changed, 2 insertions(+), 13 deletions(-)
> >
> >diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> >index 9d5f2c3..3d22dfa 100644
> >--- a/xen/arch/x86/hvm/svm/svm.c
> >+++ b/xen/arch/x86/hvm/svm/svm.c
> >@@ -826,19 +826,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
> >      struct vmcb_struct *n1vmcb, *n2vmcb;
> >      uint64_t n2_tsc_offset = 0;
> >      struct domain *d = v->domain;
> >-    uint64_t host_tsc, guest_tsc;
> >-
> >-    guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
> >-
> >-    /* Re-adjust the offset value when TSC_RATIO is available */
> >-    if ( cpu_has_tsc_ratio && !d->arch.vtsc )
> >-    {
> >-        if ( at_tsc )
> >-            host_tsc = at_tsc;
> >-        else
> >-            host_tsc = rdtsc();
> >-        offset = svm_get_tsc_offset(host_tsc, guest_tsc, vcpu_tsc_ratio(v));
> >-    }
> >+    uint64_t guest_tsc;
> 
> You could declare it at the first use site below.
>

will move to the first use site.

Haozhong

> Other than that,
> 
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> >      if ( !nestedhvm_enabled(d) ) {
> >          vmcb_set_tsc_offset(vmcb, offset);
> >@@ -854,6 +842,7 @@ static void svm_set_tsc_offset(struct vcpu *v, u64 offset, u64 at_tsc)
> >          n2_tsc_offset = vmcb_get_tsc_offset(n2vmcb) -
> >              vmcb_get_tsc_offset(n1vmcb);
> >          if ( svm->ns_tscratio != DEFAULT_TSC_RATIO ) {
> >+            guest_tsc = hvm_get_guest_tsc_fixed(v, at_tsc);
> >              n2_tsc_offset = svm_get_tsc_offset(guest_tsc,
> >                  guest_tsc + n2_tsc_offset, svm->ns_tscratio);
> >          }
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 09/14] x86/hvm: Setup TSC scaling ratio
  2015-12-07 18:42   ` Boris Ostrovsky
@ 2015-12-08  1:58     ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  1:58 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/07/15 13:42, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >This patch adds a field tsc_scaling_ratio in struct hvm_vcpu to
> >record the TSC scaling ratio, and sets it up when tsc_set_info() is
> >called for a vcpu or when a vcpu is restored or reset.
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
> 
> with a few nits below.
> 
> >---
> >  xen/arch/x86/hvm/hvm.c            | 30 ++++++++++++++++++++++++++++++
> >  xen/arch/x86/hvm/svm/svm.c        |  6 ++++--
> >  xen/arch/x86/time.c               | 13 ++++++++++++-
> >  xen/include/asm-x86/hvm/hvm.h     |  5 +++++
> >  xen/include/asm-x86/hvm/svm/svm.h |  3 ---
> >  xen/include/asm-x86/hvm/vcpu.h    |  2 ++
> >  xen/include/asm-x86/math64.h      | 30 ++++++++++++++++++++++++++++++
> >  7 files changed, 83 insertions(+), 6 deletions(-)
> >  create mode 100644 xen/include/asm-x86/math64.h
> >
> >diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >index 0e63c33..52a0ef8 100644
> >--- a/xen/arch/x86/hvm/hvm.c
> >+++ b/xen/arch/x86/hvm/hvm.c
> >@@ -65,6 +65,7 @@
> >  #include <asm/mtrr.h>
> >  #include <asm/apic.h>
> >  #include <asm/vm_event.h>
> >+#include <asm/math64.h>
> >  #include <public/sched.h>
> >  #include <public/hvm/ioreq.h>
> >  #include <public/version.h>
> >@@ -301,6 +302,29 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
> >      return 1;
> >  }
> >+void hvm_setup_tsc_scaling(struct vcpu *v)
> >+{
> >+    u64 ratio;
> >+
> >+    if ( !hvm_funcs.tsc_scaling_supported )
> >+        return;
> >+
> >+    /*
> >+     * The multiplication of the first two terms may overflow a 64-bit
> >+     * integer, so use mul_u64_u32_div() instead to keep precision.
> >+     */
> >+    ratio = mul_u64_u32_div(1ULL << hvm_funcs.tsc_scaling_ratio_frac_bits,
> >+                            v->domain->arch.tsc_khz, cpu_khz);
> >+
> >+    if ( ratio == 0 || ratio > hvm_funcs.max_tsc_scaling_ratio )
> >+        return;
> >+
> >+    v->arch.hvm_vcpu.tsc_scaling_ratio = ratio;
> >+
> >+    if ( hvm_funcs.setup_tsc_scaling )
> >+        hvm_funcs.setup_tsc_scaling(v);
> >+}
> >+
> >  void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc)
> >  {
> >      uint64_t tsc;
> >@@ -2024,6 +2048,9 @@ static int hvm_load_cpu_ctxt(struct domain *d, hvm_domain_context_t *h)
> >      if ( hvm_funcs.load_cpu_ctxt(v, &ctxt) < 0 )
> >          return -EINVAL;
> >+    if ( hvm_funcs.tsc_scaling_supported )
> >+        hvm_setup_tsc_scaling(v);
> >+
> 
> 
> I think you can drop the test here (and below) since you do the same check
> inside hvm_setup_tsc_scaling()
>

Yes, the check here is redundant.

> 
> >      v->arch.hvm_vcpu.msr_tsc_aux = ctxt.msr_tsc_aux;
> >      seg.limit = ctxt.idtr_limit;
> >@@ -5584,6 +5611,9 @@ void hvm_vcpu_reset_state(struct vcpu *v, uint16_t cs, uint16_t ip)
> >      hvm_set_segment_register(v, x86_seg_gdtr, &reg);
> >      hvm_set_segment_register(v, x86_seg_idtr, &reg);
> >+    if ( hvm_funcs.tsc_scaling_supported )
> >+        hvm_setup_tsc_scaling(v);
> >+
> >      /* Sync AP's TSC with BSP's. */
> >      v->arch.hvm_vcpu.cache_tsc_offset =
> >          v->domain->vcpu[0]->arch.hvm_vcpu.cache_tsc_offset;
> >diff --git a/xen/arch/x86/hvm/svm/svm.c b/xen/arch/x86/hvm/svm/svm.c
> >index 9f741d0..d291327 100644
> >--- a/xen/arch/x86/hvm/svm/svm.c
> >+++ b/xen/arch/x86/hvm/svm/svm.c
> >@@ -811,7 +811,7 @@ static uint64_t svm_scale_tsc(struct vcpu *v, uint64_t tsc)
> >      if ( !cpu_has_tsc_ratio || d->arch.vtsc )
> >          return tsc;
> >-    return scale_tsc(tsc, vcpu_tsc_ratio(v));
> >+    return scale_tsc(tsc, v->arch.hvm_vcpu.tsc_scaling_ratio);
> >  }
> >  static uint64_t svm_get_tsc_offset(uint64_t host_tsc, uint64_t guest_tsc,
> >@@ -987,7 +987,7 @@ static inline void svm_tsc_ratio_save(struct vcpu *v)
> >  static inline void svm_tsc_ratio_load(struct vcpu *v)
> >  {
> >      if ( cpu_has_tsc_ratio && !v->domain->arch.vtsc )
> >-        wrmsrl(MSR_AMD64_TSC_RATIO, vcpu_tsc_ratio(v));
> >+        wrmsrl(MSR_AMD64_TSC_RATIO, v->arch.hvm_vcpu.tsc_scaling_ratio);
> >  }
> >  static void svm_ctxt_switch_from(struct vcpu *v)
> >@@ -1193,6 +1193,8 @@ static int svm_vcpu_initialise(struct vcpu *v)
> >      svm_guest_osvw_init(v);
> >+    v->arch.hvm_vcpu.tsc_scaling_ratio = DEFAULT_TSC_RATIO;
> >+
> >      return 0;
> >  }
> >diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >index 732d1e9..d4a94eb 100644
> >--- a/xen/arch/x86/time.c
> >+++ b/xen/arch/x86/time.c
> >@@ -1902,8 +1902,19 @@ void tsc_set_info(struct domain *d,
> >      if ( has_hvm_container_domain(d) )
> >      {
> >          hvm_set_rdtsc_exiting(d, d->arch.vtsc);
> >-        if ( d->vcpu && d->vcpu[0] && incarnation == 0 )
> >+        if ( d->vcpu && d->vcpu[0] )
> >          {
> >+             /*
> >+             * TSC scaling ratio on BSP is set here during initial boot, while
> 
> During domain creation rather then during boot.
>

Yes, will change.

> >+             * the same TSC scaling ratio on APs will be set in
> >+             * hvm_vcpu_reset_state().
> >+             */
> >+            if ( !d->arch.vtsc && hvm_funcs.tsc_scaling_supported )
> >+                hvm_setup_tsc_scaling(d->vcpu[0]);
> >+
> >+            if ( incarnation )
> >+                return;
> >+
> >              /*
> >               * set_tsc_offset() is called from hvm_vcpu_initialise() before
> >               * tsc_set_info(). New vtsc mode may require recomputing TSC
> >diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> >index 8b10a67..e74524e 100644
> >--- a/xen/include/asm-x86/hvm/hvm.h
> >+++ b/xen/include/asm-x86/hvm/hvm.h
> >@@ -226,6 +226,9 @@ struct hvm_function_table {
> >      int (*altp2m_vcpu_emulate_vmfunc)(struct cpu_user_regs *regs);
> >      uint64_t (*scale_tsc)(struct vcpu *v, uint64_t tsc);
> >+
> >+    /* Architecture function to setup TSC scaling ratio */
> >+    void (*setup_tsc_scaling)(struct vcpu *v);
> >  };
> >  extern struct hvm_function_table hvm_funcs;
> >@@ -261,6 +264,8 @@ void hvm_set_guest_tsc_fixed(struct vcpu *v, u64 guest_tsc, u64 at_tsc);
> >  u64 hvm_get_guest_tsc_fixed(struct vcpu *v, u64 at_tsc);
> >  #define hvm_get_guest_tsc(v) hvm_get_guest_tsc_fixed(v, 0)
> >+void hvm_setup_tsc_scaling(struct vcpu *v);
> >+
> >  int hvm_set_mode(struct vcpu *v, int mode);
> >  void hvm_init_guest_time(struct domain *d);
> >  void hvm_set_guest_time(struct vcpu *v, u64 guest_time);
> >diff --git a/xen/include/asm-x86/hvm/svm/svm.h b/xen/include/asm-x86/hvm/svm/svm.h
> >index d60ec23..c954b7e 100644
> >--- a/xen/include/asm-x86/hvm/svm/svm.h
> >+++ b/xen/include/asm-x86/hvm/svm/svm.h
> >@@ -97,9 +97,6 @@ extern u32 svm_feature_flags;
> >  /* TSC rate */
> >  #define DEFAULT_TSC_RATIO       0x0000000100000000ULL
> >  #define TSC_RATIO_RSVD_BITS     0xffffff0000000000ULL
> >-#define TSC_RATIO(g_khz, h_khz) ( (((u64)(g_khz)<<32)/(u64)(h_khz)) & \
> >-                                  ~TSC_RATIO_RSVD_BITS )
> >-#define vcpu_tsc_ratio(v)       TSC_RATIO((v)->domain->arch.tsc_khz, cpu_khz)
> >  extern void svm_host_osvw_reset(void);
> >  extern void svm_host_osvw_init(void);
> >diff --git a/xen/include/asm-x86/hvm/vcpu.h b/xen/include/asm-x86/hvm/vcpu.h
> >index 152d9f3..901c988 100644
> >--- a/xen/include/asm-x86/hvm/vcpu.h
> >+++ b/xen/include/asm-x86/hvm/vcpu.h
> >@@ -175,6 +175,8 @@ struct hvm_vcpu {
> >      u64                 msr_tsc_adjust;
> >      u64                 msr_xss;
> >+    u64                 tsc_scaling_ratio;
> >+
> >      union {
> >          struct arch_vmx_struct vmx;
> >          struct arch_svm_struct svm;
> >diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
> >new file mode 100644
> >index 0000000..9af6aee
> >--- /dev/null
> >+++ b/xen/include/asm-x86/math64.h
> >@@ -0,0 +1,30 @@
> >+#ifndef __X86_MATH64
> >+#define __X86_MATH64
> >+
> >+/*
> >+ * (a * mul) / divisor
> >+ */
> >+static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
> 
> Since this is taken from Linux code I thinkit's worth mentioning so.
>

Yes, I'll add comments for it (and other code taken from Linux).

Haozhong

> -boris
> 
> >+{
> >+    union {
> >+        u64 ll;
> >+        struct {
> >+            u32 low, high;
> >+        } l;
> >+    } u, rl, rh;
> >+
> >+    u.ll = a;
> >+    rl.ll = (u64)u.l.low * mul;
> >+    rh.ll = (u64)u.l.high * mul + rl.l.high;
> >+
> >+    /* Bits 32-63 of the result will be in rh.l.low. */
> >+    rl.l.high = do_div(rh.ll, divisor);
> >+
> >+    /* Bits 0-31 of the result will be in rl.l.low. */
> >+    do_div(rl.ll, divisor);
> >+
> >+    rl.l.high = rh.l.low;
> >+    return rl.ll;
> >+}
> >+
> >+#endif
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function
  2015-12-07 18:55   ` Boris Ostrovsky
@ 2015-12-08  2:06     ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  2:06 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/07/15 13:55, Boris Ostrovsky wrote:
> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >This patch implements a common function hvm_scale_tsc() to scale TSC by
> >using TSC scaling information collected by architecture code.
> >
> >Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >---
> >  xen/arch/x86/hvm/hvm.c        | 18 +++++++++--
> >  xen/arch/x86/hvm/svm/svm.c    | 12 --------
> >  xen/arch/x86/time.c           |  2 +-
> >  xen/include/asm-x86/hvm/hvm.h |  3 +-
> >  xen/include/asm-x86/math64.h  | 70 +++++++++++++++++++++++++++++++++++++++++++
> >  5 files changed, 88 insertions(+), 17 deletions(-)
> >
> >diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> >index 52a0ef8..1e3d89f 100644
> >--- a/xen/arch/x86/hvm/hvm.c
> >+++ b/xen/arch/x86/hvm/hvm.c
> >@@ -302,6 +302,20 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
> >      return 1;
> >  }
> >+u64 hvm_scale_tsc(const struct vcpu *v, u64 tsc)
> >+{
> >+    u64 ratio = v->arch.hvm_vcpu.tsc_scaling_ratio;
> >+
> >+    if ( !hvm_funcs.tsc_scaling_supported ||
> >+         ratio == hvm_funcs.default_tsc_scaling_ratio )
> >+        return tsc;
> >+
> >+    BUG_ON(hvm_funcs.tsc_scaling_ratio_frac_bits >= 64 ||
> >+           hvm_funcs.tsc_scaling_ratio_frac_bits == 0);
> 
> Since these values never change, I am not sure it's worth testing this. And
> if you think it is then I'd do it somewhere in initialization code.
>

I'll remove this unnecessary check.

> >+
> >+    return mul_u64_u64_shr(tsc, ratio, hvm_funcs.tsc_scaling_ratio_frac_bits);
> >+}
> >+
> 
> 
> 
> >diff --git a/xen/include/asm-x86/math64.h b/xen/include/asm-x86/math64.h
> >index 9af6aee..c6a472b 100644
> >--- a/xen/include/asm-x86/math64.h
> >+++ b/xen/include/asm-x86/math64.h
> >@@ -27,4 +27,74 @@ static inline u64 mul_u64_u32_div(u64 a, u32 mul, u32 divisor)
> >      return rl.ll;
> >  }
> >+/*
> >+ * Multiply two 64-bit unsigned integers a and b. The most and least
> >+ * significant 64 bits of the 128-bit result are returned through hi
> >+ * and lo respectively.
> >+ */
> >+static inline void mul64(u64 *lo, u64 *hi, u64 a, u64 b)
> 
> Please move these routines (as well as one in previous patch into a
> standalone math patch (and provide attribution to wherever they came from).
>

I'll move them to a standalone patch and add comments for their source.

Thanks,
Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-07 17:04       ` Andrew Cooper
@ 2015-12-08  2:13         ` Haozhong Zhang
  2015-12-10 10:43         ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-08  2:13 UTC (permalink / raw)
  To: Andrew Cooper
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima, Egger,
	Christoph, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 12/07/15 17:04, Andrew Cooper wrote:
> On 07/12/15 10:16, Haozhong Zhang wrote:
> > On 12/07/15 10:03, Egger, Christoph wrote:
> >> Did you consider nested virtualization?
> >> L1 hypervisor may have a different tsc scaling
> >> and L2 guest again may have a different tsc scale ratio.
> >>
> > Oh, I forgot this. I'll check the nested TSC scaling code (mostly
> > about nested SVM TSC ratio, because this patch series does not expose
> > VMX TSC scaling to L1 guest).
> >
> > BTW, are there any practical usage scenarios of nested TSC scaling? If
> > any, I may also need to consider adding nested virtualization support
> > to VMX TSC scaling.
> 
> I would say that there are genuine uses for nested TSC scaling.  An L1
> hypervisor is going to want to scale for the same reasons as the L0
> hypervisor.
>
nice to know this.

> Having said that, if TSC scaling is correctly hidden from the L1 guests,
> I would advise against conflating the two issues together.  i.e. getting
> nested TSC scaling working is not a prerequisite for accepting this series.
>

agree, I prefer to posting other standalone patches to support nested
VMX TSC scaling.

Thanks,
Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  2015-12-08  1:08     ` Haozhong Zhang
@ 2015-12-08 18:26       ` Boris Ostrovsky
  2015-12-10  0:14         ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-08 18:26 UTC (permalink / raw)
  To: xen-devel, Keir Fraser, Jan Beulich, Andrew Cooper,
	Suravee Suthikulpanit, Aravind Gopalakrishnan, Jun Nakajima,
	Kevin Tian

On 12/07/2015 08:08 PM, Haozhong Zhang wrote:
> On 12/07/15 12:53, Boris Ostrovsky wrote:
>> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
>>> When the TSC mode of a HVM container is TSC_MODE_DEFAULT or
>>> TSC_MODE_PVRDTSCP and no TSC emulation is used, the existing
>>> tsc_get_info() uses the host TSC frequency (cpu_khz) as the guest TSC
>>> frequency. However, tsc_set_info() may set the guest TSC frequency to a
>>> value different than the host. In order to keep consistent to
>>> tsc_set_info(), this patch makes tsc_get_info() use the value set by
>>> tsc_set_info() as the guest TSC frequency.
>>>
>>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>>> ---
>>>   xen/arch/x86/time.c | 12 +++++++++---
>>>   1 file changed, 9 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
>>> index 1091e69..95df4f1 100644
>>> --- a/xen/arch/x86/time.c
>>> +++ b/xen/arch/x86/time.c
>>> @@ -1749,6 +1749,9 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>>>                     uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
>>>                     uint32_t *incarnation)
>>>   {
>>> +    bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
>>> +                                cpu_has_tsc_ratio;
>> && !d->arch.vtsc ?
>>
>> (assuming my comment to the previous patch is correct).
>>
> enable_tsc_scaling in tsc_get_info() is always used in the condition
> !d->arch.vtsc, so it's not necessary to include it again in
> enable_tsc_scaling.
>
>>> +
>>>       *incarnation = d->arch.incarnation;
>>>       *tsc_mode = d->arch.tsc_mode;
>>> @@ -1769,7 +1772,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>>>           }
>>>           tsc = rdtsc();
>>>           *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns);
>>> -        *gtsc_khz = cpu_khz;
>>> +        *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : cpu_khz;
>>>           break;
>>>       case TSC_MODE_PVRDTSCP:
>>>           if ( d->arch.vtsc )
>>> @@ -1779,10 +1782,13 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
>>>           }
>>>           else
>>>           {
>>> +            struct time_scale *scale = enable_tsc_scaling ?
>>> +                                       &this_cpu(cpu_time).tsc_scale :
>>> +                                       &d->arch.vtsc_to_ns;
>> IIUIC tsc_scale is host property and so why would it be used if TSC scaling
>> is available to guests?
>>
> scale is used below to convert a host TSC to nanosec. When TSC scaling is used,
> d->arch.vtsc_to_ns may not base on the host TSC frequency, so I turn to the host
> tsc_scale instead.

OK. Can we then use tsc_scale for both with and without TSC scaling? (on 
the set side too).

(And as a side note, I think that the comment in 
include/asm-x86/domain.h describing vtsc_to_ns as being used for 
emulated cases is rather misleading. We use it for both emulated and native)

-boris


>
> Haozhong
>
>> -boris
>>
>>
>>>               tsc = rdtsc();
>>> -            *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns) -
>>> +            *elapsed_nsec = scale_delta(tsc, scale) -
>>>                               d->arch.vtsc_offset;
>>> -            *gtsc_khz = 0; /* ignored by tsc_set_info */
>>> +            *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : 0;
>>>           }
>>>           break;
>>>       }

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-08  1:52     ` Haozhong Zhang
@ 2015-12-08 19:21       ` Boris Ostrovsky
  2015-12-10  0:23         ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Boris Ostrovsky @ 2015-12-08 19:21 UTC (permalink / raw)
  To: xen-devel, Keir Fraser, Jan Beulich, Andrew Cooper,
	Suravee Suthikulpanit, Aravind Gopalakrishnan, Jun Nakajima,
	Kevin Tian, Joao Martins

On 12/07/2015 08:52 PM, Haozhong Zhang wrote:
> On 12/07/15 13:14, Boris Ostrovsky wrote:
>> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
>>> This patch makes the pvclock return the scaled host TSC and
>>> corresponding scaling parameters to HVM domains if guest TSC is not
>>> emulated and TSC scaling is enabled.
>>>
>>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>> +Joao who has been staring at this code.
>>
>> Joao, can you run this series through your test with non-native frequency?
>> (http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
>> provides an interface to set it in config file).
>>
>>
>>> ---
>>>   xen/arch/x86/time.c | 16 ++++++++++++----
>>>   1 file changed, 12 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
>>> index 95df4f1..732d1e9 100644
>>> --- a/xen/arch/x86/time.c
>>> +++ b/xen/arch/x86/time.c
>>> @@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
>>>       }
>>>       else
>>>       {
>>> -        tsc_stamp = t->local_tsc_stamp;
>>> -
>>> -        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
>>> -        _u.tsc_shift         = (s8)t->tsc_scale.shift;
>>> +        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
>>> +        {
>>> +            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
>>> +            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
>>> +            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
>> I am not sure this is correct (which is why I asked Joao to look at this and
>> test). The scaler below is calculated as result of TSC calibration across
>> physical CPUs and what you use above (vtsc_to_ns) is an uncalibrated value.
>>
> Because guest TSC is synchronized among all vcpus of a domain, I think
> it's safe to use d->arch.vtsc_to_ns here.

This is only guaranteed if we have constant/reliable TSC. So perhaps you 
should set tsc_scaling_supported only when either (or both?) of these 
TSC properties are true. Which is probably the case anyway but may be 
worth explicitly checking in start_svm/vmx? (and use 
tsc_scaling_supported instead of cpu_has_tsc_ratio in the 'if' statement)

And just like I asked in the previous email --- should we then use the 
same scaler (which would be vtsc_to_ns) in both cases? At least for 
guests in HVM containers (it may work for PV guests as well, but it 
would need to be confirmed).

Also, I noticed that this routine uses is_hvm_domain(). I think it 
should be has_hvm_container_domain().

-boris

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info()
  2015-12-08 18:26       ` Boris Ostrovsky
@ 2015-12-10  0:14         ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-10  0:14 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit

On 12/08/15 13:26, Boris Ostrovsky wrote:
> On 12/07/2015 08:08 PM, Haozhong Zhang wrote:
> >On 12/07/15 12:53, Boris Ostrovsky wrote:
> >>On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >>>When the TSC mode of a HVM container is TSC_MODE_DEFAULT or
> >>>TSC_MODE_PVRDTSCP and no TSC emulation is used, the existing
> >>>tsc_get_info() uses the host TSC frequency (cpu_khz) as the guest TSC
> >>>frequency. However, tsc_set_info() may set the guest TSC frequency to a
> >>>value different than the host. In order to keep consistent to
> >>>tsc_set_info(), this patch makes tsc_get_info() use the value set by
> >>>tsc_set_info() as the guest TSC frequency.
> >>>
> >>>Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >>>---
> >>>  xen/arch/x86/time.c | 12 +++++++++---
> >>>  1 file changed, 9 insertions(+), 3 deletions(-)
> >>>
> >>>diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >>>index 1091e69..95df4f1 100644
> >>>--- a/xen/arch/x86/time.c
> >>>+++ b/xen/arch/x86/time.c
> >>>@@ -1749,6 +1749,9 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >>>                    uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
> >>>                    uint32_t *incarnation)
> >>>  {
> >>>+    bool_t enable_tsc_scaling = has_hvm_container_domain(d) &&
> >>>+                                cpu_has_tsc_ratio;
> >>&& !d->arch.vtsc ?
> >>
> >>(assuming my comment to the previous patch is correct).
> >>
> >enable_tsc_scaling in tsc_get_info() is always used in the condition
> >!d->arch.vtsc, so it's not necessary to include it again in
> >enable_tsc_scaling.
> >
> >>>+
> >>>      *incarnation = d->arch.incarnation;
> >>>      *tsc_mode = d->arch.tsc_mode;
> >>>@@ -1769,7 +1772,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >>>          }
> >>>          tsc = rdtsc();
> >>>          *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns);
> >>>-        *gtsc_khz = cpu_khz;
> >>>+        *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : cpu_khz;
> >>>          break;
> >>>      case TSC_MODE_PVRDTSCP:
> >>>          if ( d->arch.vtsc )
> >>>@@ -1779,10 +1782,13 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
> >>>          }
> >>>          else
> >>>          {
> >>>+            struct time_scale *scale = enable_tsc_scaling ?
> >>>+                                       &this_cpu(cpu_time).tsc_scale :
> >>>+                                       &d->arch.vtsc_to_ns;
> >>IIUIC tsc_scale is host property and so why would it be used if TSC scaling
> >>is available to guests?
> >>
> >scale is used below to convert a host TSC to nanosec. When TSC scaling is used,
> >d->arch.vtsc_to_ns may not base on the host TSC frequency, so I turn to the host
> >tsc_scale instead.
> 
> OK. Can we then use tsc_scale for both with and without TSC scaling? (on the
> set side too).
>
Yes, we can, because early_time_init() calculates tsc_scale based on
cpu_khz. I'll change it in the next version.

> (And as a side note, I think that the comment in include/asm-x86/domain.h
> describing vtsc_to_ns as being used for emulated cases is rather misleading.
> We use it for both emulated and native)
>
agree, I'll update those comments in the next version.

Haozhong

> -boris
> 
> 
> >
> >Haozhong
> >
> >>-boris
> >>
> >>
> >>>              tsc = rdtsc();
> >>>-            *elapsed_nsec = scale_delta(tsc, &d->arch.vtsc_to_ns) -
> >>>+            *elapsed_nsec = scale_delta(tsc, scale) -
> >>>                              d->arch.vtsc_offset;
> >>>-            *gtsc_khz = 0; /* ignored by tsc_set_info */
> >>>+            *gtsc_khz = enable_tsc_scaling ? d->arch.tsc_khz : 0;
> >>>          }
> >>>          break;
> >>>      }
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-08 19:21       ` Boris Ostrovsky
@ 2015-12-10  0:23         ` Haozhong Zhang
  2015-12-10 11:56           ` Joao Martins
  0 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-10  0:23 UTC (permalink / raw)
  To: Boris Ostrovsky
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Joao Martins

On 12/08/15 14:21, Boris Ostrovsky wrote:
> On 12/07/2015 08:52 PM, Haozhong Zhang wrote:
> >On 12/07/15 13:14, Boris Ostrovsky wrote:
> >>On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >>>This patch makes the pvclock return the scaled host TSC and
> >>>corresponding scaling parameters to HVM domains if guest TSC is not
> >>>emulated and TSC scaling is enabled.
> >>>
> >>>Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >>+Joao who has been staring at this code.
> >>
> >>Joao, can you run this series through your test with non-native frequency?
> >>(http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
> >>provides an interface to set it in config file).
> >>
> >>
> >>>---
> >>>  xen/arch/x86/time.c | 16 ++++++++++++----
> >>>  1 file changed, 12 insertions(+), 4 deletions(-)
> >>>
> >>>diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
> >>>index 95df4f1..732d1e9 100644
> >>>--- a/xen/arch/x86/time.c
> >>>+++ b/xen/arch/x86/time.c
> >>>@@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
> >>>      }
> >>>      else
> >>>      {
> >>>-        tsc_stamp = t->local_tsc_stamp;
> >>>-
> >>>-        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
> >>>-        _u.tsc_shift         = (s8)t->tsc_scale.shift;
> >>>+        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
> >>>+        {
> >>>+            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
> >>>+            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
> >>>+            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
> >>I am not sure this is correct (which is why I asked Joao to look at this and
> >>test). The scaler below is calculated as result of TSC calibration across
> >>physical CPUs and what you use above (vtsc_to_ns) is an uncalibrated value.
> >>
> >Because guest TSC is synchronized among all vcpus of a domain, I think
> >it's safe to use d->arch.vtsc_to_ns here.
> 
> This is only guaranteed if we have constant/reliable TSC. So perhaps you
> should set tsc_scaling_supported only when either (or both?) of these TSC
> properties are true. Which is probably the case anyway but may be worth
> explicitly checking in start_svm/vmx?

Yes, I'll add the additional check in the next version.

> (and use tsc_scaling_supported instead
> of cpu_has_tsc_ratio in the 'if' statement)

This one is only for bug fix, so tsc_scaling_supported has not been
introduced. Patch 8 introduces tsc_scaling_supported and patch 12
replaces all cpu_has_tsc_ratio with it.

> 
> And just like I asked in the previous email --- should we then use the same
> scaler (which would be vtsc_to_ns) in both cases? At least for guests in HVM
> containers (it may work for PV guests as well, but it would need to be
> confirmed).
>
Yes, but I'll check PV code first.

> Also, I noticed that this routine uses is_hvm_domain(). I think it should be
> has_hvm_container_domain().
>
forgot updating here, will do in the next version.

Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio
  2015-12-06 20:58 ` [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
  2015-12-07 18:31   ` Boris Ostrovsky
@ 2015-12-10 10:19   ` Tian, Kevin
  2015-12-10 10:27     ` Zhang, Haozhong
  1 sibling, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:19 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
ratio
> 
> Both VMX TSC scaling and SVM TSC ratio use the 64-bit TSC scaling ratio,
> but the number of fractional bits of the ratio is different between VMX
> and SVM. This patch adds the architecture code to collect the number of
> fractional bits and other related information into fields of struct
> hvm_function_table so that they can be used in the common code.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one comment


> diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> index aba63ab..8b10a67 100644
> --- a/xen/include/asm-x86/hvm/hvm.h
> +++ b/xen/include/asm-x86/hvm/hvm.h
> @@ -100,6 +100,18 @@ struct hvm_function_table {
>      unsigned int hap_capabilities;
> 
>      /*
> +     * Parameters of hardware-assisted TSC scaling.
> +     */
> +    /* is TSC scaling supported? */
> +    bool_t   tsc_scaling_supported;
> +    /* number of bits of the fractional part of TSC scaling ratio */
> +    uint8_t  tsc_scaling_ratio_frac_bits;
> +    /* default TSC scaling ratio (no scaling) */
> +    uint64_t default_tsc_scaling_ratio;
> +    /* maxmimum-allowed TSC scaling ratio */

maxmimum -> maximum

> +    uint64_t max_tsc_scaling_ratio;
> +
> +    /*
>       * Initialise/destroy HVM domain/vcpu resources
>       */
>      int  (*domain_initialise)(struct domain *d);
> --
> 2.6.3

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio
  2015-12-10 10:19   ` Tian, Kevin
@ 2015-12-10 10:27     ` Zhang, Haozhong
  0 siblings, 0 replies; 61+ messages in thread
From: Zhang, Haozhong @ 2015-12-10 10:27 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	xen-devel, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky

On 12/10/15 18:19, Tian, Kevin wrote:
> > From: Zhang, Haozhong
> > Sent: Monday, December 07, 2015 4:59 AM
> ratio
> > 
> > Both VMX TSC scaling and SVM TSC ratio use the 64-bit TSC scaling ratio,
> > but the number of fractional bits of the ratio is different between VMX
> > and SVM. This patch adds the architecture code to collect the number of
> > fractional bits and other related information into fields of struct
> > hvm_function_table so that they can be used in the common code.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> 
> Reviewed-by: Kevin Tian <kevin.tian@intel.com>, with one comment
> 
> 
> > diff --git a/xen/include/asm-x86/hvm/hvm.h b/xen/include/asm-x86/hvm/hvm.h
> > index aba63ab..8b10a67 100644
> > --- a/xen/include/asm-x86/hvm/hvm.h
> > +++ b/xen/include/asm-x86/hvm/hvm.h
> > @@ -100,6 +100,18 @@ struct hvm_function_table {
> >      unsigned int hap_capabilities;
> > 
> >      /*
> > +     * Parameters of hardware-assisted TSC scaling.
> > +     */
> > +    /* is TSC scaling supported? */
> > +    bool_t   tsc_scaling_supported;
> > +    /* number of bits of the fractional part of TSC scaling ratio */
> > +    uint8_t  tsc_scaling_ratio_frac_bits;
> > +    /* default TSC scaling ratio (no scaling) */
> > +    uint64_t default_tsc_scaling_ratio;
> > +    /* maxmimum-allowed TSC scaling ratio */
> 
> maxmimum -> maximum

will fix in the next version

Thanks,
Haozhong

> 
> > +    uint64_t max_tsc_scaling_ratio;
> > +
> > +    /*
> >       * Initialise/destroy HVM domain/vcpu resources
> >       */
> >      int  (*domain_initialise)(struct domain *d);
> > --
> > 2.6.3
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 09/14] x86/hvm: Setup TSC scaling ratio
  2015-12-06 20:58 ` [PATCH v2 09/14] x86/hvm: Setup " Haozhong Zhang
  2015-12-07 18:42   ` Boris Ostrovsky
@ 2015-12-10 10:27   ` Tian, Kevin
  2015-12-10 10:37     ` Zhang, Haozhong
  1 sibling, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:27 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
> 
> This patch adds a field tsc_scaling_ratio in struct hvm_vcpu to
> record the TSC scaling ratio, and sets it up when tsc_set_info() is
> called for a vcpu or when a vcpu is restored or reset.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>  xen/arch/x86/hvm/hvm.c            | 30
> ++++++++++++++++++++++++++++++
>  xen/arch/x86/hvm/svm/svm.c        |  6 ++++--
>  xen/arch/x86/time.c               | 13 ++++++++++++-
>  xen/include/asm-x86/hvm/hvm.h     |  5 +++++
>  xen/include/asm-x86/hvm/svm/svm.h |  3 ---
>  xen/include/asm-x86/hvm/vcpu.h    |  2 ++
>  xen/include/asm-x86/math64.h      | 30
> ++++++++++++++++++++++++++++++
>  7 files changed, 83 insertions(+), 6 deletions(-)
>  create mode 100644 xen/include/asm-x86/math64.h
> 
> diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> index 0e63c33..52a0ef8 100644
> --- a/xen/arch/x86/hvm/hvm.c
> +++ b/xen/arch/x86/hvm/hvm.c
> @@ -65,6 +65,7 @@
>  #include <asm/mtrr.h>
>  #include <asm/apic.h>
>  #include <asm/vm_event.h>
> +#include <asm/math64.h>
>  #include <public/sched.h>
>  #include <public/hvm/ioreq.h>
>  #include <public/version.h>
> @@ -301,6 +302,29 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
>      return 1;
>  }
> 
> +void hvm_setup_tsc_scaling(struct vcpu *v)
> +{
> +    u64 ratio;
> +
> +    if ( !hvm_funcs.tsc_scaling_supported )
> +        return;
> +
> +    /*
> +     * The multiplication of the first two terms may overflow a 64-bit
> +     * integer, so use mul_u64_u32_div() instead to keep precision.
> +     */
> +    ratio = mul_u64_u32_div(1ULL << hvm_funcs.tsc_scaling_ratio_frac_bits,
> +                            v->domain->arch.tsc_khz, cpu_khz);
> +
> +    if ( ratio == 0 || ratio > hvm_funcs.max_tsc_scaling_ratio )
> +        return;

How will you check such error in other places? tsc_scaling_ratio is
left w/ default value, while if you don't detect the issue that that
ratio will be used for wrong scale...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function
  2015-12-06 20:58 ` [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
  2015-12-07 18:55   ` Boris Ostrovsky
@ 2015-12-10 10:29   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:29 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
> 
> This patch implements a common function hvm_scale_tsc() to scale TSC by
> using TSC scaling information collected by architecture code.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>, except the mul64 part.

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code
  2015-12-06 20:58 ` [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
  2015-12-07 18:56   ` Boris Ostrovsky
@ 2015-12-10 10:30   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:30 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
common code
> 
> Both VMX and SVM save/load vcpu's TSC when saving/loading vcpu's
> context, so this patch moves saving/loading vcpu's TSC to the common
> functions hvm_[save|load]_cpu_ctxt().
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs
  2015-12-06 20:58 ` [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs Haozhong Zhang
  2015-12-07 18:58   ` Boris Ostrovsky
@ 2015-12-10 10:31   ` Tian, Kevin
  1 sibling, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:31 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
> 
> This patch uses hvm_funcs.tsc_scaling_supported instead of the
> architecture code to detect the TSC scaling support.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> Acked-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support
  2015-12-06 20:58 ` [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support Haozhong Zhang
@ 2015-12-10 10:33   ` Tian, Kevin
  0 siblings, 0 replies; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:33 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
>
> This patch adds the initialization and setup code for VMX TSC scaling.
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>

Acked-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 09/14] x86/hvm: Setup TSC scaling ratio
  2015-12-10 10:27   ` Tian, Kevin
@ 2015-12-10 10:37     ` Zhang, Haozhong
  0 siblings, 0 replies; 61+ messages in thread
From: Zhang, Haozhong @ 2015-12-10 10:37 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	xen-devel, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky

On 12/10/15 18:27, Tian, Kevin wrote:
> > From: Zhang, Haozhong
> > Sent: Monday, December 07, 2015 4:59 AM
> > 
> > This patch adds a field tsc_scaling_ratio in struct hvm_vcpu to
> > record the TSC scaling ratio, and sets it up when tsc_set_info() is
> > called for a vcpu or when a vcpu is restored or reset.
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > ---
> >  xen/arch/x86/hvm/hvm.c            | 30
> > ++++++++++++++++++++++++++++++
> >  xen/arch/x86/hvm/svm/svm.c        |  6 ++++--
> >  xen/arch/x86/time.c               | 13 ++++++++++++-
> >  xen/include/asm-x86/hvm/hvm.h     |  5 +++++
> >  xen/include/asm-x86/hvm/svm/svm.h |  3 ---
> >  xen/include/asm-x86/hvm/vcpu.h    |  2 ++
> >  xen/include/asm-x86/math64.h      | 30
> > ++++++++++++++++++++++++++++++
> >  7 files changed, 83 insertions(+), 6 deletions(-)
> >  create mode 100644 xen/include/asm-x86/math64.h
> > 
> > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c
> > index 0e63c33..52a0ef8 100644
> > --- a/xen/arch/x86/hvm/hvm.c
> > +++ b/xen/arch/x86/hvm/hvm.c
> > @@ -65,6 +65,7 @@
> >  #include <asm/mtrr.h>
> >  #include <asm/apic.h>
> >  #include <asm/vm_event.h>
> > +#include <asm/math64.h>
> >  #include <public/sched.h>
> >  #include <public/hvm/ioreq.h>
> >  #include <public/version.h>
> > @@ -301,6 +302,29 @@ int hvm_set_guest_pat(struct vcpu *v, u64 guest_pat)
> >      return 1;
> >  }
> > 
> > +void hvm_setup_tsc_scaling(struct vcpu *v)
> > +{
> > +    u64 ratio;
> > +
> > +    if ( !hvm_funcs.tsc_scaling_supported )
> > +        return;
> > +
> > +    /*
> > +     * The multiplication of the first two terms may overflow a 64-bit
> > +     * integer, so use mul_u64_u32_div() instead to keep precision.
> > +     */
> > +    ratio = mul_u64_u32_div(1ULL << hvm_funcs.tsc_scaling_ratio_frac_bits,
> > +                            v->domain->arch.tsc_khz, cpu_khz);
> > +
> > +    if ( ratio == 0 || ratio > hvm_funcs.max_tsc_scaling_ratio )
> > +        return;
> 
> How will you check such error in other places? tsc_scaling_ratio is
> left w/ default value, while if you don't detect the issue that that
> ratio will be used for wrong scale...
>

The intention here is to fall back to the default ratio so that it
would work like no TSC scaling is used. However, I forgot here to fall
back v->domain->arch.tsc_khz and others to default values (i.e. values
used when no TSC scaling). I'll add them in the next version.

Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt
  2015-12-06 20:58 ` [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt Haozhong Zhang
@ 2015-12-10 10:40   ` Tian, Kevin
  2015-12-10 10:56     ` Zhang, Haozhong
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:40 UTC (permalink / raw)
  To: Zhang, Haozhong, xen-devel
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	Aravind Gopalakrishnan, Suravee Suthikulpanit, Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Monday, December 07, 2015 4:59 AM
g and tscmode.txt
> 
> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> ---
>  docs/man/xl.cfg.pod.5 | 15 ++++++++++++++-
>  docs/misc/tscmode.txt | 14 ++++++++++++++
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> index 2aca8dd..7e19a9b 100644
> --- a/docs/man/xl.cfg.pod.5
> +++ b/docs/man/xl.cfg.pod.5
> @@ -1313,9 +1313,18 @@ deprecated. Options are:
> 
>  =item B<"default">
> 
> -Guest rdtsc/p executed natively when monotonicity can be guaranteed
> +Guest rdtsc/p is executed natively when monotonicity can be guaranteed
>  and emulated otherwise (with frequency scaled if necessary).
> 
> +If a HVM container in B<default> TSC mode is not migrated from other hosts

"migrated from" -> "migrated to"?

> +and the host TSC monotonicity can be guaranteed, the guest and host TSC
> +frequencies will be the same.
> +
> +If a HVM container in B<default> TSC mode is migrated to a host that can
> +guarantee the TSC monotonicity and supports Intel VMX TSC scaling/AMD SVM

and -> or? Do we think TSC scaling a must to ensure TSC monotonicity? It comes
to the rescue only when host can't ensure monotonicity...

> +TSC ratio, guest rdtsc/p will still execute natively after migration and the
> +guest TSC frequencies before and after migration will be the same.

will be the same before and after migration.

> +
>  =item B<"always_emulate">
> 
>  Guest rdtsc/p always emulated at 1GHz (kernel and user). Guest rdtsc/p
> @@ -1337,6 +1346,10 @@ determine when a restore/migration has occurred and
> assumes guest
>  obtains/uses pvclock-like mechanism to adjust for monotonicity and
>  frequency changes.
> 
> +If a HVM container in B<native_paravirt> TSC mode can execute both guest
> +rdtsc and guest rdtscp natively, then the guest TSC frequency will be
> +determined in the similar way to that of B<default> TSC mode.
> +
>  =back
> 
>  Please see F<docs/misc/tscmode.txt> for more information on this option.
> diff --git a/docs/misc/tscmode.txt b/docs/misc/tscmode.txt
> index e8c84e8..f3b70be 100644
> --- a/docs/misc/tscmode.txt
> +++ b/docs/misc/tscmode.txt
> @@ -297,3 +297,17 @@ and also much faster than nearly all OS-provided time
> mechanisms.
>  While pvrtscp is too complex for most apps, certain enterprise
>  TSC-sensitive high-TSC-frequency apps may find it useful to
>  obtain a significant performance gain.
> +
> +Hardware TSC Scaling
> +
> +Intel VMX TSC scaling and AMD SVM TSC ratio allow the guest TSC read
> +by guest rdtsc/p increasing in the different frequency than the host

"in the different" -> "in a different"

> +TSC frequency.
> +
> +For a HVM container is in default TSC mode (tsc_mode=0) or PVRDTSCP

For a HVM container *which* is

> +mode (tsc_mode=3) and can execute both guest rdtsc and rdtscp
> +natively, if it is not migrated from other hosts, the guest and host
> +TSC frequencies will be the same. 

"the guest and host TSC frequencies remain the same if the guest is
not migrated to other host."

and the condition is that the host supports constant TSC feature.

> If it is migrated to a host
> +supporting Intel VMX TSC scaling/AMD SVM TSC ratio and can still
> +execute guest rdtsc and rdtscp natively, the guest TSC frequencies
> +before and after migration will be the same.
> --
> 2.6.3

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-07 17:04       ` Andrew Cooper
  2015-12-08  2:13         ` Haozhong Zhang
@ 2015-12-10 10:43         ` Tian, Kevin
  2015-12-10 11:13           ` Haozhong Zhang
  1 sibling, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2015-12-10 10:43 UTC (permalink / raw)
  To: Andrew Cooper, Egger, Christoph, xen-devel, Keir Fraser,
	Jan Beulich, Boris Ostrovsky, Suravee Suthikulpanit,
	Aravind Gopalakrishnan, Nakajima, Jun

> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> Sent: Tuesday, December 08, 2015 1:04 AM
> 
> On 07/12/15 10:16, Haozhong Zhang wrote:
> > On 12/07/15 10:03, Egger, Christoph wrote:
> >> Did you consider nested virtualization?
> >> L1 hypervisor may have a different tsc scaling
> >> and L2 guest again may have a different tsc scale ratio.
> >>
> > Oh, I forgot this. I'll check the nested TSC scaling code (mostly
> > about nested SVM TSC ratio, because this patch series does not expose
> > VMX TSC scaling to L1 guest).
> >
> > BTW, are there any practical usage scenarios of nested TSC scaling? If
> > any, I may also need to consider adding nested virtualization support
> > to VMX TSC scaling.
> 
> I would say that there are genuine uses for nested TSC scaling.  An L1
> hypervisor is going to want to scale for the same reasons as the L0
> hypervisor.
> 
> Having said that, if TSC scaling is correctly hidden from the L1 guests,
> I would advise against conflating the two issues together.  i.e. getting
> nested TSC scaling working is not a prerequisite for accepting this series.
> 

Why is nested TSC scaling even an issue? If L0 hypervisors cross hosts
can always guarantee constant TSC frequency through TSC scaling, L1
hypervisor will never see inconstant TSC frequency so why bother to
expose TSC scaling at all?

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt
  2015-12-10 10:40   ` Tian, Kevin
@ 2015-12-10 10:56     ` Zhang, Haozhong
  0 siblings, 0 replies; 61+ messages in thread
From: Zhang, Haozhong @ 2015-12-10 10:56 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Andrew Cooper,
	xen-devel, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky

On 12/10/15 18:40, Tian, Kevin wrote:
> > From: Zhang, Haozhong
> > Sent: Monday, December 07, 2015 4:59 AM
> g and tscmode.txt
> > 
> > Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> > ---
> >  docs/man/xl.cfg.pod.5 | 15 ++++++++++++++-
> >  docs/misc/tscmode.txt | 14 ++++++++++++++
> >  2 files changed, 28 insertions(+), 1 deletion(-)
> > 
> > diff --git a/docs/man/xl.cfg.pod.5 b/docs/man/xl.cfg.pod.5
> > index 2aca8dd..7e19a9b 100644
> > --- a/docs/man/xl.cfg.pod.5
> > +++ b/docs/man/xl.cfg.pod.5
> > @@ -1313,9 +1313,18 @@ deprecated. Options are:
> > 
> >  =item B<"default">
> > 
> > -Guest rdtsc/p executed natively when monotonicity can be guaranteed
> > +Guest rdtsc/p is executed natively when monotonicity can be guaranteed
> >  and emulated otherwise (with frequency scaled if necessary).
> > 
> > +If a HVM container in B<default> TSC mode is not migrated from other hosts
> 
> "migrated from" -> "migrated to"?
>

I mean "migrated from" here. If the current host supports TSC scaling
and a domain is migrated from another host w/ different host TSC
frequency, then domain may have a different guest TSC frequency than
the current host. Thus, "not migrated from other hosts" is used here
to eliminate such case.

> > +and the host TSC monotonicity can be guaranteed, the guest and host TSC
> > +frequencies will be the same.
> > +
> > +If a HVM container in B<default> TSC mode is migrated to a host that can
> > +guarantee the TSC monotonicity and supports Intel VMX TSC scaling/AMD SVM
> 
> and -> or? Do we think TSC scaling a must to ensure TSC monotonicity? It comes
> to the rescue only when host can't ensure monotonicity...
>

No, I intend to describe the guest behavior when hardware TSC scaling is used.

Really I should say "_host_ TSC monotonicity" here.

> > +TSC ratio, guest rdtsc/p will still execute natively after migration and the
> > +guest TSC frequencies before and after migration will be the same.
> 
> will be the same before and after migration.
>

will modify in the next version.

> > +
> >  =item B<"always_emulate">
> > 
> >  Guest rdtsc/p always emulated at 1GHz (kernel and user). Guest rdtsc/p
> > @@ -1337,6 +1346,10 @@ determine when a restore/migration has occurred and
> > assumes guest
> >  obtains/uses pvclock-like mechanism to adjust for monotonicity and
> >  frequency changes.
> > 
> > +If a HVM container in B<native_paravirt> TSC mode can execute both guest
> > +rdtsc and guest rdtscp natively, then the guest TSC frequency will be
> > +determined in the similar way to that of B<default> TSC mode.
> > +
> >  =back
> > 
> >  Please see F<docs/misc/tscmode.txt> for more information on this option.
> > diff --git a/docs/misc/tscmode.txt b/docs/misc/tscmode.txt
> > index e8c84e8..f3b70be 100644
> > --- a/docs/misc/tscmode.txt
> > +++ b/docs/misc/tscmode.txt
> > @@ -297,3 +297,17 @@ and also much faster than nearly all OS-provided time
> > mechanisms.
> >  While pvrtscp is too complex for most apps, certain enterprise
> >  TSC-sensitive high-TSC-frequency apps may find it useful to
> >  obtain a significant performance gain.
> > +
> > +Hardware TSC Scaling
> > +
> > +Intel VMX TSC scaling and AMD SVM TSC ratio allow the guest TSC read
> > +by guest rdtsc/p increasing in the different frequency than the host
> 
> "in the different" -> "in a different"
>

will modify

> > +TSC frequency.
> > +
> > +For a HVM container is in default TSC mode (tsc_mode=0) or PVRDTSCP
> 
> For a HVM container *which* is
>

stupid error... will modify

> > +mode (tsc_mode=3) and can execute both guest rdtsc and rdtscp
> > +natively, if it is not migrated from other hosts, the guest and host
> > +TSC frequencies will be the same. 
> 
> "the guest and host TSC frequencies remain the same if the guest is
> not migrated to other host."
> 
> and the condition is that the host supports constant TSC feature.
>

Yes, I'll modify in the next version.

Thanks,
Haozhong

> > If it is migrated to a host
> > +supporting Intel VMX TSC scaling/AMD SVM TSC ratio and can still
> > +execute guest rdtsc and rdtscp natively, the guest TSC frequencies
> > +before and after migration will be the same.
> > --
> > 2.6.3
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-10 10:43         ` Tian, Kevin
@ 2015-12-10 11:13           ` Haozhong Zhang
  2015-12-11  7:35             ` Tian, Kevin
  0 siblings, 1 reply; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-10 11:13 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Keir Fraser, Suravee Suthikulpanit, Nakajima, Jun, Andrew Cooper,
	Egger, Christoph, xen-devel, Aravind Gopalakrishnan, Jan Beulich,
	Boris Ostrovsky

On 12/10/15 10:43, Tian, Kevin wrote:
> > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> > Sent: Tuesday, December 08, 2015 1:04 AM
> > 
> > On 07/12/15 10:16, Haozhong Zhang wrote:
> > > On 12/07/15 10:03, Egger, Christoph wrote:
> > >> Did you consider nested virtualization?
> > >> L1 hypervisor may have a different tsc scaling
> > >> and L2 guest again may have a different tsc scale ratio.
> > >>
> > > Oh, I forgot this. I'll check the nested TSC scaling code (mostly
> > > about nested SVM TSC ratio, because this patch series does not expose
> > > VMX TSC scaling to L1 guest).
> > >
> > > BTW, are there any practical usage scenarios of nested TSC scaling? If
> > > any, I may also need to consider adding nested virtualization support
> > > to VMX TSC scaling.
> > 
> > I would say that there are genuine uses for nested TSC scaling.  An L1
> > hypervisor is going to want to scale for the same reasons as the L0
> > hypervisor.
> > 
> > Having said that, if TSC scaling is correctly hidden from the L1 guests,
> > I would advise against conflating the two issues together.  i.e. getting
> > nested TSC scaling working is not a prerequisite for accepting this series.
> > 
> 
> Why is nested TSC scaling even an issue? If L0 hypervisors cross hosts
> can always guarantee constant TSC frequency through TSC scaling, L1
> hypervisor will never see inconstant TSC frequency so why bother to
> expose TSC scaling at all?
>

If exposing TSC scaling to L1, then L0 hypervisor will need to adapt
the TSC scaling ratio set by L1 hypervisor, which has not been done by
this patch series.

Consider an example that the host TSC frequency is freq_0 and
1. L0 Xen sets TSC scaling ratio to a non-identical one ratio_0.

2. Then L1 hypervisor will observe a "host" TSC frequency (freq_0 * ratio_0).

3. If L1 hypervisor sets a TSC scaling ratio ratio_1, it intends to
   provide a guest TSC frequency (freq_0 * ratio_0 * ratio_1).

   However, if ratio_1 is directly written into nested VMCS that is
   later applied to the host CPU, then the L2 guest will get an
   incorrect guest TSC frequency (freq_0 * ratio_1).


Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-10  0:23         ` Haozhong Zhang
@ 2015-12-10 11:56           ` Joao Martins
  2015-12-29 18:22             ` Joao Martins
  0 siblings, 1 reply; 61+ messages in thread
From: Joao Martins @ 2015-12-10 11:56 UTC (permalink / raw)
  To: Boris Ostrovsky, Haozhong Zhang
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit



On 12/10/2015 12:23 AM, Haozhong Zhang wrote:
> On 12/08/15 14:21, Boris Ostrovsky wrote:
>> On 12/07/2015 08:52 PM, Haozhong Zhang wrote:
>>> On 12/07/15 13:14, Boris Ostrovsky wrote:
>>>> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
>>>>> This patch makes the pvclock return the scaled host TSC and
>>>>> corresponding scaling parameters to HVM domains if guest TSC is not
>>>>> emulated and TSC scaling is enabled.
>>>>>
>>>>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>>>> +Joao who has been staring at this code.
>>>>
Apologies for late response but I was out in the beginning of the week and was
still catching up.

>>>> Joao, can you run this series through your test with non-native frequency?
>>>> (http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
>>>> provides an interface to set it in config file).
>>>>
OK, I will run it through my time warp tests.

>>>>
>>>>> ---
>>>>>  xen/arch/x86/time.c | 16 ++++++++++++----
>>>>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>>>>
>>>>> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
>>>>> index 95df4f1..732d1e9 100644
>>>>> --- a/xen/arch/x86/time.c
>>>>> +++ b/xen/arch/x86/time.c
>>>>> @@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
>>>>>      }
>>>>>      else
>>>>>      {
>>>>> -        tsc_stamp = t->local_tsc_stamp;
>>>>> -
>>>>> -        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
>>>>> -        _u.tsc_shift         = (s8)t->tsc_scale.shift;
>>>>> +        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
>>>>> +        {
>>>>> +            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
>>>>> +            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
>>>>> +            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
>>>> I am not sure this is correct (which is why I asked Joao to look at this and
>>>> test). The scaler below is calculated as result of TSC calibration across
>>>> physical CPUs and what you use above (vtsc_to_ns) is an uncalibrated value.
>>>>
>>> Because guest TSC is synchronized among all vcpus of a domain, I think
>>> it's safe to use d->arch.vtsc_to_ns here.
>>
>> This is only guaranteed if we have constant/reliable TSC. So perhaps you
>> should set tsc_scaling_supported only when either (or both?) of these TSC
>> properties are true. Which is probably the case anyway but may be worth
>> explicitly checking in start_svm/vmx?
> 
> Yes, I'll add the additional check in the next version.
> 
I believe constant TSC to be the only feature that is checked on
local_time_calibration.

>> (and use tsc_scaling_supported instead
>> of cpu_has_tsc_ratio in the 'if' statement)
> 
> This one is only for bug fix, so tsc_scaling_supported has not been
> introduced. Patch 8 introduces tsc_scaling_supported and patch 12
> replaces all cpu_has_tsc_ratio with it.
> 
>>
>> And just like I asked in the previous email --- should we then use the same
>> scaler (which would be vtsc_to_ns) in both cases? At least for guests in HVM
>> containers (it may work for PV guests as well, but it would need to be
>> confirmed).
>>
> Yes, but I'll check PV code first.
> 
>> Also, I noticed that this routine uses is_hvm_domain(). I think it should be
>> has_hvm_container_domain().
>>
> forgot updating here, will do in the next version.
> 
> Haozhong
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> http://lists.xen.org/xen-devel
> 

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-10 11:13           ` Haozhong Zhang
@ 2015-12-11  7:35             ` Tian, Kevin
  2015-12-11 10:09               ` Andrew Cooper
  0 siblings, 1 reply; 61+ messages in thread
From: Tian, Kevin @ 2015-12-11  7:35 UTC (permalink / raw)
  To: Zhang, Haozhong
  Cc: Keir Fraser, Suravee Suthikulpanit, Nakajima, Jun, Andrew Cooper,
	Egger, Christoph, xen-devel, Aravind Gopalakrishnan, Jan Beulich,
	Boris Ostrovsky

> From: Zhang, Haozhong
> Sent: Thursday, December 10, 2015 7:13 PM
> 
> On 12/10/15 10:43, Tian, Kevin wrote:
> > > From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
> > > Sent: Tuesday, December 08, 2015 1:04 AM
> > >
> > > On 07/12/15 10:16, Haozhong Zhang wrote:
> > > > On 12/07/15 10:03, Egger, Christoph wrote:
> > > >> Did you consider nested virtualization?
> > > >> L1 hypervisor may have a different tsc scaling
> > > >> and L2 guest again may have a different tsc scale ratio.
> > > >>
> > > > Oh, I forgot this. I'll check the nested TSC scaling code (mostly
> > > > about nested SVM TSC ratio, because this patch series does not expose
> > > > VMX TSC scaling to L1 guest).
> > > >
> > > > BTW, are there any practical usage scenarios of nested TSC scaling? If
> > > > any, I may also need to consider adding nested virtualization support
> > > > to VMX TSC scaling.
> > >
> > > I would say that there are genuine uses for nested TSC scaling.  An L1
> > > hypervisor is going to want to scale for the same reasons as the L0
> > > hypervisor.
> > >
> > > Having said that, if TSC scaling is correctly hidden from the L1 guests,
> > > I would advise against conflating the two issues together.  i.e. getting
> > > nested TSC scaling working is not a prerequisite for accepting this series.
> > >
> >
> > Why is nested TSC scaling even an issue? If L0 hypervisors cross hosts
> > can always guarantee constant TSC frequency through TSC scaling, L1
> > hypervisor will never see inconstant TSC frequency so why bother to
> > expose TSC scaling at all?
> >
> 
> If exposing TSC scaling to L1, then L0 hypervisor will need to adapt
> the TSC scaling ratio set by L1 hypervisor, which has not been done by
> this patch series.
> 
> Consider an example that the host TSC frequency is freq_0 and
> 1. L0 Xen sets TSC scaling ratio to a non-identical one ratio_0.
> 
> 2. Then L1 hypervisor will observe a "host" TSC frequency (freq_0 * ratio_0).
> 
> 3. If L1 hypervisor sets a TSC scaling ratio ratio_1, it intends to
>    provide a guest TSC frequency (freq_0 * ratio_0 * ratio_1).
> 
>    However, if ratio_1 is directly written into nested VMCS that is
>    later applied to the host CPU, then the L2 guest will get an
>    incorrect guest TSC frequency (freq_0 * ratio_1).
> 
> 

My point is why we need expose TSC scaling to L1, if L0 has ensures
constant TSC w/ TSC scaling ratio...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 00/14] Add VMX TSC scaling support
  2015-12-11  7:35             ` Tian, Kevin
@ 2015-12-11 10:09               ` Andrew Cooper
  0 siblings, 0 replies; 61+ messages in thread
From: Andrew Cooper @ 2015-12-11 10:09 UTC (permalink / raw)
  To: Tian, Kevin, Zhang, Haozhong
  Cc: Keir Fraser, Jan Beulich, Nakajima, Jun, Egger, Christoph,
	xen-devel, Aravind Gopalakrishnan, Suravee Suthikulpanit,
	Boris Ostrovsky

On 11/12/15 07:35, Tian, Kevin wrote:
>> From: Zhang, Haozhong
>> Sent: Thursday, December 10, 2015 7:13 PM
>>
>> On 12/10/15 10:43, Tian, Kevin wrote:
>>>> From: Andrew Cooper [mailto:andrew.cooper3@citrix.com]
>>>> Sent: Tuesday, December 08, 2015 1:04 AM
>>>>
>>>> On 07/12/15 10:16, Haozhong Zhang wrote:
>>>>> On 12/07/15 10:03, Egger, Christoph wrote:
>>>>>> Did you consider nested virtualization?
>>>>>> L1 hypervisor may have a different tsc scaling
>>>>>> and L2 guest again may have a different tsc scale ratio.
>>>>>>
>>>>> Oh, I forgot this. I'll check the nested TSC scaling code (mostly
>>>>> about nested SVM TSC ratio, because this patch series does not expose
>>>>> VMX TSC scaling to L1 guest).
>>>>>
>>>>> BTW, are there any practical usage scenarios of nested TSC scaling? If
>>>>> any, I may also need to consider adding nested virtualization support
>>>>> to VMX TSC scaling.
>>>> I would say that there are genuine uses for nested TSC scaling.  An L1
>>>> hypervisor is going to want to scale for the same reasons as the L0
>>>> hypervisor.
>>>>
>>>> Having said that, if TSC scaling is correctly hidden from the L1 guests,
>>>> I would advise against conflating the two issues together.  i.e. getting
>>>> nested TSC scaling working is not a prerequisite for accepting this series.
>>>>
>>> Why is nested TSC scaling even an issue? If L0 hypervisors cross hosts
>>> can always guarantee constant TSC frequency through TSC scaling, L1
>>> hypervisor will never see inconstant TSC frequency so why bother to
>>> expose TSC scaling at all?
>>>
>> If exposing TSC scaling to L1, then L0 hypervisor will need to adapt
>> the TSC scaling ratio set by L1 hypervisor, which has not been done by
>> this patch series.
>>
>> Consider an example that the host TSC frequency is freq_0 and
>> 1. L0 Xen sets TSC scaling ratio to a non-identical one ratio_0.
>>
>> 2. Then L1 hypervisor will observe a "host" TSC frequency (freq_0 * ratio_0).
>>
>> 3. If L1 hypervisor sets a TSC scaling ratio ratio_1, it intends to
>>    provide a guest TSC frequency (freq_0 * ratio_0 * ratio_1).
>>
>>    However, if ratio_1 is directly written into nested VMCS that is
>>    later applied to the host CPU, then the L2 guest will get an
>>    incorrect guest TSC frequency (freq_0 * ratio_1).
>>
>>
> My point is why we need expose TSC scaling to L1, if L0 has ensures
> constant TSC w/ TSC scaling ratio...

Because a set of L1 hypervisors might want to migrate VMs between them.

Or a more interesting setup migrating VMs between L0 and L1
hypervisors.  (I tried this once and it didn't blow up)

Once constant TSC has been advertised to a guest, any future hypervisor
it might find itself on must be able to guarantee to provide the same
frequency.

~Andrew

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-10 11:56           ` Joao Martins
@ 2015-12-29 18:22             ` Joao Martins
  2015-12-30  1:07               ` Haozhong Zhang
  0 siblings, 1 reply; 61+ messages in thread
From: Joao Martins @ 2015-12-29 18:22 UTC (permalink / raw)
  To: Haozhong Zhang
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky



On 12/10/2015 11:56 AM, Joao Martins wrote:
> 
> 
> On 12/10/2015 12:23 AM, Haozhong Zhang wrote:
>> On 12/08/15 14:21, Boris Ostrovsky wrote:
>>> On 12/07/2015 08:52 PM, Haozhong Zhang wrote:
>>>> On 12/07/15 13:14, Boris Ostrovsky wrote:
>>>>> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
>>>>>> This patch makes the pvclock return the scaled host TSC and
>>>>>> corresponding scaling parameters to HVM domains if guest TSC is not
>>>>>> emulated and TSC scaling is enabled.
>>>>>>
>>>>>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
>>>>> +Joao who has been staring at this code.
>>>>>
> Apologies for late response but I was out in the beginning of the week and was
> still catching up.
> 
>>>>> Joao, can you run this series through your test with non-native frequency?
>>>>> (http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
>>>>> provides an interface to set it in config file).
>>>>>
> OK, I will run it through my time warp tests.
I have using these for a while now and so far I got no issues with a non-native
frequency [I don't have a Skylake processor in my possession to test it
natively]. In addition I rebased my vdso tests on top of your series and no
warps too. But will raise out any flags if I found anything wrong.

> 
>>>>>
>>>>>> ---
>>>>>>  xen/arch/x86/time.c | 16 ++++++++++++----
>>>>>>  1 file changed, 12 insertions(+), 4 deletions(-)
>>>>>>
>>>>>> diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
>>>>>> index 95df4f1..732d1e9 100644
>>>>>> --- a/xen/arch/x86/time.c
>>>>>> +++ b/xen/arch/x86/time.c
>>>>>> @@ -815,10 +815,18 @@ static void __update_vcpu_system_time(struct vcpu *v, int force)
>>>>>>      }
>>>>>>      else
>>>>>>      {
>>>>>> -        tsc_stamp = t->local_tsc_stamp;
>>>>>> -
>>>>>> -        _u.tsc_to_system_mul = t->tsc_scale.mul_frac;
>>>>>> -        _u.tsc_shift         = (s8)t->tsc_scale.shift;
>>>>>> +        if ( is_hvm_domain(d) && cpu_has_tsc_ratio )
>>>>>> +        {
>>>>>> +            tsc_stamp            = hvm_funcs.scale_tsc(v, t->local_tsc_stamp);
>>>>>> +            _u.tsc_to_system_mul = d->arch.vtsc_to_ns.mul_frac;
>>>>>> +            _u.tsc_shift         = d->arch.vtsc_to_ns.shift;
>>>>> I am not sure this is correct (which is why I asked Joao to look at this and
>>>>> test). The scaler below is calculated as result of TSC calibration across
>>>>> physical CPUs and what you use above (vtsc_to_ns) is an uncalibrated value.
>>>>>
>>>> Because guest TSC is synchronized among all vcpus of a domain, I think
>>>> it's safe to use d->arch.vtsc_to_ns here.
>>>
>>> This is only guaranteed if we have constant/reliable TSC. So perhaps you
>>> should set tsc_scaling_supported only when either (or both?) of these TSC
>>> properties are true. Which is probably the case anyway but may be worth
>>> explicitly checking in start_svm/vmx?
>>
>> Yes, I'll add the additional check in the next version.
>>
> I believe constant TSC to be the only feature that is checked on
> local_time_calibration.
> 
>>> (and use tsc_scaling_supported instead
>>> of cpu_has_tsc_ratio in the 'if' statement)
>>
>> This one is only for bug fix, so tsc_scaling_supported has not been
>> introduced. Patch 8 introduces tsc_scaling_supported and patch 12
>> replaces all cpu_has_tsc_ratio with it.
>>
>>>
>>> And just like I asked in the previous email --- should we then use the same
>>> scaler (which would be vtsc_to_ns) in both cases? At least for guests in HVM
>>> containers (it may work for PV guests as well, but it would need to be
>>> confirmed).
>>>
>> Yes, but I'll check PV code first.
>>
>>> Also, I noticed that this routine uses is_hvm_domain(). I think it should be
>>> has_hvm_container_domain().
>>>
>> forgot updating here, will do in the next version.
>>
>> Haozhong
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@lists.xen.org
>> http://lists.xen.org/xen-devel
>>

^ permalink raw reply	[flat|nested] 61+ messages in thread

* Re: [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly
  2015-12-29 18:22             ` Joao Martins
@ 2015-12-30  1:07               ` Haozhong Zhang
  0 siblings, 0 replies; 61+ messages in thread
From: Haozhong Zhang @ 2015-12-30  1:07 UTC (permalink / raw)
  To: Joao Martins
  Cc: Kevin Tian, Keir Fraser, Jan Beulich, Jun Nakajima,
	Andrew Cooper, xen-devel, Aravind Gopalakrishnan,
	Suravee Suthikulpanit, Boris Ostrovsky

On 12/29/15 18:22, Joao Martins wrote:
> 
> 
> On 12/10/2015 11:56 AM, Joao Martins wrote:
> > 
> > 
> > On 12/10/2015 12:23 AM, Haozhong Zhang wrote:
> >> On 12/08/15 14:21, Boris Ostrovsky wrote:
> >>> On 12/07/2015 08:52 PM, Haozhong Zhang wrote:
> >>>> On 12/07/15 13:14, Boris Ostrovsky wrote:
> >>>>> On 12/06/2015 03:58 PM, Haozhong Zhang wrote:
> >>>>>> This patch makes the pvclock return the scaled host TSC and
> >>>>>> corresponding scaling parameters to HVM domains if guest TSC is not
> >>>>>> emulated and TSC scaling is enabled.
> >>>>>>
> >>>>>> Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
> >>>>> +Joao who has been staring at this code.
> >>>>>
> > Apologies for late response but I was out in the beginning of the week and was
> > still catching up.
> > 
> >>>>> Joao, can you run this series through your test with non-native frequency?
> >>>>> (http://lists.xenproject.org/archives/html/xen-devel/2015-12/msg00683.html
> >>>>> provides an interface to set it in config file).
> >>>>>
> > OK, I will run it through my time warp tests.
> I have using these for a while now and so far I got no issues with a non-native
> frequency [I don't have a Skylake processor in my possession to test it
> natively]. In addition I rebased my vdso tests on top of your series and no
> warps too. But will raise out any flags if I found anything wrong.
>

Thanks for the test!

Haozhong

^ permalink raw reply	[flat|nested] 61+ messages in thread

end of thread, other threads:[~2015-12-30  1:07 UTC | newest]

Thread overview: 61+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-06 20:58 [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
2015-12-06 20:58 ` [PATCH v2 01/14] svm: Fix incorrect TSC scaling Haozhong Zhang
2015-12-07 16:49   ` Boris Ostrovsky
2015-12-06 20:58 ` [PATCH v2 02/14] x86/time.c: Fix domain type check in tsc_set_info() Haozhong Zhang
2015-12-07 16:49   ` Boris Ostrovsky
2015-12-06 20:58 ` [PATCH v2 03/14] x86/time.c: Use correct guest TSC frequency " Haozhong Zhang
2015-12-07 16:57   ` Boris Ostrovsky
2015-12-08  0:30     ` Haozhong Zhang
2015-12-06 20:58 ` [PATCH v2 04/14] x86/time.c: Use correct guest TSC frequency in tsc_get_info() Haozhong Zhang
2015-12-07 17:53   ` Boris Ostrovsky
2015-12-08  1:08     ` Haozhong Zhang
2015-12-08 18:26       ` Boris Ostrovsky
2015-12-10  0:14         ` Haozhong Zhang
2015-12-06 20:58 ` [PATCH v2 05/14] x86/hvm: Scale host TSC when setting/getting guest TSC Haozhong Zhang
2015-12-07 17:54   ` Boris Ostrovsky
2015-12-06 20:58 ` [PATCH v2 06/14] x86/time.c: Scale host TSC in pvclock properly Haozhong Zhang
2015-12-07 18:14   ` Boris Ostrovsky
2015-12-08  1:52     ` Haozhong Zhang
2015-12-08 19:21       ` Boris Ostrovsky
2015-12-10  0:23         ` Haozhong Zhang
2015-12-10 11:56           ` Joao Martins
2015-12-29 18:22             ` Joao Martins
2015-12-30  1:07               ` Haozhong Zhang
2015-12-06 20:58 ` [PATCH v2 07/14] svm: Remove redundant TSC scaling in svm_set_tsc_offset() Haozhong Zhang
2015-12-07 18:25   ` Boris Ostrovsky
2015-12-08  1:54     ` Haozhong Zhang
2015-12-06 20:58 ` [PATCH v2 08/14] x86/hvm: Collect information of TSC scaling ratio Haozhong Zhang
2015-12-07 18:31   ` Boris Ostrovsky
2015-12-10 10:19   ` Tian, Kevin
2015-12-10 10:27     ` Zhang, Haozhong
2015-12-06 20:58 ` [PATCH v2 09/14] x86/hvm: Setup " Haozhong Zhang
2015-12-07 18:42   ` Boris Ostrovsky
2015-12-08  1:58     ` Haozhong Zhang
2015-12-10 10:27   ` Tian, Kevin
2015-12-10 10:37     ` Zhang, Haozhong
2015-12-06 20:58 ` [PATCH v2 10/14] x86/hvm: Replace architecture TSC scaling by a common function Haozhong Zhang
2015-12-07 18:55   ` Boris Ostrovsky
2015-12-08  2:06     ` Haozhong Zhang
2015-12-10 10:29   ` Tian, Kevin
2015-12-06 20:58 ` [PATCH v2 11/14] x86/hvm: Move saving/loading vcpu's TSC to common code Haozhong Zhang
2015-12-07 18:56   ` Boris Ostrovsky
2015-12-10 10:30   ` Tian, Kevin
2015-12-06 20:58 ` [PATCH v2 12/14] x86/hvm: Detect TSC scaling through hvm_funcs Haozhong Zhang
2015-12-07 18:58   ` Boris Ostrovsky
2015-12-10 10:31   ` Tian, Kevin
2015-12-06 20:58 ` [PATCH v2 13/14] vmx: Add VMX RDTSC(P) scaling support Haozhong Zhang
2015-12-10 10:33   ` Tian, Kevin
2015-12-06 20:58 ` [PATCH v2 14/14] docs: Add descriptions of TSC scaling in xl.cfg and tscmode.txt Haozhong Zhang
2015-12-10 10:40   ` Tian, Kevin
2015-12-10 10:56     ` Zhang, Haozhong
2015-12-06 21:14 ` [PATCH v2 00/14] Add VMX TSC scaling support Haozhong Zhang
2015-12-07  9:03   ` Egger, Christoph
2015-12-07 10:16     ` Haozhong Zhang
2015-12-07 17:04       ` Andrew Cooper
2015-12-08  2:13         ` Haozhong Zhang
2015-12-10 10:43         ` Tian, Kevin
2015-12-10 11:13           ` Haozhong Zhang
2015-12-11  7:35             ` Tian, Kevin
2015-12-11 10:09               ` Andrew Cooper
2015-12-07 10:37   ` Jan Beulich
2015-12-07 10:56     ` Haozhong Zhang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.