* [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation
@ 2018-04-01 20:29 Olaf Hering
2018-04-02 8:49 ` Wei Liu
2018-04-09 14:19 ` Jan Beulich
0 siblings, 2 replies; 5+ messages in thread
From: Olaf Hering @ 2018-04-01 20:29 UTC (permalink / raw)
To: xen-devel
Cc: Olaf Hering, Stefano Stabellini, Wei Liu, George Dunlap,
Andrew Cooper, Ian Jackson, Marek Marczykowski-Górecki,
Tim Deegan, Julien Grall, Jan Beulich
Add an option to control when vTSC emulation will be activated for a
domU with tsc_mode=default. Without such option each TSC access from
domU will be emulated, which causes a significant perfomance drop for
workloads that make use of rdtsc.
One option to avoid the TSC option is to run domUs with tsc_mode=native.
This has the drawback that migrating a domU from a "2.3GHz" class host
to a "2.4GHz" class host may change the rate at wich the TSC counter
increases, the domU may not be prepared for that.
With the new option the host admin can decide how a domU should behave
when it is migrated across systems of the same class. Since there is
always some jitter when Xen calibrates the cpu_khz value, all hosts of
the same class will most likely have slightly different values. As a
result vTSC emulation is unavoidable. Data collected during the incident
which triggered this change showed a jitter of up to 200 KHz across
systems of the same class.
Existing padding fields are reused to store vtsc_khz_tolerance as u16.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
--
v8:
- adjust also python stream checker for added tolerance member
v7:
- use uint16 in libxl_types.idl to match type used elsewhere in the patch
v6:
- mention default value in xl.cfg
- tsc_set_info: remove usage of __func__, use %d for domid
- tsc_set_info: use ABS to calculate khz_diff
v5:
- reduce functionality to allow setting of the tolerance value
only at initial domU startup
v4:
- add missing copyback in XEN_DOMCTL_set_vtsc_tolerance_khz
v3:
- rename vtsc_khz_tolerance to vtsc_tolerance_khz
- separate domctls to adjust values
- more docs
- update libxl.h
- update python tests
- flask check bound to tsc permissions
- not runtime tested due to dlsym() build errors in staging
---
docs/man/xen-tscmode.pod.7 | 16 ++++++++++++++++
docs/man/xl.cfg.pod.5.in | 10 ++++++++++
docs/specs/libxc-migration-stream.pandoc | 6 ++++--
tools/libxc/include/xenctrl.h | 2 ++
tools/libxc/xc_domain.c | 4 ++++
tools/libxc/xc_sr_common_x86.c | 6 ++++--
tools/libxc/xc_sr_stream_format.h | 3 ++-
tools/libxl/libxl.h | 6 ++++++
tools/libxl/libxl_types.idl | 1 +
tools/libxl/libxl_x86.c | 3 ++-
tools/python/xen/lowlevel/xc/xc.c | 2 +-
tools/python/xen/migration/libxc.py | 8 ++++----
tools/xl/xl_parse.c | 3 +++
xen/arch/x86/domain.c | 2 +-
xen/arch/x86/domctl.c | 2 ++
xen/arch/x86/time.c | 30 +++++++++++++++++++++++++++---
xen/include/asm-x86/domain.h | 1 +
xen/include/asm-x86/time.h | 6 ++++--
xen/include/public/domctl.h | 3 ++-
19 files changed, 96 insertions(+), 18 deletions(-)
diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7
index 3bbc96f201..122ae36679 100644
--- a/docs/man/xen-tscmode.pod.7
+++ b/docs/man/xen-tscmode.pod.7
@@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated
=back
+If the tsc_mode is set to "default" the decision to emulate TSC can be
+tweaked further with the "vtsc_tolerance_khz" option.
+
To understand this in more detail, the rest of this document must
be read.
@@ -211,6 +214,19 @@ is emulated. Note that, though emulated, the "apparent" TSC frequency
will be the TSC frequency of the initial physical machine, even after
migration.
+Since the calibration of the TSC frequency may not be 100% accurate, the
+exact value of the frequency can change even across reboots. This means
+also several otherwise identical systems can have a slightly different
+TSC frequency. As a result TSC access will be emulated if a domU is
+migrated from one host to another, identical host. To avoid the
+performance impact of TSC emulation a certain tolerance of the measured
+host TSC frequency can be specified with "vtsc_tolerance_khz". If the
+measured "cpu_khz" value is within the tolerance range, TSC access
+remains native. Otherwise it will be emulated. This allows to migrate
+domUs between identical hardware. If the domU will be migrated to a
+different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system,
+TSC will be emualted to maintain the TSC frequency expected by the domU.
+
For environments where both TSC-safeness AND highest performance
even across migration is a requirement, application code can be specially
modified to use an algorithm explicitly designed into Xen for this purpose.
diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in
index 2c1a6e1422..aff16052ef 100644
--- a/docs/man/xl.cfg.pod.5.in
+++ b/docs/man/xl.cfg.pod.5.in
@@ -1891,6 +1891,16 @@ determined in a similar way to that of B<default> TSC mode.
Please see B<xen-tscmode(7)> for more information on this option.
+=item B<vtsc_tolerance_khz="KHZ">
+
+B<(x86 only, relevant only for tsc_mode=default)>
+When a domU is started, the CPU frequency of the host is used by the domU for
+TSC related time measurement. Once the domU is either migrated or
+saved/restored on another host that CPU frequency has to be emulated to avoid
+timedrift. To avoid the performance penalty of the TSC emulation, allow a
+certain amount of jitter of the measured CPU frequency on the hosts the domU
+is supposed to run on. Default value is 0, i.e. no tolerance.
+
=item B<localtime=BOOLEAN>
Set the real time clock to local time or to UTC. False (0) by default,
diff --git a/docs/specs/libxc-migration-stream.pandoc b/docs/specs/libxc-migration-stream.pandoc
index 73421ff393..0d0f17edb1 100644
--- a/docs/specs/libxc-migration-stream.pandoc
+++ b/docs/specs/libxc-migration-stream.pandoc
@@ -3,7 +3,7 @@
Andrew Cooper <<andrew.cooper3@citrix.com>>
Wen Congyang <<wency@cn.fujitsu.com>>
Yang Hongyang <<hongyang.yang@easystack.cn>>
-% Revision 2
+% Revision 3
Introduction
============
@@ -472,7 +472,7 @@ XEN\_DOMCTL\_{get,set}tscinfo hypercall sub-ops.
+------------------------+------------------------+
| nsec |
+------------------------+------------------------+
- | incarnation | (reserved) |
+ | incarnation | tolerance | (reserved) |
+------------------------+------------------------+
--------------------------------------------------------------------
@@ -485,6 +485,8 @@ khz TSC frequency, in kHz.
nsec Elapsed time, in nanoseconds.
incarnation Incarnation.
+
+tolerance Amount of Jitter the domU can handle after migration
--------------------------------------------------------------------
\clearpage
diff --git a/tools/libxc/include/xenctrl.h b/tools/libxc/include/xenctrl.h
index 058e832c47..96bdd5609d 100644
--- a/tools/libxc/include/xenctrl.h
+++ b/tools/libxc/include/xenctrl.h
@@ -1360,6 +1360,7 @@ int xc_domain_set_tsc_info(xc_interface *xch,
uint32_t tsc_mode,
uint64_t elapsed_nsec,
uint32_t gtsc_khz,
+ uint16_t vtsc_tolerance_khz,
uint32_t incarnation);
int xc_domain_get_tsc_info(xc_interface *xch,
@@ -1367,6 +1368,7 @@ int xc_domain_get_tsc_info(xc_interface *xch,
uint32_t *tsc_mode,
uint64_t *elapsed_nsec,
uint32_t *gtsc_khz,
+ uint16_t *vtsc_tolerance_khz,
uint32_t *incarnation);
int xc_domain_disable_migrate(xc_interface *xch, uint32_t domid);
diff --git a/tools/libxc/xc_domain.c b/tools/libxc/xc_domain.c
index 26b4b908b9..36acc1c45f 100644
--- a/tools/libxc/xc_domain.c
+++ b/tools/libxc/xc_domain.c
@@ -852,6 +852,7 @@ int xc_domain_set_tsc_info(xc_interface *xch,
uint32_t tsc_mode,
uint64_t elapsed_nsec,
uint32_t gtsc_khz,
+ uint16_t vtsc_tolerance_khz,
uint32_t incarnation)
{
DECLARE_DOMCTL;
@@ -860,6 +861,7 @@ int xc_domain_set_tsc_info(xc_interface *xch,
domctl.u.tsc_info.tsc_mode = tsc_mode;
domctl.u.tsc_info.elapsed_nsec = elapsed_nsec;
domctl.u.tsc_info.gtsc_khz = gtsc_khz;
+ domctl.u.tsc_info.vtsc_tolerance_khz = vtsc_tolerance_khz;
domctl.u.tsc_info.incarnation = incarnation;
return do_domctl(xch, &domctl);
}
@@ -869,6 +871,7 @@ int xc_domain_get_tsc_info(xc_interface *xch,
uint32_t *tsc_mode,
uint64_t *elapsed_nsec,
uint32_t *gtsc_khz,
+ uint16_t *vtsc_tolerance_khz,
uint32_t *incarnation)
{
int rc;
@@ -882,6 +885,7 @@ int xc_domain_get_tsc_info(xc_interface *xch,
*tsc_mode = domctl.u.tsc_info.tsc_mode;
*elapsed_nsec = domctl.u.tsc_info.elapsed_nsec;
*gtsc_khz = domctl.u.tsc_info.gtsc_khz;
+ *vtsc_tolerance_khz = domctl.u.tsc_info.vtsc_tolerance_khz;
*incarnation = domctl.u.tsc_info.incarnation;
}
return rc;
diff --git a/tools/libxc/xc_sr_common_x86.c b/tools/libxc/xc_sr_common_x86.c
index 98f1cef30f..ea3e551a83 100644
--- a/tools/libxc/xc_sr_common_x86.c
+++ b/tools/libxc/xc_sr_common_x86.c
@@ -12,7 +12,8 @@ int write_tsc_info(struct xc_sr_context *ctx)
};
if ( xc_domain_get_tsc_info(xch, ctx->domid, &tsc.mode,
- &tsc.nsec, &tsc.khz, &tsc.incarnation) < 0 )
+ &tsc.nsec, &tsc.khz, &tsc.vtsc_tolerance,
+ &tsc.incarnation) < 0 )
{
PERROR("Unable to obtain TSC information");
return -1;
@@ -34,7 +35,8 @@ int handle_tsc_info(struct xc_sr_context *ctx, struct xc_sr_record *rec)
}
if ( xc_domain_set_tsc_info(xch, ctx->domid, tsc->mode,
- tsc->nsec, tsc->khz, tsc->incarnation) )
+ tsc->nsec, tsc->khz, tsc->vtsc_tolerance,
+ tsc->incarnation) )
{
PERROR("Unable to set TSC information");
return -1;
diff --git a/tools/libxc/xc_sr_stream_format.h b/tools/libxc/xc_sr_stream_format.h
index 15ff1c7efb..9b52f6ace6 100644
--- a/tools/libxc/xc_sr_stream_format.h
+++ b/tools/libxc/xc_sr_stream_format.h
@@ -121,7 +121,8 @@ struct xc_sr_rec_tsc_info
uint32_t khz;
uint64_t nsec;
uint32_t incarnation;
- uint32_t _res1;
+ uint16_t vtsc_tolerance;
+ uint16_t _res1;
};
/* HVM_PARAMS */
diff --git a/tools/libxl/libxl.h b/tools/libxl/libxl.h
index bffc5a16c7..230dd01c24 100644
--- a/tools/libxl/libxl.h
+++ b/tools/libxl/libxl.h
@@ -354,6 +354,12 @@
#define LIBXL_HAVE_BUILDINFO_BOOTLOADER 1
#define LIBXL_HAVE_BUILDINFO_BOOTLOADER_ARGS 1
+/*
+ * LIBXL_HAVE_VTSC_TOLERANCE_KHZ indicates that libxl_domain_build_info
+ * has the vtsc_tolerance_khz field.
+ */
+#define LIBXL_HAVE_VTSC_TOLERANCE_KHZ 1
+
/*
* libxl ABI compatibility
*
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index 01ec1d1afa..bb99776401 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -466,6 +466,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
("vcpu_soft_affinity", Array(libxl_bitmap, "num_vcpu_soft_affinity")),
("numa_placement", libxl_defbool),
("tsc_mode", libxl_tsc_mode),
+ ("vtsc_tolerance_khz", uint16),
("max_memkb", MemKB),
("target_memkb", MemKB),
("video_memkb", MemKB),
diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c
index 1e9f98961b..ab5ff9aa8b 100644
--- a/tools/libxl/libxl_x86.c
+++ b/tools/libxl/libxl_x86.c
@@ -313,7 +313,8 @@ int libxl__arch_domain_create(libxl__gc *gc, libxl_domain_config *d_config,
default:
abort();
}
- xc_domain_set_tsc_info(ctx->xch, domid, tsc_mode, 0, 0, 0);
+ xc_domain_set_tsc_info(ctx->xch, domid, tsc_mode, 0, 0,
+ d_config->b_info.vtsc_tolerance_khz, 0);
if (libxl_defbool_val(d_config->b_info.disable_migrate))
xc_domain_disable_migrate(ctx->xch, domid);
rtc_timeoffset = d_config->b_info.rtc_timeoffset;
diff --git a/tools/python/xen/lowlevel/xc/xc.c b/tools/python/xen/lowlevel/xc/xc.c
index f501764100..e73e2cafc7 100644
--- a/tools/python/xen/lowlevel/xc/xc.c
+++ b/tools/python/xen/lowlevel/xc/xc.c
@@ -1522,7 +1522,7 @@ static PyObject *pyxc_domain_set_tsc_info(XcObject *self, PyObject *args)
if (!PyArg_ParseTuple(args, "ii", &dom, &tsc_mode))
return NULL;
- if (xc_domain_set_tsc_info(self->xc_handle, dom, tsc_mode, 0, 0, 0) != 0)
+ if (xc_domain_set_tsc_info(self->xc_handle, dom, tsc_mode, 0, 0, 0, 0) != 0)
return pyxc_error_to_exception(self->xc_handle);
Py_INCREF(zero);
diff --git a/tools/python/xen/migration/libxc.py b/tools/python/xen/migration/libxc.py
index f24448a9ef..abcda617e4 100644
--- a/tools/python/xen/migration/libxc.py
+++ b/tools/python/xen/migration/libxc.py
@@ -114,7 +114,7 @@ X86_PV_P2M_FRAMES_FORMAT = "II"
X86_PV_VCPU_HDR_FORMAT = "II"
# tsc_info
-TSC_INFO_FORMAT = "IIQII"
+TSC_INFO_FORMAT = "IIQIHH"
# hvm_params
HVM_PARAMS_ENTRY_FORMAT = "QQ"
@@ -363,14 +363,14 @@ class VerifyLibxc(VerifyBase):
if len(content) != sz:
raise RecordError("Length should be %u bytes" % (sz, ))
- mode, khz, nsec, incarn, res1 = unpack(TSC_INFO_FORMAT, content)
+ mode, khz, nsec, incarn, tolerance, res1 = unpack(TSC_INFO_FORMAT, content)
if res1 != 0:
raise StreamError("Reserved bits set in TSC_INFO: 0x%08x"
% (res1, ))
- self.info(" Mode %u, %u kHz, %u ns, incarnation %d"
- % (mode, khz, nsec, incarn))
+ self.info(" Mode %u, %u kHz, %u ns, incarnation %d, tolerance %u kHz"
+ % (mode, khz, nsec, incarn, tolerance))
def verify_record_hvm_context(self, content):
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index e6c54483e0..1915640d64 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1126,6 +1126,9 @@ void parse_config_data(const char *config_source,
}
}
+ if (!xlu_cfg_get_long(config, "vtsc_tolerance_khz", &l, 0))
+ b_info->vtsc_tolerance_khz = l < 0 || l > UINT16_MAX ? UINT16_MAX : l;
+
if (!xlu_cfg_get_long(config, "rtc_timeoffset", &l, 0))
b_info->rtc_timeoffset = l;
diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index fbb320da9c..d40b91721e 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -561,7 +561,7 @@ int arch_domain_create(struct domain *d,
ASSERT_UNREACHABLE(); /* Not HVM and not PV? */
/* initialize default tsc behavior in case tools don't */
- tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0);
+ tsc_set_info(d, TSC_MODE_DEFAULT, 0UL, 0, 0, 0);
/* PV/PVH guests get an emulated PIT too for video BIOSes to use. */
pit_init(d, cpu_khz);
diff --git a/xen/arch/x86/domctl.c b/xen/arch/x86/domctl.c
index 8fbbf3aeb3..d86ff58482 100644
--- a/xen/arch/x86/domctl.c
+++ b/xen/arch/x86/domctl.c
@@ -939,6 +939,7 @@ long arch_do_domctl(
tsc_get_info(d, &domctl->u.tsc_info.tsc_mode,
&domctl->u.tsc_info.elapsed_nsec,
&domctl->u.tsc_info.gtsc_khz,
+ &domctl->u.tsc_info.vtsc_tolerance_khz,
&domctl->u.tsc_info.incarnation);
domain_unpause(d);
copyback = true;
@@ -954,6 +955,7 @@ long arch_do_domctl(
tsc_set_info(d, domctl->u.tsc_info.tsc_mode,
domctl->u.tsc_info.elapsed_nsec,
domctl->u.tsc_info.gtsc_khz,
+ domctl->u.tsc_info.vtsc_tolerance_khz,
domctl->u.tsc_info.incarnation);
domain_unpause(d);
}
diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c
index 84c1c0c082..c96d643acb 100644
--- a/xen/arch/x86/time.c
+++ b/xen/arch/x86/time.c
@@ -2064,7 +2064,7 @@ int host_tsc_is_safe(void)
*/
void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
uint64_t *elapsed_nsec, uint32_t *gtsc_khz,
- uint32_t *incarnation)
+ uint16_t *vtsc_tolerance_khz, uint32_t *incarnation)
{
bool enable_tsc_scaling = is_hvm_domain(d) &&
hvm_tsc_scaling_supported && !d->arch.vtsc;
@@ -2080,6 +2080,7 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
*elapsed_nsec = *gtsc_khz = 0;
break;
case TSC_MODE_DEFAULT:
+ *vtsc_tolerance_khz = d->arch.vtsc_tolerance_khz;
if ( d->arch.vtsc )
{
case TSC_MODE_ALWAYS_EMULATE:
@@ -2122,7 +2123,8 @@ void tsc_get_info(struct domain *d, uint32_t *tsc_mode,
*/
void tsc_set_info(struct domain *d,
uint32_t tsc_mode, uint64_t elapsed_nsec,
- uint32_t gtsc_khz, uint32_t incarnation)
+ uint32_t gtsc_khz, uint16_t vtsc_tolerance_khz,
+ uint32_t incarnation)
{
ASSERT(!is_system_domain(d));
@@ -2134,9 +2136,12 @@ void tsc_set_info(struct domain *d,
switch ( d->arch.tsc_mode = tsc_mode )
{
+ bool disable_vtsc;
bool enable_tsc_scaling;
case TSC_MODE_DEFAULT:
+ d->arch.vtsc_tolerance_khz = vtsc_tolerance_khz;
+ /* Fallthrough. */
case TSC_MODE_ALWAYS_EMULATE:
d->arch.vtsc_offset = get_s_time() - elapsed_nsec;
d->arch.tsc_khz = gtsc_khz ?: cpu_khz;
@@ -2149,8 +2154,25 @@ void tsc_set_info(struct domain *d,
* When a guest is created, gtsc_khz is passed in as zero, making
* d->arch.tsc_khz == cpu_khz. Thus no need to check incarnation.
*/
+ disable_vtsc = d->arch.tsc_khz == cpu_khz;
+
+ if ( tsc_mode == TSC_MODE_DEFAULT && gtsc_khz &&
+ d->arch.vtsc_tolerance_khz )
+ {
+ long khz_diff;
+
+ khz_diff = ABS((long)(cpu_khz - gtsc_khz));
+ disable_vtsc = khz_diff <= d->arch.vtsc_tolerance_khz;
+
+ printk(XENLOG_G_INFO "d%d: host has %lu kHz,"
+ " domU expects %u kHz,"
+ " difference of %ld is %s tolerance of %u\n",
+ d->domain_id, cpu_khz, gtsc_khz, khz_diff,
+ disable_vtsc ? "within" : "outside",
+ d->arch.vtsc_tolerance_khz);
+ }
if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() &&
- (d->arch.tsc_khz == cpu_khz ||
+ (disable_vtsc ||
(is_hvm_domain(d) &&
hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) )
{
@@ -2239,6 +2261,8 @@ static void dump_softtsc(unsigned char key)
printk(",ofs=%#"PRIx64, d->arch.vtsc_offset);
if ( d->arch.tsc_khz )
printk(",khz=%"PRIu32, d->arch.tsc_khz);
+ if ( d->arch.vtsc_tolerance_khz )
+ printk(",tol=%"PRIu16, d->arch.vtsc_tolerance_khz);
if ( d->arch.incarnation )
printk(",inc=%"PRIu32, d->arch.incarnation);
#if !defined(NDEBUG) || defined(CONFIG_PERF_COUNTERS)
diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h
index a12ae47f1b..7743995934 100644
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -374,6 +374,7 @@ struct arch_domain
uint64_t vtsc_offset; /* adjustment for save/restore/migrate */
uint32_t tsc_khz; /* cached guest khz for certain emulated or
hardware TSC scaling cases */
+ uint32_t vtsc_tolerance_khz; /* domU handles that much jitter in cpu_khz */
struct time_scale vtsc_to_ns; /* scaling for certain emulated or
hardware TSC scaling cases */
struct time_scale ns_to_vtsc; /* scaling for certain emulated or
diff --git a/xen/include/asm-x86/time.h b/xen/include/asm-x86/time.h
index b3ae832df4..ef9be7a701 100644
--- a/xen/include/asm-x86/time.h
+++ b/xen/include/asm-x86/time.h
@@ -61,10 +61,12 @@ u64 gtime_to_gtsc(struct domain *d, u64 time);
u64 gtsc_to_gtime(struct domain *d, u64 tsc);
void tsc_set_info(struct domain *d, uint32_t tsc_mode, uint64_t elapsed_nsec,
- uint32_t gtsc_khz, uint32_t incarnation);
+ uint32_t gtsc_khz, uint16_t vtsc_tolerance_khz,
+ uint32_t incarnation);
void tsc_get_info(struct domain *d, uint32_t *tsc_mode, uint64_t *elapsed_nsec,
- uint32_t *gtsc_khz, uint32_t *incarnation);
+ uint32_t *gtsc_khz, uint16_t *vtsc_tolerance_khz,
+ uint32_t *incarnation);
void force_update_vcpu_system_time(struct vcpu *v);
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index ec7a860afc..70a58ae2e4 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -702,7 +702,8 @@ struct xen_domctl_tsc_info {
uint32_t tsc_mode;
uint32_t gtsc_khz;
uint32_t incarnation;
- uint32_t pad;
+ uint16_t vtsc_tolerance_khz;
+ uint16_t pad;
uint64_aligned_t elapsed_nsec;
};
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation
2018-04-01 20:29 [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation Olaf Hering
@ 2018-04-02 8:49 ` Wei Liu
2018-04-09 14:19 ` Jan Beulich
1 sibling, 0 replies; 5+ messages in thread
From: Wei Liu @ 2018-04-02 8:49 UTC (permalink / raw)
To: Olaf Hering
Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
Andrew Cooper, Ian Jackson, Marek Marczykowski-Górecki,
xen-devel, Julien Grall, Jan Beulich
On Sun, Apr 01, 2018 at 10:29:58PM +0200, Olaf Hering wrote:
> Add an option to control when vTSC emulation will be activated for a
> domU with tsc_mode=default. Without such option each TSC access from
> domU will be emulated, which causes a significant perfomance drop for
> workloads that make use of rdtsc.
>
> One option to avoid the TSC option is to run domUs with tsc_mode=native.
> This has the drawback that migrating a domU from a "2.3GHz" class host
> to a "2.4GHz" class host may change the rate at wich the TSC counter
> increases, the domU may not be prepared for that.
>
> With the new option the host admin can decide how a domU should behave
> when it is migrated across systems of the same class. Since there is
> always some jitter when Xen calibrates the cpu_khz value, all hosts of
> the same class will most likely have slightly different values. As a
> result vTSC emulation is unavoidable. Data collected during the incident
> which triggered this change showed a jitter of up to 200 KHz across
> systems of the same class.
>
> Existing padding fields are reused to store vtsc_khz_tolerance as u16.
>
> Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation
2018-04-01 20:29 [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation Olaf Hering
2018-04-02 8:49 ` Wei Liu
@ 2018-04-09 14:19 ` Jan Beulich
2018-04-09 14:55 ` Olaf Hering
1 sibling, 1 reply; 5+ messages in thread
From: Jan Beulich @ 2018-04-09 14:19 UTC (permalink / raw)
To: Olaf Hering
Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
Andrew Cooper, Ian Jackson, Marek Marczykowski-Górecki,
xen-devel, Julien Grall
>>> On 01.04.18 at 22:29, <olaf@aepfle.de> wrote:
> @@ -34,7 +35,8 @@ int handle_tsc_info(struct xc_sr_context *ctx, struct xc_sr_record *rec)
> }
>
> if ( xc_domain_set_tsc_info(xch, ctx->domid, tsc->mode,
> - tsc->nsec, tsc->khz, tsc->incarnation) )
> + tsc->nsec, tsc->khz, tsc->vtsc_tolerance,
> + tsc->incarnation) )
Is there any guarantee that old hypervisors will send this field as zero
(rather than some random value)? If so, I think this should be said
explicitly in the commit message, together with the fact that you
re-use padding fields.
Hypervisor side provisionally (upon Andrew finding his prior
concerns addressed)
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation
2018-04-09 14:19 ` Jan Beulich
@ 2018-04-09 14:55 ` Olaf Hering
2018-04-09 15:10 ` Jan Beulich
0 siblings, 1 reply; 5+ messages in thread
From: Olaf Hering @ 2018-04-09 14:55 UTC (permalink / raw)
To: Jan Beulich
Cc: Tim Deegan, Stefano Stabellini, Wei Liu, George Dunlap,
Andrew Cooper, Ian Jackson, Marek Marczykowski-Górecki,
xen-devel, Julien Grall
[-- Attachment #1.1: Type: text/plain, Size: 542 bytes --]
Am Mon, 09 Apr 2018 08:19:53 -0600
schrieb "Jan Beulich" <JBeulich@suse.com>:
> Is there any guarantee that old hypervisors will send this field as zero
> (rather than some random value)? If so, I think this should be said
> explicitly in the commit message, together with the fact that you
> re-use padding fields.
I have to double check, but I'm sure the whole size of the struct is initialized with zero. The commit message already has "Existing padding fields are reused to store vtsc_khz_tolerance as u16."
Thanks,
Olaf
[-- Attachment #1.2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]
[-- Attachment #2: Type: text/plain, Size: 157 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation
2018-04-09 14:55 ` Olaf Hering
@ 2018-04-09 15:10 ` Jan Beulich
0 siblings, 0 replies; 5+ messages in thread
From: Jan Beulich @ 2018-04-09 15:10 UTC (permalink / raw)
To: Olaf Hering
Cc: Tim Deegan, Stefano Stabellini, Wei Liu, GeorgeDunlap,
Andrew Cooper, Ian Jackson, Marek Marczykowski-Górecki,
xen-devel, Julien Grall
>>> On 09.04.18 at 16:55, <olaf@aepfle.de> wrote:
> Am Mon, 09 Apr 2018 08:19:53 -0600
> schrieb "Jan Beulich" <JBeulich@suse.com>:
>
>> Is there any guarantee that old hypervisors will send this field as zero
>> (rather than some random value)? If so, I think this should be said
>> explicitly in the commit message, together with the fact that you
>> re-use padding fields.
>
> I have to double check, but I'm sure the whole size of the struct is
> initialized with zero. The commit message already has "Existing padding
> fields are reused to store vtsc_khz_tolerance as u16."
It's that sentence that I've referred to - it talks about padding, but
doesn't make clear that this padding is zero at all times.
Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-04-09 15:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-04-01 20:29 [PATCH v8] new config option vtsc_tolerance_khz to avoid TSC emulation Olaf Hering
2018-04-02 8:49 ` Wei Liu
2018-04-09 14:19 ` Jan Beulich
2018-04-09 14:55 ` Olaf Hering
2018-04-09 15:10 ` Jan Beulich
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.