* [PATCH v3 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
0 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Ingo Molnar, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
Stephen Hemminger, Dexuan Cui, linux-kernel, devel,
virtualization
Hi,
merge window is about to close so I hope it's OK to make another try here.
Changes since v2:
- Add explicit READ_ONCE() to not rely on 'volatile' [Andy Lutomirski]
- rdtsc() -> rdtsc_ordered() [Andy Lutomirski]
- virt_rmb() -> smp_rmb() [Thomas Gleixner, Andy Lutomirski]
Thomas, Andy, it seems the only blocker for the series was the ambiguity with
TSC page read algorithm. I contacted Microsoft (through K. Y.) and asked what
we should do when we see 'seq=0'. The answer is:
"I have confirmed that the only invalid value is 0 (notwithstanding what the
spec says). I have asked the Spec to be updated and the current code we have
is correct - it treats 0 as the only invalid value. The invalid value indicates
that the TSC page cannot be used as a time source and a different source is to
be used and this state is going to persist and so looping is not an option."
I agree it would be wiser to have two invalid values - one for 'try again' and
another for permanent failure in case it is needed. But this is not the case.
So I keep my algorithm untouched.
I can see two more reasons for us to keep it:
1) It is what we already have in kernel.
2) It is what Windows does (see the disassembly in c35b82ef0294.
Original description:
Hyper-V TSC page clocksource is suitable for vDSO, however, the protocol
defined by the hypervisor is different from VCLOCK_PVCLOCK. Implemented the
required support. Simple sysbench test shows the following results:
Before:
# time sysbench --test=memory --max-requests=500000 run
...
real 1m22.618s
user 0m50.193s
sys 0m32.268s
After:
# time sysbench --test=memory --max-requests=500000 run
...
real 0m47.241s
user 0m47.117s
sys 0m0.008s
Vitaly Kuznetsov (3):
x86/hyperv: implement hv_get_tsc_page()
x86/hyperv: move TSC reading method to asm/mshyperv.h
x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
arch/x86/entry/vdso/vclock_gettime.c | 24 +++++++++++++++
arch/x86/entry/vdso/vdso-layout.lds.S | 3 +-
arch/x86/entry/vdso/vdso2c.c | 3 ++
arch/x86/entry/vdso/vma.c | 7 +++++
arch/x86/hyperv/hv_init.c | 48 +++++++++--------------------
arch/x86/include/asm/clocksource.h | 3 +-
arch/x86/include/asm/mshyperv.h | 58 +++++++++++++++++++++++++++++++++++
arch/x86/include/asm/vdso.h | 1 +
drivers/hv/Kconfig | 3 ++
9 files changed, 114 insertions(+), 36 deletions(-)
--
2.9.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 0/3] x86/vdso: Add Hyper-V TSC page clocksource support
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
0 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Stephen Hemminger, Haiyang Zhang, linux-kernel, virtualization,
Ingo Molnar, H. Peter Anvin, devel
Hi,
merge window is about to close so I hope it's OK to make another try here.
Changes since v2:
- Add explicit READ_ONCE() to not rely on 'volatile' [Andy Lutomirski]
- rdtsc() -> rdtsc_ordered() [Andy Lutomirski]
- virt_rmb() -> smp_rmb() [Thomas Gleixner, Andy Lutomirski]
Thomas, Andy, it seems the only blocker for the series was the ambiguity with
TSC page read algorithm. I contacted Microsoft (through K. Y.) and asked what
we should do when we see 'seq=0'. The answer is:
"I have confirmed that the only invalid value is 0 (notwithstanding what the
spec says). I have asked the Spec to be updated and the current code we have
is correct - it treats 0 as the only invalid value. The invalid value indicates
that the TSC page cannot be used as a time source and a different source is to
be used and this state is going to persist and so looping is not an option."
I agree it would be wiser to have two invalid values - one for 'try again' and
another for permanent failure in case it is needed. But this is not the case.
So I keep my algorithm untouched.
I can see two more reasons for us to keep it:
1) It is what we already have in kernel.
2) It is what Windows does (see the disassembly in c35b82ef0294.
Original description:
Hyper-V TSC page clocksource is suitable for vDSO, however, the protocol
defined by the hypervisor is different from VCLOCK_PVCLOCK. Implemented the
required support. Simple sysbench test shows the following results:
Before:
# time sysbench --test=memory --max-requests=500000 run
...
real 1m22.618s
user 0m50.193s
sys 0m32.268s
After:
# time sysbench --test=memory --max-requests=500000 run
...
real 0m47.241s
user 0m47.117s
sys 0m0.008s
Vitaly Kuznetsov (3):
x86/hyperv: implement hv_get_tsc_page()
x86/hyperv: move TSC reading method to asm/mshyperv.h
x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
arch/x86/entry/vdso/vclock_gettime.c | 24 +++++++++++++++
arch/x86/entry/vdso/vdso-layout.lds.S | 3 +-
arch/x86/entry/vdso/vdso2c.c | 3 ++
arch/x86/entry/vdso/vma.c | 7 +++++
arch/x86/hyperv/hv_init.c | 48 +++++++++--------------------
arch/x86/include/asm/clocksource.h | 3 +-
arch/x86/include/asm/mshyperv.h | 58 +++++++++++++++++++++++++++++++++++
arch/x86/include/asm/vdso.h | 1 +
drivers/hv/Kconfig | 3 ++
9 files changed, 114 insertions(+), 36 deletions(-)
--
2.9.3
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 1/3] x86/hyperv: implement hv_get_tsc_page()
2017-03-03 13:21 ` Vitaly Kuznetsov
(?)
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-11 13:51 ` [tip:x86/vdso] x86/hyperv: Implement hv_get_tsc_page() tip-bot for Vitaly Kuznetsov
-1 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Ingo Molnar, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
Stephen Hemminger, Dexuan Cui, linux-kernel, devel,
virtualization
To use Hyper-V TSC page clocksource from vDSO we need to make tsc_pg
available. Implement hv_get_tsc_page() and add CONFIG_HYPERV_TSCPAGE to
make #ifdef-s simple.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/hyperv/hv_init.c | 9 +++++++--
arch/x86/include/asm/mshyperv.h | 8 ++++++++
drivers/hv/Kconfig | 3 +++
3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index db64baf0..6b64cae 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -27,10 +27,15 @@
#include <linux/clockchips.h>
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
static struct ms_hyperv_tsc_page *tsc_pg;
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return tsc_pg;
+}
+
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
u64 current_tick;
@@ -139,7 +144,7 @@ void hyperv_init(void)
/*
* Register Hyper-V specific clocksource.
*/
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
if (ms_hyperv.features & HV_X64_MSR_REFERENCE_TSC_AVAILABLE) {
union hv_x64_msr_hypercall_contents tsc_msr;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 7c9c895..d324dce 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -176,4 +176,12 @@ void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
void hyperv_cleanup(void);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+#else
+static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return NULL;
+}
+#endif
#endif
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 0403b51..c29cd53 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -7,6 +7,9 @@ config HYPERV
Select this option to run Linux as a Hyper-V client operating
system.
+config HYPERV_TSCPAGE
+ def_bool HYPERV && X86_64
+
config HYPERV_UTILS
tristate "Microsoft Hyper-V Utilities driver"
depends on HYPERV && CONNECTOR && NLS
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 1/3] x86/hyperv: implement hv_get_tsc_page()
2017-03-03 13:21 ` Vitaly Kuznetsov
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
-1 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Stephen Hemminger, Haiyang Zhang, linux-kernel, virtualization,
Ingo Molnar, H. Peter Anvin, devel
To use Hyper-V TSC page clocksource from vDSO we need to make tsc_pg
available. Implement hv_get_tsc_page() and add CONFIG_HYPERV_TSCPAGE to
make #ifdef-s simple.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/hyperv/hv_init.c | 9 +++++++--
arch/x86/include/asm/mshyperv.h | 8 ++++++++
drivers/hv/Kconfig | 3 +++
3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index db64baf0..6b64cae 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -27,10 +27,15 @@
#include <linux/clockchips.h>
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
static struct ms_hyperv_tsc_page *tsc_pg;
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return tsc_pg;
+}
+
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
u64 current_tick;
@@ -139,7 +144,7 @@ void hyperv_init(void)
/*
* Register Hyper-V specific clocksource.
*/
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
if (ms_hyperv.features & HV_X64_MSR_REFERENCE_TSC_AVAILABLE) {
union hv_x64_msr_hypercall_contents tsc_msr;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 7c9c895..d324dce 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -176,4 +176,12 @@ void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
void hyperv_cleanup(void);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+#else
+static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return NULL;
+}
+#endif
#endif
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 0403b51..c29cd53 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -7,6 +7,9 @@ config HYPERV
Select this option to run Linux as a Hyper-V client operating
system.
+config HYPERV_TSCPAGE
+ def_bool HYPERV && X86_64
+
config HYPERV_UTILS
tristate "Microsoft Hyper-V Utilities driver"
depends on HYPERV && CONNECTOR && NLS
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
2017-03-03 13:21 ` Vitaly Kuznetsov
` (3 preceding siblings ...)
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-03 19:31 ` Stephen Hemminger
` (2 more replies)
-1 siblings, 3 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Ingo Molnar, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
Stephen Hemminger, Dexuan Cui, linux-kernel, devel,
virtualization
As a preparation to making Hyper-V TSC page suitable for vDSO move
the TSC page reading logic to asm/mshyperv.h. While on it, do the
following
- Document the reading algorithm.
- Simplify the code a bit.
- Add explicit READ_ONCE() to not rely on 'volatile'.
- Add explicit barriers to prevent re-ordering (we need to read sequence
strictly before and after)
- Use mul_u64_u64_shr() instead of assembly, gcc generates a single 'mul'
instruction on x86_64 anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/hyperv/hv_init.c | 36 ++++-------------------------
arch/x86/include/asm/mshyperv.h | 50 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 32 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 6b64cae..63dd00e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -38,39 +38,11 @@ struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
- u64 current_tick;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick == U64_MAX)
+ rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
- if (tsc_pg->tsc_sequence != 0) {
- /*
- * Use the tsc page to compute the value.
- */
-
- while (1) {
- u64 tmp;
- u32 sequence = tsc_pg->tsc_sequence;
- u64 cur_tsc;
- u64 scale = tsc_pg->tsc_scale;
- s64 offset = tsc_pg->tsc_offset;
-
- rdtscll(cur_tsc);
- /* current_tick = ((cur_tsc *scale) >> 64) + offset */
- asm("mulq %3"
- : "=d" (current_tick), "=a" (tmp)
- : "a" (cur_tsc), "r" (scale));
-
- current_tick += offset;
- if (tsc_pg->tsc_sequence == sequence)
- return current_tick;
-
- if (tsc_pg->tsc_sequence != 0)
- continue;
- /*
- * Fallback using MSR method.
- */
- break;
- }
- }
- rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
return current_tick;
}
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index d324dce..4ff25436 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -178,6 +178,56 @@ void hyperv_cleanup(void);
#endif
#ifdef CONFIG_HYPERV_TSCPAGE
struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
+{
+ u64 scale, offset, current_tick, cur_tsc;
+ u32 sequence;
+
+ /*
+ * The protocol for reading Hyper-V TSC page is specified in Hypervisor
+ * Top-Level Functional Specification ver. 3.0 and above. To get the
+ * reference time we must do the following:
+ * - READ ReferenceTscSequence
+ * A special '0' value indicates the time source is unreliable and we
+ * need to use something else. The currently published specification
+ * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
+ * instead of '0' as the special value, see commit c35b82ef0294.
+ * - ReferenceTime =
+ * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
+ * - READ ReferenceTscSequence again. In case its value has changed
+ * since our first reading we need to discard ReferenceTime and repeat
+ * the whole sequence as the hypervisor was updating the page in
+ * between.
+ */
+ while (1) {
+ sequence = READ_ONCE(tsc_pg->tsc_sequence);
+ if (!sequence)
+ break;
+ /*
+ * Make sure we read sequence before we read other values from
+ * TSC page.
+ */
+ smp_rmb();
+
+ scale = READ_ONCE(tsc_pg->tsc_scale);
+ offset = READ_ONCE(tsc_pg->tsc_offset);
+ cur_tsc = rdtsc_ordered();
+
+ current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
+
+ /*
+ * Make sure we read sequence after we read all other values
+ * from TSC page.
+ */
+ smp_rmb();
+
+ if (READ_ONCE(tsc_pg->tsc_sequence) == sequence)
+ return current_tick;
+ }
+
+ return U64_MAX;
+}
+
#else
static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
2017-03-03 13:21 ` Vitaly Kuznetsov
` (2 preceding siblings ...)
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
-1 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Stephen Hemminger, Haiyang Zhang, linux-kernel, virtualization,
Ingo Molnar, H. Peter Anvin, devel
As a preparation to making Hyper-V TSC page suitable for vDSO move
the TSC page reading logic to asm/mshyperv.h. While on it, do the
following
- Document the reading algorithm.
- Simplify the code a bit.
- Add explicit READ_ONCE() to not rely on 'volatile'.
- Add explicit barriers to prevent re-ordering (we need to read sequence
strictly before and after)
- Use mul_u64_u64_shr() instead of assembly, gcc generates a single 'mul'
instruction on x86_64 anyway.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/hyperv/hv_init.c | 36 ++++-------------------------
arch/x86/include/asm/mshyperv.h | 50 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 54 insertions(+), 32 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 6b64cae..63dd00e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -38,39 +38,11 @@ struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
- u64 current_tick;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick == U64_MAX)
+ rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
- if (tsc_pg->tsc_sequence != 0) {
- /*
- * Use the tsc page to compute the value.
- */
-
- while (1) {
- u64 tmp;
- u32 sequence = tsc_pg->tsc_sequence;
- u64 cur_tsc;
- u64 scale = tsc_pg->tsc_scale;
- s64 offset = tsc_pg->tsc_offset;
-
- rdtscll(cur_tsc);
- /* current_tick = ((cur_tsc *scale) >> 64) + offset */
- asm("mulq %3"
- : "=d" (current_tick), "=a" (tmp)
- : "a" (cur_tsc), "r" (scale));
-
- current_tick += offset;
- if (tsc_pg->tsc_sequence == sequence)
- return current_tick;
-
- if (tsc_pg->tsc_sequence != 0)
- continue;
- /*
- * Fallback using MSR method.
- */
- break;
- }
- }
- rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
return current_tick;
}
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index d324dce..4ff25436 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -178,6 +178,56 @@ void hyperv_cleanup(void);
#endif
#ifdef CONFIG_HYPERV_TSCPAGE
struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
+{
+ u64 scale, offset, current_tick, cur_tsc;
+ u32 sequence;
+
+ /*
+ * The protocol for reading Hyper-V TSC page is specified in Hypervisor
+ * Top-Level Functional Specification ver. 3.0 and above. To get the
+ * reference time we must do the following:
+ * - READ ReferenceTscSequence
+ * A special '0' value indicates the time source is unreliable and we
+ * need to use something else. The currently published specification
+ * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
+ * instead of '0' as the special value, see commit c35b82ef0294.
+ * - ReferenceTime =
+ * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
+ * - READ ReferenceTscSequence again. In case its value has changed
+ * since our first reading we need to discard ReferenceTime and repeat
+ * the whole sequence as the hypervisor was updating the page in
+ * between.
+ */
+ while (1) {
+ sequence = READ_ONCE(tsc_pg->tsc_sequence);
+ if (!sequence)
+ break;
+ /*
+ * Make sure we read sequence before we read other values from
+ * TSC page.
+ */
+ smp_rmb();
+
+ scale = READ_ONCE(tsc_pg->tsc_scale);
+ offset = READ_ONCE(tsc_pg->tsc_offset);
+ cur_tsc = rdtsc_ordered();
+
+ current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
+
+ /*
+ * Make sure we read sequence after we read all other values
+ * from TSC page.
+ */
+ smp_rmb();
+
+ if (READ_ONCE(tsc_pg->tsc_sequence) == sequence)
+ return current_tick;
+ }
+
+ return U64_MAX;
+}
+
#else
static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/3] x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
2017-03-03 13:21 ` Vitaly Kuznetsov
` (5 preceding siblings ...)
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-11 13:52 ` [tip:x86/vdso] " tip-bot for Vitaly Kuznetsov
-1 siblings, 1 reply; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Ingo Molnar, H. Peter Anvin, K. Y. Srinivasan, Haiyang Zhang,
Stephen Hemminger, Dexuan Cui, linux-kernel, devel,
virtualization
Hyper-V TSC page clocksource is suitable for vDSO, however, the protocol
defined by the hypervisor is different from VCLOCK_PVCLOCK. Implement the
required support by adding hvclock_page VVAR.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/entry/vdso/vclock_gettime.c | 24 ++++++++++++++++++++++++
arch/x86/entry/vdso/vdso-layout.lds.S | 3 ++-
arch/x86/entry/vdso/vdso2c.c | 3 +++
arch/x86/entry/vdso/vma.c | 7 +++++++
arch/x86/hyperv/hv_init.c | 3 +++
arch/x86/include/asm/clocksource.h | 3 ++-
arch/x86/include/asm/vdso.h | 1 +
7 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 9d4d6e1..fa8dbfc 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -17,6 +17,7 @@
#include <asm/unistd.h>
#include <asm/msr.h>
#include <asm/pvclock.h>
+#include <asm/mshyperv.h>
#include <linux/math64.h>
#include <linux/time.h>
#include <linux/kernel.h>
@@ -32,6 +33,11 @@ extern u8 pvclock_page
__attribute__((visibility("hidden")));
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+extern u8 hvclock_page
+ __attribute__((visibility("hidden")));
+#endif
+
#ifndef BUILD_VDSO32
notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
@@ -141,6 +147,20 @@ static notrace u64 vread_pvclock(int *mode)
return last;
}
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+static notrace u64 vread_hvclock(int *mode)
+{
+ const struct ms_hyperv_tsc_page *tsc_pg =
+ (const struct ms_hyperv_tsc_page *)&hvclock_page;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick != U64_MAX)
+ return current_tick;
+
+ *mode = VCLOCK_NONE;
+ return 0;
+}
+#endif
notrace static u64 vread_tsc(void)
{
@@ -173,6 +193,10 @@ notrace static inline u64 vgetsns(int *mode)
else if (gtod->vclock_mode == VCLOCK_PVCLOCK)
cycles = vread_pvclock(mode);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+ else if (gtod->vclock_mode == VCLOCK_HVCLOCK)
+ cycles = vread_hvclock(mode);
+#endif
else
return 0;
v = (cycles - gtod->cycle_last) & gtod->mask;
diff --git a/arch/x86/entry/vdso/vdso-layout.lds.S b/arch/x86/entry/vdso/vdso-layout.lds.S
index a708aa9..8ebb4b6 100644
--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -25,7 +25,7 @@ SECTIONS
* segment.
*/
- vvar_start = . - 2 * PAGE_SIZE;
+ vvar_start = . - 3 * PAGE_SIZE;
vvar_page = vvar_start;
/* Place all vvars at the offsets in asm/vvar.h. */
@@ -36,6 +36,7 @@ SECTIONS
#undef EMIT_VVAR
pvclock_page = vvar_start + PAGE_SIZE;
+ hvclock_page = vvar_start + 2 * PAGE_SIZE;
. = SIZEOF_HEADERS;
diff --git a/arch/x86/entry/vdso/vdso2c.c b/arch/x86/entry/vdso/vdso2c.c
index 491020b..0780a44 100644
--- a/arch/x86/entry/vdso/vdso2c.c
+++ b/arch/x86/entry/vdso/vdso2c.c
@@ -74,6 +74,7 @@ enum {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
sym_VDSO_FAKE_SECTION_TABLE_START,
sym_VDSO_FAKE_SECTION_TABLE_END,
};
@@ -82,6 +83,7 @@ const int special_pages[] = {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
};
struct vdso_sym {
@@ -94,6 +96,7 @@ struct vdso_sym required_syms[] = {
[sym_vvar_page] = {"vvar_page", true},
[sym_hpet_page] = {"hpet_page", true},
[sym_pvclock_page] = {"pvclock_page", true},
+ [sym_hvclock_page] = {"hvclock_page", true},
[sym_VDSO_FAKE_SECTION_TABLE_START] = {
"VDSO_FAKE_SECTION_TABLE_START", false
},
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 572cee3..71f5d3a 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -21,6 +21,7 @@
#include <asm/page.h>
#include <asm/desc.h>
#include <asm/cpufeature.h>
+#include <asm/mshyperv.h>
#if defined(CONFIG_X86_64)
unsigned int __read_mostly vdso64_enabled = 1;
@@ -120,6 +121,12 @@ static int vvar_fault(const struct vm_special_mapping *sm,
vmf->address,
__pa(pvti) >> PAGE_SHIFT);
}
+ } else if (sym_offset == image->sym_hvclock_page) {
+ struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
+
+ if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK))
+ ret = vm_insert_pfn(vma, vmf->address,
+ vmalloc_to_pfn(tsc_pg));
}
if (ret == 0 || ret == -EBUSY)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 63dd00e..d08ac5e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -132,6 +132,9 @@ void hyperv_init(void)
tsc_msr.guest_physical_address = vmalloc_to_pfn(tsc_pg);
wrmsrl(HV_X64_MSR_REFERENCE_TSC, tsc_msr.as_uint64);
+
+ hyperv_cs_tsc.archdata.vclock_mode = VCLOCK_HVCLOCK;
+
clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
return;
}
diff --git a/arch/x86/include/asm/clocksource.h b/arch/x86/include/asm/clocksource.h
index eae33c7..47bea8c 100644
--- a/arch/x86/include/asm/clocksource.h
+++ b/arch/x86/include/asm/clocksource.h
@@ -6,7 +6,8 @@
#define VCLOCK_NONE 0 /* No vDSO clock available. */
#define VCLOCK_TSC 1 /* vDSO should use vread_tsc. */
#define VCLOCK_PVCLOCK 2 /* vDSO should use vread_pvclock. */
-#define VCLOCK_MAX 2
+#define VCLOCK_HVCLOCK 3 /* vDSO should use vread_hvclock. */
+#define VCLOCK_MAX 3
struct arch_clocksource_data {
int vclock_mode;
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 2444189..bccdf49 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -20,6 +20,7 @@ struct vdso_image {
long sym_vvar_page;
long sym_hpet_page;
long sym_pvclock_page;
+ long sym_hvclock_page;
long sym_VDSO32_NOTE_MASK;
long sym___kernel_sigreturn;
long sym___kernel_rt_sigreturn;
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/3] x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
2017-03-03 13:21 ` Vitaly Kuznetsov
` (4 preceding siblings ...)
(?)
@ 2017-03-03 13:21 ` Vitaly Kuznetsov
-1 siblings, 0 replies; 15+ messages in thread
From: Vitaly Kuznetsov @ 2017-03-03 13:21 UTC (permalink / raw)
To: x86, Andy Lutomirski, Thomas Gleixner
Cc: Stephen Hemminger, Haiyang Zhang, linux-kernel, virtualization,
Ingo Molnar, H. Peter Anvin, devel
Hyper-V TSC page clocksource is suitable for vDSO, however, the protocol
defined by the hypervisor is different from VCLOCK_PVCLOCK. Implement the
required support by adding hvclock_page VVAR.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
---
arch/x86/entry/vdso/vclock_gettime.c | 24 ++++++++++++++++++++++++
arch/x86/entry/vdso/vdso-layout.lds.S | 3 ++-
arch/x86/entry/vdso/vdso2c.c | 3 +++
arch/x86/entry/vdso/vma.c | 7 +++++++
arch/x86/hyperv/hv_init.c | 3 +++
arch/x86/include/asm/clocksource.h | 3 ++-
arch/x86/include/asm/vdso.h | 1 +
7 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 9d4d6e1..fa8dbfc 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -17,6 +17,7 @@
#include <asm/unistd.h>
#include <asm/msr.h>
#include <asm/pvclock.h>
+#include <asm/mshyperv.h>
#include <linux/math64.h>
#include <linux/time.h>
#include <linux/kernel.h>
@@ -32,6 +33,11 @@ extern u8 pvclock_page
__attribute__((visibility("hidden")));
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+extern u8 hvclock_page
+ __attribute__((visibility("hidden")));
+#endif
+
#ifndef BUILD_VDSO32
notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
@@ -141,6 +147,20 @@ static notrace u64 vread_pvclock(int *mode)
return last;
}
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+static notrace u64 vread_hvclock(int *mode)
+{
+ const struct ms_hyperv_tsc_page *tsc_pg =
+ (const struct ms_hyperv_tsc_page *)&hvclock_page;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick != U64_MAX)
+ return current_tick;
+
+ *mode = VCLOCK_NONE;
+ return 0;
+}
+#endif
notrace static u64 vread_tsc(void)
{
@@ -173,6 +193,10 @@ notrace static inline u64 vgetsns(int *mode)
else if (gtod->vclock_mode == VCLOCK_PVCLOCK)
cycles = vread_pvclock(mode);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+ else if (gtod->vclock_mode == VCLOCK_HVCLOCK)
+ cycles = vread_hvclock(mode);
+#endif
else
return 0;
v = (cycles - gtod->cycle_last) & gtod->mask;
diff --git a/arch/x86/entry/vdso/vdso-layout.lds.S b/arch/x86/entry/vdso/vdso-layout.lds.S
index a708aa9..8ebb4b6 100644
--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -25,7 +25,7 @@ SECTIONS
* segment.
*/
- vvar_start = . - 2 * PAGE_SIZE;
+ vvar_start = . - 3 * PAGE_SIZE;
vvar_page = vvar_start;
/* Place all vvars at the offsets in asm/vvar.h. */
@@ -36,6 +36,7 @@ SECTIONS
#undef EMIT_VVAR
pvclock_page = vvar_start + PAGE_SIZE;
+ hvclock_page = vvar_start + 2 * PAGE_SIZE;
. = SIZEOF_HEADERS;
diff --git a/arch/x86/entry/vdso/vdso2c.c b/arch/x86/entry/vdso/vdso2c.c
index 491020b..0780a44 100644
--- a/arch/x86/entry/vdso/vdso2c.c
+++ b/arch/x86/entry/vdso/vdso2c.c
@@ -74,6 +74,7 @@ enum {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
sym_VDSO_FAKE_SECTION_TABLE_START,
sym_VDSO_FAKE_SECTION_TABLE_END,
};
@@ -82,6 +83,7 @@ const int special_pages[] = {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
};
struct vdso_sym {
@@ -94,6 +96,7 @@ struct vdso_sym required_syms[] = {
[sym_vvar_page] = {"vvar_page", true},
[sym_hpet_page] = {"hpet_page", true},
[sym_pvclock_page] = {"pvclock_page", true},
+ [sym_hvclock_page] = {"hvclock_page", true},
[sym_VDSO_FAKE_SECTION_TABLE_START] = {
"VDSO_FAKE_SECTION_TABLE_START", false
},
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 572cee3..71f5d3a 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -21,6 +21,7 @@
#include <asm/page.h>
#include <asm/desc.h>
#include <asm/cpufeature.h>
+#include <asm/mshyperv.h>
#if defined(CONFIG_X86_64)
unsigned int __read_mostly vdso64_enabled = 1;
@@ -120,6 +121,12 @@ static int vvar_fault(const struct vm_special_mapping *sm,
vmf->address,
__pa(pvti) >> PAGE_SHIFT);
}
+ } else if (sym_offset == image->sym_hvclock_page) {
+ struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
+
+ if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK))
+ ret = vm_insert_pfn(vma, vmf->address,
+ vmalloc_to_pfn(tsc_pg));
}
if (ret == 0 || ret == -EBUSY)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 63dd00e..d08ac5e 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -132,6 +132,9 @@ void hyperv_init(void)
tsc_msr.guest_physical_address = vmalloc_to_pfn(tsc_pg);
wrmsrl(HV_X64_MSR_REFERENCE_TSC, tsc_msr.as_uint64);
+
+ hyperv_cs_tsc.archdata.vclock_mode = VCLOCK_HVCLOCK;
+
clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
return;
}
diff --git a/arch/x86/include/asm/clocksource.h b/arch/x86/include/asm/clocksource.h
index eae33c7..47bea8c 100644
--- a/arch/x86/include/asm/clocksource.h
+++ b/arch/x86/include/asm/clocksource.h
@@ -6,7 +6,8 @@
#define VCLOCK_NONE 0 /* No vDSO clock available. */
#define VCLOCK_TSC 1 /* vDSO should use vread_tsc. */
#define VCLOCK_PVCLOCK 2 /* vDSO should use vread_pvclock. */
-#define VCLOCK_MAX 2
+#define VCLOCK_HVCLOCK 3 /* vDSO should use vread_hvclock. */
+#define VCLOCK_MAX 3
struct arch_clocksource_data {
int vclock_mode;
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 2444189..bccdf49 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -20,6 +20,7 @@ struct vdso_image {
long sym_vvar_page;
long sym_hpet_page;
long sym_pvclock_page;
+ long sym_hvclock_page;
long sym_VDSO32_NOTE_MASK;
long sym___kernel_sigreturn;
long sym___kernel_rt_sigreturn;
--
2.9.3
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
2017-03-03 13:21 ` Vitaly Kuznetsov
@ 2017-03-03 19:31 ` Stephen Hemminger
2017-03-04 9:07 ` Thomas Gleixner
2017-03-03 19:31 ` Stephen Hemminger
2017-03-11 13:52 ` [tip:x86/vdso] x86/hyperv: Move " tip-bot for Vitaly Kuznetsov
2 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2017-03-03 19:31 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: x86, Andy Lutomirski, Thomas Gleixner, Stephen Hemminger,
Haiyang Zhang, linux-kernel, virtualization, Ingo Molnar,
H. Peter Anvin, devel
Minor coding comments
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index d324dce..4ff25436 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -178,6 +178,56 @@ void hyperv_cleanup(void);
> #endif
> #ifdef CONFIG_HYPERV_TSCPAGE
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
> +static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
> +{
> + u64 scale, offset, current_tick, cur_tsc;
> + u32 sequence;
> +
> + /*
> + * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> + * Top-Level Functional Specification ver. 3.0 and above. To get the
> + * reference time we must do the following:
> + * - READ ReferenceTscSequence
> + * A special '0' value indicates the time source is unreliable and we
> + * need to use something else. The currently published specification
> + * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
> + * instead of '0' as the special value, see commit c35b82ef0294.
> + * - ReferenceTime =
> + * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
> + * - READ ReferenceTscSequence again. In case its value has changed
> + * since our first reading we need to discard ReferenceTime and repeat
> + * the whole sequence as the hypervisor was updating the page in
> + * between.
> + */
> + while (1) {
> + sequence = READ_ONCE(tsc_pg->tsc_sequence);
> + if (!sequence)
> + break;
It would be clearer to just return U64_MAX here (and not fall out)
since this is only case here. Also since this failure only occurs if host
clock is not available, probably should be unlikely.
> + /*
> + * Make sure we read sequence before we read other values from
> + * TSC page.
> + */
> + smp_rmb();
> +
> + scale = READ_ONCE(tsc_pg->tsc_scale);
> + offset = READ_ONCE(tsc_pg->tsc_offset);
> + cur_tsc = rdtsc_ordered();
Since you already have smp_ barriers and rdtsc_ordered is a barrier,
the compiler barriers (READ_ONCE()) shouldn't be necessary.
> +
> + current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
> +
> + /*
> + * Make sure we read sequence after we read all other values
> + * from TSC page.
> + */
> + smp_rmb();
> +
> + if (READ_ONCE(tsc_pg->tsc_sequence) == sequence)
> + return current_tick;
> + }
Why not make do { } while out of this.
do {
...
} while (unlikely(READ_ONCE(tsc_pg->tsc_sequence) != sequence);
return current_tick;
Also don't need to calculate tick value until have good data. As in:
static inline u32 hv_clock_sequence(const struct ms_hyperv_tsc_page *tsc_pg)
{
u32 sequence =
return sequence;
}
static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
{
u64 scale, offset, cur_tsc;
u32 start;
/*
* The protocol for reading Hyper-V TSC page is specified in Hypervisor
* Top-Level Functional Specification ver. 3.0 and above. To get the
* reference time we must do the following:
* - READ ReferenceTscSequence
* A special '0' value indicates the time source is unreliable and we
* need to use something else. The currently published specification
* versions (up to 4.0b) contain a mistake and wrongly claim '-1'
* instead of '0' as the special value, see commit c35b82ef0294.
* - ReferenceTime =
* ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
* - READ ReferenceTscSequence again. In case its value has changed
* since our first reading we need to discard ReferenceTime and repeat
* the whole sequence as the hypervisor was updating the page in
* between.
*/
do {
start = READ_ONCE(tsc_pg->tsc_sequence);
smp_rmb();
if (unlikely(!start))
return U64_MAX;
scale = tsc_pg->tsc_scale;
offset = tsc_pg->tsc_offset;
/*
* Make sure we read sequence after we read all other values
* from TSC page.
*/
smp_rmb();
} while (unlikely(READ_ONCE(tsc_pg->tsc_sequence != start)));
cur_tsc = rdtsc_ordered();
return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-03 19:31 ` Stephen Hemminger
@ 2017-03-03 19:31 ` Stephen Hemminger
2017-03-11 13:52 ` [tip:x86/vdso] x86/hyperv: Move " tip-bot for Vitaly Kuznetsov
2 siblings, 0 replies; 15+ messages in thread
From: Stephen Hemminger @ 2017-03-03 19:31 UTC (permalink / raw)
To: Vitaly Kuznetsov
Cc: Stephen Hemminger, Haiyang Zhang, x86, linux-kernel,
Andy Lutomirski, Ingo Molnar, H. Peter Anvin, devel,
Thomas Gleixner, virtualization
Minor coding comments
> diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
> index d324dce..4ff25436 100644
> --- a/arch/x86/include/asm/mshyperv.h
> +++ b/arch/x86/include/asm/mshyperv.h
> @@ -178,6 +178,56 @@ void hyperv_cleanup(void);
> #endif
> #ifdef CONFIG_HYPERV_TSCPAGE
> struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
> +static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
> +{
> + u64 scale, offset, current_tick, cur_tsc;
> + u32 sequence;
> +
> + /*
> + * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> + * Top-Level Functional Specification ver. 3.0 and above. To get the
> + * reference time we must do the following:
> + * - READ ReferenceTscSequence
> + * A special '0' value indicates the time source is unreliable and we
> + * need to use something else. The currently published specification
> + * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
> + * instead of '0' as the special value, see commit c35b82ef0294.
> + * - ReferenceTime =
> + * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
> + * - READ ReferenceTscSequence again. In case its value has changed
> + * since our first reading we need to discard ReferenceTime and repeat
> + * the whole sequence as the hypervisor was updating the page in
> + * between.
> + */
> + while (1) {
> + sequence = READ_ONCE(tsc_pg->tsc_sequence);
> + if (!sequence)
> + break;
It would be clearer to just return U64_MAX here (and not fall out)
since this is only case here. Also since this failure only occurs if host
clock is not available, probably should be unlikely.
> + /*
> + * Make sure we read sequence before we read other values from
> + * TSC page.
> + */
> + smp_rmb();
> +
> + scale = READ_ONCE(tsc_pg->tsc_scale);
> + offset = READ_ONCE(tsc_pg->tsc_offset);
> + cur_tsc = rdtsc_ordered();
Since you already have smp_ barriers and rdtsc_ordered is a barrier,
the compiler barriers (READ_ONCE()) shouldn't be necessary.
> +
> + current_tick = mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
> +
> + /*
> + * Make sure we read sequence after we read all other values
> + * from TSC page.
> + */
> + smp_rmb();
> +
> + if (READ_ONCE(tsc_pg->tsc_sequence) == sequence)
> + return current_tick;
> + }
Why not make do { } while out of this.
do {
...
} while (unlikely(READ_ONCE(tsc_pg->tsc_sequence) != sequence);
return current_tick;
Also don't need to calculate tick value until have good data. As in:
static inline u32 hv_clock_sequence(const struct ms_hyperv_tsc_page *tsc_pg)
{
u32 sequence =
return sequence;
}
static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
{
u64 scale, offset, cur_tsc;
u32 start;
/*
* The protocol for reading Hyper-V TSC page is specified in Hypervisor
* Top-Level Functional Specification ver. 3.0 and above. To get the
* reference time we must do the following:
* - READ ReferenceTscSequence
* A special '0' value indicates the time source is unreliable and we
* need to use something else. The currently published specification
* versions (up to 4.0b) contain a mistake and wrongly claim '-1'
* instead of '0' as the special value, see commit c35b82ef0294.
* - ReferenceTime =
* ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
* - READ ReferenceTscSequence again. In case its value has changed
* since our first reading we need to discard ReferenceTime and repeat
* the whole sequence as the hypervisor was updating the page in
* between.
*/
do {
start = READ_ONCE(tsc_pg->tsc_sequence);
smp_rmb();
if (unlikely(!start))
return U64_MAX;
scale = tsc_pg->tsc_scale;
offset = tsc_pg->tsc_offset;
/*
* Make sure we read sequence after we read all other values
* from TSC page.
*/
smp_rmb();
} while (unlikely(READ_ONCE(tsc_pg->tsc_sequence != start)));
cur_tsc = rdtsc_ordered();
return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
}
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
2017-03-03 19:31 ` Stephen Hemminger
@ 2017-03-04 9:07 ` Thomas Gleixner
0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2017-03-04 9:07 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Vitaly Kuznetsov, x86, Andy Lutomirski, Stephen Hemminger,
Haiyang Zhang, linux-kernel, virtualization, Ingo Molnar,
H. Peter Anvin, devel
On Fri, 3 Mar 2017, Stephen Hemminger wrote:
> static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
> {
> u64 scale, offset, cur_tsc;
> u32 start;
>
> /*
> * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> * Top-Level Functional Specification ver. 3.0 and above. To get the
> * reference time we must do the following:
> * - READ ReferenceTscSequence
> * A special '0' value indicates the time source is unreliable and we
> * need to use something else. The currently published specification
> * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
> * instead of '0' as the special value, see commit c35b82ef0294.
> * - ReferenceTime =
> * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
> * - READ ReferenceTscSequence again. In case its value has changed
> * since our first reading we need to discard ReferenceTime and repeat
> * the whole sequence as the hypervisor was updating the page in
> * between.
> */
> do {
> start = READ_ONCE(tsc_pg->tsc_sequence);
> smp_rmb();
>
> if (unlikely(!start))
> return U64_MAX;
>
> scale = tsc_pg->tsc_scale;
> offset = tsc_pg->tsc_offset;
>
> /*
> * Make sure we read sequence after we read all other values
> * from TSC page.
> */
> smp_rmb();
> } while (unlikely(READ_ONCE(tsc_pg->tsc_sequence != start)));
>
> cur_tsc = rdtsc_ordered();
That's wrong. You need to read the TSC value together with the scale and
offset. That's needs to be "atomic". You can only do the mult/shift
outside.
> return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
> }
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h
@ 2017-03-04 9:07 ` Thomas Gleixner
0 siblings, 0 replies; 15+ messages in thread
From: Thomas Gleixner @ 2017-03-04 9:07 UTC (permalink / raw)
To: Stephen Hemminger
Cc: Stephen Hemminger, Haiyang Zhang, x86, linux-kernel,
Andy Lutomirski, Ingo Molnar, H. Peter Anvin, devel,
virtualization
On Fri, 3 Mar 2017, Stephen Hemminger wrote:
> static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
> {
> u64 scale, offset, cur_tsc;
> u32 start;
>
> /*
> * The protocol for reading Hyper-V TSC page is specified in Hypervisor
> * Top-Level Functional Specification ver. 3.0 and above. To get the
> * reference time we must do the following:
> * - READ ReferenceTscSequence
> * A special '0' value indicates the time source is unreliable and we
> * need to use something else. The currently published specification
> * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
> * instead of '0' as the special value, see commit c35b82ef0294.
> * - ReferenceTime =
> * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
> * - READ ReferenceTscSequence again. In case its value has changed
> * since our first reading we need to discard ReferenceTime and repeat
> * the whole sequence as the hypervisor was updating the page in
> * between.
> */
> do {
> start = READ_ONCE(tsc_pg->tsc_sequence);
> smp_rmb();
>
> if (unlikely(!start))
> return U64_MAX;
>
> scale = tsc_pg->tsc_scale;
> offset = tsc_pg->tsc_offset;
>
> /*
> * Make sure we read sequence after we read all other values
> * from TSC page.
> */
> smp_rmb();
> } while (unlikely(READ_ONCE(tsc_pg->tsc_sequence != start)));
>
> cur_tsc = rdtsc_ordered();
That's wrong. You need to read the TSC value together with the scale and
offset. That's needs to be "atomic". You can only do the mult/shift
outside.
> return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
> }
Thanks,
tglx
^ permalink raw reply [flat|nested] 15+ messages in thread
* [tip:x86/vdso] x86/hyperv: Implement hv_get_tsc_page()
2017-03-03 13:21 ` Vitaly Kuznetsov
@ 2017-03-11 13:51 ` tip-bot for Vitaly Kuznetsov
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Vitaly Kuznetsov @ 2017-03-11 13:51 UTC (permalink / raw)
To: linux-tip-commits
Cc: mingo, decui, linux-kernel, vkuznets, haiyangz, hpa, luto,
sthemmin, tglx, kys
Commit-ID: bd2a9adaadb8defcaf6c284bca7ff41634105f51
Gitweb: http://git.kernel.org/tip/bd2a9adaadb8defcaf6c284bca7ff41634105f51
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
AuthorDate: Fri, 3 Mar 2017 14:21:40 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 11 Mar 2017 14:47:28 +0100
x86/hyperv: Implement hv_get_tsc_page()
To use Hyper-V TSC page clocksource from vDSO we need to make tsc_pg
available. Implement hv_get_tsc_page() and add CONFIG_HYPERV_TSCPAGE to
make #ifdef-s simple.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170303132142.25595-2-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/hyperv/hv_init.c | 9 +++++++--
arch/x86/include/asm/mshyperv.h | 8 ++++++++
drivers/hv/Kconfig | 3 +++
3 files changed, 18 insertions(+), 2 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 8bef70e..bb1ea58 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -27,10 +27,15 @@
#include <linux/clockchips.h>
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
static struct ms_hyperv_tsc_page *tsc_pg;
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return tsc_pg;
+}
+
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
u64 current_tick;
@@ -139,7 +144,7 @@ void hyperv_init(void)
/*
* Register Hyper-V specific clocksource.
*/
-#ifdef CONFIG_X86_64
+#ifdef CONFIG_HYPERV_TSCPAGE
if (ms_hyperv.features & HV_X64_MSR_REFERENCE_TSC_AVAILABLE) {
union hv_x64_msr_hypercall_contents tsc_msr;
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 7c9c895..d324dce 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -176,4 +176,12 @@ void hyperv_report_panic(struct pt_regs *regs);
bool hv_is_hypercall_page_setup(void);
void hyperv_cleanup(void);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+#else
+static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
+{
+ return NULL;
+}
+#endif
#endif
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig
index 0403b51..c29cd53 100644
--- a/drivers/hv/Kconfig
+++ b/drivers/hv/Kconfig
@@ -7,6 +7,9 @@ config HYPERV
Select this option to run Linux as a Hyper-V client operating
system.
+config HYPERV_TSCPAGE
+ def_bool HYPERV && X86_64
+
config HYPERV_UTILS
tristate "Microsoft Hyper-V Utilities driver"
depends on HYPERV && CONNECTOR && NLS
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:x86/vdso] x86/hyperv: Move TSC reading method to asm/mshyperv.h
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-03 19:31 ` Stephen Hemminger
2017-03-03 19:31 ` Stephen Hemminger
@ 2017-03-11 13:52 ` tip-bot for Vitaly Kuznetsov
2 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Vitaly Kuznetsov @ 2017-03-11 13:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: vkuznets, sthemmin, luto, kys, haiyangz, mingo, hpa, decui, tglx,
linux-kernel
Commit-ID: 0733379b512ce36ba0b10942f9597b74f579f063
Gitweb: http://git.kernel.org/tip/0733379b512ce36ba0b10942f9597b74f579f063
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
AuthorDate: Fri, 3 Mar 2017 14:21:41 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 11 Mar 2017 14:47:28 +0100
x86/hyperv: Move TSC reading method to asm/mshyperv.h
As a preparation to making Hyper-V TSC page suitable for vDSO move
the TSC page reading logic to asm/mshyperv.h. While on it, do the
following:
- Document the reading algorithm.
- Simplify the code a bit.
- Add explicit READ_ONCE() to not rely on 'volatile'.
- Add explicit barriers to prevent re-ordering (we need to read sequence
strictly before and after)
- Use mul_u64_u64_shr() instead of assembly, gcc generates a single 'mul'
instruction on x86_64 anyway.
[ tglx: Simplified the loop ]
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170303132142.25595-3-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/hyperv/hv_init.c | 36 ++++----------------------------
arch/x86/include/asm/mshyperv.h | 46 +++++++++++++++++++++++++++++++++++++++++
2 files changed, 50 insertions(+), 32 deletions(-)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index bb1ea58..7f51523 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -38,39 +38,11 @@ struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
static u64 read_hv_clock_tsc(struct clocksource *arg)
{
- u64 current_tick;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick == U64_MAX)
+ rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
- if (tsc_pg->tsc_sequence != 0) {
- /*
- * Use the tsc page to compute the value.
- */
-
- while (1) {
- u64 tmp;
- u32 sequence = tsc_pg->tsc_sequence;
- u64 cur_tsc;
- u64 scale = tsc_pg->tsc_scale;
- s64 offset = tsc_pg->tsc_offset;
-
- rdtscll(cur_tsc);
- /* current_tick = ((cur_tsc *scale) >> 64) + offset */
- asm("mulq %3"
- : "=d" (current_tick), "=a" (tmp)
- : "a" (cur_tsc), "r" (scale));
-
- current_tick += offset;
- if (tsc_pg->tsc_sequence == sequence)
- return current_tick;
-
- if (tsc_pg->tsc_sequence != 0)
- continue;
- /*
- * Fallback using MSR method.
- */
- break;
- }
- }
- rdmsrl(HV_X64_MSR_TIME_REF_COUNT, current_tick);
return current_tick;
}
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index d324dce..fba1007 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -178,6 +178,52 @@ void hyperv_cleanup(void);
#endif
#ifdef CONFIG_HYPERV_TSCPAGE
struct ms_hyperv_tsc_page *hv_get_tsc_page(void);
+static inline u64 hv_read_tsc_page(const struct ms_hyperv_tsc_page *tsc_pg)
+{
+ u64 scale, offset, cur_tsc;
+ u32 sequence;
+
+ /*
+ * The protocol for reading Hyper-V TSC page is specified in Hypervisor
+ * Top-Level Functional Specification ver. 3.0 and above. To get the
+ * reference time we must do the following:
+ * - READ ReferenceTscSequence
+ * A special '0' value indicates the time source is unreliable and we
+ * need to use something else. The currently published specification
+ * versions (up to 4.0b) contain a mistake and wrongly claim '-1'
+ * instead of '0' as the special value, see commit c35b82ef0294.
+ * - ReferenceTime =
+ * ((RDTSC() * ReferenceTscScale) >> 64) + ReferenceTscOffset
+ * - READ ReferenceTscSequence again. In case its value has changed
+ * since our first reading we need to discard ReferenceTime and repeat
+ * the whole sequence as the hypervisor was updating the page in
+ * between.
+ */
+ do {
+ sequence = READ_ONCE(tsc_pg->tsc_sequence);
+ if (!sequence)
+ return U64_MAX;
+ /*
+ * Make sure we read sequence before we read other values from
+ * TSC page.
+ */
+ smp_rmb();
+
+ scale = READ_ONCE(tsc_pg->tsc_scale);
+ offset = READ_ONCE(tsc_pg->tsc_offset);
+ cur_tsc = rdtsc_ordered();
+
+ /*
+ * Make sure we read sequence after we read all other values
+ * from TSC page.
+ */
+ smp_rmb();
+
+ } while (READ_ONCE(tsc_pg->tsc_sequence) != sequence);
+
+ return mul_u64_u64_shr(cur_tsc, scale, 64) + offset;
+}
+
#else
static inline struct ms_hyperv_tsc_page *hv_get_tsc_page(void)
{
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [tip:x86/vdso] x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
2017-03-03 13:21 ` Vitaly Kuznetsov
@ 2017-03-11 13:52 ` tip-bot for Vitaly Kuznetsov
0 siblings, 0 replies; 15+ messages in thread
From: tip-bot for Vitaly Kuznetsov @ 2017-03-11 13:52 UTC (permalink / raw)
To: linux-tip-commits
Cc: vkuznets, decui, luto, sthemmin, hpa, kys, mingo, linux-kernel,
haiyangz, tglx
Commit-ID: 90b20432aeb850ef84086a72893cd9411479d896
Gitweb: http://git.kernel.org/tip/90b20432aeb850ef84086a72893cd9411479d896
Author: Vitaly Kuznetsov <vkuznets@redhat.com>
AuthorDate: Fri, 3 Mar 2017 14:21:42 +0100
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Sat, 11 Mar 2017 14:47:28 +0100
x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method
Hyper-V TSC page clocksource is suitable for vDSO, however, the protocol
defined by the hypervisor is different from VCLOCK_PVCLOCK. Implement the
required support by adding hvclock_page VVAR.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: devel@linuxdriverproject.org
Cc: "K. Y. Srinivasan" <kys@microsoft.com>
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/20170303132142.25595-4-vkuznets@redhat.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
arch/x86/entry/vdso/vclock_gettime.c | 24 ++++++++++++++++++++++++
arch/x86/entry/vdso/vdso-layout.lds.S | 3 ++-
arch/x86/entry/vdso/vdso2c.c | 3 +++
arch/x86/entry/vdso/vma.c | 7 +++++++
arch/x86/hyperv/hv_init.c | 3 +++
arch/x86/include/asm/clocksource.h | 3 ++-
arch/x86/include/asm/vdso.h | 1 +
7 files changed, 42 insertions(+), 2 deletions(-)
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 9d4d6e1..fa8dbfc 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -17,6 +17,7 @@
#include <asm/unistd.h>
#include <asm/msr.h>
#include <asm/pvclock.h>
+#include <asm/mshyperv.h>
#include <linux/math64.h>
#include <linux/time.h>
#include <linux/kernel.h>
@@ -32,6 +33,11 @@ extern u8 pvclock_page
__attribute__((visibility("hidden")));
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+extern u8 hvclock_page
+ __attribute__((visibility("hidden")));
+#endif
+
#ifndef BUILD_VDSO32
notrace static long vdso_fallback_gettime(long clock, struct timespec *ts)
@@ -141,6 +147,20 @@ static notrace u64 vread_pvclock(int *mode)
return last;
}
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+static notrace u64 vread_hvclock(int *mode)
+{
+ const struct ms_hyperv_tsc_page *tsc_pg =
+ (const struct ms_hyperv_tsc_page *)&hvclock_page;
+ u64 current_tick = hv_read_tsc_page(tsc_pg);
+
+ if (current_tick != U64_MAX)
+ return current_tick;
+
+ *mode = VCLOCK_NONE;
+ return 0;
+}
+#endif
notrace static u64 vread_tsc(void)
{
@@ -173,6 +193,10 @@ notrace static inline u64 vgetsns(int *mode)
else if (gtod->vclock_mode == VCLOCK_PVCLOCK)
cycles = vread_pvclock(mode);
#endif
+#ifdef CONFIG_HYPERV_TSCPAGE
+ else if (gtod->vclock_mode == VCLOCK_HVCLOCK)
+ cycles = vread_hvclock(mode);
+#endif
else
return 0;
v = (cycles - gtod->cycle_last) & gtod->mask;
diff --git a/arch/x86/entry/vdso/vdso-layout.lds.S b/arch/x86/entry/vdso/vdso-layout.lds.S
index a708aa9..8ebb4b6 100644
--- a/arch/x86/entry/vdso/vdso-layout.lds.S
+++ b/arch/x86/entry/vdso/vdso-layout.lds.S
@@ -25,7 +25,7 @@ SECTIONS
* segment.
*/
- vvar_start = . - 2 * PAGE_SIZE;
+ vvar_start = . - 3 * PAGE_SIZE;
vvar_page = vvar_start;
/* Place all vvars at the offsets in asm/vvar.h. */
@@ -36,6 +36,7 @@ SECTIONS
#undef EMIT_VVAR
pvclock_page = vvar_start + PAGE_SIZE;
+ hvclock_page = vvar_start + 2 * PAGE_SIZE;
. = SIZEOF_HEADERS;
diff --git a/arch/x86/entry/vdso/vdso2c.c b/arch/x86/entry/vdso/vdso2c.c
index 491020b..0780a44 100644
--- a/arch/x86/entry/vdso/vdso2c.c
+++ b/arch/x86/entry/vdso/vdso2c.c
@@ -74,6 +74,7 @@ enum {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
sym_VDSO_FAKE_SECTION_TABLE_START,
sym_VDSO_FAKE_SECTION_TABLE_END,
};
@@ -82,6 +83,7 @@ const int special_pages[] = {
sym_vvar_page,
sym_hpet_page,
sym_pvclock_page,
+ sym_hvclock_page,
};
struct vdso_sym {
@@ -94,6 +96,7 @@ struct vdso_sym required_syms[] = {
[sym_vvar_page] = {"vvar_page", true},
[sym_hpet_page] = {"hpet_page", true},
[sym_pvclock_page] = {"pvclock_page", true},
+ [sym_hvclock_page] = {"hvclock_page", true},
[sym_VDSO_FAKE_SECTION_TABLE_START] = {
"VDSO_FAKE_SECTION_TABLE_START", false
},
diff --git a/arch/x86/entry/vdso/vma.c b/arch/x86/entry/vdso/vma.c
index 226ca70..faf80fd 100644
--- a/arch/x86/entry/vdso/vma.c
+++ b/arch/x86/entry/vdso/vma.c
@@ -22,6 +22,7 @@
#include <asm/page.h>
#include <asm/desc.h>
#include <asm/cpufeature.h>
+#include <asm/mshyperv.h>
#if defined(CONFIG_X86_64)
unsigned int __read_mostly vdso64_enabled = 1;
@@ -121,6 +122,12 @@ static int vvar_fault(const struct vm_special_mapping *sm,
vmf->address,
__pa(pvti) >> PAGE_SHIFT);
}
+ } else if (sym_offset == image->sym_hvclock_page) {
+ struct ms_hyperv_tsc_page *tsc_pg = hv_get_tsc_page();
+
+ if (tsc_pg && vclock_was_used(VCLOCK_HVCLOCK))
+ ret = vm_insert_pfn(vma, vmf->address,
+ vmalloc_to_pfn(tsc_pg));
}
if (ret == 0 || ret == -EBUSY)
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c
index 7f51523..2b01421 100644
--- a/arch/x86/hyperv/hv_init.c
+++ b/arch/x86/hyperv/hv_init.c
@@ -132,6 +132,9 @@ void hyperv_init(void)
tsc_msr.guest_physical_address = vmalloc_to_pfn(tsc_pg);
wrmsrl(HV_X64_MSR_REFERENCE_TSC, tsc_msr.as_uint64);
+
+ hyperv_cs_tsc.archdata.vclock_mode = VCLOCK_HVCLOCK;
+
clocksource_register_hz(&hyperv_cs_tsc, NSEC_PER_SEC/100);
return;
}
diff --git a/arch/x86/include/asm/clocksource.h b/arch/x86/include/asm/clocksource.h
index eae33c7..47bea8c 100644
--- a/arch/x86/include/asm/clocksource.h
+++ b/arch/x86/include/asm/clocksource.h
@@ -6,7 +6,8 @@
#define VCLOCK_NONE 0 /* No vDSO clock available. */
#define VCLOCK_TSC 1 /* vDSO should use vread_tsc. */
#define VCLOCK_PVCLOCK 2 /* vDSO should use vread_pvclock. */
-#define VCLOCK_MAX 2
+#define VCLOCK_HVCLOCK 3 /* vDSO should use vread_hvclock. */
+#define VCLOCK_MAX 3
struct arch_clocksource_data {
int vclock_mode;
diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h
index 2444189..bccdf49 100644
--- a/arch/x86/include/asm/vdso.h
+++ b/arch/x86/include/asm/vdso.h
@@ -20,6 +20,7 @@ struct vdso_image {
long sym_vvar_page;
long sym_hpet_page;
long sym_pvclock_page;
+ long sym_hvclock_page;
long sym_VDSO32_NOTE_MASK;
long sym___kernel_sigreturn;
long sym___kernel_rt_sigreturn;
^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2017-03-11 13:53 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-03 13:21 [PATCH v3 0/3] x86/vdso: Add Hyper-V TSC page clocksource support Vitaly Kuznetsov
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-03 13:21 ` [PATCH v3 1/3] x86/hyperv: implement hv_get_tsc_page() Vitaly Kuznetsov
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-11 13:51 ` [tip:x86/vdso] x86/hyperv: Implement hv_get_tsc_page() tip-bot for Vitaly Kuznetsov
2017-03-03 13:21 ` [PATCH v3 2/3] x86/hyperv: move TSC reading method to asm/mshyperv.h Vitaly Kuznetsov
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-03 19:31 ` Stephen Hemminger
2017-03-04 9:07 ` Thomas Gleixner
2017-03-04 9:07 ` Thomas Gleixner
2017-03-03 19:31 ` Stephen Hemminger
2017-03-11 13:52 ` [tip:x86/vdso] x86/hyperv: Move " tip-bot for Vitaly Kuznetsov
2017-03-03 13:21 ` [PATCH v3 3/3] x86/vdso: Add VCLOCK_HVCLOCK vDSO clock read method Vitaly Kuznetsov
2017-03-03 13:21 ` Vitaly Kuznetsov
2017-03-11 13:52 ` [tip:x86/vdso] " tip-bot for Vitaly Kuznetsov
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.