All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] KVM support for TSC scaling
@ 2011-02-09 17:29 Joerg Roedel
  2011-02-09 17:29 ` [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept Joerg Roedel
                   ` (6 more replies)
  0 siblings, 7 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Zachary Amsden

Hi Avi, Marcelo,

here is the patch-set to implement the TSC-scaling feature of upcoming
AMD CPUs. When this feature is supported the CPU provides a new MSR
which holds a multiplier for the hardware TSC which is applied on the
value rdtsc[p] and reads of MSR 0x10. This feature can be used to
emulate a given tsc frequency for the guest.
Patch 1 is not directly related to this patch-set because it only fixes
a bug which prevented me from testing these patches. In fact it fixes
the same bug Andre sent a patch for. But after the discussion about his
patch he told me to just post my patch and thus here it is.

Thanks,

	Joerg

Diff-stat:

 arch/x86/include/asm/kvm_host.h  |    4 ++
 arch/x86/include/asm/msr-index.h |    1 +
 arch/x86/kvm/svm.c               |   91 +++++++++++++++++++++++++++++++++++++-
 arch/x86/kvm/vmx.c               |   12 +++++
 arch/x86/kvm/x86.c               |   60 ++++++++++++++++++++++---
 include/linux/kvm.h              |    4 ++
 6 files changed, 164 insertions(+), 8 deletions(-)

Shortlog:

Joerg Roedel (6):
      KVM: SVM: Advance instruction pointer in dr_intercept
      KVM: SVM: Implement infrastructure for TSC_RATE_MSR
      KVM: X86: Let kvm-clock report the right tsc frequency
      KVM: SVM: Propagate requested TSC frequency on vcpu init
      KVM: X86: Delegate tsc-offset calculation to architecture code
      KVM: X86: Implement userspace interface to set virtual_tsc_khz



^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-22 11:14   ` Roedel, Joerg
  2011-02-09 17:29 ` [PATCH 2/6] KVM: SVM: Implement infrastructure for TSC_RATE_MSR Joerg Roedel
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

In the dr_intercept function a new cpu-feature called
decode-assists is implemented and used when available. This
code-path does not advance the guest-rip causing the guest
to dead-loop over mov-dr instructions. This is fixed by this
patch.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 73a8f1d..bfb4948 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -2777,6 +2777,8 @@ static int dr_interception(struct vcpu_svm *svm)
 			kvm_register_write(&svm->vcpu, reg, val);
 	}
 
+	skip_emulated_instruction(&svm->vcpu);
+
 	return 1;
 }
 
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 2/6] KVM: SVM: Implement infrastructure for TSC_RATE_MSR
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
  2011-02-09 17:29 ` [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-09 17:29 ` [PATCH 3/6] KVM: X86: Let kvm-clock report the right tsc frequency Joerg Roedel
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

This patch enhances the kvm_amd module with functions to
support the TSC_RATE_MSR which can be used to set a given
tsc frequency for the guest vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/msr-index.h |    1 +
 arch/x86/kvm/svm.c               |   37 ++++++++++++++++++++++++++++++++++++-
 2 files changed, 37 insertions(+), 1 deletions(-)

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 5bfafb6..fdac548 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -106,6 +106,7 @@
    complete list. */
 
 #define MSR_AMD64_PATCH_LEVEL		0x0000008b
+#define MSR_AMD64_TSC_RATIO		0xc0000104
 #define MSR_AMD64_NB_CFG		0xc001001f
 #define MSR_AMD64_PATCH_LOADER		0xc0010020
 #define MSR_AMD64_OSVW_ID_LENGTH	0xc0010140
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index bfb4948..c96c0a6 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -63,6 +63,8 @@ MODULE_LICENSE("GPL");
 
 #define DEBUGCTL_RESERVED_BITS (~(0x3fULL))
 
+#define TSC_RATIO_RSVD          0xffffff0000000000ULL
+
 static bool erratum_383_found __read_mostly;
 
 static const u32 host_save_user_msrs[] = {
@@ -142,6 +144,12 @@ struct vcpu_svm {
 	unsigned int3_injected;
 	unsigned long int3_rip;
 	u32 apf_reason;
+
+	struct {
+		bool enabled;
+		u64  ratio;
+	} tsc_scale;
+
 };
 
 #define MSR_INVALID			0xffffffffU
@@ -852,6 +860,25 @@ static void init_sys_seg(struct vmcb_seg *seg, uint32_t type)
 	seg->base = 0;
 }
 
+static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+	u64 _tsc = tsc;
+
+	if (svm->tsc_scale.enabled) {
+		u64 mult, frac;
+
+		mult  = svm->tsc_scale.ratio >> 32;
+		frac  = svm->tsc_scale.ratio & ((1ULL << 32) - 1);
+
+		_tsc *= mult;
+		_tsc += (tsc >> 32) * frac;
+		_tsc += ((tsc & ((1ULL << 32) - 1)) * frac) >> 32;
+	}
+
+	return _tsc;
+}
+
 static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -2808,7 +2835,9 @@ static int svm_get_msr(struct kvm_vcpu *vcpu, unsigned ecx, u64 *data)
 	case MSR_IA32_TSC: {
 		struct vmcb *vmcb = get_host_vmcb(svm);
 
-		*data = vmcb->control.tsc_offset + native_read_tsc();
+		*data = vmcb->control.tsc_offset +
+			svm_scale_tsc(vcpu, native_read_tsc());
+
 		break;
 	}
 	case MSR_STAR:
@@ -3564,6 +3593,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	clgi();
 
+	if (static_cpu_has(X86_FEATURE_TSCRATEMSR) && svm->tsc_scale.enabled)
+		wrmsrl(MSR_AMD64_TSC_RATIO, svm->tsc_scale.ratio);
+
 	local_irq_enable();
 
 	asm volatile (
@@ -3647,6 +3679,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	local_irq_disable();
 
+	if (static_cpu_has(X86_FEATURE_TSCRATEMSR) && svm->tsc_scale.enabled)
+		wrmsr(MSR_AMD64_TSC_RATIO, 0, 1);
+
 	vcpu->arch.cr2 = svm->vmcb->save.cr2;
 	vcpu->arch.regs[VCPU_REGS_RAX] = svm->vmcb->save.rax;
 	vcpu->arch.regs[VCPU_REGS_RSP] = svm->vmcb->save.rsp;
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 3/6] KVM: X86: Let kvm-clock report the right tsc frequency
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
  2011-02-09 17:29 ` [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept Joerg Roedel
  2011-02-09 17:29 ` [PATCH 2/6] KVM: SVM: Implement infrastructure for TSC_RATE_MSR Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-09 17:29 ` [PATCH 4/6] KVM: SVM: Propagate requested TSC frequency on vcpu init Joerg Roedel
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

This patch changes the kvm_guest_time_update function to use
TSC frequency the guest actually has for updating its clock.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    2 ++
 arch/x86/kvm/svm.c              |    8 ++++++++
 arch/x86/kvm/vmx.c              |    6 ++++++
 arch/x86/kvm/x86.c              |   12 ++++++++++--
 4 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ffd7f8d..9686950 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -592,6 +592,8 @@ struct kvm_x86_ops {
 
 	void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
+	bool (*use_virtual_tsc_khz)(struct kvm_vcpu *vcpu);
+
 	void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2);
 	const struct trace_print_flags *exit_reasons_str;
 };
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index c96c0a6..f51f757 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -905,6 +905,13 @@ static void svm_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment)
 	mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
 }
 
+static bool svm_use_virtual_tsc_khz(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_svm *svm = to_svm(vcpu);
+
+	return svm->tsc_scale.enabled;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -3976,6 +3983,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 
 	.write_tsc_offset = svm_write_tsc_offset,
 	.adjust_tsc_offset = svm_adjust_tsc_offset,
+	.use_virtual_tsc_khz = svm_use_virtual_tsc_khz,
 
 	.set_tdp_cr3 = set_tdp_cr3,
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ae4f02d..c227a6b 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1164,6 +1164,11 @@ static void vmx_adjust_tsc_offset(struct kvm_vcpu *vcpu, s64 adjustment)
 	vmcs_write64(TSC_OFFSET, offset + adjustment);
 }
 
+static bool vmx_use_virtual_tsc_khz(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -4443,6 +4448,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 
 	.write_tsc_offset = vmx_write_tsc_offset,
 	.adjust_tsc_offset = vmx_adjust_tsc_offset,
+	.use_virtual_tsc_khz = vmx_use_virtual_tsc_khz,
 
 	.set_tdp_cr3 = vmx_set_cr3,
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8575d85..597abc8 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -979,6 +979,14 @@ static inline int kvm_tsc_changes_freq(void)
 	return ret;
 }
 
+static u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu)
+{
+	if (kvm_x86_ops->use_virtual_tsc_khz(vcpu))
+		return vcpu->kvm->arch.virtual_tsc_khz;
+	else
+		return __this_cpu_read(cpu_tsc_khz);
+}
+
 static inline u64 nsec_to_cycles(u64 nsec)
 {
 	u64 ret;
@@ -1010,6 +1018,7 @@ static u64 compute_guest_tsc(struct kvm_vcpu *vcpu, s64 kernel_ns)
 	return tsc;
 }
 
+
 void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 {
 	struct kvm *kvm = vcpu->kvm;
@@ -1072,8 +1081,7 @@ static int kvm_guest_time_update(struct kvm_vcpu *v)
 	local_irq_save(flags);
 	kvm_get_msr(v, MSR_IA32_TSC, &tsc_timestamp);
 	kernel_ns = get_kernel_ns();
-	this_tsc_khz = __this_cpu_read(cpu_tsc_khz);
-
+	this_tsc_khz = vcpu_tsc_khz(v);
 	if (unlikely(this_tsc_khz == 0)) {
 		local_irq_restore(flags);
 		kvm_make_request(KVM_REQ_CLOCK_UPDATE, v);
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 4/6] KVM: SVM: Propagate requested TSC frequency on vcpu init
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
                   ` (2 preceding siblings ...)
  2011-02-09 17:29 ` [PATCH 3/6] KVM: X86: Let kvm-clock report the right tsc frequency Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-09 17:29 ` [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code Joerg Roedel
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

This patch implements the propagation of the VM
virtual_tsc_khz into each vcpu data-structure to enable the
tsc-scaling feature.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/kvm/svm.c |   32 ++++++++++++++++++++++++++++++++
 1 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f51f757..29833a7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -879,6 +879,35 @@ static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 	return _tsc;
 }
 
+static bool svm_vcpu_init_tsc(struct kvm *kvm, struct vcpu_svm *svm)
+{
+	u64 raw_tsc, tsc, new_tsc;
+	u64 ratio;
+	u64 khz;
+
+	/* TSC scaling supported? */
+	if (!boot_cpu_has(X86_FEATURE_TSCRATEMSR))
+		goto out;
+
+	/* Guest tsc same frequency as host tsc? */
+	if (kvm->arch.virtual_tsc_khz == tsc_khz)
+		goto out;
+
+	khz = kvm->arch.virtual_tsc_khz;
+
+	/* TSC scaling required  - calculate ratio */
+	ratio = khz << 32;
+	do_div(ratio, tsc_khz);
+	if (ratio == 0 || ratio & TSC_RATIO_RSVD)
+		return false;
+
+	svm->tsc_scale.ratio   = ratio;
+	svm->tsc_scale.enabled = true;
+
+out:
+	return true;
+}
+
 static void svm_write_tsc_offset(struct kvm_vcpu *vcpu, u64 offset)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1084,6 +1113,9 @@ static struct kvm_vcpu *svm_create_vcpu(struct kvm *kvm, unsigned int id)
 	if (err)
 		goto free_svm;
 
+	if (!svm_vcpu_init_tsc(kvm, svm))
+		goto uninit;
+
 	err = -ENOMEM;
 	page = alloc_page(GFP_KERNEL);
 	if (!page)
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
                   ` (3 preceding siblings ...)
  2011-02-09 17:29 ` [PATCH 4/6] KVM: SVM: Propagate requested TSC frequency on vcpu init Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-11 22:12   ` Zachary Amsden
  2011-02-09 17:29 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
  2011-02-13 15:19 ` [PATCH 0/6] KVM support for TSC scaling Avi Kivity
  6 siblings, 1 reply; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

With TSC scaling in SVM the tsc-offset needs to be
calculated differently. This patch propagates this
calculation into the architecture specific modules so that
this complexity can be handled there.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/svm.c              |   11 ++++++++++-
 arch/x86/kvm/vmx.c              |    6 ++++++
 arch/x86/kvm/x86.c              |   10 +++++-----
 4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9686950..8c40425 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -593,6 +593,7 @@ struct kvm_x86_ops {
 	void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
 
 	bool (*use_virtual_tsc_khz)(struct kvm_vcpu *vcpu);
+	u64 (*compute_tsc_offset)(struct kvm_vcpu *vcpu, u64 target_tsc);
 
 	void (*get_exit_info)(struct kvm_vcpu *vcpu, u64 *info1, u64 *info2);
 	const struct trace_print_flags *exit_reasons_str;
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 29833a7..f938585 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -881,7 +881,6 @@ static u64 svm_scale_tsc(struct kvm_vcpu *vcpu, u64 tsc)
 
 static bool svm_vcpu_init_tsc(struct kvm *kvm, struct vcpu_svm *svm)
 {
-	u64 raw_tsc, tsc, new_tsc;
 	u64 ratio;
 	u64 khz;
 
@@ -941,6 +940,15 @@ static bool svm_use_virtual_tsc_khz(struct kvm_vcpu *vcpu)
 	return svm->tsc_scale.enabled;
 }
 
+static u64 svm_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
+{
+	u64 tsc;
+
+	tsc = svm_scale_tsc(vcpu, native_read_tsc());
+
+	return target_tsc - tsc;
+}
+
 static void init_vmcb(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *control = &svm->vmcb->control;
@@ -4016,6 +4024,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.write_tsc_offset = svm_write_tsc_offset,
 	.adjust_tsc_offset = svm_adjust_tsc_offset,
 	.use_virtual_tsc_khz = svm_use_virtual_tsc_khz,
+	.compute_tsc_offset = svm_compute_tsc_offset,
 
 	.set_tdp_cr3 = set_tdp_cr3,
 };
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index c227a6b..9bbdf1f 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -1169,6 +1169,11 @@ static bool vmx_use_virtual_tsc_khz(struct kvm_vcpu *vcpu)
 	return false;
 }
 
+static u64 vmx_compute_tsc_offset(struct kvm_vcpu *vcpu, u64 target_tsc)
+{
+	return target_tsc - native_read_tsc();
+}
+
 /*
  * Reads an msr value (of 'msr_index') into 'pdata'.
  * Returns 0 on success, non-0 otherwise.
@@ -4449,6 +4454,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.write_tsc_offset = vmx_write_tsc_offset,
 	.adjust_tsc_offset = vmx_adjust_tsc_offset,
 	.use_virtual_tsc_khz = vmx_use_virtual_tsc_khz,
+	.compute_tsc_offset = vmx_compute_tsc_offset,
 
 	.set_tdp_cr3 = vmx_set_cr3,
 };
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 597abc8..6caaf4b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -987,7 +987,7 @@ static u64 vcpu_tsc_khz(struct kvm_vcpu *vcpu)
 		return __this_cpu_read(cpu_tsc_khz);
 }
 
-static inline u64 nsec_to_cycles(u64 nsec)
+static inline u64 nsec_to_cycles(struct kvm_vcpu *vcpu, u64 nsec)
 {
 	u64 ret;
 
@@ -995,7 +995,7 @@ static inline u64 nsec_to_cycles(u64 nsec)
 	if (kvm_tsc_changes_freq())
 		printk_once(KERN_WARNING
 		 "kvm: unreliable cycle conversion on adjustable rate TSC\n");
-	ret = nsec * __this_cpu_read(cpu_tsc_khz);
+	ret = nsec * vcpu_tsc_khz(vcpu);
 	do_div(ret, USEC_PER_SEC);
 	return ret;
 }
@@ -1027,7 +1027,7 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 	s64 sdiff;
 
 	spin_lock_irqsave(&kvm->arch.tsc_write_lock, flags);
-	offset = data - native_read_tsc();
+	offset = kvm_x86_ops->compute_tsc_offset(vcpu, data);
 	ns = get_kernel_ns();
 	elapsed = ns - kvm->arch.last_tsc_nsec;
 	sdiff = data - kvm->arch.last_tsc_write;
@@ -1043,13 +1043,13 @@ void kvm_write_tsc(struct kvm_vcpu *vcpu, u64 data)
 	 * In that case, for a reliable TSC, we can match TSC offsets,
 	 * or make a best guest using elapsed value.
 	 */
-	if (sdiff < nsec_to_cycles(5ULL * NSEC_PER_SEC) &&
+	if (sdiff < nsec_to_cycles(vcpu, 5ULL * NSEC_PER_SEC) &&
 	    elapsed < 5ULL * NSEC_PER_SEC) {
 		if (!check_tsc_unstable()) {
 			offset = kvm->arch.last_tsc_offset;
 			pr_debug("kvm: matched tsc offset for %llu\n", data);
 		} else {
-			u64 delta = nsec_to_cycles(elapsed);
+			u64 delta = nsec_to_cycles(vcpu, elapsed);
 			offset += delta;
 			pr_debug("kvm: adjusted tsc offset by %llu\n", delta);
 		}
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
                   ` (4 preceding siblings ...)
  2011-02-09 17:29 ` [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code Joerg Roedel
@ 2011-02-09 17:29 ` Joerg Roedel
  2011-02-13 15:12   ` Avi Kivity
  2011-02-13 15:19 ` [PATCH 0/6] KVM support for TSC scaling Avi Kivity
  6 siblings, 1 reply; 29+ messages in thread
From: Joerg Roedel @ 2011-02-09 17:29 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti
  Cc: kvm, linux-kernel, Zachary Amsden, Joerg Roedel

This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/svm.c              |    3 +++
 arch/x86/kvm/x86.c              |   38 ++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h             |    4 ++++
 4 files changed, 46 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 8c40425..dfdd0aa 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -633,6 +633,7 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes,
 u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 extern bool tdp_enabled;
+extern int kvm_has_tsc_control;
 
 enum emulation_result {
 	EMULATE_DONE,       /* no further processing */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f938585..0b8f4f7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -797,6 +797,9 @@ static __init int svm_hardware_setup(void)
 	if (boot_cpu_has(X86_FEATURE_FXSR_OPT))
 		kvm_enable_efer_bits(EFER_FFXSR);
 
+	if (boot_cpu_has(X86_FEATURE_TSCRATEMSR))
+		kvm_has_tsc_control = 1;
+
 	if (nested) {
 		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 6caaf4b..1ac94cc 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -99,6 +99,9 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 int ignore_msrs = 0;
 module_param_named(ignore_msrs, ignore_msrs, bool, S_IRUGO | S_IWUSR);
 
+int kvm_has_tsc_control;
+EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
+
 #define KVM_NR_SHARED_MSRS 16
 
 struct kvm_shared_msrs_global {
@@ -2024,6 +2027,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_XCRS:
 		r = cpu_has_xsave;
 		break;
+	case KVM_CAP_TSC_CONTROL:
+		r = kvm_has_tsc_control;
+		break;
 	default:
 		r = 0;
 		break;
@@ -3575,6 +3581,38 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_SET_TSC_KHZ: {
+		u32 user_tsc_khz;
+
+		if (!kvm_has_tsc_control)
+			break;
+
+		/*
+		 * We force the tsc frequency to be set before any
+		 * vcpu is created
+		 */
+		if (atomic_read(&kvm->online_vcpus) > 0)
+			goto out;
+
+		user_tsc_khz = arg;
+
+		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
+
+		r = 0;
+		goto out;
+	}
+	case KVM_GET_TSC_KHZ: {
+
+		if (!kvm_has_tsc_control)
+			break;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &kvm->arch.virtual_tsc_khz, sizeof(__u32)))
+			goto out;
+
+		r = 0;
+		goto out;
+	}
 
 	default:
 		;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index ea2dc1a..a96ff92 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -541,6 +541,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_GET_PVINFO 57
 #define KVM_CAP_PPC_IRQ_LEVEL 58
 #define KVM_CAP_ASYNC_PF 59
+#define KVM_CAP_TSC_CONTROL 60
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -677,6 +678,9 @@ struct kvm_clock_data {
 #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
 /* Available with KVM_CAP_PPC_GET_PVINFO */
 #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
+/* Available with KVM_CAP_TSC_CONTROL */
+#define KVM_SET_TSC_KHZ           _IOW(KVMIO,  0xa2, __u32)
+#define KVM_GET_TSC_KHZ           _IOR(KVMIO,  0xa3, __u32)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code
  2011-02-09 17:29 ` [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code Joerg Roedel
@ 2011-02-11 22:12   ` Zachary Amsden
  2011-02-21 17:16     ` Roedel, Joerg
  0 siblings, 1 reply; 29+ messages in thread
From: Zachary Amsden @ 2011-02-11 22:12 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Avi Kivity, Marcelo Tosatti, kvm, linux-kernel

On 02/09/2011 12:29 PM, Joerg Roedel wrote:
> With TSC scaling in SVM the tsc-offset needs to be
> calculated differently. This patch propagates this
> calculation into the architecture specific modules so that
> this complexity can be handled there.
>
> Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h |    1 +
>   arch/x86/kvm/svm.c              |   11 ++++++++++-
>   arch/x86/kvm/vmx.c              |    6 ++++++
>   arch/x86/kvm/x86.c              |   10 +++++-----
>   4 files changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 9686950..8c40425 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -593,6 +593,7 @@ struct kvm_x86_ops {
>   	void (*write_tsc_offset)(struct kvm_vcpu *vcpu, u64 offset);
>
>   	bool (*use_virtual_tsc_khz)(struct kvm_vcpu *vcpu);
> +	u64 (*compute_tsc_offset)(struct kvm_vcpu *vcpu, u64 target_tsc);
>    

So I've gone over this series and the only issue I see so far is with 
this patch, and it doesn't have to do with upstream code, rather with 
code I was about to send.

Logically, the compensation done by adjust_tsc_offset should also be 
scaled; currently, this happens only for reasons, both of which are 
meant to deal with unstable TSCs; since TSC scaling won't happen on 
those processors with unstable TSCs, we don't need to worry about it there.

I have an upcoming patch which does complicate things a bit, which deals 
with host suspend.  In that case, the host TSC goes backwards and the 
offsets needs to be recomputed.  However, there is no convenient time to 
set the offset again; on VMX, the hardware will not yet be set up and so 
can't easily write the offset field in the VMCS.  We also can't put a 
synchronization barrier on all the VCPUs to write the offset before they 
start running without getting into a difficult situation with locking.

So instead, the approach I took was to re-use the adjust_tsc_offset 
function and accumulate offsets to apply.

For SVM with TSC scaling, the offset to apply as an adjustment in this 
case needs to be scaled.  Setting guest TSC (gtsc) equal to the new 
guest TSC (gstc'), we have:

gtsc = htsc * mult + offset
gstc' = htsc' * mult + offset'
gtsc' = gtsc
offset' = htsc * mult + offset - htsc' * mult
offset' = (htsc - htsc') * mult + offset

so, delta offset needs to = (htsc - htsc') * mult

We will instead be passing (htsc - htsc') as the adjustment value; the 
solution seems simple, we have to scale it up as well in the 
adjust_tsc_offset function.

However, the problem is that we need a new architecture specific 
function or API change because not all call sites for adjust_tsc want to 
have adjustments scaled - the call site dealing with tsc_catchup is 
actually working in guest cycles already, so should not be scaled again.

We could have separate functions to adjust TSC cycles by either guest or 
host cycle amounts, or pass a flag to adjust_tsc_offset indicating 
whether the adjustment is to be applied in guest cycles or host cycles.

The resulting API will be slightly asymmetric, as compute_tsc_offset 
lets the generic code compute in terms of hardware offsets, but in the 
adjustment case, there isn't an easy way to expose the ability to 
compute in hardware offset terms.

One slight pity is that we won't be able to resue 
svm_compute_tsc_offset, as the applied delta won't be based off a read 
of the tsc.  I can't really find a better API though, in case offsets 
are computed differently on different hardware (such as multiplying 
after the offset), then we need a function to convert guest cycles back 
to hardware cycles.

As usual, with the TSC code, it is going to require a lot of commenting 
to explain this.

Your code in general looks good.

Cheers,

Zach

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-02-09 17:29 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
@ 2011-02-13 15:12   ` Avi Kivity
  2011-02-21 17:17     ` Roedel, Joerg
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-02-13 15:12 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/09/2011 07:29 PM, Joerg Roedel wrote:
> This patch implements two new vm-ioctls to get and set the
> virtual_tsc_khz if the machine supports tsc-scaling. Setting
> the tsc-frequency is only possible before userspace creates
> any vcpu.
>
> Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> ---
>   arch/x86/include/asm/kvm_host.h |    1 +
>   arch/x86/kvm/svm.c              |    3 +++
>   arch/x86/kvm/x86.c              |   38 ++++++++++++++++++++++++++++++++++++++
>   include/linux/kvm.h             |    4 ++++
>   4 files changed, 46 insertions(+), 0 deletions(-)
>

Documentation/kvm/api.txt +++++++++++++

> @@ -633,6 +633,7 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes,
>   u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
>
>   extern bool tdp_enabled;
> +extern int kvm_has_tsc_control;

bool


> +	case KVM_SET_TSC_KHZ: {
> +		u32 user_tsc_khz;
> +
> +		if (!kvm_has_tsc_control)
> +			break;
> +
> +		/*
> +		 * We force the tsc frequency to be set before any
> +		 * vcpu is created
> +		 */
> +		if (atomic_read(&kvm->online_vcpus)>  0)
> +			goto out;

What if a vcpu is created here?  No locking AFAICS.

> +
> +		user_tsc_khz = arg;
> +
> +		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
> +

Error check for impossible values (0, values the tsc multiplier can't 
reach)?

> +		r = 0;
> +		goto out;
> +	}
> +	case KVM_GET_TSC_KHZ: {
> +
> +		if (!kvm_has_tsc_control)
> +			break;
> +
> +		r = -EFAULT;
> +		if (copy_to_user(argp,&kvm->arch.virtual_tsc_khz, sizeof(__u32)))
> +			goto out;

Should be the return value, no?

> +
> +		r = 0;
> +		goto out;
> +	}
>
>   	default:
>   		;
>
>
> @@ -677,6 +678,9 @@ struct kvm_clock_data {
>   #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
>   /* Available with KVM_CAP_PPC_GET_PVINFO */
>   #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
> +/* Available with KVM_CAP_TSC_CONTROL */
> +#define KVM_SET_TSC_KHZ           _IOW(KVMIO,  0xa2, __u32)
> +#define KVM_GET_TSC_KHZ           _IOR(KVMIO,  0xa3, __u32)
>   

_IO() - use arg or return value
_IOW/_IOR - copy_to/from_user()

pick one, but don't mix.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
                   ` (5 preceding siblings ...)
  2011-02-09 17:29 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
@ 2011-02-13 15:19 ` Avi Kivity
  2011-02-21 17:28   ` Roedel, Joerg
  6 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-02-13 15:19 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/09/2011 07:29 PM, Joerg Roedel wrote:
> Hi Avi, Marcelo,
>
> here is the patch-set to implement the TSC-scaling feature of upcoming
> AMD CPUs. When this feature is supported the CPU provides a new MSR
> which holds a multiplier for the hardware TSC which is applied on the
> value rdtsc[p] and reads of MSR 0x10. This feature can be used to
> emulate a given tsc frequency for the guest.
> Patch 1 is not directly related to this patch-set because it only fixes
> a bug which prevented me from testing these patches. In fact it fixes
> the same bug Andre sent a patch for. But after the discussion about his
> patch he told me to just post my patch and thus here it is.
>

Questions:
- the tsc multiplier really is a multiplier, right?  Not an addend that 
is added every cycle.

So

     wrmsr(TSC, 1e9)
     wrmsr(TSC_MULT, 2.0000)
     t = rdtsc()

will return about 2e9, not 1e9 + 2*(time to execute the code snippet) ?

- what's the cost of wrmsr(TSC_MULT)?

There are really two ways to implement this feature.  One is fully 
generic, like you did.  The other is to implement it at the host level - 
have a sysfs file and/or kernel parameter for the desired tsc frequency, 
write it once, and forget about it.  Trust management to set the host 
tsc frequency to the same value on all hosts in a migration cluster.

The advantages of the the simpler implementation are, well, that it's 
simpler, and second that it avoids two wrmsrs per exit.  We could 
combine both implementations, and have

   if (guest_mult != host_mult)
       wrmsr(TSC_MULT, guest_mult)

etc.  But I'd like to understand if there are additional motivations for 
per-guest tsc frequency.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code
  2011-02-11 22:12   ` Zachary Amsden
@ 2011-02-21 17:16     ` Roedel, Joerg
  0 siblings, 0 replies; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-21 17:16 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Avi Kivity, Marcelo Tosatti, kvm, linux-kernel

(Sorry for the delay, I had to spend some days sick at home :-( )

On Fri, Feb 11, 2011 at 05:12:29PM -0500, Zachary Amsden wrote:
> On 02/09/2011 12:29 PM, Joerg Roedel wrote:

> So I've gone over this series and the only issue I see so far is with 
> this patch, and it doesn't have to do with upstream code, rather with 
> code I was about to send.
> 
> Logically, the compensation done by adjust_tsc_offset should also be 
> scaled; currently, this happens only for reasons, both of which are 
> meant to deal with unstable TSCs; since TSC scaling won't happen on 
> those processors with unstable TSCs, we don't need to worry about it there.

The tsc_offset is applied after the TSC is scaled so there is no good
way to scale the offset with the TSC value itself.
What we can do is to use guest-tsc values only when we calculate an
adjustment. So any tsc-offset adjustment made with adjust_tsc_offset()
needs to be a function of guest-tsc values. One call-place of the
function already does this and the other one can be converted easily.
I'll do that in the next version of this patch-set.
>From what I understand of your upcoming patch the accumulation of
tsc-offsets could also be calculated from guest-tsc values instead of
native_read_tsc() values, no?

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-02-13 15:12   ` Avi Kivity
@ 2011-02-21 17:17     ` Roedel, Joerg
  0 siblings, 0 replies; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-21 17:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On Sun, Feb 13, 2011 at 10:12:14AM -0500, Avi Kivity wrote:
> On 02/09/2011 07:29 PM, Joerg Roedel wrote:
> > This patch implements two new vm-ioctls to get and set the
> > virtual_tsc_khz if the machine supports tsc-scaling. Setting
> > the tsc-frequency is only possible before userspace creates
> > any vcpu.
> >
> > Signed-off-by: Joerg Roedel<joerg.roedel@amd.com>
> > ---
> >   arch/x86/include/asm/kvm_host.h |    1 +
> >   arch/x86/kvm/svm.c              |    3 +++
> >   arch/x86/kvm/x86.c              |   38 ++++++++++++++++++++++++++++++++++++++
> >   include/linux/kvm.h             |    4 ++++
> >   4 files changed, 46 insertions(+), 0 deletions(-)
> >
> 
> Documentation/kvm/api.txt +++++++++++++
> 
> > @@ -633,6 +633,7 @@ int kvm_pv_mmu_op(struct kvm_vcpu *vcpu, unsigned long bytes,
> >   u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
> >
> >   extern bool tdp_enabled;
> > +extern int kvm_has_tsc_control;
> 
> bool
> 
> 
> > +	case KVM_SET_TSC_KHZ: {
> > +		u32 user_tsc_khz;
> > +
> > +		if (!kvm_has_tsc_control)
> > +			break;
> > +
> > +		/*
> > +		 * We force the tsc frequency to be set before any
> > +		 * vcpu is created
> > +		 */
> > +		if (atomic_read(&kvm->online_vcpus)>  0)
> > +			goto out;
> 
> What if a vcpu is created here?  No locking AFAICS.
> 
> > +
> > +		user_tsc_khz = arg;
> > +
> > +		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
> > +
> 
> Error check for impossible values (0, values the tsc multiplier can't 
> reach)?
> 
> > +		r = 0;
> > +		goto out;
> > +	}
> > +	case KVM_GET_TSC_KHZ: {
> > +
> > +		if (!kvm_has_tsc_control)
> > +			break;
> > +
> > +		r = -EFAULT;
> > +		if (copy_to_user(argp,&kvm->arch.virtual_tsc_khz, sizeof(__u32)))
> > +			goto out;
> 
> Should be the return value, no?
> 
> > +
> > +		r = 0;
> > +		goto out;
> > +	}
> >
> >   	default:
> >   		;
> >
> >
> > @@ -677,6 +678,9 @@ struct kvm_clock_data {
> >   #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
> >   /* Available with KVM_CAP_PPC_GET_PVINFO */
> >   #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
> > +/* Available with KVM_CAP_TSC_CONTROL */
> > +#define KVM_SET_TSC_KHZ           _IOW(KVMIO,  0xa2, __u32)
> > +#define KVM_GET_TSC_KHZ           _IOR(KVMIO,  0xa3, __u32)
> >   
> 
> _IO() - use arg or return value
> _IOW/_IOR - copy_to/from_user()
> 
> pick one, but don't mix.

Thanks, I'll fix these issued in the next version.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-13 15:19 ` [PATCH 0/6] KVM support for TSC scaling Avi Kivity
@ 2011-02-21 17:28   ` Roedel, Joerg
  2011-02-21 21:25     ` Zachary Amsden
  2011-02-22 10:11     ` Avi Kivity
  0 siblings, 2 replies; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-21 17:28 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On Sun, Feb 13, 2011 at 10:19:19AM -0500, Avi Kivity wrote:
> On 02/09/2011 07:29 PM, Joerg Roedel wrote:
> > Hi Avi, Marcelo,
> >
> > here is the patch-set to implement the TSC-scaling feature of upcoming
> > AMD CPUs. When this feature is supported the CPU provides a new MSR
> > which holds a multiplier for the hardware TSC which is applied on the
> > value rdtsc[p] and reads of MSR 0x10. This feature can be used to
> > emulate a given tsc frequency for the guest.
> > Patch 1 is not directly related to this patch-set because it only fixes
> > a bug which prevented me from testing these patches. In fact it fixes
> > the same bug Andre sent a patch for. But after the discussion about his
> > patch he told me to just post my patch and thus here it is.
> >
> 
> Questions:
> - the tsc multiplier really is a multiplier, right?  Not an addend that 
> is added every cycle.

Yes, it is a real multiplier. But writes to the TSC-MSR will change the
unscaled TSC value.

> 
> So
> 
>      wrmsr(TSC, 1e9)
>      wrmsr(TSC_MULT, 2.0000)
>      t = rdtsc()
> 
> will return about 2e9, not 1e9 + 2*(time to execute the code snippet) ?

Right. And if you exchange the two wrmsr calls it will still give you
the same result.

> - what's the cost of wrmsr(TSC_MULT)?

Hard to tell by now because I only have numbers for pre-production
hardware. 

> There are really two ways to implement this feature.  One is fully 
> generic, like you did.  The other is to implement it at the host level - 
> have a sysfs file and/or kernel parameter for the desired tsc frequency, 
> write it once, and forget about it.  Trust management to set the host 
> tsc frequency to the same value on all hosts in a migration cluster.

The motivation here is mostly the flexibility. Scale the TSC for the
whole migration cluster only makes sense if all hosts there support the
feature. But the most likely scenario is that existing migration
clusters will be extended by new machines and guests will be migrated
there. And these guests should be able to see the same TSC frequency on
the new host as the had on the old one. The older machines in the
cluster may even have different TSC frequencys. With this flexible
implementation those scenarios are possible. A host-wide setting for the
scaling will make the feature useless in those (common) scenarios.

Regards,

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-21 17:28   ` Roedel, Joerg
@ 2011-02-21 21:25     ` Zachary Amsden
  2011-02-22 10:11     ` Avi Kivity
  1 sibling, 0 replies; 29+ messages in thread
From: Zachary Amsden @ 2011-02-21 21:25 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Avi Kivity, Marcelo Tosatti, kvm, linux-kernel

On 02/21/2011 12:28 PM, Roedel, Joerg wrote:
> On Sun, Feb 13, 2011 at 10:19:19AM -0500, Avi Kivity wrote:
>    
>> On 02/09/2011 07:29 PM, Joerg Roedel wrote:
>>      
>>> Hi Avi, Marcelo,
>>>
>>> here is the patch-set to implement the TSC-scaling feature of upcoming
>>> AMD CPUs. When this feature is supported the CPU provides a new MSR
>>> which holds a multiplier for the hardware TSC which is applied on the
>>> value rdtsc[p] and reads of MSR 0x10. This feature can be used to
>>> emulate a given tsc frequency for the guest.
>>> Patch 1 is not directly related to this patch-set because it only fixes
>>> a bug which prevented me from testing these patches. In fact it fixes
>>> the same bug Andre sent a patch for. But after the discussion about his
>>> patch he told me to just post my patch and thus here it is.
>>>
>>>        
>> Questions:
>> - the tsc multiplier really is a multiplier, right?  Not an addend that
>> is added every cycle.
>>      
> Yes, it is a real multiplier. But writes to the TSC-MSR will change the
> unscaled TSC value.
>
>    
>> So
>>
>>       wrmsr(TSC, 1e9)
>>       wrmsr(TSC_MULT, 2.0000)
>>       t = rdtsc()
>>
>> will return about 2e9, not 1e9 + 2*(time to execute the code snippet) ?
>>      
> Right. And if you exchange the two wrmsr calls it will still give you
> the same result.
>
>    
>> - what's the cost of wrmsr(TSC_MULT)?
>>      
> Hard to tell by now because I only have numbers for pre-production
> hardware.
>
>    
>> There are really two ways to implement this feature.  One is fully
>> generic, like you did.  The other is to implement it at the host level -
>> have a sysfs file and/or kernel parameter for the desired tsc frequency,
>> write it once, and forget about it.  Trust management to set the host
>> tsc frequency to the same value on all hosts in a migration cluster.
>>      
> The motivation here is mostly the flexibility. Scale the TSC for the
> whole migration cluster only makes sense if all hosts there support the
> feature. But the most likely scenario is that existing migration
> clusters will be extended by new machines and guests will be migrated
> there. And these guests should be able to see the same TSC frequency on
> the new host as the had on the old one. The older machines in the
> cluster may even have different TSC frequencys. With this flexible
> implementation those scenarios are possible. A host-wide setting for the
> scaling will make the feature useless in those (common) scenarios.
>    

It's also possible to scale the TSCs of the cluster to be matching 
outside of the framework of KVM.  In that case, the VCPU client (qemu) 
simply needs to be smart enough to not request the TSC rate be scaled.  
That approach is completely compatible with this implementation.

If you do indeed want to have mixed speed VMs running on a single host, 
that can also be done with the approach here.

Combining the two - supporting a standard cluster rate via host scaling, 
plus a variable rate for martian VMs (those not conforming to the 
standard cluster rate) would require some more work, as the multiplier 
written back on exit from a martian would not be 1.0, rather something 
else.  Everything else should work as long as tsc_khz still expresses 
the natural rate of the TSC, even when scaled to a standard cluster 
rate.  In that case, you can also pursue Avi's suggestion of skipping 
the MSR loads for VMs where the rate matches the host rate.

Adding an export to the kernel indicating the currently applied scaling 
rate may not be a bad idea if you want to support such an implementation 
in the future.

I did have one slight concern about scaling in general.  What happens 
when the CPU khz rate is not uniformly detected across machines or 
clusters?  In general, it does vary a bit, I see differences out to the 
5th digit of precision on the same machine.  This is close enough to be 
within the range of NTP correction (500 ppm), but also small enough to 
represent real clock differences (and of course, there is some 
measurement error).

If you are within the threshold where NTP can correct the time, you may 
not want to apply a multiplier to the TSC at all.  Again, this decision 
can be made in the userspace component, but it's an important 
consideration to bring up for the qemu patches that will be required to 
support this.

Zach

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-21 17:28   ` Roedel, Joerg
  2011-02-21 21:25     ` Zachary Amsden
@ 2011-02-22 10:11     ` Avi Kivity
  2011-02-22 10:35       ` Roedel, Joerg
  1 sibling, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-02-22 10:11 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/21/2011 07:28 PM, Roedel, Joerg wrote:
> >  - what's the cost of wrmsr(TSC_MULT)?
>
> Hard to tell by now because I only have numbers for pre-production
> hardware.

Can you ask your hardware people what the cost will likely be?  msrs are 
often expensive, and here we have two in the lightweight exit path.

> >  There are really two ways to implement this feature.  One is fully
> >  generic, like you did.  The other is to implement it at the host level -
> >  have a sysfs file and/or kernel parameter for the desired tsc frequency,
> >  write it once, and forget about it.  Trust management to set the host
> >  tsc frequency to the same value on all hosts in a migration cluster.
>
> The motivation here is mostly the flexibility. Scale the TSC for the
> whole migration cluster only makes sense if all hosts there support the
> feature. But the most likely scenario is that existing migration
> clusters will be extended by new machines and guests will be migrated
> there. And these guests should be able to see the same TSC frequency on
> the new host as the had on the old one. The older machines in the
> cluster may even have different TSC frequencys. With this flexible
> implementation those scenarios are possible. A host-wide setting for the
> scaling will make the feature useless in those (common) scenarios.

This doesn't really work, since we don't know on what host the TSC 
calibration loop ran:

- start guest on host H1
- migrate it around, now it's on host H2
- guest reboots, reruns calibration loop
- migrate it around some more, now it's on host H3
- migrate to host with tsc multiplier Hnew

So, what should we set the multiplier to? H1, H2, or H3's tsc rate?

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-22 10:11     ` Avi Kivity
@ 2011-02-22 10:35       ` Roedel, Joerg
  2011-02-22 10:41         ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-22 10:35 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On Tue, Feb 22, 2011 at 05:11:42AM -0500, Avi Kivity wrote:
> On 02/21/2011 07:28 PM, Roedel, Joerg wrote:
> > >  - what's the cost of wrmsr(TSC_MULT)?
> >
> > Hard to tell by now because I only have numbers for pre-production
> > hardware.
> 
> Can you ask your hardware people what the cost will likely be?  msrs are 
> often expensive, and here we have two in the lightweight exit path.

Will do.

> This doesn't really work, since we don't know on what host the TSC 
> calibration loop ran:
> 
> - start guest on host H1
> - migrate it around, now it's on host H2
> - guest reboots, reruns calibration loop
> - migrate it around some more, now it's on host H3
> - migrate to host with tsc multiplier Hnew
> 
> So, what should we set the multiplier to? H1, H2, or H3's tsc rate?

This scenario doesn't matter. If the guest already detected its TSC to
be unstable there is nothing we can do and it doesn't really matter what
we set the tsc frequency to. Therefore software will always set the
guest tsc frequency to the same value it had on the last host.

In the above scenario this would be be the host tsc frequency of H3. If
the guest is migrated further around from the host with TSC multiplier
this frequency is passed on further. Software can read the guest tsc
frequency using the ioctl.

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-22 10:35       ` Roedel, Joerg
@ 2011-02-22 10:41         ` Avi Kivity
  2011-02-22 11:11           ` Roedel, Joerg
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-02-22 10:41 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/22/2011 12:35 PM, Roedel, Joerg wrote:
> >  This doesn't really work, since we don't know on what host the TSC
> >  calibration loop ran:
> >
> >  - start guest on host H1
> >  - migrate it around, now it's on host H2
> >  - guest reboots, reruns calibration loop
> >  - migrate it around some more, now it's on host H3
> >  - migrate to host with tsc multiplier Hnew
> >
> >  So, what should we set the multiplier to? H1, H2, or H3's tsc rate?
>
> This scenario doesn't matter. If the guest already detected its TSC to
> be unstable there is nothing we can do and it doesn't really matter what
> we set the tsc frequency to. Therefore software will always set the
> guest tsc frequency to the same value it had on the last host.

Ok, so your scenario is

- boot on host H1
- no intervening migrations
- migrate to host Hnew
- all succeeding migrations are only to new hosts or back to H1

This is somewhat artificial, and not very different from an all-new cluster.

[the whole thing is kind of sad; we went through a huge effort to make 
clocks work on virtual machines in spite of the tsc issues; then we have 
a hardware solution, but can't use it because of old hardware.  Same 
thing happens with the effort put into shadow in the pre-npt days]

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-22 10:41         ` Avi Kivity
@ 2011-02-22 11:11           ` Roedel, Joerg
  2011-02-22 14:11             ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-22 11:11 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On Tue, Feb 22, 2011 at 05:41:53AM -0500, Avi Kivity wrote:
> On 02/22/2011 12:35 PM, Roedel, Joerg wrote:
> > >  This doesn't really work, since we don't know on what host the TSC
> > >  calibration loop ran:
> > >
> > >  - start guest on host H1
> > >  - migrate it around, now it's on host H2
> > >  - guest reboots, reruns calibration loop
> > >  - migrate it around some more, now it's on host H3
> > >  - migrate to host with tsc multiplier Hnew
> > >
> > >  So, what should we set the multiplier to? H1, H2, or H3's tsc rate?
> >
> > This scenario doesn't matter. If the guest already detected its TSC to
> > be unstable there is nothing we can do and it doesn't really matter what
> > we set the tsc frequency to. Therefore software will always set the
> > guest tsc frequency to the same value it had on the last host.
> 
> Ok, so your scenario is
> 
> - boot on host H1
> - no intervening migrations
> - migrate to host Hnew
> - all succeeding migrations are only to new hosts or back to H1
> 
> This is somewhat artificial, and not very different from an all-new cluster.

This is at least the scenario where the new hardware feature will make
sense. Its clear that if you migrate a guest between hosts without
tsc-scaling will make the tsc appear unstable for the guest. This is
basically the same situation as we have today.
In fact, for older hosts the feature can be emulated in software by
trapping tsc accesses from the guest. Isn't this what Zachary has been
working on? During my implementation I understood tsc-scaling as a
hardware supported way to do this. And thats the reason I implemented it
the way it is.

> [the whole thing is kind of sad; we went through a huge effort to make 
> clocks work on virtual machines in spite of the tsc issues; then we have 
> a hardware solution, but can't use it because of old hardware.  Same 
> thing happens with the effort put into shadow in the pre-npt days]

The shadow code has a revivial as it is required for emulating
nested-npt and nested-ept, so the effort still has value :)

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept
  2011-02-09 17:29 ` [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept Joerg Roedel
@ 2011-02-22 11:14   ` Roedel, Joerg
  2011-02-22 14:01     ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-22 11:14 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: kvm, linux-kernel, Zachary Amsden

On Wed, Feb 09, 2011 at 12:29:39PM -0500, Joerg Roedel wrote:
> In the dr_intercept function a new cpu-feature called
> decode-assists is implemented and used when available. This
> code-path does not advance the guest-rip causing the guest
> to dead-loop over mov-dr instructions. This is fixed by this
> patch.
> 
> Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
> ---
>  arch/x86/kvm/svm.c |    2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
> index 73a8f1d..bfb4948 100644
> --- a/arch/x86/kvm/svm.c
> +++ b/arch/x86/kvm/svm.c
> @@ -2777,6 +2777,8 @@ static int dr_interception(struct vcpu_svm *svm)
>  			kvm_register_write(&svm->vcpu, reg, val);
>  	}
>  
> +	skip_emulated_instruction(&svm->vcpu);
> +
>  	return 1;
>  }

Btw. Can you meanwhile apply this patch? It fixes a bug which sends the
guest into an endless loop when decode assists is available on the host.

	Joerg

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept
  2011-02-22 11:14   ` Roedel, Joerg
@ 2011-02-22 14:01     ` Avi Kivity
  2011-02-22 14:33       ` Roedel, Joerg
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-02-22 14:01 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/22/2011 01:14 PM, Roedel, Joerg wrote:
> On Wed, Feb 09, 2011 at 12:29:39PM -0500, Joerg Roedel wrote:
> >  In the dr_intercept function a new cpu-feature called
> >  decode-assists is implemented and used when available. This
> >  code-path does not advance the guest-rip causing the guest
> >  to dead-loop over mov-dr instructions. This is fixed by this
> >  patch.
> >
>
> Btw. Can you meanwhile apply this patch? It fixes a bug which sends the
> guest into an endless loop when decode assists is available on the host.

Yes of course - should have done that myself.  Applied now, and queued 
for 2.6.38.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 0/6] KVM support for TSC scaling
  2011-02-22 11:11           ` Roedel, Joerg
@ 2011-02-22 14:11             ` Avi Kivity
  0 siblings, 0 replies; 29+ messages in thread
From: Avi Kivity @ 2011-02-22 14:11 UTC (permalink / raw)
  To: Roedel, Joerg; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On 02/22/2011 01:11 PM, Roedel, Joerg wrote:
> >
> >  Ok, so your scenario is
> >
> >  - boot on host H1
> >  - no intervening migrations
> >  - migrate to host Hnew
> >  - all succeeding migrations are only to new hosts or back to H1
> >
> >  This is somewhat artificial, and not very different from an all-new cluster.
>
> This is at least the scenario where the new hardware feature will make
> sense. Its clear that if you migrate a guest between hosts without
> tsc-scaling will make the tsc appear unstable for the guest. This is
> basically the same situation as we have today.
> In fact, for older hosts the feature can be emulated in software by
> trapping tsc accesses from the guest. Isn't this what Zachary has been
> working on?

Yes.  It's of dubious value though, you get a stable tsc but it's 
incredibly slow.

>   During my implementation I understood tsc-scaling as a
> hardware supported way to do this. And thats the reason I implemented it
> the way it is.

Right.  The only question is what the added guest switch cost.  If it's 
expensive (say, >= 100 cycles) then we need a mode where we can drop 
this cost by applying the same multiplier to all guests and the host 
(can be done as an add-on optimization patch).  If however we end up 
always recommending that all hosts use the same virtual tsc rate, why 
should we support individual rates for guests?

It does make sense from a generality point of view, we provide 
mechanism, not policy, just make sure that the policies we like are 
optimized as far as they can go.

> >  [the whole thing is kind of sad; we went through a huge effort to make
> >  clocks work on virtual machines in spite of the tsc issues; then we have
> >  a hardware solution, but can't use it because of old hardware.  Same
> >  thing happens with the effort put into shadow in the pre-npt days]
>
> The shadow code has a revivial as it is required for emulating
> nested-npt and nested-ept, so the effort still has value :)

Yes.  Some of it though is unused (unsync pages).  And it's hard for me 
to see nested svm itself used in production due to the huge performance 
hit for I/O.  Maybe an emulated iommu (so we can do virtio device 
assignment, or even real device assignment all the way from the host) 
will help, or even more hardware support a la s390.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept
  2011-02-22 14:01     ` Avi Kivity
@ 2011-02-22 14:33       ` Roedel, Joerg
  0 siblings, 0 replies; 29+ messages in thread
From: Roedel, Joerg @ 2011-02-22 14:33 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, kvm, linux-kernel, Zachary Amsden

On Tue, Feb 22, 2011 at 09:01:51AM -0500, Avi Kivity wrote:
> On 02/22/2011 01:14 PM, Roedel, Joerg wrote:
> > On Wed, Feb 09, 2011 at 12:29:39PM -0500, Joerg Roedel wrote:
> > >  In the dr_intercept function a new cpu-feature called
> > >  decode-assists is implemented and used when available. This
> > >  code-path does not advance the guest-rip causing the guest
> > >  to dead-loop over mov-dr instructions. This is fixed by this
> > >  patch.
> > >
> >
> > Btw. Can you meanwhile apply this patch? It fixes a bug which sends the
> > guest into an endless loop when decode assists is available on the host.
> 
> Yes of course - should have done that myself.  Applied now, and queued 
> for 2.6.38.

Great, thanks.

-- 
AMD Operating System Research Center

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-25  8:44 [PATCH 0/6] TSC scaling support for KVM v3 Joerg Roedel
@ 2011-03-25  8:44 ` Joerg Roedel
  0 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-03-25  8:44 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Zachary Amsden, kvm, Joerg Roedel

This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 Documentation/kvm/api.txt       |   23 +++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h |    7 +++++++
 arch/x86/kvm/svm.c              |   20 ++++++++++++++++++++
 arch/x86/kvm/x86.c              |   35 +++++++++++++++++++++++++++++++++++
 include/linux/kvm.h             |    5 +++++
 5 files changed, 90 insertions(+), 0 deletions(-)

diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
index 9bef4e4..1b9eaa7 100644
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -1263,6 +1263,29 @@ struct kvm_assigned_msix_entry {
 	__u16 padding[3];
 };
 
+4.54 KVM_SET_TSC_KHZ
+
+Capability: KVM_CAP_TSC_CONTROL
+Architectures: x86
+Type: vcpu ioctl
+Parameters: virtual tsc_khz
+Returns: 0 on success, -1 on error
+
+Specifies the tsc frequency for the virtual machine. The unit of the
+frequency is KHz.
+
+4.55 KVM_GET_TSC_KHZ
+
+Capability: KVM_CAP_GET_TSC_KHZ
+Architectures: x86
+Type: vcpu ioctl
+Parameters: none
+Returns: virtual tsc-khz on success, negative value on error
+
+Returns the tsc frequency of the guest. The unit of the return value is
+KHz. If the host has unstable tsc this ioctl returns -EIO instead as an
+error.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 7f48528..473a3be 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -632,6 +632,13 @@ u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 extern bool tdp_enabled;
 
+/* control of guest tsc rate supported? */
+extern bool kvm_has_tsc_control;
+/* minimum supported tsc_khz for guests */
+extern u32  kvm_min_guest_tsc_khz;
+/* maximum supported tsc_khz for guests */
+extern u32  kvm_max_guest_tsc_khz;
+
 enum emulation_result {
 	EMULATE_DONE,       /* no further processing */
 	EMULATE_DO_MMIO,      /* kvm_run filled with mmio request */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 38a4bcc..a5c1b5b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -64,6 +64,8 @@ MODULE_LICENSE("GPL");
 #define DEBUGCTL_RESERVED_BITS (~(0x3fULL))
 
 #define TSC_RATIO_RSVD          0xffffff0000000000ULL
+#define TSC_RATIO_MIN		0x0000000000000001ULL
+#define TSC_RATIO_MAX		0x000000ffffffffffULL
 
 static bool erratum_383_found __read_mostly;
 
@@ -197,6 +199,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm);
 static int nested_svm_vmexit(struct vcpu_svm *svm);
 static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
 				      bool has_error_code, u32 error_code);
+static u64 __scale_tsc(u64 ratio, u64 tsc);
 
 enum {
 	VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -807,6 +810,23 @@ static __init int svm_hardware_setup(void)
 	if (boot_cpu_has(X86_FEATURE_FXSR_OPT))
 		kvm_enable_efer_bits(EFER_FFXSR);
 
+	if (boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
+		u64 max;
+
+		kvm_has_tsc_control = true;
+
+		/*
+		 * Make sure the user can only configure tsc_khz values that
+		 * fit into a signed integer.
+		 * A min value is not calculated needed because it will always
+		 * be 1 on all machines and a value of 0 is used to disable
+		 * tsc-scaling for the vcpu.
+		 */
+		max = min(0x7fffffffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
+
+		kvm_max_guest_tsc_khz = max;
+	}
+
 	if (nested) {
 		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2f0b552..5cc9a44 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -100,6 +100,11 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 int ignore_msrs = 0;
 module_param_named(ignore_msrs, ignore_msrs, bool, S_IRUGO | S_IWUSR);
 
+bool kvm_has_tsc_control;
+EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
+u32  kvm_max_guest_tsc_khz;
+EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
+
 #define KVM_NR_SHARED_MSRS 16
 
 struct kvm_shared_msrs_global {
@@ -1999,6 +2004,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
 	case KVM_CAP_XSAVE:
 	case KVM_CAP_ASYNC_PF:
+	case KVM_CAP_GET_TSC_KHZ:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -2025,6 +2031,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_XCRS:
 		r = cpu_has_xsave;
 		break;
+	case KVM_CAP_TSC_CONTROL:
+		r = kvm_has_tsc_control;
+		break;
 	default:
 		r = 0;
 		break;
@@ -3057,6 +3066,32 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_vcpu_ioctl_x86_set_xcrs(vcpu, u.xcrs);
 		break;
 	}
+	case KVM_SET_TSC_KHZ: {
+		u32 user_tsc_khz;
+
+		r = -EINVAL;
+		if (!kvm_has_tsc_control)
+			break;
+
+		user_tsc_khz = (u32)arg;
+
+		if (user_tsc_khz >= kvm_max_guest_tsc_khz)
+			goto out;
+
+		kvm_x86_ops->set_tsc_khz(vcpu, user_tsc_khz);
+
+		r = 0;
+		goto out;
+	}
+	case KVM_GET_TSC_KHZ: {
+		r = -EIO;
+		if (check_tsc_unstable())
+			goto out;
+
+		r = vcpu_tsc_khz(vcpu);
+
+		goto out;
+	}
 	default:
 		r = -EINVAL;
 	}
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index ea2dc1a..2f63ebe 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -541,6 +541,8 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_GET_PVINFO 57
 #define KVM_CAP_PPC_IRQ_LEVEL 58
 #define KVM_CAP_ASYNC_PF 59
+#define KVM_CAP_TSC_CONTROL 60
+#define KVM_CAP_GET_TSC_KHZ 61
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -677,6 +679,9 @@ struct kvm_clock_data {
 #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
 /* Available with KVM_CAP_PPC_GET_PVINFO */
 #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
+/* Available with KVM_CAP_TSC_CONTROL */
+#define KVM_SET_TSC_KHZ           _IO(KVMIO,  0xa2)
+#define KVM_GET_TSC_KHZ           _IO(KVMIO,  0xa3)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-24 10:44       ` Avi Kivity
@ 2011-03-24 10:47         ` Joerg Roedel
  0 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-03-24 10:47 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Joerg Roedel, Marcelo Tosatti, Zachary Amsden, kvm

On Thu, Mar 24, 2011 at 12:44:59PM +0200, Avi Kivity wrote:
> On 03/24/2011 12:41 PM, Joerg Roedel wrote:
>> Okay, I'll change that. But I would prefer to keep this as a vm ioctl. A
>> vcpu ioctl might be more flexible but I doubt anybody has a use-case for
>> different tsc_khz values in one VM.
>
> My motivation is simplification.

Okay, so I'll change that too. Thanks for your feedback.

	Joerg


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-24 10:41     ` Joerg Roedel
@ 2011-03-24 10:44       ` Avi Kivity
  2011-03-24 10:47         ` Joerg Roedel
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-03-24 10:44 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Joerg Roedel, Marcelo Tosatti, Zachary Amsden, kvm

On 03/24/2011 12:41 PM, Joerg Roedel wrote:
> >>
> >>  +4.54 KVM_SET_TSC_KHZ
> >>  +
> >>  +Capability: KVM_CAP_TSC_CONTROL
> >>  +Architectures: x86
> >>  +Type: vm ioctl
> >>  +Parameters: __u32 (in)
> >>  +Returns: 0 on success, -1 on error
> >>  +
> >>  +Specifies the tsc frequency for the virtual machine. This IOCTL must be
> >>  +used before any vcpu is created. The unit of the frequency is KHz.
> >
> >  Should it not be a vcpu ioctl?
>
> The idea was that a vm ioctl will make sure that all vcpus in one vm
> have the same tsc frequency. With a vm ioctl we force this. So the tsc
> fequency is basically a vm capability which is mirrored into each vcpu
> data structure for performance reasons.
>

Yes - it doesn't make sense to have each vcpu run with a different tsc.  
But we do the same with cpuid, and since the tsc really is a per-cpu 
thing, then if it simplifies the code I think it's okay to make 
userspace call it per-cpu.

> >>  +
> >>  +		r = 0;
> >>  +		goto out;
> >>  +	}
> >>  +	case KVM_GET_TSC_KHZ: {
> >>  +		u32 vtsc_khz = kvm->arch.virtual_tsc_khz;
> >>  +
> >>  +		r = -EIO;
> >>  +		if (check_tsc_unstable())
> >>  +			goto out;
> >>  +
> >>  +		r = -EFAULT;
> >>  +		if (copy_to_user(argp,&vtsc_khz, sizeof(__u32)))
> >>  +			goto out;
> >
> >  And an ordinary return here.
>
> Okay, I'll change that. But I would prefer to keep this as a vm ioctl. A
> vcpu ioctl might be more flexible but I doubt anybody has a use-case for
> different tsc_khz values in one VM.

My motivation is simplification.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-24 10:14   ` Avi Kivity
@ 2011-03-24 10:41     ` Joerg Roedel
  2011-03-24 10:44       ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Joerg Roedel @ 2011-03-24 10:41 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Joerg Roedel, Marcelo Tosatti, Zachary Amsden, kvm

On Thu, Mar 24, 2011 at 12:14:36PM +0200, Avi Kivity wrote:
> On 03/24/2011 09:40 AM, Joerg Roedel wrote:
>> This patch implements two new vm-ioctls to get and set the
>> virtual_tsc_khz if the machine supports tsc-scaling. Setting
>> the tsc-frequency is only possible before userspace creates
>> any vcpu.
>>
>>
>> +4.54 KVM_SET_TSC_KHZ
>> +
>> +Capability: KVM_CAP_TSC_CONTROL
>> +Architectures: x86
>> +Type: vm ioctl
>> +Parameters: __u32 (in)
>> +Returns: 0 on success, -1 on error
>> +
>> +Specifies the tsc frequency for the virtual machine. This IOCTL must be
>> +used before any vcpu is created. The unit of the frequency is KHz.
>
> Should it not be a vcpu ioctl?

The idea was that a vm ioctl will make sure that all vcpus in one vm
have the same tsc frequency. With a vm ioctl we force this. So the tsc
fequency is basically a vm capability which is mirrored into each vcpu
data structure for performance reasons.

> Need to return an error if the frequency cannot be accommodated.  In  
> theory we need to provide the range of supported frequencies, but we can  
> live without it (and it will be very difficult to provide if we take  
> accuracy into account).

Yes, -EINVAL is reported if the frequency can not be accomodated. But
userspace can't find out the lower or upper bounds of the valid range.
>
>> +
>> +4.55 KVM_GET_TSC_KHZ
>> +
>> +Capability: KVM_CAP_GET_TSC_KHZ
>> +Architectures: x86
>> +Type: vm ioctl
>> +Parameters: __u32 (out)
>> +Returns: 0 on success, -1 on error
>> +
>> +Returns the tsc frequency of the guest. The unit of the return value is
>> +KHz. If the host has unstable tsc this ioctl return an error.
>
> Which error?
>
> (it's important since this is an expected error, unlike most others)
>
>> @@ -3580,6 +3591,52 @@ long kvm_arch_vm_ioctl(struct file *filp,
>>   		r = 0;
>>   		break;
>>   	}
>> +	case KVM_SET_TSC_KHZ: {
>> +		u32 user_tsc_khz;
>> +
>> +		if (!kvm_has_tsc_control)
>> +			break;
>> +
>> +		r = -EFAULT;
>> +		if (copy_from_user(&user_tsc_khz, argp, sizeof(__u32)))
>> +			goto out;
>
> Can just use the input value (arg) instead of copy_from_user().
>
>> +
>> +		r = -EINVAL;
>> +		if (user_tsc_khz<  kvm_min_guest_tsc_khz ||
>> +		    user_tsc_khz>  kvm_max_guest_tsc_khz)
>
> <= and >= are probably safer.

Right, I'll change it.

>
>> +			goto out;
>> +
>> +		mutex_lock(&kvm->lock);
>> +		/*
>> +		 * We force the tsc frequency to be set before any
>> +		 * vcpu is created
>> +		 */
>> +		if (atomic_read(&kvm->online_vcpus)>  0) {
>> +			mutex_unlock(&kvm->lock);
>> +			goto out;
>> +		}
>> +
>> +		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
>> +
>> +		mutex_unlock(&kvm->lock);
>
> Making these vcpu ioctls will remove the locking.
>
>> +
>> +		r = 0;
>> +		goto out;
>> +	}
>> +	case KVM_GET_TSC_KHZ: {
>> +		u32 vtsc_khz = kvm->arch.virtual_tsc_khz;
>> +
>> +		r = -EIO;
>> +		if (check_tsc_unstable())
>> +			goto out;
>> +
>> +		r = -EFAULT;
>> +		if (copy_to_user(argp,&vtsc_khz, sizeof(__u32)))
>> +			goto out;
>
> And an ordinary return here.

Okay, I'll change that. But I would prefer to keep this as a vm ioctl. A
vcpu ioctl might be more flexible but I doubt anybody has a use-case for
different tsc_khz values in one VM.

	Joerg


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-24  7:40 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
@ 2011-03-24 10:14   ` Avi Kivity
  2011-03-24 10:41     ` Joerg Roedel
  0 siblings, 1 reply; 29+ messages in thread
From: Avi Kivity @ 2011-03-24 10:14 UTC (permalink / raw)
  To: Joerg Roedel; +Cc: Marcelo Tosatti, Zachary Amsden, kvm

On 03/24/2011 09:40 AM, Joerg Roedel wrote:
> This patch implements two new vm-ioctls to get and set the
> virtual_tsc_khz if the machine supports tsc-scaling. Setting
> the tsc-frequency is only possible before userspace creates
> any vcpu.
>
>
> +4.54 KVM_SET_TSC_KHZ
> +
> +Capability: KVM_CAP_TSC_CONTROL
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: __u32 (in)
> +Returns: 0 on success, -1 on error
> +
> +Specifies the tsc frequency for the virtual machine. This IOCTL must be
> +used before any vcpu is created. The unit of the frequency is KHz.

Should it not be a vcpu ioctl?

KVM_CAP_TSC_CONTROL depends on both kvm and the underlying cpu.  I guess 
it's okay.

Need to return an error if the frequency cannot be accommodated.  In 
theory we need to provide the range of supported frequencies, but we can 
live without it (and it will be very difficult to provide if we take 
accuracy into account).

> +
> +4.55 KVM_GET_TSC_KHZ
> +
> +Capability: KVM_CAP_GET_TSC_KHZ
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: __u32 (out)
> +Returns: 0 on success, -1 on error
> +
> +Returns the tsc frequency of the guest. The unit of the return value is
> +KHz. If the host has unstable tsc this ioctl return an error.

Which error?

(it's important since this is an expected error, unlike most others)

> @@ -3580,6 +3591,52 @@ long kvm_arch_vm_ioctl(struct file *filp,
>   		r = 0;
>   		break;
>   	}
> +	case KVM_SET_TSC_KHZ: {
> +		u32 user_tsc_khz;
> +
> +		if (!kvm_has_tsc_control)
> +			break;
> +
> +		r = -EFAULT;
> +		if (copy_from_user(&user_tsc_khz, argp, sizeof(__u32)))
> +			goto out;

Can just use the input value (arg) instead of copy_from_user().

> +
> +		r = -EINVAL;
> +		if (user_tsc_khz<  kvm_min_guest_tsc_khz ||
> +		    user_tsc_khz>  kvm_max_guest_tsc_khz)

<= and >= are probably safer.

> +			goto out;
> +
> +		mutex_lock(&kvm->lock);
> +		/*
> +		 * We force the tsc frequency to be set before any
> +		 * vcpu is created
> +		 */
> +		if (atomic_read(&kvm->online_vcpus)>  0) {
> +			mutex_unlock(&kvm->lock);
> +			goto out;
> +		}
> +
> +		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
> +
> +		mutex_unlock(&kvm->lock);

Making these vcpu ioctls will remove the locking.

> +
> +		r = 0;
> +		goto out;
> +	}
> +	case KVM_GET_TSC_KHZ: {
> +		u32 vtsc_khz = kvm->arch.virtual_tsc_khz;
> +
> +		r = -EIO;
> +		if (check_tsc_unstable())
> +			goto out;
> +
> +		r = -EFAULT;
> +		if (copy_to_user(argp,&vtsc_khz, sizeof(__u32)))
> +			goto out;

And an ordinary return here.

> +
> +		r = 0;
> +		goto out;
> +	}
>

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 29+ messages in thread

* [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-24  7:40 [PATCH 0/6][RESEND] TSC scaling support for KVM v2 Joerg Roedel
@ 2011-03-24  7:40 ` Joerg Roedel
  2011-03-24 10:14   ` Avi Kivity
  0 siblings, 1 reply; 29+ messages in thread
From: Joerg Roedel @ 2011-03-24  7:40 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Zachary Amsden, kvm, Joerg Roedel

This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 Documentation/kvm/api.txt       |   22 +++++++++++++++
 arch/x86/include/asm/kvm_host.h |    7 +++++
 arch/x86/kvm/svm.c              |   15 ++++++++++
 arch/x86/kvm/x86.c              |   57 +++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h             |    5 +++
 5 files changed, 106 insertions(+), 0 deletions(-)

diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
index 9bef4e4..e4e4a44 100644
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -1263,6 +1263,28 @@ struct kvm_assigned_msix_entry {
 	__u16 padding[3];
 };
 
+4.54 KVM_SET_TSC_KHZ
+
+Capability: KVM_CAP_TSC_CONTROL
+Architectures: x86
+Type: vm ioctl
+Parameters: __u32 (in)
+Returns: 0 on success, -1 on error
+
+Specifies the tsc frequency for the virtual machine. This IOCTL must be
+used before any vcpu is created. The unit of the frequency is KHz.
+
+4.55 KVM_GET_TSC_KHZ
+
+Capability: KVM_CAP_GET_TSC_KHZ
+Architectures: x86
+Type: vm ioctl
+Parameters: __u32 (out)
+Returns: 0 on success, -1 on error
+
+Returns the tsc frequency of the guest. The unit of the return value is
+KHz. If the host has unstable tsc this ioctl return an error.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6fe1e84..5b96b76 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -632,6 +632,13 @@ u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 extern bool tdp_enabled;
 
+/* control of guest tsc rate supported? */
+extern bool kvm_has_tsc_control;
+/* minimum supported tsc_khz for guests */
+extern u32  kvm_min_guest_tsc_khz;
+/* maximum supported tsc_khz for guests */
+extern u32  kvm_max_guest_tsc_khz;
+
 enum emulation_result {
 	EMULATE_DONE,       /* no further processing */
 	EMULATE_DO_MMIO,      /* kvm_run filled with mmio request */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 32d444f..368577b 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -64,6 +64,8 @@ MODULE_LICENSE("GPL");
 #define DEBUGCTL_RESERVED_BITS (~(0x3fULL))
 
 #define TSC_RATIO_RSVD          0xffffff0000000000ULL
+#define TSC_RATIO_MIN		0x0000000000000001ULL
+#define TSC_RATIO_MAX		0x000000ffffffffffULL
 
 static bool erratum_383_found __read_mostly;
 
@@ -198,6 +200,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm);
 static int nested_svm_vmexit(struct vcpu_svm *svm);
 static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
 				      bool has_error_code, u32 error_code);
+static u64 __scale_tsc(u64 ratio, u64 tsc);
 
 enum {
 	VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -799,6 +802,18 @@ static __init int svm_hardware_setup(void)
 	if (boot_cpu_has(X86_FEATURE_FXSR_OPT))
 		kvm_enable_efer_bits(EFER_FFXSR);
 
+	if (boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
+		u64 min, max;
+
+		kvm_has_tsc_control = true;
+
+		min = max(1ULL,          __scale_tsc(tsc_khz, TSC_RATIO_MIN));
+		max = min(0xffffffffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
+
+		kvm_min_guest_tsc_khz = min;
+		kvm_max_guest_tsc_khz = max;
+	}
+
 	if (nested) {
 		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7754079..416b016 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -100,6 +100,13 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 int ignore_msrs = 0;
 module_param_named(ignore_msrs, ignore_msrs, bool, S_IRUGO | S_IWUSR);
 
+bool kvm_has_tsc_control;
+EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
+u32  kvm_min_guest_tsc_khz;
+EXPORT_SYMBOL_GPL(kvm_min_guest_tsc_khz);
+u32  kvm_max_guest_tsc_khz;
+EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
+
 #define KVM_NR_SHARED_MSRS 16
 
 struct kvm_shared_msrs_global {
@@ -2000,6 +2007,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
 	case KVM_CAP_XSAVE:
 	case KVM_CAP_ASYNC_PF:
+	case KVM_CAP_GET_TSC_KHZ:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -2026,6 +2034,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_XCRS:
 		r = cpu_has_xsave;
 		break;
+	case KVM_CAP_TSC_CONTROL:
+		r = kvm_has_tsc_control;
+		break;
 	default:
 		r = 0;
 		break;
@@ -3580,6 +3591,52 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_SET_TSC_KHZ: {
+		u32 user_tsc_khz;
+
+		if (!kvm_has_tsc_control)
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&user_tsc_khz, argp, sizeof(__u32)))
+			goto out;
+
+		r = -EINVAL;
+		if (user_tsc_khz < kvm_min_guest_tsc_khz ||
+		    user_tsc_khz > kvm_max_guest_tsc_khz)
+			goto out;
+
+		mutex_lock(&kvm->lock);
+		/*
+		 * We force the tsc frequency to be set before any
+		 * vcpu is created
+		 */
+		if (atomic_read(&kvm->online_vcpus) > 0) {
+			mutex_unlock(&kvm->lock);
+			goto out;
+		}
+
+		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
+
+		mutex_unlock(&kvm->lock);
+
+		r = 0;
+		goto out;
+	}
+	case KVM_GET_TSC_KHZ: {
+		u32 vtsc_khz = kvm->arch.virtual_tsc_khz;
+
+		r = -EIO;
+		if (check_tsc_unstable())
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &vtsc_khz, sizeof(__u32)))
+			goto out;
+
+		r = 0;
+		goto out;
+	}
 
 	default:
 		;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index ea2dc1a..ea16c57 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -541,6 +541,8 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_GET_PVINFO 57
 #define KVM_CAP_PPC_IRQ_LEVEL 58
 #define KVM_CAP_ASYNC_PF 59
+#define KVM_CAP_TSC_CONTROL 60
+#define KVM_CAP_GET_TSC_KHZ 61
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -677,6 +679,9 @@ struct kvm_clock_data {
 #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
 /* Available with KVM_CAP_PPC_GET_PVINFO */
 #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
+/* Available with KVM_CAP_TSC_CONTROL */
+#define KVM_SET_TSC_KHZ           _IOW(KVMIO,  0xa2, __u32)
+#define KVM_GET_TSC_KHZ           _IOR(KVMIO,  0xa2, __u32)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

* [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz
  2011-03-15  9:36 [PATCH 0/6] TSC scaling support for KVM v2 Joerg Roedel
@ 2011-03-15  9:36 ` Joerg Roedel
  0 siblings, 0 replies; 29+ messages in thread
From: Joerg Roedel @ 2011-03-15  9:36 UTC (permalink / raw)
  To: Avi Kivity, Marcelo Tosatti; +Cc: Zachary Amsden, kvm, Joerg Roedel

This patch implements two new vm-ioctls to get and set the
virtual_tsc_khz if the machine supports tsc-scaling. Setting
the tsc-frequency is only possible before userspace creates
any vcpu.

Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
---
 Documentation/kvm/api.txt       |   22 +++++++++++++++
 arch/x86/include/asm/kvm_host.h |    7 +++++
 arch/x86/kvm/svm.c              |   15 ++++++++++
 arch/x86/kvm/x86.c              |   56 +++++++++++++++++++++++++++++++++++++++
 include/linux/kvm.h             |    5 +++
 5 files changed, 105 insertions(+), 0 deletions(-)

diff --git a/Documentation/kvm/api.txt b/Documentation/kvm/api.txt
index ad85797..adc9c23 100644
--- a/Documentation/kvm/api.txt
+++ b/Documentation/kvm/api.txt
@@ -1263,6 +1263,28 @@ struct kvm_assigned_msix_entry {
 	__u16 padding[3];
 };
 
+4.54 KVM_SET_TSC_KHZ
+
+Capability: KVM_CAP_TSC_CONTROL
+Architectures: x86
+Type: vm ioctl
+Parameters: __u32 (in)
+Returns: 0 on success, -1 on error
+
+Specifies the tsc frequency for the virtual machine. This IOCTL must be
+used before any vcpu is created. The unit of the frequency is KHz.
+
+4.55 KVM_GET_TSC_KHZ
+
+Capability: KVM_CAP_GET_TSC_KHZ
+Architectures: x86
+Type: vm ioctl
+Parameters: __u32 (out)
+Returns: 0 on success, -1 on error
+
+Returns the tsc frequency of the guest. The unit of the return value is
+KHz. If the host has unstable tsc this ioctl return an error.
+
 5. The kvm_run structure
 
 Application code obtains a pointer to the kvm_run structure by
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 16db838..2471fc9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -630,6 +630,13 @@ u8 kvm_get_guest_memory_type(struct kvm_vcpu *vcpu, gfn_t gfn);
 
 extern bool tdp_enabled;
 
+/* control of guest tsc rate supported? */
+extern bool kvm_has_tsc_control;
+/* minimum supported tsc_khz for guests */
+extern u32  kvm_min_guest_tsc_khz;
+/* maximum supported tsc_khz for guests */
+extern u32  kvm_max_guest_tsc_khz;
+
 enum emulation_result {
 	EMULATE_DONE,       /* no further processing */
 	EMULATE_DO_MMIO,      /* kvm_run filled with mmio request */
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index b8c6d28..ed7d608 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -64,6 +64,8 @@ MODULE_LICENSE("GPL");
 #define DEBUGCTL_RESERVED_BITS (~(0x3fULL))
 
 #define TSC_RATIO_RSVD          0xffffff0000000000ULL
+#define TSC_RATIO_MIN		0x0000000000000001ULL
+#define TSC_RATIO_MAX		0x000000ffffffffffULL
 
 static bool erratum_383_found __read_mostly;
 
@@ -198,6 +200,7 @@ static int nested_svm_intercept(struct vcpu_svm *svm);
 static int nested_svm_vmexit(struct vcpu_svm *svm);
 static int nested_svm_check_exception(struct vcpu_svm *svm, unsigned nr,
 				      bool has_error_code, u32 error_code);
+static u64 __scale_tsc(u64 ratio, u64 tsc);
 
 enum {
 	VMCB_INTERCEPTS, /* Intercept vectors, TSC offset,
@@ -799,6 +802,18 @@ static __init int svm_hardware_setup(void)
 	if (boot_cpu_has(X86_FEATURE_FXSR_OPT))
 		kvm_enable_efer_bits(EFER_FFXSR);
 
+	if (boot_cpu_has(X86_FEATURE_TSCRATEMSR)) {
+		u64 min, max;
+
+		kvm_has_tsc_control = true;
+
+		min = max(1ULL,          __scale_tsc(tsc_khz, TSC_RATIO_MIN));
+		max = min(0xffffffffULL, __scale_tsc(tsc_khz, TSC_RATIO_MAX));
+
+		kvm_min_guest_tsc_khz = min;
+		kvm_max_guest_tsc_khz = max;
+	}
+
 	if (nested) {
 		printk(KERN_INFO "kvm: Nested Virtualization enabled\n");
 		kvm_enable_efer_bits(EFER_SVME | EFER_LMSLE);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index aecd926..de313f1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -100,6 +100,13 @@ EXPORT_SYMBOL_GPL(kvm_x86_ops);
 int ignore_msrs = 0;
 module_param_named(ignore_msrs, ignore_msrs, bool, S_IRUGO | S_IWUSR);
 
+bool kvm_has_tsc_control;
+u32  kvm_min_guest_tsc_khz;
+u32  kvm_max_guest_tsc_khz;
+EXPORT_SYMBOL_GPL(kvm_has_tsc_control);
+EXPORT_SYMBOL_GPL(kvm_min_guest_tsc_khz);
+EXPORT_SYMBOL_GPL(kvm_max_guest_tsc_khz);
+
 #define KVM_NR_SHARED_MSRS 16
 
 struct kvm_shared_msrs_global {
@@ -2000,6 +2007,7 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
 	case KVM_CAP_XSAVE:
 	case KVM_CAP_ASYNC_PF:
+	case KVM_CAP_GET_TSC_KHZ:
 		r = 1;
 		break;
 	case KVM_CAP_COALESCED_MMIO:
@@ -2026,6 +2034,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 	case KVM_CAP_XCRS:
 		r = cpu_has_xsave;
 		break;
+	case KVM_CAP_TSC_CONTROL:
+		r = kvm_has_tsc_control;
+		break;
 	default:
 		r = 0;
 		break;
@@ -3580,6 +3591,51 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = 0;
 		break;
 	}
+	case KVM_SET_TSC_KHZ: {
+		u32 user_tsc_khz;
+
+		if (!kvm_has_tsc_control)
+			break;
+
+		r = -EFAULT;
+		if (copy_from_user(&user_tsc_khz, argp, sizeof(__u32)))
+			goto out;
+
+		r = -EINVAL;
+		if (user_tsc_khz < kvm_min_guest_tsc_khz ||
+		    user_tsc_khz > kvm_max_guest_tsc_khz)
+			goto out;
+
+		mutex_lock(&kvm->lock);
+		/*
+		 * We force the tsc frequency to be set before any
+		 * vcpu is created
+		 */
+		if (atomic_read(&kvm->online_vcpus) > 0) {
+			mutex_unlock(&kvm->lock);
+			goto out;
+		}
+
+		kvm_arch_set_tsc_khz(kvm, user_tsc_khz);
+
+		mutex_unlock(&kvm->lock);
+
+		r = 0;
+		goto out;
+	}
+	case KVM_GET_TSC_KHZ: {
+
+		r = -EIO;
+		if (check_tsc_unstable())
+			goto out;
+
+		r = -EFAULT;
+		if (copy_to_user(argp, &kvm->arch.virtual_tsc_khz, sizeof(__u32)))
+			goto out;
+
+		r = 0;
+		goto out;
+	}
 
 	default:
 		;
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index ea2dc1a..ea16c57 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -541,6 +541,8 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_PPC_GET_PVINFO 57
 #define KVM_CAP_PPC_IRQ_LEVEL 58
 #define KVM_CAP_ASYNC_PF 59
+#define KVM_CAP_TSC_CONTROL 60
+#define KVM_CAP_GET_TSC_KHZ 61
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -677,6 +679,9 @@ struct kvm_clock_data {
 #define KVM_SET_PIT2              _IOW(KVMIO,  0xa0, struct kvm_pit_state2)
 /* Available with KVM_CAP_PPC_GET_PVINFO */
 #define KVM_PPC_GET_PVINFO	  _IOW(KVMIO,  0xa1, struct kvm_ppc_pvinfo)
+/* Available with KVM_CAP_TSC_CONTROL */
+#define KVM_SET_TSC_KHZ           _IOW(KVMIO,  0xa2, __u32)
+#define KVM_GET_TSC_KHZ           _IOR(KVMIO,  0xa2, __u32)
 
 /*
  * ioctls for vcpu fds
-- 
1.7.1



^ permalink raw reply related	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2011-03-25  8:45 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-09 17:29 [PATCH 0/6] KVM support for TSC scaling Joerg Roedel
2011-02-09 17:29 ` [PATCH 1/6] KVM: SVM: Advance instruction pointer in dr_intercept Joerg Roedel
2011-02-22 11:14   ` Roedel, Joerg
2011-02-22 14:01     ` Avi Kivity
2011-02-22 14:33       ` Roedel, Joerg
2011-02-09 17:29 ` [PATCH 2/6] KVM: SVM: Implement infrastructure for TSC_RATE_MSR Joerg Roedel
2011-02-09 17:29 ` [PATCH 3/6] KVM: X86: Let kvm-clock report the right tsc frequency Joerg Roedel
2011-02-09 17:29 ` [PATCH 4/6] KVM: SVM: Propagate requested TSC frequency on vcpu init Joerg Roedel
2011-02-09 17:29 ` [PATCH 5/6] KVM: X86: Delegate tsc-offset calculation to architecture code Joerg Roedel
2011-02-11 22:12   ` Zachary Amsden
2011-02-21 17:16     ` Roedel, Joerg
2011-02-09 17:29 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
2011-02-13 15:12   ` Avi Kivity
2011-02-21 17:17     ` Roedel, Joerg
2011-02-13 15:19 ` [PATCH 0/6] KVM support for TSC scaling Avi Kivity
2011-02-21 17:28   ` Roedel, Joerg
2011-02-21 21:25     ` Zachary Amsden
2011-02-22 10:11     ` Avi Kivity
2011-02-22 10:35       ` Roedel, Joerg
2011-02-22 10:41         ` Avi Kivity
2011-02-22 11:11           ` Roedel, Joerg
2011-02-22 14:11             ` Avi Kivity
2011-03-15  9:36 [PATCH 0/6] TSC scaling support for KVM v2 Joerg Roedel
2011-03-15  9:36 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
2011-03-24  7:40 [PATCH 0/6][RESEND] TSC scaling support for KVM v2 Joerg Roedel
2011-03-24  7:40 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel
2011-03-24 10:14   ` Avi Kivity
2011-03-24 10:41     ` Joerg Roedel
2011-03-24 10:44       ` Avi Kivity
2011-03-24 10:47         ` Joerg Roedel
2011-03-25  8:44 [PATCH 0/6] TSC scaling support for KVM v3 Joerg Roedel
2011-03-25  8:44 ` [PATCH 6/6] KVM: X86: Implement userspace interface to set virtual_tsc_khz Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.