kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation
@ 2020-05-08 20:36 Jim Mattson
  2020-05-08 20:36 ` [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned Jim Mattson
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Jim Mattson @ 2020-05-08 20:36 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Jim Mattson

I'm still not entirely convinced that the Linux hrtimer can be used to
accurately emulate the VMX-preemption timer, but it definitely doesn't
work if the hrtimer is on a different logical processor from the vCPU
thread that needs to get kicked out of VMX non-root operation.

With these changes, the kvm-unit-test (sent separately) that verifies
that a guest can't actually observe a delayed VMX-preemption timer
VM-exit passes 99.999% of the time on a 2GHz Skylake system.

It might be possible to improve that pass rate even more by increasing
the scaling factor in the virtual IA32_VMX_MISC[4:0], but you'd have
to be even more of a stickler than I to go to that extreme.

By the way, what is the point of migrating the hrtimers for the APIC
and the PIT, since they aren't even pinned to begin with?

The subject line of the first patch was crafted for you, Sean. :-D

Jim Mattson (3):
  KVM: nVMX: Really make emulated nested preemption timer pinned
  KVM: nVMX: Change emulated VMX-preemption timer hrtimer to absolute
  KVM: nVMX: Migrate the VMX-preemption timer

 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/irq.c              |  2 ++
 arch/x86/kvm/vmx/nested.c       |  5 +++--
 arch/x86/kvm/vmx/vmx.c          | 11 +++++++++++
 4 files changed, 18 insertions(+), 2 deletions(-)

-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned
  2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
@ 2020-05-08 20:36 ` Jim Mattson
  2020-05-08 22:16   ` Sean Christopherson
  2020-05-08 20:36 ` [PATCH 2/3] KVM: nVMX: Change emulated VMX-preemption timer hrtimer to absolute Jim Mattson
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 9+ messages in thread
From: Jim Mattson @ 2020-05-08 20:36 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Jim Mattson, Peter Shier, Oliver Upton

The PINNED bit is ignored by hrtimer_init. It is only considered when
starting the timer.

When the hrtimer isn't pinned to the same logical processor as the
vCPU thread to be interrupted, the emulated VMX-preemption timer
often fails to adhere to the architectural specification.

Fixes: f15a75eedc18e ("KVM: nVMX: make emulated nested preemption timer pinned")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index fd78ffbde644..1f7fe6f47cbc 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2041,7 +2041,7 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu)
 	preemption_timeout *= 1000000;
 	do_div(preemption_timeout, vcpu->arch.virtual_tsc_khz);
 	hrtimer_start(&vmx->nested.preemption_timer,
-		      ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL);
+		      ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL_PINNED);
 }
 
 static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] KVM: nVMX: Change emulated VMX-preemption timer hrtimer to absolute
  2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
  2020-05-08 20:36 ` [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned Jim Mattson
@ 2020-05-08 20:36 ` Jim Mattson
  2020-05-08 20:36 ` [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer Jim Mattson
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Jim Mattson @ 2020-05-08 20:36 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Jim Mattson, Peter Shier, Oliver Upton

Prepare for migration of this hrtimer, by changing it from relative to
absolute. (I couldn't get migration to work with a relative timer.)

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/kvm/vmx/nested.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 1f7fe6f47cbc..3ca792e3cd66 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -2041,7 +2041,8 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu)
 	preemption_timeout *= 1000000;
 	do_div(preemption_timeout, vcpu->arch.virtual_tsc_khz);
 	hrtimer_start(&vmx->nested.preemption_timer,
-		      ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL_PINNED);
+		      ktime_add_ns(ktime_get(), preemption_timeout),
+		      HRTIMER_MODE_ABS_PINNED);
 }
 
 static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
@@ -4614,7 +4615,7 @@ static int enter_vmx_operation(struct kvm_vcpu *vcpu)
 		goto out_shadow_vmcs;
 
 	hrtimer_init(&vmx->nested.preemption_timer, CLOCK_MONOTONIC,
-		     HRTIMER_MODE_REL_PINNED);
+		     HRTIMER_MODE_ABS_PINNED);
 	vmx->nested.preemption_timer.function = vmx_preemption_timer_fn;
 
 	vmx->nested.vpid02 = allocate_vpid();
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer
  2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
  2020-05-08 20:36 ` [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned Jim Mattson
  2020-05-08 20:36 ` [PATCH 2/3] KVM: nVMX: Change emulated VMX-preemption timer hrtimer to absolute Jim Mattson
@ 2020-05-08 20:36 ` Jim Mattson
  2020-05-09 13:17   ` Paolo Bonzini
  2020-05-09 13:14 ` [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Paolo Bonzini
  2020-06-19 18:49 ` Jim Mattson
  4 siblings, 1 reply; 9+ messages in thread
From: Jim Mattson @ 2020-05-08 20:36 UTC (permalink / raw)
  To: kvm, Paolo Bonzini; +Cc: Jim Mattson, Peter Shier, Oliver Upton

The hrtimer used to emulate the VMX-preemption timer must be pinned to
the same logical processor as the vCPU thread to be interrupted if we
want to have any hope of adhering to the architectural specification
of the VMX-preemption timer. Even with this change, the emulated
VMX-preemption timer VM-exit occasionally arrives too late.

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/irq.c              |  2 ++
 arch/x86/kvm/vmx/vmx.c          | 11 +++++++++++
 3 files changed, 15 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 42a2d0d3984a..a47c71d13039 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1254,6 +1254,8 @@ struct kvm_x86_ops {
 
 	bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
 	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
+
+	void (*migrate_timers)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_init_ops {
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index e330e7d125f7..54f7ea68083b 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -159,6 +159,8 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
 {
 	__kvm_migrate_apic_timer(vcpu);
 	__kvm_migrate_pit_timer(vcpu);
+	if (kvm_x86_ops.migrate_timers)
+		kvm_x86_ops.migrate_timers(vcpu);
 }
 
 bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index c2c6335a998c..3896dea72082 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7687,6 +7687,16 @@ static bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
 	return to_vmx(vcpu)->nested.vmxon;
 }
 
+static void vmx_migrate_timers(struct kvm_vcpu *vcpu)
+{
+	if (is_guest_mode(vcpu)) {
+		struct hrtimer *timer = &to_vmx(vcpu)->nested.preemption_timer;
+
+		if (hrtimer_try_to_cancel(timer) == 1)
+			hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
+	}
+}
+
 static void hardware_unsetup(void)
 {
 	if (nested)
@@ -7838,6 +7848,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
 	.nested_get_evmcs_version = NULL,
 	.need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
 	.apic_init_signal_blocked = vmx_apic_init_signal_blocked,
+	.migrate_timers = vmx_migrate_timers,
 };
 
 static __init int hardware_setup(void)
-- 
2.26.2.645.ge9eca65c58-goog


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned
  2020-05-08 20:36 ` [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned Jim Mattson
@ 2020-05-08 22:16   ` Sean Christopherson
  0 siblings, 0 replies; 9+ messages in thread
From: Sean Christopherson @ 2020-05-08 22:16 UTC (permalink / raw)
  To: Jim Mattson; +Cc: kvm, Paolo Bonzini, Peter Shier, Oliver Upton

LOL, love the shortlog :-)

On Fri, May 08, 2020 at 01:36:41PM -0700, Jim Mattson wrote:
> The PINNED bit is ignored by hrtimer_init. It is only considered when
> starting the timer.
> 
> When the hrtimer isn't pinned to the same logical processor as the
> vCPU thread to be interrupted, the emulated VMX-preemption timer
> often fails to adhere to the architectural specification.
> 
> Fixes: f15a75eedc18e ("KVM: nVMX: make emulated nested preemption timer pinned")
> Signed-off-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Peter Shier <pshier@google.com>
> Reviewed-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/kvm/vmx/nested.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index fd78ffbde644..1f7fe6f47cbc 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -2041,7 +2041,7 @@ static void vmx_start_preemption_timer(struct kvm_vcpu *vcpu)
>  	preemption_timeout *= 1000000;
>  	do_div(preemption_timeout, vcpu->arch.virtual_tsc_khz);
>  	hrtimer_start(&vmx->nested.preemption_timer,
> -		      ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL);
> +		      ns_to_ktime(preemption_timeout), HRTIMER_MODE_REL_PINNED);
>  }
>  
>  static u64 nested_vmx_calc_efer(struct vcpu_vmx *vmx, struct vmcs12 *vmcs12)
> -- 
> 2.26.2.645.ge9eca65c58-goog
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation
  2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
                   ` (2 preceding siblings ...)
  2020-05-08 20:36 ` [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer Jim Mattson
@ 2020-05-09 13:14 ` Paolo Bonzini
  2020-06-19 18:49 ` Jim Mattson
  4 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-05-09 13:14 UTC (permalink / raw)
  To: Jim Mattson, kvm

On 08/05/20 22:36, Jim Mattson wrote:
> 
> It might be possible to improve that pass rate even more by increasing
> the scaling factor in the virtual IA32_VMX_MISC[4:0], but you'd have
> to be even more of a stickler than I to go to that extreme.
> 
> By the way, what is the point of migrating the hrtimers for the APIC
> and the PIT, since they aren't even pinned to begin with?

Unless housekeeping CPUs for timers are configured, the timer will run
on the CPU it is started on, even without pinning; see
get_nohz_timer_target.

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer
  2020-05-08 20:36 ` [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer Jim Mattson
@ 2020-05-09 13:17   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-05-09 13:17 UTC (permalink / raw)
  To: Jim Mattson, kvm; +Cc: Peter Shier, Oliver Upton

On 08/05/20 22:36, Jim Mattson wrote:
> The hrtimer used to emulate the VMX-preemption timer must be pinned to
> the same logical processor as the vCPU thread to be interrupted if we
> want to have any hope of adhering to the architectural specification
> of the VMX-preemption timer. Even with this change, the emulated
> VMX-preemption timer VM-exit occasionally arrives too late.
> 
> Signed-off-by: Jim Mattson <jmattson@google.com>
> Reviewed-by: Peter Shier <pshier@google.com>
> Reviewed-by: Oliver Upton <oupton@google.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  2 ++
>  arch/x86/kvm/irq.c              |  2 ++
>  arch/x86/kvm/vmx/vmx.c          | 11 +++++++++++
>  3 files changed, 15 insertions(+)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 42a2d0d3984a..a47c71d13039 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1254,6 +1254,8 @@ struct kvm_x86_ops {
>  
>  	bool (*apic_init_signal_blocked)(struct kvm_vcpu *vcpu);
>  	int (*enable_direct_tlbflush)(struct kvm_vcpu *vcpu);
> +
> +	void (*migrate_timers)(struct kvm_vcpu *vcpu);
>  };
>  
>  struct kvm_x86_init_ops {
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> index e330e7d125f7..54f7ea68083b 100644
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -159,6 +159,8 @@ void __kvm_migrate_timers(struct kvm_vcpu *vcpu)
>  {
>  	__kvm_migrate_apic_timer(vcpu);
>  	__kvm_migrate_pit_timer(vcpu);
> +	if (kvm_x86_ops.migrate_timers)
> +		kvm_x86_ops.migrate_timers(vcpu);
>  }
>  
>  bool kvm_arch_irqfd_allowed(struct kvm *kvm, struct kvm_irqfd *args)
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index c2c6335a998c..3896dea72082 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7687,6 +7687,16 @@ static bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
>  	return to_vmx(vcpu)->nested.vmxon;
>  }
>  
> +static void vmx_migrate_timers(struct kvm_vcpu *vcpu)
> +{
> +	if (is_guest_mode(vcpu)) {
> +		struct hrtimer *timer = &to_vmx(vcpu)->nested.preemption_timer;
> +
> +		if (hrtimer_try_to_cancel(timer) == 1)
> +			hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
> +	}
> +}
> +
>  static void hardware_unsetup(void)
>  {
>  	if (nested)
> @@ -7838,6 +7848,7 @@ static struct kvm_x86_ops vmx_x86_ops __initdata = {
>  	.nested_get_evmcs_version = NULL,
>  	.need_emulation_on_page_fault = vmx_need_emulation_on_page_fault,
>  	.apic_init_signal_blocked = vmx_apic_init_signal_blocked,
> +	.migrate_timers = vmx_migrate_timers,
>  };
>  
>  static __init int hardware_setup(void)
> 

Queued all, thanks.

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation
  2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
                   ` (3 preceding siblings ...)
  2020-05-09 13:14 ` [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Paolo Bonzini
@ 2020-06-19 18:49 ` Jim Mattson
  2020-06-19 22:52   ` Paolo Bonzini
  4 siblings, 1 reply; 9+ messages in thread
From: Jim Mattson @ 2020-06-19 18:49 UTC (permalink / raw)
  To: kvm list, Paolo Bonzini

On Fri, May 8, 2020 at 1:36 PM Jim Mattson <jmattson@google.com> wrote:
>
> I'm still not entirely convinced that the Linux hrtimer can be used to
> accurately emulate the VMX-preemption timer...

It can't, for several reasons:

1) The conversion between wall-clock time and TSC frequency, based on
tsc_khz, isn't precise enough.
2) The base clock for the hrtimer, CLOCK_MONOTONIC, can be slewed,
whereas the TSC cannot.
3) The VMX-preemption timer is suspended during MWAIT; the hrtimer is not.

Is there any reason that VMX-preemption timer emulation shouldn't just
be a second client of the hv_timer, along with lapic timer emulation?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation
  2020-06-19 18:49 ` Jim Mattson
@ 2020-06-19 22:52   ` Paolo Bonzini
  0 siblings, 0 replies; 9+ messages in thread
From: Paolo Bonzini @ 2020-06-19 22:52 UTC (permalink / raw)
  To: Jim Mattson, kvm list

On 19/06/20 20:49, Jim Mattson wrote:
> On Fri, May 8, 2020 at 1:36 PM Jim Mattson <jmattson@google.com> wrote:
>>
>> I'm still not entirely convinced that the Linux hrtimer can be used to
>> accurately emulate the VMX-preemption timer...
> 
> It can't, for several reasons:
> 
> 1) The conversion between wall-clock time and TSC frequency, based on
> tsc_khz, isn't precise enough.
> 2) The base clock for the hrtimer, CLOCK_MONOTONIC, can be slewed,
> whereas the TSC cannot.
> 3) The VMX-preemption timer is suspended during MWAIT; the hrtimer is not.
> 
> Is there any reason that VMX-preemption timer emulation shouldn't just
> be a second client of the hv_timer, along with lapic timer emulation?

I don't think so, you'd just need logic to multiplex it.

Paolo


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-06-19 22:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-08 20:36 [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Jim Mattson
2020-05-08 20:36 ` [PATCH 1/3] KVM: nVMX: Really make emulated nested preemption timer pinned Jim Mattson
2020-05-08 22:16   ` Sean Christopherson
2020-05-08 20:36 ` [PATCH 2/3] KVM: nVMX: Change emulated VMX-preemption timer hrtimer to absolute Jim Mattson
2020-05-08 20:36 ` [PATCH 3/3] KVM: nVMX: Migrate the VMX-preemption timer Jim Mattson
2020-05-09 13:17   ` Paolo Bonzini
2020-05-09 13:14 ` [PATCH 0/3] Pin the hrtimer used for VMX-preemption timer emulation Paolo Bonzini
2020-06-19 18:49 ` Jim Mattson
2020-06-19 22:52   ` Paolo Bonzini

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).