All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)
@ 2010-12-03 22:39 Anthony Liguori
  2010-12-03 23:32 ` Joerg Roedel
  0 siblings, 1 reply; 4+ messages in thread
From: Anthony Liguori @ 2010-12-03 22:39 UTC (permalink / raw)
  To: kvm
  Cc: Avi Kivity, Marcelo Tosatti, Chris Wright, Srivatsa Vaddagiri,
	Anthony Liguori

In certain use-cases, we want to allocate guests fixed time slices where idle
guest cycles leave the machine idling.  There are many approaches to achieve
this but the most direct is to simply avoid trapping the HLT instruction which
lets the guest directly execute the instruction putting the processor to sleep.

Introduce this as a module-level option for kvm-vmx.ko since if you do this
for one guest, you probably want to do it for all.  A similar option is possible
for AMD but I don't have easy access to AMD test hardware.

Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
---
v3 -> v2
 - Clear HLT activity state on exception injection to fix issue with async PF

v1 -> v2
 - Rename parameter to yield_on_hlt
 - Remove __read_mostly

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 42d9590..9642c22 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -297,6 +297,12 @@ enum vmcs_field {
 #define GUEST_INTR_STATE_SMI		0x00000004
 #define GUEST_INTR_STATE_NMI		0x00000008
 
+/* GUEST_ACTIVITY_STATE flags */
+#define GUEST_ACTIVITY_ACTIVE		0
+#define GUEST_ACTIVITY_HLT		1
+#define GUEST_ACTIVITY_SHUTDOWN		2
+#define GUEST_ACTIVITY_WAIT_SIPI	3
+
 /*
  * Exit Qualifications for MOV for Control Register Access
  */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index caa967e..e8e64cb 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -69,6 +69,9 @@ module_param(emulate_invalid_guest_state, bool, S_IRUGO);
 static int __read_mostly vmm_exclusive = 1;
 module_param(vmm_exclusive, bool, S_IRUGO);
 
+static int yield_on_hlt = 1;
+module_param(yield_on_hlt, bool, S_IRUGO);
+
 #define KVM_GUEST_CR0_MASK_UNRESTRICTED_GUEST				\
 	(X86_CR0_WP | X86_CR0_NE | X86_CR0_NW | X86_CR0_CD)
 #define KVM_GUEST_CR0_MASK						\
@@ -1016,6 +1019,10 @@ static void vmx_queue_exception(struct kvm_vcpu *vcpu, unsigned nr,
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
 	u32 intr_info = nr | INTR_INFO_VALID_MASK;
 
+        /* Cannot inject an exception in guest activity state is HLT */
+	if (vmcs_read32(GUEST_ACTIVITY_STATE) == GUEST_ACTIVITY_HLT)
+		vmcs_write32(GUEST_ACTIVITY_STATE, GUEST_ACTIVITY_ACTIVE);
+
 	if (has_error_code) {
 		vmcs_write32(VM_ENTRY_EXCEPTION_ERROR_CODE, error_code);
 		intr_info |= INTR_INFO_DELIVER_CODE_MASK;
@@ -1419,7 +1426,7 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 				&_pin_based_exec_control) < 0)
 		return -EIO;
 
-	min = CPU_BASED_HLT_EXITING |
+	min =
 #ifdef CONFIG_X86_64
 	      CPU_BASED_CR8_LOAD_EXITING |
 	      CPU_BASED_CR8_STORE_EXITING |
@@ -1432,6 +1439,10 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 	      CPU_BASED_MWAIT_EXITING |
 	      CPU_BASED_MONITOR_EXITING |
 	      CPU_BASED_INVLPG_EXITING;
+
+	if (yield_on_hlt)
+		min |= CPU_BASED_HLT_EXITING;
+
 	opt = CPU_BASED_TPR_SHADOW |
 	      CPU_BASED_USE_MSR_BITMAPS |
 	      CPU_BASED_ACTIVATE_SECONDARY_CONTROLS;
-- 
1.7.0.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)
  2010-12-03 22:39 [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3) Anthony Liguori
@ 2010-12-03 23:32 ` Joerg Roedel
  2010-12-03 23:38   ` Anthony Liguori
  0 siblings, 1 reply; 4+ messages in thread
From: Joerg Roedel @ 2010-12-03 23:32 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: kvm, Avi Kivity, Marcelo Tosatti, Chris Wright, Srivatsa Vaddagiri

On Fri, Dec 03, 2010 at 04:39:22PM -0600, Anthony Liguori wrote:
> +	if (yield_on_hlt)
> +		min |= CPU_BASED_HLT_EXITING;

This approach won't work out on AMD because in HLT the CPU may enter
C1e. In C1e the local apic timer interupt is not delivered anymore and
when this is the current timer in use the cpu may miss timer ticks or
never comes out of HLT again. The guest has no chance to work around
this as the Linux idle routine does. 
If you really wan't active idling of a guest, it should idle in the
hypervisor where it can work around such problems.

	Joerg


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)
  2010-12-03 23:32 ` Joerg Roedel
@ 2010-12-03 23:38   ` Anthony Liguori
  2010-12-04  8:53     ` Joerg Roedel
  0 siblings, 1 reply; 4+ messages in thread
From: Anthony Liguori @ 2010-12-03 23:38 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: Anthony Liguori, kvm, Avi Kivity, Marcelo Tosatti, Chris Wright,
	Srivatsa Vaddagiri

On 12/03/2010 05:32 PM, Joerg Roedel wrote:
> On Fri, Dec 03, 2010 at 04:39:22PM -0600, Anthony Liguori wrote:
>    
>> +	if (yield_on_hlt)
>> +		min |= CPU_BASED_HLT_EXITING;
>>      
> This approach won't work out on AMD because in HLT the CPU may enter
> C1e. In C1e the local apic timer interupt is not delivered anymore and
> when this is the current timer in use the cpu may miss timer ticks or
> never comes out of HLT again. The guest has no chance to work around
> this as the Linux idle routine does.
>    

And this doesn't break old software on bare metal?

Regards,

Anthony Liguori

> If you really wan't active idling of a guest, it should idle in the
> hypervisor where it can work around such problems.
>
> 	Joerg
>
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>    


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3)
  2010-12-03 23:38   ` Anthony Liguori
@ 2010-12-04  8:53     ` Joerg Roedel
  0 siblings, 0 replies; 4+ messages in thread
From: Joerg Roedel @ 2010-12-04  8:53 UTC (permalink / raw)
  To: Anthony Liguori
  Cc: Anthony Liguori, kvm, Avi Kivity, Marcelo Tosatti, Chris Wright,
	Srivatsa Vaddagiri

On Fri, Dec 03, 2010 at 05:38:06PM -0600, Anthony Liguori wrote:
> On 12/03/2010 05:32 PM, Joerg Roedel wrote:
>> On Fri, Dec 03, 2010 at 04:39:22PM -0600, Anthony Liguori wrote:
>>    
>>> +	if (yield_on_hlt)
>>> +		min |= CPU_BASED_HLT_EXITING;
>>>      
>> This approach won't work out on AMD because in HLT the CPU may enter
>> C1e. In C1e the local apic timer interupt is not delivered anymore and
>> when this is the current timer in use the cpu may miss timer ticks or
>> never comes out of HLT again. The guest has no chance to work around
>> this as the Linux idle routine does.
>>    
>
> And this doesn't break old software on bare metal?

Yes it does. In fact, this behavior is documented as Erratum 400 for AMD
CPUs. Linux has a workaround for it for quite some time. You can have a
look at the c1e_idle routine for details.
C1e can also be disabled by the OS. But there are BIOSes which re-enable
it in SMI. So there is the chance that it gets re-enabled whithout an
vmexit.

	Joerg


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2010-12-04  8:53 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-03 22:39 [PATCH] kvm-vmx: add module parameter to avoid trapping HLT instructions (v3) Anthony Liguori
2010-12-03 23:32 ` Joerg Roedel
2010-12-03 23:38   ` Anthony Liguori
2010-12-04  8:53     ` Joerg Roedel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.