All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] nVMX: Fixes to run Xen as L1
@ 2014-03-31 21:00 Bandan Das
  2014-03-31 21:00 ` [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept Bandan Das
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Bandan Das @ 2014-03-31 21:00 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Gleb Natapov, Jan Kiszka

Minor changes to enable Xen as a L1 hypervisor.

Tested with a Haswell host, Xen-4.3 L1 and debian6 L2

v2: 
* Remove advertising single context invalidation for emulated invept
  Patch "KVM: nVMX: check for null vmcs12 when L1 does invept" from v1 
  is now obsolete and is removed
* Reorder patches "KVM: nVMX: Advertise support for interrupt acknowledgement"
  and "nVMX: Ack and write vector info to intr_info if L1 asks us to"
* Add commit description to 2/3 and change comment for nested_exit_intr_ack_set

Jan, I will send a separate unit-test patch

Bandan Das (3):
  KVM: nVMX: Don't advertise single context invalidation for invept
  KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  KVM: nVMX: Advertise support for interrupt acknowledgement

 arch/x86/kvm/irq.c |  1 +
 arch/x86/kvm/vmx.c | 35 ++++++++++++++++++++++++-----------
 2 files changed, 25 insertions(+), 11 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-03-31 21:00 [PATCH v2 0/3] nVMX: Fixes to run Xen as L1 Bandan Das
@ 2014-03-31 21:00 ` Bandan Das
  2014-04-10 20:47   ` Marcelo Tosatti
  2014-03-31 21:00 ` [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to Bandan Das
  2014-03-31 21:00 ` [PATCH v2 3/3] KVM: nVMX: Advertise support for interrupt acknowledgement Bandan Das
  2 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-03-31 21:00 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Gleb Natapov, Jan Kiszka

For single context invalidation, we fall through to global
invalidation in handle_invept() except for one case - when
the operand supplied by L1 is different from what we have in
vmcs12. However, typically hypervisors will only call invept
for the currently loaded eptp, so the condition will
never be true.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3927528..3e7f60c 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2331,12 +2331,11 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 			 VMX_EPT_INVEPT_BIT;
 		nested_vmx_ept_caps &= vmx_capability.ept;
 		/*
-		 * Since invept is completely emulated we support both global
-		 * and context invalidation independent of what host cpu
-		 * supports
+		 * For nested guests, we don't do anything specific
+		 * for single context invalidation. Hence, only advertise
+		 * support for global context invalidation.
 		 */
-		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
-			VMX_EPT_EXTENT_CONTEXT_BIT;
+		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
 	} else
 		nested_vmx_ept_caps = 0;
 
@@ -6383,7 +6382,6 @@ static int handle_invept(struct kvm_vcpu *vcpu)
 	struct {
 		u64 eptp, gpa;
 	} operand;
-	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
 
 	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
 	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
@@ -6423,16 +6421,13 @@ static int handle_invept(struct kvm_vcpu *vcpu)
 	}
 
 	switch (type) {
-	case VMX_EPT_EXTENT_CONTEXT:
-		if ((operand.eptp & eptp_mask) !=
-				(nested_ept_get_cr3(vcpu) & eptp_mask))
-			break;
 	case VMX_EPT_EXTENT_GLOBAL:
 		kvm_mmu_sync_roots(vcpu);
 		kvm_mmu_flush_tlb(vcpu);
 		nested_vmx_succeed(vcpu);
 		break;
 	default:
+		/* Trap single context invalidation invept calls */
 		BUG_ON(1);
 		break;
 	}
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  2014-03-31 21:00 [PATCH v2 0/3] nVMX: Fixes to run Xen as L1 Bandan Das
  2014-03-31 21:00 ` [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept Bandan Das
@ 2014-03-31 21:00 ` Bandan Das
  2014-04-11 18:33   ` Marcelo Tosatti
  2014-03-31 21:00 ` [PATCH v2 3/3] KVM: nVMX: Advertise support for interrupt acknowledgement Bandan Das
  2 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-03-31 21:00 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Gleb Natapov, Jan Kiszka

This feature emulates the "Acknowledge interrupt on exit" behavior.
We can safely emulate it for L1 to run L2 even if L0 itself has it
disabled (to run L1).

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/irq.c |  1 +
 arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+)

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 484bc87..bd0da43 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -113,6 +113,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
 
 	return kvm_get_apic_interrupt(v);	/* APIC */
 }
+EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
 
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 3e7f60c..bdc8f2d 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -4489,6 +4489,18 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
 		PIN_BASED_EXT_INTR_MASK;
 }
 
+/*
+ * In nested virtualization, check if L1 has enabled
+ * interrupt acknowledgement that writes the interrupt vector
+ * info on vmexit
+ * 
+ */
+static bool nested_exit_intr_ack_set(struct kvm_vcpu *vcpu)
+{
+	return get_vmcs12(vcpu)->vm_exit_controls &
+		VM_EXIT_ACK_INTR_ON_EXIT;
+}
+
 static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
 {
 	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
@@ -8442,6 +8454,13 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
 	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
 		       exit_qualification);
 
+	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
+	    && nested_exit_intr_ack_set(vcpu)) {
+		int irq = kvm_cpu_get_interrupt(vcpu);
+		vmcs12->vm_exit_intr_info = irq |
+			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
+	}
+
 	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
 				       vmcs12->exit_qualification,
 				       vmcs12->idt_vectoring_info_field,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH v2 3/3] KVM: nVMX: Advertise support for interrupt acknowledgement
  2014-03-31 21:00 [PATCH v2 0/3] nVMX: Fixes to run Xen as L1 Bandan Das
  2014-03-31 21:00 ` [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept Bandan Das
  2014-03-31 21:00 ` [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to Bandan Das
@ 2014-03-31 21:00 ` Bandan Das
  2 siblings, 0 replies; 21+ messages in thread
From: Bandan Das @ 2014-03-31 21:00 UTC (permalink / raw)
  To: kvm; +Cc: Paolo Bonzini, Gleb Natapov, Jan Kiszka

Some Type 1 hypervisors such as XEN won't enable VMX without it present

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index e864b7a..a2a03c5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2273,7 +2273,8 @@ static __init void nested_vmx_setup_ctls_msrs(void)
 		nested_vmx_pinbased_ctls_high &= ~PIN_BASED_VMX_PREEMPTION_TIMER;
 	}
 	nested_vmx_exit_ctls_high |= (VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR |
-		VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER);
+		VM_EXIT_LOAD_IA32_EFER | VM_EXIT_SAVE_IA32_EFER |
+				      VM_EXIT_ACK_INTR_ON_EXIT);
 
 	/* entry controls */
 	rdmsr(MSR_IA32_VMX_ENTRY_CTLS,
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-03-31 21:00 ` [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept Bandan Das
@ 2014-04-10 20:47   ` Marcelo Tosatti
  2014-04-11  0:27     ` Bandan Das
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-10 20:47 UTC (permalink / raw)
  To: Bandan Das; +Cc: kvm, Paolo Bonzini, Gleb Natapov, Jan Kiszka

On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> For single context invalidation, we fall through to global
> invalidation in handle_invept() except for one case - when
> the operand supplied by L1 is different from what we have in
> vmcs12. However, typically hypervisors will only call invept
> for the currently loaded eptp, so the condition will
> never be true.
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>

Bandan,

Why not fix INVEPT single-context rather than removing it entirely?

"Single-context. If the INVEPT type is 1, the logical processor
invalidates all guest-physical mappings and combined mappings associated
with the EP4TA specified in the INVEPT descriptor. Combined mappings for
that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
may invalidate mappings associated with other EP4TAs.)"

So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.

> ---
>  arch/x86/kvm/vmx.c | 15 +++++----------
>  1 file changed, 5 insertions(+), 10 deletions(-)
> 
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 3927528..3e7f60c 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -2331,12 +2331,11 @@ static __init void nested_vmx_setup_ctls_msrs(void)
>  			 VMX_EPT_INVEPT_BIT;
>  		nested_vmx_ept_caps &= vmx_capability.ept;
>  		/*
> -		 * Since invept is completely emulated we support both global
> -		 * and context invalidation independent of what host cpu
> -		 * supports
> +		 * For nested guests, we don't do anything specific
> +		 * for single context invalidation. Hence, only advertise
> +		 * support for global context invalidation.
>  		 */
> -		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
> -			VMX_EPT_EXTENT_CONTEXT_BIT;
> +		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
>  	} else
>  		nested_vmx_ept_caps = 0;
>  
> @@ -6383,7 +6382,6 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  	struct {
>  		u64 eptp, gpa;
>  	} operand;
> -	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
>  
>  	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
>  	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
> @@ -6423,16 +6421,13 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>  	}
>  
>  	switch (type) {
> -	case VMX_EPT_EXTENT_CONTEXT:
> -		if ((operand.eptp & eptp_mask) !=
> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
> -			break;
>  	case VMX_EPT_EXTENT_GLOBAL:
>  		kvm_mmu_sync_roots(vcpu);
>  		kvm_mmu_flush_tlb(vcpu);
>  		nested_vmx_succeed(vcpu);
>  		break;
>  	default:
> +		/* Trap single context invalidation invept calls */
>  		BUG_ON(1);
>  		break;
>  	}
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-10 20:47   ` Marcelo Tosatti
@ 2014-04-11  0:27     ` Bandan Das
  2014-04-11  6:22       ` Jan Kiszka
  0 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-04-11  0:27 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Gleb Natapov, Jan Kiszka

Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>> For single context invalidation, we fall through to global
>> invalidation in handle_invept() except for one case - when
>> the operand supplied by L1 is different from what we have in
>> vmcs12. However, typically hypervisors will only call invept
>> for the currently loaded eptp, so the condition will
>> never be true.
>> 
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>
> Bandan,
>
> Why not fix INVEPT single-context rather than removing it entirely?
>
> "Single-context. If the INVEPT type is 1, the logical processor
> invalidates all guest-physical mappings and combined mappings associated
> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> may invalidate mappings associated with other EP4TAs.)"
>
> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.

The single context invalidation in handle_invept() doesn't do 
anything different. It just falls down to the global case.
And the invept code in Xen and KVM both seemed to fall back
to global invalidation if support for single context wasn't found.
So, it was proposed not to advertise it at all.

But rethinking this again, I agree with you. If there's a hypervisor
with a  single context invept implmentation that does not fallback,
this will unfortunately not work. Jan, do you agree with this ?

Bandan

>> ---
>>  arch/x86/kvm/vmx.c | 15 +++++----------
>>  1 file changed, 5 insertions(+), 10 deletions(-)
>> 
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 3927528..3e7f60c 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -2331,12 +2331,11 @@ static __init void nested_vmx_setup_ctls_msrs(void)
>>  			 VMX_EPT_INVEPT_BIT;
>>  		nested_vmx_ept_caps &= vmx_capability.ept;
>>  		/*
>> -		 * Since invept is completely emulated we support both global
>> -		 * and context invalidation independent of what host cpu
>> -		 * supports
>> +		 * For nested guests, we don't do anything specific
>> +		 * for single context invalidation. Hence, only advertise
>> +		 * support for global context invalidation.
>>  		 */
>> -		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT |
>> -			VMX_EPT_EXTENT_CONTEXT_BIT;
>> +		nested_vmx_ept_caps |= VMX_EPT_EXTENT_GLOBAL_BIT;
>>  	} else
>>  		nested_vmx_ept_caps = 0;
>>  
>> @@ -6383,7 +6382,6 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>  	struct {
>>  		u64 eptp, gpa;
>>  	} operand;
>> -	u64 eptp_mask = ((1ull << 51) - 1) & PAGE_MASK;
>>  
>>  	if (!(nested_vmx_secondary_ctls_high & SECONDARY_EXEC_ENABLE_EPT) ||
>>  	    !(nested_vmx_ept_caps & VMX_EPT_INVEPT_BIT)) {
>> @@ -6423,16 +6421,13 @@ static int handle_invept(struct kvm_vcpu *vcpu)
>>  	}
>>  
>>  	switch (type) {
>> -	case VMX_EPT_EXTENT_CONTEXT:
>> -		if ((operand.eptp & eptp_mask) !=
>> -				(nested_ept_get_cr3(vcpu) & eptp_mask))
>> -			break;
>>  	case VMX_EPT_EXTENT_GLOBAL:
>>  		kvm_mmu_sync_roots(vcpu);
>>  		kvm_mmu_flush_tlb(vcpu);
>>  		nested_vmx_succeed(vcpu);
>>  		break;
>>  	default:
>> +		/* Trap single context invalidation invept calls */
>>  		BUG_ON(1);
>>  		break;
>>  	}
>> -- 
>> 1.8.3.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11  0:27     ` Bandan Das
@ 2014-04-11  6:22       ` Jan Kiszka
  2014-04-11 17:26         ` Bandan Das
                           ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Jan Kiszka @ 2014-04-11  6:22 UTC (permalink / raw)
  To: Bandan Das, Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Gleb Natapov

On 2014-04-11 02:27, Bandan Das wrote:
> Marcelo Tosatti <mtosatti@redhat.com> writes:
> 
>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>> For single context invalidation, we fall through to global
>>> invalidation in handle_invept() except for one case - when
>>> the operand supplied by L1 is different from what we have in
>>> vmcs12. However, typically hypervisors will only call invept
>>> for the currently loaded eptp, so the condition will
>>> never be true.
>>>
>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>
>> Bandan,
>>
>> Why not fix INVEPT single-context rather than removing it entirely?
>>
>> "Single-context. If the INVEPT type is 1, the logical processor
>> invalidates all guest-physical mappings and combined mappings associated
>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>> may invalidate mappings associated with other EP4TAs.)"
>>
>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> 
> The single context invalidation in handle_invept() doesn't do 
> anything different. It just falls down to the global case.
> And the invept code in Xen and KVM both seemed to fall back
> to global invalidation if support for single context wasn't found.
> So, it was proposed not to advertise it at all.
> 
> But rethinking this again, I agree with you. If there's a hypervisor
> with a  single context invept implmentation that does not fallback,
> this will unfortunately not work. Jan, do you agree with this ?

A hypervisor that doesn't properly check the HW caps is just broken. And
one that mandates single context invalidation support is silly.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11  6:22       ` Jan Kiszka
@ 2014-04-11 17:26         ` Bandan Das
  2014-04-11 18:01           ` Jan Kiszka
  2014-04-11 18:48         ` Marcelo Tosatti
  2014-04-11 19:02         ` Marcelo Tosatti
  2 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-04-11 17:26 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Gleb Natapov

Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 02:27, Bandan Das wrote:
>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>> 
>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>> For single context invalidation, we fall through to global
>>>> invalidation in handle_invept() except for one case - when
>>>> the operand supplied by L1 is different from what we have in
>>>> vmcs12. However, typically hypervisors will only call invept
>>>> for the currently loaded eptp, so the condition will
>>>> never be true.
>>>>
>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>
>>> Bandan,
>>>
>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>
>>> "Single-context. If the INVEPT type is 1, the logical processor
>>> invalidates all guest-physical mappings and combined mappings associated
>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>> may invalidate mappings associated with other EP4TAs.)"
>>>
>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>> 
>> The single context invalidation in handle_invept() doesn't do 
>> anything different. It just falls down to the global case.
>> And the invept code in Xen and KVM both seemed to fall back
>> to global invalidation if support for single context wasn't found.
>> So, it was proposed not to advertise it at all.
>> 
>> But rethinking this again, I agree with you. If there's a hypervisor
>> with a  single context invept implmentation that does not fallback,
>> this will unfortunately not work. Jan, do you agree with this ?
>
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.

Well, but we could make life a little bit easier for the unfortunate user
using the broken hypervisor :) And advertising single context inavalidation
doesn't really seem to have any downsides.

> Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 17:26         ` Bandan Das
@ 2014-04-11 18:01           ` Jan Kiszka
  2014-04-11 18:35             ` Bandan Das
  0 siblings, 1 reply; 21+ messages in thread
From: Jan Kiszka @ 2014-04-11 18:01 UTC (permalink / raw)
  To: Bandan Das; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Gleb Natapov

On 2014-04-11 19:26, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 2014-04-11 02:27, Bandan Das wrote:
>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>
>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>> For single context invalidation, we fall through to global
>>>>> invalidation in handle_invept() except for one case - when
>>>>> the operand supplied by L1 is different from what we have in
>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>> for the currently loaded eptp, so the condition will
>>>>> never be true.
>>>>>
>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>
>>>> Bandan,
>>>>
>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>
>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>> invalidates all guest-physical mappings and combined mappings associated
>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>
>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>
>>> The single context invalidation in handle_invept() doesn't do 
>>> anything different. It just falls down to the global case.
>>> And the invept code in Xen and KVM both seemed to fall back
>>> to global invalidation if support for single context wasn't found.
>>> So, it was proposed not to advertise it at all.
>>>
>>> But rethinking this again, I agree with you. If there's a hypervisor
>>> with a  single context invept implmentation that does not fallback,
>>> this will unfortunately not work. Jan, do you agree with this ?
>>
>> A hypervisor that doesn't properly check the HW caps is just broken. And
>> one that mandates single context invalidation support is silly.
> 
> Well, but we could make life a little bit easier for the unfortunate user
> using the broken hypervisor :) And advertising single context inavalidation
> doesn't really seem to have any downsides.

Ok, let's try it this way: single-context invalidation is inherently
tied to VPID support (that's how you address a context). However, KVM
does not expose VPID to its guest. So this discussion is mood: no
hypervisor will make use of this feature as it has no means to fill in
the required parameter.

Once we start supporting VPID, we can also think about how to address
single-context invalidation reasonably.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  2014-03-31 21:00 ` [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to Bandan Das
@ 2014-04-11 18:33   ` Marcelo Tosatti
  2014-04-11 19:17     ` Bandan Das
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-11 18:33 UTC (permalink / raw)
  To: Bandan Das; +Cc: kvm, Paolo Bonzini, Gleb Natapov, Jan Kiszka

On Mon, Mar 31, 2014 at 05:00:24PM -0400, Bandan Das wrote:
> This feature emulates the "Acknowledge interrupt on exit" behavior.
> We can safely emulate it for L1 to run L2 even if L0 itself has it
> disabled (to run L1).
> 
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/irq.c |  1 +
>  arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
>  2 files changed, 20 insertions(+)
> 
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> index 484bc87..bd0da43 100644
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -113,6 +113,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>  
>  	return kvm_get_apic_interrupt(v);	/* APIC */
>  }
> +EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
>  
>  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 3e7f60c..bdc8f2d 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -4489,6 +4489,18 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>  		PIN_BASED_EXT_INTR_MASK;
>  }
>  
> +/*
> + * In nested virtualization, check if L1 has enabled
> + * interrupt acknowledgement that writes the interrupt vector
> + * info on vmexit
> + * 
> + */
> +static bool nested_exit_intr_ack_set(struct kvm_vcpu *vcpu)
> +{
> +	return get_vmcs12(vcpu)->vm_exit_controls &
> +		VM_EXIT_ACK_INTR_ON_EXIT;
> +}
> +
>  static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
>  {
>  	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> @@ -8442,6 +8454,13 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
>  	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
>  		       exit_qualification);
>  
> +	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> +	    && nested_exit_intr_ack_set(vcpu)) {
> +		int irq = kvm_cpu_get_interrupt(vcpu);

Can irq be -1 ?

> +		vmcs12->vm_exit_intr_info = irq |
> +			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
> +	}
> +
>  	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
>  				       vmcs12->exit_qualification,
>  				       vmcs12->idt_vectoring_info_field,
> -- 
> 1.8.3.1
> 
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 18:01           ` Jan Kiszka
@ 2014-04-11 18:35             ` Bandan Das
  2014-04-11 18:53               ` Jan Kiszka
  0 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-04-11 18:35 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Gleb Natapov

Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 19:26, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>
>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>> For single context invalidation, we fall through to global
>>>>>> invalidation in handle_invept() except for one case - when
>>>>>> the operand supplied by L1 is different from what we have in
>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>> for the currently loaded eptp, so the condition will
>>>>>> never be true.
>>>>>>
>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>
>>>>> Bandan,
>>>>>
>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>
>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>
>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>
>>>> The single context invalidation in handle_invept() doesn't do 
>>>> anything different. It just falls down to the global case.
>>>> And the invept code in Xen and KVM both seemed to fall back
>>>> to global invalidation if support for single context wasn't found.
>>>> So, it was proposed not to advertise it at all.
>>>>
>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>> with a  single context invept implmentation that does not fallback,
>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>
>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>> one that mandates single context invalidation support is silly.
>> 
>> Well, but we could make life a little bit easier for the unfortunate user
>> using the broken hypervisor :) And advertising single context inavalidation
>> doesn't really seem to have any downsides.
>
> Ok, let's try it this way: single-context invalidation is inherently
> tied to VPID support (that's how you address a context). However, KVM
> does not expose VPID to its guest. So this discussion is mood: no
> hypervisor will make use of this feature as it has no means to fill in
> the required parameter.

I thought (from the spec) invept single context invalidation
takes the EP4TA as the second argument. invvpid single context
however takes the VPID as its descriptor.

The Xen L1 hypervisor was actually calling single context invept
multiple times. That's how I hit this bug.

> Once we start supporting VPID, we can also think about how to address
> single-context invalidation reasonably.
>
> Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11  6:22       ` Jan Kiszka
  2014-04-11 17:26         ` Bandan Das
@ 2014-04-11 18:48         ` Marcelo Tosatti
  2014-04-11 19:33           ` Bandan Das
  2014-04-11 19:02         ` Marcelo Tosatti
  2 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-11 18:48 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Bandan Das, kvm, Paolo Bonzini, Gleb Natapov

On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
> On 2014-04-11 02:27, Bandan Das wrote:
> > Marcelo Tosatti <mtosatti@redhat.com> writes:
> > 
> >> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> >>> For single context invalidation, we fall through to global
> >>> invalidation in handle_invept() except for one case - when
> >>> the operand supplied by L1 is different from what we have in
> >>> vmcs12. However, typically hypervisors will only call invept
> >>> for the currently loaded eptp, so the condition will
> >>> never be true.
> >>>
> >>> Signed-off-by: Bandan Das <bsd@redhat.com>
> >>
>> Bandan,
> >>
> >> Why not fix INVEPT single-context rather than removing it entirely?
> >>
> >> "Single-context. If the INVEPT type is 1, the logical processor
> >> invalidates all guest-physical mappings and combined mappings associated
> >> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> >> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> >> may invalidate mappings associated with other EP4TAs.)"
> >>
> >> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> > 
> > The single context invalidation in handle_invept() doesn't do 
> > anything different. It just falls down to the global case.
> > And the invept code in Xen and KVM both seemed to fall back
> > to global invalidation if support for single context wasn't found.
> > So, it was proposed not to advertise it at all.
> > 
> > But rethinking this again, I agree with you. If there's a hypervisor
> > with a  single context invept implmentation that does not fallback,

What do you mean "does not fallback" ? The hypervisor cannot detect 
fallback because:

"(The instruction may invalidate mappings associated with other EP4TAs.)"

So the spec says single context can behave as global context (similar
with TLB entries and INVLPG).

So it is valid to implement single context as global context.

> > this will unfortunately not work. Jan, do you agree with this ?
> 
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.
> 
> Jan

I imagined Xen broke because broken KVM's implementation of INVEPT
single context (so that should be fixed).

If with the proper implementation of INVEPT single context in KVM Xen
still fails for some reason, would have to understand why it is failing.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 18:35             ` Bandan Das
@ 2014-04-11 18:53               ` Jan Kiszka
  2014-04-11 19:35                 ` Marcelo Tosatti
  2014-04-11 19:38                 ` Bandan Das
  0 siblings, 2 replies; 21+ messages in thread
From: Jan Kiszka @ 2014-04-11 18:53 UTC (permalink / raw)
  To: Bandan Das; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Gleb Natapov

On 2014-04-11 20:35, Bandan Das wrote:
> Jan Kiszka <jan.kiszka@siemens.com> writes:
> 
>> On 2014-04-11 19:26, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>
>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>> For single context invalidation, we fall through to global
>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>> never be true.
>>>>>>>
>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>
>>>>>> Bandan,
>>>>>>
>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>
>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>
>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>
>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>> anything different. It just falls down to the global case.
>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>> to global invalidation if support for single context wasn't found.
>>>>> So, it was proposed not to advertise it at all.
>>>>>
>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>> with a  single context invept implmentation that does not fallback,
>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>
>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>> one that mandates single context invalidation support is silly.
>>>
>>> Well, but we could make life a little bit easier for the unfortunate user
>>> using the broken hypervisor :) And advertising single context inavalidation
>>> doesn't really seem to have any downsides.
>>
>> Ok, let's try it this way: single-context invalidation is inherently
>> tied to VPID support (that's how you address a context). However, KVM
>> does not expose VPID to its guest. So this discussion is mood: no
>> hypervisor will make use of this feature as it has no means to fill in
>> the required parameter.
> 
> I thought (from the spec) invept single context invalidation
> takes the EP4TA as the second argument. invvpid single context
> however takes the VPID as its descriptor.

Oops, invept/invvpid mess-up while re-reading the spec - sorry.

> 
> The Xen L1 hypervisor was actually calling single context invept
> multiple times. That's how I hit this bug.

...and it's no longer doing it now, I suppose. The question remains,
which hypervisor we want to cater with a
"single-context-that-is-current-context" invalidation (that is my
understanding of Marcelo's proposal). On the other hand, if some
hypervisor actually uses invept to invalidate a non-current mapping, we
would regress compared to not exposing single context invept. Hope I got
this conclusion right. ;)

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11  6:22       ` Jan Kiszka
  2014-04-11 17:26         ` Bandan Das
  2014-04-11 18:48         ` Marcelo Tosatti
@ 2014-04-11 19:02         ` Marcelo Tosatti
  2 siblings, 0 replies; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-11 19:02 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Bandan Das, kvm, Paolo Bonzini, Gleb Natapov

On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
> > But rethinking this again, I agree with you. If there's a hypervisor
> > with a  single context invept implmentation that does not fallback,
> > this will unfortunately not work. Jan, do you agree with this ?
> 
> A hypervisor that doesn't properly check the HW caps is just broken. And
> one that mandates single context invalidation support is silly.

Is this a justification for removing INVEPT single-context until it 
is implemented as single-context?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  2014-04-11 18:33   ` Marcelo Tosatti
@ 2014-04-11 19:17     ` Bandan Das
  2014-04-11 19:20       ` Marcelo Tosatti
  0 siblings, 1 reply; 21+ messages in thread
From: Bandan Das @ 2014-04-11 19:17 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: kvm, Paolo Bonzini, Gleb Natapov, Jan Kiszka

Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Mon, Mar 31, 2014 at 05:00:24PM -0400, Bandan Das wrote:
>> This feature emulates the "Acknowledge interrupt on exit" behavior.
>> We can safely emulate it for L1 to run L2 even if L0 itself has it
>> disabled (to run L1).
>> 
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  arch/x86/kvm/irq.c |  1 +
>>  arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
>>  2 files changed, 20 insertions(+)
>> 
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index 484bc87..bd0da43 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -113,6 +113,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
>>  
>>  	return kvm_get_apic_interrupt(v);	/* APIC */
>>  }
>> +EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
>>  
>>  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
>>  {
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 3e7f60c..bdc8f2d 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -4489,6 +4489,18 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
>>  		PIN_BASED_EXT_INTR_MASK;
>>  }
>>  
>> +/*
>> + * In nested virtualization, check if L1 has enabled
>> + * interrupt acknowledgement that writes the interrupt vector
>> + * info on vmexit
>> + * 
>> + */
>> +static bool nested_exit_intr_ack_set(struct kvm_vcpu *vcpu)
>> +{
>> +	return get_vmcs12(vcpu)->vm_exit_controls &
>> +		VM_EXIT_ACK_INTR_ON_EXIT;
>> +}
>> +
>>  static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
>>  {
>>  	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
>> @@ -8442,6 +8454,13 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
>>  	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
>>  		       exit_qualification);
>>  
>> +	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
>> +	    && nested_exit_intr_ack_set(vcpu)) {
>> +		int irq = kvm_cpu_get_interrupt(vcpu);
>
> Can irq be -1 ?

If it is, I think that's a bug because if we exited for this 
reason with INTR_ACK set, the hypervisor expects a valid vector 
number to be available. 

What about adding a BUG_ON ?

>> +		vmcs12->vm_exit_intr_info = irq |
>> +			INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR;
>> +	}
>> +
>>  	trace_kvm_nested_vmexit_inject(vmcs12->vm_exit_reason,
>>  				       vmcs12->exit_qualification,
>>  				       vmcs12->idt_vectoring_info_field,
>> -- 
>> 1.8.3.1
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe kvm" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  2014-04-11 19:17     ` Bandan Das
@ 2014-04-11 19:20       ` Marcelo Tosatti
  2014-04-12 16:57         ` Paolo Bonzini
  0 siblings, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-11 19:20 UTC (permalink / raw)
  To: Bandan Das; +Cc: kvm, Paolo Bonzini, Gleb Natapov, Jan Kiszka

On Fri, Apr 11, 2014 at 03:17:47PM -0400, Bandan Das wrote:
> Marcelo Tosatti <mtosatti@redhat.com> writes:
> 
> > On Mon, Mar 31, 2014 at 05:00:24PM -0400, Bandan Das wrote:
> >> This feature emulates the "Acknowledge interrupt on exit" behavior.
> >> We can safely emulate it for L1 to run L2 even if L0 itself has it
> >> disabled (to run L1).
> >> 
> >> Signed-off-by: Bandan Das <bsd@redhat.com>
> >> ---
> >>  arch/x86/kvm/irq.c |  1 +
> >>  arch/x86/kvm/vmx.c | 19 +++++++++++++++++++
> >>  2 files changed, 20 insertions(+)
> >> 
> >> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> >> index 484bc87..bd0da43 100644
> >> --- a/arch/x86/kvm/irq.c
> >> +++ b/arch/x86/kvm/irq.c
> >> @@ -113,6 +113,7 @@ int kvm_cpu_get_interrupt(struct kvm_vcpu *v)
> >>  
> >>  	return kvm_get_apic_interrupt(v);	/* APIC */
> >>  }
> >> +EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
> >>  
> >>  void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
> >>  {
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index 3e7f60c..bdc8f2d 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -4489,6 +4489,18 @@ static bool nested_exit_on_intr(struct kvm_vcpu *vcpu)
> >>  		PIN_BASED_EXT_INTR_MASK;
> >>  }
> >>  
> >> +/*
> >> + * In nested virtualization, check if L1 has enabled
> >> + * interrupt acknowledgement that writes the interrupt vector
> >> + * info on vmexit
> >> + * 
> >> + */
> >> +static bool nested_exit_intr_ack_set(struct kvm_vcpu *vcpu)
> >> +{
> >> +	return get_vmcs12(vcpu)->vm_exit_controls &
> >> +		VM_EXIT_ACK_INTR_ON_EXIT;
> >> +}
> >> +
> >>  static bool nested_exit_on_nmi(struct kvm_vcpu *vcpu)
> >>  {
> >>  	return get_vmcs12(vcpu)->pin_based_vm_exec_control &
> >> @@ -8442,6 +8454,13 @@ static void nested_vmx_vmexit(struct kvm_vcpu *vcpu, u32 exit_reason,
> >>  	prepare_vmcs12(vcpu, vmcs12, exit_reason, exit_intr_info,
> >>  		       exit_qualification);
> >>  
> >> +	if ((exit_reason == EXIT_REASON_EXTERNAL_INTERRUPT)
> >> +	    && nested_exit_intr_ack_set(vcpu)) {
> >> +		int irq = kvm_cpu_get_interrupt(vcpu);
> >
> > Can irq be -1 ?
> 
> If it is, I think that's a bug because if we exited for this 
> reason with INTR_ACK set, the hypervisor expects a valid vector 
> number to be available. 
> 
> What about adding a BUG_ON ?

Sounds good.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 18:48         ` Marcelo Tosatti
@ 2014-04-11 19:33           ` Bandan Das
  0 siblings, 0 replies; 21+ messages in thread
From: Bandan Das @ 2014-04-11 19:33 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Jan Kiszka, kvm, Paolo Bonzini, Gleb Natapov

Marcelo Tosatti <mtosatti@redhat.com> writes:

> On Fri, Apr 11, 2014 at 08:22:13AM +0200, Jan Kiszka wrote:
>> On 2014-04-11 02:27, Bandan Das wrote:
>> > Marcelo Tosatti <mtosatti@redhat.com> writes:
>> > 
>> >> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>> >>> For single context invalidation, we fall through to global
>> >>> invalidation in handle_invept() except for one case - when
>> >>> the operand supplied by L1 is different from what we have in
>> >>> vmcs12. However, typically hypervisors will only call invept
>> >>> for the currently loaded eptp, so the condition will
>> >>> never be true.
>> >>>
>> >>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> >>
>>> Bandan,
>> >>
>> >> Why not fix INVEPT single-context rather than removing it entirely?
>> >>
>> >> "Single-context. If the INVEPT type is 1, the logical processor
>> >> invalidates all guest-physical mappings and combined mappings associated
>> >> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>> >> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>> >> may invalidate mappings associated with other EP4TAs.)"
>> >>
>> >> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>> > 
>> > The single context invalidation in handle_invept() doesn't do 
>> > anything different. It just falls down to the global case.
>> > And the invept code in Xen and KVM both seemed to fall back
>> > to global invalidation if support for single context wasn't found.
>> > So, it was proposed not to advertise it at all.
>> > 
>> > But rethinking this again, I agree with you. If there's a hypervisor
>> > with a  single context invept implmentation that does not fallback,
>
> What do you mean "does not fallback" ? The hypervisor cannot detect 
> fallback because:
>
> "(The instruction may invalidate mappings associated with other EP4TAs.)"
>
> So the spec says single context can behave as global context (similar
> with TLB entries and INVLPG).
>
> So it is valid to implement single context as global context.

I meant if single context invalidation isn't supported,
the hypervisor falls back to global invalidation like in kvm -

static inline void ept_sync_context(u64 eptp)
{
...
		if (cpu_has_vmx_invept_context())
			__invept(VMX_EPT_EXTENT_CONTEXT, eptp, 0);
		else
			ept_sync_global();
...

>> > this will unfortunately not work. Jan, do you agree with this ?
>> 
>> A hypervisor that doesn't properly check the HW caps is just broken. And
>> one that mandates single context invalidation support is silly.
>> 
>> Jan
>
> I imagined Xen broke because broken KVM's implementation of INVEPT
> single context (so that should be fixed).

It's failing because of this check in handle_invept -
if ((operand.eptp & eptp_mask) !=
	(nested_ept_get_cr3(vcpu) & eptp_mask))
			break;

Problem is invept can get called even after a vmclear and Jan 
pointed out that there's probably no case where this if will
evaluate to true (atleast not for kvm/xen).

> If with the proper implementation of INVEPT single context in KVM Xen
> still fails for some reason, would have to understand why it is failing.

The argument was that since kvm doesn't do anything different
for single context invalidation, does it make sense to not advertise
it at all assuming that the above snippet of invept code is used
by all hypervisors ?


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 18:53               ` Jan Kiszka
@ 2014-04-11 19:35                 ` Marcelo Tosatti
  2014-04-14  5:46                   ` Jan Kiszka
  2014-04-11 19:38                 ` Bandan Das
  1 sibling, 1 reply; 21+ messages in thread
From: Marcelo Tosatti @ 2014-04-11 19:35 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Bandan Das, kvm, Paolo Bonzini, Gleb Natapov

On Fri, Apr 11, 2014 at 08:53:09PM +0200, Jan Kiszka wrote:
> On 2014-04-11 20:35, Bandan Das wrote:
> > Jan Kiszka <jan.kiszka@siemens.com> writes:
> > 
> >> On 2014-04-11 19:26, Bandan Das wrote:
> >>> Jan Kiszka <jan.kiszka@siemens.com> writes:
> >>>
> >>>> On 2014-04-11 02:27, Bandan Das wrote:
> >>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
> >>>>>
> >>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
> >>>>>>> For single context invalidation, we fall through to global
> >>>>>>> invalidation in handle_invept() except for one case - when
> >>>>>>> the operand supplied by L1 is different from what we have in
> >>>>>>> vmcs12. However, typically hypervisors will only call invept
> >>>>>>> for the currently loaded eptp, so the condition will
> >>>>>>> never be true.
> >>>>>>>
> >>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
> >>>>>>
> >>>>>> Bandan,
> >>>>>>
> >>>>>> Why not fix INVEPT single-context rather than removing it entirely?
> >>>>>>
> >>>>>> "Single-context. If the INVEPT type is 1, the logical processor
> >>>>>> invalidates all guest-physical mappings and combined mappings associated
> >>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
> >>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
> >>>>>> may invalidate mappings associated with other EP4TAs.)"
> >>>>>>
> >>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
> >>>>>
> >>>>> The single context invalidation in handle_invept() doesn't do 
> >>>>> anything different. It just falls down to the global case.
> >>>>> And the invept code in Xen and KVM both seemed to fall back
> >>>>> to global invalidation if support for single context wasn't found.
> >>>>> So, it was proposed not to advertise it at all.
> >>>>>
> >>>>> But rethinking this again, I agree with you. If there's a hypervisor
> >>>>> with a  single context invept implmentation that does not fallback,
> >>>>> this will unfortunately not work. Jan, do you agree with this ?
> >>>>
> >>>> A hypervisor that doesn't properly check the HW caps is just broken. And
> >>>> one that mandates single context invalidation support is silly.
> >>>
> >>> Well, but we could make life a little bit easier for the unfortunate user
> >>> using the broken hypervisor :) And advertising single context inavalidation
> >>> doesn't really seem to have any downsides.
> >>
> >> Ok, let's try it this way: single-context invalidation is inherently
> >> tied to VPID support (that's how you address a context). However, KVM
> >> does not expose VPID to its guest. So this discussion is mood: no
> >> hypervisor will make use of this feature as it has no means to fill in
> >> the required parameter.
> > 
> > I thought (from the spec) invept single context invalidation
> > takes the EP4TA as the second argument. invvpid single context
> > however takes the VPID as its descriptor.
> 
> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
> 
> > 
> > The Xen L1 hypervisor was actually calling single context invept
> > multiple times. That's how I hit this bug.
> 
> ...and it's no longer doing it now, I suppose. The question remains,
> which hypervisor we want to cater with a
> "single-context-that-is-current-context" invalidation (that is my
> understanding of Marcelo's proposal). 

My proposal is to implement what is in the spec.

> On the other hand, if some hypervisor actually uses invept to
> invalidate a non-current mapping, we would regress compared to not
> exposing single context invept. Hope I got this conclusion right. ;)

In that case INVEPT global would also be broken.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 18:53               ` Jan Kiszka
  2014-04-11 19:35                 ` Marcelo Tosatti
@ 2014-04-11 19:38                 ` Bandan Das
  1 sibling, 0 replies; 21+ messages in thread
From: Bandan Das @ 2014-04-11 19:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Marcelo Tosatti, kvm, Paolo Bonzini, Gleb Natapov

Jan Kiszka <jan.kiszka@siemens.com> writes:

> On 2014-04-11 20:35, Bandan Das wrote:
>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>> 
>>> On 2014-04-11 19:26, Bandan Das wrote:
>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>
>>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>>
>>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>>> For single context invalidation, we fall through to global
>>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>>> never be true.
>>>>>>>>
>>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>>
>>>>>>> Bandan,
>>>>>>>
>>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>>
>>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>>
>>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>>
>>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>>> anything different. It just falls down to the global case.
>>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>>> to global invalidation if support for single context wasn't found.
>>>>>> So, it was proposed not to advertise it at all.
>>>>>>
>>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>>> with a  single context invept implmentation that does not fallback,
>>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>>
>>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>>> one that mandates single context invalidation support is silly.
>>>>
>>>> Well, but we could make life a little bit easier for the unfortunate user
>>>> using the broken hypervisor :) And advertising single context inavalidation
>>>> doesn't really seem to have any downsides.
>>>
>>> Ok, let's try it this way: single-context invalidation is inherently
>>> tied to VPID support (that's how you address a context). However, KVM
>>> does not expose VPID to its guest. So this discussion is mood: no
>>> hypervisor will make use of this feature as it has no means to fill in
>>> the required parameter.
>> 
>> I thought (from the spec) invept single context invalidation
>> takes the EP4TA as the second argument. invvpid single context
>> however takes the VPID as its descriptor.
>
> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
>
>> 
>> The Xen L1 hypervisor was actually calling single context invept
>> multiple times. That's how I hit this bug.
>
> ...and it's no longer doing it now, I suppose. The question remains,
Yes.

> which hypervisor we want to cater with a
> "single-context-that-is-current-context" invalidation (that is my
> understanding of Marcelo's proposal). On the other hand, if some
> hypervisor actually uses invept to invalidate a non-current mapping, we
> would regress compared to not exposing single context invept. Hope I got
> this conclusion right. ;)

Yep, not sure if this holds true for any hypervisor. I traced this change
down to http://www.spinics.net/lists/kvm/msg94802.html but the 
conversation doesn't mention the reasoning

> Jan

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to
  2014-04-11 19:20       ` Marcelo Tosatti
@ 2014-04-12 16:57         ` Paolo Bonzini
  0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2014-04-12 16:57 UTC (permalink / raw)
  To: Marcelo Tosatti, Bandan Das; +Cc: kvm, Gleb Natapov, Jan Kiszka

Il 11/04/2014 15:20, Marcelo Tosatti ha scritto:
>>> > >
>>> > > Can irq be -1 ?
>> >
>> > If it is, I think that's a bug because if we exited for this
>> > reason with INTR_ACK set, the hypervisor expects a valid vector
>> > number to be available.
>> >
>> > What about adding a BUG_ON ?
> Sounds good.

Each BUG_ON we add to KVM is a potential guest-kill-host vulnerability. 
  WARN_ON is better.

Paolo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept
  2014-04-11 19:35                 ` Marcelo Tosatti
@ 2014-04-14  5:46                   ` Jan Kiszka
  0 siblings, 0 replies; 21+ messages in thread
From: Jan Kiszka @ 2014-04-14  5:46 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Bandan Das, kvm, Paolo Bonzini, Gleb Natapov

On 2014-04-11 21:35, Marcelo Tosatti wrote:
> On Fri, Apr 11, 2014 at 08:53:09PM +0200, Jan Kiszka wrote:
>> On 2014-04-11 20:35, Bandan Das wrote:
>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>
>>>> On 2014-04-11 19:26, Bandan Das wrote:
>>>>> Jan Kiszka <jan.kiszka@siemens.com> writes:
>>>>>
>>>>>> On 2014-04-11 02:27, Bandan Das wrote:
>>>>>>> Marcelo Tosatti <mtosatti@redhat.com> writes:
>>>>>>>
>>>>>>>> On Mon, Mar 31, 2014 at 05:00:23PM -0400, Bandan Das wrote:
>>>>>>>>> For single context invalidation, we fall through to global
>>>>>>>>> invalidation in handle_invept() except for one case - when
>>>>>>>>> the operand supplied by L1 is different from what we have in
>>>>>>>>> vmcs12. However, typically hypervisors will only call invept
>>>>>>>>> for the currently loaded eptp, so the condition will
>>>>>>>>> never be true.
>>>>>>>>>
>>>>>>>>> Signed-off-by: Bandan Das <bsd@redhat.com>
>>>>>>>>
>>>>>>>> Bandan,
>>>>>>>>
>>>>>>>> Why not fix INVEPT single-context rather than removing it entirely?
>>>>>>>>
>>>>>>>> "Single-context. If the INVEPT type is 1, the logical processor
>>>>>>>> invalidates all guest-physical mappings and combined mappings associated
>>>>>>>> with the EP4TA specified in the INVEPT descriptor. Combined mappings for
>>>>>>>> that EP4TA are invalidated for all VPIDs and all PCIDs. (The instruction
>>>>>>>> may invalidate mappings associated with other EP4TAs.)"
>>>>>>>>
>>>>>>>> So just removing the "if (EPTP != CURRENT.EPTP) BREAK" should be enough.
>>>>>>>
>>>>>>> The single context invalidation in handle_invept() doesn't do 
>>>>>>> anything different. It just falls down to the global case.
>>>>>>> And the invept code in Xen and KVM both seemed to fall back
>>>>>>> to global invalidation if support for single context wasn't found.
>>>>>>> So, it was proposed not to advertise it at all.
>>>>>>>
>>>>>>> But rethinking this again, I agree with you. If there's a hypervisor
>>>>>>> with a  single context invept implmentation that does not fallback,
>>>>>>> this will unfortunately not work. Jan, do you agree with this ?
>>>>>>
>>>>>> A hypervisor that doesn't properly check the HW caps is just broken. And
>>>>>> one that mandates single context invalidation support is silly.
>>>>>
>>>>> Well, but we could make life a little bit easier for the unfortunate user
>>>>> using the broken hypervisor :) And advertising single context inavalidation
>>>>> doesn't really seem to have any downsides.
>>>>
>>>> Ok, let's try it this way: single-context invalidation is inherently
>>>> tied to VPID support (that's how you address a context). However, KVM
>>>> does not expose VPID to its guest. So this discussion is mood: no
>>>> hypervisor will make use of this feature as it has no means to fill in
>>>> the required parameter.
>>>
>>> I thought (from the spec) invept single context invalidation
>>> takes the EP4TA as the second argument. invvpid single context
>>> however takes the VPID as its descriptor.
>>
>> Oops, invept/invvpid mess-up while re-reading the spec - sorry.
>>
>>>
>>> The Xen L1 hypervisor was actually calling single context invept
>>> multiple times. That's how I hit this bug.
>>
>> ...and it's no longer doing it now, I suppose. The question remains,
>> which hypervisor we want to cater with a
>> "single-context-that-is-current-context" invalidation (that is my
>> understanding of Marcelo's proposal). 
> 
> My proposal is to implement what is in the spec.
> 
>> On the other hand, if some hypervisor actually uses invept to
>> invalidate a non-current mapping, we would regress compared to not
>> exposing single context invept. Hope I got this conclusion right. ;)
> 
> In that case INVEPT global would also be broken.

I'm all for having a proper invept single context support but that,
first of all, requires tracking the vEPTP->EPTP mappings.

Jan

-- 
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2014-04-14  5:46 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-03-31 21:00 [PATCH v2 0/3] nVMX: Fixes to run Xen as L1 Bandan Das
2014-03-31 21:00 ` [PATCH v2 1/3] KVM: nVMX: Don't advertise single context invalidation for invept Bandan Das
2014-04-10 20:47   ` Marcelo Tosatti
2014-04-11  0:27     ` Bandan Das
2014-04-11  6:22       ` Jan Kiszka
2014-04-11 17:26         ` Bandan Das
2014-04-11 18:01           ` Jan Kiszka
2014-04-11 18:35             ` Bandan Das
2014-04-11 18:53               ` Jan Kiszka
2014-04-11 19:35                 ` Marcelo Tosatti
2014-04-14  5:46                   ` Jan Kiszka
2014-04-11 19:38                 ` Bandan Das
2014-04-11 18:48         ` Marcelo Tosatti
2014-04-11 19:33           ` Bandan Das
2014-04-11 19:02         ` Marcelo Tosatti
2014-03-31 21:00 ` [PATCH v2 2/3] KVM: nVMX: Ack and write vector info to intr_info if L1 asks us to Bandan Das
2014-04-11 18:33   ` Marcelo Tosatti
2014-04-11 19:17     ` Bandan Das
2014-04-11 19:20       ` Marcelo Tosatti
2014-04-12 16:57         ` Paolo Bonzini
2014-03-31 21:00 ` [PATCH v2 3/3] KVM: nVMX: Advertise support for interrupt acknowledgement Bandan Das

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.