All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
@ 2016-06-21  3:12 Bandan Das
  2016-06-21  3:12 ` [RFC PATCH 1/4] mmu: add a boolean to indicate host ept execute only support Bandan Das
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-21  3:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, guangrong.xiao

I spoke to Paolo about this a while back and thought to post this as
RFC while I am thinking of adding some unit tests.

Background: ESX refuses to run as L1 if support for EPT execute only isn't
found. I am not really sure if it uses it for anything since just advertising
the bits seems to work but adding the necessary plumbing seemed like a good idea.

Xiao, I took the liberty of adding you based on "git blame" :)

Thanks in advance.

Bandan Das (4):
  mmu: add a boolean to indicate host ept execute only support
  mmu: Update ept specific valid bit values
  mmu: don't set the present bit unconditionally
  vmx: advertise support for ept execute only

 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu.c              | 22 +++++++++++++++++-----
 arch/x86/kvm/paging_tmpl.h      |  9 +++++++--
 arch/x86/kvm/vmx.c              |  5 ++++-
 arch/x86/kvm/x86.c              |  2 +-
 5 files changed, 30 insertions(+), 10 deletions(-)

-- 
2.5.5


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [RFC PATCH 1/4] mmu: add a boolean to indicate host ept execute only support
  2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
@ 2016-06-21  3:12 ` Bandan Das
  2016-06-21  3:12 ` [RFC PATCH 2/4] mmu: Update ept specific valid bit values Bandan Das
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-21  3:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, guangrong.xiao

This is passed as the last argument with the call to
kvm_mmu_set_mask_ptes. This is an easier alternative to
FNAMing is_shadow_pte_present().

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/include/asm/kvm_host.h | 2 +-
 arch/x86/kvm/mmu.c              | 6 +++++-
 arch/x86/kvm/vmx.c              | 3 ++-
 arch/x86/kvm/x86.c              | 2 +-
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 69e62862..791f7e2 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1026,7 +1026,7 @@ void kvm_mmu_setup(struct kvm_vcpu *vcpu);
 void kvm_mmu_init_vm(struct kvm *kvm);
 void kvm_mmu_uninit_vm(struct kvm *kvm);
 void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
-		u64 dirty_mask, u64 nx_mask, u64 x_mask);
+		u64 dirty_mask, u64 nx_mask, u64 x_mask, bool xonly);
 
 void kvm_mmu_reset_context(struct kvm_vcpu *vcpu);
 void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index def97b3..37b01b1 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -175,6 +175,8 @@ static u64 __read_mostly shadow_user_mask;
 static u64 __read_mostly shadow_accessed_mask;
 static u64 __read_mostly shadow_dirty_mask;
 static u64 __read_mostly shadow_mmio_mask;
+/* true if processor supports execute only ept mappings */
+static bool __read_mostly shadow_xonly_valid;
 
 static void mmu_spte_set(u64 *sptep, u64 spte);
 static void mmu_free_roots(struct kvm_vcpu *vcpu);
@@ -282,13 +284,15 @@ static bool check_mmio_spte(struct kvm_vcpu *vcpu, u64 spte)
 }
 
 void kvm_mmu_set_mask_ptes(u64 user_mask, u64 accessed_mask,
-		u64 dirty_mask, u64 nx_mask, u64 x_mask)
+		u64 dirty_mask, u64 nx_mask, u64 x_mask,
+		bool xonly_valid)
 {
 	shadow_user_mask = user_mask;
 	shadow_accessed_mask = accessed_mask;
 	shadow_dirty_mask = dirty_mask;
 	shadow_nx_mask = nx_mask;
 	shadow_x_mask = x_mask;
+	shadow_xonly_valid = xonly_valid;
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_set_mask_ptes);
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 003618e..dbfc000 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -6369,7 +6369,8 @@ static __init int hardware_setup(void)
 		kvm_mmu_set_mask_ptes(0ull,
 			(enable_ept_ad_bits) ? VMX_EPT_ACCESS_BIT : 0ull,
 			(enable_ept_ad_bits) ? VMX_EPT_DIRTY_BIT : 0ull,
-			0ull, VMX_EPT_EXECUTABLE_MASK);
+			0ull, VMX_EPT_EXECUTABLE_MASK,
+			cpu_has_vmx_ept_execute_only() ? true : false);
 		ept_set_mmio_spte_mask();
 		kvm_enable_tdp();
 	} else
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 902d9da..a098d11 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5873,7 +5873,7 @@ int kvm_arch_init(void *opaque)
 	kvm_x86_ops = ops;
 
 	kvm_mmu_set_mask_ptes(PT_USER_MASK, PT_ACCESSED_MASK,
-			PT_DIRTY_MASK, PT64_NX_MASK, 0);
+			PT_DIRTY_MASK, PT64_NX_MASK, 0, false);
 
 	kvm_timer_init();
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 2/4] mmu: Update ept specific valid bit values
  2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
  2016-06-21  3:12 ` [RFC PATCH 1/4] mmu: add a boolean to indicate host ept execute only support Bandan Das
@ 2016-06-21  3:12 ` Bandan Das
  2016-06-21  8:08   ` Paolo Bonzini
  2016-06-22  4:09   ` Xiao Guangrong
  2016-06-21  3:12 ` [RFC PATCH 3/4] mmu: don't set the present bit unconditionally Bandan Das
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-21  3:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, guangrong.xiao

Until now , is_present_gpte checks for all bits set (XWR) and
is_shadow_present_pte() checks for the present bit set. To support
execute only mappings we should teach these functions to distinguish 100
as valid based on host support.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/mmu.c         | 5 ++++-
 arch/x86/kvm/paging_tmpl.h | 7 ++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 37b01b1..57d8696 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -308,7 +308,10 @@ static int is_nx(struct kvm_vcpu *vcpu)
 
 static int is_shadow_present_pte(u64 pte)
 {
-	return pte & PT_PRESENT_MASK && !is_mmio_spte(pte);
+	int xbit = shadow_xonly_valid ? pte & shadow_x_mask : 0;
+
+	return (pte & PT_PRESENT_MASK) | xbit
+		&& !is_mmio_spte(pte);
 }
 
 static int is_large_pte(u64 pte)
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index bc019f7..9f5bd06 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -133,7 +133,12 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
 #if PTTYPE != PTTYPE_EPT
 	return is_present_gpte(pte);
 #else
-	return pte & 7;
+	/*
+	 * For EPT, bits [2:0] can be 001, 100 or 111
+	 * Further, for 100, logical processor should support
+	 * execute-only.
+	 */
+	return (pte & 1) || (shadow_xonly_valid && (pte & 4));
 #endif
 }
 
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
  2016-06-21  3:12 ` [RFC PATCH 1/4] mmu: add a boolean to indicate host ept execute only support Bandan Das
  2016-06-21  3:12 ` [RFC PATCH 2/4] mmu: Update ept specific valid bit values Bandan Das
@ 2016-06-21  3:12 ` Bandan Das
  2016-06-21  8:12   ` Paolo Bonzini
  2016-06-22  4:30   ` Xiao Guangrong
  2016-06-21  3:12 ` [RFC PATCH 4/4] vmx: advertise support for ept execute only Bandan Das
  2016-06-22  4:39 ` [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Xiao Guangrong
  4 siblings, 2 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-21  3:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, guangrong.xiao

To support execute only mappings on behalf of L1 hypervisors,
we teach set_spte to honor L1's valid XWR bits. This is only
if host supports EPT execute only. Use ACC_USER_MASK to signify
if the L1 hypervisor has the present bit set.

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/mmu.c         | 11 ++++++++---
 arch/x86/kvm/paging_tmpl.h |  2 +-
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 57d8696..3ca1a99 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -2528,7 +2528,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
 		return 0;
 
-	spte = PT_PRESENT_MASK;
+	if (!shadow_xonly_valid)
+		spte = PT_PRESENT_MASK;
 	if (!speculative)
 		spte |= shadow_accessed_mask;
 
@@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 	else
 		spte |= shadow_nx_mask;
 
-	if (pte_access & ACC_USER_MASK)
-		spte |= shadow_user_mask;
+	if (pte_access & ACC_USER_MASK) {
+		if (shadow_xonly_valid)
+			spte |= PT_PRESENT_MASK;
+		else
+			spte |= shadow_user_mask;
+	}
 
 	if (level > PT_PAGE_TABLE_LEVEL)
 		spte |= PT_PAGE_SIZE_MASK;
diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
index 9f5bd06..5366a55 100644
--- a/arch/x86/kvm/paging_tmpl.h
+++ b/arch/x86/kvm/paging_tmpl.h
@@ -192,7 +192,7 @@ static inline unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, u64 gpte)
 #if PTTYPE == PTTYPE_EPT
 	access = ((gpte & VMX_EPT_WRITABLE_MASK) ? ACC_WRITE_MASK : 0) |
 		((gpte & VMX_EPT_EXECUTABLE_MASK) ? ACC_EXEC_MASK : 0) |
-		ACC_USER_MASK;
+		((gpte & VMX_EPT_READABLE_MASK) ? ACC_USER_MASK : 0);
 #else
 	BUILD_BUG_ON(ACC_EXEC_MASK != PT_PRESENT_MASK);
 	BUILD_BUG_ON(ACC_EXEC_MASK != 1);
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [RFC PATCH 4/4] vmx: advertise support for ept execute only
  2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
                   ` (2 preceding siblings ...)
  2016-06-21  3:12 ` [RFC PATCH 3/4] mmu: don't set the present bit unconditionally Bandan Das
@ 2016-06-21  3:12 ` Bandan Das
  2016-06-22  4:39 ` [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Xiao Guangrong
  4 siblings, 0 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-21  3:12 UTC (permalink / raw)
  To: kvm; +Cc: pbonzini, guangrong.xiao

With the internal plumbing in place, advertise support to the L1
hypervisor

Signed-off-by: Bandan Das <bsd@redhat.com>
---
 arch/x86/kvm/vmx.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dbfc000..5157ea5 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -2717,6 +2717,8 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 		vmx->nested.nested_vmx_ept_caps = VMX_EPT_PAGE_WALK_4_BIT |
 			 VMX_EPTP_WB_BIT | VMX_EPT_2MB_PAGE_BIT |
 			 VMX_EPT_INVEPT_BIT;
+		if (cpu_has_vmx_ept_execute_only())
+			vmx->nested.nested_vmx_ept_caps |= VMX_EPT_EXECUTE_ONLY_BIT;
 		vmx->nested.nested_vmx_ept_caps &= vmx_capability.ept;
 		/*
 		 * For nested guests, we don't do anything specific
-- 
2.5.5


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] mmu: Update ept specific valid bit values
  2016-06-21  3:12 ` [RFC PATCH 2/4] mmu: Update ept specific valid bit values Bandan Das
@ 2016-06-21  8:08   ` Paolo Bonzini
  2016-06-22  4:09   ` Xiao Guangrong
  1 sibling, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-21  8:08 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: guangrong.xiao



On 21/06/2016 05:12, Bandan Das wrote:
>  
>  static int is_shadow_present_pte(u64 pte)
>  {
> -	return pte & PT_PRESENT_MASK && !is_mmio_spte(pte);
> +	int xbit = shadow_xonly_valid ? pte & shadow_x_mask : 0;
> +
> +	return (pte & PT_PRESENT_MASK) | xbit
> +		&& !is_mmio_spte(pte);
>  }

You can change this to just:

	return
	    (pte & (PT_PRESENT_MASK | shadow_x_mask)) &&
	    !is_mmio_spte(pte);

because you know this is only called on a valid spte.  Reserved bits are
checked elsewhere (and shouldn't be there anyway!).

>  static int is_large_pte(u64 pte)
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> index bc019f7..9f5bd06 100644
> --- a/arch/x86/kvm/paging_tmpl.h
> +++ b/arch/x86/kvm/paging_tmpl.h
> @@ -133,7 +133,12 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
>  #if PTTYPE != PTTYPE_EPT
>  	return is_present_gpte(pte);
>  #else
> -	return pte & 7;
> +	/*
> +	 * For EPT, bits [2:0] can be 001, 100 or 111
> +	 * Further, for 100, logical processor should support
> +	 * execute-only.
> +	 */
> +	return (pte & 1) || (shadow_xonly_valid && (pte & 4));
>  #endif

Likewise, is_present_gpte is always followed by an is_rsvd_bits_set
check, which does check for execute-only.  Modify
reset_tdp_shadow_zero_bits_mask to pass the right execonly value to
__reset_rsvds_bits_mask_ept, and you can leave is_present_gpte as is.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-21  3:12 ` [RFC PATCH 3/4] mmu: don't set the present bit unconditionally Bandan Das
@ 2016-06-21  8:12   ` Paolo Bonzini
  2016-06-21  8:40     ` Paolo Bonzini
  2016-06-22 16:10     ` Bandan Das
  2016-06-22  4:30   ` Xiao Guangrong
  1 sibling, 2 replies; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-21  8:12 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: guangrong.xiao



On 21/06/2016 05:12, Bandan Das wrote:
> To support execute only mappings on behalf of L1 hypervisors,
> we teach set_spte to honor L1's valid XWR bits. This is only
> if host supports EPT execute only. Use ACC_USER_MASK to signify
> if the L1 hypervisor has the present bit set.

has the "R" bit set.

> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>  arch/x86/kvm/mmu.c         | 11 ++++++++---
>  arch/x86/kvm/paging_tmpl.h |  2 +-
>  2 files changed, 9 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 57d8696..3ca1a99 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2528,7 +2528,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
>  		return 0;
>  
> -	spte = PT_PRESENT_MASK;
> +	if (!shadow_xonly_valid)
> +		spte = PT_PRESENT_MASK;
>  	if (!speculative)
>  		spte |= shadow_accessed_mask;
>  
> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  	else
>  		spte |= shadow_nx_mask;
>  
> -	if (pte_access & ACC_USER_MASK)
> -		spte |= shadow_user_mask;
> +	if (pte_access & ACC_USER_MASK) {
> +		if (shadow_xonly_valid)
> +			spte |= PT_PRESENT_MASK;
> +		else
> +			spte |= shadow_user_mask;
> +	}

Can you instead pass VMX_READABLE_MASK to kvm_mmu_set_mask_ptes in vmx.c?
>  
>  	if (level > PT_PAGE_TABLE_LEVEL)
>  		spte |= PT_PAGE_SIZE_MASK;
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> index 9f5bd06..5366a55 100644
> --- a/arch/x86/kvm/paging_tmpl.h
> +++ b/arch/x86/kvm/paging_tmpl.h
> @@ -192,7 +192,7 @@ static inline unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, u64 gpte)
>  #if PTTYPE == PTTYPE_EPT
>  	access = ((gpte & VMX_EPT_WRITABLE_MASK) ? ACC_WRITE_MASK : 0) |
>  		((gpte & VMX_EPT_EXECUTABLE_MASK) ? ACC_EXEC_MASK : 0) |
> -		ACC_USER_MASK;
> +		((gpte & VMX_EPT_READABLE_MASK) ? ACC_USER_MASK : 0);

This part is fine.

Paolo

>  #else
>  	BUILD_BUG_ON(ACC_EXEC_MASK != PT_PRESENT_MASK);
>  	BUILD_BUG_ON(ACC_EXEC_MASK != 1);
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-21  8:12   ` Paolo Bonzini
@ 2016-06-21  8:40     ` Paolo Bonzini
  2016-06-22 16:10     ` Bandan Das
  1 sibling, 0 replies; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-21  8:40 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: guangrong.xiao



On 21/06/2016 10:12, Paolo Bonzini wrote:
>>  		return 0;
>>  
>> -	spte = PT_PRESENT_MASK;
>> +	if (!shadow_xonly_valid)
>> +		spte = PT_PRESENT_MASK;

I forgot to note that you need an "spte = 0;" here.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 2/4] mmu: Update ept specific valid bit values
  2016-06-21  3:12 ` [RFC PATCH 2/4] mmu: Update ept specific valid bit values Bandan Das
  2016-06-21  8:08   ` Paolo Bonzini
@ 2016-06-22  4:09   ` Xiao Guangrong
  1 sibling, 0 replies; 23+ messages in thread
From: Xiao Guangrong @ 2016-06-22  4:09 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: pbonzini



On 06/21/2016 11:12 AM, Bandan Das wrote:
> Until now , is_present_gpte checks for all bits set (XWR) and
> is_shadow_present_pte() checks for the present bit set. To support
> execute only mappings we should teach these functions to distinguish 100
> as valid based on host support.
>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>   arch/x86/kvm/mmu.c         | 5 ++++-
>   arch/x86/kvm/paging_tmpl.h | 7 ++++++-
>   2 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 37b01b1..57d8696 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -308,7 +308,10 @@ static int is_nx(struct kvm_vcpu *vcpu)
>
>   static int is_shadow_present_pte(u64 pte)
>   {
> -	return pte & PT_PRESENT_MASK && !is_mmio_spte(pte);
> +	int xbit = shadow_xonly_valid ? pte & shadow_x_mask : 0;
> +
> +	return (pte & PT_PRESENT_MASK) | xbit
> +		&& !is_mmio_spte(pte);

It could be simply use pte & (PT_PRESENT_MASK | shadow_x_mask) instead,
as this is host controlled page table and we never clear single
PT_PRESENT_MASK bit (we only fully clear low 32 bit).

>   }
>
>   static int is_large_pte(u64 pte)
> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
> index bc019f7..9f5bd06 100644
> --- a/arch/x86/kvm/paging_tmpl.h
> +++ b/arch/x86/kvm/paging_tmpl.h
> @@ -133,7 +133,12 @@ static inline int FNAME(is_present_gpte)(unsigned long pte)
>   #if PTTYPE != PTTYPE_EPT
>   	return is_present_gpte(pte);
>   #else
> -	return pte & 7;
> +	/*
> +	 * For EPT, bits [2:0] can be 001, 100 or 111
> +	 * Further, for 100, logical processor should support
> +	 * execute-only.
> +	 */
> +	return (pte & 1) || (shadow_xonly_valid && (pte & 4));

No.

For !shadow_xonly_valid guest, 100b on gpte can not pass FNAME(is_present_gpte)
so that we can get a error code without RSVD, finally a EPT-violation is injected
rather than EPT-misconfig.

Actually, current code handles shadow_xonly_valid very well, please refer to
kvm_init_shadow_ept_mmu(), there already has handled the case of 'execonly'.



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-21  3:12 ` [RFC PATCH 3/4] mmu: don't set the present bit unconditionally Bandan Das
  2016-06-21  8:12   ` Paolo Bonzini
@ 2016-06-22  4:30   ` Xiao Guangrong
  2016-06-22 16:21     ` Bandan Das
  1 sibling, 1 reply; 23+ messages in thread
From: Xiao Guangrong @ 2016-06-22  4:30 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: pbonzini



On 06/21/2016 11:12 AM, Bandan Das wrote:
> To support execute only mappings on behalf of L1 hypervisors,
> we teach set_spte to honor L1's valid XWR bits. This is only
> if host supports EPT execute only. Use ACC_USER_MASK to signify
> if the L1 hypervisor has the present bit set.
>
> Signed-off-by: Bandan Das <bsd@redhat.com>
> ---
>   arch/x86/kvm/mmu.c         | 11 ++++++++---
>   arch/x86/kvm/paging_tmpl.h |  2 +-
>   2 files changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 57d8696..3ca1a99 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2528,7 +2528,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>   	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
>   		return 0;
>
> -	spte = PT_PRESENT_MASK;
> +	if (!shadow_xonly_valid)
> +		spte = PT_PRESENT_MASK;

The xonly info can be fetched from vcpu->mmu. shadow_xonly_valid looks like
can be dropped.

>   	if (!speculative)
>   		spte |= shadow_accessed_mask;
>
> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>   	else
>   		spte |= shadow_nx_mask;
>
> -	if (pte_access & ACC_USER_MASK)
> -		spte |= shadow_user_mask;
> +	if (pte_access & ACC_USER_MASK) {
> +		if (shadow_xonly_valid)
> +			spte |= PT_PRESENT_MASK;
> +		else
> +			spte |= shadow_user_mask;
> +	}

It can be simplified by setting shadow_user_mask to PT_PRESENT_MASK
if ept enabled.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
                   ` (3 preceding siblings ...)
  2016-06-21  3:12 ` [RFC PATCH 4/4] vmx: advertise support for ept execute only Bandan Das
@ 2016-06-22  4:39 ` Xiao Guangrong
  2016-06-22 16:34   ` Bandan Das
  4 siblings, 1 reply; 23+ messages in thread
From: Xiao Guangrong @ 2016-06-22  4:39 UTC (permalink / raw)
  To: Bandan Das, kvm; +Cc: pbonzini



On 06/21/2016 11:12 AM, Bandan Das wrote:
> I spoke to Paolo about this a while back and thought to post this as
> RFC while I am thinking of adding some unit tests.
>
> Background: ESX refuses to run as L1 if support for EPT execute only isn't
> found. I am not really sure if it uses it for anything since just advertising
> the bits seems to work but adding the necessary plumbing seemed like a good idea.
>
> Xiao, I took the liberty of adding you based on "git blame" :)

Thank you, Bandan! I really hope all mmu related patches can be CCed to me
in case if i am too busy to watch patches in kvm mail list. :)

Some points are missed in this patchset:
1) You also need to tech is_present_gpte() about this fact.
2) update_permission_bitmask() need to be fixed as it always expects
    that read-access is available (actually, read-access is equal with
    user-access if ept is shadowed).

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-21  8:12   ` Paolo Bonzini
  2016-06-21  8:40     ` Paolo Bonzini
@ 2016-06-22 16:10     ` Bandan Das
  2016-06-22 16:12       ` Paolo Bonzini
  1 sibling, 1 reply; 23+ messages in thread
From: Bandan Das @ 2016-06-22 16:10 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, guangrong.xiao

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 21/06/2016 05:12, Bandan Das wrote:
>> To support execute only mappings on behalf of L1 hypervisors,
>> we teach set_spte to honor L1's valid XWR bits. This is only
>> if host supports EPT execute only. Use ACC_USER_MASK to signify
>> if the L1 hypervisor has the present bit set.
>
> has the "R" bit set.

Yep, noted!

>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>  arch/x86/kvm/mmu.c         | 11 ++++++++---
>>  arch/x86/kvm/paging_tmpl.h |  2 +-
>>  2 files changed, 9 insertions(+), 4 deletions(-)
...
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -2528,7 +2528,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>  	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
>>  		return 0;
>>  
>> -	spte = PT_PRESENT_MASK;
>> +	if (!shadow_xonly_valid)
>> +		spte = PT_PRESENT_MASK;
>>  	if (!speculative)
>>  		spte |= shadow_accessed_mask;
>>  
>> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>  	else
>>  		spte |= shadow_nx_mask;
>>  
>> -	if (pte_access & ACC_USER_MASK)
>> -		spte |= shadow_user_mask;
>> +	if (pte_access & ACC_USER_MASK) {
>> +		if (shadow_xonly_valid)
>> +			spte |= PT_PRESENT_MASK;
>> +		else
>> +			spte |= shadow_user_mask;
>> +	}
>
> Can you instead pass VMX_READABLE_MASK to kvm_mmu_set_mask_ptes in vmx.c?

So, leave spte = PT_PRESENT_MASK as is and make VMX_READABLE_MASK 1ULL if
execute only is supported ?
And then :
if !(pte_access & ACC_USER_MASK) {
       spte &= ~VMX_READABLE_MASK 

       
>>  	if (level > PT_PAGE_TABLE_LEVEL)
>>  		spte |= PT_PAGE_SIZE_MASK;
>> diff --git a/arch/x86/kvm/paging_tmpl.h b/arch/x86/kvm/paging_tmpl.h
>> index 9f5bd06..5366a55 100644
>> --- a/arch/x86/kvm/paging_tmpl.h
>> +++ b/arch/x86/kvm/paging_tmpl.h
>> @@ -192,7 +192,7 @@ static inline unsigned FNAME(gpte_access)(struct kvm_vcpu *vcpu, u64 gpte)
>>  #if PTTYPE == PTTYPE_EPT
>>  	access = ((gpte & VMX_EPT_WRITABLE_MASK) ? ACC_WRITE_MASK : 0) |
>>  		((gpte & VMX_EPT_EXECUTABLE_MASK) ? ACC_EXEC_MASK : 0) |
>> -		ACC_USER_MASK;
>> +		((gpte & VMX_EPT_READABLE_MASK) ? ACC_USER_MASK : 0);
>
> This part is fine.
>
> Paolo
>
>>  #else
>>  	BUILD_BUG_ON(ACC_EXEC_MASK != PT_PRESENT_MASK);
>>  	BUILD_BUG_ON(ACC_EXEC_MASK != 1);
>> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-22 16:10     ` Bandan Das
@ 2016-06-22 16:12       ` Paolo Bonzini
  2016-06-22 16:23         ` Bandan Das
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-22 16:12 UTC (permalink / raw)
  To: Bandan Das; +Cc: kvm, guangrong.xiao



On 22/06/2016 18:10, Bandan Das wrote:
>>> >> +	if (!shadow_xonly_valid)
>>> >> +		spte = PT_PRESENT_MASK;
>>> >>  	if (!speculative)
>>> >>  		spte |= shadow_accessed_mask;
>>> >>  
>>> >> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>> >>  	else
>>> >>  		spte |= shadow_nx_mask;
>>> >>  
>>> >> -	if (pte_access & ACC_USER_MASK)
>>> >> -		spte |= shadow_user_mask;
>>> >> +	if (pte_access & ACC_USER_MASK) {
>>> >> +		if (shadow_xonly_valid)
>>> >> +			spte |= PT_PRESENT_MASK;
>>> >> +		else
>>> >> +			spte |= shadow_user_mask;
>>> >> +	}
>> >
>> > Can you instead pass VMX_READABLE_MASK to kvm_mmu_set_mask_ptes in vmx.c?
> So, leave spte = PT_PRESENT_MASK as is and make VMX_READABLE_MASK 1ULL if
> execute only is supported ?
> And then :
> if !(pte_access & ACC_USER_MASK) {
>        spte &= ~VMX_READABLE_MASK 
> 

No, I meant something like

	spte = 0;
	if (!shadow_xonly_valid)
		spte = PT_PRESENT_MASK;
	...
	if (pte_access & ACC_USER_MASK)
		spte |= shadow_user_mask;

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-22  4:30   ` Xiao Guangrong
@ 2016-06-22 16:21     ` Bandan Das
  0 siblings, 0 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-22 16:21 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: kvm, pbonzini

Xiao Guangrong <guangrong.xiao@linux.intel.com> writes:

> On 06/21/2016 11:12 AM, Bandan Das wrote:
>> To support execute only mappings on behalf of L1 hypervisors,
>> we teach set_spte to honor L1's valid XWR bits. This is only
>> if host supports EPT execute only. Use ACC_USER_MASK to signify
>> if the L1 hypervisor has the present bit set.
>>
>> Signed-off-by: Bandan Das <bsd@redhat.com>
>> ---
>>   arch/x86/kvm/mmu.c         | 11 ++++++++---
>>   arch/x86/kvm/paging_tmpl.h |  2 +-
>>   2 files changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index 57d8696..3ca1a99 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -2528,7 +2528,8 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>   	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
>>   		return 0;
>>
>> -	spte = PT_PRESENT_MASK;
>> +	if (!shadow_xonly_valid)
>> +		spte = PT_PRESENT_MASK;
>
> The xonly info can be fetched from vcpu->mmu. shadow_xonly_valid looks like
> can be dropped.

I added shadow_xonly_valid mainly for is_shadow_present_pte and since it seems
it isn't needed there, I will drop it.

>>   	if (!speculative)
>>   		spte |= shadow_accessed_mask;
>>
>> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>   	else
>>   		spte |= shadow_nx_mask;
>>
>> -	if (pte_access & ACC_USER_MASK)
>> -		spte |= shadow_user_mask;
>> +	if (pte_access & ACC_USER_MASK) {
>> +		if (shadow_xonly_valid)
>> +			spte |= PT_PRESENT_MASK;
>> +		else
>> +			spte |= shadow_user_mask;
>> +	}
>
> It can be simplified by setting shadow_user_mask to PT_PRESENT_MASK
> if ept enabled.

Ok, sounds good. 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 3/4] mmu: don't set the present bit unconditionally
  2016-06-22 16:12       ` Paolo Bonzini
@ 2016-06-22 16:23         ` Bandan Das
  0 siblings, 0 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-22 16:23 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: kvm, guangrong.xiao

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 22/06/2016 18:10, Bandan Das wrote:
>>>> >> +	if (!shadow_xonly_valid)
>>>> >> +		spte = PT_PRESENT_MASK;
>>>> >>  	if (!speculative)
>>>> >>  		spte |= shadow_accessed_mask;
>>>> >>  
>>>> >> @@ -2537,8 +2538,12 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>>>> >>  	else
>>>> >>  		spte |= shadow_nx_mask;
>>>> >>  
>>>> >> -	if (pte_access & ACC_USER_MASK)
>>>> >> -		spte |= shadow_user_mask;
>>>> >> +	if (pte_access & ACC_USER_MASK) {
>>>> >> +		if (shadow_xonly_valid)
>>>> >> +			spte |= PT_PRESENT_MASK;
>>>> >> +		else
>>>> >> +			spte |= shadow_user_mask;
>>>> >> +	}
>>> >
>>> > Can you instead pass VMX_READABLE_MASK to kvm_mmu_set_mask_ptes in vmx.c?
>> So, leave spte = PT_PRESENT_MASK as is and make VMX_READABLE_MASK 1ULL if
>> execute only is supported ?
>> And then :
>> if !(pte_access & ACC_USER_MASK) {
>>        spte &= ~VMX_READABLE_MASK 
>> 
>
> No, I meant something like
>
> 	spte = 0;
> 	if (!shadow_xonly_valid)
> 		spte = PT_PRESENT_MASK;
> 	...
> 	if (pte_access & ACC_USER_MASK)
> 		spte |= shadow_user_mask;

Ok, Xiao mentioned this too. I will fix it in the next version.
Thanks for the review.

> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-22  4:39 ` [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Xiao Guangrong
@ 2016-06-22 16:34   ` Bandan Das
  2016-06-23  5:03     ` Xiao Guangrong
  0 siblings, 1 reply; 23+ messages in thread
From: Bandan Das @ 2016-06-22 16:34 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: kvm, pbonzini

Xiao Guangrong <guangrong.xiao@linux.intel.com> writes:

> On 06/21/2016 11:12 AM, Bandan Das wrote:
>> I spoke to Paolo about this a while back and thought to post this as
>> RFC while I am thinking of adding some unit tests.
>>
>> Background: ESX refuses to run as L1 if support for EPT execute only isn't
>> found. I am not really sure if it uses it for anything since just advertising
>> the bits seems to work but adding the necessary plumbing seemed like a good idea.
>>
>> Xiao, I took the liberty of adding you based on "git blame" :)
>
> Thank you, Bandan! I really hope all mmu related patches can be CCed to me
> in case if i am too busy to watch patches in kvm mail list. :)

We all try but it's easy to miss interested people in the cc list :)

> Some points are missed in this patchset:
> 1) You also need to tech is_present_gpte() about this fact.

If I understood right, this isn't needed anymore since a check
for rsvd bits follows and is_present_gpte already does pte & 7.

> 2) update_permission_bitmask() need to be fixed as it always expects
>    that read-access is available (actually, read-access is equal with
>    user-access if ept is shadowed).
Do you mean this part of update_permission_bitmask() ?
...
} else
  	/* Not really needed: no U/S accesses on ept  */
	u = 1;
        

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-22 16:34   ` Bandan Das
@ 2016-06-23  5:03     ` Xiao Guangrong
  2016-06-24  2:56       ` Bandan Das
  0 siblings, 1 reply; 23+ messages in thread
From: Xiao Guangrong @ 2016-06-23  5:03 UTC (permalink / raw)
  To: Bandan Das; +Cc: kvm, pbonzini



On 06/23/2016 12:34 AM, Bandan Das wrote:
> Xiao Guangrong <guangrong.xiao@linux.intel.com> writes:
>
>> On 06/21/2016 11:12 AM, Bandan Das wrote:
>>> I spoke to Paolo about this a while back and thought to post this as
>>> RFC while I am thinking of adding some unit tests.
>>>
>>> Background: ESX refuses to run as L1 if support for EPT execute only isn't
>>> found. I am not really sure if it uses it for anything since just advertising
>>> the bits seems to work but adding the necessary plumbing seemed like a good idea.
>>>
>>> Xiao, I took the liberty of adding you based on "git blame" :)
>>
>> Thank you, Bandan! I really hope all mmu related patches can be CCed to me
>> in case if i am too busy to watch patches in kvm mail list. :)
>
> We all try but it's easy to miss interested people in the cc list :)
>
>> Some points are missed in this patchset:
>> 1) You also need to tech is_present_gpte() about this fact.
>
> If I understood right, this isn't needed anymore since a check
> for rsvd bits follows and is_present_gpte already does pte & 7.

is_present_gpte() and FNAME(is_present_gpte) are different paths.

Currently, it is safe as is_present_gpte() is only used for 32 bit
guests, however, making it consistent can avoid potential bugs
in the further.

>
>> 2) update_permission_bitmask() need to be fixed as it always expects
>>     that read-access is available (actually, read-access is equal with
>>     user-access if ept is shadowed).
> Do you mean this part of update_permission_bitmask() ?
> ...
> } else
>    	/* Not really needed: no U/S accesses on ept  */
> 	u = 1;

Yes.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-23  5:03     ` Xiao Guangrong
@ 2016-06-24  2:56       ` Bandan Das
  2016-06-24  4:22         ` Paolo Bonzini
  0 siblings, 1 reply; 23+ messages in thread
From: Bandan Das @ 2016-06-24  2:56 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: kvm, pbonzini

Xiao Guangrong <guangrong.xiao@linux.intel.com> writes:

> On 06/23/2016 12:34 AM, Bandan Das wrote:
>> Xiao Guangrong <guangrong.xiao@linux.intel.com> writes:
>>
>>> On 06/21/2016 11:12 AM, Bandan Das wrote:
>>>> I spoke to Paolo about this a while back and thought to post this as
>>>> RFC while I am thinking of adding some unit tests.
>>>>
>>>> Background: ESX refuses to run as L1 if support for EPT execute only isn't
>>>> found. I am not really sure if it uses it for anything since just advertising
>>>> the bits seems to work but adding the necessary plumbing seemed like a good idea.
>>>>
>>>> Xiao, I took the liberty of adding you based on "git blame" :)
>>>
>>> Thank you, Bandan! I really hope all mmu related patches can be CCed to me
>>> in case if i am too busy to watch patches in kvm mail list. :)
>>
>> We all try but it's easy to miss interested people in the cc list :)
>>
>>> Some points are missed in this patchset:
>>> 1) You also need to tech is_present_gpte() about this fact.
>>
>> If I understood right, this isn't needed anymore since a check
>> for rsvd bits follows and is_present_gpte already does pte & 7.
>
> is_present_gpte() and FNAME(is_present_gpte) are different paths.
>
> Currently, it is safe as is_present_gpte() is only used for 32 bit
> guests, however, making it consistent can avoid potential bugs
> in the further.

I am tempted to remove the FNAME version altogether and change is_present_gpte()
to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
will take care of all cases. Hope I am not missing something :)

>>
>>> 2) update_permission_bitmask() need to be fixed as it always expects
>>>     that read-access is available (actually, read-access is equal with
>>>     user-access if ept is shadowed).
>> Do you mean this part of update_permission_bitmask() ?
>> ...
>> } else
>>    	/* Not really needed: no U/S accesses on ept  */
>> 	u = 1;
>
> Yes.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-24  2:56       ` Bandan Das
@ 2016-06-24  4:22         ` Paolo Bonzini
  2016-06-24  4:50           ` Bandan Das
  0 siblings, 1 reply; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-24  4:22 UTC (permalink / raw)
  To: Bandan Das; +Cc: Xiao Guangrong, kvm


> I am tempted to remove the FNAME version altogether and change
> is_present_gpte()
> to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
> will take care of all cases. Hope I am not missing something :)

Please rename the non-FNAME version to pae_is_present_pdpte or just inline
it in the two callers.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-24  4:22         ` Paolo Bonzini
@ 2016-06-24  4:50           ` Bandan Das
  2016-06-24  6:46             ` Paolo Bonzini
  0 siblings, 1 reply; 23+ messages in thread
From: Bandan Das @ 2016-06-24  4:50 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Xiao Guangrong, kvm

Paolo Bonzini <pbonzini@redhat.com> writes:

>> I am tempted to remove the FNAME version altogether and change
>> is_present_gpte()
>> to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
>> will take care of all cases. Hope I am not missing something :)
>
> Please rename the non-FNAME version to pae_is_present_pdpte or just inline
> it in the two callers.

I am still not sure why the FNAME version is needed, specifically the PTTYPE_EPT
specific check. Why can't we  just check for execonly and the corresponding
bit in the non FNAME version ?

> Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-24  4:50           ` Bandan Das
@ 2016-06-24  6:46             ` Paolo Bonzini
  2016-06-24  7:02               ` Xiao Guangrong
  2016-06-24 16:19               ` Bandan Das
  0 siblings, 2 replies; 23+ messages in thread
From: Paolo Bonzini @ 2016-06-24  6:46 UTC (permalink / raw)
  To: Bandan Das; +Cc: Xiao Guangrong, kvm



On 24/06/2016 06:50, Bandan Das wrote:
>>> >> I am tempted to remove the FNAME version altogether and change
>>> >> is_present_gpte()
>>> >> to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
>>> >> will take care of all cases. Hope I am not missing something :)
>> >
>> > Please rename the non-FNAME version to pae_is_present_pdpte or just inline
>> > it in the two callers.
> I am still not sure why the FNAME version is needed, specifically the PTTYPE_EPT
> specific check. Why can't we  just check for execonly and the corresponding
> bit in the non FNAME version ?

The FNAME version encodes the right semantics:

1) for non-EPT page tables, bit 1 indicates present.

2) for EPT page tables, setting any bit 0-2 indicates present.

It's perfectly fine to find that a PTE is invalid *after* judging that
it is present.  There is no difference between setting bit 51 on a PTE,
and having the invalid -W- combination on an EPT page table entry.  In
both cases, the page is present but the PTE is invalid.

The SDM is very clear about it: "If bits 2:0 of an EPT paging-structure
entry are all 0, the entry is not present".  So if bits 2:0 are -W- the
entry is present but invalid, and causes an EPT misconfiguration vmexit.
 A non-present entry will cause an EPT violation instead.

Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-24  6:46             ` Paolo Bonzini
@ 2016-06-24  7:02               ` Xiao Guangrong
  2016-06-24 16:19               ` Bandan Das
  1 sibling, 0 replies; 23+ messages in thread
From: Xiao Guangrong @ 2016-06-24  7:02 UTC (permalink / raw)
  To: Paolo Bonzini, Bandan Das; +Cc: kvm



On 06/24/2016 02:46 PM, Paolo Bonzini wrote:
>
>
> On 24/06/2016 06:50, Bandan Das wrote:
>>>>>> I am tempted to remove the FNAME version altogether and change
>>>>>> is_present_gpte()
>>>>>> to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
>>>>>> will take care of all cases. Hope I am not missing something :)
>>>>
>>>> Please rename the non-FNAME version to pae_is_present_pdpte or just inline
>>>> it in the two callers.
>> I am still not sure why the FNAME version is needed, specifically the PTTYPE_EPT
>> specific check. Why can't we  just check for execonly and the corresponding
>> bit in the non FNAME version ?
>
> The FNAME version encodes the right semantics:
>
> 1) for non-EPT page tables, bit 1 indicates present.
>
> 2) for EPT page tables, setting any bit 0-2 indicates present.
>
> It's perfectly fine to find that a PTE is invalid *after* judging that
> it is present.  There is no difference between setting bit 51 on a PTE,
> and having the invalid -W- combination on an EPT page table entry.  In
> both cases, the page is present but the PTE is invalid.
>
> The SDM is very clear about it: "If bits 2:0 of an EPT paging-structure
> entry are all 0, the entry is not present".  So if bits 2:0 are -W- the
> entry is present but invalid, and causes an EPT misconfiguration vmexit.
>   A non-present entry will cause an EPT violation instead.

Yes, i agree. Actually i have pointed it out in my previous comment:

"
For !shadow_xonly_valid guest, 100b on gpte can not pass FNAME(is_present_gpte)
so that we can get a error code without RSVD, finally a EPT-violation is injected
rather than EPT-misconfig.
"

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors
  2016-06-24  6:46             ` Paolo Bonzini
  2016-06-24  7:02               ` Xiao Guangrong
@ 2016-06-24 16:19               ` Bandan Das
  1 sibling, 0 replies; 23+ messages in thread
From: Bandan Das @ 2016-06-24 16:19 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Xiao Guangrong, kvm

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 24/06/2016 06:50, Bandan Das wrote:
>>>> >> I am tempted to remove the FNAME version altogether and change
>>>> >> is_present_gpte()
>>>> >> to return (pte & PT_PRESENT_MASK) || (shadow_xonly_valid && (pte & 4)). This
>>>> >> will take care of all cases. Hope I am not missing something :)
>>> >
>>> > Please rename the non-FNAME version to pae_is_present_pdpte or just inline
>>> > it in the two callers.
>> I am still not sure why the FNAME version is needed, specifically the PTTYPE_EPT
>> specific check. Why can't we  just check for execonly and the corresponding
>> bit in the non FNAME version ?
>
> The FNAME version encodes the right semantics:
>
> 1) for non-EPT page tables, bit 1 indicates present.
>
> 2) for EPT page tables, setting any bit 0-2 indicates present.
>
> It's perfectly fine to find that a PTE is invalid *after* judging that
> it is present.  There is no difference between setting bit 51 on a PTE,
> and having the invalid -W- combination on an EPT page table entry.  In
> both cases, the page is present but the PTE is invalid.
>
> The SDM is very clear about it: "If bits 2:0 of an EPT paging-structure
> entry are all 0, the entry is not present".  So if bits 2:0 are -W- the
> entry is present but invalid, and causes an EPT misconfiguration vmexit.
>  A non-present entry will cause an EPT violation instead.

Thank you very much, this clears it. pte & 7 makes sense now.

Xiao, as Paolo suggested, I can just remove the non FNAME version and directly
do pte & PT_PRESENT_MASK at the two call sites. I hope that will future proof it.

Thanks,
Bandan

> Paolo

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-06-24 16:19 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-21  3:12 [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Bandan Das
2016-06-21  3:12 ` [RFC PATCH 1/4] mmu: add a boolean to indicate host ept execute only support Bandan Das
2016-06-21  3:12 ` [RFC PATCH 2/4] mmu: Update ept specific valid bit values Bandan Das
2016-06-21  8:08   ` Paolo Bonzini
2016-06-22  4:09   ` Xiao Guangrong
2016-06-21  3:12 ` [RFC PATCH 3/4] mmu: don't set the present bit unconditionally Bandan Das
2016-06-21  8:12   ` Paolo Bonzini
2016-06-21  8:40     ` Paolo Bonzini
2016-06-22 16:10     ` Bandan Das
2016-06-22 16:12       ` Paolo Bonzini
2016-06-22 16:23         ` Bandan Das
2016-06-22  4:30   ` Xiao Guangrong
2016-06-22 16:21     ` Bandan Das
2016-06-21  3:12 ` [RFC PATCH 4/4] vmx: advertise support for ept execute only Bandan Das
2016-06-22  4:39 ` [RFC PATCH 0/4] Add support for EPT execute only for nested hypervisors Xiao Guangrong
2016-06-22 16:34   ` Bandan Das
2016-06-23  5:03     ` Xiao Guangrong
2016-06-24  2:56       ` Bandan Das
2016-06-24  4:22         ` Paolo Bonzini
2016-06-24  4:50           ` Bandan Das
2016-06-24  6:46             ` Paolo Bonzini
2016-06-24  7:02               ` Xiao Guangrong
2016-06-24 16:19               ` Bandan Das

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.