From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751608AbcGMP1u (ORCPT ); Wed, 13 Jul 2016 11:27:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:36186 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750963AbcGMP1j (ORCPT ); Wed, 13 Jul 2016 11:27:39 -0400 Subject: Re: [PATCH v2 0/5] Add support for EPT execute only for nested hypervisors To: Bandan Das References: <1468361932-16580-1-git-send-email-bsd@redhat.com> <82db70ed-761e-0377-5417-acb64bed6cb6@redhat.com> Cc: kvm@vger.kernel.org, guangrong.xiao@linux.intel.com, kernellwp@gmail.com, linux-kernel@vger.kernel.org From: Paolo Bonzini Message-ID: <921eef54-f23b-cd90-8e20-a428a00a3297@redhat.com> Date: Wed, 13 Jul 2016 17:27:05 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.1.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Wed, 13 Jul 2016 15:27:09 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/07/2016 17:06, Bandan Das wrote: >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c >> index 190c0559c221..bd2535fdb9eb 100644 >> --- a/arch/x86/kvm/mmu.c >> +++ b/arch/x86/kvm/mmu.c >> @@ -2524,11 +2524,10 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, >> return 0; >> >> /* >> - * In the non-EPT case, execonly is not valid and so >> - * the following line is equivalent to spte |= PT_PRESENT_MASK. >> * For the EPT case, shadow_present_mask is 0 if hardware >> - * supports it and we honor whatever way the guest set it. >> - * See: FNAME(gpte_access) in paging_tmpl.h >> + * supports exec-only page table entries. In that case, >> + * ACC_USER_MASK and shadow_user_mask are used to represent >> + * read access. See FNAME(gpte_access) in paging_tmpl.h. >> */ > > I would still prefer a note about the non-EPT case, makes it easy to > understand. I can add "shadow_present_mask is PT_PRESENT_MASK in the non-EPT case" but it's a bit of a tautology. >> spte |= shadow_present_mask; >> if (!speculative) >> @@ -3923,9 +3922,6 @@ static void update_permission_bitmask(struct kvm_vcpu *vcpu, >> * clearer. >> */ >> smap = cr4_smap && u && !uf && !ff; >> - } else { >> - if (shadow_present_mask) >> - u = 1; >> } >> >> fault = (ff && !x) || (uf && !u) || (wf && !w) || >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >> index 576c47cda1a3..dfef081e76c0 100644 >> --- a/arch/x86/kvm/vmx.c >> +++ b/arch/x86/kvm/vmx.c >> @@ -6120,12 +6120,14 @@ static int handle_ept_violation(struct kvm_vcpu *vcpu) >> gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS); >> trace_kvm_page_fault(gpa, exit_qualification); >> >> - /* It is a write fault? */ >> + /* it is a read fault? */ >> + error_code = (exit_qualification << 2) & PFERR_USER_MASK; >> + /* it is a write fault? */ >> error_code = exit_qualification & PFERR_WRITE_MASK; >> /* It is a fetch fault? */ >> error_code |= (exit_qualification << 2) & PFERR_FETCH_MASK; >> /* ept page table is present? */ >> - error_code |= (exit_qualification >> 3) & PFERR_PRESENT_MASK; >> + error_code |= (exit_qualification & 0x38) != 0; >> > > Thank you for the thorough review here. I missed that we didn't set the read bit > at all. I am still a little unclear how permission_fault works though... > >> vcpu->arch.exit_qualification = exit_qualification; >> >> @@ -6474,8 +6476,7 @@ static __init int hardware_setup(void) >> (enable_ept_ad_bits) ? VMX_EPT_DIRTY_BIT : 0ull, >> 0ull, VMX_EPT_EXECUTABLE_MASK, >> cpu_has_vmx_ept_execute_only() ? >> - 0ull : PT_PRESENT_MASK); >> - BUILD_BUG_ON(PT_PRESENT_MASK != VMX_EPT_READABLE_MASK); >> + 0ull : VMX_EPT_READABLE_MASK); > > I wanted to keep it the former way because "PT_PRESENT_MASK is equal to VMX_EPT_READABLE_MASK" > is an assumption all throughout. I wanted to use this section to catch mismatches. I think there's no such assumption anymore, actually. Can you double check? If there are any, that's where the BUILD_BUG_ON should be. Paolo