All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Matlack <dmatlack@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	Lai Jiangshan <jiangshan.ljs@antgroup.com>
Subject: Re: [PATCH V3 04/12] KVM: X86/MMU: Add local shadow pages
Date: Thu, 26 May 2022 22:01:14 +0000	[thread overview]
Message-ID: <Yo/4qnF3359LO25D@google.com> (raw)
In-Reply-To: <20220521131700.3661-5-jiangshanlai@gmail.com>

On Sat, May 21, 2022 at 09:16:52PM +0800, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> Local shadow pages are shadow pages to hold PDPTEs for 32bit guest or
> higher level shadow pages having children local shadow pages when
> shadowing nested NPT for 32bit L1 in 64 bit L0.
> 
> Current code use mmu->pae_root, mmu->pml4_root, and mmu->pml5_root to
> setup local root page.  The initialization code is complex and the root
> pages are not associated with struct kvm_mmu_page which causes the code
> more complex.
> 
> Add kvm_mmu_alloc_local_shadow_page() and mmu_free_local_root_page() to
> allocate and free local shadow pages and prepare for using local
> shadow pages to replace current logic and share the most logic with
> non-local shadow pages.
> 
> The code is not activated since using_local_root_page() is false in
> the place where it is inserted.
> 
> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
>  arch/x86/kvm/mmu/mmu.c | 109 ++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 108 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 240ebe589caf..c941a5931bc3 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -1764,6 +1764,76 @@ static bool using_local_root_page(struct kvm_mmu *mmu)
>  		return mmu->cpu_role.base.level <= PT32E_ROOT_LEVEL;
>  }
>  
> +/*
> + * Local shadow pages are shadow pages to hold PDPTEs for 32bit guest or higher
> + * level shadow pages having children local shadow pages when shadowing nested
> + * NPT for 32bit L1 in 64 bit L0.
> + *
> + * Local shadow pages are often local shadow root pages (or local root pages for
> + * short) except when shadowing nested NPT for 32bit L1 in 64 bit L0 which has
> + * 2 or 3 levels of local shadow pages on top of non-local shadow pages.
> + *
> + * Local shadow pages are locally allocated.  If the local shadow page's level
> + * is PT32E_ROOT_LEVEL, it will use the preallocated mmu->pae_root for its
> + * sp->spt.  Because sp->spt may need to be put in the 32 bits CR3 (even in
> + * x86_64) or decrypted.  Using the preallocated one to handle these
> + * requirements makes the allocation simpler.
> + *
> + * Local shadow pages are only visible to local VCPU except through
> + * sp->parent_ptes rmap from their children, so they are not in the
> + * kvm->arch.active_mmu_pages nor in the hash.
> + *
> + * And they are neither accounted nor write-protected since they don't shadow a
> + * guest page table.
> + *
> + * Because of above, local shadow pages can not be freed nor zapped like
> + * non-local shadow pages.  They are freed directly when the local root page
> + * is freed, see mmu_free_local_root_page().
> + *
> + * Local root page can not be put on mmu->prev_roots because the comparison
> + * must use PDPTEs instead of CR3 and mmu->pae_root can not be shared for multi
> + * local root pages.
> + *
> + * Except above limitations, all the other abilities are the same as other
> + * shadow page, like link, parent rmap, sync, unsync etc.
> + *
> + * Local shadow pages can be obsoleted in a little different way other than
> + * the non-local shadow pages.  When the obsoleting process is done, all the
> + * obsoleted non-local shadow pages are unlinked from the local shadow pages
> + * by the help of the sp->parent_ptes rmap and the local shadow pages become
> + * theoretically valid again except sp->mmu_valid_gen may be still outdated.
> + * If there is no other event to cause a VCPU to free the local root page and
> + * the VCPU is being preempted by the host during two obsoleting processes,
> + * sp->mmu_valid_gen might become valid again and the VCPU can reuse it when
> + * the VCPU is back.  It is different from the non-local shadow pages which
> + * are always freed after obsoleted.
> + */
> +static struct kvm_mmu_page *
> +kvm_mmu_alloc_local_shadow_page(struct kvm_vcpu *vcpu, union kvm_mmu_page_role role)
> +{
> +	struct kvm_mmu_page *sp;
> +
> +	sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
> +	sp->gfn = 0;
> +	sp->role = role;
> +	/*
> +	 * Use the preallocated mmu->pae_root when the shadow page's
> +	 * level is PT32E_ROOT_LEVEL which may need to be put in the 32 bits
> +	 * CR3 (even in x86_64) or decrypted.  The preallocated one is prepared
> +	 * for the requirements.
> +	 */
> +	if (role.level == PT32E_ROOT_LEVEL &&
> +	    !WARN_ON_ONCE(!vcpu->arch.mmu->pae_root))
> +		sp->spt = vcpu->arch.mmu->pae_root;

FYI this (and a couple other parts of this series) conflict with Nested
MMU Eager Page Splitting, since it uses struct kvm_vcpu in kvm_mmu_get_page().

Hopefully Paolo can queue Nested MMU Eager Page Splitting for 5.20 so
you can apply this series on top. I think that'd be simpler than trying
to do it the other way around.

> +	else
> +		sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
> +	/* sp->gfns is not used for local shadow page */
> +	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
> +	sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
> +
> +	return sp;
> +}
> +
>  static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct)
>  {
>  	struct kvm_mmu_page *sp;
> @@ -2121,6 +2191,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
>  	if (level <= vcpu->arch.mmu->cpu_role.base.level)
>  		role.passthrough = 0;
>  
> +	if (unlikely(level >= PT32E_ROOT_LEVEL && using_local_root_page(vcpu->arch.mmu)))
> +		return kvm_mmu_alloc_local_shadow_page(vcpu, role);
> +
>  	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
>  	for_each_valid_sp(vcpu->kvm, sp, sp_list) {
>  		if (sp->gfn != gfn) {
> @@ -3351,6 +3424,37 @@ static void mmu_free_root_page(struct kvm *kvm, hpa_t *root_hpa,
>  	*root_hpa = INVALID_PAGE;
>  }
>  
> +static void mmu_free_local_root_page(struct kvm *kvm, struct kvm_mmu *mmu)
> +{
> +	u64 spte = mmu->root.hpa;
> +	struct kvm_mmu_page *sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK);
> +	int i;
> +
> +	/* Free level 5 or 4 roots for shadow NPT for 32 bit L1 */
> +	while (sp->role.level > PT32E_ROOT_LEVEL)
> +	{
> +		spte = sp->spt[0];
> +		mmu_page_zap_pte(kvm, sp, sp->spt + 0, NULL);
> +		free_page((unsigned long)sp->spt);
> +		kmem_cache_free(mmu_page_header_cache, sp);
> +		if (!is_shadow_present_pte(spte))
> +			return;
> +		sp = to_shadow_page(spte & PT64_BASE_ADDR_MASK);
> +	}
> +
> +	if (WARN_ON_ONCE(sp->role.level != PT32E_ROOT_LEVEL))
> +		return;
> +
> +	/* Disconnect PAE root from the 4 PAE page directories */
> +	for (i = 0; i < 4; i++)
> +		mmu_page_zap_pte(kvm, sp, sp->spt + i, NULL);
> +
> +	if (sp->spt != mmu->pae_root)
> +		free_page((unsigned long)sp->spt);
> +
> +	kmem_cache_free(mmu_page_header_cache, sp);
> +}
> +
>  /* roots_to_free must be some combination of the KVM_MMU_ROOT_* flags */
>  void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
>  			ulong roots_to_free)
> @@ -3384,7 +3488,10 @@ void kvm_mmu_free_roots(struct kvm *kvm, struct kvm_mmu *mmu,
>  
>  	if (free_active_root) {
>  		if (to_shadow_page(mmu->root.hpa)) {
> -			mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
> +			if (using_local_root_page(mmu))
> +				mmu_free_local_root_page(kvm, mmu);
> +			else
> +				mmu_free_root_page(kvm, &mmu->root.hpa, &invalid_list);
>  		} else if (mmu->pae_root) {
>  			for (i = 0; i < 4; ++i) {
>  				if (!IS_VALID_PAE_ROOT(mmu->pae_root[i]))
> -- 
> 2.19.1.6.gb485710b
> 

  parent reply	other threads:[~2022-05-26 22:01 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21 13:16 [PATCH V3 00/12] KVM: X86/MMU: Use one-off local shadow page for special roots Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 01/12] KVM: X86/MMU: Verify PDPTE for nested NPT in PAE paging mode when page fault Lai Jiangshan
2022-07-19 21:17   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 02/12] KVM: X86/MMU: Add using_local_root_page() Lai Jiangshan
2022-05-26 21:28   ` David Matlack
2022-05-26 21:38     ` Sean Christopherson
2022-07-19 22:03   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 03/12] KVM: X86/MMU: Reduce a check in using_local_root_page() for common cases Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 04/12] KVM: X86/MMU: Add local shadow pages Lai Jiangshan
2022-05-26 21:38   ` David Matlack
2022-05-26 22:01   ` David Matlack [this message]
2022-07-20  0:35   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 05/12] KVM: X86/MMU: Link PAE root pagetable with its children Lai Jiangshan
2022-07-19 22:21   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 06/12] KVM: X86/MMU: Activate local shadow pages and remove old logic Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 07/12] KVM: X86/MMU: Remove the check of the return value of to_shadow_page() Lai Jiangshan
2022-07-19 22:42   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 08/12] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand Lai Jiangshan
2022-07-19 23:08   ` Sean Christopherson
2022-07-20  0:07     ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 09/12] KVM: X86/MMU: Move the verifying of NPT's PDPTE in FNAME(fetch) Lai Jiangshan
2022-07-19 23:21   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 10/12] KVM: X86/MMU: Remove unused INVALID_PAE_ROOT and IS_VALID_PAE_ROOT Lai Jiangshan
2022-07-19 23:11   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 11/12] KVM: X86/MMU: Don't use mmu->pae_root when shadowing PAE NPT in 64-bit host Lai Jiangshan
2022-07-19 23:26   ` Sean Christopherson
2022-07-19 23:27     ` Sean Christopherson
2022-05-21 13:17 ` [PATCH V3 12/12] KVM: X86/MMU: Remove mmu_alloc_special_roots() Lai Jiangshan
2022-05-26  8:49 ` [PATCH V3 00/12] KVM: X86/MMU: Use one-off local shadow page for special roots Lai Jiangshan
2022-05-26 20:27   ` David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yo/4qnF3359LO25D@google.com \
    --to=dmatlack@google.com \
    --cc=jiangshan.ljs@antgroup.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.