All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Lai Jiangshan <jiangshanlai@gmail.com>
Cc: linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Maxim Levitsky <mlevitsk@redhat.com>,
	David Matlack <dmatlack@google.com>,
	Lai Jiangshan <jiangshan.ljs@antgroup.com>
Subject: Re: [PATCH V3 08/12] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand
Date: Tue, 19 Jul 2022 23:08:22 +0000	[thread overview]
Message-ID: <Ytc5Zmer7sjkGAqV@google.com> (raw)
In-Reply-To: <20220521131700.3661-9-jiangshanlai@gmail.com>

On Sat, May 21, 2022, Lai Jiangshan wrote:
> From: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> 
> mmu->pae_root for non-PAE paging is allocated on-demand, but
> mmu->pae_root for PAE paging is allocated early when struct kvm_mmu is
> being created.
> 
> Simplify the code to allocate mmu->pae_root for PAE paging and make
> it on-demand.

Hmm, I'm not convinced this simplifies things enough to justify the risk.  There's
a non-zero chance that the __GFP_DMA32 allocation was intentionally done during VM
creation in order to avoid OOM on low memory.

Maybe move this patch to the tail end of the series so that it has a higher chance
of reverting cleanly if on-demand allocation breaks someone's setup?

> Signed-off-by: Lai Jiangshan <jiangshan.ljs@antgroup.com>
> ---
>  arch/x86/include/asm/kvm_host.h |   2 +-
>  arch/x86/kvm/mmu/mmu.c          | 101 +++++++++++++-------------------
>  arch/x86/kvm/x86.c              |   4 +-
>  3 files changed, 44 insertions(+), 63 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 9cdc5bbd721f..fb9751dfc1a7 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1615,7 +1615,7 @@ int kvm_mmu_vendor_module_init(void);
>  void kvm_mmu_vendor_module_exit(void);
>  
>  void kvm_mmu_destroy(struct kvm_vcpu *vcpu);
> -int kvm_mmu_create(struct kvm_vcpu *vcpu);
> +void kvm_mmu_create(struct kvm_vcpu *vcpu);
>  int kvm_mmu_init_vm(struct kvm *kvm);
>  void kvm_mmu_uninit_vm(struct kvm *kvm);
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 90b715eefe6a..63c2b2c6122c 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -668,6 +668,41 @@ static void walk_shadow_page_lockless_end(struct kvm_vcpu *vcpu)
>  	}
>  }
>  
> +static int mmu_alloc_pae_root(struct kvm_vcpu *vcpu)

Now that pae_root isn't the "full" root, just the page table, I think we should
rename pae_root to something else, and then name this accordingly.

pae_root_backing_page and mmu_alloc_pae_root_backing_page()?  Definitely don't
love the name if someone has a better idea.

> +{
> +	struct page *page;
> +
> +	if (vcpu->arch.mmu->root_role.level != PT32E_ROOT_LEVEL)
> +		return 0;

I think I'd prefer to move this check to the caller, it's confusing to see an
unconditional call to a PAE-specific helper.

> +	if (vcpu->arch.mmu->pae_root)
> +		return 0;
> +
> +	/*
> +	 * Allocate a page to hold the four PDPTEs for PAE paging when emulating
> +	 * 32-bit mode.  CR3 is only 32 bits even on x86_64 in this case.
> +	 * Therefore we need to allocate the PDP table in the first 4GB of
> +	 * memory, which happens to fit the DMA32 zone.
> +	 */
> +	page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_DMA32);

Leave off __GFP_ZERO, it's unnecesary in both cases, and actively misleading in
when TDP is disabled.  KVM _must_ write the page after making it decrypted.  And
since I can't find any code that actually does initialize "pae_root", I suspect
this series is buggy.

But if there is a bug, it was introduced earlier in this series, either by

  KVM: X86/MMU: Add local shadow pages

or by

  KVM: X86/MMU: Activate local shadow pages and remove old logic

depending on whether you want to blame the function that is buggy, or the patch
that uses the buggy function..

The right place to initialize the root is kvm_mmu_alloc_local_shadow_page().
KVM sets __GFP_ZERO for mmu_shadow_page_cache, i.e. relies on new sp->spt pages
to be zeroed prior to "allocating" from the cache.

The PAE root backing page on the other hand is allocated once and then reused
over and over.

	if (role.level == PT32E_ROOT_LEVEL &&
	    !WARN_ON_ONCE(!vcpu->arch.mmu->pae_root)) {
		sp->spt = vcpu->arch.mmu->pae_root;
		kvm_mmu_initialize_pae_root(sp->spt): <==== something like this
	} else {
		sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
	}


> -	for (i = 0; i < 4; ++i)
> -		mmu->pae_root[i] = INVALID_PAE_ROOT;

Please remove this code in a separate patch.  I don't care if it is removed before
or after (I'm pretty sure the existing behavior is paranoia), but I don't want
multiple potentially-functional changes in this patch.

  reply	other threads:[~2022-07-19 23:08 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-21 13:16 [PATCH V3 00/12] KVM: X86/MMU: Use one-off local shadow page for special roots Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 01/12] KVM: X86/MMU: Verify PDPTE for nested NPT in PAE paging mode when page fault Lai Jiangshan
2022-07-19 21:17   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 02/12] KVM: X86/MMU: Add using_local_root_page() Lai Jiangshan
2022-05-26 21:28   ` David Matlack
2022-05-26 21:38     ` Sean Christopherson
2022-07-19 22:03   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 03/12] KVM: X86/MMU: Reduce a check in using_local_root_page() for common cases Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 04/12] KVM: X86/MMU: Add local shadow pages Lai Jiangshan
2022-05-26 21:38   ` David Matlack
2022-05-26 22:01   ` David Matlack
2022-07-20  0:35   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 05/12] KVM: X86/MMU: Link PAE root pagetable with its children Lai Jiangshan
2022-07-19 22:21   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 06/12] KVM: X86/MMU: Activate local shadow pages and remove old logic Lai Jiangshan
2022-05-21 13:16 ` [PATCH V3 07/12] KVM: X86/MMU: Remove the check of the return value of to_shadow_page() Lai Jiangshan
2022-07-19 22:42   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 08/12] KVM: X86/MMU: Allocate mmu->pae_root for PAE paging on-demand Lai Jiangshan
2022-07-19 23:08   ` Sean Christopherson [this message]
2022-07-20  0:07     ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 09/12] KVM: X86/MMU: Move the verifying of NPT's PDPTE in FNAME(fetch) Lai Jiangshan
2022-07-19 23:21   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 10/12] KVM: X86/MMU: Remove unused INVALID_PAE_ROOT and IS_VALID_PAE_ROOT Lai Jiangshan
2022-07-19 23:11   ` Sean Christopherson
2022-05-21 13:16 ` [PATCH V3 11/12] KVM: X86/MMU: Don't use mmu->pae_root when shadowing PAE NPT in 64-bit host Lai Jiangshan
2022-07-19 23:26   ` Sean Christopherson
2022-07-19 23:27     ` Sean Christopherson
2022-05-21 13:17 ` [PATCH V3 12/12] KVM: X86/MMU: Remove mmu_alloc_special_roots() Lai Jiangshan
2022-05-26  8:49 ` [PATCH V3 00/12] KVM: X86/MMU: Use one-off local shadow page for special roots Lai Jiangshan
2022-05-26 20:27   ` David Matlack

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ytc5Zmer7sjkGAqV@google.com \
    --to=seanjc@google.com \
    --cc=dmatlack@google.com \
    --cc=jiangshan.ljs@antgroup.com \
    --cc=jiangshanlai@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mlevitsk@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=vkuznets@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.