All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Ben Gardon <bgardon@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
	linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
	Peter Xu <peterx@redhat.com>, Peter Shier <pshier@google.com>,
	David Matlack <dmatlack@google.com>,
	Mingwei Zhang <mizhang@google.com>,
	Yulei Zhang <yulei.kernel@gmail.com>,
	Wanpeng Li <kernellwp@gmail.com>,
	Xiao Guangrong <xiaoguangrong.eric@gmail.com>,
	Kai Huang <kai.huang@intel.com>,
	Keqian Zhu <zhukeqian1@huawei.com>,
	David Hildenbrand <david@redhat.com>
Subject: Re: [RFC 11/19] KVM: x86/mmu: Factor shadow_zero_check out of make_spte
Date: Thu, 11 Nov 2021 01:18:01 +0000	[thread overview]
Message-ID: <YYxvSfUPTXbclpSa@google.com> (raw)
In-Reply-To: <CANgfPd8hzDU+v52t9Kr=b48utC1p_j3yJ8gHzo-uifAxHbh-eQ@mail.gmail.com>

On Wed, Nov 10, 2021, Ben Gardon wrote:
> On Wed, Nov 10, 2021 at 2:45 PM Paolo Bonzini <pbonzini@redhat.com> wrote:
> >
> > On 11/10/21 23:30, Ben Gardon wrote:
> > > -     WARN_ONCE(is_rsvd_spte(&vcpu->arch.mmu->shadow_zero_check, spte, level),
> > > +     WARN_ONCE(is_rsvd_spte(shadow_zero_check, spte, level),
> > >                 "spte = 0x%llx, level = %d, rsvd bits = 0x%llx", spte, level,
> > > -               get_rsvd_bits(&vcpu->arch.mmu->shadow_zero_check, spte, level));
> > > +               get_rsvd_bits(shadow_zero_check, spte, level));
> >
> > Hmm, there is a deeper issue here, in that when using EPT/NPT (on either
> > the legacy aka shadow or the TDP MMU) large parts of vcpu->arch.mmu are
> > really the same for all vCPUs.  The only thing that varies is those
> > parts that actually depend on the guest's paging mode---the extended
> > role, the reserved bits, etc.  Those are needed by the emulator, but
> > don't really belong in vcpu->arch.mmu when EPT/NPT is in use.
> >
> > I wonder if there's room for splitting kvm_mmu in two parts, such as
> > kvm_mmu and kvm_guest_paging_context, and possibly change the walk_mmu
> > pointer into a pointer to kvm_guest_paging_context.  This way the
> > EPT/NPT MMU (again either shadow or TDP) can be moved to kvm->arch.  It
> > should simplify this series and also David's work on eager page splitting.
> >
> > I'm not asking you to do this, of course, but perhaps I can trigger
> > Sean's itch to refactor stuff. :)
> >
> > Paolo
> >
> 
> I think that's a great idea. I'm frequently confused as to why the
> struct kvm_mmu is a per-vcpu construct as opposed to being VM-global.
> Moving part of the struct to be a member for struct kvm would also
> open the door to formalizing the MMU interface a little better and
> perhaps even reveal more MMU code that can be consolidated across
> architectures.

But what would you actually move?  Even shadow_zero_check barely squeaks by,
e.g. if NX is ever used to for NPT, then maybe it stops being a per-VM setting.

Going through the fields...

These are all related to guest context:

	unsigned long (*get_guest_pgd)(struct kvm_vcpu *vcpu);
	u64 (*get_pdptr)(struct kvm_vcpu *vcpu, int index);
	int (*page_fault)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault);
	void (*inject_page_fault)(struct kvm_vcpu *vcpu,
				  struct x86_exception *fault);
	gpa_t (*gva_to_gpa)(struct kvm_vcpu *vcpu, gpa_t gva_or_gpa,
			    u32 access, struct x86_exception *exception);
	gpa_t (*translate_gpa)(struct kvm_vcpu *vcpu, gpa_t gpa, u32 access,
			       struct x86_exception *exception);
	int (*sync_page)(struct kvm_vcpu *vcpu,
			 struct kvm_mmu_page *sp);
	void (*invlpg)(struct kvm_vcpu *vcpu, gva_t gva, hpa_t root_hpa);
	union kvm_mmu_role mmu_role;
	u8 root_level;
	u8 permissions[16];
	u32 pkru_mask;
	struct rsvd_bits_validate guest_rsvd_check;
	u64 pdptrs[4];
	gpa_t root_pgd;

One field, ept_ad, can be straight deleted as it's redundant with respect to
the above mmu_role.ad_disabled.

	u8 ept_ad;

Ditto for direct_map flag (mmu_role.direct) and shadow_root_level (mmu_role.level).
I haven't bothered to yank those because they have a lot of touchpoints.

	bool direct_map;
	u8 shadow_root_level;

The prev_roots could be dropped if TDP roots were tracked per-VM, but we'd still
want an equivalent for !TDP and nTDP MMUs.

	struct kvm_mmu_root_info prev_roots[KVM_MMU_NUM_PREV_ROOTS];

shadow_zero_check can be made per-VM if all vCPUs are required to have the same
cpuid.MAXPHYADDR or if we remove the (IMO) pointless 5-level vs. 4-level behavior,
which by-the-by, has my vote since we could make shadow_zero_check _global_, not
just per-VM, and everything I've heard is that the extra level has no measurable
performance overhead.

	struct rsvd_bits_validate shadow_zero_check;

And that leaves us with:
	hpa_t root_hpa;

	u64 *pae_root;
	u64 *pml4_root;
	u64 *pml5_root;

Of those, _none_ of them can be per-VM, because they are all nothing more than
shadow pages, and thus cannot be per-VM unless there is exactly one set of TDP
page tables for the guest.  Even if/when we strip the unnecessary role bits from
these for TDP (on my todo list), we still need up to three sets of page tables:

	1. Normal
	2. SMM
	3. Guest (if L1 doesn't use TDP)

So I suppose we could refactor KVM to explicitly track its three possible TDP
roots, but I don't think it buys us anything and would complicate supporting
!TDP as well as nTDP.

  reply	other threads:[~2021-11-11  1:18 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10 22:29 [RFC 00/19] KVM: x86/mmu: Optimize disabling dirty logging Ben Gardon
2021-11-10 22:29 ` [RFC 01/19] KVM: x86/mmu: Fix TLB flush range when handling disconnected pt Ben Gardon
2021-11-11 17:44   ` David Matlack
2021-11-10 22:29 ` [RFC 02/19] KVM: x86/mmu: Batch TLB flushes for a single zap Ben Gardon
2021-11-11 18:06   ` David Matlack
2021-11-12 23:53   ` Sean Christopherson
2021-11-10 22:29 ` [RFC 03/19] KVM: x86/mmu: Factor flush and free up when zapping under MMU write lock Ben Gardon
2021-11-11 18:31   ` David Matlack
2021-11-10 22:29 ` [RFC 04/19] KVM: x86/mmu: Yield while processing disconnected_sps Ben Gardon
2021-11-11 18:50   ` David Matlack
2021-11-10 22:29 ` [RFC 05/19] KVM: x86/mmu: Remove redundant flushes when disabling dirty logging Ben Gardon
2021-11-11 18:55   ` David Matlack
2021-11-10 22:29 ` [RFC 06/19] KVM: x86/mmu: Introduce vcpu_make_spte Ben Gardon
2021-11-10 22:29 ` [RFC 07/19] KVM: x86/mmu: Factor wrprot for nested PML out of make_spte Ben Gardon
2021-11-18  2:12   ` Sean Christopherson
2021-11-18 17:43     ` Ben Gardon
2021-11-18 18:04       ` Paolo Bonzini
2021-11-10 22:29 ` [RFC 08/19] KVM: x86/mmu: Factor mt_mask " Ben Gardon
2021-11-10 22:30 ` [RFC 09/19] KVM: x86/mmu: Remove need for a vcpu from kvm_slot_page_track_is_active Ben Gardon
2021-11-10 22:30 ` [RFC 10/19] KVM: x86/mmu: Remove need for a vcpu from mmu_try_to_unsync_pages Ben Gardon
2021-11-10 22:30 ` [RFC 11/19] KVM: x86/mmu: Factor shadow_zero_check out of make_spte Ben Gardon
2021-11-10 22:44   ` Paolo Bonzini
2021-11-10 23:49     ` Ben Gardon
2021-11-11  1:18       ` Sean Christopherson [this message]
2021-11-11  1:44         ` Sean Christopherson
2021-11-11  7:06         ` Paolo Bonzini
2021-11-18  2:05   ` Sean Christopherson
2021-11-18  3:29     ` Sean Christopherson
2021-11-18 16:37       ` Sean Christopherson
2021-11-18 17:19         ` Paolo Bonzini
2021-11-18 18:02           ` Sean Christopherson
2021-11-18 18:07             ` Paolo Bonzini
2021-11-18 18:14               ` Sean Christopherson
2021-11-10 22:30 ` [RFC 12/19] KVM: x86/mmu: Replace vcpu argument with kvm pointer in make_spte Ben Gardon
2021-11-10 22:30 ` [RFC 13/19] KVM: x86/mmu: Factor out the meat of reset_tdp_shadow_zero_bits_mask Ben Gardon
2021-11-10 22:30 ` [RFC 14/19] KVM: x86/mmu: Propagate memslot const qualifier Ben Gardon
2021-11-10 22:30 ` [RFC 15/19] KVM: x86/MMU: Refactor vmx_get_mt_mask Ben Gardon
2021-11-10 22:30 ` [RFC 16/19] KVM: x86/mmu: Factor out part of vmx_get_mt_mask which does not depend on vcpu Ben Gardon
2021-11-10 22:30 ` [RFC 17/19] KVM: x86/mmu: Add try_get_mt_mask to x86_ops Ben Gardon
2021-11-10 22:30 ` [RFC 18/19] KVM: x86/mmu: Make kvm_is_mmio_pfn usable outside of spte.c Ben Gardon
2021-11-10 22:30 ` [RFC 19/19] KVM: x86/mmu: Promote pages in-place when disabling dirty logging Ben Gardon
2021-11-15 21:24 ` [RFC 00/19] KVM: x86/mmu: Optimize " Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YYxvSfUPTXbclpSa@google.com \
    --to=seanjc@google.com \
    --cc=bgardon@google.com \
    --cc=david@redhat.com \
    --cc=dmatlack@google.com \
    --cc=kai.huang@intel.com \
    --cc=kernellwp@gmail.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=pshier@google.com \
    --cc=xiaoguangrong.eric@gmail.com \
    --cc=yulei.kernel@gmail.com \
    --cc=zhukeqian1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.