kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sean Christopherson <sean.j.christopherson@intel.com>
To: Ben Gardon <bgardon@google.com>
Cc: kvm@vger.kernel.org, Paolo Bonzini <pbonzini@redhat.com>,
	Peter Feiner <pfeiner@google.com>,
	Peter Shier <pshier@google.com>,
	Junaid Shahid <junaids@google.com>,
	Jim Mattson <jmattson@google.com>
Subject: Re: [RFC PATCH 02/28] kvm: mmu: Separate pte generation from set_spte
Date: Wed, 27 Nov 2019 10:25:53 -0800	[thread overview]
Message-ID: <20191127182553.GB22227@linux.intel.com> (raw)
In-Reply-To: <20190926231824.149014-3-bgardon@google.com>

On Thu, Sep 26, 2019 at 04:17:58PM -0700, Ben Gardon wrote:
> Separate the functions for generating leaf page table entries from the
> function that inserts them into the paging structure. This refactoring
> will allow changes to the MMU sychronization model to use atomic
> compare / exchanges (which are not guaranteed to succeed) instead of a
> monolithic MMU lock.
> 
> Signed-off-by: Ben Gardon <bgardon@google.com>
> ---
>  arch/x86/kvm/mmu.c | 93 ++++++++++++++++++++++++++++------------------
>  1 file changed, 57 insertions(+), 36 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index 781c2ca7455e3..7e5ab9c6e2b09 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -2964,21 +2964,15 @@ static bool kvm_is_mmio_pfn(kvm_pfn_t pfn)
>  #define SET_SPTE_WRITE_PROTECTED_PT	BIT(0)
>  #define SET_SPTE_NEED_REMOTE_TLB_FLUSH	BIT(1)
>  
> -static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
> -		    unsigned pte_access, int level,
> -		    gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> -		    bool can_unsync, bool host_writable)
> +static int generate_pte(struct kvm_vcpu *vcpu, unsigned pte_access, int level,

Similar comment on "generate".  Note, I don't necessarily like the names
get_mmio_spte_value() or get_spte_value() either as they could be
misinterpreted as reading the value from memory.  Maybe
calc_{mmio_}spte_value()?

> +		    gfn_t gfn, kvm_pfn_t pfn, u64 old_pte, bool speculative,
> +		    bool can_unsync, bool host_writable, bool ad_disabled,
> +		    u64 *ptep)
>  {
> -	u64 spte = 0;
> +	u64 pte;

Renames and unrelated refactoring (leaving the variable uninitialized and
setting it directdly to shadow_present_mask) belong in separate patches.
The renames especially make this patch much more difficult to review.  And,
I disagree with the rename, I think it's important to keep the "spte"
nomenclature, even though it's a bit of a misnomer for TDP entries, so that
it is easy to differentiate data that is coming from the host PTEs versus
data that is for KVM's MMU.

>  	int ret = 0;
> -	struct kvm_mmu_page *sp;
> -
> -	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
> -		return 0;
>  
> -	sp = page_header(__pa(sptep));
> -	if (sp_ad_disabled(sp))
> -		spte |= shadow_acc_track_value;
> +	*ptep = 0;
>  
>  	/*
>  	 * For the EPT case, shadow_present_mask is 0 if hardware
> @@ -2986,36 +2980,39 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  	 * ACC_USER_MASK and shadow_user_mask are used to represent
>  	 * read access.  See FNAME(gpte_access) in paging_tmpl.h.
>  	 */
> -	spte |= shadow_present_mask;
> +	pte = shadow_present_mask;
> +
> +	if (ad_disabled)
> +		pte |= shadow_acc_track_value;
> +
>  	if (!speculative)
> -		spte |= spte_shadow_accessed_mask(spte);
> +		pte |= spte_shadow_accessed_mask(pte);
>  
>  	if (pte_access & ACC_EXEC_MASK)
> -		spte |= shadow_x_mask;
> +		pte |= shadow_x_mask;
>  	else
> -		spte |= shadow_nx_mask;
> +		pte |= shadow_nx_mask;
>  
>  	if (pte_access & ACC_USER_MASK)
> -		spte |= shadow_user_mask;
> +		pte |= shadow_user_mask;
>  
>  	if (level > PT_PAGE_TABLE_LEVEL)
> -		spte |= PT_PAGE_SIZE_MASK;
> +		pte |= PT_PAGE_SIZE_MASK;
>  	if (tdp_enabled)
> -		spte |= kvm_x86_ops->get_mt_mask(vcpu, gfn,
> +		pte |= kvm_x86_ops->get_mt_mask(vcpu, gfn,
>  			kvm_is_mmio_pfn(pfn));
>  
>  	if (host_writable)
> -		spte |= SPTE_HOST_WRITEABLE;
> +		pte |= SPTE_HOST_WRITEABLE;
>  	else
>  		pte_access &= ~ACC_WRITE_MASK;
>  
>  	if (!kvm_is_mmio_pfn(pfn))
> -		spte |= shadow_me_mask;
> +		pte |= shadow_me_mask;
>  
> -	spte |= (u64)pfn << PAGE_SHIFT;
> +	pte |= (u64)pfn << PAGE_SHIFT;
>  
>  	if (pte_access & ACC_WRITE_MASK) {
> -
>  		/*
>  		 * Other vcpu creates new sp in the window between
>  		 * mapping_level() and acquiring mmu-lock. We can
> @@ -3024,9 +3021,9 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  		 */
>  		if (level > PT_PAGE_TABLE_LEVEL &&
>  		    mmu_gfn_lpage_is_disallowed(vcpu, gfn, level))
> -			goto done;
> +			return 0;
>  
> -		spte |= PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE;
> +		pte |= PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE;
>  
>  		/*
>  		 * Optimization: for pte sync, if spte was writable the hash
> @@ -3034,30 +3031,54 @@ static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep,
>  		 * is responsibility of mmu_get_page / kvm_sync_page.
>  		 * Same reasoning can be applied to dirty page accounting.
>  		 */
> -		if (!can_unsync && is_writable_pte(*sptep))
> -			goto set_pte;
> +		if (!can_unsync && is_writable_pte(old_pte)) {
> +			*ptep = pte;
> +			return 0;
> +		}
>  
>  		if (mmu_need_write_protect(vcpu, gfn, can_unsync)) {
>  			pgprintk("%s: found shadow page for %llx, marking ro\n",
>  				 __func__, gfn);
> -			ret |= SET_SPTE_WRITE_PROTECTED_PT;
> +			ret = SET_SPTE_WRITE_PROTECTED_PT;

More unnecessary refactoring.

>  			pte_access &= ~ACC_WRITE_MASK;
> -			spte &= ~(PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE);
> +			pte &= ~(PT_WRITABLE_MASK | SPTE_MMU_WRITEABLE);
>  		}
>  	}
>  
> -	if (pte_access & ACC_WRITE_MASK) {
> -		kvm_vcpu_mark_page_dirty(vcpu, gfn);
> -		spte |= spte_shadow_dirty_mask(spte);
> -	}
> +	if (pte_access & ACC_WRITE_MASK)
> +		pte |= spte_shadow_dirty_mask(pte);
>  
>  	if (speculative)
> -		spte = mark_spte_for_access_track(spte);
> +		pte = mark_spte_for_access_track(pte);
> +
> +	*ptep = pte;
> +	return ret;
> +}
> +
> +static int set_spte(struct kvm_vcpu *vcpu, u64 *sptep, unsigned pte_access,
> +		    int level, gfn_t gfn, kvm_pfn_t pfn, bool speculative,
> +		    bool can_unsync, bool host_writable)
> +{
> +	u64 spte;
> +	int ret;
> +	struct kvm_mmu_page *sp;
> +
> +	if (set_mmio_spte(vcpu, sptep, gfn, pfn, pte_access))
> +		return 0;
> +
> +	sp = page_header(__pa(sptep));
> +
> +	ret = generate_pte(vcpu, pte_access, level, gfn, pfn, *sptep,
> +			   speculative, can_unsync, host_writable,
> +			   sp_ad_disabled(sp), &spte);

Yowsers, that's a big prototype.  This is something that came up in an
unrelated internal discussion.  I wonder if it would make sense to define
a struct to hold all of the data needed to insert an spte and pass that
on the stack isntead of having a bajillion parameters.  Just spitballing,
no idea if it's feasible and/or reasonable.

> +	if (!spte)
> +		return 0;
> +
> +	if (spte & PT_WRITABLE_MASK)
> +		kvm_vcpu_mark_page_dirty(vcpu, gfn);
>  
> -set_pte:
>  	if (mmu_spte_update(sptep, spte))
>  		ret |= SET_SPTE_NEED_REMOTE_TLB_FLUSH;
> -done:
>  	return ret;
>  }
>  
> -- 
> 2.23.0.444.g18eeb5a265-goog
> 

  reply	other threads:[~2019-11-27 18:25 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-26 23:17 [RFC PATCH 00/28] kvm: mmu: Rework the x86 TDP direct mapped case Ben Gardon
2019-09-26 23:17 ` [RFC PATCH 01/28] kvm: mmu: Separate generating and setting mmio ptes Ben Gardon
2019-11-27 18:15   ` Sean Christopherson
2019-09-26 23:17 ` [RFC PATCH 02/28] kvm: mmu: Separate pte generation from set_spte Ben Gardon
2019-11-27 18:25   ` Sean Christopherson [this message]
2019-09-26 23:17 ` [RFC PATCH 03/28] kvm: mmu: Zero page cache memory at allocation time Ben Gardon
2019-11-27 18:32   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 04/28] kvm: mmu: Update the lpages stat atomically Ben Gardon
2019-11-27 18:39   ` Sean Christopherson
2019-12-06 20:10     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 05/28] sched: Add cond_resched_rwlock Ben Gardon
2019-11-27 18:42   ` Sean Christopherson
2019-12-06 20:12     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 06/28] kvm: mmu: Replace mmu_lock with a read/write lock Ben Gardon
2019-11-27 18:47   ` Sean Christopherson
2019-12-02 22:45     ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 07/28] kvm: mmu: Add functions for handling changed PTEs Ben Gardon
2019-11-27 19:04   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 08/28] kvm: mmu: Init / Uninit the direct MMU Ben Gardon
2019-12-02 23:40   ` Sean Christopherson
2019-12-06 20:25     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 09/28] kvm: mmu: Free direct MMU page table memory in an RCU callback Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 10/28] kvm: mmu: Flush TLBs before freeing direct MMU page table memory Ben Gardon
2019-12-02 23:46   ` Sean Christopherson
2019-12-06 20:31     ` Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 11/28] kvm: mmu: Optimize for freeing direct MMU PTs on teardown Ben Gardon
2019-12-02 23:54   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 12/28] kvm: mmu: Set tlbs_dirty atomically Ben Gardon
2019-12-03  0:13   ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 13/28] kvm: mmu: Add an iterator for concurrent paging structure walks Ben Gardon
2019-12-03  2:15   ` Sean Christopherson
2019-12-18 18:25     ` Ben Gardon
2019-12-18 19:14       ` Sean Christopherson
2019-09-26 23:18 ` [RFC PATCH 14/28] kvm: mmu: Batch updates to the direct mmu disconnected list Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 15/28] kvm: mmu: Support invalidate_zap_all_pages Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 16/28] kvm: mmu: Add direct MMU page fault handler Ben Gardon
2020-01-08 17:20   ` Peter Xu
2020-01-08 18:15     ` Ben Gardon
2020-01-08 19:00       ` Peter Xu
2019-09-26 23:18 ` [RFC PATCH 17/28] kvm: mmu: Add direct MMU fast " Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 18/28] kvm: mmu: Add an hva range iterator for memslot GFNs Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 19/28] kvm: mmu: Make address space ID a property of memslots Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 20/28] kvm: mmu: Implement the invalidation MMU notifiers for the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 21/28] kvm: mmu: Integrate the direct mmu with the changed pte notifier Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 22/28] kvm: mmu: Implement access tracking for the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 23/28] kvm: mmu: Make mark_page_dirty_in_slot usable from outside kvm_main Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 24/28] kvm: mmu: Support dirty logging in the direct MMU Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 25/28] kvm: mmu: Support kvm_zap_gfn_range " Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 26/28] kvm: mmu: Integrate direct MMU with nesting Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 27/28] kvm: mmu: Lazily allocate rmap when direct MMU is enabled Ben Gardon
2019-09-26 23:18 ` [RFC PATCH 28/28] kvm: mmu: Support MMIO in the direct MMU Ben Gardon
2019-10-17 18:50 ` [RFC PATCH 00/28] kvm: mmu: Rework the x86 TDP direct mapped case Sean Christopherson
2019-10-18 13:42   ` Paolo Bonzini
2019-11-27 19:09 ` Sean Christopherson
2019-12-06 19:55   ` Ben Gardon
2019-12-06 19:57     ` Sean Christopherson
2019-12-06 20:42       ` Ben Gardon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191127182553.GB22227@linux.intel.com \
    --to=sean.j.christopherson@intel.com \
    --cc=bgardon@google.com \
    --cc=jmattson@google.com \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=pfeiner@google.com \
    --cc=pshier@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).