kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Ricardo Koller <ricarkol@google.com>
To: Marc Zyngier <maz@kernel.org>
Cc: pbonzini@redhat.com, oupton@google.com, yuzenghui@huawei.com,
	dmatlack@google.com, kvm@vger.kernel.org, kvmarm@lists.linux.dev,
	qperret@google.com, catalin.marinas@arm.com,
	andrew.jones@linux.dev, seanjc@google.com,
	alexandru.elisei@arm.com, suzuki.poulose@arm.com,
	eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com,
	rananta@google.com, bgardon@google.com, ricarkol@gmail.com,
	Shaoqin Huang <shahuang@redhat.com>
Subject: Re: [PATCH v6 02/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO
Date: Mon, 13 Mar 2023 11:49:48 -0700	[thread overview]
Message-ID: <ZA9wTG6fIx2n4YHi@google.com> (raw)
In-Reply-To: <87cz5e5jnr.wl-maz@kernel.org>

On Sun, Mar 12, 2023 at 10:49:28AM +0000, Marc Zyngier wrote:
> On Tue, 07 Mar 2023 03:45:45 +0000,
> Ricardo Koller <ricarkol@google.com> wrote:
> > 
> > Add two flags to kvm_pgtable_visit_ctx, KVM_PGTABLE_WALK_SKIP_BBM and
> > KVM_PGTABLE_WALK_SKIP_CMO, to indicate that the walk should not
> > perform break-before-make (BBM) nor cache maintenance operations
> > (CMO). This will by a future commit to create unlinked tables not
> 
> This will *be used*?
> 
> > accessible to the HW page-table walker.  This is safe as these
> > unlinked tables are not visible to the HW page-table walker.
> 
> I don't think this last sentence makes much sense. The PTW is always
> coherent with the CPU caches and doesn't require cache maintenance
> (CMOs are solely for the pages the PTs point to).
> 
> But this makes me question this patch further.
> 
> The key observation here is that if you are creating new PTs that
> shadow an existing structure and still points to the same data pages,
> the cache state is independent of the intermediate PT walk, and thus
> CMOs are pointless anyway. So skipping CMOs makes sense.
> 
> I agree with the assertion that there is little point in doing BBM
> when *creating* page tables, as all PTs start in an invalid state. But
> then, why do you need to skip it? The invalidation calls are already
> gated on the previous pointer being valid, which I presume won't be
> the case for what you describe here.
> 

I need to change the SKIP_BBM name; it's confusing, sorry for that. As
you noticed below, SKIP_BBM just skips the TLB invalidation step in the
BBM, so the invalidation still occurs with SKIP_BBM=true.

Thanks for the reviews Marc.

> > 
> > Signed-off-by: Ricardo Koller <ricarkol@google.com>
> > Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
> > ---
> >  arch/arm64/include/asm/kvm_pgtable.h | 18 ++++++++++++++++++
> >  arch/arm64/kvm/hyp/pgtable.c         | 27 ++++++++++++++++-----------
> >  2 files changed, 34 insertions(+), 11 deletions(-)
> > 
> > diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> > index 26a4293726c1..c7a269cad053 100644
> > --- a/arch/arm64/include/asm/kvm_pgtable.h
> > +++ b/arch/arm64/include/asm/kvm_pgtable.h
> > @@ -195,6 +195,12 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end,
> >   *					with other software walkers.
> >   * @KVM_PGTABLE_WALK_HANDLE_FAULT:	Indicates the page-table walk was
> >   *					invoked from a fault handler.
> > + * @KVM_PGTABLE_WALK_SKIP_BBM:		Visit and update table entries
> > + *					without Break-before-make
> > + *					requirements.
> > + * @KVM_PGTABLE_WALK_SKIP_CMO:		Visit and update table entries
> > + *					without Cache maintenance
> > + *					operations required.
> 
> We have both I and D side CMOs. Is it reasonable to always treat them
> identically?
> 
> >   */
> >  enum kvm_pgtable_walk_flags {
> >  	KVM_PGTABLE_WALK_LEAF			= BIT(0),
> > @@ -202,6 +208,8 @@ enum kvm_pgtable_walk_flags {
> >  	KVM_PGTABLE_WALK_TABLE_POST		= BIT(2),
> >  	KVM_PGTABLE_WALK_SHARED			= BIT(3),
> >  	KVM_PGTABLE_WALK_HANDLE_FAULT		= BIT(4),
> > +	KVM_PGTABLE_WALK_SKIP_BBM		= BIT(5),
> > +	KVM_PGTABLE_WALK_SKIP_CMO		= BIT(6),
> >  };
> >  
> >  struct kvm_pgtable_visit_ctx {
> > @@ -223,6 +231,16 @@ static inline bool kvm_pgtable_walk_shared(const struct kvm_pgtable_visit_ctx *c
> >  	return ctx->flags & KVM_PGTABLE_WALK_SHARED;
> >  }
> >  
> > +static inline bool kvm_pgtable_walk_skip_bbm(const struct kvm_pgtable_visit_ctx *ctx)
> > +{
> > +	return ctx->flags & KVM_PGTABLE_WALK_SKIP_BBM;
> 
> Probably worth wrapping this with an 'unlikely'.
> 
> > +}
> > +
> > +static inline bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx)
> > +{
> > +	return ctx->flags & KVM_PGTABLE_WALK_SKIP_CMO;
> 
> Same here.
> 
> Also, why are these in kvm_pgtable.h? Can't they be moved inside
> pgtable.c and thus have the "inline" attribute dropped?
> 
> > +}
> > +
> >  /**
> >   * struct kvm_pgtable_walker - Hook into a page-table walk.
> >   * @cb:		Callback function to invoke during the walk.
> > diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> > index a3246d6cddec..4f703cc4cb03 100644
> > --- a/arch/arm64/kvm/hyp/pgtable.c
> > +++ b/arch/arm64/kvm/hyp/pgtable.c
> > @@ -741,14 +741,17 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx,
> >  	if (!stage2_try_set_pte(ctx, KVM_INVALID_PTE_LOCKED))
> >  		return false;
> >  
> > -	/*
> > -	 * Perform the appropriate TLB invalidation based on the evicted pte
> > -	 * value (if any).
> > -	 */
> > -	if (kvm_pte_table(ctx->old, ctx->level))
> > -		kvm_call_hyp(__kvm_tlb_flush_vmid, mmu);
> > -	else if (kvm_pte_valid(ctx->old))
> > -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ctx->level);
> > +	if (!kvm_pgtable_walk_skip_bbm(ctx)) {
> > +		/*
> > +		 * Perform the appropriate TLB invalidation based on the
> > +		 * evicted pte value (if any).
> > +		 */
> > +		if (kvm_pte_table(ctx->old, ctx->level))
> 
> You're not skipping BBM here. You're skipping the TLB invalidation.
> Not quite the same thing.
> 
> > +			kvm_call_hyp(__kvm_tlb_flush_vmid, mmu);
> > +		else if (kvm_pte_valid(ctx->old))
> > +			kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
> > +				     ctx->addr, ctx->level);
> > +	}
> >  
> >  	if (stage2_pte_is_counted(ctx->old))
> >  		mm_ops->put_page(ctx->ptep);
> > @@ -832,11 +835,13 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
> >  		return -EAGAIN;
> >  
> >  	/* Perform CMOs before installation of the guest stage-2 PTE */
> > -	if (mm_ops->dcache_clean_inval_poc && stage2_pte_cacheable(pgt, new))
> > +	if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->dcache_clean_inval_poc &&
> > +	    stage2_pte_cacheable(pgt, new))
> >  		mm_ops->dcache_clean_inval_poc(kvm_pte_follow(new, mm_ops),
> > -						granule);
> > +					       granule);
> >  
> > -	if (mm_ops->icache_inval_pou && stage2_pte_executable(new))
> > +	if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->icache_inval_pou &&
> > +	    stage2_pte_executable(new))
> >  		mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule);
> >  
> >  	stage2_make_pte(ctx, new);
> 
> Thanks,
> 
> 	M.
> 
> -- 
> Without deviation from the norm, progress is not possible.

  reply	other threads:[~2023-03-13 18:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07  3:45 [PATCH v6 00/12] Implement Eager Page Splitting for ARM Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 01/12] KVM: arm64: Rename free_removed to free_unlinked Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 02/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO Ricardo Koller
2023-03-12 10:49   ` Marc Zyngier
2023-03-13 18:49     ` Ricardo Koller [this message]
2023-03-07  3:45 ` [PATCH v6 03/12] KVM: arm64: Add helper for creating unlinked stage2 subtrees Ricardo Koller
2023-03-12 11:06   ` Marc Zyngier
2023-03-13 22:23     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 04/12] KVM: arm64: Add kvm_pgtable_stage2_split() Ricardo Koller
2023-03-12 11:35   ` Marc Zyngier
2023-03-13 23:58     ` Ricardo Koller
2023-03-15 18:09       ` Marc Zyngier
2023-03-15 18:51         ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 05/12] KVM: arm64: Refactor kvm_arch_commit_memory_region() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 06/12] KVM: arm64: Add kvm_uninit_stage2_mmu() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 07/12] KVM: arm64: Export kvm_are_all_memslots_empty() Ricardo Koller
2023-03-12 11:39   ` Marc Zyngier
2023-03-13 15:18     ` Sean Christopherson
2023-03-14 10:18       ` Marc Zyngier
2023-03-15 21:00         ` Sean Christopherson
2023-03-07  3:45 ` [PATCH v6 08/12] KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE Ricardo Koller
2023-03-12 11:56   ` Marc Zyngier
2023-03-24  7:41     ` Ricardo Koller
2023-03-29  4:50   ` Reiji Watanabe
2023-04-10 20:04     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 09/12] KVM: arm64: Split huge pages when dirty logging is enabled Ricardo Koller
2023-03-12 12:54   ` Marc Zyngier
2023-04-10 18:32     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 10/12] KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 11/12] KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG Ricardo Koller
2023-03-12 13:01   ` Marc Zyngier
2023-04-10 18:26     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 12/12] KVM: arm64: Use local TLBI on permission relaxation Ricardo Koller
2023-03-12 13:22   ` Marc Zyngier
2023-04-10 18:22     ` Ricardo Koller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZA9wTG6fIx2n4YHi@google.com \
    --to=ricarkol@google.com \
    --cc=alexandru.elisei@arm.com \
    --cc=andrew.jones@linux.dev \
    --cc=bgardon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dmatlack@google.com \
    --cc=eric.auger@redhat.com \
    --cc=gshan@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=maz@kernel.org \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=rananta@google.com \
    --cc=reijiw@google.com \
    --cc=ricarkol@gmail.com \
    --cc=seanjc@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).