kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
From: Marc Zyngier <maz@kernel.org>
To: Ricardo Koller <ricarkol@google.com>
Cc: pbonzini@redhat.com, oupton@google.com, yuzenghui@huawei.com,
	dmatlack@google.com, kvm@vger.kernel.org, kvmarm@lists.linux.dev,
	qperret@google.com, catalin.marinas@arm.com,
	andrew.jones@linux.dev, seanjc@google.com,
	alexandru.elisei@arm.com, suzuki.poulose@arm.com,
	eric.auger@redhat.com, gshan@redhat.com, reijiw@google.com,
	rananta@google.com, bgardon@google.com, ricarkol@gmail.com,
	Shaoqin Huang <shahuang@redhat.com>
Subject: Re: [PATCH v6 02/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO
Date: Sun, 12 Mar 2023 10:49:28 +0000	[thread overview]
Message-ID: <87cz5e5jnr.wl-maz@kernel.org> (raw)
In-Reply-To: <20230307034555.39733-3-ricarkol@google.com>

On Tue, 07 Mar 2023 03:45:45 +0000,
Ricardo Koller <ricarkol@google.com> wrote:
> 
> Add two flags to kvm_pgtable_visit_ctx, KVM_PGTABLE_WALK_SKIP_BBM and
> KVM_PGTABLE_WALK_SKIP_CMO, to indicate that the walk should not
> perform break-before-make (BBM) nor cache maintenance operations
> (CMO). This will by a future commit to create unlinked tables not

This will *be used*?

> accessible to the HW page-table walker.  This is safe as these
> unlinked tables are not visible to the HW page-table walker.

I don't think this last sentence makes much sense. The PTW is always
coherent with the CPU caches and doesn't require cache maintenance
(CMOs are solely for the pages the PTs point to).

But this makes me question this patch further.

The key observation here is that if you are creating new PTs that
shadow an existing structure and still points to the same data pages,
the cache state is independent of the intermediate PT walk, and thus
CMOs are pointless anyway. So skipping CMOs makes sense.

I agree with the assertion that there is little point in doing BBM
when *creating* page tables, as all PTs start in an invalid state. But
then, why do you need to skip it? The invalidation calls are already
gated on the previous pointer being valid, which I presume won't be
the case for what you describe here.

> 
> Signed-off-by: Ricardo Koller <ricarkol@google.com>
> Reviewed-by: Shaoqin Huang <shahuang@redhat.com>
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 18 ++++++++++++++++++
>  arch/arm64/kvm/hyp/pgtable.c         | 27 ++++++++++++++++-----------
>  2 files changed, 34 insertions(+), 11 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 26a4293726c1..c7a269cad053 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -195,6 +195,12 @@ typedef bool (*kvm_pgtable_force_pte_cb_t)(u64 addr, u64 end,
>   *					with other software walkers.
>   * @KVM_PGTABLE_WALK_HANDLE_FAULT:	Indicates the page-table walk was
>   *					invoked from a fault handler.
> + * @KVM_PGTABLE_WALK_SKIP_BBM:		Visit and update table entries
> + *					without Break-before-make
> + *					requirements.
> + * @KVM_PGTABLE_WALK_SKIP_CMO:		Visit and update table entries
> + *					without Cache maintenance
> + *					operations required.

We have both I and D side CMOs. Is it reasonable to always treat them
identically?

>   */
>  enum kvm_pgtable_walk_flags {
>  	KVM_PGTABLE_WALK_LEAF			= BIT(0),
> @@ -202,6 +208,8 @@ enum kvm_pgtable_walk_flags {
>  	KVM_PGTABLE_WALK_TABLE_POST		= BIT(2),
>  	KVM_PGTABLE_WALK_SHARED			= BIT(3),
>  	KVM_PGTABLE_WALK_HANDLE_FAULT		= BIT(4),
> +	KVM_PGTABLE_WALK_SKIP_BBM		= BIT(5),
> +	KVM_PGTABLE_WALK_SKIP_CMO		= BIT(6),
>  };
>  
>  struct kvm_pgtable_visit_ctx {
> @@ -223,6 +231,16 @@ static inline bool kvm_pgtable_walk_shared(const struct kvm_pgtable_visit_ctx *c
>  	return ctx->flags & KVM_PGTABLE_WALK_SHARED;
>  }
>  
> +static inline bool kvm_pgtable_walk_skip_bbm(const struct kvm_pgtable_visit_ctx *ctx)
> +{
> +	return ctx->flags & KVM_PGTABLE_WALK_SKIP_BBM;

Probably worth wrapping this with an 'unlikely'.

> +}
> +
> +static inline bool kvm_pgtable_walk_skip_cmo(const struct kvm_pgtable_visit_ctx *ctx)
> +{
> +	return ctx->flags & KVM_PGTABLE_WALK_SKIP_CMO;

Same here.

Also, why are these in kvm_pgtable.h? Can't they be moved inside
pgtable.c and thus have the "inline" attribute dropped?

> +}
> +
>  /**
>   * struct kvm_pgtable_walker - Hook into a page-table walk.
>   * @cb:		Callback function to invoke during the walk.
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index a3246d6cddec..4f703cc4cb03 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -741,14 +741,17 @@ static bool stage2_try_break_pte(const struct kvm_pgtable_visit_ctx *ctx,
>  	if (!stage2_try_set_pte(ctx, KVM_INVALID_PTE_LOCKED))
>  		return false;
>  
> -	/*
> -	 * Perform the appropriate TLB invalidation based on the evicted pte
> -	 * value (if any).
> -	 */
> -	if (kvm_pte_table(ctx->old, ctx->level))
> -		kvm_call_hyp(__kvm_tlb_flush_vmid, mmu);
> -	else if (kvm_pte_valid(ctx->old))
> -		kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu, ctx->addr, ctx->level);
> +	if (!kvm_pgtable_walk_skip_bbm(ctx)) {
> +		/*
> +		 * Perform the appropriate TLB invalidation based on the
> +		 * evicted pte value (if any).
> +		 */
> +		if (kvm_pte_table(ctx->old, ctx->level))

You're not skipping BBM here. You're skipping the TLB invalidation.
Not quite the same thing.

> +			kvm_call_hyp(__kvm_tlb_flush_vmid, mmu);
> +		else if (kvm_pte_valid(ctx->old))
> +			kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, mmu,
> +				     ctx->addr, ctx->level);
> +	}
>  
>  	if (stage2_pte_is_counted(ctx->old))
>  		mm_ops->put_page(ctx->ptep);
> @@ -832,11 +835,13 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
>  		return -EAGAIN;
>  
>  	/* Perform CMOs before installation of the guest stage-2 PTE */
> -	if (mm_ops->dcache_clean_inval_poc && stage2_pte_cacheable(pgt, new))
> +	if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->dcache_clean_inval_poc &&
> +	    stage2_pte_cacheable(pgt, new))
>  		mm_ops->dcache_clean_inval_poc(kvm_pte_follow(new, mm_ops),
> -						granule);
> +					       granule);
>  
> -	if (mm_ops->icache_inval_pou && stage2_pte_executable(new))
> +	if (!kvm_pgtable_walk_skip_cmo(ctx) && mm_ops->icache_inval_pou &&
> +	    stage2_pte_executable(new))
>  		mm_ops->icache_inval_pou(kvm_pte_follow(new, mm_ops), granule);
>  
>  	stage2_make_pte(ctx, new);

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

  reply	other threads:[~2023-03-12 10:49 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-07  3:45 [PATCH v6 00/12] Implement Eager Page Splitting for ARM Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 01/12] KVM: arm64: Rename free_removed to free_unlinked Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 02/12] KVM: arm64: Add KVM_PGTABLE_WALK ctx->flags for skipping BBM and CMO Ricardo Koller
2023-03-12 10:49   ` Marc Zyngier [this message]
2023-03-13 18:49     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 03/12] KVM: arm64: Add helper for creating unlinked stage2 subtrees Ricardo Koller
2023-03-12 11:06   ` Marc Zyngier
2023-03-13 22:23     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 04/12] KVM: arm64: Add kvm_pgtable_stage2_split() Ricardo Koller
2023-03-12 11:35   ` Marc Zyngier
2023-03-13 23:58     ` Ricardo Koller
2023-03-15 18:09       ` Marc Zyngier
2023-03-15 18:51         ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 05/12] KVM: arm64: Refactor kvm_arch_commit_memory_region() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 06/12] KVM: arm64: Add kvm_uninit_stage2_mmu() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 07/12] KVM: arm64: Export kvm_are_all_memslots_empty() Ricardo Koller
2023-03-12 11:39   ` Marc Zyngier
2023-03-13 15:18     ` Sean Christopherson
2023-03-14 10:18       ` Marc Zyngier
2023-03-15 21:00         ` Sean Christopherson
2023-03-07  3:45 ` [PATCH v6 08/12] KVM: arm64: Add KVM_CAP_ARM_EAGER_SPLIT_CHUNK_SIZE Ricardo Koller
2023-03-12 11:56   ` Marc Zyngier
2023-03-24  7:41     ` Ricardo Koller
2023-03-29  4:50   ` Reiji Watanabe
2023-04-10 20:04     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 09/12] KVM: arm64: Split huge pages when dirty logging is enabled Ricardo Koller
2023-03-12 12:54   ` Marc Zyngier
2023-04-10 18:32     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 10/12] KVM: arm64: Open-code kvm_mmu_write_protect_pt_masked() Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 11/12] KVM: arm64: Split huge pages during KVM_CLEAR_DIRTY_LOG Ricardo Koller
2023-03-12 13:01   ` Marc Zyngier
2023-04-10 18:26     ` Ricardo Koller
2023-03-07  3:45 ` [PATCH v6 12/12] KVM: arm64: Use local TLBI on permission relaxation Ricardo Koller
2023-03-12 13:22   ` Marc Zyngier
2023-04-10 18:22     ` Ricardo Koller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87cz5e5jnr.wl-maz@kernel.org \
    --to=maz@kernel.org \
    --cc=alexandru.elisei@arm.com \
    --cc=andrew.jones@linux.dev \
    --cc=bgardon@google.com \
    --cc=catalin.marinas@arm.com \
    --cc=dmatlack@google.com \
    --cc=eric.auger@redhat.com \
    --cc=gshan@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=kvmarm@lists.linux.dev \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=qperret@google.com \
    --cc=rananta@google.com \
    --cc=reijiw@google.com \
    --cc=ricarkol@gmail.com \
    --cc=ricarkol@google.com \
    --cc=seanjc@google.com \
    --cc=shahuang@redhat.com \
    --cc=suzuki.poulose@arm.com \
    --cc=yuzenghui@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).