From: Alexandru Elisei <alexandru.elisei@arm.com>
To: Will Deacon <will@kernel.org>, kvmarm@lists.cs.columbia.edu
Cc: Gavin Shan <gshan@redhat.com>,
Suzuki Poulose <suzuki.poulose@arm.com>,
Marc Zyngier <maz@kernel.org>,
Quentin Perret <qperret@google.com>,
James Morse <james.morse@arm.com>,
Catalin Marinas <catalin.marinas@arm.com>,
kernel-team@android.com, linux-arm-kernel@lists.infradead.org
Subject: Re: [PATCH v4 06/21] KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table
Date: Thu, 10 Sep 2020 12:20:42 +0100 [thread overview]
Message-ID: <f5939f12-56e8-794c-8d9b-9ae348bba3c0@arm.com> (raw)
In-Reply-To: <20200907152344.12978-7-will@kernel.org>
Hi Will,
On 9/7/20 4:23 PM, Will Deacon wrote:
> Add stage-2 map() and unmap() operations to the generic page-table code.
>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Quentin Perret <qperret@google.com>
> Reviewed-by: Gavin Shan <gshan@redhat.com>
> Signed-off-by: Will Deacon <will@kernel.org>
> ---
> arch/arm64/include/asm/kvm_pgtable.h | 39 ++++
> arch/arm64/kvm/hyp/pgtable.c | 273 +++++++++++++++++++++++++++
> 2 files changed, 312 insertions(+)
>
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 85078bb632bb..7258966d3daa 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -136,6 +136,45 @@ int kvm_pgtable_stage2_init(struct kvm_pgtable *pgt, struct kvm *kvm);
> */
> void kvm_pgtable_stage2_destroy(struct kvm_pgtable *pgt);
>
> +/**
> + * kvm_pgtable_stage2_map() - Install a mapping in a guest stage-2 page-table.
> + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init().
> + * @addr: Intermediate physical address at which to place the mapping.
> + * @size: Size of the mapping.
> + * @phys: Physical address of the memory to map.
> + * @prot: Permissions and attributes for the mapping.
> + * @mc: Cache of pre-allocated GFP_PGTABLE_USER memory from which to
> + * allocate page-table pages.
> + *
> + * If device attributes are not explicitly requested in @prot, then the
> + * mapping will be normal, cacheable.
> + *
> + * Note that this function will both coalesce existing table entries and split
> + * existing block mappings, relying on page-faults to fault back areas outside
> + * of the new mapping lazily.
> + *
> + * Return: 0 on success, negative error code on failure.
> + */
> +int kvm_pgtable_stage2_map(struct kvm_pgtable *pgt, u64 addr, u64 size,
> + u64 phys, enum kvm_pgtable_prot prot,
> + struct kvm_mmu_memory_cache *mc);
> +
> +/**
> + * kvm_pgtable_stage2_unmap() - Remove a mapping from a guest stage-2 page-table.
> + * @pgt: Page-table structure initialised by kvm_pgtable_stage2_init().
> + * @addr: Intermediate physical address from which to remove the mapping.
> + * @size: Size of the mapping.
> + *
> + * TLB invalidation is performed for each page-table entry cleared during the
> + * unmapping operation and the reference count for the page-table page
> + * containing the cleared entry is decremented, with unreferenced pages being
> + * freed. Unmapping a cacheable page will ensure that it is clean to the PoC if
> + * FWB is not supported by the CPU.
> + *
> + * Return: 0 on success, negative error code on failure.
> + */
> +int kvm_pgtable_stage2_unmap(struct kvm_pgtable *pgt, u64 addr, u64 size);
> +
> /**
> * kvm_pgtable_walk() - Walk a page-table.
> * @pgt: Page-table structure initialised by kvm_pgtable_*_init().
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index 96e21017830b..4623380cf9de 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -32,10 +32,19 @@
> #define KVM_PTE_LEAF_ATTR_LO_S1_SH_IS 3
> #define KVM_PTE_LEAF_ATTR_LO_S1_AF BIT(10)
>
> +#define KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR GENMASK(5, 2)
> +#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R BIT(6)
> +#define KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W BIT(7)
> +#define KVM_PTE_LEAF_ATTR_LO_S2_SH GENMASK(9, 8)
> +#define KVM_PTE_LEAF_ATTR_LO_S2_SH_IS 3
> +#define KVM_PTE_LEAF_ATTR_LO_S2_AF BIT(10)
> +
> #define KVM_PTE_LEAF_ATTR_HI GENMASK(63, 51)
>
> #define KVM_PTE_LEAF_ATTR_HI_S1_XN BIT(54)
>
> +#define KVM_PTE_LEAF_ATTR_HI_S2_XN BIT(54)
> +
> struct kvm_pgtable_walk_data {
> struct kvm_pgtable *pgt;
> struct kvm_pgtable_walker *walker;
> @@ -417,6 +426,270 @@ void kvm_pgtable_hyp_destroy(struct kvm_pgtable *pgt)
> pgt->pgd = NULL;
> }
>
> +struct stage2_map_data {
> + u64 phys;
> + kvm_pte_t attr;
> +
> + kvm_pte_t *anchor;
> +
> + struct kvm_s2_mmu *mmu;
> + struct kvm_mmu_memory_cache *memcache;
> +};
> +
> +static int stage2_map_set_prot_attr(enum kvm_pgtable_prot prot,
> + struct stage2_map_data *data)
> +{
> + bool device = prot & KVM_PGTABLE_PROT_DEVICE;
> + kvm_pte_t attr = device ? PAGE_S2_MEMATTR(DEVICE_nGnRE) :
> + PAGE_S2_MEMATTR(NORMAL);
> + u32 sh = KVM_PTE_LEAF_ATTR_LO_S2_SH_IS;
> +
> + if (!(prot & KVM_PGTABLE_PROT_X))
> + attr |= KVM_PTE_LEAF_ATTR_HI_S2_XN;
> + else if (device)
> + return -EINVAL;
> +
> + if (prot & KVM_PGTABLE_PROT_R)
> + attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_R;
> +
> + if (prot & KVM_PGTABLE_PROT_W)
> + attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
> +
> + attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
> + attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
> + data->attr = attr;
> + return 0;
> +}
> +
> +static bool stage2_map_walker_try_leaf(u64 addr, u64 end, u32 level,
> + kvm_pte_t *ptep,
> + struct stage2_map_data *data)
> +{
> + u64 granule = kvm_granule_size(level), phys = data->phys;
> +
> + if (!kvm_block_mapping_supported(addr, end, phys, level))
> + return false;
> +
> + if (kvm_set_valid_leaf_pte(ptep, phys, data->attr, level))
> + goto out;
> +
> + /* There's an existing valid leaf entry, so perform break-before-make */
> + kvm_set_invalid_pte(ptep);
> + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
> + kvm_set_valid_leaf_pte(ptep, phys, data->attr, level);
> +out:
> + data->phys += granule;
> + return true;
> +}
> +
> +static int stage2_map_walk_table_pre(u64 addr, u64 end, u32 level,
> + kvm_pte_t *ptep,
> + struct stage2_map_data *data)
> +{
> + if (data->anchor)
> + return 0;
> +
> + if (!kvm_block_mapping_supported(addr, end, data->phys, level))
> + return 0;
> +
> + kvm_set_invalid_pte(ptep);
> + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, 0);
> + data->anchor = ptep;
> + return 0;
> +}
> +
> +static int stage2_map_walk_leaf(u64 addr, u64 end, u32 level, kvm_pte_t *ptep,
> + struct stage2_map_data *data)
> +{
> + kvm_pte_t *childp, pte = *ptep;
> + struct page *page = virt_to_page(ptep);
> +
> + if (data->anchor) {
> + if (kvm_pte_valid(pte))
> + put_page(page);
> +
> + return 0;
> + }
> +
> + if (stage2_map_walker_try_leaf(addr, end, level, ptep, data))
> + goto out_get_page;
> +
> + if (WARN_ON(level == KVM_PGTABLE_MAX_LEVELS - 1))
> + return -EINVAL;
> +
> + if (!data->memcache)
> + return -ENOMEM;
> +
> + childp = kvm_mmu_memory_cache_alloc(data->memcache);
> + if (!childp)
> + return -ENOMEM;
> +
> + /*
> + * If we've run into an existing block mapping then replace it with
> + * a table. Accesses beyond 'end' that fall within the new table
> + * will be mapped lazily.
> + */
> + if (kvm_pte_valid(pte)) {
> + kvm_set_invalid_pte(ptep);
> + kvm_call_hyp(__kvm_tlb_flush_vmid_ipa, data->mmu, addr, level);
> + put_page(page);
> + }
> +
> + kvm_set_table_pte(ptep, childp);
> +
> +out_get_page:
> + get_page(page);
> + return 0;
> +}
> +
> +static int stage2_map_walk_table_post(u64 addr, u64 end, u32 level,
> + kvm_pte_t *ptep,
> + struct stage2_map_data *data)
> +{
> + int ret = 0;
> +
> + if (!data->anchor)
> + return 0;
> +
> + free_page((unsigned long)kvm_pte_follow(*ptep));
> + put_page(virt_to_page(ptep));
> +
> + if (data->anchor == ptep) {
> + data->anchor = NULL;
> + ret = stage2_map_walk_leaf(addr, end, level, ptep, data);
> + }
I had another look at this function. If we're back to the anchor entry, then that
means that we know from the pre-order visitor that 1. the mapping is supported at
this level and 2. that the pte was invalidated. This means that
kvm_set_valid_leaf_pte() will succeed in changing the entry. How about instead of
calling stage2_map_walk_leaf() -> stage2_map_walker_try_leaf() ->
kvm_set_valid_leaf_pte() we call kvm_set_valid_leaf_pte() directly, followed by
get_page(virt_to_page(ptep)? It would make the code a lot easier to follow
(stage2_map_walk_leaf() is pretty complicated, imo, but that can't really be
avoided), and also slightly faster.
> +
> + return ret;
> +}
> +
> +/*
> + * This is a little fiddly, as we use all three of the walk flags. The idea
> + * is that the TABLE_PRE callback runs for table entries on the way down,
> + * looking for table entries which we could conceivably replace with a
> + * block entry for this mapping. If it finds one, then it sets the 'anchor'
> + * field in 'struct stage2_map_data' to point at the table entry, before
> + * clearing the entry to zero and descending into the now detached table.
> + *
> + * The behaviour of the LEAF callback then depends on whether or not the
> + * anchor has been set. If not, then we're not using a block mapping higher
> + * up the table and we perform the mapping at the existing leaves instead.
> + * If, on the other hand, the anchor _is_ set, then we drop references to
> + * all valid leaves so that the pages beneath the anchor can be freed.
> + *
> + * Finally, the TABLE_POST callback does nothing if the anchor has not
> + * been set, but otherwise frees the page-table pages while walking back up
> + * the page-table, installing the block entry when it revisits the anchor
> + * pointer and clearing the anchor to NULL.
> + */
The comment does wonders at explaining what is going on, thank you.
Thanks,
Alex
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-09-10 11:21 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-07 15:23 [PATCH v4 00/21] KVM: arm64: Rewrite page-table code and fault handling Will Deacon
2020-09-07 15:23 ` [PATCH v4 01/21] KVM: arm64: Remove kvm_mmu_free_memory_caches() Will Deacon
2020-09-07 15:23 ` [PATCH v4 02/21] KVM: arm64: Add stand-alone page-table walker infrastructure Will Deacon
2020-09-08 0:03 ` Gavin Shan
2020-09-10 10:57 ` Will Deacon
2020-09-09 15:29 ` Alexandru Elisei
2020-09-10 12:37 ` Will Deacon
2020-09-10 14:21 ` Andrew Scull
2020-09-11 10:15 ` Will Deacon
2020-09-11 11:22 ` Andrew Scull
2020-09-07 15:23 ` [PATCH v4 03/21] KVM: arm64: Add support for creating kernel-agnostic stage-1 page tables Will Deacon
2020-09-08 1:09 ` Gavin Shan
2020-09-07 15:23 ` [PATCH v4 04/21] KVM: arm64: Use generic allocator for hyp stage-1 page-tables Will Deacon
2020-09-08 1:03 ` Gavin Shan
2020-09-07 15:23 ` [PATCH v4 05/21] KVM: arm64: Add support for creating kernel-agnostic stage-2 page tables Will Deacon
2020-09-07 15:23 ` [PATCH v4 06/21] KVM: arm64: Add support for stage-2 map()/unmap() in generic page-table Will Deacon
2020-09-10 11:20 ` Alexandru Elisei [this message]
2020-09-10 12:34 ` Will Deacon
2020-09-10 13:55 ` Alexandru Elisei
2020-09-07 15:23 ` [PATCH v4 07/21] KVM: arm64: Convert kvm_phys_addr_ioremap() to generic page-table API Will Deacon
2020-09-07 15:23 ` [PATCH v4 08/21] KVM: arm64: Convert kvm_set_spte_hva() " Will Deacon
2020-09-07 15:23 ` [PATCH v4 09/21] KVM: arm64: Convert unmap_stage2_range() " Will Deacon
2020-09-07 15:23 ` [PATCH v4 10/21] KVM: arm64: Add support for stage-2 page-aging in generic page-table Will Deacon
2020-09-08 15:30 ` Alexandru Elisei
2020-09-10 12:42 ` Will Deacon
2020-09-07 15:23 ` [PATCH v4 11/21] KVM: arm64: Convert page-aging and access faults to generic page-table API Will Deacon
2020-09-08 15:39 ` Alexandru Elisei
2020-09-07 15:23 ` [PATCH v4 12/21] KVM: arm64: Add support for stage-2 write-protect in generic page-table Will Deacon
2020-09-07 15:23 ` [PATCH v4 13/21] KVM: arm64: Convert write-protect operation to generic page-table API Will Deacon
2020-09-07 15:23 ` [PATCH v4 14/21] KVM: arm64: Add support for stage-2 cache flushing in generic page-table Will Deacon
2020-09-07 15:23 ` [PATCH v4 15/21] KVM: arm64: Convert memslot cache-flushing code to generic page-table API Will Deacon
2020-09-07 15:23 ` [PATCH v4 16/21] KVM: arm64: Add support for relaxing stage-2 perms in generic page-table code Will Deacon
2020-09-08 16:37 ` Alexandru Elisei
2020-09-07 15:23 ` [PATCH v4 17/21] KVM: arm64: Convert user_mem_abort() to generic page-table API Will Deacon
2020-09-09 14:20 ` Alexandru Elisei
2020-09-09 17:12 ` Marc Zyngier
2020-09-10 10:51 ` Will Deacon
2020-09-10 10:58 ` Marc Zyngier
2020-09-10 13:10 ` Alexandru Elisei
2020-09-10 13:20 ` Alexandru Elisei
2020-09-07 15:23 ` [PATCH v4 18/21] KVM: arm64: Check the pgt instead of the pgd when modifying page-table Will Deacon
2020-09-07 15:23 ` [PATCH v4 19/21] KVM: arm64: Remove unused page-table code Will Deacon
2020-09-08 10:33 ` Marc Zyngier
2020-09-10 10:54 ` Will Deacon
2020-09-07 15:23 ` [PATCH v4 20/21] KVM: arm64: Remove unused 'pgd' field from 'struct kvm_s2_mmu' Will Deacon
2020-09-07 15:23 ` [PATCH v4 21/21] KVM: arm64: Don't constrain maximum IPA size based on host configuration Will Deacon
2020-09-09 14:53 ` Alexandru Elisei
2020-09-07 17:16 ` [PATCH v4 00/21] KVM: arm64: Rewrite page-table code and fault handling Marc Zyngier
2020-09-07 17:31 ` Will Deacon
2020-09-10 4:06 ` Gavin Shan
2020-09-10 4:11 ` Gavin Shan
2020-09-10 10:58 ` Will Deacon
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f5939f12-56e8-794c-8d9b-9ae348bba3c0@arm.com \
--to=alexandru.elisei@arm.com \
--cc=catalin.marinas@arm.com \
--cc=gshan@redhat.com \
--cc=james.morse@arm.com \
--cc=kernel-team@android.com \
--cc=kvmarm@lists.cs.columbia.edu \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=maz@kernel.org \
--cc=qperret@google.com \
--cc=suzuki.poulose@arm.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).