All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v1 00/44] KVM: arm64: Preamble for pKVM
@ 2024-03-27 17:34 Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 01/44] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
                   ` (43 more replies)
  0 siblings, 44 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

We are getting closer to upstreaming the remaining part of pKVM
[1]. To make the process easier for us and for our dear
reviewers, we are sending this patch series as a preamble to the
upcoming patches.

This series is based on Linux 6.9-rc1. Most of the patches in
this series are self-standing, without dependencies on other
patches within the same series, and can be applied directly to
Linux 6.9-rc1.

This series is a bit of a bombay-mix of patches we've been
carrying. There's no one overarching theme, but they do improve
the code by fixing existing bugs in pKVM, refactoring code to
make it more readable and easier to re-use for pKVM, or adding
functionality to the existing pKVM code upstream.

For a technical deep dive into pKVM, please refer to Quentin
Perret's KVM Forum Presentation [2, 3]. For the pKVM core series,
which we plan on sending for review next, the code is here [1].

Cheers,
Fuad, Quentin, Will, and Marc

[1] https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstream/pkvm-core

[2] Protected KVM on arm64 (slides)
https://static.sched.com/hosted_files/kvmforum2022/88/KVM%20forum%202022%20-%20pKVM%20deep%20dive.pdf

[3] Protected KVM on arm64 (video)
https://www.youtube.com/watch?v=9npebeVFbFw

Fuad Tabba (23):
  KVM: arm64: Change kvm_handle_mmio_return() return polarity
  KVM: arm64: Use enum instead of helper for checking FP-state
  KVM: arm64: Move setting the page as dirty out of the critical section
  KVM: arm64: Split up nvhe/fixed_config.h
  KVM: arm64: Move pstate reset value definitions to kvm_arm.h
  KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  KVM: arm64: Refactor calculating SVE state size to use helpers
  KVM: arm64: Use active guest SVE vector length on guest restore
  KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
  KVM: arm64: Move some kvm_psci functions to a shared header
  KVM: arm64: Refactor reset_mpidr() to extract its computation
  KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use
  KVM: arm64: Introduce gfn_to_memslot_prot()
  KVM: arm64: Do not use the hva in kvm_handle_guest_abort()
  KVM: arm64: Do not set the virtual timer offset for protected vCPUs
  KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()
  KVM: arm64: Do not re-initialize the KVM lock
  KVM: arm64: Check directly whether a vcpu is protected
  KVM: arm64: Trap debug break and watch from guest
  KVM: arm64: Restrict protected VM capabilities
  KVM: arm64: Do not support MTE for protected VMs
  KVM: arm64: Move pkvm_vcpu_init_traps() to hyp vcpu init
  KVM: arm64: Fix initializing traps in protected mode

Marc Zyngier (6):
  KVM: arm64: Check for PTE validity when checking for
    executable/cacheable
  KVM: arm64: Simplify vgic-v3 hypercalls
  KVM: arm64: Introduce predicates to check for protected state
  KVM: arm64: Add PC_UPDATE_REQ flags covering all PC updates
  KVM: arm64: Add vcpu flag copy primitive
  KVM: arm64: Force injection of a data abort on NISV MMIO exit

Quentin Perret (5):
  KVM: arm64: Avoid BUG-ing from the host abort path
  KVM: arm64: Add is_pkvm_initialized() helper
  KVM: arm64: Refactor enter_exception64()
  KVM: arm64: Prevent kmemleak from accessing .hyp.data
  KVM: arm64: Issue CMOs when tearing down guest s2 pages

Will Deacon (10):
  KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
  KVM: arm64: Support TLB invalidation in guest context
  KVM: arm64: Introduce hyp_rwlock_t
  KVM: arm64: Add atomics-based checking refcount implementation at EL2
  KVM: arm64: Use atomic refcount helpers for 'struct
    hyp_page::refcount'
  KVM: arm64: Remove locking from EL2 allocation fast-paths
  KVM: arm64: Reformat/beautify PTP hypercall documentation
  KVM: arm64: Rename firmware pseudo-register documentation file
  KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst
  KVM: arm64: Advertise GICv3 sysreg interface to protected guests

 Documentation/virt/kvm/api.rst                |   7 +
 .../virt/kvm/arm/fw-pseudo-registers.rst      | 138 +++++++++++
 Documentation/virt/kvm/arm/hypercalls.rst     | 180 ++++----------
 Documentation/virt/kvm/arm/index.rst          |   1 +
 Documentation/virt/kvm/arm/ptp_kvm.rst        |  38 +--
 arch/arm64/include/asm/kvm_arm.h              |  12 +
 arch/arm64/include/asm/kvm_asm.h              |   9 +-
 arch/arm64/include/asm/kvm_emulate.h          |  10 +
 arch/arm64/include/asm/kvm_host.h             |  42 +++-
 arch/arm64/include/asm/kvm_hyp.h              |   4 +-
 arch/arm64/include/asm/kvm_pkvm.h             | 234 ++++++++++++++++++
 arch/arm64/include/asm/virt.h                 |  12 +-
 arch/arm64/kvm/arch_timer.c                   |  20 +-
 arch/arm64/kvm/arm.c                          | 102 ++++++--
 arch/arm64/kvm/fpsimd.c                       |  44 ++--
 arch/arm64/kvm/hyp/exception.c                | 100 ++++----
 arch/arm64/kvm/hyp/include/hyp/switch.h       |  14 +-
 .../arm64/kvm/hyp/include/nvhe/fixed_config.h | 223 -----------------
 arch/arm64/kvm/hyp/include/nvhe/gfp.h         |   6 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  18 +-
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |  18 ++
 arch/arm64/kvm/hyp/include/nvhe/refcount.h    |  72 ++++++
 arch/arm64/kvm/hyp/include/nvhe/rwlock.h      | 129 ++++++++++
 .../arm64/kvm/hyp/include/nvhe/trap_handler.h |   2 -
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  32 +--
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |  12 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  21 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |  54 ++--
 arch/arm64/kvm/hyp/nvhe/setup.c               |   1 -
 arch/arm64/kvm/hyp/nvhe/switch.c              |  10 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c            |  13 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c                 | 114 +++++++--
 arch/arm64/kvm/hyp/pgtable.c                  |  21 +-
 arch/arm64/kvm/hyp/vgic-v3-sr.c               |  27 +-
 arch/arm64/kvm/hyp/vhe/switch.c               |   2 +-
 arch/arm64/kvm/mmio.c                         |  13 +-
 arch/arm64/kvm/mmu.c                          |  25 +-
 arch/arm64/kvm/pkvm.c                         |   2 +-
 arch/arm64/kvm/psci.c                         |  28 ---
 arch/arm64/kvm/reset.c                        |  20 +-
 arch/arm64/kvm/sys_regs.c                     |  14 +-
 arch/arm64/kvm/sys_regs.h                     |  19 ++
 arch/arm64/kvm/vgic/vgic-v2.c                 |   9 +-
 arch/arm64/kvm/vgic/vgic-v3.c                 |  23 +-
 arch/arm64/kvm/vgic/vgic.c                    |  11 -
 arch/arm64/kvm/vgic/vgic.h                    |   2 -
 include/kvm/arm_psci.h                        |  29 +++
 include/kvm/arm_vgic.h                        |   1 -
 include/linux/kvm_host.h                      |   1 +
 virt/kvm/kvm_main.c                           |  22 ++
 50 files changed, 1225 insertions(+), 736 deletions(-)
 create mode 100644 Documentation/virt/kvm/arm/fw-pseudo-registers.rst
 delete mode 100644 arch/arm64/kvm/hyp/include/nvhe/fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/refcount.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/rwlock.h

-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply	[flat|nested] 64+ messages in thread

* [PATCH v1 01/44] KVM: arm64: Change kvm_handle_mmio_return() return polarity
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state Fuad Tabba
                   ` (42 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Most exit handlers return <= 0 to indicate that the host needs to
handle the exit. Make kvm_handle_mmio_return() consistent with
the exit handlers in handle_exit(). This makes the code easier to
reason about, and makes it easier to add other handlers in future
patches.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/arm.c  | 2 +-
 arch/arm64/kvm/mmio.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 3dee5490eea9..a38943cda7cf 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -980,7 +980,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 	if (run->exit_reason == KVM_EXIT_MMIO) {
 		ret = kvm_handle_mmio_return(vcpu);
-		if (ret)
+		if (ret <= 0)
 			return ret;
 	}
 
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 200c8019a82a..5e1ffb0d5363 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -86,7 +86,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
 
 	/* Detect an already handled MMIO return */
 	if (unlikely(!vcpu->mmio_needed))
-		return 0;
+		return 1;
 
 	vcpu->mmio_needed = 0;
 
@@ -117,7 +117,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
 	 */
 	kvm_incr_pc(vcpu);
 
-	return 0;
+	return 1;
 }
 
 int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 01/44] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-28 16:19   ` Mark Brown
  2024-04-08  7:39   ` Marc Zyngier
  2024-03-27 17:34 ` [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
                   ` (41 subsequent siblings)
  43 siblings, 2 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Before the conversion of the various FP-state booleans into an
enum representing the state, this helper may have clarified
things. Since the introduction of the enum, the helper obfuscates
rather than clarifies. This also makes the code consistent with
other parts that check the FP-state.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ------
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 3 files changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index e3fcf8c4d5b4..1a6dfd035531 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -39,12 +39,6 @@ struct kvm_exception_table_entry {
 extern struct kvm_exception_table_entry __start___kvm_ex_table;
 extern struct kvm_exception_table_entry __stop___kvm_ex_table;
 
-/* Check whether the FP regs are owned by the guest */
-static inline bool guest_owns_fp_regs(struct kvm_vcpu *vcpu)
-{
-	return vcpu->arch.fp_state == FP_STATE_GUEST_OWNED;
-}
-
 /* Save the 32-bit only FPSIMD system register state */
 static inline void __fpsimd_save_fpexc32(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c50f8459e4fc..2a0b0d6da7c7 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -53,7 +53,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
 			val |= CPTR_EL2_TSM;
 	}
 
-	if (!guest_owns_fp_regs(vcpu)) {
+	if (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED) {
 		if (has_hvhe())
 			val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |
 				 CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 1581df6aec87..e9197f086137 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -75,7 +75,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
 
 	val |= CPTR_EL2_TAM;
 
-	if (guest_owns_fp_regs(vcpu)) {
+	if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
 		if (vcpu_has_sve(vcpu))
 			val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;
 	} else {
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 01/44] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-04-08  7:41   ` Marc Zyngier
  2024-03-27 17:34 ` [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
                   ` (40 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move the unlock earlier in user_mem_abort() to shorten the
critical section. This also helps for future refactoring and
reuse of similar code.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 18680771cdb0..3afc42d8833e 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 
 	read_lock(&kvm->mmu_lock);
 	pgt = vcpu->arch.hw_mmu->pgt;
-	if (mmu_invalidate_retry(kvm, mmu_seq))
+	if (mmu_invalidate_retry(kvm, mmu_seq)) {
+		ret = -EAGAIN;
 		goto out_unlock;
+	}
 
 	/*
 	 * If we are not forced to use page mapping, check if we are
@@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 					     memcache,
 					     KVM_PGTABLE_WALK_HANDLE_FAULT |
 					     KVM_PGTABLE_WALK_SHARED);
+out_unlock:
+	read_unlock(&kvm->mmu_lock);
 
 	/* Mark the page dirty only if the fault is handled successfully */
 	if (writable && !ret) {
@@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		mark_page_dirty_in_slot(kvm, memslot, gfn);
 	}
 
-out_unlock:
-	read_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
 	return ret != -EAGAIN ? ret : 0;
 }
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (2 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-04-08  7:44   ` Marc Zyngier
  2024-03-27 17:34 ` [PATCH v1 05/44] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
                   ` (39 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Quentin Perret <qperret@google.com>

Under certain circumstances __get_fault_info() may resolve the faulting
address using the AT instruction. Given that this is being done outside
of the host lock critical section, it is racy and the resolution via AT
may fail. We currently BUG() in this situation, which is obviously less
than ideal. Moving the address resolution to the critical section may
have a performance impact, so let's keep it where it is, but bail out
and return to the host to try a second time.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 861c76021a25..d48990eae1ef 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -533,7 +533,15 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 	int ret = 0;
 
 	esr = read_sysreg_el2(SYS_ESR);
-	BUG_ON(!__get_fault_info(esr, &fault));
+	if (!__get_fault_info(esr, &fault)) {
+		/* Setting the address to an invalid value for use in tracing. */
+		addr = (u64)-1;
+		/*
+		 * We've presumably raced with a page-table change which caused
+		 * AT to fail, try again.
+		 */
+		return;
+	}
 
 	addr = (fault.hpfar_el2 & HPFAR_MASK) << 8;
 	ret = host_stage2_idmap(addr);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 05/44] KVM: arm64: Check for PTE validity when checking for executable/cacheable
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (3 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 06/44] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
                   ` (38 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

Don't just assume that the PTE is valid when checking whether it
describes an executable or cacheable mapping.

This makes sure that we don't issue CMOs for invalid mappings.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 3fae5830f8d2..da54bb312910 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -907,12 +907,12 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
 static bool stage2_pte_cacheable(struct kvm_pgtable *pgt, kvm_pte_t pte)
 {
 	u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR;
-	return memattr == KVM_S2_MEMATTR(pgt, NORMAL);
+	return kvm_pte_valid(pte) && memattr == KVM_S2_MEMATTR(pgt, NORMAL);
 }
 
 static bool stage2_pte_executable(kvm_pte_t pte)
 {
-	return !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN);
+	return kvm_pte_valid(pte) && !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN);
 }
 
 static u64 stage2_map_walker_phys_addr(const struct kvm_pgtable_visit_ctx *ctx,
@@ -1363,7 +1363,7 @@ static int stage2_flush_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	struct kvm_pgtable *pgt = ctx->arg;
 	struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops;
 
-	if (!kvm_pte_valid(ctx->old) || !stage2_pte_cacheable(pgt, ctx->old))
+	if (!stage2_pte_cacheable(pgt, ctx->old))
 		return 0;
 
 	if (mm_ops->dcache_clean_inval_poc)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 06/44] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (4 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 05/44] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
                   ` (37 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

Break-before-make (BBM) can be expensive, as transitioning via an
invalid mapping (i.e. the "break" step) requires the completion of TLB
invalidation and can also cause other agents to fault concurrently on
the invalid mapping.

Since BBM is not required when changing only the software bits of a PTE,
avoid the sequence in this case and just update the PTE directly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index da54bb312910..7f3efb442d36 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -972,6 +972,21 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
 	if (!stage2_pte_needs_update(ctx->old, new))
 		return -EAGAIN;
 
+	/* If we're only changing software bits, then store them and go! */
+	if (!kvm_pgtable_walk_shared(ctx) &&
+	    !((ctx->old ^ new) & ~KVM_PTE_LEAF_ATTR_HI_SW)) {
+		bool old_is_counted = stage2_pte_is_counted(ctx->old);
+
+		if (old_is_counted != stage2_pte_is_counted(new)) {
+			if (old_is_counted)
+				mm_ops->put_page(ctx->ptep);
+			else
+				mm_ops->get_page(ctx->ptep);
+		}
+		WRITE_ONCE(*ctx->ptep, new);
+		return 0;
+	}
+
 	if (!stage2_try_break_pte(ctx, data->mmu))
 		return -EAGAIN;
 
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (5 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 06/44] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-04-15 11:36   ` Marc Zyngier
  2024-03-27 17:34 ` [PATCH v1 08/44] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
                   ` (36 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

Typically, TLB invalidation of guest stage-2 mappings using nVHE is
performed by a hypercall originating from the host. For the invalidation
instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
swizzle the active stage-2 context around the TLBI instruction.

With guest-to-host memory sharing and unsharing hypercalls
originating from the guest under pKVM, there is need to support
both guest and host VMID invalidations issued from guest context.

Replace the __tlb_switch_to_{guest,host}() functions with a more general
{enter,exit}_vmid_context() implementation which supports being invoked
from guest context and acts as a no-op if the target context matches the
running context.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/tlb.c | 114 +++++++++++++++++++++++++++-------
 1 file changed, 90 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
index a60fb13e2192..05a66b2ed76d 100644
--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
+++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
@@ -11,13 +11,23 @@
 #include <nvhe/mem_protect.h>
 
 struct tlb_inv_context {
-	u64		tcr;
+	struct kvm_s2_mmu	*mmu;
+	u64			tcr;
+	u64			sctlr;
 };
 
-static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
-				  struct tlb_inv_context *cxt,
-				  bool nsh)
+static void enter_vmid_context(struct kvm_s2_mmu *mmu,
+			       struct tlb_inv_context *cxt,
+			       bool nsh)
 {
+	struct kvm_s2_mmu *host_s2_mmu = &host_mmu.arch.mmu;
+	struct kvm_cpu_context *host_ctxt;
+	struct kvm_vcpu *vcpu;
+
+	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+	vcpu = host_ctxt->__hyp_running_vcpu;
+	cxt->mmu = NULL;
+
 	/*
 	 * We have two requirements:
 	 *
@@ -40,20 +50,52 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
 	else
 		dsb(ish);
 
+	/*
+	 * If we're already in the desired context, then there's nothing
+	 * to do.
+	 */
+	if (vcpu) {
+		/* We're in guest context */
+		if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
+			return;
+
+		cxt->mmu = vcpu->arch.hw_mmu;
+	} else {
+		/* We're in host context */
+		if (mmu == host_s2_mmu)
+			return;
+
+		cxt->mmu = host_s2_mmu;
+	}
+
 	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
 		u64 val;
 
 		/*
 		 * For CPUs that are affected by ARM 1319367, we need to
-		 * avoid a host Stage-1 walk while we have the guest's
-		 * VMID set in the VTTBR in order to invalidate TLBs.
-		 * We're guaranteed that the S1 MMU is enabled, so we can
-		 * simply set the EPD bits to avoid any further TLB fill.
+		 * avoid a Stage-1 walk with the old VMID while we have
+		 * the new VMID set in the VTTBR in order to invalidate TLBs.
+		 * We're guaranteed that the host S1 MMU is enabled, so
+		 * we can simply set the EPD bits to avoid any further
+		 * TLB fill. For guests, we ensure that the S1 MMU is
+		 * temporarily enabled in the next context.
 		 */
 		val = cxt->tcr = read_sysreg_el1(SYS_TCR);
 		val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
 		write_sysreg_el1(val, SYS_TCR);
 		isb();
+
+		if (vcpu) {
+			val = cxt->sctlr = read_sysreg_el1(SYS_SCTLR);
+			if (!(val & SCTLR_ELx_M)) {
+				val |= SCTLR_ELx_M;
+				write_sysreg_el1(val, SYS_SCTLR);
+				isb();
+			}
+		} else {
+			/* The host S1 MMU is always enabled. */
+			cxt->sctlr = SCTLR_ELx_M;
+		}
 	}
 
 	/*
@@ -62,20 +104,44 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
 	 * ensuring that we always have an ISB, but not two ISBs back
 	 * to back.
 	 */
-	__load_stage2(mmu, kern_hyp_va(mmu->arch));
+	if (vcpu)
+		__load_host_stage2();
+	else
+		__load_stage2(mmu, kern_hyp_va(mmu->arch));
+
 	asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
-static void __tlb_switch_to_host(struct tlb_inv_context *cxt)
+static void exit_vmid_context(struct tlb_inv_context *cxt)
 {
-	__load_host_stage2();
+	struct kvm_s2_mmu *mmu = cxt->mmu;
+	struct kvm_cpu_context *host_ctxt;
+	struct kvm_vcpu *vcpu;
+
+	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+	vcpu = host_ctxt->__hyp_running_vcpu;
+
+	if (!mmu)
+		return;
+
+	if (vcpu)
+		__load_stage2(mmu, kern_hyp_va(mmu->arch));
+	else
+		__load_host_stage2();
 
 	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
-		/* Ensure write of the host VMID */
+		/* Ensure write of the old VMID */
 		isb();
-		/* Restore the host's TCR_EL1 */
+
+		if (!(cxt->sctlr & SCTLR_ELx_M)) {
+			write_sysreg_el1(cxt->sctlr, SYS_SCTLR);
+			isb();
+		}
+
 		write_sysreg_el1(cxt->tcr, SYS_TCR);
 	}
+
+	cxt->mmu = NULL;
 }
 
 void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
@@ -84,7 +150,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	/*
 	 * We could do so much better if we had the VA as well.
@@ -105,7 +171,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
@@ -114,7 +180,7 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, true);
+	enter_vmid_context(mmu, &cxt, true);
 
 	/*
 	 * We could do so much better if we had the VA as well.
@@ -135,7 +201,7 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
 	dsb(nsh);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
@@ -152,7 +218,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 	start = round_down(start, stride);
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride, 0);
 
@@ -161,7 +227,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
@@ -169,13 +235,13 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__tlbi(vmalls12e1is);
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
@@ -183,19 +249,19 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__tlbi(vmalle1);
 	asm volatile("ic iallu");
 	dsb(nsh);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_flush_vm_context(void)
 {
-	/* Same remark as in __tlb_switch_to_guest() */
+	/* Same remark as in enter_vmid_context() */
 	dsb(ish);
 	__tlbi(alle1is);
 	dsb(ish);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 08/44] KVM: arm64: Simplify vgic-v3 hypercalls
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (6 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 09/44] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
                   ` (35 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore
hypercalls so that all of the EL2 GICv3 state is covered by a single pair
of hypercalls.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h   |  8 ++------
 arch/arm64/include/asm/kvm_hyp.h   |  4 ++--
 arch/arm64/kvm/arm.c               |  5 ++---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 24 ++++++------------------
 arch/arm64/kvm/hyp/vgic-v3-sr.c    | 27 +++++++++++++++++++++++----
 arch/arm64/kvm/vgic/vgic-v2.c      |  9 +--------
 arch/arm64/kvm/vgic/vgic-v3.c      | 23 ++---------------------
 arch/arm64/kvm/vgic/vgic.c         | 11 -----------
 arch/arm64/kvm/vgic/vgic.h         |  2 --
 include/kvm/arm_vgic.h             |  1 -
 10 files changed, 38 insertions(+), 76 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 24b5e6b23417..a6330460d9e5 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -73,10 +73,8 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_range,
 	__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
 	__KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_read_vmcr,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_write_vmcr,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs,
+	__KVM_HOST_SMCCC_FUNC___vgic_v3_save_vmcr_aprs,
+	__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_vmcr_aprs,
 	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
@@ -241,8 +239,6 @@ extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 extern void __kvm_adjust_pc(struct kvm_vcpu *vcpu);
 
 extern u64 __vgic_v3_get_gic_config(void);
-extern u64 __vgic_v3_read_vmcr(void);
-extern void __vgic_v3_write_vmcr(u32 vmcr);
 extern void __vgic_v3_init_lrs(void);
 
 extern u64 __kvm_get_mdcr_el2(void);
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 3e2a1ac0c9bb..3e80464f8953 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -80,8 +80,8 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_activate_traps(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_deactivate_traps(struct vgic_v3_cpu_if *cpu_if);
-void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if);
-void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if);
+void __vgic_v3_save_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
+void __vgic_v3_restore_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
 int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_NVHE_HYPERVISOR__
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index a38943cda7cf..9cb39b6f070b 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -790,9 +790,8 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
 	 * doorbells to be signalled, should an interrupt become pending.
 	 */
 	preempt_disable();
-	kvm_vgic_vmcr_sync(vcpu);
 	vcpu_set_flag(vcpu, IN_WFI);
-	vgic_v4_put(vcpu);
+	kvm_vgic_put(vcpu);
 	preempt_enable();
 
 	kvm_vcpu_halt(vcpu);
@@ -800,7 +799,7 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
 
 	preempt_disable();
 	vcpu_clear_flag(vcpu, IN_WFI);
-	vgic_v4_load(vcpu);
+	kvm_vgic_load(vcpu);
 	preempt_enable();
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2385fd03ed87..b7d7ca966f2e 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -178,16 +178,6 @@ static void handle___vgic_v3_get_gic_config(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __vgic_v3_get_gic_config();
 }
 
-static void handle___vgic_v3_read_vmcr(struct kvm_cpu_context *host_ctxt)
-{
-	cpu_reg(host_ctxt, 1) = __vgic_v3_read_vmcr();
-}
-
-static void handle___vgic_v3_write_vmcr(struct kvm_cpu_context *host_ctxt)
-{
-	__vgic_v3_write_vmcr(cpu_reg(host_ctxt, 1));
-}
-
 static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
 {
 	__vgic_v3_init_lrs();
@@ -198,18 +188,18 @@ static void handle___kvm_get_mdcr_el2(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __kvm_get_mdcr_el2();
 }
 
-static void handle___vgic_v3_save_aprs(struct kvm_cpu_context *host_ctxt)
+static void handle___vgic_v3_save_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
 
-	__vgic_v3_save_aprs(kern_hyp_va(cpu_if));
+	__vgic_v3_save_vmcr_aprs(kern_hyp_va(cpu_if));
 }
 
-static void handle___vgic_v3_restore_aprs(struct kvm_cpu_context *host_ctxt)
+static void handle___vgic_v3_restore_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
 
-	__vgic_v3_restore_aprs(kern_hyp_va(cpu_if));
+	__vgic_v3_restore_vmcr_aprs(kern_hyp_va(cpu_if));
 }
 
 static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt)
@@ -340,10 +330,8 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_tlb_flush_vmid_range),
 	HANDLE_FUNC(__kvm_flush_cpu_context),
 	HANDLE_FUNC(__kvm_timer_set_cntvoff),
-	HANDLE_FUNC(__vgic_v3_read_vmcr),
-	HANDLE_FUNC(__vgic_v3_write_vmcr),
-	HANDLE_FUNC(__vgic_v3_save_aprs),
-	HANDLE_FUNC(__vgic_v3_restore_aprs),
+	HANDLE_FUNC(__vgic_v3_save_vmcr_aprs),
+	HANDLE_FUNC(__vgic_v3_restore_vmcr_aprs),
 	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 	HANDLE_FUNC(__pkvm_init_vm),
 	HANDLE_FUNC(__pkvm_init_vcpu),
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 6cb638b184b1..7b397fad26f2 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -330,7 +330,7 @@ void __vgic_v3_deactivate_traps(struct vgic_v3_cpu_if *cpu_if)
 		write_gicreg(0, ICH_HCR_EL2);
 }
 
-void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
+static void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
 {
 	u64 val;
 	u32 nr_pre_bits;
@@ -363,7 +363,7 @@ void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
 	}
 }
 
-void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if)
+static void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if)
 {
 	u64 val;
 	u32 nr_pre_bits;
@@ -455,16 +455,35 @@ u64 __vgic_v3_get_gic_config(void)
 	return val;
 }
 
-u64 __vgic_v3_read_vmcr(void)
+static u64 __vgic_v3_read_vmcr(void)
 {
 	return read_gicreg(ICH_VMCR_EL2);
 }
 
-void __vgic_v3_write_vmcr(u32 vmcr)
+static void __vgic_v3_write_vmcr(u32 vmcr)
 {
 	write_gicreg(vmcr, ICH_VMCR_EL2);
 }
 
+void __vgic_v3_save_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if)
+{
+	__vgic_v3_save_aprs(cpu_if);
+	if (cpu_if->vgic_sre)
+		cpu_if->vgic_vmcr = __vgic_v3_read_vmcr();
+}
+
+void __vgic_v3_restore_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if)
+{
+	/*
+	 * If dealing with a GICv2 emulation on GICv3, VMCR_EL2.VFIQen
+	 * is dependent on ICC_SRE_EL1.SRE, and we have to perform the
+	 * VMCR_EL2 save/restore in the world switch.
+	 */
+	if (cpu_if->vgic_sre)
+		__vgic_v3_write_vmcr(cpu_if->vgic_vmcr);
+	__vgic_v3_restore_aprs(cpu_if);
+}
+
 static int __vgic_v3_bpr_min(void)
 {
 	/* See Pseudocode for VPriorityGroup */
diff --git a/arch/arm64/kvm/vgic/vgic-v2.c b/arch/arm64/kvm/vgic/vgic-v2.c
index 7e9cdb78f7ce..ae5a44d5702d 100644
--- a/arch/arm64/kvm/vgic/vgic-v2.c
+++ b/arch/arm64/kvm/vgic/vgic-v2.c
@@ -464,17 +464,10 @@ void vgic_v2_load(struct kvm_vcpu *vcpu)
 		       kvm_vgic_global_state.vctrl_base + GICH_APR);
 }
 
-void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-
-	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
-}
-
 void vgic_v2_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
 
-	vgic_v2_vmcr_sync(vcpu);
+	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 	cpu_if->vgic_apr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_APR);
 }
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 4ea3340786b9..ed6e412cd74b 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -722,15 +722,7 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 
-	/*
-	 * If dealing with a GICv2 emulation on GICv3, VMCR_EL2.VFIQen
-	 * is dependent on ICC_SRE_EL1.SRE, and we have to perform the
-	 * VMCR_EL2 save/restore in the world switch.
-	 */
-	if (likely(cpu_if->vgic_sre))
-		kvm_call_hyp(__vgic_v3_write_vmcr, cpu_if->vgic_vmcr);
-
-	kvm_call_hyp(__vgic_v3_restore_aprs, cpu_if);
+	kvm_call_hyp(__vgic_v3_restore_vmcr_aprs, cpu_if);
 
 	if (has_vhe())
 		__vgic_v3_activate_traps(cpu_if);
@@ -738,24 +730,13 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
 	WARN_ON(vgic_v4_load(vcpu));
 }
 
-void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
-
-	if (likely(cpu_if->vgic_sre))
-		cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
-}
-
 void vgic_v3_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 
+	kvm_call_hyp(__vgic_v3_save_vmcr_aprs, cpu_if);
 	WARN_ON(vgic_v4_put(vcpu));
 
-	vgic_v3_vmcr_sync(vcpu);
-
-	kvm_call_hyp(__vgic_v3_save_aprs, cpu_if);
-
 	if (has_vhe())
 		__vgic_v3_deactivate_traps(cpu_if);
 }
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 4ec93587c8cd..fcc5747f51e9 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -939,17 +939,6 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu)
 		vgic_v3_put(vcpu);
 }
 
-void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
-		return;
-
-	if (kvm_vgic_global_state.type == VGIC_V2)
-		vgic_v2_vmcr_sync(vcpu);
-	else
-		vgic_v3_vmcr_sync(vcpu);
-}
-
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 0c2b82de8fa3..4b93528e6a89 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -214,7 +214,6 @@ int vgic_register_dist_iodev(struct kvm *kvm, gpa_t dist_base_address,
 void vgic_v2_init_lrs(void);
 void vgic_v2_load(struct kvm_vcpu *vcpu);
 void vgic_v2_put(struct kvm_vcpu *vcpu);
-void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu);
 
 void vgic_v2_save_state(struct kvm_vcpu *vcpu);
 void vgic_v2_restore_state(struct kvm_vcpu *vcpu);
@@ -253,7 +252,6 @@ bool vgic_v3_check_base(struct kvm *kvm);
 
 void vgic_v3_load(struct kvm_vcpu *vcpu);
 void vgic_v3_put(struct kvm_vcpu *vcpu);
-void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu);
 
 bool vgic_has_its(struct kvm *kvm);
 int kvm_vgic_register_its_device(void);
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 47035946648e..0c3cce31e0a2 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -388,7 +388,6 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 
 void kvm_vgic_load(struct kvm_vcpu *vcpu);
 void kvm_vgic_put(struct kvm_vcpu *vcpu);
-void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	((k)->arch.vgic.initialized)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 09/44] KVM: arm64: Add is_pkvm_initialized() helper
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (7 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 08/44] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 10/44] KVM: arm64: Introduce predicates to check for protected state Fuad Tabba
                   ` (34 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Quentin Perret <qperret@google.com>

Add a helper allowing to check when the pkvm static key is enabled to
ease the introduction of pkvm hooks in other parts of the code.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/virt.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 261d6e9df2e1..ebf4a9f943ed 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -82,6 +82,12 @@ bool is_kvm_arm_initialised(void);
 
 DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
 
+static inline bool is_pkvm_initialized(void)
+{
+	return IS_ENABLED(CONFIG_KVM) &&
+	       static_branch_likely(&kvm_protected_mode_initialized);
+}
+
 /* Reports the availability of HYP mode */
 static inline bool is_hyp_mode_available(void)
 {
@@ -89,8 +95,7 @@ static inline bool is_hyp_mode_available(void)
 	 * If KVM protected mode is initialized, all CPUs must have been booted
 	 * in EL2. Avoid checking __boot_cpu_mode as CPUs now come up in EL1.
 	 */
-	if (IS_ENABLED(CONFIG_KVM) &&
-	    static_branch_likely(&kvm_protected_mode_initialized))
+	if (is_pkvm_initialized())
 		return true;
 
 	return (__boot_cpu_mode[0] == BOOT_CPU_MODE_EL2 &&
@@ -104,8 +109,7 @@ static inline bool is_hyp_mode_mismatched(void)
 	 * If KVM protected mode is initialized, all CPUs must have been booted
 	 * in EL2. Avoid checking __boot_cpu_mode as CPUs now come up in EL1.
 	 */
-	if (IS_ENABLED(CONFIG_KVM) &&
-	    static_branch_likely(&kvm_protected_mode_initialized))
+	if (is_pkvm_initialized())
 		return false;
 
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 10/44] KVM: arm64: Introduce predicates to check for protected state
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (8 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 09/44] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 11/44] KVM: arm64: Split up nvhe/fixed_config.h Fuad Tabba
                   ` (33 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

In order to determine whether or not a VM or (hyp) vCPU are protected,
introduce a helper function to query this state. For now, these will
always return 'false' as the underlying field is never configured.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h      |  6 ++----
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 13 +++++++++++++
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 9e8a496fb284..7249e88ea13a 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -211,6 +211,7 @@ typedef unsigned int pkvm_handle_t;
 struct kvm_protected_vm {
 	pkvm_handle_t handle;
 	struct kvm_hyp_memcache teardown_mc;
+	bool enabled;
 };
 
 struct kvm_mpidr_data {
@@ -1247,10 +1248,7 @@ struct kvm *kvm_arch_alloc_vm(void);
 
 #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS_RANGE
 
-static inline bool kvm_vm_is_protected(struct kvm *kvm)
-{
-	return false;
-}
+#define kvm_vm_is_protected(kvm)	((kvm)->arch.pkvm.enabled)
 
 int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
 bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index 82b3d62538a6..c59e16d38819 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -53,6 +53,19 @@ pkvm_hyp_vcpu_to_hyp_vm(struct pkvm_hyp_vcpu *hyp_vcpu)
 	return container_of(hyp_vcpu->vcpu.kvm, struct pkvm_hyp_vm, kvm);
 }
 
+static inline bool vcpu_is_protected(struct kvm_vcpu *vcpu)
+{
+	if (!is_protected_kvm_enabled())
+		return false;
+
+	return vcpu->kvm->arch.pkvm.enabled;
+}
+
+static inline bool pkvm_hyp_vcpu_is_protected(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	return vcpu_is_protected(&hyp_vcpu->vcpu);
+}
+
 void pkvm_hyp_vm_table_init(void *tbl);
 
 int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva,
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 11/44] KVM: arm64: Split up nvhe/fixed_config.h
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (9 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 10/44] KVM: arm64: Introduce predicates to check for protected state Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:34 ` [PATCH v1 12/44] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
                   ` (32 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

In preparation for using some of the pKVM fixed configuration register
definitions to filter the available VM CAPs in the host, split the
nvhe/fixed_config.h header so that the definitions can be shared
with the host, while keeping the hypervisor function prototypes in
the nvhe/ namespace.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_pkvm.h             | 205 ++++++++++++++++
 .../arm64/kvm/hyp/include/nvhe/fixed_config.h | 223 ------------------
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |   5 +
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |   1 -
 arch/arm64/kvm/hyp/nvhe/setup.c               |   1 -
 arch/arm64/kvm/hyp/nvhe/switch.c              |   2 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c            |   2 +-
 7 files changed, 212 insertions(+), 227 deletions(-)
 delete mode 100644 arch/arm64/kvm/hyp/include/nvhe/fixed_config.h

diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index ad9cfb5c1ff4..5bf5644aa5db 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -2,6 +2,7 @@
 /*
  * Copyright (C) 2020 - Google LLC
  * Author: Quentin Perret <qperret@google.com>
+ * Author: Fuad Tabba <tabba@google.com>
  */
 #ifndef __ARM64_KVM_PKVM_H__
 #define __ARM64_KVM_PKVM_H__
@@ -10,6 +11,7 @@
 #include <linux/memblock.h>
 #include <linux/scatterlist.h>
 #include <asm/kvm_pgtable.h>
+#include <asm/sysreg.h>
 
 /* Maximum number of VMs that can co-exist under pKVM. */
 #define KVM_MAX_PVMS 255
@@ -20,6 +22,209 @@ int pkvm_init_host_vm(struct kvm *kvm);
 int pkvm_create_hyp_vm(struct kvm *kvm);
 void pkvm_destroy_hyp_vm(struct kvm *kvm);
 
+/*
+ * Definitions for features to be allowed or restricted for guest virtual
+ * machines, depending on the mode KVM is running in and on the type of guest
+ * that is running.
+ *
+ * The ALLOW masks represent a bitmask of feature fields that are allowed
+ * without any restrictions as long as they are supported by the system.
+ *
+ * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
+ * features that are restricted to support at most the specified feature.
+ *
+ * If a feature field is not present in either, than it is not supported.
+ *
+ * The approach taken for protected VMs is to allow features that are:
+ * - Needed by common Linux distributions (e.g., floating point)
+ * - Trivial to support, e.g., supporting the feature does not introduce or
+ * require tracking of additional state in KVM
+ * - Cannot be trapped or prevent the guest from using anyway
+ */
+
+/*
+ * Allow for protected VMs:
+ * - Floating-point and Advanced SIMD
+ * - Data Independent Timing
+ */
+#define PVM_ID_AA64PFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_FP) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AdvSIMD) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_DIT) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - AArch64 guests only (no support for AArch32 guests):
+ *	AArch32 adds complexity in trap handling, emulation, condition codes,
+ *	etc...
+ * - RAS (v1)
+ *	Supported by KVM
+ */
+#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL0), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL1), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL2), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL3), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_RAS), ID_AA64PFR0_EL1_RAS_IMP) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Branch Target Identification
+ * - Speculative Store Bypassing
+ */
+#define PVM_ID_AA64PFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_BT) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SSBS) \
+	)
+
+#define PVM_ID_AA64PFR2_ALLOW (0ULL)
+
+/*
+ * Allow for protected VMs:
+ * - Mixed-endian
+ * - Distinction between Secure and Non-secure Memory
+ * - Mixed-endian at EL0 only
+ * - Non-context synchronizing exception entry and exit
+ */
+#define PVM_ID_AA64MMFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_BIGEND) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_SNSMEM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_BIGENDEL0) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_EXS) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - 40-bit IPA
+ * - 16-bit ASID
+ */
+#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_PARANGE), ID_AA64MMFR0_EL1_PARANGE_40) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_ASIDBITS), ID_AA64MMFR0_EL1_ASIDBITS_16) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Hardware translation table updates to Access flag and Dirty state
+ * - Number of VMID bits from CPU
+ * - Hierarchical Permission Disables
+ * - Privileged Access Never
+ * - SError interrupt exceptions from speculative reads
+ * - Enhanced Translation Synchronization
+ * - Control for cache maintenance permission
+ */
+#define PVM_ID_AA64MMFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HAFDBS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_VMIDBits) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HPDS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_PAN) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_SpecSEI) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_ETS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_CMOW) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Common not Private translations
+ * - User Access Override
+ * - IESB bit in the SCTLR_ELx registers
+ * - Unaligned single-copy atomicity and atomic functions
+ * - ESR_ELx.EC value on an exception by read access to feature ID space
+ * - TTL field in address operations.
+ * - Break-before-make sequences when changing translation block size
+ * - E0PDx mechanism
+ */
+#define PVM_ID_AA64MMFR2_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_CnP) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_UAO) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_IESB) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_AT) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_IDS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_TTL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_BBM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_E0PD) \
+	)
+
+#define PVM_ID_AA64MMFR3_ALLOW (0ULL)
+
+/*
+ * No support for Scalable Vectors for protected VMs:
+ *	Requires additional support from KVM, e.g., context-switching and
+ *	trapping at EL2
+ */
+#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
+
+/*
+ * No support for debug, including breakpoints, and watchpoints for protected
+ * VMs:
+ *	The Arm architecture mandates support for at least the Armv8 debug
+ *	architecture, which would include at least 2 hardware breakpoints and
+ *	watchpoints. Providing that support to protected guests adds
+ *	considerable state and complexity. Therefore, the reserved value of 0 is
+ *	used for debug-related fields.
+ */
+#define PVM_ID_AA64DFR0_ALLOW (0ULL)
+#define PVM_ID_AA64DFR1_ALLOW (0ULL)
+
+/*
+ * No support for implementation defined features.
+ */
+#define PVM_ID_AA64AFR0_ALLOW (0ULL)
+#define PVM_ID_AA64AFR1_ALLOW (0ULL)
+
+/*
+ * No restrictions on instructions implemented in AArch64.
+ */
+#define PVM_ID_AA64ISAR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_AES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA1) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA2) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_CRC32) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_ATOMIC) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_RDM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SM3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SM4) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_DP) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_FHM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_TS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_TLB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_RNDR) \
+	)
+
+/* Restrict pointer authentication to the basic version. */
+#define PVM_ID_AA64ISAR1_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA), ID_AA64ISAR1_EL1_APA_PAuth) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API), ID_AA64ISAR1_EL1_API_PAuth) \
+	)
+
+#define PVM_ID_AA64ISAR2_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3), ID_AA64ISAR2_EL1_APA3_PAuth) \
+	)
+
+#define PVM_ID_AA64ISAR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DPB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_JSCVT) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FCMA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_LRCPC) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FRINTTS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SPECRES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_BF16) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DGH) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_I8MM) \
+	)
+
+#define PVM_ID_AA64ISAR2_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_ATS1A)| \
+	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_MOPS) \
+	)
+
 extern struct memblock_region kvm_nvhe_sym(hyp_memory)[];
 extern unsigned int kvm_nvhe_sym(hyp_memblock_nr);
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/fixed_config.h b/arch/arm64/kvm/hyp/include/nvhe/fixed_config.h
deleted file mode 100644
index 51f043649146..000000000000
--- a/arch/arm64/kvm/hyp/include/nvhe/fixed_config.h
+++ /dev/null
@@ -1,223 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0-only */
-/*
- * Copyright (C) 2021 Google LLC
- * Author: Fuad Tabba <tabba@google.com>
- */
-
-#ifndef __ARM64_KVM_FIXED_CONFIG_H__
-#define __ARM64_KVM_FIXED_CONFIG_H__
-
-#include <asm/sysreg.h>
-
-/*
- * This file contains definitions for features to be allowed or restricted for
- * guest virtual machines, depending on the mode KVM is running in and on the
- * type of guest that is running.
- *
- * The ALLOW masks represent a bitmask of feature fields that are allowed
- * without any restrictions as long as they are supported by the system.
- *
- * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
- * features that are restricted to support at most the specified feature.
- *
- * If a feature field is not present in either, than it is not supported.
- *
- * The approach taken for protected VMs is to allow features that are:
- * - Needed by common Linux distributions (e.g., floating point)
- * - Trivial to support, e.g., supporting the feature does not introduce or
- * require tracking of additional state in KVM
- * - Cannot be trapped or prevent the guest from using anyway
- */
-
-/*
- * Allow for protected VMs:
- * - Floating-point and Advanced SIMD
- * - Data Independent Timing
- * - Spectre/Meltdown Mitigation
- */
-#define PVM_ID_AA64PFR0_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_FP) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AdvSIMD) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_DIT) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV2) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_CSV3) \
-	)
-
-/*
- * Restrict to the following *unsigned* features for protected VMs:
- * - AArch64 guests only (no support for AArch32 guests):
- *	AArch32 adds complexity in trap handling, emulation, condition codes,
- *	etc...
- * - RAS (v1)
- *	Supported by KVM
- */
-#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL0), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL1), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL2), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_EL3), ID_AA64PFR0_EL1_ELx_64BIT_ONLY) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_RAS), ID_AA64PFR0_EL1_RAS_IMP) \
-	)
-
-/*
- * Allow for protected VMs:
- * - Branch Target Identification
- * - Speculative Store Bypassing
- */
-#define PVM_ID_AA64PFR1_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_BT) | \
-	ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SSBS) \
-	)
-
-#define PVM_ID_AA64PFR2_ALLOW 0ULL
-
-/*
- * Allow for protected VMs:
- * - Mixed-endian
- * - Distinction between Secure and Non-secure Memory
- * - Mixed-endian at EL0 only
- * - Non-context synchronizing exception entry and exit
- */
-#define PVM_ID_AA64MMFR0_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_BIGEND) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_SNSMEM) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_BIGENDEL0) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_EXS) \
-	)
-
-/*
- * Restrict to the following *unsigned* features for protected VMs:
- * - 40-bit IPA
- * - 16-bit ASID
- */
-#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_PARANGE), ID_AA64MMFR0_EL1_PARANGE_40) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_EL1_ASIDBITS), ID_AA64MMFR0_EL1_ASIDBITS_16) \
-	)
-
-/*
- * Allow for protected VMs:
- * - Hardware translation table updates to Access flag and Dirty state
- * - Number of VMID bits from CPU
- * - Hierarchical Permission Disables
- * - Privileged Access Never
- * - SError interrupt exceptions from speculative reads
- * - Enhanced Translation Synchronization
- * - Control for cache maintenance permission
- */
-#define PVM_ID_AA64MMFR1_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HAFDBS) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_VMIDBits) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_HPDS) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_PAN) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_SpecSEI) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_ETS) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR1_EL1_CMOW) \
-	)
-
-/*
- * Allow for protected VMs:
- * - Common not Private translations
- * - User Access Override
- * - IESB bit in the SCTLR_ELx registers
- * - Unaligned single-copy atomicity and atomic functions
- * - ESR_ELx.EC value on an exception by read access to feature ID space
- * - TTL field in address operations.
- * - Break-before-make sequences when changing translation block size
- * - E0PDx mechanism
- */
-#define PVM_ID_AA64MMFR2_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_CnP) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_UAO) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_IESB) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_AT) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_IDS) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_TTL) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_BBM) | \
-	ARM64_FEATURE_MASK(ID_AA64MMFR2_EL1_E0PD) \
-	)
-
-#define PVM_ID_AA64MMFR3_ALLOW (0ULL)
-
-/*
- * No support for Scalable Vectors for protected VMs:
- *	Requires additional support from KVM, e.g., context-switching and
- *	trapping at EL2
- */
-#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
-
-/*
- * No support for debug, including breakpoints, and watchpoints for protected
- * VMs:
- *	The Arm architecture mandates support for at least the Armv8 debug
- *	architecture, which would include at least 2 hardware breakpoints and
- *	watchpoints. Providing that support to protected guests adds
- *	considerable state and complexity. Therefore, the reserved value of 0 is
- *	used for debug-related fields.
- */
-#define PVM_ID_AA64DFR0_ALLOW (0ULL)
-#define PVM_ID_AA64DFR1_ALLOW (0ULL)
-
-/*
- * No support for implementation defined features.
- */
-#define PVM_ID_AA64AFR0_ALLOW (0ULL)
-#define PVM_ID_AA64AFR1_ALLOW (0ULL)
-
-/*
- * No restrictions on instructions implemented in AArch64.
- */
-#define PVM_ID_AA64ISAR0_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_AES) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA1) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA2) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_CRC32) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_ATOMIC) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_RDM) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SHA3) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SM3) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_SM4) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_DP) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_FHM) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_TS) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_TLB) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR0_EL1_RNDR) \
-	)
-
-/* Restrict pointer authentication to the basic version. */
-#define PVM_ID_AA64ISAR1_RESTRICT_UNSIGNED (\
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA), ID_AA64ISAR1_EL1_APA_PAuth) | \
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API), ID_AA64ISAR1_EL1_API_PAuth) \
-	)
-
-#define PVM_ID_AA64ISAR2_RESTRICT_UNSIGNED (\
-	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_APA3), ID_AA64ISAR2_EL1_APA3_PAuth) \
-	)
-
-#define PVM_ID_AA64ISAR1_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DPB) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_JSCVT) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FCMA) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_LRCPC) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_FRINTTS) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SB) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_SPECRES) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_BF16) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_DGH) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_I8MM) \
-	)
-
-#define PVM_ID_AA64ISAR2_ALLOW (\
-	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_ATS1A)| \
-	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_GPA3) | \
-	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_MOPS) \
-	)
-
-u64 pvm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id);
-bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
-bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code);
-int kvm_check_pvm_sysreg_table(void);
-
-#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index c59e16d38819..0fc1cf5bae8c 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -78,4 +78,9 @@ struct pkvm_hyp_vcpu *pkvm_load_hyp_vcpu(pkvm_handle_t handle,
 					 unsigned int vcpu_idx);
 void pkvm_put_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu);
 
+u64 pvm_read_id_reg(const struct kvm_vcpu *vcpu, u32 id);
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
+bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code);
+int kvm_check_pvm_sysreg_table(void);
+
 #endif /* __ARM64_KVM_NVHE_PKVM_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 26dd9a20ad6e..7b5d245a371e 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -6,7 +6,6 @@
 
 #include <linux/kvm_host.h>
 #include <linux/mm.h>
-#include <nvhe/fixed_config.h>
 #include <nvhe/mem_protect.h>
 #include <nvhe/memory.h>
 #include <nvhe/pkvm.h>
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index bc58d1b515af..d41163d1fed3 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -12,7 +12,6 @@
 
 #include <nvhe/early_alloc.h>
 #include <nvhe/ffa.h>
-#include <nvhe/fixed_config.h>
 #include <nvhe/gfp.h>
 #include <nvhe/memory.h>
 #include <nvhe/mem_protect.h>
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 2a0b0d6da7c7..302b6cf8f92c 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -26,8 +26,8 @@
 #include <asm/debug-monitors.h>
 #include <asm/processor.h>
 
-#include <nvhe/fixed_config.h>
 #include <nvhe/mem_protect.h>
+#include <nvhe/pkvm.h>
 
 /* Non-VHE specific context */
 DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index edd969a1f36b..18c1ca0a66b9 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -11,7 +11,7 @@
 
 #include <hyp/adjust_pc.h>
 
-#include <nvhe/fixed_config.h>
+#include <nvhe/pkvm.h>
 
 #include "../../sys_regs.h"
 
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 12/44] KVM: arm64: Move pstate reset value definitions to kvm_arm.h
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (10 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 11/44] KVM: arm64: Split up nvhe/fixed_config.h Fuad Tabba
@ 2024-03-27 17:34 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
                   ` (31 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:34 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move the macro defines of the pstate reset values to a shared
header to be used by hyp in future patches.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_arm.h | 12 ++++++++++++
 arch/arm64/kvm/reset.c           | 12 ------------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index e01bb5ca13b7..12a4b226690a 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -432,4 +432,16 @@
 	{ PSR_AA32_MODE_UND,	"32-bit UND" },	\
 	{ PSR_AA32_MODE_SYS,	"32-bit SYS" }
 
+/*
+ * ARMv8 Reset Values
+ */
+#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
+				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
+
 #endif /* __ARM64_KVM_ARM_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 68d1d05672bd..29ae68f60bef 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -33,18 +33,6 @@
 /* Maximum phys_shift supported for any VM on this host */
 static u32 __ro_after_init kvm_ipa_limit;
 
-/*
- * ARMv8 Reset Values
- */
-#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
-				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
-
 unsigned int __ro_after_init kvm_sve_max_vl;
 
 int __init kvm_arm_init_sve(void)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (11 preceding siblings ...)
  2024-03-27 17:34 ` [PATCH v1 12/44] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-28 18:53   ` Mark Brown
  2024-03-27 17:35 ` [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
                   ` (30 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Expand comment clarifying why the host value representing SVE
vector length being restored for ZCR_EL1 on guest exit isn't the
same as it was on guest entry.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/fpsimd.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 826307e19e3a..f297e89e4810 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -200,7 +200,18 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 		if (vcpu_has_sve(vcpu)) {
 			__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 
-			/* Restore the VL that was saved when bound to the CPU */
+			/*
+			 * Restore the VL that was saved when bound to the CPU,
+			 * which is the maximum VL for the guest. Because
+			 * the layout of the data when saving the sve state
+			 * depends on the VL, we need to use a consistent VL.
+			 * Note that this means that at guest exit ZCR_EL1 is
+			 * not necessarily the same as on guest entry.
+			 *
+			 * Flushing the cpu state sets the TIF_FOREIGN_FPSTATE
+			 * bit for the context, which lets the kernel restore
+			 * the sve state, including ZCR_EL1 later.
+			 */
 			if (!has_vhe())
 				sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
 						       SYS_ZCR_EL1);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (12 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-28 18:57   ` Mark Brown
  2024-03-27 17:35 ` [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore Fuad Tabba
                   ` (29 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

The main factor for determining the SVE state size is the vector
length, and future patches will need to calculate it without
necessarily having a vcpu as a reference.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 7249e88ea13a..3d12fc2aeb9e 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -843,22 +843,24 @@ struct kvm_vcpu_arch {
 #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
 			     sve_ffr_offset((vcpu)->arch.sve_max_vl))
 
-#define vcpu_sve_max_vq(vcpu)	sve_vq_from_vl((vcpu)->arch.sve_max_vl)
-
-#define vcpu_sve_state_size(vcpu) ({					\
+#define _vcpu_sve_state_size(sve_max_vl) ({				\
 	size_t __size_ret;						\
-	unsigned int __vcpu_vq;						\
+	unsigned int __vq;						\
 									\
-	if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) {		\
+	if (WARN_ON(!sve_vl_valid(sve_max_vl))) {			\
 		__size_ret = 0;						\
 	} else {							\
-		__vcpu_vq = vcpu_sve_max_vq(vcpu);			\
-		__size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq);		\
+		__vq = sve_vq_from_vl(sve_max_vl);			\
+		__size_ret = SVE_SIG_REGS_SIZE(__vq);			\
 	}								\
 									\
 	__size_ret;							\
 })
 
+#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl)
+
+#define vcpu_sve_state_size(vcpu) _vcpu_sve_state_size((vcpu)->arch.sve_max_vl)
+
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
 				 KVM_GUESTDBG_USE_HW | \
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (13 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-28 19:17   ` Mark Brown
  2024-03-27 17:35 ` [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
                   ` (28 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

On restoring guest SVE state, use the guest's current active
vector length. This reduces the amount of restoring for the cases
where the maximum size isn't used. Moreover, it fixes a bug where
the ZCR_EL2 value wasn't being set when restoring the guest
state, potentially corrupting it.

Fixes: 52029198c1ce ("KVM: arm64: Rework SVE host-save/guest-restore")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 1a6dfd035531..570bbdbe55ca 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -315,10 +315,14 @@ static bool kvm_hyp_handle_mops(struct kvm_vcpu *vcpu, u64 *exit_code)
 
 static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 {
-	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
+	u64 zcr_el1 = __vcpu_sys_reg(vcpu, ZCR_EL1);
+	u64 zcr_el2 = min(zcr_el1, vcpu_sve_max_vq(vcpu) - 1ULL);
+
+	write_sysreg_el1(zcr_el1, SYS_ZCR);
+	sve_cond_update_zcr_vq(zcr_el2, SYS_ZCR_EL2);
 	__sve_restore_state(vcpu_sve_pffr(vcpu),
 			    &vcpu->arch.ctxt.fp_regs.fpsr);
-	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
+	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1ULL, SYS_ZCR_EL2);
 }
 
 /*
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (14 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-28 19:20   ` Mark Brown
  2024-03-27 17:35 ` [PATCH v1 17/44] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
                   ` (27 subsequent siblings)
  43 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

pKVM maintains its own state for tracking the host fpsimd state.
Therefore, no need to map and share the host's view with it.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 --
 arch/arm64/kvm/fpsimd.c           | 31 ++++---------------------------
 arch/arm64/kvm/reset.c            |  1 -
 3 files changed, 4 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3d12fc2aeb9e..cdbbfa3246c1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -642,7 +642,6 @@ struct kvm_vcpu_arch {
 	struct kvm_guest_debug_arch external_debug_state;
 
 	struct user_fpsimd_state *host_fpsimd_state;	/* hyp VA */
-	struct task_struct *parent_task;
 
 	struct {
 		/* {Break,watch}point registers */
@@ -1214,7 +1213,6 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
-void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index f297e89e4810..e3d9ec4ab9d0 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -14,19 +14,6 @@
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
-void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
-{
-	struct task_struct *p = vcpu->arch.parent_task;
-	struct user_fpsimd_state *fpsimd;
-
-	if (!is_protected_kvm_enabled() || !p)
-		return;
-
-	fpsimd = &p->thread.uw.fpsimd_state;
-	kvm_unshare_hyp(fpsimd, fpsimd + 1);
-	put_task_struct(p);
-}
-
 /*
  * Called on entry to KVM_RUN unless this vcpu previously ran at least
  * once and the most recent prior KVM_RUN for this vcpu was called from
@@ -38,11 +25,12 @@ void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
  */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 {
-	int ret;
-
 	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+	int ret;
 
-	kvm_vcpu_unshare_task_fp(vcpu);
+	/* pKVM has its own tracking of the host fpsimd state. */
+	if (is_protected_kvm_enabled())
+		return 0;
 
 	/* Make sure the host task fpsimd state is visible to hyp: */
 	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
@@ -51,17 +39,6 @@ int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.host_fpsimd_state = kern_hyp_va(fpsimd);
 
-	/*
-	 * We need to keep current's task_struct pinned until its data has been
-	 * unshared with the hypervisor to make sure it is not re-used by the
-	 * kernel and donated to someone else while already shared -- see
-	 * kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
-	 */
-	if (is_protected_kvm_enabled()) {
-		get_task_struct(current);
-		vcpu->arch.parent_task = current;
-	}
-
 	return 0;
 }
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 29ae68f60bef..3d8064bf67c8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -139,7 +139,6 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	void *sve_state = vcpu->arch.sve_state;
 
-	kvm_vcpu_unshare_task_fp(vcpu);
 	kvm_unshare_hyp(vcpu, vcpu + 1);
 	if (sve_state)
 		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 17/44] KVM: arm64: Move some kvm_psci functions to a shared header
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (15 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 18/44] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
                   ` (26 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move some PSCI functions and macros to a shared header to be used
by hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/psci.c  | 28 ----------------------------
 include/kvm/arm_psci.h | 29 +++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 1f69b667332b..43458949d955 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -21,16 +21,6 @@
  * as described in ARM document number ARM DEN 0022A.
  */
 
-#define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
-
-static unsigned long psci_affinity_mask(unsigned long affinity_level)
-{
-	if (affinity_level <= 3)
-		return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
-
-	return 0;
-}
-
 static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -51,12 +41,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 	return PSCI_RET_SUCCESS;
 }
 
-static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
-					   unsigned long affinity)
-{
-	return !(affinity & ~MPIDR_HWID_BITMASK);
-}
-
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -214,18 +198,6 @@ static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
 	run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 }
 
-static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
-{
-	int i;
-
-	/*
-	 * Zero the input registers' upper 32 bits. They will be fully
-	 * zeroed on exit, so we're fine changing them in place.
-	 */
-	for (i = 1; i < 4; i++)
-		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
-}
-
 static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn)
 {
 	/*
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index e8fb624013d1..c86f228efae1 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -36,6 +36,35 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
 	return KVM_ARM_PSCI_0_1;
 }
 
+/* Narrow the PSCI register arguments (r1 to r3) to 32 bits. */
+static inline void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	/*
+	 * Zero the input registers' upper 32 bits. They will be fully
+	 * zeroed on exit, so we're fine changing them in place.
+	 */
+	for (i = 1; i < 4; i++)
+		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
+}
+
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+					   unsigned long affinity)
+{
+	return !(affinity & ~MPIDR_HWID_BITMASK);
+}
+
+
+#define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
+
+static inline unsigned long psci_affinity_mask(unsigned long affinity_level)
+{
+	if (affinity_level <= 3)
+		return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
+
+	return 0;
+}
 
 int kvm_psci_call(struct kvm_vcpu *vcpu);
 
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 18/44] KVM: arm64: Refactor reset_mpidr() to extract its computation
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (16 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 17/44] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 19/44] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
                   ` (25 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move the computation of the mpidr to its own function in a shared
header, as the computation will be used by hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/sys_regs.c | 14 +-------------
 arch/arm64/kvm/sys_regs.h | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index c9f4f387155f..0ad788686bbf 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -816,21 +816,9 @@ static u64 reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 
 static u64 reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
-	u64 mpidr;
+	u64 mpidr = calculate_mpidr(vcpu);
 
-	/*
-	 * Map the vcpu_id into the first three affinity level fields of
-	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
-	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
-	 * of the GICv3 to be able to address each CPU directly when
-	 * sending IPIs.
-	 */
-	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
-	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
-	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
-	mpidr |= (1ULL << 31);
 	vcpu_write_sys_reg(vcpu, mpidr, MPIDR_EL1);
-
 	return mpidr;
 }
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 997eea21ba2a..1dfd2380a1ae 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -222,6 +222,25 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
 	return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
 }
 
+static inline u64 calculate_mpidr(const struct kvm_vcpu *vcpu)
+{
+	u64 mpidr;
+
+	/*
+	 * Map the vcpu_id into the first three affinity level fields of
+	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
+	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
+	 * of the GICv3 to be able to address each CPU directly when
+	 * sending IPIs.
+	 */
+	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
+	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+	mpidr |= (1ULL << 31);
+
+	return mpidr;
+}
+
 const struct sys_reg_desc *get_reg_by_id(u64 id,
 					 const struct sys_reg_desc table[],
 					 unsigned int num);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 19/44] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (17 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 18/44] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 20/44] KVM: arm64: Refactor enter_exception64() Fuad Tabba
                   ` (24 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move kvm_vcpu_enable_ptrauth() to a shared header to be used by
hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 5 +++++
 arch/arm64/kvm/reset.c               | 7 +------
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 975af30af31f..dcb2aaf10d8c 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -611,4 +611,9 @@ static __always_inline void kvm_reset_cptr_el2(struct kvm_vcpu *vcpu)
 
 	kvm_write_cptr_el2(val);
 }
+
+static inline void kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
+{
+	vcpu_set_flag(vcpu, GUEST_HAS_PTRAUTH);
+}
 #endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3d8064bf67c8..c955419582a8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -105,7 +105,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
 		kfree(buf);
 		return ret;
 	}
-	
+
 	vcpu->arch.sve_state = buf;
 	vcpu_set_flag(vcpu, VCPU_SVE_FINALIZED);
 	return 0;
@@ -152,11 +152,6 @@ static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
 		memset(vcpu->arch.sve_state, 0, vcpu_sve_state_size(vcpu));
 }
 
-static void kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
-{
-	vcpu_set_flag(vcpu, GUEST_HAS_PTRAUTH);
-}
-
 /**
  * kvm_reset_vcpu - sets core registers and sys_regs to reset value
  * @vcpu: The VCPU pointer
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 20/44] KVM: arm64: Refactor enter_exception64()
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (18 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 19/44] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 21/44] KVM: arm64: Add PC_UPDATE_REQ flags covering all PC updates Fuad Tabba
                   ` (23 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Quentin Perret <qperret@google.com>

In order to simplify the injection of exceptions in the host in pkvm
context, let's factor out of enter_exception64() the code calculating
the exception offset from VBAR_EL1 and the cpsr.

No functional change intended.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_emulate.h |   5 ++
 arch/arm64/kvm/hyp/exception.c       | 100 ++++++++++++++++-----------
 2 files changed, 63 insertions(+), 42 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index dcb2aaf10d8c..4f0bc2df46f6 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -49,6 +49,11 @@ void kvm_inject_dabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
 void kvm_inject_size_fault(struct kvm_vcpu *vcpu);
 
+unsigned long get_except64_offset(unsigned long psr, unsigned long target_mode,
+				  enum exception_type type);
+unsigned long get_except64_cpsr(unsigned long old, bool has_mte,
+				unsigned long sctlr, unsigned long mode);
+
 void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
 
 void kvm_emulate_nested_eret(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/exception.c b/arch/arm64/kvm/hyp/exception.c
index 424a5107cddb..da69a5685c47 100644
--- a/arch/arm64/kvm/hyp/exception.c
+++ b/arch/arm64/kvm/hyp/exception.c
@@ -71,31 +71,12 @@ static void __vcpu_write_spsr_und(struct kvm_vcpu *vcpu, u64 val)
 		vcpu->arch.ctxt.spsr_und = val;
 }
 
-/*
- * This performs the exception entry at a given EL (@target_mode), stashing PC
- * and PSTATE into ELR and SPSR respectively, and compute the new PC/PSTATE.
- * The EL passed to this function *must* be a non-secure, privileged mode with
- * bit 0 being set (PSTATE.SP == 1).
- *
- * When an exception is taken, most PSTATE fields are left unchanged in the
- * handler. However, some are explicitly overridden (e.g. M[4:0]). Luckily all
- * of the inherited bits have the same position in the AArch64/AArch32 SPSR_ELx
- * layouts, so we don't need to shuffle these for exceptions from AArch32 EL0.
- *
- * For the SPSR_ELx layout for AArch64, see ARM DDI 0487E.a page C5-429.
- * For the SPSR_ELx layout for AArch32, see ARM DDI 0487E.a page C5-426.
- *
- * Here we manipulate the fields in order of the AArch64 SPSR_ELx layout, from
- * MSB to LSB.
- */
-static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
-			      enum exception_type type)
+unsigned long get_except64_offset(unsigned long psr, unsigned long target_mode,
+				  enum exception_type type)
 {
-	unsigned long sctlr, vbar, old, new, mode;
+	u64 mode = psr & (PSR_MODE_MASK | PSR_MODE32_BIT);
 	u64 exc_offset;
 
-	mode = *vcpu_cpsr(vcpu) & (PSR_MODE_MASK | PSR_MODE32_BIT);
-
 	if      (mode == target_mode)
 		exc_offset = CURRENT_EL_SP_ELx_VECTOR;
 	else if ((mode | PSR_MODE_THREAD_BIT) == target_mode)
@@ -105,33 +86,32 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
 	else
 		exc_offset = LOWER_EL_AArch32_VECTOR;
 
-	switch (target_mode) {
-	case PSR_MODE_EL1h:
-		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
-		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
-		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
-		break;
-	case PSR_MODE_EL2h:
-		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
-		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
-		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
-		break;
-	default:
-		/* Don't do that */
-		BUG();
-	}
-
-	*vcpu_pc(vcpu) = vbar + exc_offset + type;
+	return exc_offset + type;
+}
 
-	old = *vcpu_cpsr(vcpu);
-	new = 0;
+/*
+ * When an exception is taken, most PSTATE fields are left unchanged in the
+ * handler. However, some are explicitly overridden (e.g. M[4:0]). Luckily all
+ * of the inherited bits have the same position in the AArch64/AArch32 SPSR_ELx
+ * layouts, so we don't need to shuffle these for exceptions from AArch32 EL0.
+ *
+ * For the SPSR_ELx layout for AArch64, see ARM DDI 0487E.a page C5-429.
+ * For the SPSR_ELx layout for AArch32, see ARM DDI 0487E.a page C5-426.
+ *
+ * Here we manipulate the fields in order of the AArch64 SPSR_ELx layout, from
+ * MSB to LSB.
+ */
+unsigned long get_except64_cpsr(unsigned long old, bool has_mte,
+				unsigned long sctlr, unsigned long target_mode)
+{
+	u64 new = 0;
 
 	new |= (old & PSR_N_BIT);
 	new |= (old & PSR_Z_BIT);
 	new |= (old & PSR_C_BIT);
 	new |= (old & PSR_V_BIT);
 
-	if (kvm_has_mte(kern_hyp_va(vcpu->kvm)))
+	if (has_mte)
 		new |= PSR_TCO_BIT;
 
 	new |= (old & PSR_DIT_BIT);
@@ -167,6 +147,42 @@ static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
 
 	new |= target_mode;
 
+	return new;
+}
+
+/*
+ * This performs the exception entry at a given EL (@target_mode), stashing PC
+ * and PSTATE into ELR and SPSR respectively, and compute the new PC/PSTATE.
+ * The EL passed to this function *must* be a non-secure, privileged mode with
+ * bit 0 being set (PSTATE.SP == 1).
+ */
+static void enter_exception64(struct kvm_vcpu *vcpu, unsigned long target_mode,
+			      enum exception_type type)
+{
+	u64 offset = get_except64_offset(*vcpu_cpsr(vcpu), target_mode, type);
+	unsigned long sctlr, vbar, old, new;
+
+	switch (target_mode) {
+	case PSR_MODE_EL1h:
+		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL1);
+		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL1);
+		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL1);
+		break;
+	case PSR_MODE_EL2h:
+		vbar = __vcpu_read_sys_reg(vcpu, VBAR_EL2);
+		sctlr = __vcpu_read_sys_reg(vcpu, SCTLR_EL2);
+		__vcpu_write_sys_reg(vcpu, *vcpu_pc(vcpu), ELR_EL2);
+		break;
+	default:
+		/* Don't do that */
+		BUG();
+	}
+
+	*vcpu_pc(vcpu) = vbar + offset;
+
+	old = *vcpu_cpsr(vcpu);
+	new = get_except64_cpsr(old, kvm_has_mte(kern_hyp_va(vcpu->kvm)), sctlr,
+				target_mode);
 	*vcpu_cpsr(vcpu) = new;
 	__vcpu_write_spsr(vcpu, target_mode, old);
 }
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 21/44] KVM: arm64: Add PC_UPDATE_REQ flags covering all PC updates
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (19 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 20/44] KVM: arm64: Refactor enter_exception64() Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 22/44] KVM: arm64: Add vcpu flag copy primitive Fuad Tabba
                   ` (22 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

In order to deal with PC updates (such as INCREMENT_PC and the
collection of flags that come with PENDING_EXCEPTION), add a single
mask that covers them all.

This will be used to manipulate these flags as a single entity.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index cdbbfa3246c1..3c0fefb1dd73 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -786,6 +786,8 @@ struct kvm_vcpu_arch {
 #define INCREMENT_PC		__vcpu_single_flag(iflags, BIT(1))
 /* Target EL/MODE (not a single flag, but let's abuse the macro) */
 #define EXCEPT_MASK		__vcpu_single_flag(iflags, GENMASK(3, 1))
+/* Cover both PENDING_EXCEPTION and EXCEPT_MASK for global operations */
+#define PC_UPDATE_REQ		__vcpu_single_flag(iflags, GENMASK(3, 0))
 
 /* Helpers to encode exceptions with minimum fuss */
 #define __EXCEPT_MASK_VAL	unpack_vcpu_flag(EXCEPT_MASK)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 22/44] KVM: arm64: Add vcpu flag copy primitive
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (20 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 21/44] KVM: arm64: Add PC_UPDATE_REQ flags covering all PC updates Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 23/44] KVM: arm64: Introduce gfn_to_memslot_prot() Fuad Tabba
                   ` (21 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

Contrary to vanilla KVM, pKVM not only deals with flags in a vcpu,
but also synchronises them across host and hypervisor views of the same
vcpu.

Most of the time, this is about copying flags from one vcpu structure
to another, so let's offer a primitive that does this.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3c0fefb1dd73..da5fc300d691 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -764,9 +764,25 @@ struct kvm_vcpu_arch {
 		__vcpu_flags_preempt_enable();			\
 	} while (0)
 
+#define __vcpu_copy_flag(vt, vs, flagset, f, m)			\
+	do {							\
+		typeof(vs->arch.flagset) tmp, val;		\
+								\
+		__build_check_flag(vs, flagset, f, m);		\
+								\
+		val = READ_ONCE(vs->arch.flagset);		\
+		val &= (m);					\
+		tmp = READ_ONCE(vt->arch.flagset);		\
+		tmp &= ~(m);					\
+		tmp |= val;					\
+		WRITE_ONCE(vt->arch.flagset, tmp);		\
+	} while (0)
+
+
 #define vcpu_get_flag(v, ...)	__vcpu_get_flag((v), __VA_ARGS__)
 #define vcpu_set_flag(v, ...)	__vcpu_set_flag((v), __VA_ARGS__)
 #define vcpu_clear_flag(v, ...)	__vcpu_clear_flag((v), __VA_ARGS__)
+#define vcpu_copy_flag(vt, vs, ...) __vcpu_copy_flag((vt), (vs), __VA_ARGS__)
 
 /* SVE exposed to guest */
 #define GUEST_HAS_SVE		__vcpu_single_flag(cflags, BIT(0))
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 23/44] KVM: arm64: Introduce gfn_to_memslot_prot()
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (21 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 22/44] KVM: arm64: Add vcpu flag copy primitive Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 24/44] KVM: arm64: Do not use the hva in kvm_handle_guest_abort() Fuad Tabba
                   ` (20 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Returns the memslot and whether it's writable without requiring a
userspace address at the host.

The userspace address isn't needed to get this information.
Future patches, where the userspace address might not be known,
would need access to the memslot and whether it's writeable.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 include/linux/kvm_host.h |  1 +
 virt/kvm/kvm_main.c      | 22 ++++++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 48f31dcd318a..94f0e8e00d0c 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1292,6 +1292,7 @@ int kvm_gfn_to_hva_cache_init(struct kvm *kvm, struct gfn_to_hva_cache *ghc,
 
 int kvm_clear_guest(struct kvm *kvm, gpa_t gpa, unsigned long len);
 struct kvm_memory_slot *gfn_to_memslot(struct kvm *kvm, gfn_t gfn);
+struct kvm_memory_slot *gfn_to_memslot_prot(struct kvm *kvm, gfn_t gfn, bool *writable);
 bool kvm_is_visible_gfn(struct kvm *kvm, gfn_t gfn);
 bool kvm_vcpu_is_visible_gfn(struct kvm_vcpu *vcpu, gfn_t gfn);
 unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index fb49c2a60200..8757c8f7808b 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2709,6 +2709,28 @@ static bool memslot_is_readonly(const struct kvm_memory_slot *slot)
 	return slot->flags & KVM_MEM_READONLY;
 }
 
+/*
+ * Return the memslot of a @gfn and the R/W attribute if slot is valid, or NULL
+ * if slot is not valid.
+ *
+ * @slot: the kvm_memory_slot which contains @gfn
+ * @gfn: the gfn to be translated
+ * @writable: used to return the read/write attribute of the @slot if the hva
+ * is valid and @writable is not NULL
+ */
+struct kvm_memory_slot *gfn_to_memslot_prot(struct kvm *kvm, gfn_t gfn, bool *writable)
+{
+	struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn);
+
+	if (!slot || slot->flags & KVM_MEMSLOT_INVALID)
+		return NULL;
+
+	if (writable)
+		*writable = !memslot_is_readonly(slot);
+
+	return slot;
+}
+
 static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn,
 				       gfn_t *nr_pages, bool write)
 {
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 24/44] KVM: arm64: Do not use the hva in kvm_handle_guest_abort()
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (22 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 23/44] KVM: arm64: Introduce gfn_to_memslot_prot() Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 25/44] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
                   ` (19 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

The hva isn't needed by kvm_handle_guest_abort(), but is used as
a proxy for determining whether there's an error or a write
fault. Use the newly introduced gfn_to_hva_memslot_prot() to
determine errors or write faults.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 3afc42d8833e..b35a20901794 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1375,7 +1375,7 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma)
 }
 
 static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
-			  struct kvm_memory_slot *memslot, unsigned long hva,
+			  struct kvm_memory_slot *memslot,
 			  bool fault_is_perm)
 {
 	int ret = 0;
@@ -1387,12 +1387,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache;
 	struct vm_area_struct *vma;
 	short vma_shift;
-	gfn_t gfn;
+	gfn_t gfn = fault_ipa >> PAGE_SHIFT;
 	kvm_pfn_t pfn;
 	bool logging_active = memslot_is_logging(memslot);
 	long vma_pagesize, fault_granule;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R;
 	struct kvm_pgtable *pgt;
+	unsigned long hva = gfn_to_hva_memslot_prot(memslot, gfn, NULL);
 
 	if (fault_is_perm)
 		fault_granule = kvm_vcpu_trap_get_perm_fault_granule(vcpu);
@@ -1469,7 +1470,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 	if (vma_pagesize == PMD_SIZE || vma_pagesize == PUD_SIZE)
 		fault_ipa &= ~(vma_pagesize - 1);
 
-	gfn = fault_ipa >> PAGE_SHIFT;
 	mte_allowed = kvm_vma_mte_allowed(vma);
 
 	vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED;
@@ -1629,7 +1629,6 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 	unsigned long esr;
 	phys_addr_t fault_ipa;
 	struct kvm_memory_slot *memslot;
-	unsigned long hva;
 	bool is_iabt, write_fault, writable;
 	gfn_t gfn;
 	int ret, idx;
@@ -1687,10 +1686,9 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 	idx = srcu_read_lock(&vcpu->kvm->srcu);
 
 	gfn = fault_ipa >> PAGE_SHIFT;
-	memslot = gfn_to_memslot(vcpu->kvm, gfn);
-	hva = gfn_to_hva_memslot_prot(memslot, gfn, &writable);
+	memslot = gfn_to_memslot_prot(vcpu->kvm, gfn, &writable);
 	write_fault = kvm_is_write_fault(vcpu);
-	if (kvm_is_error_hva(hva) || (write_fault && !writable)) {
+	if (!memslot || (write_fault && !writable)) {
 		/*
 		 * The guest has put either its instructions or its page-tables
 		 * somewhere it shouldn't have. Userspace won't be able to do
@@ -1718,7 +1716,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 		 * So let's assume that the guest is just being
 		 * cautious, and skip the instruction.
 		 */
-		if (kvm_is_error_hva(hva) && kvm_vcpu_dabt_is_cm(vcpu)) {
+		if (!memslot && kvm_vcpu_dabt_is_cm(vcpu)) {
 			kvm_incr_pc(vcpu);
 			ret = 1;
 			goto out_unlock;
@@ -1744,8 +1742,7 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu)
 		goto out_unlock;
 	}
 
-	ret = user_mem_abort(vcpu, fault_ipa, memslot, hva,
-			     esr_fsc_is_permission_fault(esr));
+	ret = user_mem_abort(vcpu, fault_ipa, memslot, esr_fsc_is_permission_fault(esr));
 	if (ret == 0)
 		ret = 1;
 out:
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 25/44] KVM: arm64: Introduce hyp_rwlock_t
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (23 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 24/44] KVM: arm64: Do not use the hva in kvm_handle_guest_abort() Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 26/44] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
                   ` (18 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

Introduce a simple counter-based rwlock for EL2 which can reduce locking
contention on read-mostly data structures when compared to a spinlock.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/rwlock.h | 129 +++++++++++++++++++++++
 1 file changed, 129 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/rwlock.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/rwlock.h b/arch/arm64/kvm/hyp/include/nvhe/rwlock.h
new file mode 100644
index 000000000000..365084497e59
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/rwlock.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A stand-alone rwlock implementation for use by the non-VHE KVM
+ * hypervisor code running at EL2. This is *not* a fair lock and is
+ * likely to scale very badly under contention.
+ *
+ * Copyright (C) 2022 Google LLC
+ * Author: Will Deacon <will@kernel.org>
+ *
+ * Heavily based on the implementation removed by 087133ac9076 which was:
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __ARM64_KVM_NVHE_RWLOCK_H__
+#define __ARM64_KVM_NVHE_RWLOCK_H__
+
+#include <linux/bits.h>
+
+typedef struct {
+	u32	__val;
+} hyp_rwlock_t;
+
+#define __HYP_RWLOCK_INITIALIZER \
+	{ .__val = 0 }
+
+#define __HYP_RWLOCK_UNLOCKED \
+	((hyp_rwlock_t) __HYP_RWLOCK_INITIALIZER)
+
+#define DEFINE_HYP_RWLOCK(x)	hyp_rwlock_t x = __HYP_RWLOCK_UNLOCKED
+
+#define hyp_rwlock_init(l)						\
+do {									\
+	*(l) = __HYP_RWLOCK_UNLOCKED;					\
+} while (0)
+
+#define __HYP_RWLOCK_WRITER_BIT	31
+
+static inline void hyp_write_lock(hyp_rwlock_t *lock)
+{
+	u32 tmp;
+
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"	sevl\n"
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, %1\n"
+	"	cbnz	%w0, 1b\n"
+	"	stxr	%w0, %w2, %1\n"
+	"	cbnz	%w0, 2b\n"
+	__nops(1),
+	/* LSE atomics */
+	"1:	mov	%w0, wzr\n"
+	"2:	casa	%w0, %w2, %1\n"
+	"	cbz	%w0, 3f\n"
+	"	ldxr	%w0, %1\n"
+	"	cbz	%w0, 2b\n"
+	"	wfe\n"
+	"	b	1b\n"
+	"3:")
+	: "=&r" (tmp), "+Q" (lock->__val)
+	: "r" (BIT(__HYP_RWLOCK_WRITER_BIT))
+	: "memory");
+}
+
+static inline void hyp_write_unlock(hyp_rwlock_t *lock)
+{
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	"	stlr	wzr, %0",
+	"	swpl	wzr, wzr, %0")
+	: "=Q" (lock->__val) :: "memory");
+}
+
+static inline void hyp_read_lock(hyp_rwlock_t *lock)
+{
+	u32 tmp, tmp2;
+
+	asm volatile(
+	"	sevl\n"
+	ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, %2\n"
+	"	add	%w0, %w0, #1\n"
+	"	tbnz	%w0, %3, 1b\n"
+	"	stxr	%w1, %w0, %2\n"
+	"	cbnz	%w1, 2b\n"
+	__nops(1),
+	/* LSE atomics */
+	"1:	wfe\n"
+	"2:	ldxr	%w0, %2\n"
+	"	adds	%w1, %w0, #1\n"
+	"	tbnz	%w1, %3, 1b\n"
+	"	casa	%w0, %w1, %2\n"
+	"	sbc	%w0, %w1, %w0\n"
+	"	cbnz	%w0, 2b")
+	: "=&r" (tmp), "=&r" (tmp2), "+Q" (lock->__val)
+	: "i" (__HYP_RWLOCK_WRITER_BIT)
+	: "cc", "memory");
+}
+
+static inline void hyp_read_unlock(hyp_rwlock_t *lock)
+{
+	u32 tmp, tmp2;
+
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"1:	ldxr	%w0, %2\n"
+	"	sub	%w0, %w0, #1\n"
+	"	stlxr	%w1, %w0, %2\n"
+	"	cbnz	%w1, 1b",
+	/* LSE atomics */
+	"	movn	%w0, #0\n"
+	"	staddl	%w0, %2\n"
+	__nops(2))
+	: "=&r" (tmp), "=&r" (tmp2), "+Q" (lock->__val)
+	:
+	: "memory");
+}
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static inline void hyp_assert_write_lock_held(hyp_rwlock_t *lock)
+{
+	BUG_ON(!(READ_ONCE(lock->__val) & BIT(__HYP_RWLOCK_WRITER_BIT)));
+}
+#else
+static inline void hyp_assert_write_lock_held(hyp_rwlock_t *lock) { }
+#endif
+
+#endif	/* __ARM64_KVM_NVHE_RWLOCK_H__ */
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 26/44] KVM: arm64: Add atomics-based checking refcount implementation at EL2
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (24 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 25/44] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 27/44] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
                   ` (17 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

The current nVHE refcount implementation at EL2 uses a simple spinlock
to serialise accesses. Although this works, it forces serialisation in
places where it is not necessary, so introduce a simple atomics-based
refcount implementation instead.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/refcount.h | 72 ++++++++++++++++++++++
 1 file changed, 72 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/refcount.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/refcount.h b/arch/arm64/kvm/hyp/include/nvhe/refcount.h
new file mode 100644
index 000000000000..e90e66f49651
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/refcount.h
@@ -0,0 +1,72 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Atomics-based checking refcount implementation.
+ * Copyright (C) 2023 Google LLC
+ * Author: Will Deacon <will@kernel.org>
+ */
+#ifndef __ARM64_KVM_NVHE_REFCOUNT_H__
+#define __ARM64_KVM_NVHE_REFCOUNT_H__
+
+#include <asm/lse.h>
+
+static inline s16 __ll_sc_refcount_fetch_add_16(u16 *refcount, s16 addend)
+{
+	u16 new;
+	u32 flag;
+
+	asm volatile(
+	"	prfm	pstl1strm, %[refcount]\n"
+	"1:	ldxrh	%w[new], %[refcount]\n"
+	"	add	%w[new], %w[new], %w[addend]\n"
+	"	stxrh	%w[flag], %w[new], %[refcount]\n"
+	"	cbnz	%w[flag], 1b"
+	: [refcount] "+Q" (*refcount),
+	  [new] "=&r" (new),
+	  [flag] "=&r" (flag)
+	: [addend] "Ir" (addend));
+
+	return new;
+}
+
+static inline s16 __lse_refcount_fetch_add_16(u16 *refcount, s16 addend)
+{
+	s16 old;
+
+	asm volatile(__LSE_PREAMBLE
+	"	ldaddh	%w[addend], %w[old], %[refcount]"
+	: [refcount] "+Q" (*refcount),
+	  [old] "=r" (old)
+	: [addend] "r" (addend));
+
+	return old + addend;
+}
+
+static inline u64 __hyp_refcount_fetch_add(void *refcount, const size_t size,
+					   const s64 addend)
+{
+	s64 new;
+
+	switch (size) {
+	case 2:
+		new = __lse_ll_sc_body(refcount_fetch_add_16, refcount, addend);
+		break;
+	default:
+		BUILD_BUG_ON_MSG(1, "Unsupported refcount size");
+		unreachable();
+	}
+
+	BUG_ON(new < 0);
+	return new;
+}
+
+
+#define hyp_refcount_inc(r)	__hyp_refcount_fetch_add(&(r), sizeof(r), 1)
+#define hyp_refcount_dec(r)	__hyp_refcount_fetch_add(&(r), sizeof(r), -1)
+#define hyp_refcount_get(r)	READ_ONCE(r)
+#define hyp_refcount_set(r, v)	do {			\
+	typeof(r) *__rp = &(r);				\
+	WARN_ON(hyp_refcount_get(*__rp));		\
+	WRITE_ONCE(*__rp, v);				\
+} while (0)
+
+#endif /* __ARM64_KVM_NVHE_REFCOUNT_H__ */
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 27/44] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (25 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 26/44] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 28/44] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
                   ` (16 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

Convert the 'struct hyp_page' refcount manipulation functions over to
the new atomic refcount helpers. For now, this will make absolutely no
difference because the 'struct hyp_pool' locking is still serialising
everything. One step at a time...

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 +++++++-----------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c    |  2 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c     |  5 ++++-
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index ab205c4d6774..74474c82667b 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -6,6 +6,7 @@
 #include <asm/page.h>
 
 #include <linux/types.h>
+#include <nvhe/refcount.h>
 
 struct hyp_page {
 	unsigned short refcount;
@@ -39,37 +40,32 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
 #define hyp_page_to_pool(page)	(((struct hyp_page *)page)->pool)
 
 /*
- * Refcounting for 'struct hyp_page'.
- * hyp_pool::lock must be held if atomic access to the refcount is required.
+ * Refcounting wrappers for 'struct hyp_page'.
  */
 static inline int hyp_page_count(void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	return p->refcount;
+	return hyp_refcount_get(p->refcount);
 }
 
 static inline void hyp_page_ref_inc(struct hyp_page *p)
 {
-	BUG_ON(p->refcount == USHRT_MAX);
-	p->refcount++;
+	hyp_refcount_inc(p->refcount);
 }
 
 static inline void hyp_page_ref_dec(struct hyp_page *p)
 {
-	BUG_ON(!p->refcount);
-	p->refcount--;
+	hyp_refcount_dec(p->refcount);
 }
 
 static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
 {
-	hyp_page_ref_dec(p);
-	return (p->refcount == 0);
+	return hyp_refcount_dec(p->refcount) == 0;
 }
 
 static inline void hyp_set_page_refcounted(struct hyp_page *p)
 {
-	BUG_ON(p->refcount);
-	p->refcount = 1;
+	hyp_refcount_set(p->refcount, 1);
 }
 #endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index d48990eae1ef..4b20bf553312 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -202,7 +202,7 @@ static void *guest_s2_zalloc_page(void *mc)
 	memset(addr, 0, PAGE_SIZE);
 	p = hyp_virt_to_page(addr);
 	memset(p, 0, sizeof(*p));
-	p->refcount = 1;
+	hyp_set_page_refcounted(p);
 
 	return addr;
 }
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index e691290d3765..169cdb43b4b8 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -55,7 +55,10 @@ static struct hyp_page *__find_buddy_avail(struct hyp_pool *pool,
 {
 	struct hyp_page *buddy = __find_buddy_nocheck(pool, p, order);
 
-	if (!buddy || buddy->order != order || buddy->refcount)
+	if (!buddy)
+		return NULL;
+
+	if (buddy->order != order || hyp_refcount_get(buddy->refcount))
 		return NULL;
 
 	return buddy;
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 28/44] KVM: arm64: Remove locking from EL2 allocation fast-paths
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (26 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 27/44] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 29/44] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
                   ` (15 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

hyp_{get,put}_page() are called extensively from the page-table code
to adjust reference counts on page-table pages. As a small step towards
removing reader serialisation on these paths, drop the 'hyp_pool' lock
in the case where the refcount remains positive, only taking the lock
if the page is to be freed back to the allocator.

Remove a misleading comment at the same time, which implies that a page
with a refcount of zero and which is not attached to a freelist is
unsafe whereas in practice it's the other way around which can lead to
problems.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h |  6 +-----
 arch/arm64/kvm/hyp/nvhe/page_alloc.c  | 16 ++++------------
 2 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
index 97c527ef53c2..24eb7840d98e 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/gfp.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
@@ -10,11 +10,7 @@
 #define HYP_NO_ORDER	USHRT_MAX
 
 struct hyp_pool {
-	/*
-	 * Spinlock protecting concurrent changes to the memory pool as well as
-	 * the struct hyp_page of the pool's pages until we have a proper atomic
-	 * API at EL2.
-	 */
+	/* lock protecting concurrent changes to the memory pool. */
 	hyp_spinlock_t lock;
 	struct list_head free_area[NR_PAGE_ORDERS];
 	phys_addr_t range_start;
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 169cdb43b4b8..96b52f545af0 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -155,33 +155,25 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool,
 
 static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p)
 {
-	if (hyp_page_ref_dec_and_test(p))
+	if (hyp_page_ref_dec_and_test(p)) {
+		hyp_spin_lock(&pool->lock);
 		__hyp_attach_page(pool, p);
+		hyp_spin_unlock(&pool->lock);
+	}
 }
 
-/*
- * Changes to the buddy tree and page refcounts must be done with the hyp_pool
- * lock held. If a refcount change requires an update to the buddy tree (e.g.
- * hyp_put_page()), both operations must be done within the same critical
- * section to guarantee transient states (e.g. a page with null refcount but
- * not yet attached to a free list) can't be observed by well-behaved readers.
- */
 void hyp_put_page(struct hyp_pool *pool, void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	hyp_spin_lock(&pool->lock);
 	__hyp_put_page(pool, p);
-	hyp_spin_unlock(&pool->lock);
 }
 
 void hyp_get_page(struct hyp_pool *pool, void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	hyp_spin_lock(&pool->lock);
 	hyp_page_ref_inc(p);
-	hyp_spin_unlock(&pool->lock);
 }
 
 void hyp_split_page(struct hyp_page *p)
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 29/44] KVM: arm64: Reformat/beautify PTP hypercall documentation
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (27 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 28/44] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 30/44] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
                   ` (14 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

The PTP hypercall documentation doesn't produce the best-looking table
when formatting in HTML as all of the return value definitions end up
on the same line.

Reformat the PTP hypercall documentation to follow the formatting used
by hypercalls.rst.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/arm/ptp_kvm.rst | 38 ++++++++++++++++----------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/Documentation/virt/kvm/arm/ptp_kvm.rst b/Documentation/virt/kvm/arm/ptp_kvm.rst
index aecdc80ddcd8..7c0960970a0e 100644
--- a/Documentation/virt/kvm/arm/ptp_kvm.rst
+++ b/Documentation/virt/kvm/arm/ptp_kvm.rst
@@ -7,19 +7,29 @@ PTP_KVM is used for high precision time sync between host and guests.
 It relies on transferring the wall clock and counter value from the
 host to the guest using a KVM-specific hypercall.
 
-* ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID: 0x86000001
+``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
+----------------------------------------
 
-This hypercall uses the SMC32/HVC32 calling convention:
+Retrieve current time information for the specific counter. There are no
+endianness restrictions.
 
-ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID
-    ==============    ========    =====================================
-    Function ID:      (uint32)    0x86000001
-    Arguments:        (uint32)    KVM_PTP_VIRT_COUNTER(0)
-                                  KVM_PTP_PHYS_COUNTER(1)
-    Return Values:    (int32)     NOT_SUPPORTED(-1) on error, or
-                      (uint32)    Upper 32 bits of wall clock time (r0)
-                      (uint32)    Lower 32 bits of wall clock time (r1)
-                      (uint32)    Upper 32 bits of counter (r2)
-                      (uint32)    Lower 32 bits of counter (r3)
-    Endianness:                   No Restrictions.
-    ==============    ========    =====================================
++---------------------+-------------------------------------------------------+
+| Presence:           | Optional                                              |
++---------------------+-------------------------------------------------------+
+| Calling convention: | HVC32                                                 |
++---------------------+----------+--------------------------------------------+
+| Function ID:        | (uint32) | 0x86000001                                 |
++---------------------+----------+----+---------------------------------------+
+| Arguments:          | (uint32) | R1 | ``KVM_PTP_VIRT_COUNTER (0)``          |
+|                     |          |    +---------------------------------------+
+|                     |          |    | ``KVM_PTP_PHYS_COUNTER (1)``          |
++---------------------+----------+----+---------------------------------------+
+| Return Values:      | (int32)  | R0 | ``NOT_SUPPORTED (-1)`` on error, else |
+|                     |          |    | upper 32 bits of wall clock time      |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R1 | Lower 32 bits of wall clock time      |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R2 | Upper 32 bits of counter              |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R3 | Lower 32 bits of counter              |
++---------------------+----------+----+---------------------------------------+
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 30/44] KVM: arm64: Rename firmware pseudo-register documentation file
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (28 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 29/44] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 31/44] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
                   ` (13 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

In preparation for describing the guest view of KVM/arm64 hypercalls in
hypercalls.rst, move the existing contents of the file concerning the
firmware pseudo-registers elsewhere.

Cc: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 .../kvm/arm/{hypercalls.rst => fw-pseudo-registers.rst}     | 6 +++---
 Documentation/virt/kvm/arm/index.rst                        | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)
 rename Documentation/virt/kvm/arm/{hypercalls.rst => fw-pseudo-registers.rst} (97%)

diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/fw-pseudo-registers.rst
similarity index 97%
rename from Documentation/virt/kvm/arm/hypercalls.rst
rename to Documentation/virt/kvm/arm/fw-pseudo-registers.rst
index 3e23084644ba..b90fd0b0fa66 100644
--- a/Documentation/virt/kvm/arm/hypercalls.rst
+++ b/Documentation/virt/kvm/arm/fw-pseudo-registers.rst
@@ -1,8 +1,8 @@
 .. SPDX-License-Identifier: GPL-2.0
 
-=======================
-ARM Hypercall Interface
-=======================
+=======================================
+ARM firmware pseudo-registers interface
+=======================================
 
 KVM handles the hypercall services as requested by the guests. New hypercall
 services are regularly made available by the ARM specification or by KVM (as
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst
index 7f231c724e16..d28d65122290 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -7,8 +7,8 @@ ARM
 .. toctree::
    :maxdepth: 2
 
+   fw-pseudo-registers
    hyp-abi
-   hypercalls
    pvtime
    ptp_kvm
    vcpu-features
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 31/44] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (29 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 30/44] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 32/44] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
                   ` (12 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

KVM/arm64 makes use of the SMCCC "Vendor Specific Hypervisor Service
Call Range" to expose KVM-specific hypercalls to guests in a
discoverable and extensible fashion.

Document the existence of this interface and the discovery hypercall.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/arm/hypercalls.rst | 46 +++++++++++++++++++++++
 Documentation/virt/kvm/arm/index.rst      |  1 +
 2 files changed, 47 insertions(+)
 create mode 100644 Documentation/virt/kvm/arm/hypercalls.rst

diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/hypercalls.rst
new file mode 100644
index 000000000000..17be111f493f
--- /dev/null
+++ b/Documentation/virt/kvm/arm/hypercalls.rst
@@ -0,0 +1,46 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================================
+KVM/arm64-specific hypercalls exposed to guests
+===============================================
+
+This file documents the KVM/arm64-specific hypercalls which may be
+exposed by KVM/arm64 to guest operating systems. These hypercalls are
+issued using the HVC instruction according to version 1.1 of the Arm SMC
+Calling Convention (DEN0028/C):
+
+https://developer.arm.com/docs/den0028/c
+
+All KVM/arm64-specific hypercalls are allocated within the "Vendor
+Specific Hypervisor Service Call" range with a UID of
+``28b46fb6-2ec5-11e9-a9ca-4b564d003a74``. This UID should be queried by the
+guest using the standard "Call UID" function for the service range in
+order to determine that the KVM/arm64-specific hypercalls are available.
+
+``ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID``
+---------------------------------------------
+
+Provides a discovery mechanism for other KVM/arm64 hypercalls.
+
++---------------------+-------------------------------------------------------------+
+| Presence:           | Mandatory for the KVM/arm64 UID                             |
++---------------------+-------------------------------------------------------------+
+| Calling convention: | HVC32                                                       |
++---------------------+----------+--------------------------------------------------+
+| Function ID:        | (uint32) | 0x86000000                                       |
++---------------------+----------+--------------------------------------------------+
+| Arguments:          | None                                                        |
++---------------------+----------+----+---------------------------------------------+
+| Return Values:      | (uint32) | R0 | Bitmap of available function numbers 0-31   |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R1 | Bitmap of available function numbers 32-63  |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R2 | Bitmap of available function numbers 64-95  |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R3 | Bitmap of available function numbers 96-127 |
++---------------------+----------+----+---------------------------------------------+
+
+``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
+----------------------------------------
+
+See ptp_kvm.rst
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst
index d28d65122290..ec09881de4cf 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -9,6 +9,7 @@ ARM
 
    fw-pseudo-registers
    hyp-abi
+   hypercalls
    pvtime
    ptp_kvm
    vcpu-features
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 32/44] KVM: arm64: Prevent kmemleak from accessing .hyp.data
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (30 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 31/44] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 33/44] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
                   ` (11 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Quentin Perret <qperret@google.com>

We've added a .data section for the hypervisor, which kmemleak is
eager to parse. This clearly doesn't go well, so add the section
to kmemleak's block list.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/pkvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index b7be96a53597..a0af0926d9f8 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -259,6 +259,7 @@ static int __init finalize_pkvm(void)
 	 * at, which would end badly once inaccessible.
 	 */
 	kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
+	kmemleak_free_part(__hyp_rodata_start, __hyp_rodata_end - __hyp_rodata_start);
 	kmemleak_free_part_phys(hyp_mem_base, hyp_mem_size);
 
 	ret = pkvm_drop_host_privileges();
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 33/44] KVM: arm64: Issue CMOs when tearing down guest s2 pages
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (31 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 32/44] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 34/44] KVM: arm64: Do not set the virtual timer offset for protected vCPUs Fuad Tabba
                   ` (10 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Quentin Perret <qperret@google.com>

On the guest teardown path, pKVM will zero the pages used to back
the guest data structures before returning them to the host as
they may contain secrets (e.g. in the vCPU registers). However,
the zeroing is done using a cacheable alias, and CMOs are
missing, hence giving the host a potential opportunity to read
the original content of the guest structs from memory.

Fix this by issuing CMOs after zeroing the pages.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 7b5d245a371e..fb4801865db1 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -429,6 +429,7 @@ static void *map_donated_memory(unsigned long host_va, size_t size)
 
 static void __unmap_donated_memory(void *va, size_t size)
 {
+	kvm_flush_dcache_to_poc(va, size);
 	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(va),
 				       PAGE_ALIGN(size) >> PAGE_SHIFT));
 }
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 34/44] KVM: arm64: Do not set the virtual timer offset for protected vCPUs
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (32 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 33/44] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 35/44] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
                   ` (9 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

The host shouldn't be able to access the virtual timer state for
protected vCPUs. Moreover, protected vCPUs always run with a
virtual counter offset of 0.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/arch_timer.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/arch_timer.c b/arch/arm64/kvm/arch_timer.c
index 879982b1cc73..ab49c22694ca 100644
--- a/arch/arm64/kvm/arch_timer.c
+++ b/arch/arm64/kvm/arch_timer.c
@@ -162,8 +162,9 @@ static void timer_set_cval(struct arch_timer_context *ctxt, u64 cval)
 
 static void timer_set_offset(struct arch_timer_context *ctxt, u64 offset)
 {
-	if (!ctxt->offset.vm_offset) {
-		WARN(offset, "timer %ld\n", arch_timer_ctx_index(ctxt));
+	if (unlikely(!ctxt->offset.vm_offset)) {
+		WARN(offset && !kvm_vm_is_protected(ctxt->vcpu->kvm),
+			"timer %ld\n", arch_timer_ctx_index(ctxt));
 		return;
 	}
 
@@ -988,10 +989,14 @@ static void timer_context_init(struct kvm_vcpu *vcpu, int timerid)
 
 	ctxt->vcpu = vcpu;
 
-	if (timerid == TIMER_VTIMER)
-		ctxt->offset.vm_offset = &kvm->arch.timer_data.voffset;
-	else
-		ctxt->offset.vm_offset = &kvm->arch.timer_data.poffset;
+	if (!kvm_vm_is_protected(vcpu->kvm)) {
+		if (timerid == TIMER_VTIMER)
+			ctxt->offset.vm_offset = &kvm->arch.timer_data.voffset;
+		else
+			ctxt->offset.vm_offset = &kvm->arch.timer_data.poffset;
+	} else {
+		ctxt->offset.vm_offset = NULL;
+	}
 
 	hrtimer_init(&ctxt->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_HARD);
 	ctxt->hrtimer.function = kvm_hrtimer_expire;
@@ -1656,6 +1661,9 @@ int kvm_vm_ioctl_set_counter_offset(struct kvm *kvm,
 	if (offset->reserved)
 		return -EINVAL;
 
+	if (kvm_vm_is_protected(kvm))
+		return -EBUSY;
+
 	mutex_lock(&kvm->lock);
 
 	if (lock_all_vcpus(kvm)) {
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 35/44] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (33 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 34/44] KVM: arm64: Do not set the virtual timer offset for protected vCPUs Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 36/44] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
                   ` (8 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Fix the comment to clarify that __pkvm_vcpu_init_traps()
initializes traps for all VMs in protected mode, and not only
for protected VMs.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index fb4801865db1..1167c2296b65 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -199,7 +199,7 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 }
 
 /*
- * Initialize trap register values for protected VMs.
+ * Initialize trap register values in protected mode.
  */
 void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
 {
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 36/44] KVM: arm64: Do not re-initialize the KVM lock
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (34 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 35/44] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 37/44] KVM: arm64: Check directly whether a vcpu is protected Fuad Tabba
                   ` (7 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

The lock is already initialized in core KVM code at
kvm_create_vm().

Fixes: 9d0c063a4d1d ("KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1")

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/pkvm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index a0af0926d9f8..85117ea8f351 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -222,7 +222,6 @@ void pkvm_destroy_hyp_vm(struct kvm *host_kvm)
 
 int pkvm_init_host_vm(struct kvm *host_kvm)
 {
-	mutex_init(&host_kvm->lock);
 	return 0;
 }
 
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 37/44] KVM: arm64: Check directly whether a vcpu is protected
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (35 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 36/44] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 38/44] KVM: arm64: Trap debug break and watch from guest Fuad Tabba
                   ` (6 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Use the vcpu_is_protected() function instead of the more
long-winded way of doing it through the kvm structure.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 302b6cf8f92c..28f3a6323940 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -209,7 +209,7 @@ static const exit_handler_fn pvm_exit_handlers[] = {
 
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)
 {
-	if (unlikely(kvm_vm_is_protected(kern_hyp_va(vcpu->kvm))))
+	if (unlikely(vcpu_is_protected(vcpu)))
 		return pvm_exit_handlers;
 
 	return hyp_exit_handlers;
@@ -228,9 +228,7 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)
  */
 static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
-
-	if (kvm_vm_is_protected(kvm) && vcpu_mode_is_32bit(vcpu)) {
+	if (unlikely(vcpu_is_protected(vcpu) && vcpu_mode_is_32bit(vcpu))) {
 		/*
 		 * As we have caught the guest red-handed, decide that it isn't
 		 * fit for purpose anymore by making the vcpu invalid. The VMM
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 38/44] KVM: arm64: Trap debug break and watch from guest
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (36 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 37/44] KVM: arm64: Check directly whether a vcpu is protected Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 39/44] KVM: arm64: Restrict protected VM capabilities Fuad Tabba
                   ` (5 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Debug and trace are not currently supported for protected guests, so
trap accesses to the related registers and emulate them as RAZ/WI.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c     |  2 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c | 11 +++++++++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 1167c2296b65..59c6b4317f29 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -111,7 +111,7 @@ static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
 
 	/* Trap Debug */
 	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DebugVer), feature_ids))
-		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
+		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA;
 
 	/* Trap OS Double Lock */
 	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_DoubleLock), feature_ids))
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index 18c1ca0a66b9..1604d170df53 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -345,6 +345,17 @@ static const struct sys_reg_desc pvm_sys_reg_descs[] = {
 	/* Cache maintenance by set/way operations are restricted. */
 
 	/* Debug and Trace Registers are restricted. */
+	RAZ_WI(SYS_DBGBVRn_EL1(0)),
+	RAZ_WI(SYS_DBGBCRn_EL1(0)),
+	RAZ_WI(SYS_DBGWVRn_EL1(0)),
+	RAZ_WI(SYS_DBGWCRn_EL1(0)),
+	RAZ_WI(SYS_MDSCR_EL1),
+	RAZ_WI(SYS_OSLAR_EL1),
+	RAZ_WI(SYS_OSLSR_EL1),
+	RAZ_WI(SYS_OSDLR_EL1),
+
+	/* Group 1 ID registers */
+	RAZ_WI(SYS_REVIDR_EL1),
 
 	/* AArch64 mappings of the AArch32 ID registers */
 	/* CRm=1 */
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 39/44] KVM: arm64: Restrict protected VM capabilities
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (37 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 38/44] KVM: arm64: Trap debug break and watch from guest Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 40/44] KVM: arm64: Do not support MTE for protected VMs Fuad Tabba
                   ` (4 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Restrict protected VM capabilities based on the
fixed-configuration for protected VMs.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_pkvm.h | 27 ++++++++++++
 arch/arm64/kvm/arm.c              | 69 ++++++++++++++++++++++++++++++-
 2 files changed, 95 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 5bf5644aa5db..970b0cf72f7d 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -225,6 +225,33 @@ void pkvm_destroy_hyp_vm(struct kvm *kvm);
 	ARM64_FEATURE_MASK(ID_AA64ISAR2_EL1_MOPS) \
 	)
 
+/*
+ * Returns the maximum number of breakpoints supported for protected VMs.
+ */
+static inline int pkvm_get_max_brps(void)
+{
+	int num = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_BRPs),
+			    PVM_ID_AA64DFR0_ALLOW);
+
+	/*
+	 * If breakpoints are supported, the maximum number is 1 + the field.
+	 * Otherwise, return 0, which is not compliant with the architecture,
+	 * but is reserved and is used here to indicate no debug support.
+	 */
+	return num ? num + 1 : 0;
+}
+
+/*
+ * Returns the maximum number of watchpoints supported for protected VMs.
+ */
+static inline int pkvm_get_max_wrps(void)
+{
+	int num = FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_WRPs),
+			    PVM_ID_AA64DFR0_ALLOW);
+
+	return num ? num + 1 : 0;
+}
+
 extern struct memblock_region kvm_nvhe_sym(hyp_memory)[];
 extern unsigned int kvm_nvhe_sym(hyp_memblock_nr);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 9cb39b6f070b..6e3c57c055d6 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -218,9 +218,10 @@ void kvm_arch_destroy_vm(struct kvm *kvm)
 	kvm_arm_teardown_hypercalls(kvm);
 }
 
-int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
+static int kvm_check_extension(struct kvm *kvm, long ext)
 {
 	int r;
+
 	switch (ext) {
 	case KVM_CAP_IRQCHIP:
 		r = vgic_present;
@@ -332,6 +333,72 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	return r;
 }
 
+/*
+ * Checks whether the extension specified in ext is supported in protected
+ * mode for the specified vm.
+ * The capabilities supported by kvm in general are passed in kvm_cap.
+ */
+static int pkvm_check_extension(struct kvm *kvm, long ext, int kvm_cap)
+{
+	int r;
+
+	switch (ext) {
+	case KVM_CAP_IRQCHIP:
+	case KVM_CAP_ARM_PSCI:
+	case KVM_CAP_ARM_PSCI_0_2:
+	case KVM_CAP_NR_VCPUS:
+	case KVM_CAP_MAX_VCPUS:
+	case KVM_CAP_MAX_VCPU_ID:
+	case KVM_CAP_MSI_DEVID:
+	case KVM_CAP_ARM_VM_IPA_SIZE:
+		r = kvm_cap;
+		break;
+	case KVM_CAP_GUEST_DEBUG_HW_BPS:
+		r = min(kvm_cap, pkvm_get_max_brps());
+		break;
+	case KVM_CAP_GUEST_DEBUG_HW_WPS:
+		r = min(kvm_cap, pkvm_get_max_wrps());
+		break;
+	case KVM_CAP_ARM_PMU_V3:
+		r = kvm_cap && FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_EL1_PMUVer),
+					 PVM_ID_AA64DFR0_ALLOW);
+		break;
+	case KVM_CAP_ARM_SVE:
+		r = kvm_cap && FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_SVE),
+					 PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
+		break;
+	case KVM_CAP_ARM_PTRAUTH_ADDRESS:
+		r = kvm_cap &&
+		    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_API),
+			      PVM_ID_AA64ISAR1_ALLOW) &&
+		    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_APA),
+			      PVM_ID_AA64ISAR1_ALLOW);
+		break;
+	case KVM_CAP_ARM_PTRAUTH_GENERIC:
+		r = kvm_cap &&
+		    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPI),
+			      PVM_ID_AA64ISAR1_ALLOW) &&
+		    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64ISAR1_EL1_GPA),
+			      PVM_ID_AA64ISAR1_ALLOW);
+		break;
+	default:
+		r = 0;
+		break;
+	}
+
+	return r;
+}
+
+int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
+{
+	int r = kvm_check_extension(kvm, ext);
+
+	if (kvm && kvm_vm_is_protected(kvm))
+		r = pkvm_check_extension(kvm, ext, r);
+
+	return r;
+}
+
 long kvm_arch_dev_ioctl(struct file *filp,
 			unsigned int ioctl, unsigned long arg)
 {
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 40/44] KVM: arm64: Do not support MTE for protected VMs
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (38 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 39/44] KVM: arm64: Restrict protected VM capabilities Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 41/44] KVM: arm64: Move pkvm_vcpu_init_traps() to hyp vcpu init Fuad Tabba
                   ` (3 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Return an error (-EINVAL) if trying to enable MTE on a protected
vm. MTE is still not supported in KVM in general. This check
ensures that pKVM isn't caught off-guard when that support is
added to KVM.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/arm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6e3c57c055d6..79783db5dac8 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -86,7 +86,9 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
-		if (!system_supports_mte() || kvm->created_vcpus) {
+		if (!system_supports_mte() ||
+		    kvm_vm_is_protected(kvm) ||
+		    kvm->created_vcpus) {
 			r = -EINVAL;
 		} else {
 			r = 0;
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 41/44] KVM: arm64: Move pkvm_vcpu_init_traps() to hyp vcpu init
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (39 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 40/44] KVM: arm64: Do not support MTE for protected VMs Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 42/44] KVM: arm64: Fix initializing traps in protected mode Fuad Tabba
                   ` (2 subsequent siblings)
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

Move pkvm_vcpu_init_traps() to the initialization of the hyp
vcpu, and remove the associated hypercall. Traps need to be
initialized at every vcpu init. This simplifies the code and
saves a hypercall per vcpu initialization.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h               | 1 -
 arch/arm64/kvm/arm.c                           | 8 --------
 arch/arm64/kvm/hyp/include/nvhe/trap_handler.h | 2 --
 arch/arm64/kvm/hyp/nvhe/hyp-main.c             | 8 --------
 arch/arm64/kvm/hyp/nvhe/pkvm.c                 | 4 +++-
 5 files changed, 3 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index a6330460d9e5..f8caa2520903 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -75,7 +75,6 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff,
 	__KVM_HOST_SMCCC_FUNC___vgic_v3_save_vmcr_aprs,
 	__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_vmcr_aprs,
-	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
 	__KVM_HOST_SMCCC_FUNC___pkvm_teardown_vm,
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 79783db5dac8..828bbd9e2d94 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -776,14 +776,6 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu)
 		static_branch_inc(&userspace_irqchip_in_use);
 	}
 
-	/*
-	 * Initialize traps for protected VMs.
-	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
-	 * the code is in place for first run initialization at EL2.
-	 */
-	if (kvm_vm_is_protected(kvm))
-		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
-
 	mutex_lock(&kvm->arch.config_lock);
 	set_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags);
 	mutex_unlock(&kvm->arch.config_lock);
diff --git a/arch/arm64/kvm/hyp/include/nvhe/trap_handler.h b/arch/arm64/kvm/hyp/include/nvhe/trap_handler.h
index 45a84f0ade04..1e6d995968a1 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/trap_handler.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/trap_handler.h
@@ -15,6 +15,4 @@
 #define DECLARE_REG(type, name, ctxt, reg)	\
 				type name = (type)cpu_reg(ctxt, (reg))
 
-void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
-
 #endif /* __ARM64_KVM_NVHE_TRAP_HANDLER_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index b7d7ca966f2e..aa50f2cb9d09 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -270,13 +270,6 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
 
-static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
-{
-	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
-
-	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
-}
-
 static void handle___pkvm_init_vm(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct kvm *, host_kvm, host_ctxt, 1);
@@ -332,7 +325,6 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_timer_set_cntvoff),
 	HANDLE_FUNC(__vgic_v3_save_vmcr_aprs),
 	HANDLE_FUNC(__vgic_v3_restore_vmcr_aprs),
-	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 	HANDLE_FUNC(__pkvm_init_vm),
 	HANDLE_FUNC(__pkvm_init_vcpu),
 	HANDLE_FUNC(__pkvm_teardown_vm),
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 59c6b4317f29..bc520f9f6d07 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -201,7 +201,7 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 /*
  * Initialize trap register values in protected mode.
  */
-void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
+static void pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
 {
 	pvm_init_trap_regs(vcpu);
 	pvm_init_traps_aa64pfr0(vcpu);
@@ -332,6 +332,8 @@ static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu,
 
 	hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu;
 	hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags);
+
+	pkvm_vcpu_init_traps(&hyp_vcpu->vcpu);
 done:
 	if (ret)
 		unpin_host_vcpu(host_vcpu);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 42/44] KVM: arm64: Fix initializing traps in protected mode
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (40 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 41/44] KVM: arm64: Move pkvm_vcpu_init_traps() to hyp vcpu init Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 43/44] KVM: arm64: Advertise GICv3 sysreg interface to protected guests Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 44/44] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

The values of the trapping registers for protected VMs should be
computed from the ground up, and not depend on potentially
preexisting values.

Moreover, non-protected VMs should not be restricted in protected
mode in the same manner as protected VMs.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 48 +++++++++++++++++++++++-----------
 1 file changed, 33 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index bc520f9f6d07..5693b431b310 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -179,19 +179,27 @@ static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
  */
 static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 {
-	const u64 hcr_trap_feat_regs = HCR_TID3;
-	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
-
 	/*
 	 * Always trap:
 	 * - Feature id registers: to control features exposed to guests
 	 * - Implementation-defined features
 	 */
-	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
+	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS |
+			     HCR_TID3 | HCR_TACR | HCR_TIDCP | HCR_TID1;
+
+	if (cpus_have_final_cap(ARM64_HAS_RAS_EXTN)) {
+		/* route synchronous external abort exceptions to EL2 */
+		vcpu->arch.hcr_el2 |= HCR_TEA;
+		/* trap error record accesses */
+		vcpu->arch.hcr_el2 |= HCR_TERR;
+	}
+
+	if (cpus_have_final_cap(ARM64_HAS_STAGE2_FWB))
+		vcpu->arch.hcr_el2 |= HCR_FWB;
+
+	if (cpus_have_final_cap(ARM64_MISMATCHED_CACHE_TYPE))
+		vcpu->arch.hcr_el2 |= HCR_TID2;
 
-	/* Clear res0 and set res1 bits to trap potential new features. */
-	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
-	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
 	if (!has_hvhe()) {
 		vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
 		vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
@@ -201,14 +209,24 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 /*
  * Initialize trap register values in protected mode.
  */
-static void pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
+static void pkvm_vcpu_init_traps(struct pkvm_hyp_vcpu *hyp_vcpu)
 {
-	pvm_init_trap_regs(vcpu);
-	pvm_init_traps_aa64pfr0(vcpu);
-	pvm_init_traps_aa64pfr1(vcpu);
-	pvm_init_traps_aa64dfr0(vcpu);
-	pvm_init_traps_aa64mmfr0(vcpu);
-	pvm_init_traps_aa64mmfr1(vcpu);
+	hyp_vcpu->vcpu.arch.cptr_el2 = kvm_get_reset_cptr_el2(&hyp_vcpu->vcpu);
+	hyp_vcpu->vcpu.arch.mdcr_el2 = 0;
+
+	if (!pkvm_hyp_vcpu_is_protected(hyp_vcpu)) {
+		u64 hcr = READ_ONCE(hyp_vcpu->host_vcpu->arch.hcr_el2);
+
+		hyp_vcpu->vcpu.arch.hcr_el2 = HCR_GUEST_FLAGS | hcr;
+		return;
+	}
+
+	pvm_init_trap_regs(&hyp_vcpu->vcpu);
+	pvm_init_traps_aa64pfr0(&hyp_vcpu->vcpu);
+	pvm_init_traps_aa64pfr1(&hyp_vcpu->vcpu);
+	pvm_init_traps_aa64dfr0(&hyp_vcpu->vcpu);
+	pvm_init_traps_aa64mmfr0(&hyp_vcpu->vcpu);
+	pvm_init_traps_aa64mmfr1(&hyp_vcpu->vcpu);
 }
 
 /*
@@ -333,7 +351,7 @@ static int init_pkvm_hyp_vcpu(struct pkvm_hyp_vcpu *hyp_vcpu,
 	hyp_vcpu->vcpu.arch.hw_mmu = &hyp_vm->kvm.arch.mmu;
 	hyp_vcpu->vcpu.arch.cflags = READ_ONCE(host_vcpu->arch.cflags);
 
-	pkvm_vcpu_init_traps(&hyp_vcpu->vcpu);
+	pkvm_vcpu_init_traps(hyp_vcpu);
 done:
 	if (ret)
 		unpin_host_vcpu(host_vcpu);
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 43/44] KVM: arm64: Advertise GICv3 sysreg interface to protected guests
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (41 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 42/44] KVM: arm64: Fix initializing traps in protected mode Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  2024-03-27 17:35 ` [PATCH v1 44/44] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Will Deacon <will@kernel.org>

Advertise the system register GICv3 CPU interface to protected guests
as that is the only supported configuration under pKVM.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_pkvm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/kvm_pkvm.h b/arch/arm64/include/asm/kvm_pkvm.h
index 970b0cf72f7d..40bebcffc9d0 100644
--- a/arch/arm64/include/asm/kvm_pkvm.h
+++ b/arch/arm64/include/asm/kvm_pkvm.h
@@ -45,11 +45,13 @@ void pkvm_destroy_hyp_vm(struct kvm *kvm);
 /*
  * Allow for protected VMs:
  * - Floating-point and Advanced SIMD
+ * - GICv3(+) system register interface
  * - Data Independent Timing
  */
 #define PVM_ID_AA64PFR0_ALLOW (\
 	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_FP) | \
 	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_AdvSIMD) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_GIC) | \
 	ARM64_FEATURE_MASK(ID_AA64PFR0_EL1_DIT) \
 	)
 
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* [PATCH v1 44/44] KVM: arm64: Force injection of a data abort on NISV MMIO exit
  2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (42 preceding siblings ...)
  2024-03-27 17:35 ` [PATCH v1 43/44] KVM: arm64: Advertise GICv3 sysreg interface to protected guests Fuad Tabba
@ 2024-03-27 17:35 ` Fuad Tabba
  43 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-03-27 17:35 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta

From: Marc Zyngier <maz@kernel.org>

If a vcpu exits for a data abort with an invalid syndrome, the
expectations are that userspace has a chance to save the day if
it has requested to see such exits.

However, this is completely futile in the case of a protected VM,
as none of the state is available. In this particular case, inject
a data abort directly into the vcpu, consistent with what userspace
could do.

This also helps with pKVM, which discards all syndrome information when
forwarding data aborts that are not known to be MMIO.

Finally, hide the RETURN_NISV_IO_ABORT_TO_USER cap from userspace on
protected VMs, and document this tweak to the API.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/api.rst |  7 +++++++
 arch/arm64/kvm/arm.c           | 14 ++++++++++----
 arch/arm64/kvm/mmio.c          |  9 +++++++++
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 0b5a33ee71ee..b11b70ae137e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6894,6 +6894,13 @@ Note that KVM does not skip the faulting instruction as it does for
 KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
 if it decides to decode and emulate the instruction.
 
+This feature isn't available to protected VMs, as userspace does not
+have access to the state that is required to perform the emulation.
+Instead, a data abort exception is directly injected in the guest.
+Note that although KVM_CAP_ARM_NISV_TO_USER will be reported if
+queried outside of a protected VM context, the feature will not be
+exposed if queried on a protected VM file descriptor.
+
 ::
 
 		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 828bbd9e2d94..924010e8a8cc 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -80,9 +80,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 
 	switch (cap->cap) {
 	case KVM_CAP_ARM_NISV_TO_USER:
-		r = 0;
-		set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
-			&kvm->arch.flags);
+		if (kvm_vm_is_protected(kvm)) {
+			r = -EINVAL;
+		} else {
+			r = 0;
+			set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
+				&kvm->arch.flags);
+		}
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
@@ -240,7 +244,6 @@ static int kvm_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_IMMEDIATE_EXIT:
 	case KVM_CAP_VCPU_EVENTS:
 	case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2:
-	case KVM_CAP_ARM_NISV_TO_USER:
 	case KVM_CAP_ARM_INJECT_EXT_DABT:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
@@ -250,6 +253,9 @@ static int kvm_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_COUNTER_OFFSET:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_NISV_TO_USER:
+		r = !kvm || !kvm_vm_is_protected(kvm);
+		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
 		return KVM_GUESTDBG_VALID_MASK;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 5e1ffb0d5363..87fd8faf2b62 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -133,11 +133,20 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 	/*
 	 * No valid syndrome? Ask userspace for help if it has
 	 * volunteered to do so, and bail out otherwise.
+	 *
+	 * In the protected VM case, there isn't much userspace can do
+	 * though, so directly deliver an exception to the guest.
 	 */
 	if (!kvm_vcpu_dabt_isvalid(vcpu)) {
 		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
 				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
 
+		if (is_protected_kvm_enabled() &&
+		    kvm_vm_is_protected(vcpu->kvm)) {
+			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
+			return 1;
+		}
+
 		if (test_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
 			     &vcpu->kvm->arch.flags)) {
 			run->exit_reason = KVM_EXIT_ARM_NISV;
-- 
2.44.0.478.gd926399ef9-goog


^ permalink raw reply related	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state
  2024-03-27 17:34 ` [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state Fuad Tabba
@ 2024-03-28 16:19   ` Mark Brown
  2024-04-08  7:39   ` Marc Zyngier
  1 sibling, 0 replies; 64+ messages in thread
From: Mark Brown @ 2024-03-28 16:19 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

[-- Attachment #1: Type: text/plain, Size: 406 bytes --]

On Wed, Mar 27, 2024 at 05:34:49PM +0000, Fuad Tabba wrote:
> Before the conversion of the various FP-state booleans into an
> enum representing the state, this helper may have clarified
> things. Since the introduction of the enum, the helper obfuscates
> rather than clarifies. This also makes the code consistent with
> other parts that check the FP-state.

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  2024-03-27 17:35 ` [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
@ 2024-03-28 18:53   ` Mark Brown
  2024-04-08 13:34     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Mark Brown @ 2024-03-28 18:53 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

[-- Attachment #1: Type: text/plain, Size: 1532 bytes --]

On Wed, Mar 27, 2024 at 05:35:00PM +0000, Fuad Tabba wrote:

> Expand comment clarifying why the host value representing SVE
> vector length being restored for ZCR_EL1 on guest exit isn't the
> same as it was on guest entry.

> -			/* Restore the VL that was saved when bound to the CPU */
> +			/*
> +			 * Restore the VL that was saved when bound to the CPU,
> +			 * which is the maximum VL for the guest. Because
> +			 * the layout of the data when saving the sve state
> +			 * depends on the VL, we need to use a consistent VL.
> +			 * Note that this means that at guest exit ZCR_EL1 is
> +			 * not necessarily the same as on guest entry.

I don't know if a reference to the ZCR_EL1 fulfilling the role of
ZCR_EL2 when doing the save from EL2 in VHE mode, that's potentially a
bit more architecture to know but explains why we only need this for
nVHE.

> +			 * Flushing the cpu state sets the TIF_FOREIGN_FPSTATE
> +			 * bit for the context, which lets the kernel restore
> +			 * the sve state, including ZCR_EL1 later.
> +			 */

The bit about flushing probably wants to be commented on the flush
itself which is done unconditionally rather than only in the nVHE case,
put something in there about how we need to save and invalidate the
state so that if the host tries to use floating point it's not using
stale data from the guest.

>  			if (!has_vhe())
>  				sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
>  						       SYS_ZCR_EL1);
> -- 
> 2.44.0.478.gd926399ef9-goog
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers
  2024-03-27 17:35 ` [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
@ 2024-03-28 18:57   ` Mark Brown
  2024-04-08 13:35     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Mark Brown @ 2024-03-28 18:57 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

[-- Attachment #1: Type: text/plain, Size: 663 bytes --]

On Wed, Mar 27, 2024 at 05:35:01PM +0000, Fuad Tabba wrote:
> The main factor for determining the SVE state size is the vector
> length, and future patches will need to calculate it without
> necessarily having a vcpu as a reference.

> -#define vcpu_sve_max_vq(vcpu)	sve_vq_from_vl((vcpu)->arch.sve_max_vl)
> -
> -#define vcpu_sve_state_size(vcpu) ({					\
> +#define _vcpu_sve_state_size(sve_max_vl) ({				\

If we're trying to make this a vCPU independent thing (which is fair
enough) shouldn't we also remove the vcpu from the name?
_sve_state_size() for example?  It feels like this might want sharing
with the host too but that could be done incrementally.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore
  2024-03-27 17:35 ` [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore Fuad Tabba
@ 2024-03-28 19:17   ` Mark Brown
  2024-04-09  9:34     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Mark Brown @ 2024-03-28 19:17 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

[-- Attachment #1: Type: text/plain, Size: 1223 bytes --]

On Wed, Mar 27, 2024 at 05:35:02PM +0000, Fuad Tabba wrote:
> On restoring guest SVE state, use the guest's current active
> vector length. This reduces the amount of restoring for the cases

For this to work don't we also need to also save the state with the
currently operational guest VL, all the saves and loads are done with VL
dependent operations?  It has crossed my mind to save and load the guest
state with the currently active guest VL since that's probably a little
quicker but we don't appear to be doing that.

> where the maximum size isn't used. Moreover, it fixes a bug where
> the ZCR_EL2 value wasn't being set when restoring the guest
> state, potentially corrupting it.

>  static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
>  {
> -	sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);

What was the bug with ZCR_EL2 not being set - I'm not clear how this
could be skipped?

> +	u64 zcr_el1 = __vcpu_sys_reg(vcpu, ZCR_EL1);
> +	u64 zcr_el2 = min(zcr_el1, vcpu_sve_max_vq(vcpu) - 1ULL);

This works currently since all the bits other than LEN are either RES0
or RAZ but will break if anything new is added, explicit extraction of
LEN is probably safer though slight overhead.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
  2024-03-27 17:35 ` [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
@ 2024-03-28 19:20   ` Mark Brown
  0 siblings, 0 replies; 64+ messages in thread
From: Mark Brown @ 2024-03-28 19:20 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

On Wed, Mar 27, 2024 at 05:35:03PM +0000, Fuad Tabba wrote:
> pKVM maintains its own state for tracking the host fpsimd state.
> Therefore, no need to map and share the host's view with it.

Ah, that explains why that was confusing me!

Reviewed-by: Mark Brown <broonie@kernel.org>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state
  2024-03-27 17:34 ` [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state Fuad Tabba
  2024-03-28 16:19   ` Mark Brown
@ 2024-04-08  7:39   ` Marc Zyngier
  2024-04-08 13:39     ` Fuad Tabba
  1 sibling, 1 reply; 64+ messages in thread
From: Marc Zyngier @ 2024-04-08  7:39 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Wed, 27 Mar 2024 17:34:49 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> Before the conversion of the various FP-state booleans into an
> enum representing the state, this helper may have clarified
> things. Since the introduction of the enum, the helper obfuscates
> rather than clarifies. This also makes the code consistent with
> other parts that check the FP-state.
> 
> No functional change intended.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ------
>  arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
>  arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
>  3 files changed, 2 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index e3fcf8c4d5b4..1a6dfd035531 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -39,12 +39,6 @@ struct kvm_exception_table_entry {
>  extern struct kvm_exception_table_entry __start___kvm_ex_table;
>  extern struct kvm_exception_table_entry __stop___kvm_ex_table;
>  
> -/* Check whether the FP regs are owned by the guest */
> -static inline bool guest_owns_fp_regs(struct kvm_vcpu *vcpu)
> -{
> -	return vcpu->arch.fp_state == FP_STATE_GUEST_OWNED;
> -}
> -
>  /* Save the 32-bit only FPSIMD system register state */
>  static inline void __fpsimd_save_fpexc32(struct kvm_vcpu *vcpu)
>  {
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index c50f8459e4fc..2a0b0d6da7c7 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -53,7 +53,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
>  			val |= CPTR_EL2_TSM;
>  	}
>  
> -	if (!guest_owns_fp_regs(vcpu)) {
> +	if (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED) {
>  		if (has_hvhe())
>  			val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |
>  				 CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
> diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
> index 1581df6aec87..e9197f086137 100644
> --- a/arch/arm64/kvm/hyp/vhe/switch.c
> +++ b/arch/arm64/kvm/hyp/vhe/switch.c
> @@ -75,7 +75,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
>  
>  	val |= CPTR_EL2_TAM;
>  
> -	if (guest_owns_fp_regs(vcpu)) {
> +	if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
>  		if (vcpu_has_sve(vcpu))
>  			val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;
>  	} else {

I'm not sure this buys us much. If anything, it makes moving the data
around more difficult (see [1]).

Overall, asking whether the vcpu owns the FP state seems a natural
question, and I would have expected this helper to be generalised
instead of being dropped.

Could you please elaborate on this?

Thanks,

	M.

[1] https://lore.kernel.org/all/20240322170945.3292593-6-maz@kernel.org/

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-03-27 17:34 ` [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
@ 2024-04-08  7:41   ` Marc Zyngier
  2024-04-08 15:41     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Marc Zyngier @ 2024-04-08  7:41 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Wed, 27 Mar 2024 17:34:50 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> Move the unlock earlier in user_mem_abort() to shorten the
> critical section. This also helps for future refactoring and
> reuse of similar code.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/mmu.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> index 18680771cdb0..3afc42d8833e 100644
> --- a/arch/arm64/kvm/mmu.c
> +++ b/arch/arm64/kvm/mmu.c
> @@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  
>  	read_lock(&kvm->mmu_lock);
>  	pgt = vcpu->arch.hw_mmu->pgt;
> -	if (mmu_invalidate_retry(kvm, mmu_seq))
> +	if (mmu_invalidate_retry(kvm, mmu_seq)) {
> +		ret = -EAGAIN;
>  		goto out_unlock;
> +	}
>  
>  	/*
>  	 * If we are not forced to use page mapping, check if we are
> @@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  					     memcache,
>  					     KVM_PGTABLE_WALK_HANDLE_FAULT |
>  					     KVM_PGTABLE_WALK_SHARED);
> +out_unlock:
> +	read_unlock(&kvm->mmu_lock);
>  
>  	/* Mark the page dirty only if the fault is handled successfully */
>  	if (writable && !ret) {
> @@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
>  		mark_page_dirty_in_slot(kvm, memslot, gfn);
>  	}
>  
> -out_unlock:
> -	read_unlock(&kvm->mmu_lock);
>  	kvm_release_pfn_clean(pfn);
>  	return ret != -EAGAIN ? ret : 0;
>  }

It now means that things such as marking a page dirty happens outside
of the lock, which may interact with the dirty log/bitmap stuff.

Can you elaborate on *why* this is correct?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path
  2024-03-27 17:34 ` [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
@ 2024-04-08  7:44   ` Marc Zyngier
  2024-04-08 13:48     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Marc Zyngier @ 2024-04-08  7:44 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Wed, 27 Mar 2024 17:34:51 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> From: Quentin Perret <qperret@google.com>
> 
> Under certain circumstances __get_fault_info() may resolve the faulting
> address using the AT instruction. Given that this is being done outside
> of the host lock critical section, it is racy and the resolution via AT
> may fail. We currently BUG() in this situation, which is obviously less
> than ideal. Moving the address resolution to the critical section may
> have a performance impact, so let's keep it where it is, but bail out
> and return to the host to try a second time.
> 
> Signed-off-by: Quentin Perret <qperret@google.com>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> index 861c76021a25..d48990eae1ef 100644
> --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> @@ -533,7 +533,15 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
>  	int ret = 0;
>  
>  	esr = read_sysreg_el2(SYS_ESR);
> -	BUG_ON(!__get_fault_info(esr, &fault));
> +	if (!__get_fault_info(esr, &fault)) {
> +		/* Setting the address to an invalid value for use in tracing. */
> +		addr = (u64)-1;

If this is relating to tracing, can this instead be added together with
the tracing itself?

> +		/*
> +		 * We've presumably raced with a page-table change which caused
> +		 * AT to fail, try again.
> +		 */
> +		return;
> +	}
>  
>  	addr = (fault.hpfar_el2 & HPFAR_MASK) << 8;
>  	ret = host_stage2_idmap(addr);

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  2024-03-28 18:53   ` Mark Brown
@ 2024-04-08 13:34     ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 13:34 UTC (permalink / raw)
  To: Mark Brown
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

Hi Mark,

Thanks for your reviews and sorry for the delay. I was traveling with
limited internet connectivity.

On Thu, Mar 28, 2024 at 6:53 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Mar 27, 2024 at 05:35:00PM +0000, Fuad Tabba wrote:
>
> > Expand comment clarifying why the host value representing SVE
> > vector length being restored for ZCR_EL1 on guest exit isn't the
> > same as it was on guest entry.
>
> > -                     /* Restore the VL that was saved when bound to the CPU */
> > +                     /*
> > +                      * Restore the VL that was saved when bound to the CPU,
> > +                      * which is the maximum VL for the guest. Because
> > +                      * the layout of the data when saving the sve state
> > +                      * depends on the VL, we need to use a consistent VL.
> > +                      * Note that this means that at guest exit ZCR_EL1 is
> > +                      * not necessarily the same as on guest entry.
>
> I don't know if a reference to the ZCR_EL1 fulfilling the role of
> ZCR_EL2 when doing the save from EL2 in VHE mode, that's potentially a
> bit more architecture to know but explains why we only need this for
> nVHE.

Ack.

> > +                      * Flushing the cpu state sets the TIF_FOREIGN_FPSTATE
> > +                      * bit for the context, which lets the kernel restore
> > +                      * the sve state, including ZCR_EL1 later.
> > +                      */
>
> The bit about flushing probably wants to be commented on the flush
> itself which is done unconditionally rather than only in the nVHE case,
> put something in there about how we need to save and invalidate the
> state so that if the host tries to use floating point it's not using
> stale data from the guest.

Will do.

Cheers,
/fuad

> >                       if (!has_vhe())
> >                               sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
> >                                                      SYS_ZCR_EL1);
> > --
> > 2.44.0.478.gd926399ef9-goog
> >

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers
  2024-03-28 18:57   ` Mark Brown
@ 2024-04-08 13:35     ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 13:35 UTC (permalink / raw)
  To: Mark Brown
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

Hi Mark,

On Thu, Mar 28, 2024 at 6:57 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Mar 27, 2024 at 05:35:01PM +0000, Fuad Tabba wrote:
> > The main factor for determining the SVE state size is the vector
> > length, and future patches will need to calculate it without
> > necessarily having a vcpu as a reference.
>
> > -#define vcpu_sve_max_vq(vcpu)        sve_vq_from_vl((vcpu)->arch.sve_max_vl)
> > -
> > -#define vcpu_sve_state_size(vcpu) ({                                 \
> > +#define _vcpu_sve_state_size(sve_max_vl) ({                          \
>
> If we're trying to make this a vCPU independent thing (which is fair
> enough) shouldn't we also remove the vcpu from the name?
> _sve_state_size() for example?  It feels like this might want sharing
> with the host too but that could be done incrementally.

Will do.

Cheers,
/fuad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state
  2024-04-08  7:39   ` Marc Zyngier
@ 2024-04-08 13:39     ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 13:39 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

Hi Marc,

On Mon, Apr 8, 2024 at 8:39 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 27 Mar 2024 17:34:49 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Before the conversion of the various FP-state booleans into an
> > enum representing the state, this helper may have clarified
> > things. Since the introduction of the enum, the helper obfuscates
> > rather than clarifies. This also makes the code consistent with
> > other parts that check the FP-state.
> >
> > No functional change intended.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ------
> >  arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
> >  arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
> >  3 files changed, 2 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > index e3fcf8c4d5b4..1a6dfd035531 100644
> > --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> > +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> > @@ -39,12 +39,6 @@ struct kvm_exception_table_entry {
> >  extern struct kvm_exception_table_entry __start___kvm_ex_table;
> >  extern struct kvm_exception_table_entry __stop___kvm_ex_table;
> >
> > -/* Check whether the FP regs are owned by the guest */
> > -static inline bool guest_owns_fp_regs(struct kvm_vcpu *vcpu)
> > -{
> > -     return vcpu->arch.fp_state == FP_STATE_GUEST_OWNED;
> > -}
> > -
> >  /* Save the 32-bit only FPSIMD system register state */
> >  static inline void __fpsimd_save_fpexc32(struct kvm_vcpu *vcpu)
> >  {
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index c50f8459e4fc..2a0b0d6da7c7 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -53,7 +53,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
> >                       val |= CPTR_EL2_TSM;
> >       }
> >
> > -     if (!guest_owns_fp_regs(vcpu)) {
> > +     if (vcpu->arch.fp_state != FP_STATE_GUEST_OWNED) {
> >               if (has_hvhe())
> >                       val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |
> >                                CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
> > diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
> > index 1581df6aec87..e9197f086137 100644
> > --- a/arch/arm64/kvm/hyp/vhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/vhe/switch.c
> > @@ -75,7 +75,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
> >
> >       val |= CPTR_EL2_TAM;
> >
> > -     if (guest_owns_fp_regs(vcpu)) {
> > +     if (vcpu->arch.fp_state == FP_STATE_GUEST_OWNED) {
> >               if (vcpu_has_sve(vcpu))
> >                       val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;
> >       } else {
>
> I'm not sure this buys us much. If anything, it makes moving the data
> around more difficult (see [1]).
>
> Overall, asking whether the vcpu owns the FP state seems a natural
> question, and I would have expected this helper to be generalised
> instead of being dropped.
>
> Could you please elaborate on this?

It is more that this wasn't really being used much, and the reason
that this helper was created to begin with was that the state was
maintained between two booleans rather than an enum. With the enum,
doing a check wasn't that much harder.

That said, looking at [1], I see your point. Generalizing the helper
is likely to make other code more readable. I'll drop this patch on
the respin, and see if it's worth doing the generalization as you
suggest.

Thanks,
/fuad

> Thanks,
>
>         M.
>
> [1] https://lore.kernel.org/all/20240322170945.3292593-6-maz@kernel.org/
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path
  2024-04-08  7:44   ` Marc Zyngier
@ 2024-04-08 13:48     ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 13:48 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

Hi Marc,

On Mon, Apr 8, 2024 at 8:44 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 27 Mar 2024 17:34:51 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > From: Quentin Perret <qperret@google.com>
> >
> > Under certain circumstances __get_fault_info() may resolve the faulting
> > address using the AT instruction. Given that this is being done outside
> > of the host lock critical section, it is racy and the resolution via AT
> > may fail. We currently BUG() in this situation, which is obviously less
> > than ideal. Moving the address resolution to the critical section may
> > have a performance impact, so let's keep it where it is, but bail out
> > and return to the host to try a second time.
> >
> > Signed-off-by: Quentin Perret <qperret@google.com>
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c | 10 +++++++++-
> >  1 file changed, 9 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > index 861c76021a25..d48990eae1ef 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
> > @@ -533,7 +533,15 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
> >       int ret = 0;
> >
> >       esr = read_sysreg_el2(SYS_ESR);
> > -     BUG_ON(!__get_fault_info(esr, &fault));
> > +     if (!__get_fault_info(esr, &fault)) {
> > +             /* Setting the address to an invalid value for use in tracing. */
> > +             addr = (u64)-1;
>
> If this is relating to tracing, can this instead be added together with
> the tracing itself?

Will do.

Thanks,
/fuad

> > +             /*
> > +              * We've presumably raced with a page-table change which caused
> > +              * AT to fail, try again.
> > +              */
> > +             return;
> > +     }
> >
> >       addr = (fault.hpfar_el2 & HPFAR_MASK) << 8;
> >       ret = host_stage2_idmap(addr);
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-04-08  7:41   ` Marc Zyngier
@ 2024-04-08 15:41     ` Fuad Tabba
  2024-04-08 15:53       ` Marc Zyngier
  0 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 15:41 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

Hi Marc,

On Mon, Apr 8, 2024 at 8:41 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 27 Mar 2024 17:34:50 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Move the unlock earlier in user_mem_abort() to shorten the
> > critical section. This also helps for future refactoring and
> > reuse of similar code.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/mmu.c | 8 +++++---
> >  1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > index 18680771cdb0..3afc42d8833e 100644
> > --- a/arch/arm64/kvm/mmu.c
> > +++ b/arch/arm64/kvm/mmu.c
> > @@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >
> >       read_lock(&kvm->mmu_lock);
> >       pgt = vcpu->arch.hw_mmu->pgt;
> > -     if (mmu_invalidate_retry(kvm, mmu_seq))
> > +     if (mmu_invalidate_retry(kvm, mmu_seq)) {
> > +             ret = -EAGAIN;
> >               goto out_unlock;
> > +     }
> >
> >       /*
> >        * If we are not forced to use page mapping, check if we are
> > @@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >                                            memcache,
> >                                            KVM_PGTABLE_WALK_HANDLE_FAULT |
> >                                            KVM_PGTABLE_WALK_SHARED);
> > +out_unlock:
> > +     read_unlock(&kvm->mmu_lock);
> >
> >       /* Mark the page dirty only if the fault is handled successfully */
> >       if (writable && !ret) {
> > @@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> >               mark_page_dirty_in_slot(kvm, memslot, gfn);
> >       }
> >
> > -out_unlock:
> > -     read_unlock(&kvm->mmu_lock);
> >       kvm_release_pfn_clean(pfn);
> >       return ret != -EAGAIN ? ret : 0;
> >  }
>
> It now means that things such as marking a page dirty happens outside
> of the lock, which may interact with the dirty log/bitmap stuff.
>
> Can you elaborate on *why* this is correct?

The reason why I _think_ this is correct (something I am less certain
about now judging by your reply :) is that this is a lock that
protects the stage-2 page tables (struct kvm_vcpu_arch::hw_mmu), held
in this case for read. As far as I can tell, kvm_set_pfn_dirty() and
mark_page_dirty_in_slot() only access and modify (i.e., write to) the
page without accessing the stage-2 page table. I think that the dirtly
log is protected by the slots_lock.

Am I missing something?

Thanks,
/fuad

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-04-08 15:41     ` Fuad Tabba
@ 2024-04-08 15:53       ` Marc Zyngier
  2024-04-08 15:57         ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Marc Zyngier @ 2024-04-08 15:53 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Mon, 08 Apr 2024 16:41:15 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi Marc,
> 
> On Mon, Apr 8, 2024 at 8:41 AM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Wed, 27 Mar 2024 17:34:50 +0000,
> > Fuad Tabba <tabba@google.com> wrote:
> > >
> > > Move the unlock earlier in user_mem_abort() to shorten the
> > > critical section. This also helps for future refactoring and
> > > reuse of similar code.
> > >
> > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > ---
> > >  arch/arm64/kvm/mmu.c | 8 +++++---
> > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > > index 18680771cdb0..3afc42d8833e 100644
> > > --- a/arch/arm64/kvm/mmu.c
> > > +++ b/arch/arm64/kvm/mmu.c
> > > @@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > >
> > >       read_lock(&kvm->mmu_lock);
> > >       pgt = vcpu->arch.hw_mmu->pgt;
> > > -     if (mmu_invalidate_retry(kvm, mmu_seq))
> > > +     if (mmu_invalidate_retry(kvm, mmu_seq)) {
> > > +             ret = -EAGAIN;
> > >               goto out_unlock;
> > > +     }
> > >
> > >       /*
> > >        * If we are not forced to use page mapping, check if we are
> > > @@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > >                                            memcache,
> > >                                            KVM_PGTABLE_WALK_HANDLE_FAULT |
> > >                                            KVM_PGTABLE_WALK_SHARED);
> > > +out_unlock:
> > > +     read_unlock(&kvm->mmu_lock);
> > >
> > >       /* Mark the page dirty only if the fault is handled successfully */
> > >       if (writable && !ret) {
> > > @@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > >               mark_page_dirty_in_slot(kvm, memslot, gfn);
> > >       }
> > >
> > > -out_unlock:
> > > -     read_unlock(&kvm->mmu_lock);
> > >       kvm_release_pfn_clean(pfn);
> > >       return ret != -EAGAIN ? ret : 0;
> > >  }
> >
> > It now means that things such as marking a page dirty happens outside
> > of the lock, which may interact with the dirty log/bitmap stuff.
> >
> > Can you elaborate on *why* this is correct?
> 
> The reason why I _think_ this is correct (something I am less certain
> about now judging by your reply :) is that this is a lock that
> protects the stage-2 page tables (struct kvm_vcpu_arch::hw_mmu), held
> in this case for read. As far as I can tell, kvm_set_pfn_dirty() and
> mark_page_dirty_in_slot() only access and modify (i.e., write to) the
> page without accessing the stage-2 page table. I think that the dirtly
> log is protected by the slots_lock.
> 
> Am I missing something?

I'm just not sure, nothing more sinister than that. I was also
reviewing 20240402213656.3068504-1-dmatlack@google.com, which also
moves the access to the dirty bitmap outside of the critical section.

I haven't had a chance to page the whole thing in, unfortunately,
hence my question. In any case, it would be work documenting the
rationale for this relaxation.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-04-08 15:53       ` Marc Zyngier
@ 2024-04-08 15:57         ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-08 15:57 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Mon, Apr 8, 2024 at 4:53 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Mon, 08 Apr 2024 16:41:15 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Hi Marc,
> >
> > On Mon, Apr 8, 2024 at 8:41 AM Marc Zyngier <maz@kernel.org> wrote:
> > >
> > > On Wed, 27 Mar 2024 17:34:50 +0000,
> > > Fuad Tabba <tabba@google.com> wrote:
> > > >
> > > > Move the unlock earlier in user_mem_abort() to shorten the
> > > > critical section. This also helps for future refactoring and
> > > > reuse of similar code.
> > > >
> > > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > > ---
> > > >  arch/arm64/kvm/mmu.c | 8 +++++---
> > > >  1 file changed, 5 insertions(+), 3 deletions(-)
> > > >
> > > > diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
> > > > index 18680771cdb0..3afc42d8833e 100644
> > > > --- a/arch/arm64/kvm/mmu.c
> > > > +++ b/arch/arm64/kvm/mmu.c
> > > > @@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > > >
> > > >       read_lock(&kvm->mmu_lock);
> > > >       pgt = vcpu->arch.hw_mmu->pgt;
> > > > -     if (mmu_invalidate_retry(kvm, mmu_seq))
> > > > +     if (mmu_invalidate_retry(kvm, mmu_seq)) {
> > > > +             ret = -EAGAIN;
> > > >               goto out_unlock;
> > > > +     }
> > > >
> > > >       /*
> > > >        * If we are not forced to use page mapping, check if we are
> > > > @@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > > >                                            memcache,
> > > >                                            KVM_PGTABLE_WALK_HANDLE_FAULT |
> > > >                                            KVM_PGTABLE_WALK_SHARED);
> > > > +out_unlock:
> > > > +     read_unlock(&kvm->mmu_lock);
> > > >
> > > >       /* Mark the page dirty only if the fault is handled successfully */
> > > >       if (writable && !ret) {
> > > > @@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
> > > >               mark_page_dirty_in_slot(kvm, memslot, gfn);
> > > >       }
> > > >
> > > > -out_unlock:
> > > > -     read_unlock(&kvm->mmu_lock);
> > > >       kvm_release_pfn_clean(pfn);
> > > >       return ret != -EAGAIN ? ret : 0;
> > > >  }
> > >
> > > It now means that things such as marking a page dirty happens outside
> > > of the lock, which may interact with the dirty log/bitmap stuff.
> > >
> > > Can you elaborate on *why* this is correct?
> >
> > The reason why I _think_ this is correct (something I am less certain
> > about now judging by your reply :) is that this is a lock that
> > protects the stage-2 page tables (struct kvm_vcpu_arch::hw_mmu), held
> > in this case for read. As far as I can tell, kvm_set_pfn_dirty() and
> > mark_page_dirty_in_slot() only access and modify (i.e., write to) the
> > page without accessing the stage-2 page table. I think that the dirtly
> > log is protected by the slots_lock.
> >
> > Am I missing something?
>
> I'm just not sure, nothing more sinister than that. I was also
> reviewing 20240402213656.3068504-1-dmatlack@google.com, which also
> moves the access to the dirty bitmap outside of the critical section.
>
> I haven't had a chance to page the whole thing in, unfortunately,
> hence my question. In any case, it would be work documenting the
> rationale for this relaxation.

Of course. I'll do that in the respin (if I'm still convinced it's correct :)

/fuad

> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore
  2024-03-28 19:17   ` Mark Brown
@ 2024-04-09  9:34     ` Fuad Tabba
  0 siblings, 0 replies; 64+ messages in thread
From: Fuad Tabba @ 2024-04-09  9:34 UTC (permalink / raw)
  To: Mark Brown
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, joey.gouly, rananta

Hi Mark,


On Thu, Mar 28, 2024 at 7:17 PM Mark Brown <broonie@kernel.org> wrote:
>
> On Wed, Mar 27, 2024 at 05:35:02PM +0000, Fuad Tabba wrote:
> > On restoring guest SVE state, use the guest's current active
> > vector length. This reduces the amount of restoring for the cases
>
> For this to work don't we also need to also save the state with the
> currently operational guest VL, all the saves and loads are done with VL
> dependent operations?  It has crossed my mind to save and load the guest
> state with the currently active guest VL since that's probably a little
> quicker but we don't appear to be doing that.
>
> > where the maximum size isn't used. Moreover, it fixes a bug where
> > the ZCR_EL2 value wasn't being set when restoring the guest
> > state, potentially corrupting it.

I apologize, but I realized that I'm missing other patches that should
have gone before this. I'll either include them on the respin, or drop
this altogether for now.

> >  static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
> >  {
> > -     sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1, SYS_ZCR_EL2);
>
> What was the bug with ZCR_EL2 not being set - I'm not clear how this
> could be skipped?
>
> > +     u64 zcr_el1 = __vcpu_sys_reg(vcpu, ZCR_EL1);
> > +     u64 zcr_el2 = min(zcr_el1, vcpu_sve_max_vq(vcpu) - 1ULL);
>
> This works currently since all the bits other than LEN are either RES0
> or RAZ but will break if anything new is added, explicit extraction of
> LEN is probably safer though slight overhead.

Noted. I'll fix this on the respin (or future patches if this doesn't make it).

Thanks,
/fuad

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context
  2024-03-27 17:34 ` [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
@ 2024-04-15 11:36   ` Marc Zyngier
  2024-04-15 15:02     ` Fuad Tabba
  0 siblings, 1 reply; 64+ messages in thread
From: Marc Zyngier @ 2024-04-15 11:36 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Wed, 27 Mar 2024 17:34:54 +0000,
Fuad Tabba <tabba@google.com> wrote:
> 
> From: Will Deacon <will@kernel.org>
> 
> Typically, TLB invalidation of guest stage-2 mappings using nVHE is
> performed by a hypercall originating from the host. For the invalidation
> instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
> swizzle the active stage-2 context around the TLBI instruction.
> 
> With guest-to-host memory sharing and unsharing hypercalls
> originating from the guest under pKVM, there is need to support
> both guest and host VMID invalidations issued from guest context.
> 
> Replace the __tlb_switch_to_{guest,host}() functions with a more general
> {enter,exit}_vmid_context() implementation which supports being invoked
> from guest context and acts as a no-op if the target context matches the
> running context.
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/tlb.c | 114 +++++++++++++++++++++++++++-------
>  1 file changed, 90 insertions(+), 24 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
> index a60fb13e2192..05a66b2ed76d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/tlb.c
> +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
> @@ -11,13 +11,23 @@
>  #include <nvhe/mem_protect.h>
>  
>  struct tlb_inv_context {
> -	u64		tcr;
> +	struct kvm_s2_mmu	*mmu;
> +	u64			tcr;
> +	u64			sctlr;
>  };
>  
> -static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> -				  struct tlb_inv_context *cxt,
> -				  bool nsh)
> +static void enter_vmid_context(struct kvm_s2_mmu *mmu,
> +			       struct tlb_inv_context *cxt,
> +			       bool nsh)
>  {
> +	struct kvm_s2_mmu *host_s2_mmu = &host_mmu.arch.mmu;
> +	struct kvm_cpu_context *host_ctxt;
> +	struct kvm_vcpu *vcpu;
> +
> +	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +	vcpu = host_ctxt->__hyp_running_vcpu;
> +	cxt->mmu = NULL;
> +
>  	/*
>  	 * We have two requirements:
>  	 *
> @@ -40,20 +50,52 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
>  	else
>  		dsb(ish);
>  
> +	/*
> +	 * If we're already in the desired context, then there's nothing
> +	 * to do.
> +	 */
> +	if (vcpu) {
> +		/* We're in guest context */
> +		if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
> +			return;

I'm a bit concerned about this one, not so much for what it does, but
because it outlines an inconsistency we have.

Under memory pressure, we can end-up unmapping a page via the MMU
notifiers, and will provide a s2_mmu context for the TLBI. This can
happen while *another* context is loaded (a vcpu from a different VM)
and that vcpu faults.

You'd end up with a scenario very similar to the one I debugged here:

https://lore.kernel.org/kvmarm/86y1gfn67v.wl-maz@kernel.org

Now, this doesn't break here because __hyp_running_vcpu is set to NULL
on each exit from the HYP code. But that only happens on nVHE, and not
on VHE, which bizarrely only sets this on entry and leaves a dangling
pointer...

I think we need to clarify how and when this pointer is considered
valid.

> +
> +		cxt->mmu = vcpu->arch.hw_mmu;
> +	} else {
> +		/* We're in host context */
> +		if (mmu == host_s2_mmu)
> +			return;
> +
> +		cxt->mmu = host_s2_mmu;
> +	}
> +
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
>  		u64 val;
>  
>  		/*
>  		 * For CPUs that are affected by ARM 1319367, we need to
> -		 * avoid a host Stage-1 walk while we have the guest's
> -		 * VMID set in the VTTBR in order to invalidate TLBs.
> -		 * We're guaranteed that the S1 MMU is enabled, so we can
> -		 * simply set the EPD bits to avoid any further TLB fill.
> +		 * avoid a Stage-1 walk with the old VMID while we have
> +		 * the new VMID set in the VTTBR in order to invalidate TLBs.
> +		 * We're guaranteed that the host S1 MMU is enabled, so
> +		 * we can simply set the EPD bits to avoid any further
> +		 * TLB fill. For guests, we ensure that the S1 MMU is
> +		 * temporarily enabled in the next context.
>  		 */
>  		val = cxt->tcr = read_sysreg_el1(SYS_TCR);
>  		val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
>  		write_sysreg_el1(val, SYS_TCR);
>  		isb();
> +
> +		if (vcpu) {
> +			val = cxt->sctlr = read_sysreg_el1(SYS_SCTLR);
> +			if (!(val & SCTLR_ELx_M)) {
> +				val |= SCTLR_ELx_M;
> +				write_sysreg_el1(val, SYS_SCTLR);
> +				isb();
> +			}
> +		} else {
> +			/* The host S1 MMU is always enabled. */
> +			cxt->sctlr = SCTLR_ELx_M;
> +		}
>  	}
>  
>  	/*
> @@ -62,20 +104,44 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
>  	 * ensuring that we always have an ISB, but not two ISBs back
>  	 * to back.
>  	 */
> -	__load_stage2(mmu, kern_hyp_va(mmu->arch));
> +	if (vcpu)
> +		__load_host_stage2();
> +	else
> +		__load_stage2(mmu, kern_hyp_va(mmu->arch));
> +
>  	asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
>  }
>  
> -static void __tlb_switch_to_host(struct tlb_inv_context *cxt)
> +static void exit_vmid_context(struct tlb_inv_context *cxt)
>  {
> -	__load_host_stage2();
> +	struct kvm_s2_mmu *mmu = cxt->mmu;
> +	struct kvm_cpu_context *host_ctxt;
> +	struct kvm_vcpu *vcpu;
> +
> +	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> +	vcpu = host_ctxt->__hyp_running_vcpu;
> +
> +	if (!mmu)
> +		return;
> +
> +	if (vcpu)
> +		__load_stage2(mmu, kern_hyp_va(mmu->arch));
> +	else
> +		__load_host_stage2();
>  
>  	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> -		/* Ensure write of the host VMID */
> +		/* Ensure write of the old VMID */
>  		isb();
> -		/* Restore the host's TCR_EL1 */
> +
> +		if (!(cxt->sctlr & SCTLR_ELx_M)) {
> +			write_sysreg_el1(cxt->sctlr, SYS_SCTLR);
> +			isb();
> +		}
> +
>  		write_sysreg_el1(cxt->tcr, SYS_TCR);
>  	}
> +
> +	cxt->mmu = NULL;

nit: do we actually need this last line?

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context
  2024-04-15 11:36   ` Marc Zyngier
@ 2024-04-15 15:02     ` Fuad Tabba
  2024-04-15 15:59       ` Marc Zyngier
  0 siblings, 1 reply; 64+ messages in thread
From: Fuad Tabba @ 2024-04-15 15:02 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

Hi Marc,

On Mon, Apr 15, 2024 at 12:36 PM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 27 Mar 2024 17:34:54 +0000,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > From: Will Deacon <will@kernel.org>
> >
> > Typically, TLB invalidation of guest stage-2 mappings using nVHE is
> > performed by a hypercall originating from the host. For the invalidation
> > instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
> > swizzle the active stage-2 context around the TLBI instruction.
> >
> > With guest-to-host memory sharing and unsharing hypercalls
> > originating from the guest under pKVM, there is need to support
> > both guest and host VMID invalidations issued from guest context.
> >
> > Replace the __tlb_switch_to_{guest,host}() functions with a more general
> > {enter,exit}_vmid_context() implementation which supports being invoked
> > from guest context and acts as a no-op if the target context matches the
> > running context.
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/tlb.c | 114 +++++++++++++++++++++++++++-------
> >  1 file changed, 90 insertions(+), 24 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > index a60fb13e2192..05a66b2ed76d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/tlb.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > @@ -11,13 +11,23 @@
> >  #include <nvhe/mem_protect.h>
> >
> >  struct tlb_inv_context {
> > -     u64             tcr;
> > +     struct kvm_s2_mmu       *mmu;
> > +     u64                     tcr;
> > +     u64                     sctlr;
> >  };
> >
> > -static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> > -                               struct tlb_inv_context *cxt,
> > -                               bool nsh)
> > +static void enter_vmid_context(struct kvm_s2_mmu *mmu,
> > +                            struct tlb_inv_context *cxt,
> > +                            bool nsh)
> >  {
> > +     struct kvm_s2_mmu *host_s2_mmu = &host_mmu.arch.mmu;
> > +     struct kvm_cpu_context *host_ctxt;
> > +     struct kvm_vcpu *vcpu;
> > +
> > +     host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > +     vcpu = host_ctxt->__hyp_running_vcpu;
> > +     cxt->mmu = NULL;
> > +
> >       /*
> >        * We have two requirements:
> >        *
> > @@ -40,20 +50,52 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> >       else
> >               dsb(ish);
> >
> > +     /*
> > +      * If we're already in the desired context, then there's nothing
> > +      * to do.
> > +      */
> > +     if (vcpu) {
> > +             /* We're in guest context */
> > +             if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
> > +                     return;
>
> I'm a bit concerned about this one, not so much for what it does, but
> because it outlines an inconsistency we have.
>
> Under memory pressure, we can end-up unmapping a page via the MMU
> notifiers, and will provide a s2_mmu context for the TLBI. This can
> happen while *another* context is loaded (a vcpu from a different VM)
> and that vcpu faults.
>
> You'd end up with a scenario very similar to the one I debugged here:
>
> https://lore.kernel.org/kvmarm/86y1gfn67v.wl-maz@kernel.org
>
> Now, this doesn't break here because __hyp_running_vcpu is set to NULL
> on each exit from the HYP code. But that only happens on nVHE, and not
> on VHE, which bizarrely only sets this on entry and leaves a dangling
> pointer...
>
> I think we need to clarify how and when this pointer is considered
> valid.

Right.  I'll add the patch to fix the dangling pointer in VHE. Should
I add a comment about the validity of the pointer where it's defined
as well?

> > +
> > +             cxt->mmu = vcpu->arch.hw_mmu;
> > +     } else {
> > +             /* We're in host context */
> > +             if (mmu == host_s2_mmu)
> > +                     return;
> > +
> > +             cxt->mmu = host_s2_mmu;
> > +     }
> > +
> >       if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> >               u64 val;
> >
> >               /*
> >                * For CPUs that are affected by ARM 1319367, we need to
> > -              * avoid a host Stage-1 walk while we have the guest's
> > -              * VMID set in the VTTBR in order to invalidate TLBs.
> > -              * We're guaranteed that the S1 MMU is enabled, so we can
> > -              * simply set the EPD bits to avoid any further TLB fill.
> > +              * avoid a Stage-1 walk with the old VMID while we have
> > +              * the new VMID set in the VTTBR in order to invalidate TLBs.
> > +              * We're guaranteed that the host S1 MMU is enabled, so
> > +              * we can simply set the EPD bits to avoid any further
> > +              * TLB fill. For guests, we ensure that the S1 MMU is
> > +              * temporarily enabled in the next context.
> >                */
> >               val = cxt->tcr = read_sysreg_el1(SYS_TCR);
> >               val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
> >               write_sysreg_el1(val, SYS_TCR);
> >               isb();
> > +
> > +             if (vcpu) {
> > +                     val = cxt->sctlr = read_sysreg_el1(SYS_SCTLR);
> > +                     if (!(val & SCTLR_ELx_M)) {
> > +                             val |= SCTLR_ELx_M;
> > +                             write_sysreg_el1(val, SYS_SCTLR);
> > +                             isb();
> > +                     }
> > +             } else {
> > +                     /* The host S1 MMU is always enabled. */
> > +                     cxt->sctlr = SCTLR_ELx_M;
> > +             }
> >       }
> >
> >       /*
> > @@ -62,20 +104,44 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> >        * ensuring that we always have an ISB, but not two ISBs back
> >        * to back.
> >        */
> > -     __load_stage2(mmu, kern_hyp_va(mmu->arch));
> > +     if (vcpu)
> > +             __load_host_stage2();
> > +     else
> > +             __load_stage2(mmu, kern_hyp_va(mmu->arch));
> > +
> >       asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
> >  }
> >
> > -static void __tlb_switch_to_host(struct tlb_inv_context *cxt)
> > +static void exit_vmid_context(struct tlb_inv_context *cxt)
> >  {
> > -     __load_host_stage2();
> > +     struct kvm_s2_mmu *mmu = cxt->mmu;
> > +     struct kvm_cpu_context *host_ctxt;
> > +     struct kvm_vcpu *vcpu;
> > +
> > +     host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > +     vcpu = host_ctxt->__hyp_running_vcpu;
> > +
> > +     if (!mmu)
> > +             return;
> > +
> > +     if (vcpu)
> > +             __load_stage2(mmu, kern_hyp_va(mmu->arch));
> > +     else
> > +             __load_host_stage2();
> >
> >       if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
> > -             /* Ensure write of the host VMID */
> > +             /* Ensure write of the old VMID */
> >               isb();
> > -             /* Restore the host's TCR_EL1 */
> > +
> > +             if (!(cxt->sctlr & SCTLR_ELx_M)) {
> > +                     write_sysreg_el1(cxt->sctlr, SYS_SCTLR);
> > +                     isb();
> > +             }
> > +
> >               write_sysreg_el1(cxt->tcr, SYS_TCR);
> >       }
> > +
> > +     cxt->mmu = NULL;
>
> nit: do we actually need this last line?

Nope. Will remove it.

Thanks,
/fuad

>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

* Re: [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context
  2024-04-15 15:02     ` Fuad Tabba
@ 2024-04-15 15:59       ` Marc Zyngier
  0 siblings, 0 replies; 64+ messages in thread
From: Marc Zyngier @ 2024-04-15 15:59 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, qperret, seanjc, alexandru.elisei, catalin.marinas,
	philmd, james.morse, suzuki.poulose, oliver.upton, mark.rutland,
	broonie, joey.gouly, rananta

On Mon, 15 Apr 2024 16:02:02 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Hi Marc,
> 
> On Mon, Apr 15, 2024 at 12:36 PM Marc Zyngier <maz@kernel.org> wrote:
> >
> > On Wed, 27 Mar 2024 17:34:54 +0000,
> > Fuad Tabba <tabba@google.com> wrote:
> > >
> > > From: Will Deacon <will@kernel.org>
> > >
> > > Typically, TLB invalidation of guest stage-2 mappings using nVHE is
> > > performed by a hypercall originating from the host. For the invalidation
> > > instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
> > > swizzle the active stage-2 context around the TLBI instruction.
> > >
> > > With guest-to-host memory sharing and unsharing hypercalls
> > > originating from the guest under pKVM, there is need to support
> > > both guest and host VMID invalidations issued from guest context.
> > >
> > > Replace the __tlb_switch_to_{guest,host}() functions with a more general
> > > {enter,exit}_vmid_context() implementation which supports being invoked
> > > from guest context and acts as a no-op if the target context matches the
> > > running context.
> > >
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > ---
> > >  arch/arm64/kvm/hyp/nvhe/tlb.c | 114 +++++++++++++++++++++++++++-------
> > >  1 file changed, 90 insertions(+), 24 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > > index a60fb13e2192..05a66b2ed76d 100644
> > > --- a/arch/arm64/kvm/hyp/nvhe/tlb.c
> > > +++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
> > > @@ -11,13 +11,23 @@
> > >  #include <nvhe/mem_protect.h>
> > >
> > >  struct tlb_inv_context {
> > > -     u64             tcr;
> > > +     struct kvm_s2_mmu       *mmu;
> > > +     u64                     tcr;
> > > +     u64                     sctlr;
> > >  };
> > >
> > > -static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> > > -                               struct tlb_inv_context *cxt,
> > > -                               bool nsh)
> > > +static void enter_vmid_context(struct kvm_s2_mmu *mmu,
> > > +                            struct tlb_inv_context *cxt,
> > > +                            bool nsh)
> > >  {
> > > +     struct kvm_s2_mmu *host_s2_mmu = &host_mmu.arch.mmu;
> > > +     struct kvm_cpu_context *host_ctxt;
> > > +     struct kvm_vcpu *vcpu;
> > > +
> > > +     host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
> > > +     vcpu = host_ctxt->__hyp_running_vcpu;
> > > +     cxt->mmu = NULL;
> > > +
> > >       /*
> > >        * We have two requirements:
> > >        *
> > > @@ -40,20 +50,52 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
> > >       else
> > >               dsb(ish);
> > >
> > > +     /*
> > > +      * If we're already in the desired context, then there's nothing
> > > +      * to do.
> > > +      */
> > > +     if (vcpu) {
> > > +             /* We're in guest context */
> > > +             if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
> > > +                     return;
> >
> > I'm a bit concerned about this one, not so much for what it does, but
> > because it outlines an inconsistency we have.
> >
> > Under memory pressure, we can end-up unmapping a page via the MMU
> > notifiers, and will provide a s2_mmu context for the TLBI. This can
> > happen while *another* context is loaded (a vcpu from a different VM)
> > and that vcpu faults.
> >
> > You'd end up with a scenario very similar to the one I debugged here:
> >
> > https://lore.kernel.org/kvmarm/86y1gfn67v.wl-maz@kernel.org
> >
> > Now, this doesn't break here because __hyp_running_vcpu is set to NULL
> > on each exit from the HYP code. But that only happens on nVHE, and not
> > on VHE, which bizarrely only sets this on entry and leaves a dangling
> > pointer...
> >
> > I think we need to clarify how and when this pointer is considered
> > valid.
> 
> Right.  I'll add the patch to fix the dangling pointer in VHE. Should
> I add a comment about the validity of the pointer where it's defined
> as well?

Don't bother with the VHE bit, I'm already on it (moving it all the
way to load/put for consistency), and I'm currently testing the hack.

But adding a comment explaining that this plays the role of
kvm_get_current_vcpu() and that it is only valid from within
__kvm_vcpu_run() would be great.

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 64+ messages in thread

end of thread, other threads:[~2024-04-15 15:59 UTC | newest]

Thread overview: 64+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-27 17:34 [PATCH v1 00/44] KVM: arm64: Preamble for pKVM Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 01/44] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 02/44] KVM: arm64: Use enum instead of helper for checking FP-state Fuad Tabba
2024-03-28 16:19   ` Mark Brown
2024-04-08  7:39   ` Marc Zyngier
2024-04-08 13:39     ` Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 03/44] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
2024-04-08  7:41   ` Marc Zyngier
2024-04-08 15:41     ` Fuad Tabba
2024-04-08 15:53       ` Marc Zyngier
2024-04-08 15:57         ` Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 04/44] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
2024-04-08  7:44   ` Marc Zyngier
2024-04-08 13:48     ` Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 05/44] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 06/44] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 07/44] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
2024-04-15 11:36   ` Marc Zyngier
2024-04-15 15:02     ` Fuad Tabba
2024-04-15 15:59       ` Marc Zyngier
2024-03-27 17:34 ` [PATCH v1 08/44] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 09/44] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 10/44] KVM: arm64: Introduce predicates to check for protected state Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 11/44] KVM: arm64: Split up nvhe/fixed_config.h Fuad Tabba
2024-03-27 17:34 ` [PATCH v1 12/44] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 13/44] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
2024-03-28 18:53   ` Mark Brown
2024-04-08 13:34     ` Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 14/44] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
2024-03-28 18:57   ` Mark Brown
2024-04-08 13:35     ` Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 15/44] KVM: arm64: Use active guest SVE vector length on guest restore Fuad Tabba
2024-03-28 19:17   ` Mark Brown
2024-04-09  9:34     ` Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 16/44] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
2024-03-28 19:20   ` Mark Brown
2024-03-27 17:35 ` [PATCH v1 17/44] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 18/44] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 19/44] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 20/44] KVM: arm64: Refactor enter_exception64() Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 21/44] KVM: arm64: Add PC_UPDATE_REQ flags covering all PC updates Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 22/44] KVM: arm64: Add vcpu flag copy primitive Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 23/44] KVM: arm64: Introduce gfn_to_memslot_prot() Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 24/44] KVM: arm64: Do not use the hva in kvm_handle_guest_abort() Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 25/44] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 26/44] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 27/44] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 28/44] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 29/44] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 30/44] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 31/44] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 32/44] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 33/44] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 34/44] KVM: arm64: Do not set the virtual timer offset for protected vCPUs Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 35/44] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 36/44] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 37/44] KVM: arm64: Check directly whether a vcpu is protected Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 38/44] KVM: arm64: Trap debug break and watch from guest Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 39/44] KVM: arm64: Restrict protected VM capabilities Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 40/44] KVM: arm64: Do not support MTE for protected VMs Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 41/44] KVM: arm64: Move pkvm_vcpu_init_traps() to hyp vcpu init Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 42/44] KVM: arm64: Fix initializing traps in protected mode Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 43/44] KVM: arm64: Advertise GICv3 sysreg interface to protected guests Fuad Tabba
2024-03-27 17:35 ` [PATCH v1 44/44] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.