kvmarm.lists.cs.columbia.edu archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/31] KVM: arm64: Preamble for pKVM
@ 2024-04-19  7:59 Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 01/31] KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM Fuad Tabba
                   ` (30 more replies)
  0 siblings, 31 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Changes from V2 [0]:
- Dropped patches that are better suited for later on with pKVM [Marc]
- Squashed some patches together to make it easier for maintainers [Marc]
- Moved all fixes to the beginning of the series [patches 1 - 13]
- Misc fixes [Mark, Marc, Oliver]

We are getting closer to upstreaming the remaining part of pKVM
[1]. To make the process easier for us and for our dear
reviewers, we are sending out this patch series as a preamble to
the upcoming patches.

This series is based on Linux 6.9-rc3 -- kvmarm/next
(9ac5bab4deee). Most of the patches in this series are
self-standing, and can be applied directly.

This series is a bit of a bombay-mix of patches we've been
carrying. There's no one overarching theme, but they do improve
the code by fixing existing bugs in pKVM, refactoring code to
make it more readable and easier to re-use for pKVM, or adding
functionality to the existing pKVM code upstream.

None of the patches in this series intentionally affect the
functionality of non-protected modes. Patches 1 to 13 are fixes.

For a technical deep dive into pKVM, please refer to Quentin
Perret's KVM Forum Presentation [2, 3]. The pKVM core series,
which we plan on sending for review next, the code is here [1].

Cheers,
Fuad, Quentin, Will, and Marc

[0] https://lore.kernel.org/all/20240416095638.3620345-1-tabba@google.com/

[1] https://android-kvm.googlesource.com/linux/+/refs/heads/for-upstream/pkvm-core

[2] Protected KVM on arm64 (slides)
https://static.sched.com/hosted_files/kvmforum2022/88/KVM%20forum%202022%20-%20pKVM%20deep%20dive.pdf

[3] Protected KVM on arm64 (video)
https://www.youtube.com/watch?v=9npebeVFbFw

Fuad Tabba (15):
  KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in
    pKVM
  KVM: arm64: Move guest_owns_fp_regs() to increase its scope
  KVM: arm64: Refactor checks for FP state ownership
  KVM: arm64: Do not re-initialize the KVM lock
  KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
  KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()
  KVM: arm64: Change kvm_handle_mmio_return() return polarity
  KVM: arm64: Move setting the page as dirty out of the critical section
  KVM: arm64: Introduce and use predicates that check for protected VMs
  KVM: arm64: Move pstate reset value definitions to kvm_arm.h
  KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  KVM: arm64: Refactor calculating SVE state size to use helpers
  KVM: arm64: Move some kvm_psci functions to a shared header
  KVM: arm64: Refactor reset_mpidr() to extract its computation
  KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use

Marc Zyngier (3):
  KVM: arm64: Check for PTE validity when checking for
    executable/cacheable
  KVM: arm64: Simplify vgic-v3 hypercalls
  KVM: arm64: Force injection of a data abort on NISV MMIO exit

Quentin Perret (4):
  KVM: arm64: Issue CMOs when tearing down guest s2 pages
  KVM: arm64: Avoid BUG-ing from the host abort path
  KVM: arm64: Prevent kmemleak from accessing .hyp.data
  KVM: arm64: Add is_pkvm_initialized() helper

Will Deacon (9):
  KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
  KVM: arm64: Support TLB invalidation in guest context
  KVM: arm64: Remove locking from EL2 allocation fast-paths
  KVM: arm64: Introduce hyp_rwlock_t
  KVM: arm64: Add atomics-based checking refcount implementation at EL2
  KVM: arm64: Use atomic refcount helpers for 'struct
    hyp_page::refcount'
  KVM: arm64: Reformat/beautify PTP hypercall documentation
  KVM: arm64: Rename firmware pseudo-register documentation file
  KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst

 Documentation/virt/kvm/api.rst                |   7 +
 .../virt/kvm/arm/fw-pseudo-registers.rst      | 138 ++++++++++++++
 Documentation/virt/kvm/arm/hypercalls.rst     | 180 +++++-------------
 Documentation/virt/kvm/arm/index.rst          |   1 +
 Documentation/virt/kvm/arm/ptp_kvm.rst        |  38 ++--
 arch/arm64/include/asm/kvm_arm.h              |  12 ++
 arch/arm64/include/asm/kvm_asm.h              |   8 +-
 arch/arm64/include/asm/kvm_emulate.h          |  11 +-
 arch/arm64/include/asm/kvm_host.h             |  39 ++--
 arch/arm64/include/asm/kvm_hyp.h              |   4 +-
 arch/arm64/include/asm/virt.h                 |  12 +-
 arch/arm64/kvm/arm.c                          |  21 +-
 arch/arm64/kvm/fpsimd.c                       |  60 +++---
 arch/arm64/kvm/hyp/include/hyp/switch.h       |   8 +-
 arch/arm64/kvm/hyp/include/nvhe/gfp.h         |   6 +-
 arch/arm64/kvm/hyp/include/nvhe/memory.h      |  18 +-
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h        |   6 +
 arch/arm64/kvm/hyp/include/nvhe/refcount.h    |  76 ++++++++
 arch/arm64/kvm/hyp/include/nvhe/rwlock.h      | 129 +++++++++++++
 arch/arm64/kvm/hyp/nvhe/hyp-main.c            |  24 +--
 arch/arm64/kvm/hyp/nvhe/mem_protect.c         |  10 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c          |  21 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c                |  14 +-
 arch/arm64/kvm/hyp/nvhe/setup.c               |   1 +
 arch/arm64/kvm/hyp/nvhe/switch.c              |  10 +-
 arch/arm64/kvm/hyp/nvhe/tlb.c                 | 115 ++++++++---
 arch/arm64/kvm/hyp/pgtable.c                  |  21 +-
 arch/arm64/kvm/hyp/vgic-v3-sr.c               |  27 ++-
 arch/arm64/kvm/hyp/vhe/switch.c               |   4 +-
 arch/arm64/kvm/mmio.c                         |  12 +-
 arch/arm64/kvm/mmu.c                          |   8 +-
 arch/arm64/kvm/pkvm.c                         |   2 +-
 arch/arm64/kvm/psci.c                         |  28 ---
 arch/arm64/kvm/reset.c                        |  20 +-
 arch/arm64/kvm/sys_regs.c                     |  14 +-
 arch/arm64/kvm/sys_regs.h                     |  19 ++
 arch/arm64/kvm/vgic/vgic-v2.c                 |   9 +-
 arch/arm64/kvm/vgic/vgic-v3.c                 |  23 +--
 arch/arm64/kvm/vgic/vgic.c                    |  11 --
 arch/arm64/kvm/vgic/vgic.h                    |   2 -
 include/kvm/arm_psci.h                        |  29 +++
 include/kvm/arm_vgic.h                        |   1 -
 42 files changed, 775 insertions(+), 424 deletions(-)
 create mode 100644 Documentation/virt/kvm/arm/fw-pseudo-registers.rst
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/refcount.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/rwlock.h


base-commit: 9ac5bab4deeeeb99f36695250b99c2f9bfae2379
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v3 01/31] KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 02/31] KVM: arm64: Move guest_owns_fp_regs() to increase its scope Fuad Tabba
                   ` (29 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Since the host_fpsimd_state has been removed from kvm_vcpu_arch,
it isn't pointing to the hyp's version of the host fp_regs in
protected mode.

Initialize the host_data fpsimd_state point to the host_data's
context fp_regs on pKVM initialization.

Fixes: 51e09b5572d6 ("KVM: arm64: Exclude host_fpsimd_state pointer from kvm_vcpu_arch")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  1 +
 arch/arm64/kvm/hyp/nvhe/pkvm.c         | 11 +++++++++++
 arch/arm64/kvm/hyp/nvhe/setup.c        |  1 +
 3 files changed, 13 insertions(+)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index 82b3d62538a6..20c3f6e13b99 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -54,6 +54,7 @@ pkvm_hyp_vcpu_to_hyp_vm(struct pkvm_hyp_vcpu *hyp_vcpu)
 }
 
 void pkvm_hyp_vm_table_init(void *tbl);
+void pkvm_host_fpsimd_state_init(void);
 
 int __pkvm_init_vm(struct kvm *host_kvm, unsigned long vm_hva,
 		   unsigned long pgd_hva);
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 26dd9a20ad6e..492b7fc2c0c7 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -247,6 +247,17 @@ void pkvm_hyp_vm_table_init(void *tbl)
 	vm_table = tbl;
 }
 
+void pkvm_host_fpsimd_state_init(void)
+{
+	unsigned long i;
+
+	for (i = 0; i < hyp_nr_cpus; i++) {
+		struct kvm_host_data *host_data = per_cpu_ptr(&kvm_host_data, i);
+
+		host_data->fpsimd_state = &host_data->host_ctxt.fp_regs;
+	}
+}
+
 /*
  * Return the hyp vm structure corresponding to the handle.
  */
diff --git a/arch/arm64/kvm/hyp/nvhe/setup.c b/arch/arm64/kvm/hyp/nvhe/setup.c
index ae00dfa80801..859f22f754d3 100644
--- a/arch/arm64/kvm/hyp/nvhe/setup.c
+++ b/arch/arm64/kvm/hyp/nvhe/setup.c
@@ -300,6 +300,7 @@ void __noreturn __pkvm_init_finalise(void)
 		goto out;
 
 	pkvm_hyp_vm_table_init(vm_table_base);
+	pkvm_host_fpsimd_state_init();
 out:
 	/*
 	 * We tail-called to here from handle___pkvm_init() and will not return,
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 02/31] KVM: arm64: Move guest_owns_fp_regs() to increase its scope
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 01/31] KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 03/31] KVM: arm64: Refactor checks for FP state ownership Fuad Tabba
                   ` (28 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

guest_owns_fp_regs() will be used to check fpsimd state ownership
across kvm/arm64. Therefore, move it to kvm_host.h to widen its
scope.

Moreover, the host state is not per-vcpu anymore, the vcpu
parameter isn't used, so remove it as well.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h       | 6 ++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 6 ------
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2b63fdfad5b2..2889e1d8a8c1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1207,6 +1207,12 @@ DECLARE_KVM_HYP_PER_CPU(struct kvm_host_data, kvm_host_data);
 	 &this_cpu_ptr_hyp_sym(kvm_host_data)->f)
 #endif
 
+/* Check whether the FP regs are owned by the guest */
+static inline bool guest_owns_fp_regs(void)
+{
+	return *host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED;
+}
+
 static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt)
 {
 	/* The host's MPIDR is immutable, so let's set it up at boot time */
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 2629420d0659..38961b6b1a18 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -39,12 +39,6 @@ struct kvm_exception_table_entry {
 extern struct kvm_exception_table_entry __start___kvm_ex_table;
 extern struct kvm_exception_table_entry __stop___kvm_ex_table;
 
-/* Check whether the FP regs are owned by the guest */
-static inline bool guest_owns_fp_regs(struct kvm_vcpu *vcpu)
-{
-	return *host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED;
-}
-
 /* Save the 32-bit only FPSIMD system register state */
 static inline void __fpsimd_save_fpexc32(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 1f82d531a494..1b22654a3180 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -53,7 +53,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
 			val |= CPTR_EL2_TSM;
 	}
 
-	if (!guest_owns_fp_regs(vcpu)) {
+	if (!guest_owns_fp_regs()) {
 		if (has_hvhe())
 			val &= ~(CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN |
 				 CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN);
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index b92f9fe2d50e..7286db75b8d6 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -75,7 +75,7 @@ static void __activate_traps(struct kvm_vcpu *vcpu)
 
 	val |= CPTR_EL2_TAM;
 
-	if (guest_owns_fp_regs(vcpu)) {
+	if (guest_owns_fp_regs()) {
 		if (vcpu_has_sve(vcpu))
 			val |= CPACR_EL1_ZEN_EL0EN | CPACR_EL1_ZEN_EL1EN;
 	} else {
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 03/31] KVM: arm64: Refactor checks for FP state ownership
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 01/31] KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 02/31] KVM: arm64: Move guest_owns_fp_regs() to increase its scope Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 04/31] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
                   ` (27 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

To avoid direct comparison against the fp_owner enum, add a new
function that performs the check, host_owns_fp_regs(), to
complement the existing guest_owns_fp_regs().

To check for fpsimd state ownership, use the helpers instead of
directly using the enums.

No functional change intended.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_emulate.h    | 6 ++----
 arch/arm64/include/asm/kvm_host.h       | 6 ++++++
 arch/arm64/kvm/fpsimd.c                 | 5 ++---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 2 +-
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 6 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 3d65d9413608..deaa88b972ec 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -587,16 +587,14 @@ static __always_inline u64 kvm_get_reset_cptr_el2(struct kvm_vcpu *vcpu)
 	} else if (has_hvhe()) {
 		val = (CPACR_EL1_FPEN_EL0EN | CPACR_EL1_FPEN_EL1EN);
 
-		if (!vcpu_has_sve(vcpu) ||
-		    (*host_data_ptr(fp_owner) != FP_STATE_GUEST_OWNED))
+		if (!vcpu_has_sve(vcpu) || !guest_owns_fp_regs())
 			val |= CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN;
 		if (cpus_have_final_cap(ARM64_SME))
 			val |= CPACR_EL1_SMEN_EL1EN | CPACR_EL1_SMEN_EL0EN;
 	} else {
 		val = CPTR_NVHE_EL2_RES1;
 
-		if (vcpu_has_sve(vcpu) &&
-		    (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED))
+		if (vcpu_has_sve(vcpu) && guest_owns_fp_regs())
 			val |= CPTR_EL2_TZ;
 		if (cpus_have_final_cap(ARM64_SME))
 			val &= ~CPTR_EL2_TSM;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 2889e1d8a8c1..4609d1b9ddde 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -1213,6 +1213,12 @@ static inline bool guest_owns_fp_regs(void)
 	return *host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED;
 }
 
+/* Check whether the FP regs are owned by the host */
+static inline bool host_owns_fp_regs(void)
+{
+	return *host_data_ptr(fp_owner) == FP_STATE_HOST_OWNED;
+}
+
 static inline void kvm_init_host_cpu_context(struct kvm_cpu_context *cpu_ctxt)
 {
 	/* The host's MPIDR is immutable, so let's set it up at boot time */
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 7507dcc4e553..d5837d65e4a1 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -141,8 +141,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu)
 
 	WARN_ON_ONCE(!irqs_disabled());
 
-	if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) {
-
+	if (guest_owns_fp_regs()) {
 		/*
 		 * Currently we do not support SME guests so SVCR is
 		 * always 0 and we just need a variable to point to.
@@ -195,7 +194,7 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 		isb();
 	}
 
-	if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED) {
+	if (guest_owns_fp_regs()) {
 		if (vcpu_has_sve(vcpu)) {
 			__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 38961b6b1a18..38b4e2623b02 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -370,7 +370,7 @@ static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 	isb();
 
 	/* Write out the host state if it's in the registers */
-	if (*host_data_ptr(fp_owner) == FP_STATE_HOST_OWNED)
+	if (host_owns_fp_regs())
 		__fpsimd_save_state(*host_data_ptr(fpsimd_state));
 
 	/* Restore the guest state */
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 1b22654a3180..5d2d4d6465e8 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -337,7 +337,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	__sysreg_restore_state_nvhe(host_ctxt);
 
-	if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED)
+	if (guest_owns_fp_regs())
 		__fpsimd_save_fpexc32(vcpu);
 
 	__debug_switch_to_host(vcpu);
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 7286db75b8d6..93f78df8b0f6 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -258,7 +258,7 @@ static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 
 	sysreg_restore_host_state_vhe(host_ctxt);
 
-	if (*host_data_ptr(fp_owner) == FP_STATE_GUEST_OWNED)
+	if (guest_owns_fp_regs())
 		__fpsimd_save_fpexc32(vcpu);
 
 	__debug_switch_to_host(vcpu);
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 04/31] KVM: arm64: Do not re-initialize the KVM lock
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (2 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 03/31] KVM: arm64: Refactor checks for FP state ownership Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 05/31] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
                   ` (26 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

The lock is already initialized in core KVM code at
kvm_create_vm().

Fixes: 9d0c063a4d1d ("KVM: arm64: Instantiate pKVM hypervisor VM and vCPU structures from EL1")
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/pkvm.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index b7be96a53597..e2c08443f284 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -222,7 +222,6 @@ void pkvm_destroy_hyp_vm(struct kvm *host_kvm)
 
 int pkvm_init_host_vm(struct kvm *host_kvm)
 {
-	mutex_init(&host_kvm->lock);
 	return 0;
 }
 
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 05/31] KVM: arm64: Issue CMOs when tearing down guest s2 pages
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (3 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 04/31] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 06/31] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
                   ` (25 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Quentin Perret <qperret@google.com>

On the guest teardown path, pKVM will zero the pages used to back
the guest data structures before returning them to the host as
they may contain secrets (e.g. in the vCPU registers). However,
the zeroing is done using a cacheable alias, and CMOs are
missing, hence giving the host a potential opportunity to read
the original content of the guest structs from memory.

Fix this by issuing CMOs after zeroing the pages.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 492b7fc2c0c7..315d4ebe1d6a 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -441,6 +441,7 @@ static void *map_donated_memory(unsigned long host_va, size_t size)
 
 static void __unmap_donated_memory(void *va, size_t size)
 {
+	kvm_flush_dcache_to_poc(va, size);
 	WARN_ON(__pkvm_hyp_donate_host(hyp_virt_to_pfn(va),
 				       PAGE_ALIGN(size) >> PAGE_SHIFT));
 }
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 06/31] KVM: arm64: Avoid BUG-ing from the host abort path
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (4 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 05/31] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 07/31] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
                   ` (24 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Quentin Perret <qperret@google.com>

Under certain circumstances __get_fault_info() may resolve the faulting
address using the AT instruction. Given that this is being done outside
of the host lock critical section, it is racy and the resolution via AT
may fail. We currently BUG() in this situation, which is obviously less
than ideal. Moving the address resolution to the critical section may
have a performance impact, so let's keep it where it is, but bail out
and return to the host to try a second time.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 861c76021a25..caba3e4bd09e 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -533,7 +533,13 @@ void handle_host_mem_abort(struct kvm_cpu_context *host_ctxt)
 	int ret = 0;
 
 	esr = read_sysreg_el2(SYS_ESR);
-	BUG_ON(!__get_fault_info(esr, &fault));
+	if (!__get_fault_info(esr, &fault)) {
+		/*
+		 * We've presumably raced with a page-table change which caused
+		 * AT to fail, try again.
+		 */
+		return;
+	}
 
 	addr = (fault.hpfar_el2 & HPFAR_MASK) << 8;
 	ret = host_stage2_idmap(addr);
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 07/31] KVM: arm64: Check for PTE validity when checking for executable/cacheable
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (5 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 06/31] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 08/31] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
                   ` (23 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Marc Zyngier <maz@kernel.org>

Don't just assume that the PTE is valid when checking whether it
describes an executable or cacheable mapping.

This makes sure that we don't issue CMOs for invalid mappings.

Suggested-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 5a59ef88b646..67647b853c9b 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -914,12 +914,12 @@ static void stage2_unmap_put_pte(const struct kvm_pgtable_visit_ctx *ctx,
 static bool stage2_pte_cacheable(struct kvm_pgtable *pgt, kvm_pte_t pte)
 {
 	u64 memattr = pte & KVM_PTE_LEAF_ATTR_LO_S2_MEMATTR;
-	return memattr == KVM_S2_MEMATTR(pgt, NORMAL);
+	return kvm_pte_valid(pte) && memattr == KVM_S2_MEMATTR(pgt, NORMAL);
 }
 
 static bool stage2_pte_executable(kvm_pte_t pte)
 {
-	return !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN);
+	return kvm_pte_valid(pte) && !(pte & KVM_PTE_LEAF_ATTR_HI_S2_XN);
 }
 
 static u64 stage2_map_walker_phys_addr(const struct kvm_pgtable_visit_ctx *ctx,
@@ -1370,7 +1370,7 @@ static int stage2_flush_walker(const struct kvm_pgtable_visit_ctx *ctx,
 	struct kvm_pgtable *pgt = ctx->arg;
 	struct kvm_pgtable_mm_ops *mm_ops = pgt->mm_ops;
 
-	if (!kvm_pte_valid(ctx->old) || !stage2_pte_cacheable(pgt, ctx->old))
+	if (!stage2_pte_cacheable(pgt, ctx->old))
 		return 0;
 
 	if (mm_ops->dcache_clean_inval_poc)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 08/31] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (6 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 07/31] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
                   ` (22 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

Break-before-make (BBM) can be expensive, as transitioning via an
invalid mapping (i.e. the "break" step) requires the completion of TLB
invalidation and can also cause other agents to fault concurrently on
the invalid mapping.

Since BBM is not required when changing only the software bits of a PTE,
avoid the sequence in this case and just update the PTE directly.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/pgtable.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index 67647b853c9b..9e2bbee77491 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -979,6 +979,21 @@ static int stage2_map_walker_try_leaf(const struct kvm_pgtable_visit_ctx *ctx,
 	if (!stage2_pte_needs_update(ctx->old, new))
 		return -EAGAIN;
 
+	/* If we're only changing software bits, then store them and go! */
+	if (!kvm_pgtable_walk_shared(ctx) &&
+	    !((ctx->old ^ new) & ~KVM_PTE_LEAF_ATTR_HI_SW)) {
+		bool old_is_counted = stage2_pte_is_counted(ctx->old);
+
+		if (old_is_counted != stage2_pte_is_counted(new)) {
+			if (old_is_counted)
+				mm_ops->put_page(ctx->ptep);
+			else
+				mm_ops->get_page(ctx->ptep);
+		}
+		WARN_ON_ONCE(!stage2_try_set_pte(ctx, new));
+		return 0;
+	}
+
 	if (!stage2_try_break_pte(ctx, data->mmu))
 		return -EAGAIN;
 
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (7 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 08/31] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19 20:54   ` Oliver Upton
  2024-04-19  7:59 ` [PATCH v3 10/31] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
                   ` (21 subsequent siblings)
  30 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

Typically, TLB invalidation of guest stage-2 mappings using nVHE is
performed by a hypercall originating from the host. For the invalidation
instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
swizzle the active stage-2 context around the TLBI instruction.

With guest-to-host memory sharing and unsharing hypercalls
originating from the guest under pKVM, there is need to support
both guest and host VMID invalidations issued from guest context.

Replace the __tlb_switch_to_{guest,host}() functions with a more general
{enter,exit}_vmid_context() implementation which supports being invoked
from guest context and acts as a no-op if the target context matches the
running context.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/tlb.c | 115 +++++++++++++++++++++++++++-------
 1 file changed, 91 insertions(+), 24 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/tlb.c b/arch/arm64/kvm/hyp/nvhe/tlb.c
index 2fc68da4036d..ca3c09df8d7c 100644
--- a/arch/arm64/kvm/hyp/nvhe/tlb.c
+++ b/arch/arm64/kvm/hyp/nvhe/tlb.c
@@ -11,13 +11,23 @@
 #include <nvhe/mem_protect.h>
 
 struct tlb_inv_context {
-	u64		tcr;
+	struct kvm_s2_mmu	*mmu;
+	u64			tcr;
+	u64			sctlr;
 };
 
-static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
-				  struct tlb_inv_context *cxt,
-				  bool nsh)
+static void enter_vmid_context(struct kvm_s2_mmu *mmu,
+			       struct tlb_inv_context *cxt,
+			       bool nsh)
 {
+	struct kvm_s2_mmu *host_s2_mmu = &host_mmu.arch.mmu;
+	struct kvm_cpu_context *host_ctxt;
+	struct kvm_vcpu *vcpu;
+
+	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+	vcpu = host_ctxt->__hyp_running_vcpu;
+	cxt->mmu = NULL;
+
 	/*
 	 * We have two requirements:
 	 *
@@ -40,20 +50,55 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
 	else
 		dsb(ish);
 
+	/*
+	 * If we're already in the desired context, then there's nothing to do.
+	 */
+	if (vcpu) {
+		/*
+		 * We're in guest context. However, for this to work, this needs
+		 * to be called from within __kvm_vcpu_run(), which ensures that
+		 * __hyp_running_vcpu is set to the current guest vcpu.
+		 */
+		if (mmu == vcpu->arch.hw_mmu || WARN_ON(mmu != host_s2_mmu))
+			return;
+
+		cxt->mmu = vcpu->arch.hw_mmu;
+	} else {
+		/* We're in host context. */
+		if (mmu == host_s2_mmu)
+			return;
+
+		cxt->mmu = host_s2_mmu;
+	}
+
 	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
 		u64 val;
 
 		/*
 		 * For CPUs that are affected by ARM 1319367, we need to
-		 * avoid a host Stage-1 walk while we have the guest's
-		 * VMID set in the VTTBR in order to invalidate TLBs.
-		 * We're guaranteed that the S1 MMU is enabled, so we can
-		 * simply set the EPD bits to avoid any further TLB fill.
+		 * avoid a Stage-1 walk with the old VMID while we have
+		 * the new VMID set in the VTTBR in order to invalidate TLBs.
+		 * We're guaranteed that the host S1 MMU is enabled, so
+		 * we can simply set the EPD bits to avoid any further
+		 * TLB fill. For guests, we ensure that the S1 MMU is
+		 * temporarily enabled in the next context.
 		 */
 		val = cxt->tcr = read_sysreg_el1(SYS_TCR);
 		val |= TCR_EPD1_MASK | TCR_EPD0_MASK;
 		write_sysreg_el1(val, SYS_TCR);
 		isb();
+
+		if (vcpu) {
+			val = cxt->sctlr = read_sysreg_el1(SYS_SCTLR);
+			if (!(val & SCTLR_ELx_M)) {
+				val |= SCTLR_ELx_M;
+				write_sysreg_el1(val, SYS_SCTLR);
+				isb();
+			}
+		} else {
+			/* The host S1 MMU is always enabled. */
+			cxt->sctlr = SCTLR_ELx_M;
+		}
 	}
 
 	/*
@@ -62,18 +107,40 @@ static void __tlb_switch_to_guest(struct kvm_s2_mmu *mmu,
 	 * ensuring that we always have an ISB, but not two ISBs back
 	 * to back.
 	 */
-	__load_stage2(mmu, kern_hyp_va(mmu->arch));
+	if (vcpu)
+		__load_host_stage2();
+	else
+		__load_stage2(mmu, kern_hyp_va(mmu->arch));
+
 	asm(ALTERNATIVE("isb", "nop", ARM64_WORKAROUND_SPECULATIVE_AT));
 }
 
-static void __tlb_switch_to_host(struct tlb_inv_context *cxt)
+static void exit_vmid_context(struct tlb_inv_context *cxt)
 {
-	__load_host_stage2();
+	struct kvm_s2_mmu *mmu = cxt->mmu;
+	struct kvm_cpu_context *host_ctxt;
+	struct kvm_vcpu *vcpu;
+
+	host_ctxt = &this_cpu_ptr(&kvm_host_data)->host_ctxt;
+	vcpu = host_ctxt->__hyp_running_vcpu;
+
+	if (!mmu)
+		return;
+
+	if (vcpu)
+		__load_stage2(mmu, kern_hyp_va(mmu->arch));
+	else
+		__load_host_stage2();
 
 	if (cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT)) {
-		/* Ensure write of the host VMID */
+		/* Ensure write of the old VMID */
 		isb();
-		/* Restore the host's TCR_EL1 */
+
+		if (!(cxt->sctlr & SCTLR_ELx_M)) {
+			write_sysreg_el1(cxt->sctlr, SYS_SCTLR);
+			isb();
+		}
+
 		write_sysreg_el1(cxt->tcr, SYS_TCR);
 	}
 }
@@ -84,7 +151,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	/*
 	 * We could do so much better if we had the VA as well.
@@ -105,7 +172,7 @@ void __kvm_tlb_flush_vmid_ipa(struct kvm_s2_mmu *mmu,
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
@@ -114,7 +181,7 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, true);
+	enter_vmid_context(mmu, &cxt, true);
 
 	/*
 	 * We could do so much better if we had the VA as well.
@@ -135,7 +202,7 @@ void __kvm_tlb_flush_vmid_ipa_nsh(struct kvm_s2_mmu *mmu,
 	dsb(nsh);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
@@ -152,7 +219,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 	start = round_down(start, stride);
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__flush_s2_tlb_range_op(ipas2e1is, start, pages, stride,
 				TLBI_TTL_UNKNOWN);
@@ -162,7 +229,7 @@ void __kvm_tlb_flush_vmid_range(struct kvm_s2_mmu *mmu,
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
@@ -170,13 +237,13 @@ void __kvm_tlb_flush_vmid(struct kvm_s2_mmu *mmu)
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__tlbi(vmalls12e1is);
 	dsb(ish);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
@@ -184,19 +251,19 @@ void __kvm_flush_cpu_context(struct kvm_s2_mmu *mmu)
 	struct tlb_inv_context cxt;
 
 	/* Switch to requested VMID */
-	__tlb_switch_to_guest(mmu, &cxt, false);
+	enter_vmid_context(mmu, &cxt, false);
 
 	__tlbi(vmalle1);
 	asm volatile("ic iallu");
 	dsb(nsh);
 	isb();
 
-	__tlb_switch_to_host(&cxt);
+	exit_vmid_context(&cxt);
 }
 
 void __kvm_flush_vm_context(void)
 {
-	/* Same remark as in __tlb_switch_to_guest() */
+	/* Same remark as in enter_vmid_context() */
 	dsb(ish);
 	__tlbi(alle1is);
 	dsb(ish);
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 10/31] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (8 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
                   ` (20 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

pKVM maintains its own state at EL2 for tracking the host fpsimd
state. Therefore, no need to map and share the host's view with
it.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/kvm_host.h |  3 ---
 arch/arm64/kvm/fpsimd.c           | 31 ++++---------------------------
 arch/arm64/kvm/reset.c            |  1 -
 3 files changed, 4 insertions(+), 31 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 4609d1b9ddde..74dc5a60f171 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -663,8 +663,6 @@ struct kvm_vcpu_arch {
 	struct kvm_guest_debug_arch vcpu_debug_state;
 	struct kvm_guest_debug_arch external_debug_state;
 
-	struct task_struct *parent_task;
-
 	/* VGIC state */
 	struct vgic_cpu vgic_cpu;
 	struct arch_timer_cpu timer_cpu;
@@ -1262,7 +1260,6 @@ void kvm_arch_vcpu_load_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu);
 void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu);
-void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu);
 
 static inline bool kvm_pmu_counter_deferred(struct perf_event_attr *attr)
 {
diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index d5837d65e4a1..63a6f82934a6 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -14,19 +14,6 @@
 #include <asm/kvm_mmu.h>
 #include <asm/sysreg.h>
 
-void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
-{
-	struct task_struct *p = vcpu->arch.parent_task;
-	struct user_fpsimd_state *fpsimd;
-
-	if (!is_protected_kvm_enabled() || !p)
-		return;
-
-	fpsimd = &p->thread.uw.fpsimd_state;
-	kvm_unshare_hyp(fpsimd, fpsimd + 1);
-	put_task_struct(p);
-}
-
 /*
  * Called on entry to KVM_RUN unless this vcpu previously ran at least
  * once and the most recent prior KVM_RUN for this vcpu was called from
@@ -38,28 +25,18 @@ void kvm_vcpu_unshare_task_fp(struct kvm_vcpu *vcpu)
  */
 int kvm_arch_vcpu_run_map_fp(struct kvm_vcpu *vcpu)
 {
-	int ret;
-
 	struct user_fpsimd_state *fpsimd = &current->thread.uw.fpsimd_state;
+	int ret;
 
-	kvm_vcpu_unshare_task_fp(vcpu);
+	/* pKVM has its own tracking of the host fpsimd state. */
+	if (is_protected_kvm_enabled())
+		return 0;
 
 	/* Make sure the host task fpsimd state is visible to hyp: */
 	ret = kvm_share_hyp(fpsimd, fpsimd + 1);
 	if (ret)
 		return ret;
 
-	/*
-	 * We need to keep current's task_struct pinned until its data has been
-	 * unshared with the hypervisor to make sure it is not re-used by the
-	 * kernel and donated to someone else while already shared -- see
-	 * kvm_vcpu_unshare_task_fp() for the matching put_task_struct().
-	 */
-	if (is_protected_kvm_enabled()) {
-		get_task_struct(current);
-		vcpu->arch.parent_task = current;
-	}
-
 	return 0;
 }
 
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 68d1d05672bd..1b7b58cb121f 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -151,7 +151,6 @@ void kvm_arm_vcpu_destroy(struct kvm_vcpu *vcpu)
 {
 	void *sve_state = vcpu->arch.sve_state;
 
-	kvm_vcpu_unshare_task_fp(vcpu);
 	kvm_unshare_hyp(vcpu, vcpu + 1);
 	if (sve_state)
 		kvm_unshare_hyp(sve_state, sve_state + vcpu_sve_state_size(vcpu));
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (9 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 10/31] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19 20:42   ` Oliver Upton
  2024-04-19  7:59 ` [PATCH v3 12/31] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
                   ` (19 subsequent siblings)
  30 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

hyp_{get,put}_page() are called extensively from the page-table code
to adjust reference counts on page-table pages. As a small step towards
removing reader serialisation on these paths, drop the 'hyp_pool' lock
in the case where the refcount remains positive, only taking the lock
if the page is to be freed back to the allocator.

Remove a misleading comment at the same time, which implies that a page
with a refcount of zero and which is not attached to a freelist is
unsafe whereas in practice it's the other way around which can lead to
problems.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/gfp.h |  6 +-----
 arch/arm64/kvm/hyp/nvhe/page_alloc.c  | 16 ++++------------
 2 files changed, 5 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/gfp.h b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
index 97c527ef53c2..24eb7840d98e 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/gfp.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/gfp.h
@@ -10,11 +10,7 @@
 #define HYP_NO_ORDER	USHRT_MAX
 
 struct hyp_pool {
-	/*
-	 * Spinlock protecting concurrent changes to the memory pool as well as
-	 * the struct hyp_page of the pool's pages until we have a proper atomic
-	 * API at EL2.
-	 */
+	/* lock protecting concurrent changes to the memory pool. */
 	hyp_spinlock_t lock;
 	struct list_head free_area[NR_PAGE_ORDERS];
 	phys_addr_t range_start;
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index e691290d3765..73437f5c2616 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -152,33 +152,25 @@ static struct hyp_page *__hyp_extract_page(struct hyp_pool *pool,
 
 static void __hyp_put_page(struct hyp_pool *pool, struct hyp_page *p)
 {
-	if (hyp_page_ref_dec_and_test(p))
+	if (hyp_page_ref_dec_and_test(p)) {
+		hyp_spin_lock(&pool->lock);
 		__hyp_attach_page(pool, p);
+		hyp_spin_unlock(&pool->lock);
+	}
 }
 
-/*
- * Changes to the buddy tree and page refcounts must be done with the hyp_pool
- * lock held. If a refcount change requires an update to the buddy tree (e.g.
- * hyp_put_page()), both operations must be done within the same critical
- * section to guarantee transient states (e.g. a page with null refcount but
- * not yet attached to a free list) can't be observed by well-behaved readers.
- */
 void hyp_put_page(struct hyp_pool *pool, void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	hyp_spin_lock(&pool->lock);
 	__hyp_put_page(pool, p);
-	hyp_spin_unlock(&pool->lock);
 }
 
 void hyp_get_page(struct hyp_pool *pool, void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	hyp_spin_lock(&pool->lock);
 	hyp_page_ref_inc(p);
-	hyp_spin_unlock(&pool->lock);
 }
 
 void hyp_split_page(struct hyp_page *p)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 12/31] KVM: arm64: Prevent kmemleak from accessing .hyp.data
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (10 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 13/31] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
                   ` (18 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Quentin Perret <qperret@google.com>

We've added a .data section for the hypervisor, which kmemleak is
eager to parse. This clearly doesn't go well, so add the section
to kmemleak's block list.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/pkvm.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/kvm/pkvm.c b/arch/arm64/kvm/pkvm.c
index e2c08443f284..85117ea8f351 100644
--- a/arch/arm64/kvm/pkvm.c
+++ b/arch/arm64/kvm/pkvm.c
@@ -258,6 +258,7 @@ static int __init finalize_pkvm(void)
 	 * at, which would end badly once inaccessible.
 	 */
 	kmemleak_free_part(__hyp_bss_start, __hyp_bss_end - __hyp_bss_start);
+	kmemleak_free_part(__hyp_rodata_start, __hyp_rodata_end - __hyp_rodata_start);
 	kmemleak_free_part_phys(hyp_mem_base, hyp_mem_size);
 
 	ret = pkvm_drop_host_privileges();
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 13/31] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps()
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (11 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 12/31] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 14/31] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
                   ` (17 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Fix the comment to clarify that __pkvm_vcpu_init_traps()
initializes traps for all VMs in protected mode, and not only
for protected VMs.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/pkvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
index 315d4ebe1d6a..16aa4875ddb8 100644
--- a/arch/arm64/kvm/hyp/nvhe/pkvm.c
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -200,7 +200,7 @@ static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
 }
 
 /*
- * Initialize trap register values for protected VMs.
+ * Initialize trap register values in protected mode.
  */
 void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
 {
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 14/31] KVM: arm64: Change kvm_handle_mmio_return() return polarity
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (12 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 13/31] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 15/31] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
                   ` (16 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Most exit handlers return <= 0 to indicate that the host needs to
handle the exit. Make kvm_handle_mmio_return() consistent with
the exit handlers in handle_exit(). This makes the code easier to
reason about, and makes it easier to add other handlers in future
patches.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/arm.c  | 2 +-
 arch/arm64/kvm/mmio.c | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index e57aa3502e13..0b5473fbbd08 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -974,7 +974,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
 
 	if (run->exit_reason == KVM_EXIT_MMIO) {
 		ret = kvm_handle_mmio_return(vcpu);
-		if (ret)
+		if (ret <= 0)
 			return ret;
 	}
 
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 200c8019a82a..5e1ffb0d5363 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -86,7 +86,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
 
 	/* Detect an already handled MMIO return */
 	if (unlikely(!vcpu->mmio_needed))
-		return 0;
+		return 1;
 
 	vcpu->mmio_needed = 0;
 
@@ -117,7 +117,7 @@ int kvm_handle_mmio_return(struct kvm_vcpu *vcpu)
 	 */
 	kvm_incr_pc(vcpu);
 
-	return 0;
+	return 1;
 }
 
 int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 15/31] KVM: arm64: Move setting the page as dirty out of the critical section
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (13 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 14/31] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 16/31] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
                   ` (15 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Move the unlock earlier in user_mem_abort() to shorten the
critical section. This also helps for future refactoring and
reuse of similar code.

This moves out marking the page as dirty outside of the critical
section. That code does not interact with the stage-2 page
tables, which the read lock in the critical section protects.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/mmu.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index dc04bc767865..03cf0a473458 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1522,8 +1522,10 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 
 	read_lock(&kvm->mmu_lock);
 	pgt = vcpu->arch.hw_mmu->pgt;
-	if (mmu_invalidate_retry(kvm, mmu_seq))
+	if (mmu_invalidate_retry(kvm, mmu_seq)) {
+		ret = -EAGAIN;
 		goto out_unlock;
+	}
 
 	/*
 	 * If we are not forced to use page mapping, check if we are
@@ -1581,6 +1583,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 					     memcache,
 					     KVM_PGTABLE_WALK_HANDLE_FAULT |
 					     KVM_PGTABLE_WALK_SHARED);
+out_unlock:
+	read_unlock(&kvm->mmu_lock);
 
 	/* Mark the page dirty only if the fault is handled successfully */
 	if (writable && !ret) {
@@ -1588,8 +1592,6 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		mark_page_dirty_in_slot(kvm, memslot, gfn);
 	}
 
-out_unlock:
-	read_unlock(&kvm->mmu_lock);
 	kvm_release_pfn_clean(pfn);
 	return ret != -EAGAIN ? ret : 0;
 }
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 16/31] KVM: arm64: Simplify vgic-v3 hypercalls
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (14 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 15/31] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 17/31] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
                   ` (14 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Marc Zyngier <maz@kernel.org>

Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore
hypercalls so that all of the EL2 GICv3 state is covered by a single pair
of hypercalls.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h   |  8 ++------
 arch/arm64/include/asm/kvm_hyp.h   |  4 ++--
 arch/arm64/kvm/arm.c               |  5 ++---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 24 ++++++------------------
 arch/arm64/kvm/hyp/vgic-v3-sr.c    | 27 +++++++++++++++++++++++----
 arch/arm64/kvm/vgic/vgic-v2.c      |  9 +--------
 arch/arm64/kvm/vgic/vgic-v3.c      | 23 ++---------------------
 arch/arm64/kvm/vgic/vgic.c         | 11 -----------
 arch/arm64/kvm/vgic/vgic.h         |  2 --
 include/kvm/arm_vgic.h             |  1 -
 10 files changed, 38 insertions(+), 76 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index 24b5e6b23417..a6330460d9e5 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -73,10 +73,8 @@ enum __kvm_host_smccc_func {
 	__KVM_HOST_SMCCC_FUNC___kvm_tlb_flush_vmid_range,
 	__KVM_HOST_SMCCC_FUNC___kvm_flush_cpu_context,
 	__KVM_HOST_SMCCC_FUNC___kvm_timer_set_cntvoff,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_read_vmcr,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_write_vmcr,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_save_aprs,
-	__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_aprs,
+	__KVM_HOST_SMCCC_FUNC___vgic_v3_save_vmcr_aprs,
+	__KVM_HOST_SMCCC_FUNC___vgic_v3_restore_vmcr_aprs,
 	__KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vm,
 	__KVM_HOST_SMCCC_FUNC___pkvm_init_vcpu,
@@ -241,8 +239,6 @@ extern int __kvm_vcpu_run(struct kvm_vcpu *vcpu);
 extern void __kvm_adjust_pc(struct kvm_vcpu *vcpu);
 
 extern u64 __vgic_v3_get_gic_config(void);
-extern u64 __vgic_v3_read_vmcr(void);
-extern void __vgic_v3_write_vmcr(u32 vmcr);
 extern void __vgic_v3_init_lrs(void);
 
 extern u64 __kvm_get_mdcr_el2(void);
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 3e2a1ac0c9bb..3e80464f8953 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -80,8 +80,8 @@ void __vgic_v3_save_state(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_restore_state(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_activate_traps(struct vgic_v3_cpu_if *cpu_if);
 void __vgic_v3_deactivate_traps(struct vgic_v3_cpu_if *cpu_if);
-void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if);
-void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if);
+void __vgic_v3_save_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
+void __vgic_v3_restore_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if);
 int __vgic_v3_perform_cpuif_access(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_NVHE_HYPERVISOR__
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 0b5473fbbd08..66301131d5a9 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -784,9 +784,8 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
 	 * doorbells to be signalled, should an interrupt become pending.
 	 */
 	preempt_disable();
-	kvm_vgic_vmcr_sync(vcpu);
 	vcpu_set_flag(vcpu, IN_WFI);
-	vgic_v4_put(vcpu);
+	kvm_vgic_put(vcpu);
 	preempt_enable();
 
 	kvm_vcpu_halt(vcpu);
@@ -794,7 +793,7 @@ void kvm_vcpu_wfi(struct kvm_vcpu *vcpu)
 
 	preempt_disable();
 	vcpu_clear_flag(vcpu, IN_WFI);
-	vgic_v4_load(vcpu);
+	kvm_vgic_load(vcpu);
 	preempt_enable();
 }
 
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 26561c562f7a..d5c48dc98f67 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -175,16 +175,6 @@ static void handle___vgic_v3_get_gic_config(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __vgic_v3_get_gic_config();
 }
 
-static void handle___vgic_v3_read_vmcr(struct kvm_cpu_context *host_ctxt)
-{
-	cpu_reg(host_ctxt, 1) = __vgic_v3_read_vmcr();
-}
-
-static void handle___vgic_v3_write_vmcr(struct kvm_cpu_context *host_ctxt)
-{
-	__vgic_v3_write_vmcr(cpu_reg(host_ctxt, 1));
-}
-
 static void handle___vgic_v3_init_lrs(struct kvm_cpu_context *host_ctxt)
 {
 	__vgic_v3_init_lrs();
@@ -195,18 +185,18 @@ static void handle___kvm_get_mdcr_el2(struct kvm_cpu_context *host_ctxt)
 	cpu_reg(host_ctxt, 1) = __kvm_get_mdcr_el2();
 }
 
-static void handle___vgic_v3_save_aprs(struct kvm_cpu_context *host_ctxt)
+static void handle___vgic_v3_save_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
 
-	__vgic_v3_save_aprs(kern_hyp_va(cpu_if));
+	__vgic_v3_save_vmcr_aprs(kern_hyp_va(cpu_if));
 }
 
-static void handle___vgic_v3_restore_aprs(struct kvm_cpu_context *host_ctxt)
+static void handle___vgic_v3_restore_vmcr_aprs(struct kvm_cpu_context *host_ctxt)
 {
 	DECLARE_REG(struct vgic_v3_cpu_if *, cpu_if, host_ctxt, 1);
 
-	__vgic_v3_restore_aprs(kern_hyp_va(cpu_if));
+	__vgic_v3_restore_vmcr_aprs(kern_hyp_va(cpu_if));
 }
 
 static void handle___pkvm_init(struct kvm_cpu_context *host_ctxt)
@@ -337,10 +327,8 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__kvm_tlb_flush_vmid_range),
 	HANDLE_FUNC(__kvm_flush_cpu_context),
 	HANDLE_FUNC(__kvm_timer_set_cntvoff),
-	HANDLE_FUNC(__vgic_v3_read_vmcr),
-	HANDLE_FUNC(__vgic_v3_write_vmcr),
-	HANDLE_FUNC(__vgic_v3_save_aprs),
-	HANDLE_FUNC(__vgic_v3_restore_aprs),
+	HANDLE_FUNC(__vgic_v3_save_vmcr_aprs),
+	HANDLE_FUNC(__vgic_v3_restore_vmcr_aprs),
 	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 	HANDLE_FUNC(__pkvm_init_vm),
 	HANDLE_FUNC(__pkvm_init_vcpu),
diff --git a/arch/arm64/kvm/hyp/vgic-v3-sr.c b/arch/arm64/kvm/hyp/vgic-v3-sr.c
index 6cb638b184b1..7b397fad26f2 100644
--- a/arch/arm64/kvm/hyp/vgic-v3-sr.c
+++ b/arch/arm64/kvm/hyp/vgic-v3-sr.c
@@ -330,7 +330,7 @@ void __vgic_v3_deactivate_traps(struct vgic_v3_cpu_if *cpu_if)
 		write_gicreg(0, ICH_HCR_EL2);
 }
 
-void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
+static void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
 {
 	u64 val;
 	u32 nr_pre_bits;
@@ -363,7 +363,7 @@ void __vgic_v3_save_aprs(struct vgic_v3_cpu_if *cpu_if)
 	}
 }
 
-void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if)
+static void __vgic_v3_restore_aprs(struct vgic_v3_cpu_if *cpu_if)
 {
 	u64 val;
 	u32 nr_pre_bits;
@@ -455,16 +455,35 @@ u64 __vgic_v3_get_gic_config(void)
 	return val;
 }
 
-u64 __vgic_v3_read_vmcr(void)
+static u64 __vgic_v3_read_vmcr(void)
 {
 	return read_gicreg(ICH_VMCR_EL2);
 }
 
-void __vgic_v3_write_vmcr(u32 vmcr)
+static void __vgic_v3_write_vmcr(u32 vmcr)
 {
 	write_gicreg(vmcr, ICH_VMCR_EL2);
 }
 
+void __vgic_v3_save_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if)
+{
+	__vgic_v3_save_aprs(cpu_if);
+	if (cpu_if->vgic_sre)
+		cpu_if->vgic_vmcr = __vgic_v3_read_vmcr();
+}
+
+void __vgic_v3_restore_vmcr_aprs(struct vgic_v3_cpu_if *cpu_if)
+{
+	/*
+	 * If dealing with a GICv2 emulation on GICv3, VMCR_EL2.VFIQen
+	 * is dependent on ICC_SRE_EL1.SRE, and we have to perform the
+	 * VMCR_EL2 save/restore in the world switch.
+	 */
+	if (cpu_if->vgic_sre)
+		__vgic_v3_write_vmcr(cpu_if->vgic_vmcr);
+	__vgic_v3_restore_aprs(cpu_if);
+}
+
 static int __vgic_v3_bpr_min(void)
 {
 	/* See Pseudocode for VPriorityGroup */
diff --git a/arch/arm64/kvm/vgic/vgic-v2.c b/arch/arm64/kvm/vgic/vgic-v2.c
index 7e9cdb78f7ce..ae5a44d5702d 100644
--- a/arch/arm64/kvm/vgic/vgic-v2.c
+++ b/arch/arm64/kvm/vgic/vgic-v2.c
@@ -464,17 +464,10 @@ void vgic_v2_load(struct kvm_vcpu *vcpu)
 		       kvm_vgic_global_state.vctrl_base + GICH_APR);
 }
 
-void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
-
-	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
-}
-
 void vgic_v2_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v2_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v2;
 
-	vgic_v2_vmcr_sync(vcpu);
+	cpu_if->vgic_vmcr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_VMCR);
 	cpu_if->vgic_apr = readl_relaxed(kvm_vgic_global_state.vctrl_base + GICH_APR);
 }
diff --git a/arch/arm64/kvm/vgic/vgic-v3.c b/arch/arm64/kvm/vgic/vgic-v3.c
index 4ea3340786b9..ed6e412cd74b 100644
--- a/arch/arm64/kvm/vgic/vgic-v3.c
+++ b/arch/arm64/kvm/vgic/vgic-v3.c
@@ -722,15 +722,7 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 
-	/*
-	 * If dealing with a GICv2 emulation on GICv3, VMCR_EL2.VFIQen
-	 * is dependent on ICC_SRE_EL1.SRE, and we have to perform the
-	 * VMCR_EL2 save/restore in the world switch.
-	 */
-	if (likely(cpu_if->vgic_sre))
-		kvm_call_hyp(__vgic_v3_write_vmcr, cpu_if->vgic_vmcr);
-
-	kvm_call_hyp(__vgic_v3_restore_aprs, cpu_if);
+	kvm_call_hyp(__vgic_v3_restore_vmcr_aprs, cpu_if);
 
 	if (has_vhe())
 		__vgic_v3_activate_traps(cpu_if);
@@ -738,24 +730,13 @@ void vgic_v3_load(struct kvm_vcpu *vcpu)
 	WARN_ON(vgic_v4_load(vcpu));
 }
 
-void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
-
-	if (likely(cpu_if->vgic_sre))
-		cpu_if->vgic_vmcr = kvm_call_hyp_ret(__vgic_v3_read_vmcr);
-}
-
 void vgic_v3_put(struct kvm_vcpu *vcpu)
 {
 	struct vgic_v3_cpu_if *cpu_if = &vcpu->arch.vgic_cpu.vgic_v3;
 
+	kvm_call_hyp(__vgic_v3_save_vmcr_aprs, cpu_if);
 	WARN_ON(vgic_v4_put(vcpu));
 
-	vgic_v3_vmcr_sync(vcpu);
-
-	kvm_call_hyp(__vgic_v3_save_aprs, cpu_if);
-
 	if (has_vhe())
 		__vgic_v3_deactivate_traps(cpu_if);
 }
diff --git a/arch/arm64/kvm/vgic/vgic.c b/arch/arm64/kvm/vgic/vgic.c
index 4ec93587c8cd..fcc5747f51e9 100644
--- a/arch/arm64/kvm/vgic/vgic.c
+++ b/arch/arm64/kvm/vgic/vgic.c
@@ -939,17 +939,6 @@ void kvm_vgic_put(struct kvm_vcpu *vcpu)
 		vgic_v3_put(vcpu);
 }
 
-void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu)
-{
-	if (unlikely(!irqchip_in_kernel(vcpu->kvm)))
-		return;
-
-	if (kvm_vgic_global_state.type == VGIC_V2)
-		vgic_v2_vmcr_sync(vcpu);
-	else
-		vgic_v3_vmcr_sync(vcpu);
-}
-
 int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu)
 {
 	struct vgic_cpu *vgic_cpu = &vcpu->arch.vgic_cpu;
diff --git a/arch/arm64/kvm/vgic/vgic.h b/arch/arm64/kvm/vgic/vgic.h
index 0c2b82de8fa3..4b93528e6a89 100644
--- a/arch/arm64/kvm/vgic/vgic.h
+++ b/arch/arm64/kvm/vgic/vgic.h
@@ -214,7 +214,6 @@ int vgic_register_dist_iodev(struct kvm *kvm, gpa_t dist_base_address,
 void vgic_v2_init_lrs(void);
 void vgic_v2_load(struct kvm_vcpu *vcpu);
 void vgic_v2_put(struct kvm_vcpu *vcpu);
-void vgic_v2_vmcr_sync(struct kvm_vcpu *vcpu);
 
 void vgic_v2_save_state(struct kvm_vcpu *vcpu);
 void vgic_v2_restore_state(struct kvm_vcpu *vcpu);
@@ -253,7 +252,6 @@ bool vgic_v3_check_base(struct kvm *kvm);
 
 void vgic_v3_load(struct kvm_vcpu *vcpu);
 void vgic_v3_put(struct kvm_vcpu *vcpu);
-void vgic_v3_vmcr_sync(struct kvm_vcpu *vcpu);
 
 bool vgic_has_its(struct kvm *kvm);
 int kvm_vgic_register_its_device(void);
diff --git a/include/kvm/arm_vgic.h b/include/kvm/arm_vgic.h
index 47035946648e..0c3cce31e0a2 100644
--- a/include/kvm/arm_vgic.h
+++ b/include/kvm/arm_vgic.h
@@ -388,7 +388,6 @@ int kvm_vgic_vcpu_pending_irq(struct kvm_vcpu *vcpu);
 
 void kvm_vgic_load(struct kvm_vcpu *vcpu);
 void kvm_vgic_put(struct kvm_vcpu *vcpu);
-void kvm_vgic_vmcr_sync(struct kvm_vcpu *vcpu);
 
 #define irqchip_in_kernel(k)	(!!((k)->arch.vgic.in_kernel))
 #define vgic_initialized(k)	((k)->arch.vgic.initialized)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 17/31] KVM: arm64: Add is_pkvm_initialized() helper
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (15 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 16/31] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 18/31] KVM: arm64: Introduce and use predicates that check for protected VMs Fuad Tabba
                   ` (13 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Quentin Perret <qperret@google.com>

Add a helper allowing to check when the pkvm static key is enabled to
ease the introduction of pkvm hooks in other parts of the code.

Signed-off-by: Quentin Perret <qperret@google.com>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/virt.h | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 261d6e9df2e1..ebf4a9f943ed 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -82,6 +82,12 @@ bool is_kvm_arm_initialised(void);
 
 DECLARE_STATIC_KEY_FALSE(kvm_protected_mode_initialized);
 
+static inline bool is_pkvm_initialized(void)
+{
+	return IS_ENABLED(CONFIG_KVM) &&
+	       static_branch_likely(&kvm_protected_mode_initialized);
+}
+
 /* Reports the availability of HYP mode */
 static inline bool is_hyp_mode_available(void)
 {
@@ -89,8 +95,7 @@ static inline bool is_hyp_mode_available(void)
 	 * If KVM protected mode is initialized, all CPUs must have been booted
 	 * in EL2. Avoid checking __boot_cpu_mode as CPUs now come up in EL1.
 	 */
-	if (IS_ENABLED(CONFIG_KVM) &&
-	    static_branch_likely(&kvm_protected_mode_initialized))
+	if (is_pkvm_initialized())
 		return true;
 
 	return (__boot_cpu_mode[0] == BOOT_CPU_MODE_EL2 &&
@@ -104,8 +109,7 @@ static inline bool is_hyp_mode_mismatched(void)
 	 * If KVM protected mode is initialized, all CPUs must have been booted
 	 * in EL2. Avoid checking __boot_cpu_mode as CPUs now come up in EL1.
 	 */
-	if (IS_ENABLED(CONFIG_KVM) &&
-	    static_branch_likely(&kvm_protected_mode_initialized))
+	if (is_pkvm_initialized())
 		return false;
 
 	return __boot_cpu_mode[0] != __boot_cpu_mode[1];
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 18/31] KVM: arm64: Introduce and use predicates that check for protected VMs
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (16 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 17/31] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 19/31] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
                   ` (12 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

In order to determine whether or not a VM or vcpu are protected,
introduce helpers to query this state. While at it, use the vcpu
helper to check vcpus protected state instead of the kvm one.

Co-authored-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_host.h      | 8 ++++----
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h | 5 +++++
 arch/arm64/kvm/hyp/nvhe/switch.c       | 6 ++----
 3 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 74dc5a60f171..0e6c186a6d6c 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -211,6 +211,7 @@ typedef unsigned int pkvm_handle_t;
 struct kvm_protected_vm {
 	pkvm_handle_t handle;
 	struct kvm_hyp_memcache teardown_mc;
+	bool enabled;
 };
 
 struct kvm_mpidr_data {
@@ -1295,10 +1296,9 @@ struct kvm *kvm_arch_alloc_vm(void);
 
 #define __KVM_HAVE_ARCH_FLUSH_REMOTE_TLBS_RANGE
 
-static inline bool kvm_vm_is_protected(struct kvm *kvm)
-{
-	return false;
-}
+#define kvm_vm_is_protected(kvm)	(is_protected_kvm_enabled() && (kvm)->arch.pkvm.enabled)
+
+#define vcpu_is_protected(vcpu)		kvm_vm_is_protected((vcpu)->kvm)
 
 int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
 bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
index 20c3f6e13b99..22f374e9f532 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -53,6 +53,11 @@ pkvm_hyp_vcpu_to_hyp_vm(struct pkvm_hyp_vcpu *hyp_vcpu)
 	return container_of(hyp_vcpu->vcpu.kvm, struct pkvm_hyp_vm, kvm);
 }
 
+static inline bool pkvm_hyp_vcpu_is_protected(struct pkvm_hyp_vcpu *hyp_vcpu)
+{
+	return vcpu_is_protected(&hyp_vcpu->vcpu);
+}
+
 void pkvm_hyp_vm_table_init(void *tbl);
 void pkvm_host_fpsimd_state_init(void);
 
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 5d2d4d6465e8..41d1ba6de41a 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -209,7 +209,7 @@ static const exit_handler_fn pvm_exit_handlers[] = {
 
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)
 {
-	if (unlikely(kvm_vm_is_protected(kern_hyp_va(vcpu->kvm))))
+	if (unlikely(vcpu_is_protected(vcpu)))
 		return pvm_exit_handlers;
 
 	return hyp_exit_handlers;
@@ -228,9 +228,7 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm_vcpu *vcpu)
  */
 static void early_exit_filter(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-	struct kvm *kvm = kern_hyp_va(vcpu->kvm);
-
-	if (kvm_vm_is_protected(kvm) && vcpu_mode_is_32bit(vcpu)) {
+	if (unlikely(vcpu_is_protected(vcpu) && vcpu_mode_is_32bit(vcpu))) {
 		/*
 		 * As we have caught the guest red-handed, decide that it isn't
 		 * fit for purpose anymore by making the vcpu invalid. The VMM
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 19/31] KVM: arm64: Move pstate reset value definitions to kvm_arm.h
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (17 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 18/31] KVM: arm64: Introduce and use predicates that check for protected VMs Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 20/31] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
                   ` (11 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Move the macro defines of the pstate reset values to a shared
header to be used by hyp in future patches.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_arm.h | 12 ++++++++++++
 arch/arm64/kvm/reset.c           | 12 ------------
 2 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index e01bb5ca13b7..12a4b226690a 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -432,4 +432,16 @@
 	{ PSR_AA32_MODE_UND,	"32-bit UND" },	\
 	{ PSR_AA32_MODE_SYS,	"32-bit SYS" }
 
+/*
+ * ARMv8 Reset Values
+ */
+#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
+				 PSR_F_BIT | PSR_D_BIT)
+
+#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
+				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
+
 #endif /* __ARM64_KVM_ARM_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 1b7b58cb121f..3d8064bf67c8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -33,18 +33,6 @@
 /* Maximum phys_shift supported for any VM on this host */
 static u32 __ro_after_init kvm_ipa_limit;
 
-/*
- * ARMv8 Reset Values
- */
-#define VCPU_RESET_PSTATE_EL1	(PSR_MODE_EL1h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_EL2	(PSR_MODE_EL2h | PSR_A_BIT | PSR_I_BIT | \
-				 PSR_F_BIT | PSR_D_BIT)
-
-#define VCPU_RESET_PSTATE_SVC	(PSR_AA32_MODE_SVC | PSR_AA32_A_BIT | \
-				 PSR_AA32_I_BIT | PSR_AA32_F_BIT)
-
 unsigned int __ro_after_init kvm_sve_max_vl;
 
 int __init kvm_arm_init_sve(void)
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 20/31] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (18 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 19/31] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 21/31] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
                   ` (10 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Expand comment clarifying why the host value representing SVE
vector length being restored for ZCR_EL1 on guest exit isn't the
same as it was on guest entry.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/kvm/fpsimd.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c
index 63a6f82934a6..1807d3a79a8a 100644
--- a/arch/arm64/kvm/fpsimd.c
+++ b/arch/arm64/kvm/fpsimd.c
@@ -175,12 +175,34 @@ void kvm_arch_vcpu_put_fp(struct kvm_vcpu *vcpu)
 		if (vcpu_has_sve(vcpu)) {
 			__vcpu_sys_reg(vcpu, ZCR_EL1) = read_sysreg_el1(SYS_ZCR);
 
-			/* Restore the VL that was saved when bound to the CPU */
+			/*
+			 * Restore the VL that was saved when bound to the CPU,
+			 * which is the maximum VL for the guest. Because the
+			 * layout of the data when saving the sve state depends
+			 * on the VL, we need to use a consistent (i.e., the
+			 * maximum) VL.
+			 * Note that this means that at guest exit ZCR_EL1 is
+			 * not necessarily the same as on guest entry.
+			 *
+			 * Restoring the VL isn't needed in VHE mode since
+			 * ZCR_EL2 (accessed via ZCR_EL1) would fulfill the same
+			 * role when doing the save from EL2.
+			 */
 			if (!has_vhe())
 				sve_cond_update_zcr_vq(vcpu_sve_max_vq(vcpu) - 1,
 						       SYS_ZCR_EL1);
 		}
 
+		/*
+		 * Flush (save and invalidate) the fpsimd/sve state so that if
+		 * the host tries to use fpsimd/sve, it's not using stale data
+		 * from the guest.
+		 *
+		 * Flushing the state sets the TIF_FOREIGN_FPSTATE bit for the
+		 * context unconditionally, in both nVHE and VHE. This allows
+		 * the kernel to restore the fpsimd/sve state, including ZCR_EL1
+		 * when needed.
+		 */
 		fpsimd_save_and_flush_cpu_state();
 	} else if (has_vhe() && system_supports_sve()) {
 		/*
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 21/31] KVM: arm64: Refactor calculating SVE state size to use helpers
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (19 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 20/31] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 22/31] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
                   ` (9 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

The main factor for determining the SVE state size is the vector
length, and future patches will need to calculate it without
necessarily having a vcpu as a reference.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Mark Brown <broonie@kernel.org>
---
 arch/arm64/include/asm/kvm_host.h | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 0e6c186a6d6c..bf4e8a8d9956 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -854,22 +854,24 @@ struct kvm_vcpu_arch {
 #define vcpu_sve_pffr(vcpu) (kern_hyp_va((vcpu)->arch.sve_state) +	\
 			     sve_ffr_offset((vcpu)->arch.sve_max_vl))
 
-#define vcpu_sve_max_vq(vcpu)	sve_vq_from_vl((vcpu)->arch.sve_max_vl)
-
-#define vcpu_sve_state_size(vcpu) ({					\
+#define sve_state_size(sve_max_vl) ({					\
 	size_t __size_ret;						\
-	unsigned int __vcpu_vq;						\
+	unsigned int __vq;						\
 									\
-	if (WARN_ON(!sve_vl_valid((vcpu)->arch.sve_max_vl))) {		\
+	if (WARN_ON(!sve_vl_valid(sve_max_vl))) {			\
 		__size_ret = 0;						\
 	} else {							\
-		__vcpu_vq = vcpu_sve_max_vq(vcpu);			\
-		__size_ret = SVE_SIG_REGS_SIZE(__vcpu_vq);		\
+		__vq = sve_vq_from_vl(sve_max_vl);			\
+		__size_ret = SVE_SIG_REGS_SIZE(__vq);			\
 	}								\
 									\
 	__size_ret;							\
 })
 
+#define vcpu_sve_max_vq(vcpu) sve_vq_from_vl((vcpu)->arch.sve_max_vl)
+
+#define vcpu_sve_state_size(vcpu) sve_state_size((vcpu)->arch.sve_max_vl)
+
 #define KVM_GUESTDBG_VALID_MASK (KVM_GUESTDBG_ENABLE | \
 				 KVM_GUESTDBG_USE_SW_BP | \
 				 KVM_GUESTDBG_USE_HW | \
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 22/31] KVM: arm64: Move some kvm_psci functions to a shared header
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (20 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 21/31] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 23/31] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
                   ` (8 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Move some PSCI functions and macros to a shared header to be used
by hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/psci.c  | 28 ----------------------------
 include/kvm/arm_psci.h | 29 +++++++++++++++++++++++++++++
 2 files changed, 29 insertions(+), 28 deletions(-)

diff --git a/arch/arm64/kvm/psci.c b/arch/arm64/kvm/psci.c
index 1f69b667332b..43458949d955 100644
--- a/arch/arm64/kvm/psci.c
+++ b/arch/arm64/kvm/psci.c
@@ -21,16 +21,6 @@
  * as described in ARM document number ARM DEN 0022A.
  */
 
-#define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
-
-static unsigned long psci_affinity_mask(unsigned long affinity_level)
-{
-	if (affinity_level <= 3)
-		return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
-
-	return 0;
-}
-
 static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 {
 	/*
@@ -51,12 +41,6 @@ static unsigned long kvm_psci_vcpu_suspend(struct kvm_vcpu *vcpu)
 	return PSCI_RET_SUCCESS;
 }
 
-static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
-					   unsigned long affinity)
-{
-	return !(affinity & ~MPIDR_HWID_BITMASK);
-}
-
 static unsigned long kvm_psci_vcpu_on(struct kvm_vcpu *source_vcpu)
 {
 	struct vcpu_reset_state *reset_state;
@@ -214,18 +198,6 @@ static void kvm_psci_system_suspend(struct kvm_vcpu *vcpu)
 	run->exit_reason = KVM_EXIT_SYSTEM_EVENT;
 }
 
-static void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
-{
-	int i;
-
-	/*
-	 * Zero the input registers' upper 32 bits. They will be fully
-	 * zeroed on exit, so we're fine changing them in place.
-	 */
-	for (i = 1; i < 4; i++)
-		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
-}
-
 static unsigned long kvm_psci_check_allowed_function(struct kvm_vcpu *vcpu, u32 fn)
 {
 	/*
diff --git a/include/kvm/arm_psci.h b/include/kvm/arm_psci.h
index e8fb624013d1..c86f228efae1 100644
--- a/include/kvm/arm_psci.h
+++ b/include/kvm/arm_psci.h
@@ -36,6 +36,35 @@ static inline int kvm_psci_version(struct kvm_vcpu *vcpu)
 	return KVM_ARM_PSCI_0_1;
 }
 
+/* Narrow the PSCI register arguments (r1 to r3) to 32 bits. */
+static inline void kvm_psci_narrow_to_32bit(struct kvm_vcpu *vcpu)
+{
+	int i;
+
+	/*
+	 * Zero the input registers' upper 32 bits. They will be fully
+	 * zeroed on exit, so we're fine changing them in place.
+	 */
+	for (i = 1; i < 4; i++)
+		vcpu_set_reg(vcpu, i, lower_32_bits(vcpu_get_reg(vcpu, i)));
+}
+
+static inline bool kvm_psci_valid_affinity(struct kvm_vcpu *vcpu,
+					   unsigned long affinity)
+{
+	return !(affinity & ~MPIDR_HWID_BITMASK);
+}
+
+
+#define AFFINITY_MASK(level)	~((0x1UL << ((level) * MPIDR_LEVEL_BITS)) - 1)
+
+static inline unsigned long psci_affinity_mask(unsigned long affinity_level)
+{
+	if (affinity_level <= 3)
+		return MPIDR_HWID_BITMASK & AFFINITY_MASK(affinity_level);
+
+	return 0;
+}
 
 int kvm_psci_call(struct kvm_vcpu *vcpu);
 
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 23/31] KVM: arm64: Refactor reset_mpidr() to extract its computation
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (21 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 22/31] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 24/31] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
                   ` (7 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Move the computation of the mpidr to its own function in a shared
header, as the computation will be used by hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/sys_regs.c | 14 +-------------
 arch/arm64/kvm/sys_regs.h | 19 +++++++++++++++++++
 2 files changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 33efa441e21d..c6da7dd891bd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -816,21 +816,9 @@ static u64 reset_actlr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 
 static u64 reset_mpidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
 {
-	u64 mpidr;
+	u64 mpidr = calculate_mpidr(vcpu);
 
-	/*
-	 * Map the vcpu_id into the first three affinity level fields of
-	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
-	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
-	 * of the GICv3 to be able to address each CPU directly when
-	 * sending IPIs.
-	 */
-	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
-	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
-	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
-	mpidr |= (1ULL << 31);
 	vcpu_write_sys_reg(vcpu, mpidr, MPIDR_EL1);
-
 	return mpidr;
 }
 
diff --git a/arch/arm64/kvm/sys_regs.h b/arch/arm64/kvm/sys_regs.h
index 997eea21ba2a..1dfd2380a1ae 100644
--- a/arch/arm64/kvm/sys_regs.h
+++ b/arch/arm64/kvm/sys_regs.h
@@ -222,6 +222,25 @@ find_reg(const struct sys_reg_params *params, const struct sys_reg_desc table[],
 	return __inline_bsearch((void *)pval, table, num, sizeof(table[0]), match_sys_reg);
 }
 
+static inline u64 calculate_mpidr(const struct kvm_vcpu *vcpu)
+{
+	u64 mpidr;
+
+	/*
+	 * Map the vcpu_id into the first three affinity level fields of
+	 * the MPIDR. We limit the number of VCPUs in level 0 due to a
+	 * limitation to 16 CPUs in that level in the ICC_SGIxR registers
+	 * of the GICv3 to be able to address each CPU directly when
+	 * sending IPIs.
+	 */
+	mpidr = (vcpu->vcpu_id & 0x0f) << MPIDR_LEVEL_SHIFT(0);
+	mpidr |= ((vcpu->vcpu_id >> 4) & 0xff) << MPIDR_LEVEL_SHIFT(1);
+	mpidr |= ((vcpu->vcpu_id >> 12) & 0xff) << MPIDR_LEVEL_SHIFT(2);
+	mpidr |= (1ULL << 31);
+
+	return mpidr;
+}
+
 const struct sys_reg_desc *get_reg_by_id(u64 id,
 					 const struct sys_reg_desc table[],
 					 unsigned int num);
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 24/31] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (22 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 23/31] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 25/31] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
                   ` (6 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

Move kvm_vcpu_enable_ptrauth() to a shared header to be used by
hyp in protected mode.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_emulate.h | 5 +++++
 arch/arm64/kvm/reset.c               | 7 +------
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index deaa88b972ec..43b5d51981f9 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -609,4 +609,9 @@ static __always_inline void kvm_reset_cptr_el2(struct kvm_vcpu *vcpu)
 
 	kvm_write_cptr_el2(val);
 }
+
+static inline void kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
+{
+	vcpu_set_flag(vcpu, GUEST_HAS_PTRAUTH);
+}
 #endif /* __ARM64_KVM_EMULATE_H__ */
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 3d8064bf67c8..c955419582a8 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -105,7 +105,7 @@ static int kvm_vcpu_finalize_sve(struct kvm_vcpu *vcpu)
 		kfree(buf);
 		return ret;
 	}
-	
+
 	vcpu->arch.sve_state = buf;
 	vcpu_set_flag(vcpu, VCPU_SVE_FINALIZED);
 	return 0;
@@ -152,11 +152,6 @@ static void kvm_vcpu_reset_sve(struct kvm_vcpu *vcpu)
 		memset(vcpu->arch.sve_state, 0, vcpu_sve_state_size(vcpu));
 }
 
-static void kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu)
-{
-	vcpu_set_flag(vcpu, GUEST_HAS_PTRAUTH);
-}
-
 /**
  * kvm_reset_vcpu - sets core registers and sys_regs to reset value
  * @vcpu: The VCPU pointer
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 25/31] KVM: arm64: Introduce hyp_rwlock_t
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (23 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 24/31] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 26/31] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
                   ` (5 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

Introduce a simple counter-based rwlock for EL2 which can reduce locking
contention on read-mostly data structures when compared to a spinlock.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/rwlock.h | 129 +++++++++++++++++++++++
 1 file changed, 129 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/rwlock.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/rwlock.h b/arch/arm64/kvm/hyp/include/nvhe/rwlock.h
new file mode 100644
index 000000000000..365084497e59
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/rwlock.h
@@ -0,0 +1,129 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * A stand-alone rwlock implementation for use by the non-VHE KVM
+ * hypervisor code running at EL2. This is *not* a fair lock and is
+ * likely to scale very badly under contention.
+ *
+ * Copyright (C) 2022 Google LLC
+ * Author: Will Deacon <will@kernel.org>
+ *
+ * Heavily based on the implementation removed by 087133ac9076 which was:
+ * Copyright (C) 2012 ARM Ltd.
+ */
+
+#ifndef __ARM64_KVM_NVHE_RWLOCK_H__
+#define __ARM64_KVM_NVHE_RWLOCK_H__
+
+#include <linux/bits.h>
+
+typedef struct {
+	u32	__val;
+} hyp_rwlock_t;
+
+#define __HYP_RWLOCK_INITIALIZER \
+	{ .__val = 0 }
+
+#define __HYP_RWLOCK_UNLOCKED \
+	((hyp_rwlock_t) __HYP_RWLOCK_INITIALIZER)
+
+#define DEFINE_HYP_RWLOCK(x)	hyp_rwlock_t x = __HYP_RWLOCK_UNLOCKED
+
+#define hyp_rwlock_init(l)						\
+do {									\
+	*(l) = __HYP_RWLOCK_UNLOCKED;					\
+} while (0)
+
+#define __HYP_RWLOCK_WRITER_BIT	31
+
+static inline void hyp_write_lock(hyp_rwlock_t *lock)
+{
+	u32 tmp;
+
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"	sevl\n"
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, %1\n"
+	"	cbnz	%w0, 1b\n"
+	"	stxr	%w0, %w2, %1\n"
+	"	cbnz	%w0, 2b\n"
+	__nops(1),
+	/* LSE atomics */
+	"1:	mov	%w0, wzr\n"
+	"2:	casa	%w0, %w2, %1\n"
+	"	cbz	%w0, 3f\n"
+	"	ldxr	%w0, %1\n"
+	"	cbz	%w0, 2b\n"
+	"	wfe\n"
+	"	b	1b\n"
+	"3:")
+	: "=&r" (tmp), "+Q" (lock->__val)
+	: "r" (BIT(__HYP_RWLOCK_WRITER_BIT))
+	: "memory");
+}
+
+static inline void hyp_write_unlock(hyp_rwlock_t *lock)
+{
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	"	stlr	wzr, %0",
+	"	swpl	wzr, wzr, %0")
+	: "=Q" (lock->__val) :: "memory");
+}
+
+static inline void hyp_read_lock(hyp_rwlock_t *lock)
+{
+	u32 tmp, tmp2;
+
+	asm volatile(
+	"	sevl\n"
+	ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, %2\n"
+	"	add	%w0, %w0, #1\n"
+	"	tbnz	%w0, %3, 1b\n"
+	"	stxr	%w1, %w0, %2\n"
+	"	cbnz	%w1, 2b\n"
+	__nops(1),
+	/* LSE atomics */
+	"1:	wfe\n"
+	"2:	ldxr	%w0, %2\n"
+	"	adds	%w1, %w0, #1\n"
+	"	tbnz	%w1, %3, 1b\n"
+	"	casa	%w0, %w1, %2\n"
+	"	sbc	%w0, %w1, %w0\n"
+	"	cbnz	%w0, 2b")
+	: "=&r" (tmp), "=&r" (tmp2), "+Q" (lock->__val)
+	: "i" (__HYP_RWLOCK_WRITER_BIT)
+	: "cc", "memory");
+}
+
+static inline void hyp_read_unlock(hyp_rwlock_t *lock)
+{
+	u32 tmp, tmp2;
+
+	asm volatile(ARM64_LSE_ATOMIC_INSN(
+	/* LL/SC */
+	"1:	ldxr	%w0, %2\n"
+	"	sub	%w0, %w0, #1\n"
+	"	stlxr	%w1, %w0, %2\n"
+	"	cbnz	%w1, 1b",
+	/* LSE atomics */
+	"	movn	%w0, #0\n"
+	"	staddl	%w0, %2\n"
+	__nops(2))
+	: "=&r" (tmp), "=&r" (tmp2), "+Q" (lock->__val)
+	:
+	: "memory");
+}
+
+#ifdef CONFIG_NVHE_EL2_DEBUG
+static inline void hyp_assert_write_lock_held(hyp_rwlock_t *lock)
+{
+	BUG_ON(!(READ_ONCE(lock->__val) & BIT(__HYP_RWLOCK_WRITER_BIT)));
+}
+#else
+static inline void hyp_assert_write_lock_held(hyp_rwlock_t *lock) { }
+#endif
+
+#endif	/* __ARM64_KVM_NVHE_RWLOCK_H__ */
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 26/31] KVM: arm64: Add atomics-based checking refcount implementation at EL2
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (24 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 25/31] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
                   ` (4 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

The current nVHE refcount implementation at EL2 uses a simple spinlock
to serialise accesses. Although this works, it forces serialisation in
places where it is not necessary, so introduce a simple atomics-based
refcount implementation instead.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/refcount.h | 76 ++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/refcount.h

diff --git a/arch/arm64/kvm/hyp/include/nvhe/refcount.h b/arch/arm64/kvm/hyp/include/nvhe/refcount.h
new file mode 100644
index 000000000000..bf6bd4ef89bf
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/refcount.h
@@ -0,0 +1,76 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Atomics-based checking refcount implementation.
+ * Copyright (C) 2023 Google LLC
+ * Author: Will Deacon <will@kernel.org>
+ */
+#ifndef __ARM64_KVM_NVHE_REFCOUNT_H__
+#define __ARM64_KVM_NVHE_REFCOUNT_H__
+
+#include <asm/lse.h>
+
+static inline s16 __ll_sc_refcount_fetch_add_16(u16 *refcount, s16 addend)
+{
+	u16 new;
+	u32 flag;
+
+	asm volatile(
+	"	prfm	pstl1strm, %[refcount]\n"
+	"1:	ldxrh	%w[new], %[refcount]\n"
+	"	add	%w[new], %w[new], %w[addend]\n"
+	"	stxrh	%w[flag], %w[new], %[refcount]\n"
+	"	cbnz	%w[flag], 1b"
+	: [refcount] "+Q" (*refcount),
+	  [new] "=&r" (new),
+	  [flag] "=&r" (flag)
+	: [addend] "Ir" (addend));
+
+	return new;
+}
+
+#ifdef CONFIG_ARM64_LSE_ATOMICS
+
+static inline s16 __lse_refcount_fetch_add_16(u16 *refcount, s16 addend)
+{
+	s16 old;
+
+	asm volatile(__LSE_PREAMBLE
+	"	ldaddh	%w[addend], %w[old], %[refcount]"
+	: [refcount] "+Q" (*refcount),
+	  [old] "=r" (old)
+	: [addend] "r" (addend));
+
+	return old + addend;
+}
+
+#endif /* CONFIG_ARM64_LSE_ATOMICS */
+
+static inline u64 __hyp_refcount_fetch_add(void *refcount, const size_t size,
+					   const s64 addend)
+{
+	s64 new;
+
+	switch (size) {
+	case 2:
+		new = __lse_ll_sc_body(refcount_fetch_add_16, refcount, addend);
+		break;
+	default:
+		BUILD_BUG_ON_MSG(1, "Unsupported refcount size");
+		unreachable();
+	}
+
+	BUG_ON(new < 0);
+	return new;
+}
+
+
+#define hyp_refcount_inc(r)	__hyp_refcount_fetch_add(&(r), sizeof(r), 1)
+#define hyp_refcount_dec(r)	__hyp_refcount_fetch_add(&(r), sizeof(r), -1)
+#define hyp_refcount_get(r)	READ_ONCE(r)
+#define hyp_refcount_set(r, v)	do {			\
+	typeof(r) *__rp = &(r);				\
+	WARN_ON(hyp_refcount_get(*__rp));		\
+	WRITE_ONCE(*__rp, v);				\
+} while (0)
+
+#endif /* __ARM64_KVM_NVHE_REFCOUNT_H__ */
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (25 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 26/31] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19 20:52   ` Oliver Upton
  2024-04-19  7:59 ` [PATCH v3 28/31] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
                   ` (3 subsequent siblings)
  30 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

Convert the 'struct hyp_page' refcount manipulation functions over to
the new atomic refcount helpers. For now, this will make absolutely no
difference because the 'struct hyp_pool' locking is still serialising
everything. One step at a time...

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 +++++++-----------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c    |  2 +-
 arch/arm64/kvm/hyp/nvhe/page_alloc.c     |  5 ++++-
 3 files changed, 12 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
index ab205c4d6774..74474c82667b 100644
--- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
+++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
@@ -6,6 +6,7 @@
 #include <asm/page.h>
 
 #include <linux/types.h>
+#include <nvhe/refcount.h>
 
 struct hyp_page {
 	unsigned short refcount;
@@ -39,37 +40,32 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
 #define hyp_page_to_pool(page)	(((struct hyp_page *)page)->pool)
 
 /*
- * Refcounting for 'struct hyp_page'.
- * hyp_pool::lock must be held if atomic access to the refcount is required.
+ * Refcounting wrappers for 'struct hyp_page'.
  */
 static inline int hyp_page_count(void *addr)
 {
 	struct hyp_page *p = hyp_virt_to_page(addr);
 
-	return p->refcount;
+	return hyp_refcount_get(p->refcount);
 }
 
 static inline void hyp_page_ref_inc(struct hyp_page *p)
 {
-	BUG_ON(p->refcount == USHRT_MAX);
-	p->refcount++;
+	hyp_refcount_inc(p->refcount);
 }
 
 static inline void hyp_page_ref_dec(struct hyp_page *p)
 {
-	BUG_ON(!p->refcount);
-	p->refcount--;
+	hyp_refcount_dec(p->refcount);
 }
 
 static inline int hyp_page_ref_dec_and_test(struct hyp_page *p)
 {
-	hyp_page_ref_dec(p);
-	return (p->refcount == 0);
+	return hyp_refcount_dec(p->refcount) == 0;
 }
 
 static inline void hyp_set_page_refcounted(struct hyp_page *p)
 {
-	BUG_ON(p->refcount);
-	p->refcount = 1;
+	hyp_refcount_set(p->refcount, 1);
 }
 #endif /* __KVM_HYP_MEMORY_H */
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index caba3e4bd09e..ab5e73a82857 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -202,7 +202,7 @@ static void *guest_s2_zalloc_page(void *mc)
 	memset(addr, 0, PAGE_SIZE);
 	p = hyp_virt_to_page(addr);
 	memset(p, 0, sizeof(*p));
-	p->refcount = 1;
+	hyp_set_page_refcounted(p);
 
 	return addr;
 }
diff --git a/arch/arm64/kvm/hyp/nvhe/page_alloc.c b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
index 73437f5c2616..96b52f545af0 100644
--- a/arch/arm64/kvm/hyp/nvhe/page_alloc.c
+++ b/arch/arm64/kvm/hyp/nvhe/page_alloc.c
@@ -55,7 +55,10 @@ static struct hyp_page *__find_buddy_avail(struct hyp_pool *pool,
 {
 	struct hyp_page *buddy = __find_buddy_nocheck(pool, p, order);
 
-	if (!buddy || buddy->order != order || buddy->refcount)
+	if (!buddy)
+		return NULL;
+
+	if (buddy->order != order || hyp_refcount_get(buddy->refcount))
 		return NULL;
 
 	return buddy;
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 28/31] KVM: arm64: Reformat/beautify PTP hypercall documentation
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (26 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 29/31] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
                   ` (2 subsequent siblings)
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

The PTP hypercall documentation doesn't produce the best-looking table
when formatting in HTML as all of the return value definitions end up
on the same line.

Reformat the PTP hypercall documentation to follow the formatting used
by hypercalls.rst.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/arm/ptp_kvm.rst | 38 ++++++++++++++++----------
 1 file changed, 24 insertions(+), 14 deletions(-)

diff --git a/Documentation/virt/kvm/arm/ptp_kvm.rst b/Documentation/virt/kvm/arm/ptp_kvm.rst
index aecdc80ddcd8..7c0960970a0e 100644
--- a/Documentation/virt/kvm/arm/ptp_kvm.rst
+++ b/Documentation/virt/kvm/arm/ptp_kvm.rst
@@ -7,19 +7,29 @@ PTP_KVM is used for high precision time sync between host and guests.
 It relies on transferring the wall clock and counter value from the
 host to the guest using a KVM-specific hypercall.
 
-* ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID: 0x86000001
+``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
+----------------------------------------
 
-This hypercall uses the SMC32/HVC32 calling convention:
+Retrieve current time information for the specific counter. There are no
+endianness restrictions.
 
-ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID
-    ==============    ========    =====================================
-    Function ID:      (uint32)    0x86000001
-    Arguments:        (uint32)    KVM_PTP_VIRT_COUNTER(0)
-                                  KVM_PTP_PHYS_COUNTER(1)
-    Return Values:    (int32)     NOT_SUPPORTED(-1) on error, or
-                      (uint32)    Upper 32 bits of wall clock time (r0)
-                      (uint32)    Lower 32 bits of wall clock time (r1)
-                      (uint32)    Upper 32 bits of counter (r2)
-                      (uint32)    Lower 32 bits of counter (r3)
-    Endianness:                   No Restrictions.
-    ==============    ========    =====================================
++---------------------+-------------------------------------------------------+
+| Presence:           | Optional                                              |
++---------------------+-------------------------------------------------------+
+| Calling convention: | HVC32                                                 |
++---------------------+----------+--------------------------------------------+
+| Function ID:        | (uint32) | 0x86000001                                 |
++---------------------+----------+----+---------------------------------------+
+| Arguments:          | (uint32) | R1 | ``KVM_PTP_VIRT_COUNTER (0)``          |
+|                     |          |    +---------------------------------------+
+|                     |          |    | ``KVM_PTP_PHYS_COUNTER (1)``          |
++---------------------+----------+----+---------------------------------------+
+| Return Values:      | (int32)  | R0 | ``NOT_SUPPORTED (-1)`` on error, else |
+|                     |          |    | upper 32 bits of wall clock time      |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R1 | Lower 32 bits of wall clock time      |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R2 | Upper 32 bits of counter              |
+|                     +----------+----+---------------------------------------+
+|                     | (uint32) | R3 | Lower 32 bits of counter              |
++---------------------+----------+----+---------------------------------------+
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 29/31] KVM: arm64: Rename firmware pseudo-register documentation file
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (27 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 28/31] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 30/31] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

In preparation for describing the guest view of KVM/arm64 hypercalls in
hypercalls.rst, move the existing contents of the file concerning the
firmware pseudo-registers elsewhere.

Cc: Raghavendra Rao Ananta <rananta@google.com>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 .../kvm/arm/{hypercalls.rst => fw-pseudo-registers.rst}     | 6 +++---
 Documentation/virt/kvm/arm/index.rst                        | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)
 rename Documentation/virt/kvm/arm/{hypercalls.rst => fw-pseudo-registers.rst} (97%)

diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/fw-pseudo-registers.rst
similarity index 97%
rename from Documentation/virt/kvm/arm/hypercalls.rst
rename to Documentation/virt/kvm/arm/fw-pseudo-registers.rst
index 3e23084644ba..b90fd0b0fa66 100644
--- a/Documentation/virt/kvm/arm/hypercalls.rst
+++ b/Documentation/virt/kvm/arm/fw-pseudo-registers.rst
@@ -1,8 +1,8 @@
 .. SPDX-License-Identifier: GPL-2.0
 
-=======================
-ARM Hypercall Interface
-=======================
+=======================================
+ARM firmware pseudo-registers interface
+=======================================
 
 KVM handles the hypercall services as requested by the guests. New hypercall
 services are regularly made available by the ARM specification or by KVM (as
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst
index 7f231c724e16..d28d65122290 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -7,8 +7,8 @@ ARM
 .. toctree::
    :maxdepth: 2
 
+   fw-pseudo-registers
    hyp-abi
-   hypercalls
    pvtime
    ptp_kvm
    vcpu-features
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 30/31] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (28 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 29/31] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19  7:59 ` [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
  30 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Will Deacon <will@kernel.org>

KVM/arm64 makes use of the SMCCC "Vendor Specific Hypervisor Service
Call Range" to expose KVM-specific hypercalls to guests in a
discoverable and extensible fashion.

Document the existence of this interface and the discovery hypercall.

Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/arm/hypercalls.rst | 46 +++++++++++++++++++++++
 Documentation/virt/kvm/arm/index.rst      |  1 +
 2 files changed, 47 insertions(+)
 create mode 100644 Documentation/virt/kvm/arm/hypercalls.rst

diff --git a/Documentation/virt/kvm/arm/hypercalls.rst b/Documentation/virt/kvm/arm/hypercalls.rst
new file mode 100644
index 000000000000..17be111f493f
--- /dev/null
+++ b/Documentation/virt/kvm/arm/hypercalls.rst
@@ -0,0 +1,46 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===============================================
+KVM/arm64-specific hypercalls exposed to guests
+===============================================
+
+This file documents the KVM/arm64-specific hypercalls which may be
+exposed by KVM/arm64 to guest operating systems. These hypercalls are
+issued using the HVC instruction according to version 1.1 of the Arm SMC
+Calling Convention (DEN0028/C):
+
+https://developer.arm.com/docs/den0028/c
+
+All KVM/arm64-specific hypercalls are allocated within the "Vendor
+Specific Hypervisor Service Call" range with a UID of
+``28b46fb6-2ec5-11e9-a9ca-4b564d003a74``. This UID should be queried by the
+guest using the standard "Call UID" function for the service range in
+order to determine that the KVM/arm64-specific hypercalls are available.
+
+``ARM_SMCCC_VENDOR_HYP_KVM_FEATURES_FUNC_ID``
+---------------------------------------------
+
+Provides a discovery mechanism for other KVM/arm64 hypercalls.
+
++---------------------+-------------------------------------------------------------+
+| Presence:           | Mandatory for the KVM/arm64 UID                             |
++---------------------+-------------------------------------------------------------+
+| Calling convention: | HVC32                                                       |
++---------------------+----------+--------------------------------------------------+
+| Function ID:        | (uint32) | 0x86000000                                       |
++---------------------+----------+--------------------------------------------------+
+| Arguments:          | None                                                        |
++---------------------+----------+----+---------------------------------------------+
+| Return Values:      | (uint32) | R0 | Bitmap of available function numbers 0-31   |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R1 | Bitmap of available function numbers 32-63  |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R2 | Bitmap of available function numbers 64-95  |
+|                     +----------+----+---------------------------------------------+
+|                     | (uint32) | R3 | Bitmap of available function numbers 96-127 |
++---------------------+----------+----+---------------------------------------------+
+
+``ARM_SMCCC_VENDOR_HYP_KVM_PTP_FUNC_ID``
+----------------------------------------
+
+See ptp_kvm.rst
diff --git a/Documentation/virt/kvm/arm/index.rst b/Documentation/virt/kvm/arm/index.rst
index d28d65122290..ec09881de4cf 100644
--- a/Documentation/virt/kvm/arm/index.rst
+++ b/Documentation/virt/kvm/arm/index.rst
@@ -9,6 +9,7 @@ ARM
 
    fw-pseudo-registers
    hyp-abi
+   hypercalls
    pvtime
    ptp_kvm
    vcpu-features
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit
  2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
                   ` (29 preceding siblings ...)
  2024-04-19  7:59 ` [PATCH v3 30/31] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
@ 2024-04-19  7:59 ` Fuad Tabba
  2024-04-19 20:28   ` Oliver Upton
  30 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-19  7:59 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, qperret, tabba, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	oliver.upton, mark.rutland, broonie, joey.gouly, rananta,
	smostafa

From: Marc Zyngier <maz@kernel.org>

If a vcpu exits for a data abort with an invalid syndrome, the
expectations are that userspace has a chance to save the day if
it has requested to see such exits.

However, this is completely futile in the case of a protected VM,
as none of the state is available. In this particular case, inject
a data abort directly into the vcpu, consistent with what userspace
could do.

This also helps with pKVM, which discards all syndrome information when
forwarding data aborts that are not known to be MMIO.

Finally, hide the RETURN_NISV_IO_ABORT_TO_USER cap from userspace on
protected VMs, and document this tweak to the API.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 Documentation/virt/kvm/api.rst |  7 +++++++
 arch/arm64/kvm/arm.c           | 14 ++++++++++----
 arch/arm64/kvm/mmio.c          |  8 ++++++++
 3 files changed, 25 insertions(+), 4 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 0b5a33ee71ee..b11b70ae137e 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -6894,6 +6894,13 @@ Note that KVM does not skip the faulting instruction as it does for
 KVM_EXIT_MMIO, but userspace has to emulate any change to the processing state
 if it decides to decode and emulate the instruction.
 
+This feature isn't available to protected VMs, as userspace does not
+have access to the state that is required to perform the emulation.
+Instead, a data abort exception is directly injected in the guest.
+Note that although KVM_CAP_ARM_NISV_TO_USER will be reported if
+queried outside of a protected VM context, the feature will not be
+exposed if queried on a protected VM file descriptor.
+
 ::
 
 		/* KVM_EXIT_X86_RDMSR / KVM_EXIT_X86_WRMSR */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 66301131d5a9..750386a84968 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -80,9 +80,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 
 	switch (cap->cap) {
 	case KVM_CAP_ARM_NISV_TO_USER:
-		r = 0;
-		set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
-			&kvm->arch.flags);
+		if (kvm_vm_is_protected(kvm)) {
+			r = -EINVAL;
+		} else {
+			r = 0;
+			set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
+				&kvm->arch.flags);
+		}
 		break;
 	case KVM_CAP_ARM_MTE:
 		mutex_lock(&kvm->lock);
@@ -237,7 +241,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_IMMEDIATE_EXIT:
 	case KVM_CAP_VCPU_EVENTS:
 	case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2:
-	case KVM_CAP_ARM_NISV_TO_USER:
 	case KVM_CAP_ARM_INJECT_EXT_DABT:
 	case KVM_CAP_SET_GUEST_DEBUG:
 	case KVM_CAP_VCPU_ATTRIBUTES:
@@ -247,6 +250,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_COUNTER_OFFSET:
 		r = 1;
 		break;
+	case KVM_CAP_ARM_NISV_TO_USER:
+		r = !kvm || !kvm_vm_is_protected(kvm);
+		break;
 	case KVM_CAP_SET_GUEST_DEBUG2:
 		return KVM_GUESTDBG_VALID_MASK;
 	case KVM_CAP_ARM_SET_DEVICE_ADDR:
diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
index 5e1ffb0d5363..75e1072948cd 100644
--- a/arch/arm64/kvm/mmio.c
+++ b/arch/arm64/kvm/mmio.c
@@ -133,11 +133,19 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
 	/*
 	 * No valid syndrome? Ask userspace for help if it has
 	 * volunteered to do so, and bail out otherwise.
+	 *
+	 * In the protected VM case, there isn't much userspace can do
+	 * though, so directly deliver an exception to the guest.
 	 */
 	if (!kvm_vcpu_dabt_isvalid(vcpu)) {
 		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
 				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
 
+		if (is_protected_kvm_enabled() && vcpu_is_protected(vcpu)) {
+			kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu));
+			return 1;
+		}
+
 		if (test_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
 			     &vcpu->kvm->arch.flags)) {
 			run->exit_reason = KVM_EXIT_ARM_NISV;
-- 
2.44.0.769.g3c40516874-goog


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit
  2024-04-19  7:59 ` [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
@ 2024-04-19 20:28   ` Oliver Upton
  2024-04-22  8:07     ` Fuad Tabba
  0 siblings, 1 reply; 44+ messages in thread
From: Oliver Upton @ 2024-04-19 20:28 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Fri, Apr 19, 2024 at 08:59:41AM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> If a vcpu exits for a data abort with an invalid syndrome, the
> expectations are that userspace has a chance to save the day if
> it has requested to see such exits.
> 
> However, this is completely futile in the case of a protected VM,
> as none of the state is available. In this particular case, inject
> a data abort directly into the vcpu, consistent with what userspace
> could do.
> 
> This also helps with pKVM, which discards all syndrome information when
> forwarding data aborts that are not known to be MMIO.
> 
> Finally, hide the RETURN_NISV_IO_ABORT_TO_USER cap from userspace on
> protected VMs, and document this tweak to the API.

Are there going to be more KVM caps that we hide from protected VMs in
the future? If so, it might be better to have a common helper that
enforces a denylist of capabilities not supported by pKVM.

That way we can avoid adding kvm_vm_is_protected() checks all over the
shop.

> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 66301131d5a9..750386a84968 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -80,9 +80,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
>  
>  	switch (cap->cap) {
>  	case KVM_CAP_ARM_NISV_TO_USER:
> -		r = 0;
> -		set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
> -			&kvm->arch.flags);
> +		if (kvm_vm_is_protected(kvm)) {
> +			r = -EINVAL;
> +		} else {
> +			r = 0;
> +			set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
> +				&kvm->arch.flags);
> +		}

nitpick: initialize r = -EINVAL and get rid of all the error-path
initializations in kvm_vm_ioctl_enable_cap()

>  		break;
>  	case KVM_CAP_ARM_MTE:
>  		mutex_lock(&kvm->lock);
> @@ -237,7 +241,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_IMMEDIATE_EXIT:
>  	case KVM_CAP_VCPU_EVENTS:
>  	case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2:
> -	case KVM_CAP_ARM_NISV_TO_USER:
>  	case KVM_CAP_ARM_INJECT_EXT_DABT:
>  	case KVM_CAP_SET_GUEST_DEBUG:
>  	case KVM_CAP_VCPU_ATTRIBUTES:
> @@ -247,6 +250,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
>  	case KVM_CAP_COUNTER_OFFSET:
>  		r = 1;
>  		break;
> +	case KVM_CAP_ARM_NISV_TO_USER:
> +		r = !kvm || !kvm_vm_is_protected(kvm);
> +		break;
>  	case KVM_CAP_SET_GUEST_DEBUG2:
>  		return KVM_GUESTDBG_VALID_MASK;
>  	case KVM_CAP_ARM_SET_DEVICE_ADDR:
> diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> index 5e1ffb0d5363..75e1072948cd 100644
> --- a/arch/arm64/kvm/mmio.c
> +++ b/arch/arm64/kvm/mmio.c
> @@ -133,11 +133,19 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
>  	/*
>  	 * No valid syndrome? Ask userspace for help if it has
>  	 * volunteered to do so, and bail out otherwise.
> +	 *
> +	 * In the protected VM case, there isn't much userspace can do
> +	 * though, so directly deliver an exception to the guest.
>  	 */
>  	if (!kvm_vcpu_dabt_isvalid(vcpu)) {
>  		trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
>  				    kvm_vcpu_get_hfar(vcpu), fault_ipa);
>  
> +		if (is_protected_kvm_enabled() && vcpu_is_protected(vcpu)) {

You can drop the is_protected_kvm_enabled() now that it got folded in to
kvm_vm_is_protected().

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths
  2024-04-19  7:59 ` [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
@ 2024-04-19 20:42   ` Oliver Upton
  2024-04-22  8:09     ` Fuad Tabba
  0 siblings, 1 reply; 44+ messages in thread
From: Oliver Upton @ 2024-04-19 20:42 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Fri, Apr 19, 2024 at 08:59:21AM +0100, Fuad Tabba wrote:
> From: Will Deacon <will@kernel.org>
> 
> hyp_{get,put}_page() are called extensively from the page-table code
> to adjust reference counts on page-table pages. As a small step towards
> removing reader serialisation on these paths, drop the 'hyp_pool' lock
> in the case where the refcount remains positive, only taking the lock
> if the page is to be freed back to the allocator.

Surely the ordering of these patches is wrong... The refcount accessors
are non-atomic until patch 27.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-19  7:59 ` [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
@ 2024-04-19 20:52   ` Oliver Upton
  2024-04-22  8:10     ` Fuad Tabba
  0 siblings, 1 reply; 44+ messages in thread
From: Oliver Upton @ 2024-04-19 20:52 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Fri, Apr 19, 2024 at 08:59:37AM +0100, Fuad Tabba wrote:
> From: Will Deacon <will@kernel.org>
> 
> Convert the 'struct hyp_page' refcount manipulation functions over to
> the new atomic refcount helpers. For now, this will make absolutely no
> difference because the 'struct hyp_pool' locking is still serialising
> everything. One step at a time...
> 
> Signed-off-by: Will Deacon <will@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 +++++++-----------
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c    |  2 +-
>  arch/arm64/kvm/hyp/nvhe/page_alloc.c     |  5 ++++-
>  3 files changed, 12 insertions(+), 13 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> index ab205c4d6774..74474c82667b 100644
> --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
> +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> @@ -6,6 +6,7 @@
>  #include <asm/page.h>
>  
>  #include <linux/types.h>
> +#include <nvhe/refcount.h>
>  
>  struct hyp_page {
>  	unsigned short refcount;
> @@ -39,37 +40,32 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
>  #define hyp_page_to_pool(page)	(((struct hyp_page *)page)->pool)
>  
>  /*
> - * Refcounting for 'struct hyp_page'.
> - * hyp_pool::lock must be held if atomic access to the refcount is required.
> + * Refcounting wrappers for 'struct hyp_page'.
>   */
>  static inline int hyp_page_count(void *addr)
>  {
>  	struct hyp_page *p = hyp_virt_to_page(addr);
>  
> -	return p->refcount;
> +	return hyp_refcount_get(p->refcount);
>  }
>  
>  static inline void hyp_page_ref_inc(struct hyp_page *p)
>  {
> -	BUG_ON(p->refcount == USHRT_MAX);
> -	p->refcount++;
> +	hyp_refcount_inc(p->refcount);

Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
p->refcount was 0) would be nice, especially since we're past the point of
serializing everything and you can theoretically have a zero count page
outside of the free list.

Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
allocation path.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context
  2024-04-19  7:59 ` [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
@ 2024-04-19 20:54   ` Oliver Upton
  2024-04-22  8:11     ` Fuad Tabba
  0 siblings, 1 reply; 44+ messages in thread
From: Oliver Upton @ 2024-04-19 20:54 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Fri, Apr 19, 2024 at 08:59:19AM +0100, Fuad Tabba wrote:
> From: Will Deacon <will@kernel.org>
> 
> Typically, TLB invalidation of guest stage-2 mappings using nVHE is
> performed by a hypercall originating from the host. For the invalidation
> instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
> swizzle the active stage-2 context around the TLBI instruction.
> 
> With guest-to-host memory sharing and unsharing hypercalls
> originating from the guest under pKVM, there is need to support
> both guest and host VMID invalidations issued from guest context.
> 
> Replace the __tlb_switch_to_{guest,host}() functions with a more general
> {enter,exit}_vmid_context() implementation which supports being invoked
> from guest context and acts as a no-op if the target context matches the
> running context.

I'd rather not introduce unnecessary asymmetry between the nVHE and VHE
TLB invalidations, the code duplication is annoying enough as is. Can
you rename to {enter,exit}_vmid_context() on the VHE side as well?

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit
  2024-04-19 20:28   ` Oliver Upton
@ 2024-04-22  8:07     ` Fuad Tabba
  0 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-22  8:07 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

Hi Oliver,

On Fri, Apr 19, 2024 at 9:28 PM Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Fri, Apr 19, 2024 at 08:59:41AM +0100, Fuad Tabba wrote:
> > From: Marc Zyngier <maz@kernel.org>
> >
> > If a vcpu exits for a data abort with an invalid syndrome, the
> > expectations are that userspace has a chance to save the day if
> > it has requested to see such exits.
> >
> > However, this is completely futile in the case of a protected VM,
> > as none of the state is available. In this particular case, inject
> > a data abort directly into the vcpu, consistent with what userspace
> > could do.
> >
> > This also helps with pKVM, which discards all syndrome information when
> > forwarding data aborts that are not known to be MMIO.
> >
> > Finally, hide the RETURN_NISV_IO_ABORT_TO_USER cap from userspace on
> > protected VMs, and document this tweak to the API.
>
> Are there going to be more KVM caps that we hide from protected VMs in
> the future? If so, it might be better to have a common helper that
> enforces a denylist of capabilities not supported by pKVM.
>
> That way we can avoid adding kvm_vm_is_protected() checks all over the
> shop.

Yes there will be. I'll add a patch to do that and rework this one.

> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 66301131d5a9..750386a84968 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -80,9 +80,13 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> >
> >       switch (cap->cap) {
> >       case KVM_CAP_ARM_NISV_TO_USER:
> > -             r = 0;
> > -             set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
> > -                     &kvm->arch.flags);
> > +             if (kvm_vm_is_protected(kvm)) {
> > +                     r = -EINVAL;
> > +             } else {
> > +                     r = 0;
> > +                     set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER,
> > +                             &kvm->arch.flags);
> > +             }
>
> nitpick: initialize r = -EINVAL and get rid of all the error-path
> initializations in kvm_vm_ioctl_enable_cap()

Ack.

> >               break;
> >       case KVM_CAP_ARM_MTE:
> >               mutex_lock(&kvm->lock);
> > @@ -237,7 +241,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >       case KVM_CAP_IMMEDIATE_EXIT:
> >       case KVM_CAP_VCPU_EVENTS:
> >       case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2:
> > -     case KVM_CAP_ARM_NISV_TO_USER:
> >       case KVM_CAP_ARM_INJECT_EXT_DABT:
> >       case KVM_CAP_SET_GUEST_DEBUG:
> >       case KVM_CAP_VCPU_ATTRIBUTES:
> > @@ -247,6 +250,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> >       case KVM_CAP_COUNTER_OFFSET:
> >               r = 1;
> >               break;
> > +     case KVM_CAP_ARM_NISV_TO_USER:
> > +             r = !kvm || !kvm_vm_is_protected(kvm);
> > +             break;
> >       case KVM_CAP_SET_GUEST_DEBUG2:
> >               return KVM_GUESTDBG_VALID_MASK;
> >       case KVM_CAP_ARM_SET_DEVICE_ADDR:
> > diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c
> > index 5e1ffb0d5363..75e1072948cd 100644
> > --- a/arch/arm64/kvm/mmio.c
> > +++ b/arch/arm64/kvm/mmio.c
> > @@ -133,11 +133,19 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa)
> >       /*
> >        * No valid syndrome? Ask userspace for help if it has
> >        * volunteered to do so, and bail out otherwise.
> > +      *
> > +      * In the protected VM case, there isn't much userspace can do
> > +      * though, so directly deliver an exception to the guest.
> >        */
> >       if (!kvm_vcpu_dabt_isvalid(vcpu)) {
> >               trace_kvm_mmio_nisv(*vcpu_pc(vcpu), kvm_vcpu_get_esr(vcpu),
> >                                   kvm_vcpu_get_hfar(vcpu), fault_ipa);
> >
> > +             if (is_protected_kvm_enabled() && vcpu_is_protected(vcpu)) {
>
> You can drop the is_protected_kvm_enabled() now that it got folded in to
> kvm_vm_is_protected().

Ack.

Thank you!
/fuad

> --
> Thanks,
> Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths
  2024-04-19 20:42   ` Oliver Upton
@ 2024-04-22  8:09     ` Fuad Tabba
  0 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-22  8:09 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

Hi Oliver,

On Fri, Apr 19, 2024 at 9:42 PM Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Fri, Apr 19, 2024 at 08:59:21AM +0100, Fuad Tabba wrote:
> > From: Will Deacon <will@kernel.org>
> >
> > hyp_{get,put}_page() are called extensively from the page-table code
> > to adjust reference counts on page-table pages. As a small step towards
> > removing reader serialisation on these paths, drop the 'hyp_pool' lock
> > in the case where the refcount remains positive, only taking the lock
> > if the page is to be freed back to the allocator.
>
> Surely the ordering of these patches is wrong... The refcount accessors
> are non-atomic until patch 27.

Yup. This is the patch sender's fault (i.e., mine), not the author's.
I'll fix this on the respin.

Thanks!
/fuad

> --
> Thanks,
> Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-19 20:52   ` Oliver Upton
@ 2024-04-22  8:10     ` Fuad Tabba
  2024-04-22 13:08       ` Fuad Tabba
  0 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-22  8:10 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Fri, Apr 19, 2024 at 9:52 PM Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Fri, Apr 19, 2024 at 08:59:37AM +0100, Fuad Tabba wrote:
> > From: Will Deacon <will@kernel.org>
> >
> > Convert the 'struct hyp_page' refcount manipulation functions over to
> > the new atomic refcount helpers. For now, this will make absolutely no
> > difference because the 'struct hyp_pool' locking is still serialising
> > everything. One step at a time...
> >
> > Signed-off-by: Will Deacon <will@kernel.org>
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 +++++++-----------
> >  arch/arm64/kvm/hyp/nvhe/mem_protect.c    |  2 +-
> >  arch/arm64/kvm/hyp/nvhe/page_alloc.c     |  5 ++++-
> >  3 files changed, 12 insertions(+), 13 deletions(-)
> >
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > index ab205c4d6774..74474c82667b 100644
> > --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > @@ -6,6 +6,7 @@
> >  #include <asm/page.h>
> >
> >  #include <linux/types.h>
> > +#include <nvhe/refcount.h>
> >
> >  struct hyp_page {
> >       unsigned short refcount;
> > @@ -39,37 +40,32 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
> >  #define hyp_page_to_pool(page)       (((struct hyp_page *)page)->pool)
> >
> >  /*
> > - * Refcounting for 'struct hyp_page'.
> > - * hyp_pool::lock must be held if atomic access to the refcount is required.
> > + * Refcounting wrappers for 'struct hyp_page'.
> >   */
> >  static inline int hyp_page_count(void *addr)
> >  {
> >       struct hyp_page *p = hyp_virt_to_page(addr);
> >
> > -     return p->refcount;
> > +     return hyp_refcount_get(p->refcount);
> >  }
> >
> >  static inline void hyp_page_ref_inc(struct hyp_page *p)
> >  {
> > -     BUG_ON(p->refcount == USHRT_MAX);
> > -     p->refcount++;
> > +     hyp_refcount_inc(p->refcount);
>
> Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
> p->refcount was 0) would be nice, especially since we're past the point of
> serializing everything and you can theoretically have a zero count page
> outside of the free list.
>
> Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
> allocation path.

Will do.

Thanks,
/fuad
>
> --
> Thanks,
> Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context
  2024-04-19 20:54   ` Oliver Upton
@ 2024-04-22  8:11     ` Fuad Tabba
  0 siblings, 0 replies; 44+ messages in thread
From: Fuad Tabba @ 2024-04-22  8:11 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

Hi Oliver,

On Fri, Apr 19, 2024 at 9:54 PM Oliver Upton <oliver.upton@linux.dev> wrote:
>
> On Fri, Apr 19, 2024 at 08:59:19AM +0100, Fuad Tabba wrote:
> > From: Will Deacon <will@kernel.org>
> >
> > Typically, TLB invalidation of guest stage-2 mappings using nVHE is
> > performed by a hypercall originating from the host. For the invalidation
> > instruction to be effective, therefore, __tlb_switch_to_{guest,host}()
> > swizzle the active stage-2 context around the TLBI instruction.
> >
> > With guest-to-host memory sharing and unsharing hypercalls
> > originating from the guest under pKVM, there is need to support
> > both guest and host VMID invalidations issued from guest context.
> >
> > Replace the __tlb_switch_to_{guest,host}() functions with a more general
> > {enter,exit}_vmid_context() implementation which supports being invoked
> > from guest context and acts as a no-op if the target context matches the
> > running context.
>
> I'd rather not introduce unnecessary asymmetry between the nVHE and VHE
> TLB invalidations, the code duplication is annoying enough as is. Can
> you rename to {enter,exit}_vmid_context() on the VHE side as well?

Will do.

Thanks for all the reviews!
/fuad

> --
> Thanks,
> Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-22  8:10     ` Fuad Tabba
@ 2024-04-22 13:08       ` Fuad Tabba
  2024-04-22 20:46         ` Oliver Upton
  0 siblings, 1 reply; 44+ messages in thread
From: Fuad Tabba @ 2024-04-22 13:08 UTC (permalink / raw)
  To: Oliver Upton
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

Hi Oliver,

On Mon, Apr 22, 2024 at 9:10 AM Fuad Tabba <tabba@google.com> wrote:
>
> On Fri, Apr 19, 2024 at 9:52 PM Oliver Upton <oliver.upton@linux.dev> wrote:
> >
> > On Fri, Apr 19, 2024 at 08:59:37AM +0100, Fuad Tabba wrote:
> > > From: Will Deacon <will@kernel.org>
> > >
> > > Convert the 'struct hyp_page' refcount manipulation functions over to
> > > the new atomic refcount helpers. For now, this will make absolutely no
> > > difference because the 'struct hyp_pool' locking is still serialising
> > > everything. One step at a time...
> > >
> > > Signed-off-by: Will Deacon <will@kernel.org>
> > > Signed-off-by: Fuad Tabba <tabba@google.com>
> > > ---
> > >  arch/arm64/kvm/hyp/include/nvhe/memory.h | 18 +++++++-----------
> > >  arch/arm64/kvm/hyp/nvhe/mem_protect.c    |  2 +-
> > >  arch/arm64/kvm/hyp/nvhe/page_alloc.c     |  5 ++++-
> > >  3 files changed, 12 insertions(+), 13 deletions(-)
> > >
> > > diff --git a/arch/arm64/kvm/hyp/include/nvhe/memory.h b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > > index ab205c4d6774..74474c82667b 100644
> > > --- a/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > > +++ b/arch/arm64/kvm/hyp/include/nvhe/memory.h
> > > @@ -6,6 +6,7 @@
> > >  #include <asm/page.h>
> > >
> > >  #include <linux/types.h>
> > > +#include <nvhe/refcount.h>
> > >
> > >  struct hyp_page {
> > >       unsigned short refcount;
> > > @@ -39,37 +40,32 @@ static inline phys_addr_t hyp_virt_to_phys(void *addr)
> > >  #define hyp_page_to_pool(page)       (((struct hyp_page *)page)->pool)
> > >
> > >  /*
> > > - * Refcounting for 'struct hyp_page'.
> > > - * hyp_pool::lock must be held if atomic access to the refcount is required.
> > > + * Refcounting wrappers for 'struct hyp_page'.
> > >   */
> > >  static inline int hyp_page_count(void *addr)
> > >  {
> > >       struct hyp_page *p = hyp_virt_to_page(addr);
> > >
> > > -     return p->refcount;
> > > +     return hyp_refcount_get(p->refcount);
> > >  }
> > >
> > >  static inline void hyp_page_ref_inc(struct hyp_page *p)
> > >  {
> > > -     BUG_ON(p->refcount == USHRT_MAX);
> > > -     p->refcount++;
> > > +     hyp_refcount_inc(p->refcount);
> >
> > Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
> > p->refcount was 0) would be nice, especially since we're past the point of
> > serializing everything and you can theoretically have a zero count page
> > outside of the free list.
> >
> > Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
> > allocation path.

Actually, the refcount can be 0 without it being an error. For
example, when hyp pins memory shared with it by the host
(mem_protect.c:hyp_pin_shared_mem()).

Cheers,
/fuad

> Will do.
>
> Thanks,
> /fuad
> >
> > --
> > Thanks,
> > Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-22 13:08       ` Fuad Tabba
@ 2024-04-22 20:46         ` Oliver Upton
  2024-04-22 23:44           ` Will Deacon
  0 siblings, 1 reply; 44+ messages in thread
From: Oliver Upton @ 2024-04-22 20:46 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Mon, Apr 22, 2024 at 02:08:17PM +0100, Fuad Tabba wrote:

[...]

> > > Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
> > > p->refcount was 0) would be nice, especially since we're past the point of
> > > serializing everything and you can theoretically have a zero count page
> > > outside of the free list.
> > >
> > > Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
> > > allocation path.
> 
> Actually, the refcount can be 0 without it being an error. For
> example, when hyp pins memory shared with it by the host
> (mem_protect.c:hyp_pin_shared_mem()).

Are those not by their very definition non-refcounted pages? I can't
imagine we'd want pages in a shared state with the host to ever get
returned to the hyp allocator. Seems an erroneous hyp_put_page() would
get you there, though.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-22 20:46         ` Oliver Upton
@ 2024-04-22 23:44           ` Will Deacon
  2024-04-23  1:15             ` Oliver Upton
  0 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2024-04-22 23:44 UTC (permalink / raw)
  To: Oliver Upton
  Cc: Fuad Tabba, kvmarm, maz, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

Hi Oliver,

On Mon, Apr 22, 2024 at 01:46:14PM -0700, Oliver Upton wrote:
> On Mon, Apr 22, 2024 at 02:08:17PM +0100, Fuad Tabba wrote:
> 
> [...]
> 
> > > > Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
> > > > p->refcount was 0) would be nice, especially since we're past the point of
> > > > serializing everything and you can theoretically have a zero count page
> > > > outside of the free list.
> > > >
> > > > Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
> > > > allocation path.
> > 
> > Actually, the refcount can be 0 without it being an error. For
> > example, when hyp pins memory shared with it by the host
> > (mem_protect.c:hyp_pin_shared_mem()).
> 
> Are those not by their very definition non-refcounted pages?

Right, we're using the refcount for two things here: (1) so that the
allocator knows when to return the page to the pool and (2) so that the
hypervisor can transiently prevent a page which has been shared by the
host from being unshared. That second part is needed to e.g. prevent a
page holding a host vCPU structure being donated to a guest as normal
memory and then having the hypervisor write to it as a result of a host
hypercall. We use the refcount for this because the same page can be
shared with the hypervisor multiple times and we need to know when the
last host sharer has dropped its pin.

> I can't imagine we'd want pages in a shared state with the host to ever
> get returned to the hyp allocator. Seems an erroneous hyp_put_page() would
> get you there, though.

Given the dual-use above, I don't think a BUG_ON() on the refcount is
the right fix. Instead, we'd probably want a (cheap) mechanism to
differentiate pages in states (1) and (2). This could be a new flag in
'struct hyp_page' or perhaps we could be creative and set the 'order' to
HYP_NO_ORDER for pinned pages and then have a BUG() to check 'p->order'
against 'pool->max_order' in hyp_put_page().

What do you think?

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount'
  2024-04-22 23:44           ` Will Deacon
@ 2024-04-23  1:15             ` Oliver Upton
  0 siblings, 0 replies; 44+ messages in thread
From: Oliver Upton @ 2024-04-23  1:15 UTC (permalink / raw)
  To: Will Deacon
  Cc: Fuad Tabba, kvmarm, maz, qperret, seanjc, alexandru.elisei,
	catalin.marinas, philmd, james.morse, suzuki.poulose,
	mark.rutland, broonie, joey.gouly, rananta, smostafa

On Tue, Apr 23, 2024 at 12:44:32AM +0100, Will Deacon wrote:
> Hi Oliver,
> 
> On Mon, Apr 22, 2024 at 01:46:14PM -0700, Oliver Upton wrote:
> > On Mon, Apr 22, 2024 at 02:08:17PM +0100, Fuad Tabba wrote:
> > 
> > [...]
> > 
> > > > > Adding a BUG_ON() for taking a reference on a non-refcounted page (i.e.
> > > > > p->refcount was 0) would be nice, especially since we're past the point of
> > > > > serializing everything and you can theoretically have a zero count page
> > > > > outside of the free list.
> > > > >
> > > > > Seems like otherwise we'd get actually hit the BUG_ON() in an unrelated
> > > > > allocation path.
> > > 
> > > Actually, the refcount can be 0 without it being an error. For
> > > example, when hyp pins memory shared with it by the host
> > > (mem_protect.c:hyp_pin_shared_mem()).
> > 
> > Are those not by their very definition non-refcounted pages?
> 
> Right, we're using the refcount for two things here: (1) so that the
> allocator knows when to return the page to the pool and (2) so that the
> hypervisor can transiently prevent a page which has been shared by the
> host from being unshared. That second part is needed to e.g. prevent a
> page holding a host vCPU structure being donated to a guest as normal
> memory and then having the hypervisor write to it as a result of a host
> hypercall. We use the refcount for this because the same page can be
> shared with the hypervisor multiple times and we need to know when the
> last host sharer has dropped its pin.

Ah, right. I don't have the muscle memory for the pKVM bits upstream
(even though I should), I missed the very obvious refcount test in
hyp_ack_unshare().

So it is a refcount on the page state, be it allocated or shared. And
the 0 -> 1 transition on a shared page happens through an increment
rather than an initializer like pages from the hyp pool.

> > I can't imagine we'd want pages in a shared state with the host to ever
> > get returned to the hyp allocator. Seems an erroneous hyp_put_page() would
> > get you there, though.
> 
> Given the dual-use above, I don't think a BUG_ON() on the refcount is
> the right fix. Instead, we'd probably want a (cheap) mechanism to
> differentiate pages in states (1) and (2). This could be a new flag in
> 'struct hyp_page' or perhaps we could be creative and set the 'order' to
> HYP_NO_ORDER for pinned pages and then have a BUG() to check 'p->order'
> against 'pool->max_order' in hyp_put_page().
> 
> What do you think?

Excellent idea. Having a way to disambiguate page states would be great,
all the better if it can (ab)use an existing field.

-- 
Thanks,
Oliver

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2024-04-23  1:15 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-19  7:59 [PATCH v3 00/31] KVM: arm64: Preamble for pKVM Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 01/31] KVM: arm64: Initialize the kvm host data's fpsimd_state pointer in pKVM Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 02/31] KVM: arm64: Move guest_owns_fp_regs() to increase its scope Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 03/31] KVM: arm64: Refactor checks for FP state ownership Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 04/31] KVM: arm64: Do not re-initialize the KVM lock Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 05/31] KVM: arm64: Issue CMOs when tearing down guest s2 pages Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 06/31] KVM: arm64: Avoid BUG-ing from the host abort path Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 07/31] KVM: arm64: Check for PTE validity when checking for executable/cacheable Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 08/31] KVM: arm64: Avoid BBM when changing only s/w bits in Stage-2 PTE Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 09/31] KVM: arm64: Support TLB invalidation in guest context Fuad Tabba
2024-04-19 20:54   ` Oliver Upton
2024-04-22  8:11     ` Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 10/31] KVM: arm64: Do not map the host fpsimd state to hyp in pKVM Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 11/31] KVM: arm64: Remove locking from EL2 allocation fast-paths Fuad Tabba
2024-04-19 20:42   ` Oliver Upton
2024-04-22  8:09     ` Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 12/31] KVM: arm64: Prevent kmemleak from accessing .hyp.data Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 13/31] KVM: arm64: Fix comment for __pkvm_vcpu_init_traps() Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 14/31] KVM: arm64: Change kvm_handle_mmio_return() return polarity Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 15/31] KVM: arm64: Move setting the page as dirty out of the critical section Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 16/31] KVM: arm64: Simplify vgic-v3 hypercalls Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 17/31] KVM: arm64: Add is_pkvm_initialized() helper Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 18/31] KVM: arm64: Introduce and use predicates that check for protected VMs Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 19/31] KVM: arm64: Move pstate reset value definitions to kvm_arm.h Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 20/31] KVM: arm64: Clarify rationale for ZCR_EL1 value restored on guest exit Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 21/31] KVM: arm64: Refactor calculating SVE state size to use helpers Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 22/31] KVM: arm64: Move some kvm_psci functions to a shared header Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 23/31] KVM: arm64: Refactor reset_mpidr() to extract its computation Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 24/31] KVM: arm64: Refactor kvm_vcpu_enable_ptrauth() for hyp use Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 25/31] KVM: arm64: Introduce hyp_rwlock_t Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 26/31] KVM: arm64: Add atomics-based checking refcount implementation at EL2 Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 27/31] KVM: arm64: Use atomic refcount helpers for 'struct hyp_page::refcount' Fuad Tabba
2024-04-19 20:52   ` Oliver Upton
2024-04-22  8:10     ` Fuad Tabba
2024-04-22 13:08       ` Fuad Tabba
2024-04-22 20:46         ` Oliver Upton
2024-04-22 23:44           ` Will Deacon
2024-04-23  1:15             ` Oliver Upton
2024-04-19  7:59 ` [PATCH v3 28/31] KVM: arm64: Reformat/beautify PTP hypercall documentation Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 29/31] KVM: arm64: Rename firmware pseudo-register documentation file Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 30/31] KVM: arm64: Document the KVM/arm64-specific calls in hypercalls.rst Fuad Tabba
2024-04-19  7:59 ` [PATCH v3 31/31] KVM: arm64: Force injection of a data abort on NISV MMIO exit Fuad Tabba
2024-04-19 20:28   ` Oliver Upton
2024-04-22  8:07     ` Fuad Tabba

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).