All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/12] KVM: arm64: Fixed features for protected VMs
@ 2021-09-22 12:46 ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Hi,

Changes since v5 [1]:
- Rebase on 5.15-rc2
- Include Marc's early exception handlers in the series
- Refactoring and fixes (Drew, Marc)

This patch series adds support for restricting CPU features for protected VMs
in KVM (pKVM). For more background, please refer to the previous series [1].

This series is based on 5.15-rc2. You can find the applied series here [2].

Cheers,
/fuad

[1] https://lore.kernel.org/kvmarm/20210827101609.2808181-1-tabba@google.com/

[2] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/el2_fixed_feature_v6

Fuad Tabba (9):
  KVM: arm64: Add missing FORCE prerequisite in Makefile
  KVM: arm64: Pass struct kvm to per-EC handlers
  KVM: arm64: Add missing field descriptor for MDCR_EL2
  KVM: arm64: Simplify masking out MTE in feature id reg
  KVM: arm64: Add handlers for protected VM System Registers
  KVM: arm64: Initialize trap registers for protected VMs
  KVM: arm64: Move sanitized copies of CPU features
  KVM: arm64: Trap access to pVM restricted features
  KVM: arm64: Handle protected guests at 32 bits

Marc Zyngier (3):
  KVM: arm64: Move __get_fault_info() and co into their own include file
  KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
  KVM: arm64: Move early handlers to per-EC handlers

 arch/arm64/include/asm/kvm_arm.h           |   1 +
 arch/arm64/include/asm/kvm_asm.h           |   1 +
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_host.h          |   2 +
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |  13 +
 arch/arm64/kvm/hyp/include/hyp/fault.h     |  75 ++++
 arch/arm64/kvm/hyp/include/hyp/switch.h    | 221 ++++-----
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h     |  14 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c         |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c      |   8 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c             | 186 ++++++++
 arch/arm64/kvm/hyp/nvhe/switch.c           | 117 +++++
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 494 +++++++++++++++++++++
 arch/arm64/kvm/hyp/vhe/switch.c            |  17 +
 arch/arm64/kvm/sys_regs.c                  |  10 +-
 18 files changed, 1257 insertions(+), 146 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c


base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH v6 00/12] KVM: arm64: Fixed features for protected VMs
@ 2021-09-22 12:46 ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Hi,

Changes since v5 [1]:
- Rebase on 5.15-rc2
- Include Marc's early exception handlers in the series
- Refactoring and fixes (Drew, Marc)

This patch series adds support for restricting CPU features for protected VMs
in KVM (pKVM). For more background, please refer to the previous series [1].

This series is based on 5.15-rc2. You can find the applied series here [2].

Cheers,
/fuad

[1] https://lore.kernel.org/kvmarm/20210827101609.2808181-1-tabba@google.com/

[2] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/el2_fixed_feature_v6

Fuad Tabba (9):
  KVM: arm64: Add missing FORCE prerequisite in Makefile
  KVM: arm64: Pass struct kvm to per-EC handlers
  KVM: arm64: Add missing field descriptor for MDCR_EL2
  KVM: arm64: Simplify masking out MTE in feature id reg
  KVM: arm64: Add handlers for protected VM System Registers
  KVM: arm64: Initialize trap registers for protected VMs
  KVM: arm64: Move sanitized copies of CPU features
  KVM: arm64: Trap access to pVM restricted features
  KVM: arm64: Handle protected guests at 32 bits

Marc Zyngier (3):
  KVM: arm64: Move __get_fault_info() and co into their own include file
  KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
  KVM: arm64: Move early handlers to per-EC handlers

 arch/arm64/include/asm/kvm_arm.h           |   1 +
 arch/arm64/include/asm/kvm_asm.h           |   1 +
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_host.h          |   2 +
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |  13 +
 arch/arm64/kvm/hyp/include/hyp/fault.h     |  75 ++++
 arch/arm64/kvm/hyp/include/hyp/switch.h    | 221 ++++-----
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h     |  14 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c         |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c      |   8 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c             | 186 ++++++++
 arch/arm64/kvm/hyp/nvhe/switch.c           | 117 +++++
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 494 +++++++++++++++++++++
 arch/arm64/kvm/hyp/vhe/switch.c            |  17 +
 arch/arm64/kvm/sys_regs.c                  |  10 +-
 18 files changed, 1257 insertions(+), 146 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c


base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH v6 00/12] KVM: arm64: Fixed features for protected VMs
@ 2021-09-22 12:46 ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Hi,

Changes since v5 [1]:
- Rebase on 5.15-rc2
- Include Marc's early exception handlers in the series
- Refactoring and fixes (Drew, Marc)

This patch series adds support for restricting CPU features for protected VMs
in KVM (pKVM). For more background, please refer to the previous series [1].

This series is based on 5.15-rc2. You can find the applied series here [2].

Cheers,
/fuad

[1] https://lore.kernel.org/kvmarm/20210827101609.2808181-1-tabba@google.com/

[2] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/el2_fixed_feature_v6

Fuad Tabba (9):
  KVM: arm64: Add missing FORCE prerequisite in Makefile
  KVM: arm64: Pass struct kvm to per-EC handlers
  KVM: arm64: Add missing field descriptor for MDCR_EL2
  KVM: arm64: Simplify masking out MTE in feature id reg
  KVM: arm64: Add handlers for protected VM System Registers
  KVM: arm64: Initialize trap registers for protected VMs
  KVM: arm64: Move sanitized copies of CPU features
  KVM: arm64: Trap access to pVM restricted features
  KVM: arm64: Handle protected guests at 32 bits

Marc Zyngier (3):
  KVM: arm64: Move __get_fault_info() and co into their own include file
  KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
  KVM: arm64: Move early handlers to per-EC handlers

 arch/arm64/include/asm/kvm_arm.h           |   1 +
 arch/arm64/include/asm/kvm_asm.h           |   1 +
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_host.h          |   2 +
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |  13 +
 arch/arm64/kvm/hyp/include/hyp/fault.h     |  75 ++++
 arch/arm64/kvm/hyp/include/hyp/switch.h    | 221 ++++-----
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h     |  14 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   4 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c         |  12 +-
 arch/arm64/kvm/hyp/nvhe/mem_protect.c      |   8 +-
 arch/arm64/kvm/hyp/nvhe/pkvm.c             | 186 ++++++++
 arch/arm64/kvm/hyp/nvhe/switch.c           | 117 +++++
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 494 +++++++++++++++++++++
 arch/arm64/kvm/hyp/vhe/switch.c            |  17 +
 arch/arm64/kvm/sys_regs.c                  |  10 +-
 18 files changed, 1257 insertions(+), 146 deletions(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c


base-commit: e4e737bb5c170df6135a127739a9e6148ee3da82
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

In order to avoid including the whole of the switching helpers
in unrelated files, move the __get_fault_info() and related helpers
into their own include file.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
 3 files changed, 77 insertions(+), 61 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h

diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
new file mode 100644
index 000000000000..1b8a2dcd712f
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ */
+
+#ifndef __ARM64_KVM_HYP_FAULT_H__
+#define __ARM64_KVM_HYP_FAULT_H__
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+
+static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
+{
+	u64 par, tmp;
+
+	/*
+	 * Resolve the IPA the hard way using the guest VA.
+	 *
+	 * Stage-1 translation already validated the memory access
+	 * rights. As such, we can use the EL1 translation regime, and
+	 * don't have to distinguish between EL0 and EL1 access.
+	 *
+	 * We do need to save/restore PAR_EL1 though, as we haven't
+	 * saved the guest context yet, and we may return early...
+	 */
+	par = read_sysreg_par();
+	if (!__kvm_at("s1e1r", far))
+		tmp = read_sysreg_par();
+	else
+		tmp = SYS_PAR_EL1_F; /* back to the guest */
+	write_sysreg(par, par_el1);
+
+	if (unlikely(tmp & SYS_PAR_EL1_F))
+		return false; /* Translation failed, back to guest */
+
+	/* Convert PAR to HPFAR format */
+	*hpfar = PAR_TO_HPFAR(tmp);
+	return true;
+}
+
+static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
+{
+	u64 hpfar, far;
+
+	far = read_sysreg_el2(SYS_FAR);
+
+	/*
+	 * The HPFAR can be invalid if the stage 2 fault did not
+	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
+	 * bit is clear) and one of the two following cases are true:
+	 *   1. The fault was due to a permission fault
+	 *   2. The processor carries errata 834220
+	 *
+	 * Therefore, for all non S1PTW faults where we either have a
+	 * permission fault or the errata workaround is enabled, we
+	 * resolve the IPA using the AT instruction.
+	 */
+	if (!(esr & ESR_ELx_S1PTW) &&
+	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
+	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
+		if (!__translate_far_to_hpfar(far, &hpfar))
+			return false;
+	} else {
+		hpfar = read_sysreg(hpfar_el2);
+	}
+
+	fault->far_el2 = far;
+	fault->hpfar_el2 = hpfar;
+	return true;
+}
+
+#endif
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index a0e78a6027be..54abc8298ec3 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -8,6 +8,7 @@
 #define __ARM64_KVM_HYP_SWITCH_H__
 
 #include <hyp/adjust_pc.h>
+#include <hyp/fault.h>
 
 #include <linux/arm-smccc.h>
 #include <linux/kvm_host.h>
@@ -133,66 +134,6 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 	}
 }
 
-static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
-{
-	u64 par, tmp;
-
-	/*
-	 * Resolve the IPA the hard way using the guest VA.
-	 *
-	 * Stage-1 translation already validated the memory access
-	 * rights. As such, we can use the EL1 translation regime, and
-	 * don't have to distinguish between EL0 and EL1 access.
-	 *
-	 * We do need to save/restore PAR_EL1 though, as we haven't
-	 * saved the guest context yet, and we may return early...
-	 */
-	par = read_sysreg_par();
-	if (!__kvm_at("s1e1r", far))
-		tmp = read_sysreg_par();
-	else
-		tmp = SYS_PAR_EL1_F; /* back to the guest */
-	write_sysreg(par, par_el1);
-
-	if (unlikely(tmp & SYS_PAR_EL1_F))
-		return false; /* Translation failed, back to guest */
-
-	/* Convert PAR to HPFAR format */
-	*hpfar = PAR_TO_HPFAR(tmp);
-	return true;
-}
-
-static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
-{
-	u64 hpfar, far;
-
-	far = read_sysreg_el2(SYS_FAR);
-
-	/*
-	 * The HPFAR can be invalid if the stage 2 fault did not
-	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
-	 * bit is clear) and one of the two following cases are true:
-	 *   1. The fault was due to a permission fault
-	 *   2. The processor carries errata 834220
-	 *
-	 * Therefore, for all non S1PTW faults where we either have a
-	 * permission fault or the errata workaround is enabled, we
-	 * resolve the IPA using the AT instruction.
-	 */
-	if (!(esr & ESR_ELx_S1PTW) &&
-	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
-	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
-		if (!__translate_far_to_hpfar(far, &hpfar))
-			return false;
-	} else {
-		hpfar = read_sysreg(hpfar_el2);
-	}
-
-	fault->far_el2 = far;
-	fault->hpfar_el2 = hpfar;
-	return true;
-}
-
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
 	u8 ec;
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..2a07d63b8498 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -11,7 +11,7 @@
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
-#include <hyp/switch.h>
+#include <hyp/fault.h>
 
 #include <nvhe/gfp.h>
 #include <nvhe/memory.h>
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

From: Marc Zyngier <maz@kernel.org>

In order to avoid including the whole of the switching helpers
in unrelated files, move the __get_fault_info() and related helpers
into their own include file.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
 3 files changed, 77 insertions(+), 61 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h

diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
new file mode 100644
index 000000000000..1b8a2dcd712f
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ */
+
+#ifndef __ARM64_KVM_HYP_FAULT_H__
+#define __ARM64_KVM_HYP_FAULT_H__
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+
+static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
+{
+	u64 par, tmp;
+
+	/*
+	 * Resolve the IPA the hard way using the guest VA.
+	 *
+	 * Stage-1 translation already validated the memory access
+	 * rights. As such, we can use the EL1 translation regime, and
+	 * don't have to distinguish between EL0 and EL1 access.
+	 *
+	 * We do need to save/restore PAR_EL1 though, as we haven't
+	 * saved the guest context yet, and we may return early...
+	 */
+	par = read_sysreg_par();
+	if (!__kvm_at("s1e1r", far))
+		tmp = read_sysreg_par();
+	else
+		tmp = SYS_PAR_EL1_F; /* back to the guest */
+	write_sysreg(par, par_el1);
+
+	if (unlikely(tmp & SYS_PAR_EL1_F))
+		return false; /* Translation failed, back to guest */
+
+	/* Convert PAR to HPFAR format */
+	*hpfar = PAR_TO_HPFAR(tmp);
+	return true;
+}
+
+static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
+{
+	u64 hpfar, far;
+
+	far = read_sysreg_el2(SYS_FAR);
+
+	/*
+	 * The HPFAR can be invalid if the stage 2 fault did not
+	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
+	 * bit is clear) and one of the two following cases are true:
+	 *   1. The fault was due to a permission fault
+	 *   2. The processor carries errata 834220
+	 *
+	 * Therefore, for all non S1PTW faults where we either have a
+	 * permission fault or the errata workaround is enabled, we
+	 * resolve the IPA using the AT instruction.
+	 */
+	if (!(esr & ESR_ELx_S1PTW) &&
+	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
+	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
+		if (!__translate_far_to_hpfar(far, &hpfar))
+			return false;
+	} else {
+		hpfar = read_sysreg(hpfar_el2);
+	}
+
+	fault->far_el2 = far;
+	fault->hpfar_el2 = hpfar;
+	return true;
+}
+
+#endif
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index a0e78a6027be..54abc8298ec3 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -8,6 +8,7 @@
 #define __ARM64_KVM_HYP_SWITCH_H__
 
 #include <hyp/adjust_pc.h>
+#include <hyp/fault.h>
 
 #include <linux/arm-smccc.h>
 #include <linux/kvm_host.h>
@@ -133,66 +134,6 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 	}
 }
 
-static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
-{
-	u64 par, tmp;
-
-	/*
-	 * Resolve the IPA the hard way using the guest VA.
-	 *
-	 * Stage-1 translation already validated the memory access
-	 * rights. As such, we can use the EL1 translation regime, and
-	 * don't have to distinguish between EL0 and EL1 access.
-	 *
-	 * We do need to save/restore PAR_EL1 though, as we haven't
-	 * saved the guest context yet, and we may return early...
-	 */
-	par = read_sysreg_par();
-	if (!__kvm_at("s1e1r", far))
-		tmp = read_sysreg_par();
-	else
-		tmp = SYS_PAR_EL1_F; /* back to the guest */
-	write_sysreg(par, par_el1);
-
-	if (unlikely(tmp & SYS_PAR_EL1_F))
-		return false; /* Translation failed, back to guest */
-
-	/* Convert PAR to HPFAR format */
-	*hpfar = PAR_TO_HPFAR(tmp);
-	return true;
-}
-
-static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
-{
-	u64 hpfar, far;
-
-	far = read_sysreg_el2(SYS_FAR);
-
-	/*
-	 * The HPFAR can be invalid if the stage 2 fault did not
-	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
-	 * bit is clear) and one of the two following cases are true:
-	 *   1. The fault was due to a permission fault
-	 *   2. The processor carries errata 834220
-	 *
-	 * Therefore, for all non S1PTW faults where we either have a
-	 * permission fault or the errata workaround is enabled, we
-	 * resolve the IPA using the AT instruction.
-	 */
-	if (!(esr & ESR_ELx_S1PTW) &&
-	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
-	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
-		if (!__translate_far_to_hpfar(far, &hpfar))
-			return false;
-	} else {
-		hpfar = read_sysreg(hpfar_el2);
-	}
-
-	fault->far_el2 = far;
-	fault->hpfar_el2 = hpfar;
-	return true;
-}
-
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
 	u8 ec;
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..2a07d63b8498 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -11,7 +11,7 @@
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
-#include <hyp/switch.h>
+#include <hyp/fault.h>
 
 #include <nvhe/gfp.h>
 #include <nvhe/memory.h>
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

In order to avoid including the whole of the switching helpers
in unrelated files, move the __get_fault_info() and related helpers
into their own include file.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
 arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
 arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
 3 files changed, 77 insertions(+), 61 deletions(-)
 create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h

diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
new file mode 100644
index 000000000000..1b8a2dcd712f
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
@@ -0,0 +1,75 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2015 - ARM Ltd
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ */
+
+#ifndef __ARM64_KVM_HYP_FAULT_H__
+#define __ARM64_KVM_HYP_FAULT_H__
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_emulate.h>
+#include <asm/kvm_hyp.h>
+#include <asm/kvm_mmu.h>
+
+static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
+{
+	u64 par, tmp;
+
+	/*
+	 * Resolve the IPA the hard way using the guest VA.
+	 *
+	 * Stage-1 translation already validated the memory access
+	 * rights. As such, we can use the EL1 translation regime, and
+	 * don't have to distinguish between EL0 and EL1 access.
+	 *
+	 * We do need to save/restore PAR_EL1 though, as we haven't
+	 * saved the guest context yet, and we may return early...
+	 */
+	par = read_sysreg_par();
+	if (!__kvm_at("s1e1r", far))
+		tmp = read_sysreg_par();
+	else
+		tmp = SYS_PAR_EL1_F; /* back to the guest */
+	write_sysreg(par, par_el1);
+
+	if (unlikely(tmp & SYS_PAR_EL1_F))
+		return false; /* Translation failed, back to guest */
+
+	/* Convert PAR to HPFAR format */
+	*hpfar = PAR_TO_HPFAR(tmp);
+	return true;
+}
+
+static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
+{
+	u64 hpfar, far;
+
+	far = read_sysreg_el2(SYS_FAR);
+
+	/*
+	 * The HPFAR can be invalid if the stage 2 fault did not
+	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
+	 * bit is clear) and one of the two following cases are true:
+	 *   1. The fault was due to a permission fault
+	 *   2. The processor carries errata 834220
+	 *
+	 * Therefore, for all non S1PTW faults where we either have a
+	 * permission fault or the errata workaround is enabled, we
+	 * resolve the IPA using the AT instruction.
+	 */
+	if (!(esr & ESR_ELx_S1PTW) &&
+	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
+	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
+		if (!__translate_far_to_hpfar(far, &hpfar))
+			return false;
+	} else {
+		hpfar = read_sysreg(hpfar_el2);
+	}
+
+	fault->far_el2 = far;
+	fault->hpfar_el2 = hpfar;
+	return true;
+}
+
+#endif
diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index a0e78a6027be..54abc8298ec3 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -8,6 +8,7 @@
 #define __ARM64_KVM_HYP_SWITCH_H__
 
 #include <hyp/adjust_pc.h>
+#include <hyp/fault.h>
 
 #include <linux/arm-smccc.h>
 #include <linux/kvm_host.h>
@@ -133,66 +134,6 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 	}
 }
 
-static inline bool __translate_far_to_hpfar(u64 far, u64 *hpfar)
-{
-	u64 par, tmp;
-
-	/*
-	 * Resolve the IPA the hard way using the guest VA.
-	 *
-	 * Stage-1 translation already validated the memory access
-	 * rights. As such, we can use the EL1 translation regime, and
-	 * don't have to distinguish between EL0 and EL1 access.
-	 *
-	 * We do need to save/restore PAR_EL1 though, as we haven't
-	 * saved the guest context yet, and we may return early...
-	 */
-	par = read_sysreg_par();
-	if (!__kvm_at("s1e1r", far))
-		tmp = read_sysreg_par();
-	else
-		tmp = SYS_PAR_EL1_F; /* back to the guest */
-	write_sysreg(par, par_el1);
-
-	if (unlikely(tmp & SYS_PAR_EL1_F))
-		return false; /* Translation failed, back to guest */
-
-	/* Convert PAR to HPFAR format */
-	*hpfar = PAR_TO_HPFAR(tmp);
-	return true;
-}
-
-static inline bool __get_fault_info(u64 esr, struct kvm_vcpu_fault_info *fault)
-{
-	u64 hpfar, far;
-
-	far = read_sysreg_el2(SYS_FAR);
-
-	/*
-	 * The HPFAR can be invalid if the stage 2 fault did not
-	 * happen during a stage 1 page table walk (the ESR_EL2.S1PTW
-	 * bit is clear) and one of the two following cases are true:
-	 *   1. The fault was due to a permission fault
-	 *   2. The processor carries errata 834220
-	 *
-	 * Therefore, for all non S1PTW faults where we either have a
-	 * permission fault or the errata workaround is enabled, we
-	 * resolve the IPA using the AT instruction.
-	 */
-	if (!(esr & ESR_ELx_S1PTW) &&
-	    (cpus_have_final_cap(ARM64_WORKAROUND_834220) ||
-	     (esr & ESR_ELx_FSC_TYPE) == FSC_PERM)) {
-		if (!__translate_far_to_hpfar(far, &hpfar))
-			return false;
-	} else {
-		hpfar = read_sysreg(hpfar_el2);
-	}
-
-	fault->far_el2 = far;
-	fault->hpfar_el2 = hpfar;
-	return true;
-}
-
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
 	u8 ec;
diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index bacd493a4eac..2a07d63b8498 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -11,7 +11,7 @@
 #include <asm/kvm_pgtable.h>
 #include <asm/stage2_pgtable.h>
 
-#include <hyp/switch.h>
+#include <hyp/fault.h>
 
 #include <nvhe/gfp.h>
 #include <nvhe/memory.h>
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

hyp-main.c includes switch.h while it only requires adjust-pc.h.
Fix it to remove an unnecessary dependency.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..8ca1104f4774 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -4,7 +4,7 @@
  * Author: Andrew Scull <ascull@google.com>
  */
 
-#include <hyp/switch.h>
+#include <hyp/adjust_pc.h>
 
 #include <asm/pgtable-types.h>
 #include <asm/kvm_asm.h>
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

From: Marc Zyngier <maz@kernel.org>

hyp-main.c includes switch.h while it only requires adjust-pc.h.
Fix it to remove an unnecessary dependency.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..8ca1104f4774 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -4,7 +4,7 @@
  * Author: Andrew Scull <ascull@google.com>
  */
 
-#include <hyp/switch.h>
+#include <hyp/adjust_pc.h>
 
 #include <asm/pgtable-types.h>
 #include <asm/kvm_asm.h>
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

hyp-main.c includes switch.h while it only requires adjust-pc.h.
Fix it to remove an unnecessary dependency.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 2da6aa8da868..8ca1104f4774 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -4,7 +4,7 @@
  * Author: Andrew Scull <ascull@google.com>
  */
 
-#include <hyp/switch.h>
+#include <hyp/adjust_pc.h>
 
 #include <asm/pgtable-types.h>
 #include <asm/kvm_asm.h>
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

Simplify the early exception handling by slicing the gigantic decoding
tree into a more manageable set of functions, similar to what we have
in handle_exit.c.

This will also make the structure reusable for pKVM's own early exit
handling.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
 arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
 arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
 3 files changed, 126 insertions(+), 68 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 54abc8298ec3..0397606c0951 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u8 ec;
-	u64 esr;
-
-	esr = vcpu->arch.fault.esr_el2;
-	ec = ESR_ELx_EC(esr);
-
-	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-		return true;
-
-	return __get_fault_info(esr, &vcpu->arch.fault);
+	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
 }
 
 static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
@@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
 }
 
-/* Check for an FPSIMD/SVE trap and handle as appropriate */
-static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
+/*
+ * We trap the first access to the FP/SIMD to save the host context and
+ * restore the guest context lazily.
+ * If FP/SIMD is not implemented, handle the trap and inject an undefined
+ * instruction exception to the guest. Similarly for trapped SVE accesses.
+ */
+static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	bool sve_guest, sve_host;
 	u8 esr_ec;
@@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
 	}
 
 	esr_ec = kvm_vcpu_trap_get_class(vcpu);
-	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
-	    esr_ec != ESR_ELx_EC_SVE)
-		return false;
 
 	/* Don't handle SVE traps for non-SVE vcpus here: */
 	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
@@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
 
 DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
 
-static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
+static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	struct kvm_cpu_context *ctxt;
 	u64 val;
@@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
+	    handle_tx2_tvm(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_iabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
+		bool valid;
+
+		valid = kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
+			kvm_vcpu_dabt_isvalid(vcpu) &&
+			!kvm_vcpu_abt_issea(vcpu) &&
+			!kvm_vcpu_abt_iss1tw(vcpu);
+
+		if (valid) {
+			int ret = __vgic_v2_perform_cpuif_access(vcpu);
+
+			if (ret == 1)
+				return true;
+
+			/* Promote an illegal access to an SError.*/
+			if (ret == -1)
+				*exit_code = ARM_EXCEPTION_EL1_SERROR;
+		}
+	}
+
+	return false;
+}
+
+typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void);
+
+/*
+ * Allow the hypervisor to handle the exit with an exit handler if it has one.
+ *
+ * Returns true if the hypervisor handled the exit, and control should go back
+ * to the guest, or false if it hasn't.
+ */
+static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	exit_handler_fn fn;
+
+	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
+
+	if (fn)
+		return fn(vcpu, exit_code);
+
+	return false;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -384,59 +458,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
-	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
-	    handle_tx2_tvm(vcpu))
+	/* Check if there's an exit handler and allow it to handle the exit. */
+	if (kvm_hyp_handle_exit(vcpu, exit_code))
 		goto guest;
-
-	/*
-	 * We trap the first access to the FP/SIMD to save the host context
-	 * and restore the guest context lazily.
-	 * If FP/SIMD is not implemented, handle the trap and inject an
-	 * undefined instruction exception to the guest.
-	 * Similarly for trapped SVE accesses.
-	 */
-	if (__hyp_handle_fpsimd(vcpu))
-		goto guest;
-
-	if (__hyp_handle_ptrauth(vcpu))
-		goto guest;
-
-	if (!__populate_fault_info(vcpu))
-		goto guest;
-
-	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
-		bool valid;
-
-		valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
-			kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
-			kvm_vcpu_dabt_isvalid(vcpu) &&
-			!kvm_vcpu_abt_issea(vcpu) &&
-			!kvm_vcpu_abt_iss1tw(vcpu);
-
-		if (valid) {
-			int ret = __vgic_v2_perform_cpuif_access(vcpu);
-
-			if (ret == 1)
-				goto guest;
-
-			/* Promote an illegal access to an SError.*/
-			if (ret == -1)
-				*exit_code = ARM_EXCEPTION_EL1_SERROR;
-
-			goto exit;
-		}
-	}
-
-	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
-	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
-	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
-		int ret = __vgic_v3_perform_cpuif_access(vcpu);
-
-		if (ret == 1)
-			goto guest;
-	}
-
 exit:
 	/* Return to the host kernel and handle the exit */
 	return false;
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index a34b01cc8ab9..c52d580708e0 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -158,6 +158,23 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index ded2c66675f0..0e0d342358f7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -96,6 +96,23 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 	__deactivate_traps_common(vcpu);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for VHE systems running in EL2 */
 static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

From: Marc Zyngier <maz@kernel.org>

Simplify the early exception handling by slicing the gigantic decoding
tree into a more manageable set of functions, similar to what we have
in handle_exit.c.

This will also make the structure reusable for pKVM's own early exit
handling.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
 arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
 arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
 3 files changed, 126 insertions(+), 68 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 54abc8298ec3..0397606c0951 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u8 ec;
-	u64 esr;
-
-	esr = vcpu->arch.fault.esr_el2;
-	ec = ESR_ELx_EC(esr);
-
-	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-		return true;
-
-	return __get_fault_info(esr, &vcpu->arch.fault);
+	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
 }
 
 static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
@@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
 }
 
-/* Check for an FPSIMD/SVE trap and handle as appropriate */
-static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
+/*
+ * We trap the first access to the FP/SIMD to save the host context and
+ * restore the guest context lazily.
+ * If FP/SIMD is not implemented, handle the trap and inject an undefined
+ * instruction exception to the guest. Similarly for trapped SVE accesses.
+ */
+static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	bool sve_guest, sve_host;
 	u8 esr_ec;
@@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
 	}
 
 	esr_ec = kvm_vcpu_trap_get_class(vcpu);
-	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
-	    esr_ec != ESR_ELx_EC_SVE)
-		return false;
 
 	/* Don't handle SVE traps for non-SVE vcpus here: */
 	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
@@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
 
 DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
 
-static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
+static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	struct kvm_cpu_context *ctxt;
 	u64 val;
@@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
+	    handle_tx2_tvm(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_iabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
+		bool valid;
+
+		valid = kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
+			kvm_vcpu_dabt_isvalid(vcpu) &&
+			!kvm_vcpu_abt_issea(vcpu) &&
+			!kvm_vcpu_abt_iss1tw(vcpu);
+
+		if (valid) {
+			int ret = __vgic_v2_perform_cpuif_access(vcpu);
+
+			if (ret == 1)
+				return true;
+
+			/* Promote an illegal access to an SError.*/
+			if (ret == -1)
+				*exit_code = ARM_EXCEPTION_EL1_SERROR;
+		}
+	}
+
+	return false;
+}
+
+typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void);
+
+/*
+ * Allow the hypervisor to handle the exit with an exit handler if it has one.
+ *
+ * Returns true if the hypervisor handled the exit, and control should go back
+ * to the guest, or false if it hasn't.
+ */
+static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	exit_handler_fn fn;
+
+	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
+
+	if (fn)
+		return fn(vcpu, exit_code);
+
+	return false;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -384,59 +458,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
-	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
-	    handle_tx2_tvm(vcpu))
+	/* Check if there's an exit handler and allow it to handle the exit. */
+	if (kvm_hyp_handle_exit(vcpu, exit_code))
 		goto guest;
-
-	/*
-	 * We trap the first access to the FP/SIMD to save the host context
-	 * and restore the guest context lazily.
-	 * If FP/SIMD is not implemented, handle the trap and inject an
-	 * undefined instruction exception to the guest.
-	 * Similarly for trapped SVE accesses.
-	 */
-	if (__hyp_handle_fpsimd(vcpu))
-		goto guest;
-
-	if (__hyp_handle_ptrauth(vcpu))
-		goto guest;
-
-	if (!__populate_fault_info(vcpu))
-		goto guest;
-
-	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
-		bool valid;
-
-		valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
-			kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
-			kvm_vcpu_dabt_isvalid(vcpu) &&
-			!kvm_vcpu_abt_issea(vcpu) &&
-			!kvm_vcpu_abt_iss1tw(vcpu);
-
-		if (valid) {
-			int ret = __vgic_v2_perform_cpuif_access(vcpu);
-
-			if (ret == 1)
-				goto guest;
-
-			/* Promote an illegal access to an SError.*/
-			if (ret == -1)
-				*exit_code = ARM_EXCEPTION_EL1_SERROR;
-
-			goto exit;
-		}
-	}
-
-	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
-	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
-	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
-		int ret = __vgic_v3_perform_cpuif_access(vcpu);
-
-		if (ret == 1)
-			goto guest;
-	}
-
 exit:
 	/* Return to the host kernel and handle the exit */
 	return false;
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index a34b01cc8ab9..c52d580708e0 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -158,6 +158,23 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index ded2c66675f0..0e0d342358f7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -96,6 +96,23 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 	__deactivate_traps_common(vcpu);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for VHE systems running in EL2 */
 static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

From: Marc Zyngier <maz@kernel.org>

Simplify the early exception handling by slicing the gigantic decoding
tree into a more manageable set of functions, similar to what we have
in handle_exit.c.

This will also make the structure reusable for pKVM's own early exit
handling.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
 arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
 arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
 3 files changed, 126 insertions(+), 68 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 54abc8298ec3..0397606c0951 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
 
 static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u8 ec;
-	u64 esr;
-
-	esr = vcpu->arch.fault.esr_el2;
-	ec = ESR_ELx_EC(esr);
-
-	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
-		return true;
-
-	return __get_fault_info(esr, &vcpu->arch.fault);
+	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
 }
 
 static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
@@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
 	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
 }
 
-/* Check for an FPSIMD/SVE trap and handle as appropriate */
-static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
+/*
+ * We trap the first access to the FP/SIMD to save the host context and
+ * restore the guest context lazily.
+ * If FP/SIMD is not implemented, handle the trap and inject an undefined
+ * instruction exception to the guest. Similarly for trapped SVE accesses.
+ */
+static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	bool sve_guest, sve_host;
 	u8 esr_ec;
@@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
 	}
 
 	esr_ec = kvm_vcpu_trap_get_class(vcpu);
-	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
-	    esr_ec != ESR_ELx_EC_SVE)
-		return false;
 
 	/* Don't handle SVE traps for non-SVE vcpus here: */
 	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
@@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
 
 DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
 
-static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
+static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	struct kvm_cpu_context *ctxt;
 	u64 val;
@@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
 	return true;
 }
 
+static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
+	    handle_tx2_tvm(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
+	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_iabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	return false;
+}
+
+static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (!__populate_fault_info(vcpu))
+		return true;
+
+	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
+		bool valid;
+
+		valid = kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
+			kvm_vcpu_dabt_isvalid(vcpu) &&
+			!kvm_vcpu_abt_issea(vcpu) &&
+			!kvm_vcpu_abt_iss1tw(vcpu);
+
+		if (valid) {
+			int ret = __vgic_v2_perform_cpuif_access(vcpu);
+
+			if (ret == 1)
+				return true;
+
+			/* Promote an illegal access to an SError.*/
+			if (ret == -1)
+				*exit_code = ARM_EXCEPTION_EL1_SERROR;
+		}
+	}
+
+	return false;
+}
+
+typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void);
+
+/*
+ * Allow the hypervisor to handle the exit with an exit handler if it has one.
+ *
+ * Returns true if the hypervisor handled the exit, and control should go back
+ * to the guest, or false if it hasn't.
+ */
+static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	exit_handler_fn fn;
+
+	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
+
+	if (fn)
+		return fn(vcpu, exit_code);
+
+	return false;
+}
+
 /*
  * Return true when we were able to fixup the guest exit and should return to
  * the guest, false when we should restore the host state and return to the
@@ -384,59 +458,9 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 	if (*exit_code != ARM_EXCEPTION_TRAP)
 		goto exit;
 
-	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
-	    kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 &&
-	    handle_tx2_tvm(vcpu))
+	/* Check if there's an exit handler and allow it to handle the exit. */
+	if (kvm_hyp_handle_exit(vcpu, exit_code))
 		goto guest;
-
-	/*
-	 * We trap the first access to the FP/SIMD to save the host context
-	 * and restore the guest context lazily.
-	 * If FP/SIMD is not implemented, handle the trap and inject an
-	 * undefined instruction exception to the guest.
-	 * Similarly for trapped SVE accesses.
-	 */
-	if (__hyp_handle_fpsimd(vcpu))
-		goto guest;
-
-	if (__hyp_handle_ptrauth(vcpu))
-		goto guest;
-
-	if (!__populate_fault_info(vcpu))
-		goto guest;
-
-	if (static_branch_unlikely(&vgic_v2_cpuif_trap)) {
-		bool valid;
-
-		valid = kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_DABT_LOW &&
-			kvm_vcpu_trap_get_fault_type(vcpu) == FSC_FAULT &&
-			kvm_vcpu_dabt_isvalid(vcpu) &&
-			!kvm_vcpu_abt_issea(vcpu) &&
-			!kvm_vcpu_abt_iss1tw(vcpu);
-
-		if (valid) {
-			int ret = __vgic_v2_perform_cpuif_access(vcpu);
-
-			if (ret == 1)
-				goto guest;
-
-			/* Promote an illegal access to an SError.*/
-			if (ret == -1)
-				*exit_code = ARM_EXCEPTION_EL1_SERROR;
-
-			goto exit;
-		}
-	}
-
-	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
-	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_SYS64 ||
-	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_CP15_32)) {
-		int ret = __vgic_v3_perform_cpuif_access(vcpu);
-
-		if (ret == 1)
-			goto guest;
-	}
-
 exit:
 	/* Return to the host kernel and handle the exit */
 	return false;
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index a34b01cc8ab9..c52d580708e0 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -158,6 +158,23 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index ded2c66675f0..0e0d342358f7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -96,6 +96,23 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 	__deactivate_traps_common(vcpu);
 }
 
+static const exit_handler_fn hyp_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
+	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
+static const exit_handler_fn *kvm_get_exit_handler_array(void)
+{
+	return hyp_exit_handlers;
+}
+
 /* Switch to the guest for VHE systems running in EL2 */
 static int __kvm_vcpu_run_vhe(struct kvm_vcpu *vcpu)
 {
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Add missing FORCE prerequisite for hyp relocation target.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 5df6193fc430..8d741f71377f 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
 #    runtime. Because the hypervisor is part of the kernel binary, relocations
 #    produce a kernel VA. We enumerate relocations targeting hyp at build time
 #    and convert the kernel VAs at those positions to hyp VAs.
-$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
+$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
 	$(call if_changed,hyprel)
 
 # 5) Compile hyp-reloc.S and link it into the existing partially linked object.
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Add missing FORCE prerequisite for hyp relocation target.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 5df6193fc430..8d741f71377f 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
 #    runtime. Because the hypervisor is part of the kernel binary, relocations
 #    produce a kernel VA. We enumerate relocations targeting hyp at build time
 #    and convert the kernel VAs at those positions to hyp VAs.
-$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
+$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
 	$(call if_changed,hyprel)
 
 # 5) Compile hyp-reloc.S and link it into the existing partially linked object.
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Add missing FORCE prerequisite for hyp relocation target.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 5df6193fc430..8d741f71377f 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
 #    runtime. Because the hypervisor is part of the kernel binary, relocations
 #    produce a kernel VA. We enumerate relocations targeting hyp at build time
 #    and convert the kernel VAs at those positions to hyp VAs.
-$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
+$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
 	$(call if_changed,hyprel)
 
 # 5) Compile hyp-reloc.S and link it into the existing partially linked object.
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 05/12] KVM: arm64: Pass struct kvm to per-EC handlers
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

We need struct kvm to check for protected VMs to be able to pick
the right handlers for them in subsequent patches.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 4 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..733e39f5aaaf 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -403,7 +403,7 @@ static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
 
 typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void);
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm);
 
 /*
  * Allow the hypervisor to handle the exit with an exit handler if it has one.
@@ -413,7 +413,7 @@ static const exit_handler_fn *kvm_get_exit_handler_array(void);
  */
 static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array(kern_hyp_va(vcpu->kvm));
 	exit_handler_fn fn;
 
 	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..49080c607838 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -170,7 +170,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..34a4bd9f67a7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -108,7 +108,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 05/12] KVM: arm64: Pass struct kvm to per-EC handlers
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

We need struct kvm to check for protected VMs to be able to pick
the right handlers for them in subsequent patches.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 4 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..733e39f5aaaf 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -403,7 +403,7 @@ static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
 
 typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void);
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm);
 
 /*
  * Allow the hypervisor to handle the exit with an exit handler if it has one.
@@ -413,7 +413,7 @@ static const exit_handler_fn *kvm_get_exit_handler_array(void);
  */
 static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array(kern_hyp_va(vcpu->kvm));
 	exit_handler_fn fn;
 
 	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..49080c607838 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -170,7 +170,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..34a4bd9f67a7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -108,7 +108,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 05/12] KVM: arm64: Pass struct kvm to per-EC handlers
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

We need struct kvm to check for protected VMs to be able to pick
the right handlers for them in subsequent patches.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/include/hyp/switch.h | 4 ++--
 arch/arm64/kvm/hyp/nvhe/switch.c        | 2 +-
 arch/arm64/kvm/hyp/vhe/switch.c         | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..733e39f5aaaf 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -403,7 +403,7 @@ static bool kvm_hyp_handle_dabt_low(struct kvm_vcpu *vcpu, u64 *exit_code)
 
 typedef bool (*exit_handler_fn)(struct kvm_vcpu *, u64 *);
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void);
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm);
 
 /*
  * Allow the hypervisor to handle the exit with an exit handler if it has one.
@@ -413,7 +413,7 @@ static const exit_handler_fn *kvm_get_exit_handler_array(void);
  */
 static inline bool kvm_hyp_handle_exit(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
-	const exit_handler_fn *handlers = kvm_get_exit_handler_array();
+	const exit_handler_fn *handlers = kvm_get_exit_handler_array(kern_hyp_va(vcpu->kvm));
 	exit_handler_fn fn;
 
 	fn = handlers[kvm_vcpu_trap_get_class(vcpu)];
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..49080c607838 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -170,7 +170,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..34a4bd9f67a7 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -108,7 +108,7 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
-static const exit_handler_fn *kvm_get_exit_handler_array(void)
+static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
 	return hyp_exit_handlers;
 }
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 06/12] KVM: arm64: Add missing field descriptor for MDCR_EL2
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

It's not currently used. Added for completeness.

No functional change intended.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/include/asm/kvm_arm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 327120c0089f..a39fcf318c77 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -295,6 +295,7 @@
 #define MDCR_EL2_HPMFZO		(UL(1) << 29)
 #define MDCR_EL2_MTPME		(UL(1) << 28)
 #define MDCR_EL2_TDCC		(UL(1) << 27)
+#define MDCR_EL2_HLP		(UL(1) << 26)
 #define MDCR_EL2_HCCD		(UL(1) << 23)
 #define MDCR_EL2_TTRF		(UL(1) << 19)
 #define MDCR_EL2_HPMD		(UL(1) << 17)
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 06/12] KVM: arm64: Add missing field descriptor for MDCR_EL2
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

It's not currently used. Added for completeness.

No functional change intended.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/include/asm/kvm_arm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 327120c0089f..a39fcf318c77 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -295,6 +295,7 @@
 #define MDCR_EL2_HPMFZO		(UL(1) << 29)
 #define MDCR_EL2_MTPME		(UL(1) << 28)
 #define MDCR_EL2_TDCC		(UL(1) << 27)
+#define MDCR_EL2_HLP		(UL(1) << 26)
 #define MDCR_EL2_HCCD		(UL(1) << 23)
 #define MDCR_EL2_TTRF		(UL(1) << 19)
 #define MDCR_EL2_HPMD		(UL(1) << 17)
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 06/12] KVM: arm64: Add missing field descriptor for MDCR_EL2
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

It's not currently used. Added for completeness.

No functional change intended.

Suggested-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/include/asm/kvm_arm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 327120c0089f..a39fcf318c77 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -295,6 +295,7 @@
 #define MDCR_EL2_HPMFZO		(UL(1) << 29)
 #define MDCR_EL2_MTPME		(UL(1) << 28)
 #define MDCR_EL2_TDCC		(UL(1) << 27)
+#define MDCR_EL2_HLP		(UL(1) << 26)
 #define MDCR_EL2_HCCD		(UL(1) << 23)
 #define MDCR_EL2_TTRF		(UL(1) << 19)
 #define MDCR_EL2_HPMD		(UL(1) << 17)
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 07/12] KVM: arm64: Simplify masking out MTE in feature id reg
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:46   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Simplify code for hiding MTE support in feature id register when
MTE is not enabled/supported by KVM.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/kvm/sys_regs.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1d46e185f31e..447acce9ca84 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1077,14 +1077,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3), (u64)vcpu->kvm->arch.pfr0_csv3);
 		break;
 	case SYS_ID_AA64PFR1_EL1:
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
-		if (kvm_has_mte(vcpu->kvm)) {
-			u64 pfr, mte;
-
-			pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
-			mte = cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR1_MTE_SHIFT);
-			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), mte);
-		}
+		if (!kvm_has_mte(vcpu->kvm))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
 		break;
 	case SYS_ID_AA64ISAR1_EL1:
 		if (!vcpu_has_ptrauth(vcpu))
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 07/12] KVM: arm64: Simplify masking out MTE in feature id reg
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Simplify code for hiding MTE support in feature id register when
MTE is not enabled/supported by KVM.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/kvm/sys_regs.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1d46e185f31e..447acce9ca84 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1077,14 +1077,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3), (u64)vcpu->kvm->arch.pfr0_csv3);
 		break;
 	case SYS_ID_AA64PFR1_EL1:
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
-		if (kvm_has_mte(vcpu->kvm)) {
-			u64 pfr, mte;
-
-			pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
-			mte = cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR1_MTE_SHIFT);
-			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), mte);
-		}
+		if (!kvm_has_mte(vcpu->kvm))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
 		break;
 	case SYS_ID_AA64ISAR1_EL1:
 		if (!vcpu_has_ptrauth(vcpu))
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 07/12] KVM: arm64: Simplify masking out MTE in feature id reg
@ 2021-09-22 12:46   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:46 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Simplify code for hiding MTE support in feature id register when
MTE is not enabled/supported by KVM.

No functional change intended.

Signed-off-by: Fuad Tabba <tabba@google.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
---
 arch/arm64/kvm/sys_regs.c | 10 ++--------
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1d46e185f31e..447acce9ca84 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1077,14 +1077,8 @@ static u64 read_id_reg(const struct kvm_vcpu *vcpu,
 		val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3), (u64)vcpu->kvm->arch.pfr0_csv3);
 		break;
 	case SYS_ID_AA64PFR1_EL1:
-		val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
-		if (kvm_has_mte(vcpu->kvm)) {
-			u64 pfr, mte;
-
-			pfr = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
-			mte = cpuid_feature_extract_unsigned_field(pfr, ID_AA64PFR1_MTE_SHIFT);
-			val |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), mte);
-		}
+		if (!kvm_has_mte(vcpu->kvm))
+			val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
 		break;
 	case SYS_ID_AA64ISAR1_EL1:
 		if (!vcpu_has_ptrauth(vcpu))
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:47   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Add system register handlers for protected VMs. These cover Sys64
registers (including feature id registers), and debug.

No functional change intended as these are not hooked in yet to
the guest exit handlers introduced earlier. So when trapping is
triggered, the exit handlers let the host handle it, as before.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |   5 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
 6 files changed, 726 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c

diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
new file mode 100644
index 000000000000..0ed06923f7e9
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_fixed_config.h
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_FIXED_CONFIG_H__
+#define __ARM64_KVM_FIXED_CONFIG_H__
+
+#include <asm/sysreg.h>
+
+/*
+ * This file contains definitions for features to be allowed or restricted for
+ * guest virtual machines, depending on the mode KVM is running in and on the
+ * type of guest that is running.
+ *
+ * The ALLOW masks represent a bitmask of feature fields that are allowed
+ * without any restrictions as long as they are supported by the system.
+ *
+ * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
+ * features that are restricted to support at most the specified feature.
+ *
+ * If a feature field is not present in either, than it is not supported.
+ *
+ * The approach taken for protected VMs is to allow features that are:
+ * - Needed by common Linux distributions (e.g., floating point)
+ * - Trivial to support, e.g., supporting the feature does not introduce or
+ * require tracking of additional state in KVM
+ * - Cannot be trapped or prevent the guest from using anyway
+ */
+
+/*
+ * Allow for protected VMs:
+ * - Floating-point and Advanced SIMD
+ * - Data Independent Timing
+ */
+#define PVM_ID_AA64PFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - AArch64 guests only (no support for AArch32 guests):
+ *	AArch32 adds complexity in trap handling, emulation, condition codes,
+ *	etc...
+ * - RAS (v1)
+ *	Supported by KVM
+ */
+#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Branch Target Identification
+ * - Speculative Store Bypassing
+ */
+#define PVM_ID_AA64PFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Mixed-endian
+ * - Distinction between Secure and Non-secure Memory
+ * - Mixed-endian at EL0 only
+ * - Non-context synchronizing exception entry and exit
+ */
+#define PVM_ID_AA64MMFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - 40-bit IPA
+ * - 16-bit ASID
+ */
+#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Hardware translation table updates to Access flag and Dirty state
+ * - Number of VMID bits from CPU
+ * - Hierarchical Permission Disables
+ * - Privileged Access Never
+ * - SError interrupt exceptions from speculative reads
+ * - Enhanced Translation Synchronization
+ */
+#define PVM_ID_AA64MMFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Common not Private translations
+ * - User Access Override
+ * - IESB bit in the SCTLR_ELx registers
+ * - Unaligned single-copy atomicity and atomic functions
+ * - ESR_ELx.EC value on an exception by read access to feature ID space
+ * - TTL field in address operations.
+ * - Break-before-make sequences when changing translation block size
+ * - E0PDx mechanism
+ */
+#define PVM_ID_AA64MMFR2_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
+	)
+
+/*
+ * No support for Scalable Vectors for protected VMs:
+ *	Requires additional support from KVM, e.g., context-switching and
+ *	trapping at EL2
+ */
+#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
+
+/*
+ * No support for debug, including breakpoints, and watchpoints for protected
+ * VMs:
+ *	The Arm architecture mandates support for at least the Armv8 debug
+ *	architecture, which would include at least 2 hardware breakpoints and
+ *	watchpoints. Providing that support to protected guests adds
+ *	considerable state and complexity. Therefore, the reserved value of 0 is
+ *	used for debug-related fields.
+ */
+#define PVM_ID_AA64DFR0_ALLOW (0ULL)
+#define PVM_ID_AA64DFR1_ALLOW (0ULL)
+
+/*
+ * No support for implementation defined features.
+ */
+#define PVM_ID_AA64AFR0_ALLOW (0ULL)
+#define PVM_ID_AA64AFR1_ALLOW (0ULL)
+
+/*
+ * No restrictions on instructions implemented in AArch64.
+ */
+#define PVM_ID_AA64ISAR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
+	)
+
+#define PVM_ID_AA64ISAR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
+	)
+
+#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 657d0c94cf82..5afd14ab15b9 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
 void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
 #endif
 
+extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
 
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..6aa7b0c5bf21 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
 	void *addr = phys_to_virt(hyp_mem_base);
 	int ret;
 
+	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
+	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
+	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
+	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
 	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
 	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
 
 	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
 	if (ret)
diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
new file mode 100644
index 000000000000..0865163d363c
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
+#define __ARM64_KVM_NVHE_SYS_REGS_H__
+
+#include <asm/kvm_host.h>
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
+
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
+void __inject_undef64(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 8d741f71377f..0bbe37a18d5d 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
new file mode 100644
index 000000000000..ef8456c54b18
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -0,0 +1,492 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_fixed_config.h>
+#include <asm/kvm_mmu.h>
+
+#include <hyp/adjust_pc.h>
+
+#include "../../sys_regs.h"
+
+/*
+ * Copies of the host's CPU features registers holding sanitized values at hyp.
+ */
+u64 id_aa64pfr0_el1_sys_val;
+u64 id_aa64pfr1_el1_sys_val;
+u64 id_aa64isar0_el1_sys_val;
+u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr2_el1_sys_val;
+
+static inline void inject_undef64(struct kvm_vcpu *vcpu)
+{
+	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
+
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
+			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
+			     KVM_ARM64_PENDING_EXCEPTION);
+
+	__kvm_adjust_pc(vcpu);
+
+	write_sysreg_el1(esr, SYS_ESR);
+	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
+}
+
+/*
+ * Inject an unknown/undefined exception to an AArch64 guest while most of its
+ * sysregs are live.
+ */
+void __inject_undef64(struct kvm_vcpu *vcpu)
+{
+	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
+	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
+
+	inject_undef64(vcpu);
+
+	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
+	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
+}
+
+/*
+ * Accessor for undefined accesses.
+ */
+static bool undef_access(struct kvm_vcpu *vcpu,
+			 struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	__inject_undef64(vcpu);
+	return false;
+}
+
+/*
+ * Returns the restricted features values of the feature register based on the
+ * limitations in restrict_fields.
+ * A feature id field value of 0b0000 does not impose any restrictions.
+ * Note: Use only for unsigned feature field values.
+ */
+static u64 get_restricted_features_unsigned(u64 sys_reg_val,
+					    u64 restrict_fields)
+{
+	u64 value = 0UL;
+	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
+
+	/*
+	 * According to the Arm Architecture Reference Manual, feature fields
+	 * use increasing values to indicate increases in functionality.
+	 * Iterate over the restricted feature fields and calculate the minimum
+	 * unsigned value between the one supported by the system, and what the
+	 * value is being restricted to.
+	 */
+	while (sys_reg_val && restrict_fields) {
+		value |= min(sys_reg_val & mask, restrict_fields & mask);
+		sys_reg_val &= ~mask;
+		restrict_fields &= ~mask;
+		mask <<= ARM64_FEATURE_FIELD_BITS;
+	}
+
+	return value;
+}
+
+/*
+ * Functions that return the value of feature id registers for protected VMs
+ * based on allowed features, system features, and KVM support.
+ */
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 set_mask = 0;
+	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
+
+	if (!vcpu_has_sve(vcpu))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
+
+	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
+		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
+
+	/* Spectre and Meltdown mitigation in KVM */
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
+			       (u64)kvm->arch.pfr0_csv2);
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
+			       (u64)kvm->arch.pfr0_csv3);
+
+	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
+}
+
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
+
+	if (!kvm_has_mte(kvm))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
+
+	return id_aa64pfr1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for Scalable Vectors, therefore, hyp has no sanitized
+	 * copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, including breakpoints, and watchpoints,
+	 * therefore, pKVM has no sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, therefore, hyp has no sanitized copy of the
+	 * feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
+}
+
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
+{
+	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
+
+	if (!vcpu_has_ptrauth(vcpu))
+		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
+
+	return id_aa64isar1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
+{
+	u64 set_mask;
+
+	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
+		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
+
+	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
+}
+
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
+}
+
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
+}
+
+/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		       struct sys_reg_desc const *r)
+{
+	u32 id = reg_to_encoding(r);
+
+	switch (id) {
+	case SYS_ID_AA64PFR0_EL1:
+		return get_pvm_id_aa64pfr0(vcpu);
+	case SYS_ID_AA64PFR1_EL1:
+		return get_pvm_id_aa64pfr1(vcpu);
+	case SYS_ID_AA64ZFR0_EL1:
+		return get_pvm_id_aa64zfr0(vcpu);
+	case SYS_ID_AA64DFR0_EL1:
+		return get_pvm_id_aa64dfr0(vcpu);
+	case SYS_ID_AA64DFR1_EL1:
+		return get_pvm_id_aa64dfr1(vcpu);
+	case SYS_ID_AA64AFR0_EL1:
+		return get_pvm_id_aa64afr0(vcpu);
+	case SYS_ID_AA64AFR1_EL1:
+		return get_pvm_id_aa64afr1(vcpu);
+	case SYS_ID_AA64ISAR0_EL1:
+		return get_pvm_id_aa64isar0(vcpu);
+	case SYS_ID_AA64ISAR1_EL1:
+		return get_pvm_id_aa64isar1(vcpu);
+	case SYS_ID_AA64MMFR0_EL1:
+		return get_pvm_id_aa64mmfr0(vcpu);
+	case SYS_ID_AA64MMFR1_EL1:
+		return get_pvm_id_aa64mmfr1(vcpu);
+	case SYS_ID_AA64MMFR2_EL1:
+		return get_pvm_id_aa64mmfr2(vcpu);
+	default:
+		/*
+		 * Should never happen because all cases are covered in
+		 * pvm_sys_reg_descs[] below.
+		 */
+		WARN_ON(1);
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Accessor for AArch32 feature id registers.
+ *
+ * The value of these registers is "unknown" according to the spec if AArch32
+ * isn't supported.
+ */
+static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	/*
+	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
+	 * of AArch32 feature id registers.
+	 */
+	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
+		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
+
+	/* Use 0 for architecturally "unknown" values. */
+	p->regval = 0;
+	return true;
+}
+
+/*
+ * Accessor for AArch64 feature id registers.
+ *
+ * If access is allowed, set the regval to the protected VM's view of the
+ * register and return true.
+ * Otherwise, inject an undefined exception and return false.
+ */
+static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	p->regval = read_id_reg(vcpu, r);
+	return true;
+}
+
+/* Mark the specified system register as an AArch32 feature id register. */
+#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
+
+/* Mark the specified system register as an AArch64 feature id register. */
+#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
+
+/* Mark the specified system register as not being handled in hyp. */
+#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
+
+/*
+ * Architected system registers.
+ * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
+ * it will lead to injecting an exception into the guest.
+ */
+static const struct sys_reg_desc pvm_sys_reg_descs[] = {
+	/* Cache maintenance by set/way operations are restricted. */
+
+	/* Debug and Trace Registers are restricted. */
+
+	/* AArch64 mappings of the AArch32 ID registers */
+	/* CRm=1 */
+	AARCH32(SYS_ID_PFR0_EL1),
+	AARCH32(SYS_ID_PFR1_EL1),
+	AARCH32(SYS_ID_DFR0_EL1),
+	AARCH32(SYS_ID_AFR0_EL1),
+	AARCH32(SYS_ID_MMFR0_EL1),
+	AARCH32(SYS_ID_MMFR1_EL1),
+	AARCH32(SYS_ID_MMFR2_EL1),
+	AARCH32(SYS_ID_MMFR3_EL1),
+
+	/* CRm=2 */
+	AARCH32(SYS_ID_ISAR0_EL1),
+	AARCH32(SYS_ID_ISAR1_EL1),
+	AARCH32(SYS_ID_ISAR2_EL1),
+	AARCH32(SYS_ID_ISAR3_EL1),
+	AARCH32(SYS_ID_ISAR4_EL1),
+	AARCH32(SYS_ID_ISAR5_EL1),
+	AARCH32(SYS_ID_MMFR4_EL1),
+	AARCH32(SYS_ID_ISAR6_EL1),
+
+	/* CRm=3 */
+	AARCH32(SYS_MVFR0_EL1),
+	AARCH32(SYS_MVFR1_EL1),
+	AARCH32(SYS_MVFR2_EL1),
+	AARCH32(SYS_ID_PFR2_EL1),
+	AARCH32(SYS_ID_DFR1_EL1),
+	AARCH32(SYS_ID_MMFR5_EL1),
+
+	/* AArch64 ID registers */
+	/* CRm=4 */
+	AARCH64(SYS_ID_AA64PFR0_EL1),
+	AARCH64(SYS_ID_AA64PFR1_EL1),
+	AARCH64(SYS_ID_AA64ZFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR1_EL1),
+	AARCH64(SYS_ID_AA64AFR0_EL1),
+	AARCH64(SYS_ID_AA64AFR1_EL1),
+	AARCH64(SYS_ID_AA64ISAR0_EL1),
+	AARCH64(SYS_ID_AA64ISAR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR0_EL1),
+	AARCH64(SYS_ID_AA64MMFR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR2_EL1),
+
+	HOST_HANDLED(SYS_SCTLR_EL1),
+	HOST_HANDLED(SYS_ACTLR_EL1),
+	HOST_HANDLED(SYS_CPACR_EL1),
+
+	HOST_HANDLED(SYS_RGSR_EL1),
+	HOST_HANDLED(SYS_GCR_EL1),
+
+	/* Scalable Vector Registers are restricted. */
+
+	HOST_HANDLED(SYS_TTBR0_EL1),
+	HOST_HANDLED(SYS_TTBR1_EL1),
+	HOST_HANDLED(SYS_TCR_EL1),
+
+	HOST_HANDLED(SYS_APIAKEYLO_EL1),
+	HOST_HANDLED(SYS_APIAKEYHI_EL1),
+	HOST_HANDLED(SYS_APIBKEYLO_EL1),
+	HOST_HANDLED(SYS_APIBKEYHI_EL1),
+	HOST_HANDLED(SYS_APDAKEYLO_EL1),
+	HOST_HANDLED(SYS_APDAKEYHI_EL1),
+	HOST_HANDLED(SYS_APDBKEYLO_EL1),
+	HOST_HANDLED(SYS_APDBKEYHI_EL1),
+	HOST_HANDLED(SYS_APGAKEYLO_EL1),
+	HOST_HANDLED(SYS_APGAKEYHI_EL1),
+
+	HOST_HANDLED(SYS_AFSR0_EL1),
+	HOST_HANDLED(SYS_AFSR1_EL1),
+	HOST_HANDLED(SYS_ESR_EL1),
+
+	HOST_HANDLED(SYS_ERRIDR_EL1),
+	HOST_HANDLED(SYS_ERRSELR_EL1),
+	HOST_HANDLED(SYS_ERXFR_EL1),
+	HOST_HANDLED(SYS_ERXCTLR_EL1),
+	HOST_HANDLED(SYS_ERXSTATUS_EL1),
+	HOST_HANDLED(SYS_ERXADDR_EL1),
+	HOST_HANDLED(SYS_ERXMISC0_EL1),
+	HOST_HANDLED(SYS_ERXMISC1_EL1),
+
+	HOST_HANDLED(SYS_TFSR_EL1),
+	HOST_HANDLED(SYS_TFSRE0_EL1),
+
+	HOST_HANDLED(SYS_FAR_EL1),
+	HOST_HANDLED(SYS_PAR_EL1),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_MAIR_EL1),
+	HOST_HANDLED(SYS_AMAIR_EL1),
+
+	/* Limited Ordering Regions Registers are restricted. */
+
+	HOST_HANDLED(SYS_VBAR_EL1),
+	HOST_HANDLED(SYS_DISR_EL1),
+
+	/* GIC CPU Interface registers are restricted. */
+
+	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
+	HOST_HANDLED(SYS_TPIDR_EL1),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL1),
+
+	HOST_HANDLED(SYS_CNTKCTL_EL1),
+
+	HOST_HANDLED(SYS_CCSIDR_EL1),
+	HOST_HANDLED(SYS_CLIDR_EL1),
+	HOST_HANDLED(SYS_CSSELR_EL1),
+	HOST_HANDLED(SYS_CTR_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_TPIDR_EL0),
+	HOST_HANDLED(SYS_TPIDRRO_EL0),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL0),
+
+	/* Activity Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
+	HOST_HANDLED(SYS_CNTP_CTL_EL0),
+	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_DACR32_EL2),
+	HOST_HANDLED(SYS_IFSR32_EL2),
+	HOST_HANDLED(SYS_FPEXC32_EL2),
+};
+
+/*
+ * Handler for protected VM MSR, MRS or System instruction execution.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't, to be handled by the host.
+ */
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const struct sys_reg_desc *r;
+	struct sys_reg_params params;
+	unsigned long esr = kvm_vcpu_get_esr(vcpu);
+	int Rt = kvm_vcpu_sys_get_rt(vcpu);
+
+	params = esr_sys64_to_params(esr);
+	params.regval = vcpu_get_reg(vcpu, Rt);
+
+	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
+
+	/* Undefined access (RESTRICTED). */
+	if (r == NULL) {
+		__inject_undef64(vcpu);
+		return true;
+	}
+
+	/* Handled by the host (HOST_HANDLED) */
+	if (r->access == NULL)
+		return false;
+
+	/* Handled by hyp: skip instruction if instructed to do so. */
+	if (r->access(vcpu, &params, r))
+		__kvm_skip_instr(vcpu);
+
+	if (!params.is_write)
+		vcpu_set_reg(vcpu, Rt, params.regval);
+
+	return true;
+}
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Add system register handlers for protected VMs. These cover Sys64
registers (including feature id registers), and debug.

No functional change intended as these are not hooked in yet to
the guest exit handlers introduced earlier. So when trapping is
triggered, the exit handlers let the host handle it, as before.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |   5 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
 6 files changed, 726 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c

diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
new file mode 100644
index 000000000000..0ed06923f7e9
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_fixed_config.h
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_FIXED_CONFIG_H__
+#define __ARM64_KVM_FIXED_CONFIG_H__
+
+#include <asm/sysreg.h>
+
+/*
+ * This file contains definitions for features to be allowed or restricted for
+ * guest virtual machines, depending on the mode KVM is running in and on the
+ * type of guest that is running.
+ *
+ * The ALLOW masks represent a bitmask of feature fields that are allowed
+ * without any restrictions as long as they are supported by the system.
+ *
+ * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
+ * features that are restricted to support at most the specified feature.
+ *
+ * If a feature field is not present in either, than it is not supported.
+ *
+ * The approach taken for protected VMs is to allow features that are:
+ * - Needed by common Linux distributions (e.g., floating point)
+ * - Trivial to support, e.g., supporting the feature does not introduce or
+ * require tracking of additional state in KVM
+ * - Cannot be trapped or prevent the guest from using anyway
+ */
+
+/*
+ * Allow for protected VMs:
+ * - Floating-point and Advanced SIMD
+ * - Data Independent Timing
+ */
+#define PVM_ID_AA64PFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - AArch64 guests only (no support for AArch32 guests):
+ *	AArch32 adds complexity in trap handling, emulation, condition codes,
+ *	etc...
+ * - RAS (v1)
+ *	Supported by KVM
+ */
+#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Branch Target Identification
+ * - Speculative Store Bypassing
+ */
+#define PVM_ID_AA64PFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Mixed-endian
+ * - Distinction between Secure and Non-secure Memory
+ * - Mixed-endian at EL0 only
+ * - Non-context synchronizing exception entry and exit
+ */
+#define PVM_ID_AA64MMFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - 40-bit IPA
+ * - 16-bit ASID
+ */
+#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Hardware translation table updates to Access flag and Dirty state
+ * - Number of VMID bits from CPU
+ * - Hierarchical Permission Disables
+ * - Privileged Access Never
+ * - SError interrupt exceptions from speculative reads
+ * - Enhanced Translation Synchronization
+ */
+#define PVM_ID_AA64MMFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Common not Private translations
+ * - User Access Override
+ * - IESB bit in the SCTLR_ELx registers
+ * - Unaligned single-copy atomicity and atomic functions
+ * - ESR_ELx.EC value on an exception by read access to feature ID space
+ * - TTL field in address operations.
+ * - Break-before-make sequences when changing translation block size
+ * - E0PDx mechanism
+ */
+#define PVM_ID_AA64MMFR2_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
+	)
+
+/*
+ * No support for Scalable Vectors for protected VMs:
+ *	Requires additional support from KVM, e.g., context-switching and
+ *	trapping at EL2
+ */
+#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
+
+/*
+ * No support for debug, including breakpoints, and watchpoints for protected
+ * VMs:
+ *	The Arm architecture mandates support for at least the Armv8 debug
+ *	architecture, which would include at least 2 hardware breakpoints and
+ *	watchpoints. Providing that support to protected guests adds
+ *	considerable state and complexity. Therefore, the reserved value of 0 is
+ *	used for debug-related fields.
+ */
+#define PVM_ID_AA64DFR0_ALLOW (0ULL)
+#define PVM_ID_AA64DFR1_ALLOW (0ULL)
+
+/*
+ * No support for implementation defined features.
+ */
+#define PVM_ID_AA64AFR0_ALLOW (0ULL)
+#define PVM_ID_AA64AFR1_ALLOW (0ULL)
+
+/*
+ * No restrictions on instructions implemented in AArch64.
+ */
+#define PVM_ID_AA64ISAR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
+	)
+
+#define PVM_ID_AA64ISAR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
+	)
+
+#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 657d0c94cf82..5afd14ab15b9 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
 void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
 #endif
 
+extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
 
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..6aa7b0c5bf21 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
 	void *addr = phys_to_virt(hyp_mem_base);
 	int ret;
 
+	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
+	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
+	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
+	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
 	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
 	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
 
 	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
 	if (ret)
diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
new file mode 100644
index 000000000000..0865163d363c
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
+#define __ARM64_KVM_NVHE_SYS_REGS_H__
+
+#include <asm/kvm_host.h>
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
+
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
+void __inject_undef64(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 8d741f71377f..0bbe37a18d5d 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
new file mode 100644
index 000000000000..ef8456c54b18
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -0,0 +1,492 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_fixed_config.h>
+#include <asm/kvm_mmu.h>
+
+#include <hyp/adjust_pc.h>
+
+#include "../../sys_regs.h"
+
+/*
+ * Copies of the host's CPU features registers holding sanitized values at hyp.
+ */
+u64 id_aa64pfr0_el1_sys_val;
+u64 id_aa64pfr1_el1_sys_val;
+u64 id_aa64isar0_el1_sys_val;
+u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr2_el1_sys_val;
+
+static inline void inject_undef64(struct kvm_vcpu *vcpu)
+{
+	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
+
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
+			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
+			     KVM_ARM64_PENDING_EXCEPTION);
+
+	__kvm_adjust_pc(vcpu);
+
+	write_sysreg_el1(esr, SYS_ESR);
+	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
+}
+
+/*
+ * Inject an unknown/undefined exception to an AArch64 guest while most of its
+ * sysregs are live.
+ */
+void __inject_undef64(struct kvm_vcpu *vcpu)
+{
+	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
+	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
+
+	inject_undef64(vcpu);
+
+	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
+	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
+}
+
+/*
+ * Accessor for undefined accesses.
+ */
+static bool undef_access(struct kvm_vcpu *vcpu,
+			 struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	__inject_undef64(vcpu);
+	return false;
+}
+
+/*
+ * Returns the restricted features values of the feature register based on the
+ * limitations in restrict_fields.
+ * A feature id field value of 0b0000 does not impose any restrictions.
+ * Note: Use only for unsigned feature field values.
+ */
+static u64 get_restricted_features_unsigned(u64 sys_reg_val,
+					    u64 restrict_fields)
+{
+	u64 value = 0UL;
+	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
+
+	/*
+	 * According to the Arm Architecture Reference Manual, feature fields
+	 * use increasing values to indicate increases in functionality.
+	 * Iterate over the restricted feature fields and calculate the minimum
+	 * unsigned value between the one supported by the system, and what the
+	 * value is being restricted to.
+	 */
+	while (sys_reg_val && restrict_fields) {
+		value |= min(sys_reg_val & mask, restrict_fields & mask);
+		sys_reg_val &= ~mask;
+		restrict_fields &= ~mask;
+		mask <<= ARM64_FEATURE_FIELD_BITS;
+	}
+
+	return value;
+}
+
+/*
+ * Functions that return the value of feature id registers for protected VMs
+ * based on allowed features, system features, and KVM support.
+ */
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 set_mask = 0;
+	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
+
+	if (!vcpu_has_sve(vcpu))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
+
+	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
+		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
+
+	/* Spectre and Meltdown mitigation in KVM */
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
+			       (u64)kvm->arch.pfr0_csv2);
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
+			       (u64)kvm->arch.pfr0_csv3);
+
+	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
+}
+
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
+
+	if (!kvm_has_mte(kvm))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
+
+	return id_aa64pfr1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for Scalable Vectors, therefore, hyp has no sanitized
+	 * copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, including breakpoints, and watchpoints,
+	 * therefore, pKVM has no sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, therefore, hyp has no sanitized copy of the
+	 * feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
+}
+
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
+{
+	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
+
+	if (!vcpu_has_ptrauth(vcpu))
+		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
+
+	return id_aa64isar1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
+{
+	u64 set_mask;
+
+	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
+		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
+
+	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
+}
+
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
+}
+
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
+}
+
+/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		       struct sys_reg_desc const *r)
+{
+	u32 id = reg_to_encoding(r);
+
+	switch (id) {
+	case SYS_ID_AA64PFR0_EL1:
+		return get_pvm_id_aa64pfr0(vcpu);
+	case SYS_ID_AA64PFR1_EL1:
+		return get_pvm_id_aa64pfr1(vcpu);
+	case SYS_ID_AA64ZFR0_EL1:
+		return get_pvm_id_aa64zfr0(vcpu);
+	case SYS_ID_AA64DFR0_EL1:
+		return get_pvm_id_aa64dfr0(vcpu);
+	case SYS_ID_AA64DFR1_EL1:
+		return get_pvm_id_aa64dfr1(vcpu);
+	case SYS_ID_AA64AFR0_EL1:
+		return get_pvm_id_aa64afr0(vcpu);
+	case SYS_ID_AA64AFR1_EL1:
+		return get_pvm_id_aa64afr1(vcpu);
+	case SYS_ID_AA64ISAR0_EL1:
+		return get_pvm_id_aa64isar0(vcpu);
+	case SYS_ID_AA64ISAR1_EL1:
+		return get_pvm_id_aa64isar1(vcpu);
+	case SYS_ID_AA64MMFR0_EL1:
+		return get_pvm_id_aa64mmfr0(vcpu);
+	case SYS_ID_AA64MMFR1_EL1:
+		return get_pvm_id_aa64mmfr1(vcpu);
+	case SYS_ID_AA64MMFR2_EL1:
+		return get_pvm_id_aa64mmfr2(vcpu);
+	default:
+		/*
+		 * Should never happen because all cases are covered in
+		 * pvm_sys_reg_descs[] below.
+		 */
+		WARN_ON(1);
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Accessor for AArch32 feature id registers.
+ *
+ * The value of these registers is "unknown" according to the spec if AArch32
+ * isn't supported.
+ */
+static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	/*
+	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
+	 * of AArch32 feature id registers.
+	 */
+	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
+		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
+
+	/* Use 0 for architecturally "unknown" values. */
+	p->regval = 0;
+	return true;
+}
+
+/*
+ * Accessor for AArch64 feature id registers.
+ *
+ * If access is allowed, set the regval to the protected VM's view of the
+ * register and return true.
+ * Otherwise, inject an undefined exception and return false.
+ */
+static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	p->regval = read_id_reg(vcpu, r);
+	return true;
+}
+
+/* Mark the specified system register as an AArch32 feature id register. */
+#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
+
+/* Mark the specified system register as an AArch64 feature id register. */
+#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
+
+/* Mark the specified system register as not being handled in hyp. */
+#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
+
+/*
+ * Architected system registers.
+ * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
+ * it will lead to injecting an exception into the guest.
+ */
+static const struct sys_reg_desc pvm_sys_reg_descs[] = {
+	/* Cache maintenance by set/way operations are restricted. */
+
+	/* Debug and Trace Registers are restricted. */
+
+	/* AArch64 mappings of the AArch32 ID registers */
+	/* CRm=1 */
+	AARCH32(SYS_ID_PFR0_EL1),
+	AARCH32(SYS_ID_PFR1_EL1),
+	AARCH32(SYS_ID_DFR0_EL1),
+	AARCH32(SYS_ID_AFR0_EL1),
+	AARCH32(SYS_ID_MMFR0_EL1),
+	AARCH32(SYS_ID_MMFR1_EL1),
+	AARCH32(SYS_ID_MMFR2_EL1),
+	AARCH32(SYS_ID_MMFR3_EL1),
+
+	/* CRm=2 */
+	AARCH32(SYS_ID_ISAR0_EL1),
+	AARCH32(SYS_ID_ISAR1_EL1),
+	AARCH32(SYS_ID_ISAR2_EL1),
+	AARCH32(SYS_ID_ISAR3_EL1),
+	AARCH32(SYS_ID_ISAR4_EL1),
+	AARCH32(SYS_ID_ISAR5_EL1),
+	AARCH32(SYS_ID_MMFR4_EL1),
+	AARCH32(SYS_ID_ISAR6_EL1),
+
+	/* CRm=3 */
+	AARCH32(SYS_MVFR0_EL1),
+	AARCH32(SYS_MVFR1_EL1),
+	AARCH32(SYS_MVFR2_EL1),
+	AARCH32(SYS_ID_PFR2_EL1),
+	AARCH32(SYS_ID_DFR1_EL1),
+	AARCH32(SYS_ID_MMFR5_EL1),
+
+	/* AArch64 ID registers */
+	/* CRm=4 */
+	AARCH64(SYS_ID_AA64PFR0_EL1),
+	AARCH64(SYS_ID_AA64PFR1_EL1),
+	AARCH64(SYS_ID_AA64ZFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR1_EL1),
+	AARCH64(SYS_ID_AA64AFR0_EL1),
+	AARCH64(SYS_ID_AA64AFR1_EL1),
+	AARCH64(SYS_ID_AA64ISAR0_EL1),
+	AARCH64(SYS_ID_AA64ISAR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR0_EL1),
+	AARCH64(SYS_ID_AA64MMFR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR2_EL1),
+
+	HOST_HANDLED(SYS_SCTLR_EL1),
+	HOST_HANDLED(SYS_ACTLR_EL1),
+	HOST_HANDLED(SYS_CPACR_EL1),
+
+	HOST_HANDLED(SYS_RGSR_EL1),
+	HOST_HANDLED(SYS_GCR_EL1),
+
+	/* Scalable Vector Registers are restricted. */
+
+	HOST_HANDLED(SYS_TTBR0_EL1),
+	HOST_HANDLED(SYS_TTBR1_EL1),
+	HOST_HANDLED(SYS_TCR_EL1),
+
+	HOST_HANDLED(SYS_APIAKEYLO_EL1),
+	HOST_HANDLED(SYS_APIAKEYHI_EL1),
+	HOST_HANDLED(SYS_APIBKEYLO_EL1),
+	HOST_HANDLED(SYS_APIBKEYHI_EL1),
+	HOST_HANDLED(SYS_APDAKEYLO_EL1),
+	HOST_HANDLED(SYS_APDAKEYHI_EL1),
+	HOST_HANDLED(SYS_APDBKEYLO_EL1),
+	HOST_HANDLED(SYS_APDBKEYHI_EL1),
+	HOST_HANDLED(SYS_APGAKEYLO_EL1),
+	HOST_HANDLED(SYS_APGAKEYHI_EL1),
+
+	HOST_HANDLED(SYS_AFSR0_EL1),
+	HOST_HANDLED(SYS_AFSR1_EL1),
+	HOST_HANDLED(SYS_ESR_EL1),
+
+	HOST_HANDLED(SYS_ERRIDR_EL1),
+	HOST_HANDLED(SYS_ERRSELR_EL1),
+	HOST_HANDLED(SYS_ERXFR_EL1),
+	HOST_HANDLED(SYS_ERXCTLR_EL1),
+	HOST_HANDLED(SYS_ERXSTATUS_EL1),
+	HOST_HANDLED(SYS_ERXADDR_EL1),
+	HOST_HANDLED(SYS_ERXMISC0_EL1),
+	HOST_HANDLED(SYS_ERXMISC1_EL1),
+
+	HOST_HANDLED(SYS_TFSR_EL1),
+	HOST_HANDLED(SYS_TFSRE0_EL1),
+
+	HOST_HANDLED(SYS_FAR_EL1),
+	HOST_HANDLED(SYS_PAR_EL1),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_MAIR_EL1),
+	HOST_HANDLED(SYS_AMAIR_EL1),
+
+	/* Limited Ordering Regions Registers are restricted. */
+
+	HOST_HANDLED(SYS_VBAR_EL1),
+	HOST_HANDLED(SYS_DISR_EL1),
+
+	/* GIC CPU Interface registers are restricted. */
+
+	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
+	HOST_HANDLED(SYS_TPIDR_EL1),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL1),
+
+	HOST_HANDLED(SYS_CNTKCTL_EL1),
+
+	HOST_HANDLED(SYS_CCSIDR_EL1),
+	HOST_HANDLED(SYS_CLIDR_EL1),
+	HOST_HANDLED(SYS_CSSELR_EL1),
+	HOST_HANDLED(SYS_CTR_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_TPIDR_EL0),
+	HOST_HANDLED(SYS_TPIDRRO_EL0),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL0),
+
+	/* Activity Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
+	HOST_HANDLED(SYS_CNTP_CTL_EL0),
+	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_DACR32_EL2),
+	HOST_HANDLED(SYS_IFSR32_EL2),
+	HOST_HANDLED(SYS_FPEXC32_EL2),
+};
+
+/*
+ * Handler for protected VM MSR, MRS or System instruction execution.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't, to be handled by the host.
+ */
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const struct sys_reg_desc *r;
+	struct sys_reg_params params;
+	unsigned long esr = kvm_vcpu_get_esr(vcpu);
+	int Rt = kvm_vcpu_sys_get_rt(vcpu);
+
+	params = esr_sys64_to_params(esr);
+	params.regval = vcpu_get_reg(vcpu, Rt);
+
+	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
+
+	/* Undefined access (RESTRICTED). */
+	if (r == NULL) {
+		__inject_undef64(vcpu);
+		return true;
+	}
+
+	/* Handled by the host (HOST_HANDLED) */
+	if (r->access == NULL)
+		return false;
+
+	/* Handled by hyp: skip instruction if instructed to do so. */
+	if (r->access(vcpu, &params, r))
+		__kvm_skip_instr(vcpu);
+
+	if (!params.is_write)
+		vcpu_set_reg(vcpu, Rt, params.regval);
+
+	return true;
+}
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Add system register handlers for protected VMs. These cover Sys64
registers (including feature id registers), and debug.

No functional change intended as these are not hooked in yet to
the guest exit handlers introduced earlier. So when trapping is
triggered, the exit handlers let the host handle it, as before.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
 arch/arm64/include/asm/kvm_hyp.h           |   5 +
 arch/arm64/kvm/arm.c                       |   5 +
 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
 arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
 arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
 6 files changed, 726 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c

diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
new file mode 100644
index 000000000000..0ed06923f7e9
--- /dev/null
+++ b/arch/arm64/include/asm/kvm_fixed_config.h
@@ -0,0 +1,195 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_FIXED_CONFIG_H__
+#define __ARM64_KVM_FIXED_CONFIG_H__
+
+#include <asm/sysreg.h>
+
+/*
+ * This file contains definitions for features to be allowed or restricted for
+ * guest virtual machines, depending on the mode KVM is running in and on the
+ * type of guest that is running.
+ *
+ * The ALLOW masks represent a bitmask of feature fields that are allowed
+ * without any restrictions as long as they are supported by the system.
+ *
+ * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
+ * features that are restricted to support at most the specified feature.
+ *
+ * If a feature field is not present in either, than it is not supported.
+ *
+ * The approach taken for protected VMs is to allow features that are:
+ * - Needed by common Linux distributions (e.g., floating point)
+ * - Trivial to support, e.g., supporting the feature does not introduce or
+ * require tracking of additional state in KVM
+ * - Cannot be trapped or prevent the guest from using anyway
+ */
+
+/*
+ * Allow for protected VMs:
+ * - Floating-point and Advanced SIMD
+ * - Data Independent Timing
+ */
+#define PVM_ID_AA64PFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - AArch64 guests only (no support for AArch32 guests):
+ *	AArch32 adds complexity in trap handling, emulation, condition codes,
+ *	etc...
+ * - RAS (v1)
+ *	Supported by KVM
+ */
+#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Branch Target Identification
+ * - Speculative Store Bypassing
+ */
+#define PVM_ID_AA64PFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
+	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Mixed-endian
+ * - Distinction between Secure and Non-secure Memory
+ * - Mixed-endian at EL0 only
+ * - Non-context synchronizing exception entry and exit
+ */
+#define PVM_ID_AA64MMFR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
+	)
+
+/*
+ * Restrict to the following *unsigned* features for protected VMs:
+ * - 40-bit IPA
+ * - 16-bit ASID
+ */
+#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
+	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Hardware translation table updates to Access flag and Dirty state
+ * - Number of VMID bits from CPU
+ * - Hierarchical Permission Disables
+ * - Privileged Access Never
+ * - SError interrupt exceptions from speculative reads
+ * - Enhanced Translation Synchronization
+ */
+#define PVM_ID_AA64MMFR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
+	)
+
+/*
+ * Allow for protected VMs:
+ * - Common not Private translations
+ * - User Access Override
+ * - IESB bit in the SCTLR_ELx registers
+ * - Unaligned single-copy atomicity and atomic functions
+ * - ESR_ELx.EC value on an exception by read access to feature ID space
+ * - TTL field in address operations.
+ * - Break-before-make sequences when changing translation block size
+ * - E0PDx mechanism
+ */
+#define PVM_ID_AA64MMFR2_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
+	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
+	)
+
+/*
+ * No support for Scalable Vectors for protected VMs:
+ *	Requires additional support from KVM, e.g., context-switching and
+ *	trapping at EL2
+ */
+#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
+
+/*
+ * No support for debug, including breakpoints, and watchpoints for protected
+ * VMs:
+ *	The Arm architecture mandates support for at least the Armv8 debug
+ *	architecture, which would include at least 2 hardware breakpoints and
+ *	watchpoints. Providing that support to protected guests adds
+ *	considerable state and complexity. Therefore, the reserved value of 0 is
+ *	used for debug-related fields.
+ */
+#define PVM_ID_AA64DFR0_ALLOW (0ULL)
+#define PVM_ID_AA64DFR1_ALLOW (0ULL)
+
+/*
+ * No support for implementation defined features.
+ */
+#define PVM_ID_AA64AFR0_ALLOW (0ULL)
+#define PVM_ID_AA64AFR1_ALLOW (0ULL)
+
+/*
+ * No restrictions on instructions implemented in AArch64.
+ */
+#define PVM_ID_AA64ISAR0_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
+	)
+
+#define PVM_ID_AA64ISAR1_ALLOW (\
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
+	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
+	)
+
+#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
index 657d0c94cf82..5afd14ab15b9 100644
--- a/arch/arm64/include/asm/kvm_hyp.h
+++ b/arch/arm64/include/asm/kvm_hyp.h
@@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
 void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
 #endif
 
+extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
 extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
+extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
 
 #endif /* __ARM64_KVM_HYP_H__ */
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index fe102cd2e518..6aa7b0c5bf21 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
 	void *addr = phys_to_virt(hyp_mem_base);
 	int ret;
 
+	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
+	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
+	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
+	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
 	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
 	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
+	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
 
 	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
 	if (ret)
diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
new file mode 100644
index 000000000000..0865163d363c
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
+#define __ARM64_KVM_NVHE_SYS_REGS_H__
+
+#include <asm/kvm_host.h>
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
+
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
+void __inject_undef64(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 8d741f71377f..0bbe37a18d5d 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
new file mode 100644
index 000000000000..ef8456c54b18
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -0,0 +1,492 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <asm/kvm_asm.h>
+#include <asm/kvm_fixed_config.h>
+#include <asm/kvm_mmu.h>
+
+#include <hyp/adjust_pc.h>
+
+#include "../../sys_regs.h"
+
+/*
+ * Copies of the host's CPU features registers holding sanitized values at hyp.
+ */
+u64 id_aa64pfr0_el1_sys_val;
+u64 id_aa64pfr1_el1_sys_val;
+u64 id_aa64isar0_el1_sys_val;
+u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr2_el1_sys_val;
+
+static inline void inject_undef64(struct kvm_vcpu *vcpu)
+{
+	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
+
+	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
+			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
+			     KVM_ARM64_PENDING_EXCEPTION);
+
+	__kvm_adjust_pc(vcpu);
+
+	write_sysreg_el1(esr, SYS_ESR);
+	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
+}
+
+/*
+ * Inject an unknown/undefined exception to an AArch64 guest while most of its
+ * sysregs are live.
+ */
+void __inject_undef64(struct kvm_vcpu *vcpu)
+{
+	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
+	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
+
+	inject_undef64(vcpu);
+
+	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
+	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
+}
+
+/*
+ * Accessor for undefined accesses.
+ */
+static bool undef_access(struct kvm_vcpu *vcpu,
+			 struct sys_reg_params *p,
+			 const struct sys_reg_desc *r)
+{
+	__inject_undef64(vcpu);
+	return false;
+}
+
+/*
+ * Returns the restricted features values of the feature register based on the
+ * limitations in restrict_fields.
+ * A feature id field value of 0b0000 does not impose any restrictions.
+ * Note: Use only for unsigned feature field values.
+ */
+static u64 get_restricted_features_unsigned(u64 sys_reg_val,
+					    u64 restrict_fields)
+{
+	u64 value = 0UL;
+	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
+
+	/*
+	 * According to the Arm Architecture Reference Manual, feature fields
+	 * use increasing values to indicate increases in functionality.
+	 * Iterate over the restricted feature fields and calculate the minimum
+	 * unsigned value between the one supported by the system, and what the
+	 * value is being restricted to.
+	 */
+	while (sys_reg_val && restrict_fields) {
+		value |= min(sys_reg_val & mask, restrict_fields & mask);
+		sys_reg_val &= ~mask;
+		restrict_fields &= ~mask;
+		mask <<= ARM64_FEATURE_FIELD_BITS;
+	}
+
+	return value;
+}
+
+/*
+ * Functions that return the value of feature id registers for protected VMs
+ * based on allowed features, system features, and KVM support.
+ */
+
+u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 set_mask = 0;
+	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
+
+	if (!vcpu_has_sve(vcpu))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
+
+	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
+		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
+
+	/* Spectre and Meltdown mitigation in KVM */
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
+			       (u64)kvm->arch.pfr0_csv2);
+	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
+			       (u64)kvm->arch.pfr0_csv3);
+
+	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
+}
+
+u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
+{
+	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
+	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
+
+	if (!kvm_has_mte(kvm))
+		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
+
+	return id_aa64pfr1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for Scalable Vectors, therefore, hyp has no sanitized
+	 * copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, including breakpoints, and watchpoints,
+	 * therefore, pKVM has no sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for debug, therefore, hyp has no sanitized copy of the
+	 * feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
+{
+	/*
+	 * No support for implementation defined features, therefore, hyp has no
+	 * sanitized copy of the feature id register.
+	 */
+	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
+	return 0;
+}
+
+u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
+}
+
+u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
+{
+	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
+
+	if (!vcpu_has_ptrauth(vcpu))
+		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
+				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
+
+	return id_aa64isar1_el1_sys_val & allow_mask;
+}
+
+u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
+{
+	u64 set_mask;
+
+	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
+		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
+
+	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
+}
+
+u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
+}
+
+u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
+{
+	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
+}
+
+/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
+static u64 read_id_reg(const struct kvm_vcpu *vcpu,
+		       struct sys_reg_desc const *r)
+{
+	u32 id = reg_to_encoding(r);
+
+	switch (id) {
+	case SYS_ID_AA64PFR0_EL1:
+		return get_pvm_id_aa64pfr0(vcpu);
+	case SYS_ID_AA64PFR1_EL1:
+		return get_pvm_id_aa64pfr1(vcpu);
+	case SYS_ID_AA64ZFR0_EL1:
+		return get_pvm_id_aa64zfr0(vcpu);
+	case SYS_ID_AA64DFR0_EL1:
+		return get_pvm_id_aa64dfr0(vcpu);
+	case SYS_ID_AA64DFR1_EL1:
+		return get_pvm_id_aa64dfr1(vcpu);
+	case SYS_ID_AA64AFR0_EL1:
+		return get_pvm_id_aa64afr0(vcpu);
+	case SYS_ID_AA64AFR1_EL1:
+		return get_pvm_id_aa64afr1(vcpu);
+	case SYS_ID_AA64ISAR0_EL1:
+		return get_pvm_id_aa64isar0(vcpu);
+	case SYS_ID_AA64ISAR1_EL1:
+		return get_pvm_id_aa64isar1(vcpu);
+	case SYS_ID_AA64MMFR0_EL1:
+		return get_pvm_id_aa64mmfr0(vcpu);
+	case SYS_ID_AA64MMFR1_EL1:
+		return get_pvm_id_aa64mmfr1(vcpu);
+	case SYS_ID_AA64MMFR2_EL1:
+		return get_pvm_id_aa64mmfr2(vcpu);
+	default:
+		/*
+		 * Should never happen because all cases are covered in
+		 * pvm_sys_reg_descs[] below.
+		 */
+		WARN_ON(1);
+		break;
+	}
+
+	return 0;
+}
+
+/*
+ * Accessor for AArch32 feature id registers.
+ *
+ * The value of these registers is "unknown" according to the spec if AArch32
+ * isn't supported.
+ */
+static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	/*
+	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
+	 * of AArch32 feature id registers.
+	 */
+	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
+		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
+
+	/* Use 0 for architecturally "unknown" values. */
+	p->regval = 0;
+	return true;
+}
+
+/*
+ * Accessor for AArch64 feature id registers.
+ *
+ * If access is allowed, set the regval to the protected VM's view of the
+ * register and return true.
+ * Otherwise, inject an undefined exception and return false.
+ */
+static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
+				  struct sys_reg_params *p,
+				  const struct sys_reg_desc *r)
+{
+	if (p->is_write)
+		return undef_access(vcpu, p, r);
+
+	p->regval = read_id_reg(vcpu, r);
+	return true;
+}
+
+/* Mark the specified system register as an AArch32 feature id register. */
+#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
+
+/* Mark the specified system register as an AArch64 feature id register. */
+#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
+
+/* Mark the specified system register as not being handled in hyp. */
+#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
+
+/*
+ * Architected system registers.
+ * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
+ *
+ * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
+ * it will lead to injecting an exception into the guest.
+ */
+static const struct sys_reg_desc pvm_sys_reg_descs[] = {
+	/* Cache maintenance by set/way operations are restricted. */
+
+	/* Debug and Trace Registers are restricted. */
+
+	/* AArch64 mappings of the AArch32 ID registers */
+	/* CRm=1 */
+	AARCH32(SYS_ID_PFR0_EL1),
+	AARCH32(SYS_ID_PFR1_EL1),
+	AARCH32(SYS_ID_DFR0_EL1),
+	AARCH32(SYS_ID_AFR0_EL1),
+	AARCH32(SYS_ID_MMFR0_EL1),
+	AARCH32(SYS_ID_MMFR1_EL1),
+	AARCH32(SYS_ID_MMFR2_EL1),
+	AARCH32(SYS_ID_MMFR3_EL1),
+
+	/* CRm=2 */
+	AARCH32(SYS_ID_ISAR0_EL1),
+	AARCH32(SYS_ID_ISAR1_EL1),
+	AARCH32(SYS_ID_ISAR2_EL1),
+	AARCH32(SYS_ID_ISAR3_EL1),
+	AARCH32(SYS_ID_ISAR4_EL1),
+	AARCH32(SYS_ID_ISAR5_EL1),
+	AARCH32(SYS_ID_MMFR4_EL1),
+	AARCH32(SYS_ID_ISAR6_EL1),
+
+	/* CRm=3 */
+	AARCH32(SYS_MVFR0_EL1),
+	AARCH32(SYS_MVFR1_EL1),
+	AARCH32(SYS_MVFR2_EL1),
+	AARCH32(SYS_ID_PFR2_EL1),
+	AARCH32(SYS_ID_DFR1_EL1),
+	AARCH32(SYS_ID_MMFR5_EL1),
+
+	/* AArch64 ID registers */
+	/* CRm=4 */
+	AARCH64(SYS_ID_AA64PFR0_EL1),
+	AARCH64(SYS_ID_AA64PFR1_EL1),
+	AARCH64(SYS_ID_AA64ZFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR0_EL1),
+	AARCH64(SYS_ID_AA64DFR1_EL1),
+	AARCH64(SYS_ID_AA64AFR0_EL1),
+	AARCH64(SYS_ID_AA64AFR1_EL1),
+	AARCH64(SYS_ID_AA64ISAR0_EL1),
+	AARCH64(SYS_ID_AA64ISAR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR0_EL1),
+	AARCH64(SYS_ID_AA64MMFR1_EL1),
+	AARCH64(SYS_ID_AA64MMFR2_EL1),
+
+	HOST_HANDLED(SYS_SCTLR_EL1),
+	HOST_HANDLED(SYS_ACTLR_EL1),
+	HOST_HANDLED(SYS_CPACR_EL1),
+
+	HOST_HANDLED(SYS_RGSR_EL1),
+	HOST_HANDLED(SYS_GCR_EL1),
+
+	/* Scalable Vector Registers are restricted. */
+
+	HOST_HANDLED(SYS_TTBR0_EL1),
+	HOST_HANDLED(SYS_TTBR1_EL1),
+	HOST_HANDLED(SYS_TCR_EL1),
+
+	HOST_HANDLED(SYS_APIAKEYLO_EL1),
+	HOST_HANDLED(SYS_APIAKEYHI_EL1),
+	HOST_HANDLED(SYS_APIBKEYLO_EL1),
+	HOST_HANDLED(SYS_APIBKEYHI_EL1),
+	HOST_HANDLED(SYS_APDAKEYLO_EL1),
+	HOST_HANDLED(SYS_APDAKEYHI_EL1),
+	HOST_HANDLED(SYS_APDBKEYLO_EL1),
+	HOST_HANDLED(SYS_APDBKEYHI_EL1),
+	HOST_HANDLED(SYS_APGAKEYLO_EL1),
+	HOST_HANDLED(SYS_APGAKEYHI_EL1),
+
+	HOST_HANDLED(SYS_AFSR0_EL1),
+	HOST_HANDLED(SYS_AFSR1_EL1),
+	HOST_HANDLED(SYS_ESR_EL1),
+
+	HOST_HANDLED(SYS_ERRIDR_EL1),
+	HOST_HANDLED(SYS_ERRSELR_EL1),
+	HOST_HANDLED(SYS_ERXFR_EL1),
+	HOST_HANDLED(SYS_ERXCTLR_EL1),
+	HOST_HANDLED(SYS_ERXSTATUS_EL1),
+	HOST_HANDLED(SYS_ERXADDR_EL1),
+	HOST_HANDLED(SYS_ERXMISC0_EL1),
+	HOST_HANDLED(SYS_ERXMISC1_EL1),
+
+	HOST_HANDLED(SYS_TFSR_EL1),
+	HOST_HANDLED(SYS_TFSRE0_EL1),
+
+	HOST_HANDLED(SYS_FAR_EL1),
+	HOST_HANDLED(SYS_PAR_EL1),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_MAIR_EL1),
+	HOST_HANDLED(SYS_AMAIR_EL1),
+
+	/* Limited Ordering Regions Registers are restricted. */
+
+	HOST_HANDLED(SYS_VBAR_EL1),
+	HOST_HANDLED(SYS_DISR_EL1),
+
+	/* GIC CPU Interface registers are restricted. */
+
+	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
+	HOST_HANDLED(SYS_TPIDR_EL1),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL1),
+
+	HOST_HANDLED(SYS_CNTKCTL_EL1),
+
+	HOST_HANDLED(SYS_CCSIDR_EL1),
+	HOST_HANDLED(SYS_CLIDR_EL1),
+	HOST_HANDLED(SYS_CSSELR_EL1),
+	HOST_HANDLED(SYS_CTR_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_TPIDR_EL0),
+	HOST_HANDLED(SYS_TPIDRRO_EL0),
+
+	HOST_HANDLED(SYS_SCXTNUM_EL0),
+
+	/* Activity Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
+	HOST_HANDLED(SYS_CNTP_CTL_EL0),
+	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
+
+	/* Performance Monitoring Registers are restricted. */
+
+	HOST_HANDLED(SYS_DACR32_EL2),
+	HOST_HANDLED(SYS_IFSR32_EL2),
+	HOST_HANDLED(SYS_FPEXC32_EL2),
+};
+
+/*
+ * Handler for protected VM MSR, MRS or System instruction execution.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't, to be handled by the host.
+ */
+bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	const struct sys_reg_desc *r;
+	struct sys_reg_params params;
+	unsigned long esr = kvm_vcpu_get_esr(vcpu);
+	int Rt = kvm_vcpu_sys_get_rt(vcpu);
+
+	params = esr_sys64_to_params(esr);
+	params.regval = vcpu_get_reg(vcpu, Rt);
+
+	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
+
+	/* Undefined access (RESTRICTED). */
+	if (r == NULL) {
+		__inject_undef64(vcpu);
+		return true;
+	}
+
+	/* Handled by the host (HOST_HANDLED) */
+	if (r->access == NULL)
+		return false;
+
+	/* Handled by hyp: skip instruction if instructed to do so. */
+	if (r->access(vcpu, &params, r))
+		__kvm_skip_instr(vcpu);
+
+	if (!params.is_write)
+		vcpu_set_reg(vcpu, Rt, params.regval);
+
+	return true;
+}
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:47   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Protected VMs have more restricted features that need to be
trapped. Moreover, the host should not be trusted to set the
appropriate trapping registers and their values.

Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
cptr_el2 at EL2 for protected guests, based on the values of the
guest's feature id registers.

No functional change intended as trap handlers introduced in the
previous patch are still not hooked in to the guest exit
handlers.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   2 +
 arch/arm64/kvm/arm.c                   |   8 ++
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
 arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
 arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
 7 files changed, 222 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..a460e1243cef 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..4a323aa27a6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
 	return false;
 }
 
+void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
+
 int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
 bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6aa7b0c5bf21..3af6d59d1919 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
+	/*
+	 * Initialize traps for protected VMs.
+	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
+	 * the code is in place for first run initialization at EL2.
+	 */
+	if (kvm_vm_is_protected(kvm))
+		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
+
 	return ret;
 }
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
new file mode 100644
index 000000000000..e6c259db6719
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_PKVM_H__
+#define __ARM64_KVM_NVHE_PKVM_H__
+
+#include <asm/kvm_host.h>
+
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 0bbe37a18d5d..c3c11974fa3b 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o sys_regs.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8ca1104f4774..f59e0870c343 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -15,6 +15,7 @@
 
 #include <nvhe/mem_protect.h>
 #include <nvhe/mm.h>
+#include <nvhe/pkvm.h>
 #include <nvhe/trap_handler.h>
 
 DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
@@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
 {
 	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
+
+static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
+
+	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_host_share_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
+	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
new file mode 100644
index 000000000000..cc6139631dc4
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <asm/kvm_fixed_config.h>
+#include <nvhe/sys_regs.h>
+
+/*
+ * Set trap register values based on features in ID_AA64PFR0.
+ */
+static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap AArch32 guests */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT ||
+	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT)
+		hcr_set |= HCR_RW | HCR_TID0;
+
+	/* Trap RAS unless all current versions are supported */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
+	    ID_AA64PFR0_RAS_V1P1) {
+		hcr_set |= HCR_TERR | HCR_TEA;
+		hcr_clear |= HCR_FIEN;
+	}
+
+	/* Trap AMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
+		hcr_clear |= HCR_AMVOFFEN;
+		cptr_set |= CPTR_EL2_TAM;
+	}
+
+	/*
+	 * Linux guests assume support for floating-point and Advanced SIMD. Do
+	 * not change the trapping behavior for these from the KVM default.
+	 */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	/* Trap SVE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
+		cptr_set |= CPTR_EL2_TZ;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64PFR1.
+ */
+static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+
+	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
+		hcr_set |= HCR_TID5;
+		hcr_clear |= HCR_DCT | HCR_ATA;
+	}
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64DFR0.
+ */
+static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
+	u64 mdcr_set = 0;
+	u64 mdcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap/constrain PMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
+		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
+			      MDCR_EL2_HPMN_MASK;
+	}
+
+	/* Trap Debug */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
+		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
+
+	/* Trap OS Double Lock */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
+		mdcr_set |= MDCR_EL2_TDOSA;
+
+	/* Trap SPE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPMS;
+		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
+	}
+
+	/* Trap Trace Filter */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
+		mdcr_set |= MDCR_EL2_TTRF;
+
+	/* Trap Trace */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
+		cptr_set |= CPTR_EL2_TTA;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR0.
+ */
+static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
+	u64 mdcr_set = 0;
+
+	/* Trap Debug Communications Channel registers */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
+		mdcr_set |= MDCR_EL2_TDCC;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR1.
+ */
+static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
+	u64 hcr_set = 0;
+
+	/* Trap LOR */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
+		hcr_set |= HCR_TLOR;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+}
+
+/*
+ * Set baseline trap register values.
+ */
+static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
+{
+	const u64 hcr_trap_feat_regs = HCR_TID3;
+	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
+
+	/*
+	 * Always trap:
+	 * - Feature id registers: to control features exposed to guests
+	 * - Implementation-defined features
+	 */
+	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
+
+	/* Clear res0 and set res1 bits to trap potential new features. */
+	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
+	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
+	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
+	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
+}
+
+/*
+ * Initialize trap register values for protected VMs.
+ */
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
+{
+	pvm_init_trap_regs(vcpu);
+	pvm_init_traps_aa64pfr0(vcpu);
+	pvm_init_traps_aa64pfr1(vcpu);
+	pvm_init_traps_aa64dfr0(vcpu);
+	pvm_init_traps_aa64mmfr0(vcpu);
+	pvm_init_traps_aa64mmfr1(vcpu);
+}
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Protected VMs have more restricted features that need to be
trapped. Moreover, the host should not be trusted to set the
appropriate trapping registers and their values.

Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
cptr_el2 at EL2 for protected guests, based on the values of the
guest's feature id registers.

No functional change intended as trap handlers introduced in the
previous patch are still not hooked in to the guest exit
handlers.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   2 +
 arch/arm64/kvm/arm.c                   |   8 ++
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
 arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
 arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
 7 files changed, 222 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..a460e1243cef 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..4a323aa27a6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
 	return false;
 }
 
+void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
+
 int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
 bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6aa7b0c5bf21..3af6d59d1919 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
+	/*
+	 * Initialize traps for protected VMs.
+	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
+	 * the code is in place for first run initialization at EL2.
+	 */
+	if (kvm_vm_is_protected(kvm))
+		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
+
 	return ret;
 }
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
new file mode 100644
index 000000000000..e6c259db6719
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_PKVM_H__
+#define __ARM64_KVM_NVHE_PKVM_H__
+
+#include <asm/kvm_host.h>
+
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 0bbe37a18d5d..c3c11974fa3b 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o sys_regs.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8ca1104f4774..f59e0870c343 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -15,6 +15,7 @@
 
 #include <nvhe/mem_protect.h>
 #include <nvhe/mm.h>
+#include <nvhe/pkvm.h>
 #include <nvhe/trap_handler.h>
 
 DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
@@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
 {
 	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
+
+static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
+
+	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_host_share_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
+	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
new file mode 100644
index 000000000000..cc6139631dc4
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <asm/kvm_fixed_config.h>
+#include <nvhe/sys_regs.h>
+
+/*
+ * Set trap register values based on features in ID_AA64PFR0.
+ */
+static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap AArch32 guests */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT ||
+	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT)
+		hcr_set |= HCR_RW | HCR_TID0;
+
+	/* Trap RAS unless all current versions are supported */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
+	    ID_AA64PFR0_RAS_V1P1) {
+		hcr_set |= HCR_TERR | HCR_TEA;
+		hcr_clear |= HCR_FIEN;
+	}
+
+	/* Trap AMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
+		hcr_clear |= HCR_AMVOFFEN;
+		cptr_set |= CPTR_EL2_TAM;
+	}
+
+	/*
+	 * Linux guests assume support for floating-point and Advanced SIMD. Do
+	 * not change the trapping behavior for these from the KVM default.
+	 */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	/* Trap SVE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
+		cptr_set |= CPTR_EL2_TZ;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64PFR1.
+ */
+static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+
+	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
+		hcr_set |= HCR_TID5;
+		hcr_clear |= HCR_DCT | HCR_ATA;
+	}
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64DFR0.
+ */
+static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
+	u64 mdcr_set = 0;
+	u64 mdcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap/constrain PMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
+		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
+			      MDCR_EL2_HPMN_MASK;
+	}
+
+	/* Trap Debug */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
+		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
+
+	/* Trap OS Double Lock */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
+		mdcr_set |= MDCR_EL2_TDOSA;
+
+	/* Trap SPE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPMS;
+		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
+	}
+
+	/* Trap Trace Filter */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
+		mdcr_set |= MDCR_EL2_TTRF;
+
+	/* Trap Trace */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
+		cptr_set |= CPTR_EL2_TTA;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR0.
+ */
+static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
+	u64 mdcr_set = 0;
+
+	/* Trap Debug Communications Channel registers */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
+		mdcr_set |= MDCR_EL2_TDCC;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR1.
+ */
+static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
+	u64 hcr_set = 0;
+
+	/* Trap LOR */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
+		hcr_set |= HCR_TLOR;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+}
+
+/*
+ * Set baseline trap register values.
+ */
+static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
+{
+	const u64 hcr_trap_feat_regs = HCR_TID3;
+	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
+
+	/*
+	 * Always trap:
+	 * - Feature id registers: to control features exposed to guests
+	 * - Implementation-defined features
+	 */
+	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
+
+	/* Clear res0 and set res1 bits to trap potential new features. */
+	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
+	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
+	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
+	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
+}
+
+/*
+ * Initialize trap register values for protected VMs.
+ */
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
+{
+	pvm_init_trap_regs(vcpu);
+	pvm_init_traps_aa64pfr0(vcpu);
+	pvm_init_traps_aa64pfr1(vcpu);
+	pvm_init_traps_aa64dfr0(vcpu);
+	pvm_init_traps_aa64mmfr0(vcpu);
+	pvm_init_traps_aa64mmfr1(vcpu);
+}
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Protected VMs have more restricted features that need to be
trapped. Moreover, the host should not be trusted to set the
appropriate trapping registers and their values.

Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
cptr_el2 at EL2 for protected guests, based on the values of the
guest's feature id registers.

No functional change intended as trap handlers introduced in the
previous patch are still not hooked in to the guest exit
handlers.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/include/asm/kvm_asm.h       |   1 +
 arch/arm64/include/asm/kvm_host.h      |   2 +
 arch/arm64/kvm/arm.c                   |   8 ++
 arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
 arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
 arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
 arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
 7 files changed, 222 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
 create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index e86045ac43ba..a460e1243cef 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -64,6 +64,7 @@
 #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
 #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
 #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
+#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index f8be56d5342b..4a323aa27a6b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
 	return false;
 }
 
+void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
+
 int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
 bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
 
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
index 6aa7b0c5bf21..3af6d59d1919 100644
--- a/arch/arm64/kvm/arm.c
+++ b/arch/arm64/kvm/arm.c
@@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
 
 	ret = kvm_arm_pmu_v3_enable(vcpu);
 
+	/*
+	 * Initialize traps for protected VMs.
+	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
+	 * the code is in place for first run initialization at EL2.
+	 */
+	if (kvm_vm_is_protected(kvm))
+		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
+
 	return ret;
 }
 
diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
new file mode 100644
index 000000000000..e6c259db6719
--- /dev/null
+++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#ifndef __ARM64_KVM_NVHE_PKVM_H__
+#define __ARM64_KVM_NVHE_PKVM_H__
+
+#include <asm/kvm_host.h>
+
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
+
+#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
index 0bbe37a18d5d..c3c11974fa3b 100644
--- a/arch/arm64/kvm/hyp/nvhe/Makefile
+++ b/arch/arm64/kvm/hyp/nvhe/Makefile
@@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
 
 obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
 	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
-	 cache.o setup.o mm.o mem_protect.o sys_regs.o
+	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
 obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
 	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
 obj-y += $(lib-objs)
diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
index 8ca1104f4774..f59e0870c343 100644
--- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
+++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
@@ -15,6 +15,7 @@
 
 #include <nvhe/mem_protect.h>
 #include <nvhe/mm.h>
+#include <nvhe/pkvm.h>
 #include <nvhe/trap_handler.h>
 
 DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
@@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
 {
 	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
 }
+
+static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
+{
+	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
+
+	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
+}
+
 typedef void (*hcall_t)(struct kvm_cpu_context *);
 
 #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
@@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
 	HANDLE_FUNC(__pkvm_host_share_hyp),
 	HANDLE_FUNC(__pkvm_create_private_mapping),
 	HANDLE_FUNC(__pkvm_prot_finalize),
+	HANDLE_FUNC(__pkvm_vcpu_init_traps),
 };
 
 static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
new file mode 100644
index 000000000000..cc6139631dc4
--- /dev/null
+++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2021 Google LLC
+ * Author: Fuad Tabba <tabba@google.com>
+ */
+
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <asm/kvm_fixed_config.h>
+#include <nvhe/sys_regs.h>
+
+/*
+ * Set trap register values based on features in ID_AA64PFR0.
+ */
+static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap AArch32 guests */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT ||
+	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
+		    ID_AA64PFR0_ELx_32BIT_64BIT)
+		hcr_set |= HCR_RW | HCR_TID0;
+
+	/* Trap RAS unless all current versions are supported */
+	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
+	    ID_AA64PFR0_RAS_V1P1) {
+		hcr_set |= HCR_TERR | HCR_TEA;
+		hcr_clear |= HCR_FIEN;
+	}
+
+	/* Trap AMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
+		hcr_clear |= HCR_AMVOFFEN;
+		cptr_set |= CPTR_EL2_TAM;
+	}
+
+	/*
+	 * Linux guests assume support for floating-point and Advanced SIMD. Do
+	 * not change the trapping behavior for these from the KVM default.
+	 */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	/* Trap SVE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
+		cptr_set |= CPTR_EL2_TZ;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64PFR1.
+ */
+static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
+	u64 hcr_set = 0;
+	u64 hcr_clear = 0;
+
+	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
+		hcr_set |= HCR_TID5;
+		hcr_clear |= HCR_DCT | HCR_ATA;
+	}
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+	vcpu->arch.hcr_el2 &= ~hcr_clear;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64DFR0.
+ */
+static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
+	u64 mdcr_set = 0;
+	u64 mdcr_clear = 0;
+	u64 cptr_set = 0;
+
+	/* Trap/constrain PMU */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
+		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
+			      MDCR_EL2_HPMN_MASK;
+	}
+
+	/* Trap Debug */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
+		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
+
+	/* Trap OS Double Lock */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
+		mdcr_set |= MDCR_EL2_TDOSA;
+
+	/* Trap SPE */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
+		mdcr_set |= MDCR_EL2_TPMS;
+		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
+	}
+
+	/* Trap Trace Filter */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
+		mdcr_set |= MDCR_EL2_TTRF;
+
+	/* Trap Trace */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
+		cptr_set |= CPTR_EL2_TTA;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
+	vcpu->arch.cptr_el2 |= cptr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR0.
+ */
+static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
+	u64 mdcr_set = 0;
+
+	/* Trap Debug Communications Channel registers */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
+		mdcr_set |= MDCR_EL2_TDCC;
+
+	vcpu->arch.mdcr_el2 |= mdcr_set;
+}
+
+/*
+ * Set trap register values based on features in ID_AA64MMFR1.
+ */
+static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
+{
+	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
+	u64 hcr_set = 0;
+
+	/* Trap LOR */
+	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
+		hcr_set |= HCR_TLOR;
+
+	vcpu->arch.hcr_el2 |= hcr_set;
+}
+
+/*
+ * Set baseline trap register values.
+ */
+static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
+{
+	const u64 hcr_trap_feat_regs = HCR_TID3;
+	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
+
+	/*
+	 * Always trap:
+	 * - Feature id registers: to control features exposed to guests
+	 * - Implementation-defined features
+	 */
+	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
+
+	/* Clear res0 and set res1 bits to trap potential new features. */
+	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
+	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
+	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
+	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
+}
+
+/*
+ * Initialize trap register values for protected VMs.
+ */
+void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
+{
+	pvm_init_trap_regs(vcpu);
+	pvm_init_traps_aa64pfr0(vcpu);
+	pvm_init_traps_aa64pfr1(vcpu);
+	pvm_init_traps_aa64dfr0(vcpu);
+	pvm_init_traps_aa64mmfr0(vcpu);
+	pvm_init_traps_aa64mmfr1(vcpu);
+}
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 10/12] KVM: arm64: Move sanitized copies of CPU features
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:47   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Move the sanitized copies of the CPU feature registers to the
recently created sys_regs.c. This consolidates all copies in a
more relevant file.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 ------
 arch/arm64/kvm/hyp/nvhe/sys_regs.c    | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 2a07d63b8498..f6d96e60b323 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -25,12 +25,6 @@ struct host_kvm host_kvm;
 
 static struct hyp_pool host_s2_pool;
 
-/*
- * Copies of the host's CPU features registers holding sanitized values.
- */
-u64 id_aa64mmfr0_el1_sys_val;
-u64 id_aa64mmfr1_el1_sys_val;
-
 const u8 pkvm_hyp_id = 1;
 
 static void *host_s2_zalloc_pages_exact(size_t size)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index ef8456c54b18..13163be83756 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -19,6 +19,8 @@ u64 id_aa64pfr0_el1_sys_val;
 u64 id_aa64pfr1_el1_sys_val;
 u64 id_aa64isar0_el1_sys_val;
 u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr0_el1_sys_val;
+u64 id_aa64mmfr1_el1_sys_val;
 u64 id_aa64mmfr2_el1_sys_val;
 
 static inline void inject_undef64(struct kvm_vcpu *vcpu)
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 10/12] KVM: arm64: Move sanitized copies of CPU features
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Move the sanitized copies of the CPU feature registers to the
recently created sys_regs.c. This consolidates all copies in a
more relevant file.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 ------
 arch/arm64/kvm/hyp/nvhe/sys_regs.c    | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 2a07d63b8498..f6d96e60b323 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -25,12 +25,6 @@ struct host_kvm host_kvm;
 
 static struct hyp_pool host_s2_pool;
 
-/*
- * Copies of the host's CPU features registers holding sanitized values.
- */
-u64 id_aa64mmfr0_el1_sys_val;
-u64 id_aa64mmfr1_el1_sys_val;
-
 const u8 pkvm_hyp_id = 1;
 
 static void *host_s2_zalloc_pages_exact(size_t size)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index ef8456c54b18..13163be83756 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -19,6 +19,8 @@ u64 id_aa64pfr0_el1_sys_val;
 u64 id_aa64pfr1_el1_sys_val;
 u64 id_aa64isar0_el1_sys_val;
 u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr0_el1_sys_val;
+u64 id_aa64mmfr1_el1_sys_val;
 u64 id_aa64mmfr2_el1_sys_val;
 
 static inline void inject_undef64(struct kvm_vcpu *vcpu)
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 10/12] KVM: arm64: Move sanitized copies of CPU features
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Move the sanitized copies of the CPU feature registers to the
recently created sys_regs.c. This consolidates all copies in a
more relevant file.

No functional change intended.

Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/mem_protect.c | 6 ------
 arch/arm64/kvm/hyp/nvhe/sys_regs.c    | 2 ++
 2 files changed, 2 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kvm/hyp/nvhe/mem_protect.c b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
index 2a07d63b8498..f6d96e60b323 100644
--- a/arch/arm64/kvm/hyp/nvhe/mem_protect.c
+++ b/arch/arm64/kvm/hyp/nvhe/mem_protect.c
@@ -25,12 +25,6 @@ struct host_kvm host_kvm;
 
 static struct hyp_pool host_s2_pool;
 
-/*
- * Copies of the host's CPU features registers holding sanitized values.
- */
-u64 id_aa64mmfr0_el1_sys_val;
-u64 id_aa64mmfr1_el1_sys_val;
-
 const u8 pkvm_hyp_id = 1;
 
 static void *host_s2_zalloc_pages_exact(size_t size)
diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
index ef8456c54b18..13163be83756 100644
--- a/arch/arm64/kvm/hyp/nvhe/sys_regs.c
+++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
@@ -19,6 +19,8 @@ u64 id_aa64pfr0_el1_sys_val;
 u64 id_aa64pfr1_el1_sys_val;
 u64 id_aa64isar0_el1_sys_val;
 u64 id_aa64isar1_el1_sys_val;
+u64 id_aa64mmfr0_el1_sys_val;
+u64 id_aa64mmfr1_el1_sys_val;
 u64 id_aa64mmfr2_el1_sys_val;
 
 static inline void inject_undef64(struct kvm_vcpu *vcpu)
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:47   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Trap accesses to restricted features for VMs running in protected
mode.

Access to feature registers are emulated, and only supported
features are exposed to protected VMs.

Accesses to restricted registers as well as restricted
instructions are trapped, and an undefined exception is injected
into the protected guests, i.e., with EC = 0x0 (unknown reason).
This EC is the one used, according to the Arm Architecture
Reference Manual, for unallocated or undefined system registers
or instructions.

Only affects the functionality of protected VMs. Otherwise,
should not affect non-protected VMs when KVM is running in
protected mode.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 49080c607838..2bf5952f651b 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -20,6 +20,7 @@
 #include <asm/kprobes.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_fixed_config.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
@@ -28,6 +29,7 @@
 #include <asm/thread_info.h>
 
 #include <nvhe/mem_protect.h>
+#include <nvhe/sys_regs.h>
 
 /* Non-VHE specific context */
 DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
@@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+/**
+ * Handler for protected VM restricted exceptions.
+ *
+ * Inject an undefined exception into the guest and return true to indicate that
+ * the hypervisor has handled the exit, and control should go back to the guest.
+ */
+static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	__inject_undef64(vcpu);
+	return true;
+}
+
+/**
+ * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
+		return true;
+	else
+		return kvm_hyp_handle_sysreg(vcpu, exit_code);
+}
+
+/**
+ * Handler for protected floating-point and Advanced SIMD accesses.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	/* Linux guests assume support for floating-point and Advanced SIMD. */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
+}
+
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
 	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
@@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
+static const exit_handler_fn pvm_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
+	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
+	if (unlikely(kvm_vm_is_protected(kvm)))
+		return pvm_exit_handlers;
+
 	return hyp_exit_handlers;
 }
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Trap accesses to restricted features for VMs running in protected
mode.

Access to feature registers are emulated, and only supported
features are exposed to protected VMs.

Accesses to restricted registers as well as restricted
instructions are trapped, and an undefined exception is injected
into the protected guests, i.e., with EC = 0x0 (unknown reason).
This EC is the one used, according to the Arm Architecture
Reference Manual, for unallocated or undefined system registers
or instructions.

Only affects the functionality of protected VMs. Otherwise,
should not affect non-protected VMs when KVM is running in
protected mode.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 49080c607838..2bf5952f651b 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -20,6 +20,7 @@
 #include <asm/kprobes.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_fixed_config.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
@@ -28,6 +29,7 @@
 #include <asm/thread_info.h>
 
 #include <nvhe/mem_protect.h>
+#include <nvhe/sys_regs.h>
 
 /* Non-VHE specific context */
 DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
@@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+/**
+ * Handler for protected VM restricted exceptions.
+ *
+ * Inject an undefined exception into the guest and return true to indicate that
+ * the hypervisor has handled the exit, and control should go back to the guest.
+ */
+static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	__inject_undef64(vcpu);
+	return true;
+}
+
+/**
+ * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
+		return true;
+	else
+		return kvm_hyp_handle_sysreg(vcpu, exit_code);
+}
+
+/**
+ * Handler for protected floating-point and Advanced SIMD accesses.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	/* Linux guests assume support for floating-point and Advanced SIMD. */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
+}
+
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
 	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
@@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
+static const exit_handler_fn pvm_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
+	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
+	if (unlikely(kvm_vm_is_protected(kvm)))
+		return pvm_exit_handlers;
+
 	return hyp_exit_handlers;
 }
 
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Trap accesses to restricted features for VMs running in protected
mode.

Access to feature registers are emulated, and only supported
features are exposed to protected VMs.

Accesses to restricted registers as well as restricted
instructions are trapped, and an undefined exception is injected
into the protected guests, i.e., with EC = 0x0 (unknown reason).
This EC is the one used, according to the Arm Architecture
Reference Manual, for unallocated or undefined system registers
or instructions.

Only affects the functionality of protected VMs. Otherwise,
should not affect non-protected VMs when KVM is running in
protected mode.

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 49080c607838..2bf5952f651b 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -20,6 +20,7 @@
 #include <asm/kprobes.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
+#include <asm/kvm_fixed_config.h>
 #include <asm/kvm_hyp.h>
 #include <asm/kvm_mmu.h>
 #include <asm/fpsimd.h>
@@ -28,6 +29,7 @@
 #include <asm/thread_info.h>
 
 #include <nvhe/mem_protect.h>
+#include <nvhe/sys_regs.h>
 
 /* Non-VHE specific context */
 DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
@@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 		write_sysreg(pmu->events_host, pmcntenset_el0);
 }
 
+/**
+ * Handler for protected VM restricted exceptions.
+ *
+ * Inject an undefined exception into the guest and return true to indicate that
+ * the hypervisor has handled the exit, and control should go back to the guest.
+ */
+static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	__inject_undef64(vcpu);
+	return true;
+}
+
+/**
+ * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
+		return true;
+	else
+		return kvm_hyp_handle_sysreg(vcpu, exit_code);
+}
+
+/**
+ * Handler for protected floating-point and Advanced SIMD accesses.
+ *
+ * Returns true if the hypervisor has handled the exit, and control should go
+ * back to the guest, or false if it hasn't.
+ */
+static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	/* Linux guests assume support for floating-point and Advanced SIMD. */
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
+				PVM_ID_AA64PFR0_ALLOW));
+	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
+				PVM_ID_AA64PFR0_ALLOW));
+
+	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
+}
+
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
 	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
@@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
 	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
 };
 
+static const exit_handler_fn pvm_exit_handlers[] = {
+	[0 ... ESR_ELx_EC_MAX]		= NULL,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
+	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
+	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
+	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
+	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
+	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
+};
+
 static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 {
+	if (unlikely(kvm_vm_is_protected(kvm)))
+		return pvm_exit_handlers;
+
 	return hyp_exit_handlers;
 }
 
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
  2021-09-22 12:46 ` Fuad Tabba
  (?)
@ 2021-09-22 12:47   ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Protected KVM does not support protected AArch32 guests. However,
it is possible for the guest to force run AArch32, potentially
causing problems. Add an extra check so that if the hypervisor
catches the guest doing that, it can prevent the guest from
running again by resetting vcpu->arch.target and returning
ARM_EXCEPTION_IL.

If this were to happen, The VMM can try and fix it by re-
initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
likely not possible for protected VMs.

Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
AArch32 systems")

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 2bf5952f651b..d66226e49013 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 	return hyp_exit_handlers;
 }
 
+/*
+ * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
+ * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
+ * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
+ * hypervisor spots a guest in such a state ensure it is handled, and don't
+ * trust the host to spot or fix it.  The check below is based on the one in
+ * kvm_arch_vcpu_ioctl_run().
+ *
+ * Returns false if the guest ran in AArch32 when it shouldn't have, and
+ * thus should exit to the host, or true if a the guest run loop can continue.
+ */
+static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
+	bool is_aarch32_allowed =
+		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
+			  get_pvm_id_aa64pfr0(vcpu)) >=
+				ID_AA64PFR0_ELx_32BIT_64BIT;
+
+	if (kvm_vm_is_protected(kvm) &&
+	    vcpu_mode_is_32bit(vcpu) &&
+	    !is_aarch32_allowed) {
+		/*
+		 * As we have caught the guest red-handed, decide that it isn't
+		 * fit for purpose anymore by making the vcpu invalid. The VMM
+		 * can try and fix it by re-initializing the vcpu with
+		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
+		 * protected VMs.
+		 */
+		vcpu->arch.target = -1;
+		*exit_code = ARM_EXCEPTION_IL;
+		return false;
+	}
+
+	return true;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
@@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 		/* Jump in the fire! */
 		exit_code = __guest_enter(vcpu);
 
+		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
+			break;
+
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-- 
2.33.0.464.g1972c5931b-goog


^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm; +Cc: kernel-team, kvm, maz, pbonzini, will, linux-arm-kernel

Protected KVM does not support protected AArch32 guests. However,
it is possible for the guest to force run AArch32, potentially
causing problems. Add an extra check so that if the hypervisor
catches the guest doing that, it can prevent the guest from
running again by resetting vcpu->arch.target and returning
ARM_EXCEPTION_IL.

If this were to happen, The VMM can try and fix it by re-
initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
likely not possible for protected VMs.

Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
AArch32 systems")

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 2bf5952f651b..d66226e49013 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 	return hyp_exit_handlers;
 }
 
+/*
+ * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
+ * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
+ * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
+ * hypervisor spots a guest in such a state ensure it is handled, and don't
+ * trust the host to spot or fix it.  The check below is based on the one in
+ * kvm_arch_vcpu_ioctl_run().
+ *
+ * Returns false if the guest ran in AArch32 when it shouldn't have, and
+ * thus should exit to the host, or true if a the guest run loop can continue.
+ */
+static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
+	bool is_aarch32_allowed =
+		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
+			  get_pvm_id_aa64pfr0(vcpu)) >=
+				ID_AA64PFR0_ELx_32BIT_64BIT;
+
+	if (kvm_vm_is_protected(kvm) &&
+	    vcpu_mode_is_32bit(vcpu) &&
+	    !is_aarch32_allowed) {
+		/*
+		 * As we have caught the guest red-handed, decide that it isn't
+		 * fit for purpose anymore by making the vcpu invalid. The VMM
+		 * can try and fix it by re-initializing the vcpu with
+		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
+		 * protected VMs.
+		 */
+		vcpu->arch.target = -1;
+		*exit_code = ARM_EXCEPTION_IL;
+		return false;
+	}
+
+	return true;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
@@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 		/* Jump in the fire! */
 		exit_code = __guest_enter(vcpu);
 
+		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
+			break;
+
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-- 
2.33.0.464.g1972c5931b-goog

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-09-22 12:47   ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-09-22 12:47 UTC (permalink / raw)
  To: kvmarm
  Cc: maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team, tabba

Protected KVM does not support protected AArch32 guests. However,
it is possible for the guest to force run AArch32, potentially
causing problems. Add an extra check so that if the hypervisor
catches the guest doing that, it can prevent the guest from
running again by resetting vcpu->arch.target and returning
ARM_EXCEPTION_IL.

If this were to happen, The VMM can try and fix it by re-
initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
likely not possible for protected VMs.

Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
AArch32 systems")

Signed-off-by: Fuad Tabba <tabba@google.com>
---
 arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
 1 file changed, 40 insertions(+)

diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index 2bf5952f651b..d66226e49013 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
 	return hyp_exit_handlers;
 }
 
+/*
+ * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
+ * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
+ * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
+ * hypervisor spots a guest in such a state ensure it is handled, and don't
+ * trust the host to spot or fix it.  The check below is based on the one in
+ * kvm_arch_vcpu_ioctl_run().
+ *
+ * Returns false if the guest ran in AArch32 when it shouldn't have, and
+ * thus should exit to the host, or true if a the guest run loop can continue.
+ */
+static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
+{
+	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
+	bool is_aarch32_allowed =
+		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
+			  get_pvm_id_aa64pfr0(vcpu)) >=
+				ID_AA64PFR0_ELx_32BIT_64BIT;
+
+	if (kvm_vm_is_protected(kvm) &&
+	    vcpu_mode_is_32bit(vcpu) &&
+	    !is_aarch32_allowed) {
+		/*
+		 * As we have caught the guest red-handed, decide that it isn't
+		 * fit for purpose anymore by making the vcpu invalid. The VMM
+		 * can try and fix it by re-initializing the vcpu with
+		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
+		 * protected VMs.
+		 */
+		vcpu->arch.target = -1;
+		*exit_code = ARM_EXCEPTION_IL;
+		return false;
+	}
+
+	return true;
+}
+
 /* Switch to the guest for legacy non-VHE systems */
 int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 {
@@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 		/* Jump in the fire! */
 		exit_code = __guest_enter(vcpu);
 
+		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
+			break;
+
 		/* And we're baaack! */
 	} while (fixup_guest_exit(vcpu, &exit_code));
 
-- 
2.33.0.464.g1972c5931b-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
  2021-09-22 12:46   ` Fuad Tabba
  (?)
@ 2021-09-22 14:17     ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-22 14:17 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:46:56 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add missing FORCE prerequisite for hyp relocation target.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 5df6193fc430..8d741f71377f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
>  #    runtime. Because the hypervisor is part of the kernel binary, relocations
>  #    produce a kernel VA. We enumerate relocations targeting hyp at build time
>  #    and convert the kernel VAs at those positions to hyp VAs.
> -$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
> +$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
>  	$(call if_changed,hyprel)
>  
>  # 5) Compile hyp-reloc.S and link it into the existing partially linked object.

There is already a fix[1] queued for this.

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/commit/?h=fixes&id=a49b50a3c1c3226d26e1dd11e8b763f27e477623

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
@ 2021-09-22 14:17     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-22 14:17 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, 22 Sep 2021 13:46:56 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add missing FORCE prerequisite for hyp relocation target.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 5df6193fc430..8d741f71377f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
>  #    runtime. Because the hypervisor is part of the kernel binary, relocations
>  #    produce a kernel VA. We enumerate relocations targeting hyp at build time
>  #    and convert the kernel VAs at those positions to hyp VAs.
> -$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
> +$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
>  	$(call if_changed,hyprel)
>  
>  # 5) Compile hyp-reloc.S and link it into the existing partially linked object.

There is already a fix[1] queued for this.

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/commit/?h=fixes&id=a49b50a3c1c3226d26e1dd11e8b763f27e477623

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile
@ 2021-09-22 14:17     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-22 14:17 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:46:56 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add missing FORCE prerequisite for hyp relocation target.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 5df6193fc430..8d741f71377f 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -54,7 +54,7 @@ $(obj)/kvm_nvhe.tmp.o: $(obj)/hyp.lds $(addprefix $(obj)/,$(hyp-obj)) FORCE
>  #    runtime. Because the hypervisor is part of the kernel binary, relocations
>  #    produce a kernel VA. We enumerate relocations targeting hyp at build time
>  #    and convert the kernel VAs at those positions to hyp VAs.
> -$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel
> +$(obj)/hyp-reloc.S: $(obj)/kvm_nvhe.tmp.o $(obj)/gen-hyprel FORCE
>  	$(call if_changed,hyprel)
>  
>  # 5) Compile hyp-reloc.S and link it into the existing partially linked object.

There is already a fix[1] queued for this.

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm.git/commit/?h=fixes&id=a49b50a3c1c3226d26e1dd11e8b763f27e477623

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
  2021-09-22 12:46   ` Fuad Tabba
  (?)
@ 2021-09-30 13:04     ` Will Deacon
  -1 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:04 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:53PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> In order to avoid including the whole of the switching helpers
> in unrelated files, move the __get_fault_info() and related helpers
> into their own include file.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
>  3 files changed, 77 insertions(+), 61 deletions(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> new file mode 100644
> index 000000000000..1b8a2dcd712f
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -0,0 +1,75 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier <marc.zyngier@arm.com>

May as well fix the broken email address? ^^

> + */
> +
> +#ifndef __ARM64_KVM_HYP_FAULT_H__
> +#define __ARM64_KVM_HYP_FAULT_H__
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>

Strictly speaking, I think you're probably missing a bunch of includes here
(e.g. asm/sysreg.h, asm/kvm_arm.h, asm/cpufeature.h, ...)

Nits aside:

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
@ 2021-09-30 13:04     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:04 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, maz, pbonzini, kvmarm, linux-arm-kernel

On Wed, Sep 22, 2021 at 01:46:53PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> In order to avoid including the whole of the switching helpers
> in unrelated files, move the __get_fault_info() and related helpers
> into their own include file.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
>  3 files changed, 77 insertions(+), 61 deletions(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> new file mode 100644
> index 000000000000..1b8a2dcd712f
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -0,0 +1,75 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier <marc.zyngier@arm.com>

May as well fix the broken email address? ^^

> + */
> +
> +#ifndef __ARM64_KVM_HYP_FAULT_H__
> +#define __ARM64_KVM_HYP_FAULT_H__
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>

Strictly speaking, I think you're probably missing a bunch of includes here
(e.g. asm/sysreg.h, asm/kvm_arm.h, asm/cpufeature.h, ...)

Nits aside:

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file
@ 2021-09-30 13:04     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:04 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:53PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> In order to avoid including the whole of the switching helpers
> in unrelated files, move the __get_fault_info() and related helpers
> into their own include file.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/fault.h  | 75 +++++++++++++++++++++++++
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 61 +-------------------
>  arch/arm64/kvm/hyp/nvhe/mem_protect.c   |  2 +-
>  3 files changed, 77 insertions(+), 61 deletions(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/hyp/fault.h
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/fault.h b/arch/arm64/kvm/hyp/include/hyp/fault.h
> new file mode 100644
> index 000000000000..1b8a2dcd712f
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/hyp/fault.h
> @@ -0,0 +1,75 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2015 - ARM Ltd
> + * Author: Marc Zyngier <marc.zyngier@arm.com>

May as well fix the broken email address? ^^

> + */
> +
> +#ifndef __ARM64_KVM_HYP_FAULT_H__
> +#define __ARM64_KVM_HYP_FAULT_H__
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_emulate.h>
> +#include <asm/kvm_hyp.h>
> +#include <asm/kvm_mmu.h>

Strictly speaking, I think you're probably missing a bunch of includes here
(e.g. asm/sysreg.h, asm/kvm_arm.h, asm/cpufeature.h, ...)

Nits aside:

Acked-by: Will Deacon <will@kernel.org>

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
  2021-09-22 12:46   ` Fuad Tabba
  (?)
@ 2021-09-30 13:07     ` Will Deacon
  -1 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:07 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, maz, pbonzini, kvmarm, linux-arm-kernel

On Wed, Sep 22, 2021 at 01:46:54PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> hyp-main.c includes switch.h while it only requires adjust-pc.h.
> Fix it to remove an unnecessary dependency.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 2da6aa8da868..8ca1104f4774 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -4,7 +4,7 @@
>   * Author: Andrew Scull <ascull@google.com>
>   */
>  
> -#include <hyp/switch.h>
> +#include <hyp/adjust_pc.h>
>  
>  #include <asm/pgtable-types.h>
>  #include <asm/kvm_asm.h>

Acked-by: Will Deacon <will@kernel.org>

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
@ 2021-09-30 13:07     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:07 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:54PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> hyp-main.c includes switch.h while it only requires adjust-pc.h.
> Fix it to remove an unnecessary dependency.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 2da6aa8da868..8ca1104f4774 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -4,7 +4,7 @@
>   * Author: Andrew Scull <ascull@google.com>
>   */
>  
> -#include <hyp/switch.h>
> +#include <hyp/adjust_pc.h>
>  
>  #include <asm/pgtable-types.h>
>  #include <asm/kvm_asm.h>

Acked-by: Will Deacon <will@kernel.org>

Will

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c
@ 2021-09-30 13:07     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:07 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:54PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> hyp-main.c includes switch.h while it only requires adjust-pc.h.
> Fix it to remove an unnecessary dependency.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 2da6aa8da868..8ca1104f4774 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -4,7 +4,7 @@
>   * Author: Andrew Scull <ascull@google.com>
>   */
>  
> -#include <hyp/switch.h>
> +#include <hyp/adjust_pc.h>
>  
>  #include <asm/pgtable-types.h>
>  #include <asm/kvm_asm.h>

Acked-by: Will Deacon <will@kernel.org>

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
  2021-09-22 12:46   ` Fuad Tabba
  (?)
@ 2021-09-30 13:35     ` Will Deacon
  -1 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:35 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> Simplify the early exception handling by slicing the gigantic decoding
> tree into a more manageable set of functions, similar to what we have
> in handle_exit.c.
> 
> This will also make the structure reusable for pKVM's own early exit
> handling.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
>  arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
>  arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
>  3 files changed, 126 insertions(+), 68 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 54abc8298ec3..0397606c0951 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
>  
>  static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> -	u8 ec;
> -	u64 esr;
> -
> -	esr = vcpu->arch.fault.esr_el2;
> -	ec = ESR_ELx_EC(esr);
> -
> -	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
> -		return true;
> -
> -	return __get_fault_info(esr, &vcpu->arch.fault);
> +	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
>  }
>  
>  static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
> @@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
>  	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
>  }
>  
> -/* Check for an FPSIMD/SVE trap and handle as appropriate */
> -static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
> +/*
> + * We trap the first access to the FP/SIMD to save the host context and
> + * restore the guest context lazily.
> + * If FP/SIMD is not implemented, handle the trap and inject an undefined
> + * instruction exception to the guest. Similarly for trapped SVE accesses.
> + */
> +static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	bool sve_guest, sve_host;
>  	u8 esr_ec;
> @@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
>  	}
>  
>  	esr_ec = kvm_vcpu_trap_get_class(vcpu);
> -	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
> -	    esr_ec != ESR_ELx_EC_SVE)
> -		return false;
>  
>  	/* Don't handle SVE traps for non-SVE vcpus here: */
>  	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
> @@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
>  
>  DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
>  
> -static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
> +static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	struct kvm_cpu_context *ctxt;
>  	u64 val;
> @@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
> +	    handle_tx2_tvm(vcpu))
> +		return true;
> +
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;
> +
> +	return false;
> +}
> +
> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;

I think you're now calling this for the 64-bit CP15 access path, which I
don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
accesses?

Will

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 13:35     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:35 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, maz, pbonzini, kvmarm, linux-arm-kernel

On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> Simplify the early exception handling by slicing the gigantic decoding
> tree into a more manageable set of functions, similar to what we have
> in handle_exit.c.
> 
> This will also make the structure reusable for pKVM's own early exit
> handling.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
>  arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
>  arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
>  3 files changed, 126 insertions(+), 68 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 54abc8298ec3..0397606c0951 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
>  
>  static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> -	u8 ec;
> -	u64 esr;
> -
> -	esr = vcpu->arch.fault.esr_el2;
> -	ec = ESR_ELx_EC(esr);
> -
> -	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
> -		return true;
> -
> -	return __get_fault_info(esr, &vcpu->arch.fault);
> +	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
>  }
>  
>  static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
> @@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
>  	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
>  }
>  
> -/* Check for an FPSIMD/SVE trap and handle as appropriate */
> -static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
> +/*
> + * We trap the first access to the FP/SIMD to save the host context and
> + * restore the guest context lazily.
> + * If FP/SIMD is not implemented, handle the trap and inject an undefined
> + * instruction exception to the guest. Similarly for trapped SVE accesses.
> + */
> +static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	bool sve_guest, sve_host;
>  	u8 esr_ec;
> @@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
>  	}
>  
>  	esr_ec = kvm_vcpu_trap_get_class(vcpu);
> -	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
> -	    esr_ec != ESR_ELx_EC_SVE)
> -		return false;
>  
>  	/* Don't handle SVE traps for non-SVE vcpus here: */
>  	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
> @@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
>  
>  DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
>  
> -static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
> +static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	struct kvm_cpu_context *ctxt;
>  	u64 val;
> @@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
> +	    handle_tx2_tvm(vcpu))
> +		return true;
> +
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;
> +
> +	return false;
> +}
> +
> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;

I think you're now calling this for the 64-bit CP15 access path, which I
don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
accesses?

Will
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 13:35     ` Will Deacon
  0 siblings, 0 replies; 90+ messages in thread
From: Will Deacon @ 2021-09-30 13:35 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
> From: Marc Zyngier <maz@kernel.org>
> 
> Simplify the early exception handling by slicing the gigantic decoding
> tree into a more manageable set of functions, similar to what we have
> in handle_exit.c.
> 
> This will also make the structure reusable for pKVM's own early exit
> handling.
> 
> Signed-off-by: Marc Zyngier <maz@kernel.org>
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/include/hyp/switch.h | 160 ++++++++++++++----------
>  arch/arm64/kvm/hyp/nvhe/switch.c        |  17 +++
>  arch/arm64/kvm/hyp/vhe/switch.c         |  17 +++
>  3 files changed, 126 insertions(+), 68 deletions(-)
> 
> diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
> index 54abc8298ec3..0397606c0951 100644
> --- a/arch/arm64/kvm/hyp/include/hyp/switch.h
> +++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
> @@ -136,16 +136,7 @@ static inline void ___deactivate_traps(struct kvm_vcpu *vcpu)
>  
>  static inline bool __populate_fault_info(struct kvm_vcpu *vcpu)
>  {
> -	u8 ec;
> -	u64 esr;
> -
> -	esr = vcpu->arch.fault.esr_el2;
> -	ec = ESR_ELx_EC(esr);
> -
> -	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
> -		return true;
> -
> -	return __get_fault_info(esr, &vcpu->arch.fault);
> +	return __get_fault_info(vcpu->arch.fault.esr_el2, &vcpu->arch.fault);
>  }
>  
>  static inline void __hyp_sve_save_host(struct kvm_vcpu *vcpu)
> @@ -166,8 +157,13 @@ static inline void __hyp_sve_restore_guest(struct kvm_vcpu *vcpu)
>  	write_sysreg_el1(__vcpu_sys_reg(vcpu, ZCR_EL1), SYS_ZCR);
>  }
>  
> -/* Check for an FPSIMD/SVE trap and handle as appropriate */
> -static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
> +/*
> + * We trap the first access to the FP/SIMD to save the host context and
> + * restore the guest context lazily.
> + * If FP/SIMD is not implemented, handle the trap and inject an undefined
> + * instruction exception to the guest. Similarly for trapped SVE accesses.
> + */
> +static bool kvm_hyp_handle_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	bool sve_guest, sve_host;
>  	u8 esr_ec;
> @@ -185,9 +181,6 @@ static inline bool __hyp_handle_fpsimd(struct kvm_vcpu *vcpu)
>  	}
>  
>  	esr_ec = kvm_vcpu_trap_get_class(vcpu);
> -	if (esr_ec != ESR_ELx_EC_FP_ASIMD &&
> -	    esr_ec != ESR_ELx_EC_SVE)
> -		return false;
>  
>  	/* Don't handle SVE traps for non-SVE vcpus here: */
>  	if (!sve_guest && esr_ec != ESR_ELx_EC_FP_ASIMD)
> @@ -325,7 +318,7 @@ static inline bool esr_is_ptrauth_trap(u32 esr)
>  
>  DECLARE_PER_CPU(struct kvm_cpu_context, kvm_hyp_ctxt);
>  
> -static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
> +static bool kvm_hyp_handle_ptrauth(struct kvm_vcpu *vcpu, u64 *exit_code)
>  {
>  	struct kvm_cpu_context *ctxt;
>  	u64 val;
> @@ -350,6 +343,87 @@ static inline bool __hyp_handle_ptrauth(struct kvm_vcpu *vcpu)
>  	return true;
>  }
>  
> +static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (cpus_have_final_cap(ARM64_WORKAROUND_CAVIUM_TX2_219_TVM) &&
> +	    handle_tx2_tvm(vcpu))
> +		return true;
> +
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;
> +
> +	return false;
> +}
> +
> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> +		return true;

I think you're now calling this for the 64-bit CP15 access path, which I
don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
accesses?

Will

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
  2021-09-30 13:35     ` Will Deacon
  (?)
@ 2021-09-30 16:02       ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Fuad Tabba, kvmarm, james.morse, alexandru.elisei,
	suzuki.poulose, mark.rutland, christoffer.dall, pbonzini,
	drjones, oupton, qperret, kvm, linux-arm-kernel, kernel-team

On 2021-09-30 14:35, Will Deacon wrote:
> On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
>> From: Marc Zyngier <maz@kernel.org>

>> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 
>> *exit_code)
>> +{
>> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
>> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
>> +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which 
> I
> don't think is correct. Maybe have separate handlers for 32-bit v4 
> 64-bit
> accesses?

Good point. The saving grace is that there is no 32bit-capable CPU that
requires GICv3 trapping, nor any 64bit cp15 register in the GICv3
architecture apart form the SGI registers, which are always handled at 
EL1.
So this code is largely academic!

Not providing a handler is the way to go for CP15-64.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 16:02       ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:02 UTC (permalink / raw)
  To: Will Deacon; +Cc: kvm, kernel-team, linux-arm-kernel, pbonzini, kvmarm

On 2021-09-30 14:35, Will Deacon wrote:
> On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
>> From: Marc Zyngier <maz@kernel.org>

>> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 
>> *exit_code)
>> +{
>> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
>> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
>> +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which 
> I
> don't think is correct. Maybe have separate handlers for 32-bit v4 
> 64-bit
> accesses?

Good point. The saving grace is that there is no 32bit-capable CPU that
requires GICv3 trapping, nor any 64bit cp15 register in the GICv3
architecture apart form the SGI registers, which are always handled at 
EL1.
So this code is largely academic!

Not providing a handler is the way to go for CP15-64.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 16:02       ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:02 UTC (permalink / raw)
  To: Will Deacon
  Cc: Fuad Tabba, kvmarm, james.morse, alexandru.elisei,
	suzuki.poulose, mark.rutland, christoffer.dall, pbonzini,
	drjones, oupton, qperret, kvm, linux-arm-kernel, kernel-team

On 2021-09-30 14:35, Will Deacon wrote:
> On Wed, Sep 22, 2021 at 01:46:55PM +0100, Fuad Tabba wrote:
>> From: Marc Zyngier <maz@kernel.org>

>> +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 
>> *exit_code)
>> +{
>> +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
>> +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
>> +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which 
> I
> don't think is correct. Maybe have separate handlers for 32-bit v4 
> 64-bit
> accesses?

Good point. The saving grace is that there is no 32bit-capable CPU that
requires GICv3 trapping, nor any 64bit cp15 register in the GICv3
architecture apart form the SGI registers, which are always handled at 
EL1.
So this code is largely academic!

Not providing a handler is the way to go for CP15-64.

Thanks,

         M.
-- 
Jazz is not dead. It just smells funny...

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
  2021-09-30 13:35     ` Will Deacon
  (?)
@ 2021-09-30 16:27       ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Fuad Tabba, kvmarm, james.morse, alexandru.elisei,
	suzuki.poulose, mark.rutland, christoffer.dall, pbonzini,
	drjones, oupton, qperret, kvm, linux-arm-kernel, kernel-team

On Thu, 30 Sep 2021 14:35:57 +0100,
Will Deacon <will@kernel.org> wrote:
> > +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> > +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> > +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which I
> don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
> accesses?

FWIW, here's what I'm queuing as a fix.

Thanks,

	M.

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..1e4177322be7 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -356,7 +356,7 @@ static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
 	return false;
 }
 
-static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+static bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
 	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..4f3992a1aabd 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -160,8 +160,7 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..9aedc8afc8b9 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -98,8 +98,7 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 16:27       ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:27 UTC (permalink / raw)
  To: Will Deacon; +Cc: kvm, kernel-team, linux-arm-kernel, pbonzini, kvmarm

On Thu, 30 Sep 2021 14:35:57 +0100,
Will Deacon <will@kernel.org> wrote:
> > +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> > +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> > +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which I
> don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
> accesses?

FWIW, here's what I'm queuing as a fix.

Thanks,

	M.

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..1e4177322be7 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -356,7 +356,7 @@ static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
 	return false;
 }
 
-static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+static bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
 	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..4f3992a1aabd 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -160,8 +160,7 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..9aedc8afc8b9 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -98,8 +98,7 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers
@ 2021-09-30 16:27       ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-09-30 16:27 UTC (permalink / raw)
  To: Will Deacon
  Cc: Fuad Tabba, kvmarm, james.morse, alexandru.elisei,
	suzuki.poulose, mark.rutland, christoffer.dall, pbonzini,
	drjones, oupton, qperret, kvm, linux-arm-kernel, kernel-team

On Thu, 30 Sep 2021 14:35:57 +0100,
Will Deacon <will@kernel.org> wrote:
> > +static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
> > +	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
> > +		return true;
> 
> I think you're now calling this for the 64-bit CP15 access path, which I
> don't think is correct. Maybe have separate handlers for 32-bit v4 64-bit
> accesses?

FWIW, here's what I'm queuing as a fix.

Thanks,

	M.

diff --git a/arch/arm64/kvm/hyp/include/hyp/switch.h b/arch/arm64/kvm/hyp/include/hyp/switch.h
index 0397606c0951..1e4177322be7 100644
--- a/arch/arm64/kvm/hyp/include/hyp/switch.h
+++ b/arch/arm64/kvm/hyp/include/hyp/switch.h
@@ -356,7 +356,7 @@ static bool kvm_hyp_handle_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
 	return false;
 }
 
-static bool kvm_hyp_handle_cp15(struct kvm_vcpu *vcpu, u64 *exit_code)
+static bool kvm_hyp_handle_cp15_32(struct kvm_vcpu *vcpu, u64 *exit_code)
 {
 	if (static_branch_unlikely(&vgic_v3_cpuif_trap) &&
 	    __vgic_v3_perform_cpuif_access(vcpu) == 1)
diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
index c52d580708e0..4f3992a1aabd 100644
--- a/arch/arm64/kvm/hyp/nvhe/switch.c
+++ b/arch/arm64/kvm/hyp/nvhe/switch.c
@@ -160,8 +160,7 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,
diff --git a/arch/arm64/kvm/hyp/vhe/switch.c b/arch/arm64/kvm/hyp/vhe/switch.c
index 0e0d342358f7..9aedc8afc8b9 100644
--- a/arch/arm64/kvm/hyp/vhe/switch.c
+++ b/arch/arm64/kvm/hyp/vhe/switch.c
@@ -98,8 +98,7 @@ void deactivate_traps_vhe_put(struct kvm_vcpu *vcpu)
 
 static const exit_handler_fn hyp_exit_handlers[] = {
 	[0 ... ESR_ELx_EC_MAX]		= NULL,
-	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
-	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,
+	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15_32,
 	[ESR_ELx_EC_SYS64]		= kvm_hyp_handle_sysreg,
 	[ESR_ELx_EC_SVE]		= kvm_hyp_handle_fpsimd,
 	[ESR_ELx_EC_FP_ASIMD]		= kvm_hyp_handle_fpsimd,

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-04 17:27     ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-04 17:27 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Fuad,

On Wed, 22 Sep 2021 13:47:03 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Trap accesses to restricted features for VMs running in protected
> mode.
> 
> Access to feature registers are emulated, and only supported
> features are exposed to protected VMs.
> 
> Accesses to restricted registers as well as restricted
> instructions are trapped, and an undefined exception is injected
> into the protected guests, i.e., with EC = 0x0 (unknown reason).
> This EC is the one used, according to the Arm Architecture
> Reference Manual, for unallocated or undefined system registers
> or instructions.
> 
> Only affects the functionality of protected VMs. Otherwise,
> should not affect non-protected VMs when KVM is running in
> protected mode.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 49080c607838..2bf5952f651b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -20,6 +20,7 @@
>  #include <asm/kprobes.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_fixed_config.h>
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
> @@ -28,6 +29,7 @@
>  #include <asm/thread_info.h>
>  
>  #include <nvhe/mem_protect.h>
> +#include <nvhe/sys_regs.h>
>  
>  /* Non-VHE specific context */
>  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
>  		write_sysreg(pmu->events_host, pmcntenset_el0);
>  }
>  
> +/**
> + * Handler for protected VM restricted exceptions.
> + *
> + * Inject an undefined exception into the guest and return true to indicate that
> + * the hypervisor has handled the exit, and control should go back to the guest.
> + */
> +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	__inject_undef64(vcpu);
> +	return true;
> +}
> +
> +/**
> + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> +		return true;
> +	else
> +		return kvm_hyp_handle_sysreg(vcpu, exit_code);

nit: drop the else.

I wonder though: what if there is an overlap between between the pVM
handling and the normal KVM stuff? Are we guaranteed that there is
none?

For example, ESR_ELx_EC_SYS64 is used when working around some bugs
(see the TX2 TVM handling). What happens if you return early and don't
let it happen? This has a huge potential for some bad breakage.

> +}
> +
> +/**
> + * Handler for protected floating-point and Advanced SIMD accesses.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	/* Linux guests assume support for floating-point and Advanced SIMD. */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> +}
> +
>  static const exit_handler_fn hyp_exit_handlers[] = {
>  	[0 ... ESR_ELx_EC_MAX]		= NULL,
>  	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
>  	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
>  };
>  
> +static const exit_handler_fn pvm_exit_handlers[] = {
> +	[0 ... ESR_ELx_EC_MAX]		= NULL,
> +	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> +	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,

Heads up, this one was bogus, and I removed it in my patches[1].

But it really begs the question: given that you really don't want to
handle any AArch32 for protected VMs, why handling anything at all the
first place? You really should let the exit happen and let the outer
run loop deal with it.

> +	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
> +	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
> +	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
> +	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
> +	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
> +	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
> +};
> +
>  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  {
> +	if (unlikely(kvm_vm_is_protected(kvm)))
> +		return pvm_exit_handlers;
> +
>  	return hyp_exit_handlers;
>  }
>  
> -- 
> 2.33.0.464.g1972c5931b-goog
> 
> 

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-10-04 17:27     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-04 17:27 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

Hi Fuad,

On Wed, 22 Sep 2021 13:47:03 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Trap accesses to restricted features for VMs running in protected
> mode.
> 
> Access to feature registers are emulated, and only supported
> features are exposed to protected VMs.
> 
> Accesses to restricted registers as well as restricted
> instructions are trapped, and an undefined exception is injected
> into the protected guests, i.e., with EC = 0x0 (unknown reason).
> This EC is the one used, according to the Arm Architecture
> Reference Manual, for unallocated or undefined system registers
> or instructions.
> 
> Only affects the functionality of protected VMs. Otherwise,
> should not affect non-protected VMs when KVM is running in
> protected mode.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 49080c607838..2bf5952f651b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -20,6 +20,7 @@
>  #include <asm/kprobes.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_fixed_config.h>
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
> @@ -28,6 +29,7 @@
>  #include <asm/thread_info.h>
>  
>  #include <nvhe/mem_protect.h>
> +#include <nvhe/sys_regs.h>
>  
>  /* Non-VHE specific context */
>  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
>  		write_sysreg(pmu->events_host, pmcntenset_el0);
>  }
>  
> +/**
> + * Handler for protected VM restricted exceptions.
> + *
> + * Inject an undefined exception into the guest and return true to indicate that
> + * the hypervisor has handled the exit, and control should go back to the guest.
> + */
> +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	__inject_undef64(vcpu);
> +	return true;
> +}
> +
> +/**
> + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> +		return true;
> +	else
> +		return kvm_hyp_handle_sysreg(vcpu, exit_code);

nit: drop the else.

I wonder though: what if there is an overlap between between the pVM
handling and the normal KVM stuff? Are we guaranteed that there is
none?

For example, ESR_ELx_EC_SYS64 is used when working around some bugs
(see the TX2 TVM handling). What happens if you return early and don't
let it happen? This has a huge potential for some bad breakage.

> +}
> +
> +/**
> + * Handler for protected floating-point and Advanced SIMD accesses.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	/* Linux guests assume support for floating-point and Advanced SIMD. */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> +}
> +
>  static const exit_handler_fn hyp_exit_handlers[] = {
>  	[0 ... ESR_ELx_EC_MAX]		= NULL,
>  	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
>  	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
>  };
>  
> +static const exit_handler_fn pvm_exit_handlers[] = {
> +	[0 ... ESR_ELx_EC_MAX]		= NULL,
> +	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> +	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,

Heads up, this one was bogus, and I removed it in my patches[1].

But it really begs the question: given that you really don't want to
handle any AArch32 for protected VMs, why handling anything at all the
first place? You really should let the exit happen and let the outer
run loop deal with it.

> +	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
> +	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
> +	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
> +	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
> +	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
> +	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
> +};
> +
>  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  {
> +	if (unlikely(kvm_vm_is_protected(kvm)))
> +		return pvm_exit_handlers;
> +
>  	return hyp_exit_handlers;
>  }
>  
> -- 
> 2.33.0.464.g1972c5931b-goog
> 
> 

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-10-04 17:27     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-04 17:27 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Fuad,

On Wed, 22 Sep 2021 13:47:03 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Trap accesses to restricted features for VMs running in protected
> mode.
> 
> Access to feature registers are emulated, and only supported
> features are exposed to protected VMs.
> 
> Accesses to restricted registers as well as restricted
> instructions are trapped, and an undefined exception is injected
> into the protected guests, i.e., with EC = 0x0 (unknown reason).
> This EC is the one used, according to the Arm Architecture
> Reference Manual, for unallocated or undefined system registers
> or instructions.
> 
> Only affects the functionality of protected VMs. Otherwise,
> should not affect non-protected VMs when KVM is running in
> protected mode.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
>  1 file changed, 60 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 49080c607838..2bf5952f651b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -20,6 +20,7 @@
>  #include <asm/kprobes.h>
>  #include <asm/kvm_asm.h>
>  #include <asm/kvm_emulate.h>
> +#include <asm/kvm_fixed_config.h>
>  #include <asm/kvm_hyp.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/fpsimd.h>
> @@ -28,6 +29,7 @@
>  #include <asm/thread_info.h>
>  
>  #include <nvhe/mem_protect.h>
> +#include <nvhe/sys_regs.h>
>  
>  /* Non-VHE specific context */
>  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
>  		write_sysreg(pmu->events_host, pmcntenset_el0);
>  }
>  
> +/**
> + * Handler for protected VM restricted exceptions.
> + *
> + * Inject an undefined exception into the guest and return true to indicate that
> + * the hypervisor has handled the exit, and control should go back to the guest.
> + */
> +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	__inject_undef64(vcpu);
> +	return true;
> +}
> +
> +/**
> + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> +		return true;
> +	else
> +		return kvm_hyp_handle_sysreg(vcpu, exit_code);

nit: drop the else.

I wonder though: what if there is an overlap between between the pVM
handling and the normal KVM stuff? Are we guaranteed that there is
none?

For example, ESR_ELx_EC_SYS64 is used when working around some bugs
(see the TX2 TVM handling). What happens if you return early and don't
let it happen? This has a huge potential for some bad breakage.

> +}
> +
> +/**
> + * Handler for protected floating-point and Advanced SIMD accesses.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't.
> + */
> +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	/* Linux guests assume support for floating-point and Advanced SIMD. */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> +}
> +
>  static const exit_handler_fn hyp_exit_handlers[] = {
>  	[0 ... ESR_ELx_EC_MAX]		= NULL,
>  	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
>  	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
>  };
>  
> +static const exit_handler_fn pvm_exit_handlers[] = {
> +	[0 ... ESR_ELx_EC_MAX]		= NULL,
> +	[ESR_ELx_EC_CP15_32]		= kvm_hyp_handle_cp15,
> +	[ESR_ELx_EC_CP15_64]		= kvm_hyp_handle_cp15,

Heads up, this one was bogus, and I removed it in my patches[1].

But it really begs the question: given that you really don't want to
handle any AArch32 for protected VMs, why handling anything at all the
first place? You really should let the exit happen and let the outer
run loop deal with it.

> +	[ESR_ELx_EC_SYS64]		= kvm_handle_pvm_sys64,
> +	[ESR_ELx_EC_SVE]		= kvm_handle_pvm_restricted,
> +	[ESR_ELx_EC_FP_ASIMD]		= kvm_handle_pvm_fpsimd,
> +	[ESR_ELx_EC_IABT_LOW]		= kvm_hyp_handle_iabt_low,
> +	[ESR_ELx_EC_DABT_LOW]		= kvm_hyp_handle_dabt_low,
> +	[ESR_ELx_EC_PAC]		= kvm_hyp_handle_ptrauth,
> +};
> +
>  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  {
> +	if (unlikely(kvm_vm_is_protected(kvm)))
> +		return pvm_exit_handlers;
> +
>  	return hyp_exit_handlers;
>  }
>  
> -- 
> 2.33.0.464.g1972c5931b-goog
> 
> 

Thanks,

	M.

[1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
  2021-10-04 17:27     ` Marc Zyngier
  (?)
@ 2021-10-05  7:20       ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  7:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Mon, Oct 4, 2021 at 6:27 PM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Wed, 22 Sep 2021 13:47:03 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Trap accesses to restricted features for VMs running in protected
> > mode.
> >
> > Access to feature registers are emulated, and only supported
> > features are exposed to protected VMs.
> >
> > Accesses to restricted registers as well as restricted
> > instructions are trapped, and an undefined exception is injected
> > into the protected guests, i.e., with EC = 0x0 (unknown reason).
> > This EC is the one used, according to the Arm Architecture
> > Reference Manual, for unallocated or undefined system registers
> > or instructions.
> >
> > Only affects the functionality of protected VMs. Otherwise,
> > should not affect non-protected VMs when KVM is running in
> > protected mode.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
> >  1 file changed, 60 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 49080c607838..2bf5952f651b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -20,6 +20,7 @@
> >  #include <asm/kprobes.h>
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_emulate.h>
> > +#include <asm/kvm_fixed_config.h>
> >  #include <asm/kvm_hyp.h>
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/fpsimd.h>
> > @@ -28,6 +29,7 @@
> >  #include <asm/thread_info.h>
> >
> >  #include <nvhe/mem_protect.h>
> > +#include <nvhe/sys_regs.h>
> >
> >  /* Non-VHE specific context */
> >  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> > @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
> >               write_sysreg(pmu->events_host, pmcntenset_el0);
> >  }
> >
> > +/**
> > + * Handler for protected VM restricted exceptions.
> > + *
> > + * Inject an undefined exception into the guest and return true to indicate that
> > + * the hypervisor has handled the exit, and control should go back to the guest.
> > + */
> > +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return true;
> > +}
> > +
> > +/**
> > + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> > +             return true;
> > +     else
> > +             return kvm_hyp_handle_sysreg(vcpu, exit_code);
>
> nit: drop the else.

Will do.

> I wonder though: what if there is an overlap between between the pVM
> handling and the normal KVM stuff? Are we guaranteed that there is
> none?
>
> For example, ESR_ELx_EC_SYS64 is used when working around some bugs
> (see the TX2 TVM handling). What happens if you return early and don't
> let it happen? This has a huge potential for some bad breakage.

This is a tough one. Especially because it's dealing with bugs, there
is no guarantee really. I think that for the TVM handling there is no
overlap and the TVM handling code in kvm_hyp_handle_sysreg() will be
invoked. However, workarounds could always be added, and if that
happens, we need to make sure that they're on all paths. One solution
is to make sure that such code is in a common function called by both
paths. Not sure how we could enforce that other than by documenting
it.

What do you think?

> > +}
> > +
> > +/**
> > + * Handler for protected floating-point and Advanced SIMD accesses.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     /* Linux guests assume support for floating-point and Advanced SIMD. */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> > +}
> > +
> >  static const exit_handler_fn hyp_exit_handlers[] = {
> >       [0 ... ESR_ELx_EC_MAX]          = NULL,
> >       [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
> >       [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> >  };
> >
> > +static const exit_handler_fn pvm_exit_handlers[] = {
> > +     [0 ... ESR_ELx_EC_MAX]          = NULL,
> > +     [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > +     [ESR_ELx_EC_CP15_64]            = kvm_hyp_handle_cp15,
>
> Heads up, this one was bogus, and I removed it in my patches[1].
>
> But it really begs the question: given that you really don't want to
> handle any AArch32 for protected VMs, why handling anything at all the
> first place? You really should let the exit happen and let the outer
> run loop deal with it.

Good point. Will fix this.

Cheers,
/fuad

> > +     [ESR_ELx_EC_SYS64]              = kvm_handle_pvm_sys64,
> > +     [ESR_ELx_EC_SVE]                = kvm_handle_pvm_restricted,
> > +     [ESR_ELx_EC_FP_ASIMD]           = kvm_handle_pvm_fpsimd,
> > +     [ESR_ELx_EC_IABT_LOW]           = kvm_hyp_handle_iabt_low,
> > +     [ESR_ELx_EC_DABT_LOW]           = kvm_hyp_handle_dabt_low,
> > +     [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> > +};
> > +
> >  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >  {
> > +     if (unlikely(kvm_vm_is_protected(kvm)))
> > +             return pvm_exit_handlers;
> > +
> >       return hyp_exit_handlers;
> >  }
> >
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
> >
>
> Thanks,
>
>         M.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-10-05  7:20       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  7:20 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

Hi Marc,

On Mon, Oct 4, 2021 at 6:27 PM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Wed, 22 Sep 2021 13:47:03 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Trap accesses to restricted features for VMs running in protected
> > mode.
> >
> > Access to feature registers are emulated, and only supported
> > features are exposed to protected VMs.
> >
> > Accesses to restricted registers as well as restricted
> > instructions are trapped, and an undefined exception is injected
> > into the protected guests, i.e., with EC = 0x0 (unknown reason).
> > This EC is the one used, according to the Arm Architecture
> > Reference Manual, for unallocated or undefined system registers
> > or instructions.
> >
> > Only affects the functionality of protected VMs. Otherwise,
> > should not affect non-protected VMs when KVM is running in
> > protected mode.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
> >  1 file changed, 60 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 49080c607838..2bf5952f651b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -20,6 +20,7 @@
> >  #include <asm/kprobes.h>
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_emulate.h>
> > +#include <asm/kvm_fixed_config.h>
> >  #include <asm/kvm_hyp.h>
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/fpsimd.h>
> > @@ -28,6 +29,7 @@
> >  #include <asm/thread_info.h>
> >
> >  #include <nvhe/mem_protect.h>
> > +#include <nvhe/sys_regs.h>
> >
> >  /* Non-VHE specific context */
> >  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> > @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
> >               write_sysreg(pmu->events_host, pmcntenset_el0);
> >  }
> >
> > +/**
> > + * Handler for protected VM restricted exceptions.
> > + *
> > + * Inject an undefined exception into the guest and return true to indicate that
> > + * the hypervisor has handled the exit, and control should go back to the guest.
> > + */
> > +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return true;
> > +}
> > +
> > +/**
> > + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> > +             return true;
> > +     else
> > +             return kvm_hyp_handle_sysreg(vcpu, exit_code);
>
> nit: drop the else.

Will do.

> I wonder though: what if there is an overlap between between the pVM
> handling and the normal KVM stuff? Are we guaranteed that there is
> none?
>
> For example, ESR_ELx_EC_SYS64 is used when working around some bugs
> (see the TX2 TVM handling). What happens if you return early and don't
> let it happen? This has a huge potential for some bad breakage.

This is a tough one. Especially because it's dealing with bugs, there
is no guarantee really. I think that for the TVM handling there is no
overlap and the TVM handling code in kvm_hyp_handle_sysreg() will be
invoked. However, workarounds could always be added, and if that
happens, we need to make sure that they're on all paths. One solution
is to make sure that such code is in a common function called by both
paths. Not sure how we could enforce that other than by documenting
it.

What do you think?

> > +}
> > +
> > +/**
> > + * Handler for protected floating-point and Advanced SIMD accesses.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     /* Linux guests assume support for floating-point and Advanced SIMD. */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> > +}
> > +
> >  static const exit_handler_fn hyp_exit_handlers[] = {
> >       [0 ... ESR_ELx_EC_MAX]          = NULL,
> >       [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
> >       [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> >  };
> >
> > +static const exit_handler_fn pvm_exit_handlers[] = {
> > +     [0 ... ESR_ELx_EC_MAX]          = NULL,
> > +     [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > +     [ESR_ELx_EC_CP15_64]            = kvm_hyp_handle_cp15,
>
> Heads up, this one was bogus, and I removed it in my patches[1].
>
> But it really begs the question: given that you really don't want to
> handle any AArch32 for protected VMs, why handling anything at all the
> first place? You really should let the exit happen and let the outer
> run loop deal with it.

Good point. Will fix this.

Cheers,
/fuad

> > +     [ESR_ELx_EC_SYS64]              = kvm_handle_pvm_sys64,
> > +     [ESR_ELx_EC_SVE]                = kvm_handle_pvm_restricted,
> > +     [ESR_ELx_EC_FP_ASIMD]           = kvm_handle_pvm_fpsimd,
> > +     [ESR_ELx_EC_IABT_LOW]           = kvm_hyp_handle_iabt_low,
> > +     [ESR_ELx_EC_DABT_LOW]           = kvm_hyp_handle_dabt_low,
> > +     [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> > +};
> > +
> >  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >  {
> > +     if (unlikely(kvm_vm_is_protected(kvm)))
> > +             return pvm_exit_handlers;
> > +
> >       return hyp_exit_handlers;
> >  }
> >
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
> >
>
> Thanks,
>
>         M.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f
>
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features
@ 2021-10-05  7:20       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  7:20 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Mon, Oct 4, 2021 at 6:27 PM Marc Zyngier <maz@kernel.org> wrote:
>
> Hi Fuad,
>
> On Wed, 22 Sep 2021 13:47:03 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Trap accesses to restricted features for VMs running in protected
> > mode.
> >
> > Access to feature registers are emulated, and only supported
> > features are exposed to protected VMs.
> >
> > Accesses to restricted registers as well as restricted
> > instructions are trapped, and an undefined exception is injected
> > into the protected guests, i.e., with EC = 0x0 (unknown reason).
> > This EC is the one used, according to the Arm Architecture
> > Reference Manual, for unallocated or undefined system registers
> > or instructions.
> >
> > Only affects the functionality of protected VMs. Otherwise,
> > should not affect non-protected VMs when KVM is running in
> > protected mode.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 60 ++++++++++++++++++++++++++++++++
> >  1 file changed, 60 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 49080c607838..2bf5952f651b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -20,6 +20,7 @@
> >  #include <asm/kprobes.h>
> >  #include <asm/kvm_asm.h>
> >  #include <asm/kvm_emulate.h>
> > +#include <asm/kvm_fixed_config.h>
> >  #include <asm/kvm_hyp.h>
> >  #include <asm/kvm_mmu.h>
> >  #include <asm/fpsimd.h>
> > @@ -28,6 +29,7 @@
> >  #include <asm/thread_info.h>
> >
> >  #include <nvhe/mem_protect.h>
> > +#include <nvhe/sys_regs.h>
> >
> >  /* Non-VHE specific context */
> >  DEFINE_PER_CPU(struct kvm_host_data, kvm_host_data);
> > @@ -158,6 +160,49 @@ static void __pmu_switch_to_host(struct kvm_cpu_context *host_ctxt)
> >               write_sysreg(pmu->events_host, pmcntenset_el0);
> >  }
> >
> > +/**
> > + * Handler for protected VM restricted exceptions.
> > + *
> > + * Inject an undefined exception into the guest and return true to indicate that
> > + * the hypervisor has handled the exit, and control should go back to the guest.
> > + */
> > +static bool kvm_handle_pvm_restricted(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return true;
> > +}
> > +
> > +/**
> > + * Handler for protected VM MSR, MRS or System instruction execution in AArch64.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_sys64(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     if (kvm_handle_pvm_sysreg(vcpu, exit_code))
> > +             return true;
> > +     else
> > +             return kvm_hyp_handle_sysreg(vcpu, exit_code);
>
> nit: drop the else.

Will do.

> I wonder though: what if there is an overlap between between the pVM
> handling and the normal KVM stuff? Are we guaranteed that there is
> none?
>
> For example, ESR_ELx_EC_SYS64 is used when working around some bugs
> (see the TX2 TVM handling). What happens if you return early and don't
> let it happen? This has a huge potential for some bad breakage.

This is a tough one. Especially because it's dealing with bugs, there
is no guarantee really. I think that for the TVM handling there is no
overlap and the TVM handling code in kvm_hyp_handle_sysreg() will be
invoked. However, workarounds could always be added, and if that
happens, we need to make sure that they're on all paths. One solution
is to make sure that such code is in a common function called by both
paths. Not sure how we could enforce that other than by documenting
it.

What do you think?

> > +}
> > +
> > +/**
> > + * Handler for protected floating-point and Advanced SIMD accesses.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't.
> > + */
> > +static bool kvm_handle_pvm_fpsimd(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     /* Linux guests assume support for floating-point and Advanced SIMD. */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     return kvm_hyp_handle_fpsimd(vcpu, exit_code);
> > +}
> > +
> >  static const exit_handler_fn hyp_exit_handlers[] = {
> >       [0 ... ESR_ELx_EC_MAX]          = NULL,
> >       [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > @@ -170,8 +215,23 @@ static const exit_handler_fn hyp_exit_handlers[] = {
> >       [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> >  };
> >
> > +static const exit_handler_fn pvm_exit_handlers[] = {
> > +     [0 ... ESR_ELx_EC_MAX]          = NULL,
> > +     [ESR_ELx_EC_CP15_32]            = kvm_hyp_handle_cp15,
> > +     [ESR_ELx_EC_CP15_64]            = kvm_hyp_handle_cp15,
>
> Heads up, this one was bogus, and I removed it in my patches[1].
>
> But it really begs the question: given that you really don't want to
> handle any AArch32 for protected VMs, why handling anything at all the
> first place? You really should let the exit happen and let the outer
> run loop deal with it.

Good point. Will fix this.

Cheers,
/fuad

> > +     [ESR_ELx_EC_SYS64]              = kvm_handle_pvm_sys64,
> > +     [ESR_ELx_EC_SVE]                = kvm_handle_pvm_restricted,
> > +     [ESR_ELx_EC_FP_ASIMD]           = kvm_handle_pvm_fpsimd,
> > +     [ESR_ELx_EC_IABT_LOW]           = kvm_hyp_handle_iabt_low,
> > +     [ESR_ELx_EC_DABT_LOW]           = kvm_hyp_handle_dabt_low,
> > +     [ESR_ELx_EC_PAC]                = kvm_hyp_handle_ptrauth,
> > +};
> > +
> >  static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >  {
> > +     if (unlikely(kvm_vm_is_protected(kvm)))
> > +             return pvm_exit_handlers;
> > +
> >       return hyp_exit_handlers;
> >  }
> >
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
> >
>
> Thanks,
>
>         M.
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms.git/commit/?h=kvm-arm64/early-ec-handlers&id=f84ff369795ed47f2cd5e556170166ee8b3a988f
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-05  8:48     ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  8:48 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:04 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected KVM does not support protected AArch32 guests. However,
> it is possible for the guest to force run AArch32, potentially
> causing problems. Add an extra check so that if the hypervisor
> catches the guest doing that, it can prevent the guest from
> running again by resetting vcpu->arch.target and returning
> ARM_EXCEPTION_IL.
> 
> If this were to happen, The VMM can try and fix it by re-
> initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> likely not possible for protected VMs.
> 
> Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> AArch32 systems")
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 2bf5952f651b..d66226e49013 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  	return hyp_exit_handlers;
>  }
>  
> +/*
> + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> + * hypervisor spots a guest in such a state ensure it is handled, and don't
> + * trust the host to spot or fix it.  The check below is based on the one in
> + * kvm_arch_vcpu_ioctl_run().
> + *
> + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> + * thus should exit to the host, or true if a the guest run loop can continue.
> + */
> +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);

There is no need for an extra cast. kern_hyp_va() already provides a
cast to the type of the parameter.

> +	bool is_aarch32_allowed =
> +		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> +			  get_pvm_id_aa64pfr0(vcpu)) >=
> +				ID_AA64PFR0_ELx_32BIT_64BIT;
> +
> +
> +	if (kvm_vm_is_protected(kvm) &&
> +	    vcpu_mode_is_32bit(vcpu) &&
> +	    !is_aarch32_allowed) {

Do we really need to go through this is_aarch32_allowed check?
Protected VMs don't have AArch32, and we don't have the infrastructure
to handle 32bit at all. For non-protected VMs, we already have what we
need at EL1. So the extra check only adds complexity.

> +		/*
> +		 * As we have caught the guest red-handed, decide that it isn't
> +		 * fit for purpose anymore by making the vcpu invalid. The VMM
> +		 * can try and fix it by re-initializing the vcpu with
> +		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> +		 * protected VMs.
> +		 */
> +		vcpu->arch.target = -1;
> +		*exit_code = ARM_EXCEPTION_IL;
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /* Switch to the guest for legacy non-VHE systems */
>  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  {
> @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  		/* Jump in the fire! */
>  		exit_code = __guest_enter(vcpu);
>  
> +		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> +			break;
> +
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-10-05  8:48     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  8:48 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, 22 Sep 2021 13:47:04 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected KVM does not support protected AArch32 guests. However,
> it is possible for the guest to force run AArch32, potentially
> causing problems. Add an extra check so that if the hypervisor
> catches the guest doing that, it can prevent the guest from
> running again by resetting vcpu->arch.target and returning
> ARM_EXCEPTION_IL.
> 
> If this were to happen, The VMM can try and fix it by re-
> initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> likely not possible for protected VMs.
> 
> Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> AArch32 systems")
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 2bf5952f651b..d66226e49013 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  	return hyp_exit_handlers;
>  }
>  
> +/*
> + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> + * hypervisor spots a guest in such a state ensure it is handled, and don't
> + * trust the host to spot or fix it.  The check below is based on the one in
> + * kvm_arch_vcpu_ioctl_run().
> + *
> + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> + * thus should exit to the host, or true if a the guest run loop can continue.
> + */
> +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);

There is no need for an extra cast. kern_hyp_va() already provides a
cast to the type of the parameter.

> +	bool is_aarch32_allowed =
> +		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> +			  get_pvm_id_aa64pfr0(vcpu)) >=
> +				ID_AA64PFR0_ELx_32BIT_64BIT;
> +
> +
> +	if (kvm_vm_is_protected(kvm) &&
> +	    vcpu_mode_is_32bit(vcpu) &&
> +	    !is_aarch32_allowed) {

Do we really need to go through this is_aarch32_allowed check?
Protected VMs don't have AArch32, and we don't have the infrastructure
to handle 32bit at all. For non-protected VMs, we already have what we
need at EL1. So the extra check only adds complexity.

> +		/*
> +		 * As we have caught the guest red-handed, decide that it isn't
> +		 * fit for purpose anymore by making the vcpu invalid. The VMM
> +		 * can try and fix it by re-initializing the vcpu with
> +		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> +		 * protected VMs.
> +		 */
> +		vcpu->arch.target = -1;
> +		*exit_code = ARM_EXCEPTION_IL;
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /* Switch to the guest for legacy non-VHE systems */
>  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  {
> @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  		/* Jump in the fire! */
>  		exit_code = __guest_enter(vcpu);
>  
> +		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> +			break;
> +
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-10-05  8:48     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  8:48 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:04 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected KVM does not support protected AArch32 guests. However,
> it is possible for the guest to force run AArch32, potentially
> causing problems. Add an extra check so that if the hypervisor
> catches the guest doing that, it can prevent the guest from
> running again by resetting vcpu->arch.target and returning
> ARM_EXCEPTION_IL.
> 
> If this were to happen, The VMM can try and fix it by re-
> initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> likely not possible for protected VMs.
> 
> Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> AArch32 systems")
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
>  1 file changed, 40 insertions(+)
> 
> diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> index 2bf5952f651b..d66226e49013 100644
> --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
>  	return hyp_exit_handlers;
>  }
>  
> +/*
> + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> + * hypervisor spots a guest in such a state ensure it is handled, and don't
> + * trust the host to spot or fix it.  The check below is based on the one in
> + * kvm_arch_vcpu_ioctl_run().
> + *
> + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> + * thus should exit to the host, or true if a the guest run loop can continue.
> + */
> +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);

There is no need for an extra cast. kern_hyp_va() already provides a
cast to the type of the parameter.

> +	bool is_aarch32_allowed =
> +		FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> +			  get_pvm_id_aa64pfr0(vcpu)) >=
> +				ID_AA64PFR0_ELx_32BIT_64BIT;
> +
> +
> +	if (kvm_vm_is_protected(kvm) &&
> +	    vcpu_mode_is_32bit(vcpu) &&
> +	    !is_aarch32_allowed) {

Do we really need to go through this is_aarch32_allowed check?
Protected VMs don't have AArch32, and we don't have the infrastructure
to handle 32bit at all. For non-protected VMs, we already have what we
need at EL1. So the extra check only adds complexity.

> +		/*
> +		 * As we have caught the guest red-handed, decide that it isn't
> +		 * fit for purpose anymore by making the vcpu invalid. The VMM
> +		 * can try and fix it by re-initializing the vcpu with
> +		 * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> +		 * protected VMs.
> +		 */
> +		vcpu->arch.target = -1;
> +		*exit_code = ARM_EXCEPTION_IL;
> +		return false;
> +	}
> +
> +	return true;
> +}
> +
>  /* Switch to the guest for legacy non-VHE systems */
>  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  {
> @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
>  		/* Jump in the fire! */
>  		exit_code = __guest_enter(vcpu);
>  
> +		if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> +			break;
> +
>  		/* And we're baaack! */
>  	} while (fixup_guest_exit(vcpu, &exit_code));
>  

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-05  8:52     ` Andrew Jones
  -1 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-05  8:52 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);
> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;
> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}

Reading the same function five times make me wonder if a generator macro
wouldn't be better for these.

> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.

I'd drop the 'below' word. It's not overly helpful and since code gets
moved it can go out of date.

> +		 */
> +		WARN_ON(1);

The above cases could also be generated by a macro. And I wonder if we can
come up with something that makes sure these separate lists stay
consistent with macros and build bugs in order to better avoid these
"should never happen" situations.

> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};
> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}
> -- 
> 2.33.0.464.g1972c5931b-goog
>

Other than the nits and suggestion to try and build in some register list
consistency checks, this looks good to me. I don't know what pKVM
should / should not expose, but I like the approach this takes, so,
FWIW,

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05  8:52     ` Andrew Jones
  0 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-05  8:52 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kernel-team, kvm, maz, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);
> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;
> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}

Reading the same function five times make me wonder if a generator macro
wouldn't be better for these.

> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.

I'd drop the 'below' word. It's not overly helpful and since code gets
moved it can go out of date.

> +		 */
> +		WARN_ON(1);

The above cases could also be generated by a macro. And I wonder if we can
come up with something that makes sure these separate lists stay
consistent with macros and build bugs in order to better avoid these
"should never happen" situations.

> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};
> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}
> -- 
> 2.33.0.464.g1972c5931b-goog
>

Other than the nits and suggestion to try and build in some register list
consistency checks, this looks good to me. I don't know what pKVM
should / should not expose, but I like the approach this takes, so,
FWIW,

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05  8:52     ` Andrew Jones
  0 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-05  8:52 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);
> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;
> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}

Reading the same function five times make me wonder if a generator macro
wouldn't be better for these.

> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.

I'd drop the 'below' word. It's not overly helpful and since code gets
moved it can go out of date.

> +		 */
> +		WARN_ON(1);

The above cases could also be generated by a macro. And I wonder if we can
come up with something that makes sure these separate lists stay
consistent with macros and build bugs in order to better avoid these
"should never happen" situations.

> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};
> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}
> -- 
> 2.33.0.464.g1972c5931b-goog
>

Other than the nits and suggestion to try and build in some register list
consistency checks, this looks good to me. I don't know what pKVM
should / should not expose, but I like the approach this takes, so,
FWIW,

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
  2021-10-05  8:48     ` Marc Zyngier
  (?)
@ 2021-10-05  9:05       ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 9:48 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:04 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected KVM does not support protected AArch32 guests. However,
> > it is possible for the guest to force run AArch32, potentially
> > causing problems. Add an extra check so that if the hypervisor
> > catches the guest doing that, it can prevent the guest from
> > running again by resetting vcpu->arch.target and returning
> > ARM_EXCEPTION_IL.
> >
> > If this were to happen, The VMM can try and fix it by re-
> > initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> > likely not possible for protected VMs.
> >
> > Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> > AArch32 systems")
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 2bf5952f651b..d66226e49013 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >       return hyp_exit_handlers;
> >  }
> >
> > +/*
> > + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> > + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> > + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> > + * hypervisor spots a guest in such a state ensure it is handled, and don't
> > + * trust the host to spot or fix it.  The check below is based on the one in
> > + * kvm_arch_vcpu_ioctl_run().
> > + *
> > + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> > + * thus should exit to the host, or true if a the guest run loop can continue.
> > + */
> > +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
>
> There is no need for an extra cast. kern_hyp_va() already provides a
> cast to the type of the parameter.

Will drop it.

> > +     bool is_aarch32_allowed =
> > +             FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> > +                       get_pvm_id_aa64pfr0(vcpu)) >=
> > +                             ID_AA64PFR0_ELx_32BIT_64BIT;
> > +
> > +
> > +     if (kvm_vm_is_protected(kvm) &&
> > +         vcpu_mode_is_32bit(vcpu) &&
> > +         !is_aarch32_allowed) {
>
> Do we really need to go through this is_aarch32_allowed check?
> Protected VMs don't have AArch32, and we don't have the infrastructure
> to handle 32bit at all. For non-protected VMs, we already have what we
> need at EL1. So the extra check only adds complexity.

No. I could change it to a build-time assertion just to make sure that
AArch32 is not allowed instead.

Thanks,
/fuad

> > +             /*
> > +              * As we have caught the guest red-handed, decide that it isn't
> > +              * fit for purpose anymore by making the vcpu invalid. The VMM
> > +              * can try and fix it by re-initializing the vcpu with
> > +              * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> > +              * protected VMs.
> > +              */
> > +             vcpu->arch.target = -1;
> > +             *exit_code = ARM_EXCEPTION_IL;
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
> > +
> >  /* Switch to the guest for legacy non-VHE systems */
> >  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >  {
> > @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >               /* Jump in the fire! */
> >               exit_code = __guest_enter(vcpu);
> >
> > +             if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> > +                     break;
> > +
> >               /* And we're baaack! */
> >       } while (fixup_guest_exit(vcpu, &exit_code));
> >
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-10-05  9:05       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:05 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

Hi Marc,

On Tue, Oct 5, 2021 at 9:48 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:04 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected KVM does not support protected AArch32 guests. However,
> > it is possible for the guest to force run AArch32, potentially
> > causing problems. Add an extra check so that if the hypervisor
> > catches the guest doing that, it can prevent the guest from
> > running again by resetting vcpu->arch.target and returning
> > ARM_EXCEPTION_IL.
> >
> > If this were to happen, The VMM can try and fix it by re-
> > initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> > likely not possible for protected VMs.
> >
> > Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> > AArch32 systems")
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 2bf5952f651b..d66226e49013 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >       return hyp_exit_handlers;
> >  }
> >
> > +/*
> > + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> > + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> > + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> > + * hypervisor spots a guest in such a state ensure it is handled, and don't
> > + * trust the host to spot or fix it.  The check below is based on the one in
> > + * kvm_arch_vcpu_ioctl_run().
> > + *
> > + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> > + * thus should exit to the host, or true if a the guest run loop can continue.
> > + */
> > +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
>
> There is no need for an extra cast. kern_hyp_va() already provides a
> cast to the type of the parameter.

Will drop it.

> > +     bool is_aarch32_allowed =
> > +             FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> > +                       get_pvm_id_aa64pfr0(vcpu)) >=
> > +                             ID_AA64PFR0_ELx_32BIT_64BIT;
> > +
> > +
> > +     if (kvm_vm_is_protected(kvm) &&
> > +         vcpu_mode_is_32bit(vcpu) &&
> > +         !is_aarch32_allowed) {
>
> Do we really need to go through this is_aarch32_allowed check?
> Protected VMs don't have AArch32, and we don't have the infrastructure
> to handle 32bit at all. For non-protected VMs, we already have what we
> need at EL1. So the extra check only adds complexity.

No. I could change it to a build-time assertion just to make sure that
AArch32 is not allowed instead.

Thanks,
/fuad

> > +             /*
> > +              * As we have caught the guest red-handed, decide that it isn't
> > +              * fit for purpose anymore by making the vcpu invalid. The VMM
> > +              * can try and fix it by re-initializing the vcpu with
> > +              * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> > +              * protected VMs.
> > +              */
> > +             vcpu->arch.target = -1;
> > +             *exit_code = ARM_EXCEPTION_IL;
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
> > +
> >  /* Switch to the guest for legacy non-VHE systems */
> >  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >  {
> > @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >               /* Jump in the fire! */
> >               exit_code = __guest_enter(vcpu);
> >
> > +             if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> > +                     break;
> > +
> >               /* And we're baaack! */
> >       } while (fixup_guest_exit(vcpu, &exit_code));
> >
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits
@ 2021-10-05  9:05       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:05 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 9:48 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:04 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected KVM does not support protected AArch32 guests. However,
> > it is possible for the guest to force run AArch32, potentially
> > causing problems. Add an extra check so that if the hypervisor
> > catches the guest doing that, it can prevent the guest from
> > running again by resetting vcpu->arch.target and returning
> > ARM_EXCEPTION_IL.
> >
> > If this were to happen, The VMM can try and fix it by re-
> > initializing the vcpu with KVM_ARM_VCPU_INIT, however, this is
> > likely not possible for protected VMs.
> >
> > Adapted from commit 22f553842b14 ("KVM: arm64: Handle Asymmetric
> > AArch32 systems")
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/kvm/hyp/nvhe/switch.c | 40 ++++++++++++++++++++++++++++++++
> >  1 file changed, 40 insertions(+)
> >
> > diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/switch.c
> > index 2bf5952f651b..d66226e49013 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/switch.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/switch.c
> > @@ -235,6 +235,43 @@ static const exit_handler_fn *kvm_get_exit_handler_array(struct kvm *kvm)
> >       return hyp_exit_handlers;
> >  }
> >
> > +/*
> > + * Some guests (e.g., protected VMs) might not be allowed to run in AArch32.
> > + * The ARMv8 architecture does not give the hypervisor a mechanism to prevent a
> > + * guest from dropping to AArch32 EL0 if implemented by the CPU. If the
> > + * hypervisor spots a guest in such a state ensure it is handled, and don't
> > + * trust the host to spot or fix it.  The check below is based on the one in
> > + * kvm_arch_vcpu_ioctl_run().
> > + *
> > + * Returns false if the guest ran in AArch32 when it shouldn't have, and
> > + * thus should exit to the host, or true if a the guest run loop can continue.
> > + */
> > +static bool handle_aarch32_guest(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     struct kvm *kvm = (struct kvm *) kern_hyp_va(vcpu->kvm);
>
> There is no need for an extra cast. kern_hyp_va() already provides a
> cast to the type of the parameter.

Will drop it.

> > +     bool is_aarch32_allowed =
> > +             FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0),
> > +                       get_pvm_id_aa64pfr0(vcpu)) >=
> > +                             ID_AA64PFR0_ELx_32BIT_64BIT;
> > +
> > +
> > +     if (kvm_vm_is_protected(kvm) &&
> > +         vcpu_mode_is_32bit(vcpu) &&
> > +         !is_aarch32_allowed) {
>
> Do we really need to go through this is_aarch32_allowed check?
> Protected VMs don't have AArch32, and we don't have the infrastructure
> to handle 32bit at all. For non-protected VMs, we already have what we
> need at EL1. So the extra check only adds complexity.

No. I could change it to a build-time assertion just to make sure that
AArch32 is not allowed instead.

Thanks,
/fuad

> > +             /*
> > +              * As we have caught the guest red-handed, decide that it isn't
> > +              * fit for purpose anymore by making the vcpu invalid. The VMM
> > +              * can try and fix it by re-initializing the vcpu with
> > +              * KVM_ARM_VCPU_INIT, however, this is likely not possible for
> > +              * protected VMs.
> > +              */
> > +             vcpu->arch.target = -1;
> > +             *exit_code = ARM_EXCEPTION_IL;
> > +             return false;
> > +     }
> > +
> > +     return true;
> > +}
> > +
> >  /* Switch to the guest for legacy non-VHE systems */
> >  int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >  {
> > @@ -297,6 +334,9 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu)
> >               /* Jump in the fire! */
> >               exit_code = __guest_enter(vcpu);
> >
> > +             if (unlikely(!handle_aarch32_guest(vcpu, &exit_code)))
> > +                     break;
> > +
> >               /* And we're baaack! */
> >       } while (fixup_guest_exit(vcpu, &exit_code));
> >
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-05  9:23     ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:23 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:01 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index e86045ac43ba..a460e1243cef 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -64,6 +64,7 @@
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
>  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
> +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..4a323aa27a6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
>  	return false;
>  }
>  
> +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> +
>  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
>  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 6aa7b0c5bf21..3af6d59d1919 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
> +	/*
> +	 * Initialize traps for protected VMs.
> +	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> +	 * the code is in place for first run initialization at EL2.
> +	 */
> +	if (kvm_vm_is_protected(kvm))
> +		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> +
>  	return ret;
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> new file mode 100644
> index 000000000000..e6c259db6719
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> +#define __ARM64_KVM_NVHE_PKVM_H__
> +
> +#include <asm/kvm_host.h>
> +
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */

We need to stop adding these small files with only two lines in
them. Please merge this with nvhe/trap_handler.h, for example, and
rename the whole thing to pkvm.h if you want.

> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 0bbe37a18d5d..c3c11974fa3b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o sys_regs.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 8ca1104f4774..f59e0870c343 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -15,6 +15,7 @@
>  
>  #include <nvhe/mem_protect.h>
>  #include <nvhe/mm.h>
> +#include <nvhe/pkvm.h>
>  #include <nvhe/trap_handler.h>
>  
>  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
>  {
>  	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
>  }
> +
> +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> +{
> +	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> +
> +	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> +}
> +
>  typedef void (*hcall_t)(struct kvm_cpu_context *);
>  
>  #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_host_share_hyp),
>  	HANDLE_FUNC(__pkvm_create_private_mapping),
>  	HANDLE_FUNC(__pkvm_prot_finalize),
> +	HANDLE_FUNC(__pkvm_vcpu_init_traps),
>  };
>  
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> new file mode 100644
> index 000000000000..cc6139631dc4
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/mm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <nvhe/sys_regs.h>
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR0.
> + */
> +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap AArch32 guests */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT ||
> +	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT)
> +		hcr_set |= HCR_RW | HCR_TID0;

We have defined that pVMs don't have AArch32 at all. So RW should
always be set. And if RW is set, the TID0 serves no purpose as EL1 is
AArch64, as it only traps AArch32 EL1 accesses.

I like the fact that this is all driven from the feature set, but it
is also a bit unreadable. So I'd drop it in favour of:

	u64 hcr_set = HCR_RW;

at the top of the function.

> +
> +	/* Trap RAS unless all current versions are supported */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> +	    ID_AA64PFR0_RAS_V1P1) {
> +		hcr_set |= HCR_TERR | HCR_TEA;
> +		hcr_clear |= HCR_FIEN;
> +	}
> +
> +	/* Trap AMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> +		hcr_clear |= HCR_AMVOFFEN;
> +		cptr_set |= CPTR_EL2_TAM;
> +	}
> +
> +	/*
> +	 * Linux guests assume support for floating-point and Advanced SIMD. Do
> +	 * not change the trapping behavior for these from the KVM default.
> +	 */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	/* Trap SVE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> +		cptr_set |= CPTR_EL2_TZ;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR1.
> + */
> +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +
> +	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> +		hcr_set |= HCR_TID5;
> +		hcr_clear |= HCR_DCT | HCR_ATA;
> +	}
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64DFR0.
> + */
> +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> +	u64 mdcr_set = 0;
> +	u64 mdcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap/constrain PMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> +		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> +			      MDCR_EL2_HPMN_MASK;
> +	}
> +
> +	/* Trap Debug */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> +
> +	/* Trap OS Double Lock */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDOSA;
> +
> +	/* Trap SPE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPMS;
> +		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> +	}
> +
> +	/* Trap Trace Filter */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TTRF;
> +
> +	/* Trap Trace */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> +		cptr_set |= CPTR_EL2_TTA;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR0.
> + */
> +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> +	u64 mdcr_set = 0;
> +
> +	/* Trap Debug Communications Channel registers */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDCC;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR1.
> + */
> +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> +	u64 hcr_set = 0;
> +
> +	/* Trap LOR */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> +		hcr_set |= HCR_TLOR;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +}
> +
> +/*
> + * Set baseline trap register values.
> + */
> +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> +{
> +	const u64 hcr_trap_feat_regs = HCR_TID3;
> +	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> +
> +	/*
> +	 * Always trap:
> +	 * - Feature id registers: to control features exposed to guests
> +	 * - Implementation-defined features
> +	 */
> +	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> +
> +	/* Clear res0 and set res1 bits to trap potential new features. */
> +	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> +	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> +	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> +	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> +}
> +
> +/*
> + * Initialize trap register values for protected VMs.
> + */
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> +{
> +	pvm_init_trap_regs(vcpu);
> +	pvm_init_traps_aa64pfr0(vcpu);
> +	pvm_init_traps_aa64pfr1(vcpu);
> +	pvm_init_traps_aa64dfr0(vcpu);
> +	pvm_init_traps_aa64mmfr0(vcpu);
> +	pvm_init_traps_aa64mmfr1(vcpu);
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-05  9:23     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:23 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, 22 Sep 2021 13:47:01 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index e86045ac43ba..a460e1243cef 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -64,6 +64,7 @@
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
>  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
> +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..4a323aa27a6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
>  	return false;
>  }
>  
> +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> +
>  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
>  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 6aa7b0c5bf21..3af6d59d1919 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
> +	/*
> +	 * Initialize traps for protected VMs.
> +	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> +	 * the code is in place for first run initialization at EL2.
> +	 */
> +	if (kvm_vm_is_protected(kvm))
> +		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> +
>  	return ret;
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> new file mode 100644
> index 000000000000..e6c259db6719
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> +#define __ARM64_KVM_NVHE_PKVM_H__
> +
> +#include <asm/kvm_host.h>
> +
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */

We need to stop adding these small files with only two lines in
them. Please merge this with nvhe/trap_handler.h, for example, and
rename the whole thing to pkvm.h if you want.

> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 0bbe37a18d5d..c3c11974fa3b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o sys_regs.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 8ca1104f4774..f59e0870c343 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -15,6 +15,7 @@
>  
>  #include <nvhe/mem_protect.h>
>  #include <nvhe/mm.h>
> +#include <nvhe/pkvm.h>
>  #include <nvhe/trap_handler.h>
>  
>  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
>  {
>  	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
>  }
> +
> +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> +{
> +	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> +
> +	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> +}
> +
>  typedef void (*hcall_t)(struct kvm_cpu_context *);
>  
>  #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_host_share_hyp),
>  	HANDLE_FUNC(__pkvm_create_private_mapping),
>  	HANDLE_FUNC(__pkvm_prot_finalize),
> +	HANDLE_FUNC(__pkvm_vcpu_init_traps),
>  };
>  
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> new file mode 100644
> index 000000000000..cc6139631dc4
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/mm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <nvhe/sys_regs.h>
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR0.
> + */
> +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap AArch32 guests */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT ||
> +	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT)
> +		hcr_set |= HCR_RW | HCR_TID0;

We have defined that pVMs don't have AArch32 at all. So RW should
always be set. And if RW is set, the TID0 serves no purpose as EL1 is
AArch64, as it only traps AArch32 EL1 accesses.

I like the fact that this is all driven from the feature set, but it
is also a bit unreadable. So I'd drop it in favour of:

	u64 hcr_set = HCR_RW;

at the top of the function.

> +
> +	/* Trap RAS unless all current versions are supported */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> +	    ID_AA64PFR0_RAS_V1P1) {
> +		hcr_set |= HCR_TERR | HCR_TEA;
> +		hcr_clear |= HCR_FIEN;
> +	}
> +
> +	/* Trap AMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> +		hcr_clear |= HCR_AMVOFFEN;
> +		cptr_set |= CPTR_EL2_TAM;
> +	}
> +
> +	/*
> +	 * Linux guests assume support for floating-point and Advanced SIMD. Do
> +	 * not change the trapping behavior for these from the KVM default.
> +	 */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	/* Trap SVE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> +		cptr_set |= CPTR_EL2_TZ;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR1.
> + */
> +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +
> +	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> +		hcr_set |= HCR_TID5;
> +		hcr_clear |= HCR_DCT | HCR_ATA;
> +	}
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64DFR0.
> + */
> +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> +	u64 mdcr_set = 0;
> +	u64 mdcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap/constrain PMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> +		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> +			      MDCR_EL2_HPMN_MASK;
> +	}
> +
> +	/* Trap Debug */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> +
> +	/* Trap OS Double Lock */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDOSA;
> +
> +	/* Trap SPE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPMS;
> +		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> +	}
> +
> +	/* Trap Trace Filter */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TTRF;
> +
> +	/* Trap Trace */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> +		cptr_set |= CPTR_EL2_TTA;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR0.
> + */
> +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> +	u64 mdcr_set = 0;
> +
> +	/* Trap Debug Communications Channel registers */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDCC;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR1.
> + */
> +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> +	u64 hcr_set = 0;
> +
> +	/* Trap LOR */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> +		hcr_set |= HCR_TLOR;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +}
> +
> +/*
> + * Set baseline trap register values.
> + */
> +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> +{
> +	const u64 hcr_trap_feat_regs = HCR_TID3;
> +	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> +
> +	/*
> +	 * Always trap:
> +	 * - Feature id registers: to control features exposed to guests
> +	 * - Implementation-defined features
> +	 */
> +	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> +
> +	/* Clear res0 and set res1 bits to trap potential new features. */
> +	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> +	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> +	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> +	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> +}
> +
> +/*
> + * Initialize trap register values for protected VMs.
> + */
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> +{
> +	pvm_init_trap_regs(vcpu);
> +	pvm_init_traps_aa64pfr0(vcpu);
> +	pvm_init_traps_aa64pfr1(vcpu);
> +	pvm_init_traps_aa64dfr0(vcpu);
> +	pvm_init_traps_aa64mmfr0(vcpu);
> +	pvm_init_traps_aa64mmfr1(vcpu);
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-05  9:23     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:23 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:01 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> 
> diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> index e86045ac43ba..a460e1243cef 100644
> --- a/arch/arm64/include/asm/kvm_asm.h
> +++ b/arch/arm64/include/asm/kvm_asm.h
> @@ -64,6 +64,7 @@
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector		18
>  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize		19
>  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc			20
> +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps		21
>  
>  #ifndef __ASSEMBLY__
>  
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index f8be56d5342b..4a323aa27a6b 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
>  	return false;
>  }
>  
> +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> +
>  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
>  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
>  
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index 6aa7b0c5bf21..3af6d59d1919 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
>  
>  	ret = kvm_arm_pmu_v3_enable(vcpu);
>  
> +	/*
> +	 * Initialize traps for protected VMs.
> +	 * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> +	 * the code is in place for first run initialization at EL2.
> +	 */
> +	if (kvm_vm_is_protected(kvm))
> +		kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> +
>  	return ret;
>  }
>  
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> new file mode 100644
> index 000000000000..e6c259db6719
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> @@ -0,0 +1,14 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> +#define __ARM64_KVM_NVHE_PKVM_H__
> +
> +#include <asm/kvm_host.h>
> +
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */

We need to stop adding these small files with only two lines in
them. Please merge this with nvhe/trap_handler.h, for example, and
rename the whole thing to pkvm.h if you want.

> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 0bbe37a18d5d..c3c11974fa3b 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o sys_regs.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> index 8ca1104f4774..f59e0870c343 100644
> --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> @@ -15,6 +15,7 @@
>  
>  #include <nvhe/mem_protect.h>
>  #include <nvhe/mm.h>
> +#include <nvhe/pkvm.h>
>  #include <nvhe/trap_handler.h>
>  
>  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
>  {
>  	cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
>  }
> +
> +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> +{
> +	DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> +
> +	__pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> +}
> +
>  typedef void (*hcall_t)(struct kvm_cpu_context *);
>  
>  #define HANDLE_FUNC(x)	[__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
>  	HANDLE_FUNC(__pkvm_host_share_hyp),
>  	HANDLE_FUNC(__pkvm_create_private_mapping),
>  	HANDLE_FUNC(__pkvm_prot_finalize),
> +	HANDLE_FUNC(__pkvm_vcpu_init_traps),
>  };
>  
>  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> new file mode 100644
> index 000000000000..cc6139631dc4
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> @@ -0,0 +1,186 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <linux/kvm_host.h>
> +#include <linux/mm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <nvhe/sys_regs.h>
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR0.
> + */
> +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap AArch32 guests */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT ||
> +	    FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> +		    ID_AA64PFR0_ELx_32BIT_64BIT)
> +		hcr_set |= HCR_RW | HCR_TID0;

We have defined that pVMs don't have AArch32 at all. So RW should
always be set. And if RW is set, the TID0 serves no purpose as EL1 is
AArch64, as it only traps AArch32 EL1 accesses.

I like the fact that this is all driven from the feature set, but it
is also a bit unreadable. So I'd drop it in favour of:

	u64 hcr_set = HCR_RW;

at the top of the function.

> +
> +	/* Trap RAS unless all current versions are supported */
> +	if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> +	    ID_AA64PFR0_RAS_V1P1) {
> +		hcr_set |= HCR_TERR | HCR_TEA;
> +		hcr_clear |= HCR_FIEN;
> +	}
> +
> +	/* Trap AMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> +		hcr_clear |= HCR_AMVOFFEN;
> +		cptr_set |= CPTR_EL2_TAM;
> +	}
> +
> +	/*
> +	 * Linux guests assume support for floating-point and Advanced SIMD. Do
> +	 * not change the trapping behavior for these from the KVM default.
> +	 */
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> +				PVM_ID_AA64PFR0_ALLOW));
> +	BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> +				PVM_ID_AA64PFR0_ALLOW));
> +
> +	/* Trap SVE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> +		cptr_set |= CPTR_EL2_TZ;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64PFR1.
> + */
> +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> +	u64 hcr_set = 0;
> +	u64 hcr_clear = 0;
> +
> +	/* Memory Tagging: Trap and Treat as Untagged if not supported. */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> +		hcr_set |= HCR_TID5;
> +		hcr_clear |= HCR_DCT | HCR_ATA;
> +	}
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +	vcpu->arch.hcr_el2 &= ~hcr_clear;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64DFR0.
> + */
> +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> +	u64 mdcr_set = 0;
> +	u64 mdcr_clear = 0;
> +	u64 cptr_set = 0;
> +
> +	/* Trap/constrain PMU */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> +		mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> +			      MDCR_EL2_HPMN_MASK;
> +	}
> +
> +	/* Trap Debug */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> +
> +	/* Trap OS Double Lock */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDOSA;
> +
> +	/* Trap SPE */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> +		mdcr_set |= MDCR_EL2_TPMS;
> +		mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> +	}
> +
> +	/* Trap Trace Filter */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TTRF;
> +
> +	/* Trap Trace */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> +		cptr_set |= CPTR_EL2_TTA;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +	vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> +	vcpu->arch.cptr_el2 |= cptr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR0.
> + */
> +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> +	u64 mdcr_set = 0;
> +
> +	/* Trap Debug Communications Channel registers */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> +		mdcr_set |= MDCR_EL2_TDCC;
> +
> +	vcpu->arch.mdcr_el2 |= mdcr_set;
> +}
> +
> +/*
> + * Set trap register values based on features in ID_AA64MMFR1.
> + */
> +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> +{
> +	const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> +	u64 hcr_set = 0;
> +
> +	/* Trap LOR */
> +	if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> +		hcr_set |= HCR_TLOR;
> +
> +	vcpu->arch.hcr_el2 |= hcr_set;
> +}
> +
> +/*
> + * Set baseline trap register values.
> + */
> +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> +{
> +	const u64 hcr_trap_feat_regs = HCR_TID3;
> +	const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> +
> +	/*
> +	 * Always trap:
> +	 * - Feature id registers: to control features exposed to guests
> +	 * - Implementation-defined features
> +	 */
> +	vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> +
> +	/* Clear res0 and set res1 bits to trap potential new features. */
> +	vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> +	vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> +	vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> +	vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> +}
> +
> +/*
> + * Initialize trap register values for protected VMs.
> + */
> +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> +{
> +	pvm_init_trap_regs(vcpu);
> +	pvm_init_traps_aa64pfr0(vcpu);
> +	pvm_init_traps_aa64pfr1(vcpu);
> +	pvm_init_traps_aa64dfr0(vcpu);
> +	pvm_init_traps_aa64mmfr0(vcpu);
> +	pvm_init_traps_aa64mmfr1(vcpu);
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
  2021-10-05  9:23     ` Marc Zyngier
  (?)
@ 2021-10-05  9:33       ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 10:23 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:01 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected VMs have more restricted features that need to be
> > trapped. Moreover, the host should not be trusted to set the
> > appropriate trapping registers and their values.
> >
> > Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> > cptr_el2 at EL2 for protected guests, based on the values of the
> > guest's feature id registers.
> >
> > No functional change intended as trap handlers introduced in the
> > previous patch are still not hooked in to the guest exit
> > handlers.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_asm.h       |   1 +
> >  arch/arm64/include/asm/kvm_host.h      |   2 +
> >  arch/arm64/kvm/arm.c                   |   8 ++
> >  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
> >  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
> >  7 files changed, 222 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index e86045ac43ba..a460e1243cef 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -64,6 +64,7 @@
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector          18
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize           19
> >  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc                        20
> > +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps         21
> >
> >  #ifndef __ASSEMBLY__
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..4a323aa27a6b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
> >       return false;
> >  }
> >
> > +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> > +
> >  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
> >  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 6aa7b0c5bf21..3af6d59d1919 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >
> >       ret = kvm_arm_pmu_v3_enable(vcpu);
> >
> > +     /*
> > +      * Initialize traps for protected VMs.
> > +      * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> > +      * the code is in place for first run initialization at EL2.
> > +      */
> > +     if (kvm_vm_is_protected(kvm))
> > +             kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> > +
> >       return ret;
> >  }
> >
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > new file mode 100644
> > index 000000000000..e6c259db6719
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > @@ -0,0 +1,14 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> > +#define __ARM64_KVM_NVHE_PKVM_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
>
> We need to stop adding these small files with only two lines in
> them. Please merge this with nvhe/trap_handler.h, for example, and
> rename the whole thing to pkvm.h if you want.

Will do.

> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 0bbe37a18d5d..c3c11974fa3b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o sys_regs.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > index 8ca1104f4774..f59e0870c343 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > @@ -15,6 +15,7 @@
> >
> >  #include <nvhe/mem_protect.h>
> >  #include <nvhe/mm.h>
> > +#include <nvhe/pkvm.h>
> >  #include <nvhe/trap_handler.h>
> >
> >  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> > @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
> >  {
> >       cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
> >  }
> > +
> > +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> > +{
> > +     DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> > +
> > +     __pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> > +}
> > +
> >  typedef void (*hcall_t)(struct kvm_cpu_context *);
> >
> >  #define HANDLE_FUNC(x)       [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> > @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
> >       HANDLE_FUNC(__pkvm_host_share_hyp),
> >       HANDLE_FUNC(__pkvm_create_private_mapping),
> >       HANDLE_FUNC(__pkvm_prot_finalize),
> > +     HANDLE_FUNC(__pkvm_vcpu_init_traps),
> >  };
> >
> >  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > new file mode 100644
> > index 000000000000..cc6139631dc4
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -0,0 +1,186 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <linux/kvm_host.h>
> > +#include <linux/mm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <nvhe/sys_regs.h>
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR0.
> > + */
> > +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap AArch32 guests */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT ||
> > +         FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT)
> > +             hcr_set |= HCR_RW | HCR_TID0;
>
> We have defined that pVMs don't have AArch32 at all. So RW should
> always be set. And if RW is set, the TID0 serves no purpose as EL1 is
> AArch64, as it only traps AArch32 EL1 accesses.
>
> I like the fact that this is all driven from the feature set, but it
> is also a bit unreadable. So I'd drop it in favour of:
>
>         u64 hcr_set = HCR_RW;
>
> at the top of the function.

Sure. What I could do, which I mentioned in a reply to your comments
on patch 12/12, is to have a build time assertion that checks that
AArch32 is not supported for pvms.

Cheers,
/fuad


> > +
> > +     /* Trap RAS unless all current versions are supported */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> > +         ID_AA64PFR0_RAS_V1P1) {
> > +             hcr_set |= HCR_TERR | HCR_TEA;
> > +             hcr_clear |= HCR_FIEN;
> > +     }
> > +
> > +     /* Trap AMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> > +             hcr_clear |= HCR_AMVOFFEN;
> > +             cptr_set |= CPTR_EL2_TAM;
> > +     }
> > +
> > +     /*
> > +      * Linux guests assume support for floating-point and Advanced SIMD. Do
> > +      * not change the trapping behavior for these from the KVM default.
> > +      */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     /* Trap SVE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> > +             cptr_set |= CPTR_EL2_TZ;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR1.
> > + */
> > +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +
> > +     /* Memory Tagging: Trap and Treat as Untagged if not supported. */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> > +             hcr_set |= HCR_TID5;
> > +             hcr_clear |= HCR_DCT | HCR_ATA;
> > +     }
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64DFR0.
> > + */
> > +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +     u64 mdcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap/constrain PMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> > +             mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> > +                           MDCR_EL2_HPMN_MASK;
> > +     }
> > +
> > +     /* Trap Debug */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> > +
> > +     /* Trap OS Double Lock */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDOSA;
> > +
> > +     /* Trap SPE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPMS;
> > +             mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> > +     }
> > +
> > +     /* Trap Trace Filter */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TTRF;
> > +
> > +     /* Trap Trace */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> > +             cptr_set |= CPTR_EL2_TTA;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +     vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR0.
> > + */
> > +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +
> > +     /* Trap Debug Communications Channel registers */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDCC;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR1.
> > + */
> > +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +
> > +     /* Trap LOR */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> > +             hcr_set |= HCR_TLOR;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +}
> > +
> > +/*
> > + * Set baseline trap register values.
> > + */
> > +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 hcr_trap_feat_regs = HCR_TID3;
> > +     const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> > +
> > +     /*
> > +      * Always trap:
> > +      * - Feature id registers: to control features exposed to guests
> > +      * - Implementation-defined features
> > +      */
> > +     vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> > +
> > +     /* Clear res0 and set res1 bits to trap potential new features. */
> > +     vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> > +     vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> > +     vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> > +     vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> > +}
> > +
> > +/*
> > + * Initialize trap register values for protected VMs.
> > + */
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> > +{
> > +     pvm_init_trap_regs(vcpu);
> > +     pvm_init_traps_aa64pfr0(vcpu);
> > +     pvm_init_traps_aa64pfr1(vcpu);
> > +     pvm_init_traps_aa64dfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr1(vcpu);
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-05  9:33       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:33 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

Hi Marc,

On Tue, Oct 5, 2021 at 10:23 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:01 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected VMs have more restricted features that need to be
> > trapped. Moreover, the host should not be trusted to set the
> > appropriate trapping registers and their values.
> >
> > Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> > cptr_el2 at EL2 for protected guests, based on the values of the
> > guest's feature id registers.
> >
> > No functional change intended as trap handlers introduced in the
> > previous patch are still not hooked in to the guest exit
> > handlers.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_asm.h       |   1 +
> >  arch/arm64/include/asm/kvm_host.h      |   2 +
> >  arch/arm64/kvm/arm.c                   |   8 ++
> >  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
> >  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
> >  7 files changed, 222 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index e86045ac43ba..a460e1243cef 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -64,6 +64,7 @@
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector          18
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize           19
> >  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc                        20
> > +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps         21
> >
> >  #ifndef __ASSEMBLY__
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..4a323aa27a6b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
> >       return false;
> >  }
> >
> > +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> > +
> >  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
> >  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 6aa7b0c5bf21..3af6d59d1919 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >
> >       ret = kvm_arm_pmu_v3_enable(vcpu);
> >
> > +     /*
> > +      * Initialize traps for protected VMs.
> > +      * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> > +      * the code is in place for first run initialization at EL2.
> > +      */
> > +     if (kvm_vm_is_protected(kvm))
> > +             kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> > +
> >       return ret;
> >  }
> >
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > new file mode 100644
> > index 000000000000..e6c259db6719
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > @@ -0,0 +1,14 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> > +#define __ARM64_KVM_NVHE_PKVM_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
>
> We need to stop adding these small files with only two lines in
> them. Please merge this with nvhe/trap_handler.h, for example, and
> rename the whole thing to pkvm.h if you want.

Will do.

> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 0bbe37a18d5d..c3c11974fa3b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o sys_regs.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > index 8ca1104f4774..f59e0870c343 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > @@ -15,6 +15,7 @@
> >
> >  #include <nvhe/mem_protect.h>
> >  #include <nvhe/mm.h>
> > +#include <nvhe/pkvm.h>
> >  #include <nvhe/trap_handler.h>
> >
> >  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> > @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
> >  {
> >       cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
> >  }
> > +
> > +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> > +{
> > +     DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> > +
> > +     __pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> > +}
> > +
> >  typedef void (*hcall_t)(struct kvm_cpu_context *);
> >
> >  #define HANDLE_FUNC(x)       [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> > @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
> >       HANDLE_FUNC(__pkvm_host_share_hyp),
> >       HANDLE_FUNC(__pkvm_create_private_mapping),
> >       HANDLE_FUNC(__pkvm_prot_finalize),
> > +     HANDLE_FUNC(__pkvm_vcpu_init_traps),
> >  };
> >
> >  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > new file mode 100644
> > index 000000000000..cc6139631dc4
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -0,0 +1,186 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <linux/kvm_host.h>
> > +#include <linux/mm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <nvhe/sys_regs.h>
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR0.
> > + */
> > +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap AArch32 guests */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT ||
> > +         FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT)
> > +             hcr_set |= HCR_RW | HCR_TID0;
>
> We have defined that pVMs don't have AArch32 at all. So RW should
> always be set. And if RW is set, the TID0 serves no purpose as EL1 is
> AArch64, as it only traps AArch32 EL1 accesses.
>
> I like the fact that this is all driven from the feature set, but it
> is also a bit unreadable. So I'd drop it in favour of:
>
>         u64 hcr_set = HCR_RW;
>
> at the top of the function.

Sure. What I could do, which I mentioned in a reply to your comments
on patch 12/12, is to have a build time assertion that checks that
AArch32 is not supported for pvms.

Cheers,
/fuad


> > +
> > +     /* Trap RAS unless all current versions are supported */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> > +         ID_AA64PFR0_RAS_V1P1) {
> > +             hcr_set |= HCR_TERR | HCR_TEA;
> > +             hcr_clear |= HCR_FIEN;
> > +     }
> > +
> > +     /* Trap AMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> > +             hcr_clear |= HCR_AMVOFFEN;
> > +             cptr_set |= CPTR_EL2_TAM;
> > +     }
> > +
> > +     /*
> > +      * Linux guests assume support for floating-point and Advanced SIMD. Do
> > +      * not change the trapping behavior for these from the KVM default.
> > +      */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     /* Trap SVE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> > +             cptr_set |= CPTR_EL2_TZ;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR1.
> > + */
> > +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +
> > +     /* Memory Tagging: Trap and Treat as Untagged if not supported. */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> > +             hcr_set |= HCR_TID5;
> > +             hcr_clear |= HCR_DCT | HCR_ATA;
> > +     }
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64DFR0.
> > + */
> > +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +     u64 mdcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap/constrain PMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> > +             mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> > +                           MDCR_EL2_HPMN_MASK;
> > +     }
> > +
> > +     /* Trap Debug */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> > +
> > +     /* Trap OS Double Lock */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDOSA;
> > +
> > +     /* Trap SPE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPMS;
> > +             mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> > +     }
> > +
> > +     /* Trap Trace Filter */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TTRF;
> > +
> > +     /* Trap Trace */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> > +             cptr_set |= CPTR_EL2_TTA;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +     vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR0.
> > + */
> > +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +
> > +     /* Trap Debug Communications Channel registers */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDCC;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR1.
> > + */
> > +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +
> > +     /* Trap LOR */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> > +             hcr_set |= HCR_TLOR;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +}
> > +
> > +/*
> > + * Set baseline trap register values.
> > + */
> > +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 hcr_trap_feat_regs = HCR_TID3;
> > +     const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> > +
> > +     /*
> > +      * Always trap:
> > +      * - Feature id registers: to control features exposed to guests
> > +      * - Implementation-defined features
> > +      */
> > +     vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> > +
> > +     /* Clear res0 and set res1 bits to trap potential new features. */
> > +     vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> > +     vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> > +     vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> > +     vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> > +}
> > +
> > +/*
> > + * Initialize trap register values for protected VMs.
> > + */
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> > +{
> > +     pvm_init_trap_regs(vcpu);
> > +     pvm_init_traps_aa64pfr0(vcpu);
> > +     pvm_init_traps_aa64pfr1(vcpu);
> > +     pvm_init_traps_aa64dfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr1(vcpu);
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-05  9:33       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05  9:33 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 10:23 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:01 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Protected VMs have more restricted features that need to be
> > trapped. Moreover, the host should not be trusted to set the
> > appropriate trapping registers and their values.
> >
> > Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> > cptr_el2 at EL2 for protected guests, based on the values of the
> > guest's feature id registers.
> >
> > No functional change intended as trap handlers introduced in the
> > previous patch are still not hooked in to the guest exit
> > handlers.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_asm.h       |   1 +
> >  arch/arm64/include/asm/kvm_host.h      |   2 +
> >  arch/arm64/kvm/arm.c                   |   8 ++
> >  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
> >  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
> >  7 files changed, 222 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
> > index e86045ac43ba..a460e1243cef 100644
> > --- a/arch/arm64/include/asm/kvm_asm.h
> > +++ b/arch/arm64/include/asm/kvm_asm.h
> > @@ -64,6 +64,7 @@
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_cpu_set_vector          18
> >  #define __KVM_HOST_SMCCC_FUNC___pkvm_prot_finalize           19
> >  #define __KVM_HOST_SMCCC_FUNC___kvm_adjust_pc                        20
> > +#define __KVM_HOST_SMCCC_FUNC___pkvm_vcpu_init_traps         21
> >
> >  #ifndef __ASSEMBLY__
> >
> > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> > index f8be56d5342b..4a323aa27a6b 100644
> > --- a/arch/arm64/include/asm/kvm_host.h
> > +++ b/arch/arm64/include/asm/kvm_host.h
> > @@ -780,6 +780,8 @@ static inline bool kvm_vm_is_protected(struct kvm *kvm)
> >       return false;
> >  }
> >
> > +void kvm_init_protected_traps(struct kvm_vcpu *vcpu);
> > +
> >  int kvm_arm_vcpu_finalize(struct kvm_vcpu *vcpu, int feature);
> >  bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu);
> >
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index 6aa7b0c5bf21..3af6d59d1919 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -620,6 +620,14 @@ static int kvm_vcpu_first_run_init(struct kvm_vcpu *vcpu)
> >
> >       ret = kvm_arm_pmu_v3_enable(vcpu);
> >
> > +     /*
> > +      * Initialize traps for protected VMs.
> > +      * NOTE: Move to run in EL2 directly, rather than via a hypercall, once
> > +      * the code is in place for first run initialization at EL2.
> > +      */
> > +     if (kvm_vm_is_protected(kvm))
> > +             kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
> > +
> >       return ret;
> >  }
> >
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/pkvm.h b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > new file mode 100644
> > index 000000000000..e6c259db6719
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/pkvm.h
> > @@ -0,0 +1,14 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_PKVM_H__
> > +#define __ARM64_KVM_NVHE_PKVM_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_PKVM_H__ */
>
> We need to stop adding these small files with only two lines in
> them. Please merge this with nvhe/trap_handler.h, for example, and
> rename the whole thing to pkvm.h if you want.

Will do.

> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 0bbe37a18d5d..c3c11974fa3b 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o sys_regs.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o pkvm.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/hyp-main.c b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > index 8ca1104f4774..f59e0870c343 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > +++ b/arch/arm64/kvm/hyp/nvhe/hyp-main.c
> > @@ -15,6 +15,7 @@
> >
> >  #include <nvhe/mem_protect.h>
> >  #include <nvhe/mm.h>
> > +#include <nvhe/pkvm.h>
> >  #include <nvhe/trap_handler.h>
> >
> >  DEFINE_PER_CPU(struct kvm_nvhe_init_params, kvm_init_params);
> > @@ -160,6 +161,14 @@ static void handle___pkvm_prot_finalize(struct kvm_cpu_context *host_ctxt)
> >  {
> >       cpu_reg(host_ctxt, 1) = __pkvm_prot_finalize();
> >  }
> > +
> > +static void handle___pkvm_vcpu_init_traps(struct kvm_cpu_context *host_ctxt)
> > +{
> > +     DECLARE_REG(struct kvm_vcpu *, vcpu, host_ctxt, 1);
> > +
> > +     __pkvm_vcpu_init_traps(kern_hyp_va(vcpu));
> > +}
> > +
> >  typedef void (*hcall_t)(struct kvm_cpu_context *);
> >
> >  #define HANDLE_FUNC(x)       [__KVM_HOST_SMCCC_FUNC_##x] = (hcall_t)handle_##x
> > @@ -185,6 +194,7 @@ static const hcall_t host_hcall[] = {
> >       HANDLE_FUNC(__pkvm_host_share_hyp),
> >       HANDLE_FUNC(__pkvm_create_private_mapping),
> >       HANDLE_FUNC(__pkvm_prot_finalize),
> > +     HANDLE_FUNC(__pkvm_vcpu_init_traps),
> >  };
> >
> >  static void handle_host_hcall(struct kvm_cpu_context *host_ctxt)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/pkvm.c b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > new file mode 100644
> > index 000000000000..cc6139631dc4
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/pkvm.c
> > @@ -0,0 +1,186 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <linux/kvm_host.h>
> > +#include <linux/mm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <nvhe/sys_regs.h>
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR0.
> > + */
> > +static void pvm_init_traps_aa64pfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr0(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap AArch32 guests */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT ||
> > +         FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), feature_ids) <
> > +                 ID_AA64PFR0_ELx_32BIT_64BIT)
> > +             hcr_set |= HCR_RW | HCR_TID0;
>
> We have defined that pVMs don't have AArch32 at all. So RW should
> always be set. And if RW is set, the TID0 serves no purpose as EL1 is
> AArch64, as it only traps AArch32 EL1 accesses.
>
> I like the fact that this is all driven from the feature set, but it
> is also a bit unreadable. So I'd drop it in favour of:
>
>         u64 hcr_set = HCR_RW;
>
> at the top of the function.

Sure. What I could do, which I mentioned in a reply to your comments
on patch 12/12, is to have a build time assertion that checks that
AArch32 is not supported for pvms.

Cheers,
/fuad


> > +
> > +     /* Trap RAS unless all current versions are supported */
> > +     if (FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), feature_ids) <
> > +         ID_AA64PFR0_RAS_V1P1) {
> > +             hcr_set |= HCR_TERR | HCR_TEA;
> > +             hcr_clear |= HCR_FIEN;
> > +     }
> > +
> > +     /* Trap AMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_AMU), feature_ids)) {
> > +             hcr_clear |= HCR_AMVOFFEN;
> > +             cptr_set |= CPTR_EL2_TAM;
> > +     }
> > +
> > +     /*
> > +      * Linux guests assume support for floating-point and Advanced SIMD. Do
> > +      * not change the trapping behavior for these from the KVM default.
> > +      */
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_FP),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +     BUILD_BUG_ON(!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD),
> > +                             PVM_ID_AA64PFR0_ALLOW));
> > +
> > +     /* Trap SVE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_SVE), feature_ids))
> > +             cptr_set |= CPTR_EL2_TZ;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64PFR1.
> > + */
> > +static void pvm_init_traps_aa64pfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64pfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +     u64 hcr_clear = 0;
> > +
> > +     /* Memory Tagging: Trap and Treat as Untagged if not supported. */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR1_MTE), feature_ids)) {
> > +             hcr_set |= HCR_TID5;
> > +             hcr_clear |= HCR_DCT | HCR_ATA;
> > +     }
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +     vcpu->arch.hcr_el2 &= ~hcr_clear;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64DFR0.
> > + */
> > +static void pvm_init_traps_aa64dfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64dfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +     u64 mdcr_clear = 0;
> > +     u64 cptr_set = 0;
> > +
> > +     /* Trap/constrain PMU */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMUVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPM | MDCR_EL2_TPMCR;
> > +             mdcr_clear |= MDCR_EL2_HPME | MDCR_EL2_MTPME |
> > +                           MDCR_EL2_HPMN_MASK;
> > +     }
> > +
> > +     /* Trap Debug */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DEBUGVER), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDRA | MDCR_EL2_TDA | MDCR_EL2_TDE;
> > +
> > +     /* Trap OS Double Lock */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_DOUBLELOCK), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDOSA;
> > +
> > +     /* Trap SPE */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_PMSVER), feature_ids)) {
> > +             mdcr_set |= MDCR_EL2_TPMS;
> > +             mdcr_clear |= MDCR_EL2_E2PB_MASK << MDCR_EL2_E2PB_SHIFT;
> > +     }
> > +
> > +     /* Trap Trace Filter */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACE_FILT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TTRF;
> > +
> > +     /* Trap Trace */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64DFR0_TRACEVER), feature_ids))
> > +             cptr_set |= CPTR_EL2_TTA;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +     vcpu->arch.mdcr_el2 &= ~mdcr_clear;
> > +     vcpu->arch.cptr_el2 |= cptr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR0.
> > + */
> > +static void pvm_init_traps_aa64mmfr0(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr0(vcpu);
> > +     u64 mdcr_set = 0;
> > +
> > +     /* Trap Debug Communications Channel registers */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR0_FGT), feature_ids))
> > +             mdcr_set |= MDCR_EL2_TDCC;
> > +
> > +     vcpu->arch.mdcr_el2 |= mdcr_set;
> > +}
> > +
> > +/*
> > + * Set trap register values based on features in ID_AA64MMFR1.
> > + */
> > +static void pvm_init_traps_aa64mmfr1(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 feature_ids = get_pvm_id_aa64mmfr1(vcpu);
> > +     u64 hcr_set = 0;
> > +
> > +     /* Trap LOR */
> > +     if (!FIELD_GET(ARM64_FEATURE_MASK(ID_AA64MMFR1_LOR), feature_ids))
> > +             hcr_set |= HCR_TLOR;
> > +
> > +     vcpu->arch.hcr_el2 |= hcr_set;
> > +}
> > +
> > +/*
> > + * Set baseline trap register values.
> > + */
> > +static void pvm_init_trap_regs(struct kvm_vcpu *vcpu)
> > +{
> > +     const u64 hcr_trap_feat_regs = HCR_TID3;
> > +     const u64 hcr_trap_impdef = HCR_TACR | HCR_TIDCP | HCR_TID1;
> > +
> > +     /*
> > +      * Always trap:
> > +      * - Feature id registers: to control features exposed to guests
> > +      * - Implementation-defined features
> > +      */
> > +     vcpu->arch.hcr_el2 |= hcr_trap_feat_regs | hcr_trap_impdef;
> > +
> > +     /* Clear res0 and set res1 bits to trap potential new features. */
> > +     vcpu->arch.hcr_el2 &= ~(HCR_RES0);
> > +     vcpu->arch.mdcr_el2 &= ~(MDCR_EL2_RES0);
> > +     vcpu->arch.cptr_el2 |= CPTR_NVHE_EL2_RES1;
> > +     vcpu->arch.cptr_el2 &= ~(CPTR_NVHE_EL2_RES0);
> > +}
> > +
> > +/*
> > + * Initialize trap register values for protected VMs.
> > + */
> > +void __pkvm_vcpu_init_traps(struct kvm_vcpu *vcpu)
> > +{
> > +     pvm_init_trap_regs(vcpu);
> > +     pvm_init_traps_aa64pfr0(vcpu);
> > +     pvm_init_traps_aa64pfr1(vcpu);
> > +     pvm_init_traps_aa64dfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr0(vcpu);
> > +     pvm_init_traps_aa64mmfr1(vcpu);
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-05  9:53     ` Marc Zyngier
  -1 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:53 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:00 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)

Please drop the inline. The compiler will sort it out.

> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);

The naming is odd. __blah() is usually a primitive for blah(), while
you have it the other way around.

> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;

An access exception is the result of a memory access. undef_access
makes my head spin because you are conflating two unrelated terms.

I suggest you merge all three functions in a single inject_undef64().

> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.
> +		 */
> +		WARN_ON(1);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};

It would be good if you had something that checks the ordering of this
array at boot time. It is incredibly easy to screw up the ordering,
and then everything goes subtly wrong.

> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05  9:53     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:53 UTC (permalink / raw)
  To: Fuad Tabba; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, 22 Sep 2021 13:47:00 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)

Please drop the inline. The compiler will sort it out.

> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);

The naming is odd. __blah() is usually a primitive for blah(), while
you have it the other way around.

> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;

An access exception is the result of a memory access. undef_access
makes my head spin because you are conflating two unrelated terms.

I suggest you merge all three functions in a single inject_undef64().

> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.
> +		 */
> +		WARN_ON(1);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};

It would be good if you had something that checks the ordering of this
array at boot time. It is incredibly easy to screw up the ordering,
and then everything goes subtly wrong.

> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05  9:53     ` Marc Zyngier
  0 siblings, 0 replies; 90+ messages in thread
From: Marc Zyngier @ 2021-10-05  9:53 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

On Wed, 22 Sep 2021 13:47:00 +0100,
Fuad Tabba <tabba@google.com> wrote:
> 
> Add system register handlers for protected VMs. These cover Sys64
> registers (including feature id registers), and debug.
> 
> No functional change intended as these are not hooked in yet to
> the guest exit handlers introduced earlier. So when trapping is
> triggered, the exit handlers let the host handle it, as before.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
>  arch/arm64/include/asm/kvm_hyp.h           |   5 +
>  arch/arm64/kvm/arm.c                       |   5 +
>  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
>  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
>  6 files changed, 726 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> 
> diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> new file mode 100644
> index 000000000000..0ed06923f7e9
> --- /dev/null
> +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> @@ -0,0 +1,195 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> +#define __ARM64_KVM_FIXED_CONFIG_H__
> +
> +#include <asm/sysreg.h>
> +
> +/*
> + * This file contains definitions for features to be allowed or restricted for
> + * guest virtual machines, depending on the mode KVM is running in and on the
> + * type of guest that is running.
> + *
> + * The ALLOW masks represent a bitmask of feature fields that are allowed
> + * without any restrictions as long as they are supported by the system.
> + *
> + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> + * features that are restricted to support at most the specified feature.
> + *
> + * If a feature field is not present in either, than it is not supported.
> + *
> + * The approach taken for protected VMs is to allow features that are:
> + * - Needed by common Linux distributions (e.g., floating point)
> + * - Trivial to support, e.g., supporting the feature does not introduce or
> + * require tracking of additional state in KVM
> + * - Cannot be trapped or prevent the guest from using anyway
> + */
> +
> +/*
> + * Allow for protected VMs:
> + * - Floating-point and Advanced SIMD
> + * - Data Independent Timing
> + */
> +#define PVM_ID_AA64PFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - AArch64 guests only (no support for AArch32 guests):
> + *	AArch32 adds complexity in trap handling, emulation, condition codes,
> + *	etc...
> + * - RAS (v1)
> + *	Supported by KVM
> + */
> +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Branch Target Identification
> + * - Speculative Store Bypassing
> + */
> +#define PVM_ID_AA64PFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> +	ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Mixed-endian
> + * - Distinction between Secure and Non-secure Memory
> + * - Mixed-endian at EL0 only
> + * - Non-context synchronizing exception entry and exit
> + */
> +#define PVM_ID_AA64MMFR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> +	)
> +
> +/*
> + * Restrict to the following *unsigned* features for protected VMs:
> + * - 40-bit IPA
> + * - 16-bit ASID
> + */
> +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> +	FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Hardware translation table updates to Access flag and Dirty state
> + * - Number of VMID bits from CPU
> + * - Hierarchical Permission Disables
> + * - Privileged Access Never
> + * - SError interrupt exceptions from speculative reads
> + * - Enhanced Translation Synchronization
> + */
> +#define PVM_ID_AA64MMFR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> +	)
> +
> +/*
> + * Allow for protected VMs:
> + * - Common not Private translations
> + * - User Access Override
> + * - IESB bit in the SCTLR_ELx registers
> + * - Unaligned single-copy atomicity and atomic functions
> + * - ESR_ELx.EC value on an exception by read access to feature ID space
> + * - TTL field in address operations.
> + * - Break-before-make sequences when changing translation block size
> + * - E0PDx mechanism
> + */
> +#define PVM_ID_AA64MMFR2_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> +	ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> +	)
> +
> +/*
> + * No support for Scalable Vectors for protected VMs:
> + *	Requires additional support from KVM, e.g., context-switching and
> + *	trapping at EL2
> + */
> +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> +
> +/*
> + * No support for debug, including breakpoints, and watchpoints for protected
> + * VMs:
> + *	The Arm architecture mandates support for at least the Armv8 debug
> + *	architecture, which would include at least 2 hardware breakpoints and
> + *	watchpoints. Providing that support to protected guests adds
> + *	considerable state and complexity. Therefore, the reserved value of 0 is
> + *	used for debug-related fields.
> + */
> +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> +
> +/*
> + * No support for implementation defined features.
> + */
> +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> +
> +/*
> + * No restrictions on instructions implemented in AArch64.
> + */
> +#define PVM_ID_AA64ISAR0_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> +	)
> +
> +#define PVM_ID_AA64ISAR1_ALLOW (\
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> +	ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> +	)
> +
> +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> index 657d0c94cf82..5afd14ab15b9 100644
> --- a/arch/arm64/include/asm/kvm_hyp.h
> +++ b/arch/arm64/include/asm/kvm_hyp.h
> @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
>  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
>  #endif
>  
> +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
>  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
>  
>  #endif /* __ARM64_KVM_HYP_H__ */
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index fe102cd2e518..6aa7b0c5bf21 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
>  	void *addr = phys_to_virt(hyp_mem_base);
>  	int ret;
>  
> +	kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> +	kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> +	kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> +	kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> +	kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
>  
>  	ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
>  	if (ret)
> diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> new file mode 100644
> index 000000000000..0865163d363c
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> +
> +#include <asm/kvm_host.h>
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> +
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> +void __inject_undef64(struct kvm_vcpu *vcpu);
> +
> +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> index 8d741f71377f..0bbe37a18d5d 100644
> --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
>  
>  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
>  	 hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> -	 cache.o setup.o mm.o mem_protect.o
> +	 cache.o setup.o mm.o mem_protect.o sys_regs.o
>  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
>  	 ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
>  obj-y += $(lib-objs)
> diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> new file mode 100644
> index 000000000000..ef8456c54b18
> --- /dev/null
> +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> @@ -0,0 +1,492 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2021 Google LLC
> + * Author: Fuad Tabba <tabba@google.com>
> + */
> +
> +#include <asm/kvm_asm.h>
> +#include <asm/kvm_fixed_config.h>
> +#include <asm/kvm_mmu.h>
> +
> +#include <hyp/adjust_pc.h>
> +
> +#include "../../sys_regs.h"
> +
> +/*
> + * Copies of the host's CPU features registers holding sanitized values at hyp.
> + */
> +u64 id_aa64pfr0_el1_sys_val;
> +u64 id_aa64pfr1_el1_sys_val;
> +u64 id_aa64isar0_el1_sys_val;
> +u64 id_aa64isar1_el1_sys_val;
> +u64 id_aa64mmfr2_el1_sys_val;
> +
> +static inline void inject_undef64(struct kvm_vcpu *vcpu)

Please drop the inline. The compiler will sort it out.

> +{
> +	u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> +
> +	vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> +			     KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> +			     KVM_ARM64_PENDING_EXCEPTION);
> +
> +	__kvm_adjust_pc(vcpu);
> +
> +	write_sysreg_el1(esr, SYS_ESR);
> +	write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> +}
> +
> +/*
> + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> + * sysregs are live.
> + */
> +void __inject_undef64(struct kvm_vcpu *vcpu)
> +{
> +	*vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> +	*vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> +
> +	inject_undef64(vcpu);

The naming is odd. __blah() is usually a primitive for blah(), while
you have it the other way around.

> +
> +	write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> +	write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> +}
> +
> +/*
> + * Accessor for undefined accesses.
> + */
> +static bool undef_access(struct kvm_vcpu *vcpu,
> +			 struct sys_reg_params *p,
> +			 const struct sys_reg_desc *r)
> +{
> +	__inject_undef64(vcpu);
> +	return false;

An access exception is the result of a memory access. undef_access
makes my head spin because you are conflating two unrelated terms.

I suggest you merge all three functions in a single inject_undef64().

> +}
> +
> +/*
> + * Returns the restricted features values of the feature register based on the
> + * limitations in restrict_fields.
> + * A feature id field value of 0b0000 does not impose any restrictions.
> + * Note: Use only for unsigned feature field values.
> + */
> +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> +					    u64 restrict_fields)
> +{
> +	u64 value = 0UL;
> +	u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> +
> +	/*
> +	 * According to the Arm Architecture Reference Manual, feature fields
> +	 * use increasing values to indicate increases in functionality.
> +	 * Iterate over the restricted feature fields and calculate the minimum
> +	 * unsigned value between the one supported by the system, and what the
> +	 * value is being restricted to.
> +	 */
> +	while (sys_reg_val && restrict_fields) {
> +		value |= min(sys_reg_val & mask, restrict_fields & mask);
> +		sys_reg_val &= ~mask;
> +		restrict_fields &= ~mask;
> +		mask <<= ARM64_FEATURE_FIELD_BITS;
> +	}
> +
> +	return value;
> +}
> +
> +/*
> + * Functions that return the value of feature id registers for protected VMs
> + * based on allowed features, system features, and KVM support.
> + */
> +
> +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 set_mask = 0;
> +	u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> +
> +	if (!vcpu_has_sve(vcpu))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> +
> +	set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> +		PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> +
> +	/* Spectre and Meltdown mitigation in KVM */
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> +			       (u64)kvm->arch.pfr0_csv2);
> +	set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> +			       (u64)kvm->arch.pfr0_csv3);
> +
> +	return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> +{
> +	const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> +	u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> +
> +	if (!kvm_has_mte(kvm))
> +		allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> +
> +	return id_aa64pfr1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for Scalable Vectors, therefore, hyp has no sanitized
> +	 * copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, including breakpoints, and watchpoints,
> +	 * therefore, pKVM has no sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for debug, therefore, hyp has no sanitized copy of the
> +	 * feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> +{
> +	/*
> +	 * No support for implementation defined features, therefore, hyp has no
> +	 * sanitized copy of the feature id register.
> +	 */
> +	BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> +	return 0;
> +}
> +
> +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> +{
> +	u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> +
> +	if (!vcpu_has_ptrauth(vcpu))
> +		allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> +				ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> +
> +	return id_aa64isar1_el1_sys_val & allow_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> +{
> +	u64 set_mask;
> +
> +	set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> +		PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> +
> +	return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> +}
> +
> +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> +}
> +
> +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> +{
> +	return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> +}
> +
> +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> +		       struct sys_reg_desc const *r)
> +{
> +	u32 id = reg_to_encoding(r);
> +
> +	switch (id) {
> +	case SYS_ID_AA64PFR0_EL1:
> +		return get_pvm_id_aa64pfr0(vcpu);
> +	case SYS_ID_AA64PFR1_EL1:
> +		return get_pvm_id_aa64pfr1(vcpu);
> +	case SYS_ID_AA64ZFR0_EL1:
> +		return get_pvm_id_aa64zfr0(vcpu);
> +	case SYS_ID_AA64DFR0_EL1:
> +		return get_pvm_id_aa64dfr0(vcpu);
> +	case SYS_ID_AA64DFR1_EL1:
> +		return get_pvm_id_aa64dfr1(vcpu);
> +	case SYS_ID_AA64AFR0_EL1:
> +		return get_pvm_id_aa64afr0(vcpu);
> +	case SYS_ID_AA64AFR1_EL1:
> +		return get_pvm_id_aa64afr1(vcpu);
> +	case SYS_ID_AA64ISAR0_EL1:
> +		return get_pvm_id_aa64isar0(vcpu);
> +	case SYS_ID_AA64ISAR1_EL1:
> +		return get_pvm_id_aa64isar1(vcpu);
> +	case SYS_ID_AA64MMFR0_EL1:
> +		return get_pvm_id_aa64mmfr0(vcpu);
> +	case SYS_ID_AA64MMFR1_EL1:
> +		return get_pvm_id_aa64mmfr1(vcpu);
> +	case SYS_ID_AA64MMFR2_EL1:
> +		return get_pvm_id_aa64mmfr2(vcpu);
> +	default:
> +		/*
> +		 * Should never happen because all cases are covered in
> +		 * pvm_sys_reg_descs[] below.
> +		 */
> +		WARN_ON(1);
> +		break;
> +	}
> +
> +	return 0;
> +}
> +
> +/*
> + * Accessor for AArch32 feature id registers.
> + *
> + * The value of these registers is "unknown" according to the spec if AArch32
> + * isn't supported.
> + */
> +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	/*
> +	 * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> +	 * of AArch32 feature id registers.
> +	 */
> +	BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> +		     PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> +
> +	/* Use 0 for architecturally "unknown" values. */
> +	p->regval = 0;
> +	return true;
> +}
> +
> +/*
> + * Accessor for AArch64 feature id registers.
> + *
> + * If access is allowed, set the regval to the protected VM's view of the
> + * register and return true.
> + * Otherwise, inject an undefined exception and return false.
> + */
> +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> +				  struct sys_reg_params *p,
> +				  const struct sys_reg_desc *r)
> +{
> +	if (p->is_write)
> +		return undef_access(vcpu, p, r);
> +
> +	p->regval = read_id_reg(vcpu, r);
> +	return true;
> +}
> +
> +/* Mark the specified system register as an AArch32 feature id register. */
> +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> +
> +/* Mark the specified system register as an AArch64 feature id register. */
> +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> +
> +/* Mark the specified system register as not being handled in hyp. */
> +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> +
> +/*
> + * Architected system registers.
> + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> + *
> + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> + * it will lead to injecting an exception into the guest.
> + */
> +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> +	/* Cache maintenance by set/way operations are restricted. */
> +
> +	/* Debug and Trace Registers are restricted. */
> +
> +	/* AArch64 mappings of the AArch32 ID registers */
> +	/* CRm=1 */
> +	AARCH32(SYS_ID_PFR0_EL1),
> +	AARCH32(SYS_ID_PFR1_EL1),
> +	AARCH32(SYS_ID_DFR0_EL1),
> +	AARCH32(SYS_ID_AFR0_EL1),
> +	AARCH32(SYS_ID_MMFR0_EL1),
> +	AARCH32(SYS_ID_MMFR1_EL1),
> +	AARCH32(SYS_ID_MMFR2_EL1),
> +	AARCH32(SYS_ID_MMFR3_EL1),
> +
> +	/* CRm=2 */
> +	AARCH32(SYS_ID_ISAR0_EL1),
> +	AARCH32(SYS_ID_ISAR1_EL1),
> +	AARCH32(SYS_ID_ISAR2_EL1),
> +	AARCH32(SYS_ID_ISAR3_EL1),
> +	AARCH32(SYS_ID_ISAR4_EL1),
> +	AARCH32(SYS_ID_ISAR5_EL1),
> +	AARCH32(SYS_ID_MMFR4_EL1),
> +	AARCH32(SYS_ID_ISAR6_EL1),
> +
> +	/* CRm=3 */
> +	AARCH32(SYS_MVFR0_EL1),
> +	AARCH32(SYS_MVFR1_EL1),
> +	AARCH32(SYS_MVFR2_EL1),
> +	AARCH32(SYS_ID_PFR2_EL1),
> +	AARCH32(SYS_ID_DFR1_EL1),
> +	AARCH32(SYS_ID_MMFR5_EL1),
> +
> +	/* AArch64 ID registers */
> +	/* CRm=4 */
> +	AARCH64(SYS_ID_AA64PFR0_EL1),
> +	AARCH64(SYS_ID_AA64PFR1_EL1),
> +	AARCH64(SYS_ID_AA64ZFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR0_EL1),
> +	AARCH64(SYS_ID_AA64DFR1_EL1),
> +	AARCH64(SYS_ID_AA64AFR0_EL1),
> +	AARCH64(SYS_ID_AA64AFR1_EL1),
> +	AARCH64(SYS_ID_AA64ISAR0_EL1),
> +	AARCH64(SYS_ID_AA64ISAR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR0_EL1),
> +	AARCH64(SYS_ID_AA64MMFR1_EL1),
> +	AARCH64(SYS_ID_AA64MMFR2_EL1),
> +
> +	HOST_HANDLED(SYS_SCTLR_EL1),
> +	HOST_HANDLED(SYS_ACTLR_EL1),
> +	HOST_HANDLED(SYS_CPACR_EL1),
> +
> +	HOST_HANDLED(SYS_RGSR_EL1),
> +	HOST_HANDLED(SYS_GCR_EL1),
> +
> +	/* Scalable Vector Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TTBR0_EL1),
> +	HOST_HANDLED(SYS_TTBR1_EL1),
> +	HOST_HANDLED(SYS_TCR_EL1),
> +
> +	HOST_HANDLED(SYS_APIAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APIBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APIBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDAKEYHI_EL1),
> +	HOST_HANDLED(SYS_APDBKEYLO_EL1),
> +	HOST_HANDLED(SYS_APDBKEYHI_EL1),
> +	HOST_HANDLED(SYS_APGAKEYLO_EL1),
> +	HOST_HANDLED(SYS_APGAKEYHI_EL1),
> +
> +	HOST_HANDLED(SYS_AFSR0_EL1),
> +	HOST_HANDLED(SYS_AFSR1_EL1),
> +	HOST_HANDLED(SYS_ESR_EL1),
> +
> +	HOST_HANDLED(SYS_ERRIDR_EL1),
> +	HOST_HANDLED(SYS_ERRSELR_EL1),
> +	HOST_HANDLED(SYS_ERXFR_EL1),
> +	HOST_HANDLED(SYS_ERXCTLR_EL1),
> +	HOST_HANDLED(SYS_ERXSTATUS_EL1),
> +	HOST_HANDLED(SYS_ERXADDR_EL1),
> +	HOST_HANDLED(SYS_ERXMISC0_EL1),
> +	HOST_HANDLED(SYS_ERXMISC1_EL1),
> +
> +	HOST_HANDLED(SYS_TFSR_EL1),
> +	HOST_HANDLED(SYS_TFSRE0_EL1),
> +
> +	HOST_HANDLED(SYS_FAR_EL1),
> +	HOST_HANDLED(SYS_PAR_EL1),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_MAIR_EL1),
> +	HOST_HANDLED(SYS_AMAIR_EL1),
> +
> +	/* Limited Ordering Regions Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_VBAR_EL1),
> +	HOST_HANDLED(SYS_DISR_EL1),
> +
> +	/* GIC CPU Interface registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> +	HOST_HANDLED(SYS_TPIDR_EL1),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL1),
> +
> +	HOST_HANDLED(SYS_CNTKCTL_EL1),
> +
> +	HOST_HANDLED(SYS_CCSIDR_EL1),
> +	HOST_HANDLED(SYS_CLIDR_EL1),
> +	HOST_HANDLED(SYS_CSSELR_EL1),
> +	HOST_HANDLED(SYS_CTR_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_TPIDR_EL0),
> +	HOST_HANDLED(SYS_TPIDRRO_EL0),
> +
> +	HOST_HANDLED(SYS_SCXTNUM_EL0),
> +
> +	/* Activity Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CTL_EL0),
> +	HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> +
> +	/* Performance Monitoring Registers are restricted. */
> +
> +	HOST_HANDLED(SYS_DACR32_EL2),
> +	HOST_HANDLED(SYS_IFSR32_EL2),
> +	HOST_HANDLED(SYS_FPEXC32_EL2),
> +};

It would be good if you had something that checks the ordering of this
array at boot time. It is incredibly easy to screw up the ordering,
and then everything goes subtly wrong.

> +
> +/*
> + * Handler for protected VM MSR, MRS or System instruction execution.
> + *
> + * Returns true if the hypervisor has handled the exit, and control should go
> + * back to the guest, or false if it hasn't, to be handled by the host.
> + */
> +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> +{
> +	const struct sys_reg_desc *r;
> +	struct sys_reg_params params;
> +	unsigned long esr = kvm_vcpu_get_esr(vcpu);
> +	int Rt = kvm_vcpu_sys_get_rt(vcpu);
> +
> +	params = esr_sys64_to_params(esr);
> +	params.regval = vcpu_get_reg(vcpu, Rt);
> +
> +	r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> +
> +	/* Undefined access (RESTRICTED). */
> +	if (r == NULL) {
> +		__inject_undef64(vcpu);
> +		return true;
> +	}
> +
> +	/* Handled by the host (HOST_HANDLED) */
> +	if (r->access == NULL)
> +		return false;
> +
> +	/* Handled by hyp: skip instruction if instructed to do so. */
> +	if (r->access(vcpu, &params, r))
> +		__kvm_skip_instr(vcpu);
> +
> +	if (!params.is_write)
> +		vcpu_set_reg(vcpu, Rt, params.regval);
> +
> +	return true;
> +}

Thanks,

	M.

-- 
Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
  2021-10-05  8:52     ` Andrew Jones
  (?)
@ 2021-10-05 16:43       ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:43 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

Hi Drew,

On Tue, Oct 5, 2021 at 9:53 AM Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
>
> Reading the same function five times make me wonder if a generator macro
> wouldn't be better for these.

I think so too. I'll do that.

> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
>
> I'd drop the 'below' word. It's not overly helpful and since code gets
> moved it can go out of date.

Will fix.

> > +              */
> > +             WARN_ON(1);
>
> The above cases could also be generated by a macro. And I wonder if we can
> come up with something that makes sure these separate lists stay
> consistent with macros and build bugs in order to better avoid these
> "should never happen" situations.

Which ties in to Marc's comment for this patch. I'll handle this.

> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
>
> Other than the nits and suggestion to try and build in some register list
> consistency checks, this looks good to me. I don't know what pKVM
> should / should not expose, but I like the approach this takes, so,
> FWIW,
>
> Reviewed-by: Andrew Jones <drjones@redhat.com>

Thank you,
/fuad

> Thanks,
> drew
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05 16:43       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:43 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kernel-team, kvm, maz, pbonzini, will, kvmarm, linux-arm-kernel

Hi Drew,

On Tue, Oct 5, 2021 at 9:53 AM Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
>
> Reading the same function five times make me wonder if a generator macro
> wouldn't be better for these.

I think so too. I'll do that.

> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
>
> I'd drop the 'below' word. It's not overly helpful and since code gets
> moved it can go out of date.

Will fix.

> > +              */
> > +             WARN_ON(1);
>
> The above cases could also be generated by a macro. And I wonder if we can
> come up with something that makes sure these separate lists stay
> consistent with macros and build bugs in order to better avoid these
> "should never happen" situations.

Which ties in to Marc's comment for this patch. I'll handle this.

> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
>
> Other than the nits and suggestion to try and build in some register list
> consistency checks, this looks good to me. I don't know what pKVM
> should / should not expose, but I like the approach this takes, so,
> FWIW,
>
> Reviewed-by: Andrew Jones <drjones@redhat.com>

Thank you,
/fuad

> Thanks,
> drew
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05 16:43       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:43 UTC (permalink / raw)
  To: Andrew Jones
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

Hi Drew,

On Tue, Oct 5, 2021 at 9:53 AM Andrew Jones <drjones@redhat.com> wrote:
>
> On Wed, Sep 22, 2021 at 01:47:00PM +0100, Fuad Tabba wrote:
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
>
> Reading the same function five times make me wonder if a generator macro
> wouldn't be better for these.

I think so too. I'll do that.

> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
>
> I'd drop the 'below' word. It's not overly helpful and since code gets
> moved it can go out of date.

Will fix.

> > +              */
> > +             WARN_ON(1);
>
> The above cases could also be generated by a macro. And I wonder if we can
> come up with something that makes sure these separate lists stay
> consistent with macros and build bugs in order to better avoid these
> "should never happen" situations.

Which ties in to Marc's comment for this patch. I'll handle this.

> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
> > --
> > 2.33.0.464.g1972c5931b-goog
> >
>
> Other than the nits and suggestion to try and build in some register list
> consistency checks, this looks good to me. I don't know what pKVM
> should / should not expose, but I like the approach this takes, so,
> FWIW,
>
> Reviewed-by: Andrew Jones <drjones@redhat.com>

Thank you,
/fuad

> Thanks,
> drew
>
> --
> To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com.
>

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
  2021-10-05  9:53     ` Marc Zyngier
  (?)
@ 2021-10-05 16:49       ` Fuad Tabba
  -1 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:49 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 10:54 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:00 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
>
> Please drop the inline. The compiler will sort it out.

Sure.

> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
>
> The naming is odd. __blah() is usually a primitive for blah(), while
> you have it the other way around.

I agree, and I thought so too but I was following the same pattern as
__kvm_skip_instr, which invokes kvm_skip_instr in a similar manner.

> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
>
> An access exception is the result of a memory access. undef_access
> makes my head spin because you are conflating two unrelated terms.
>
> I suggest you merge all three functions in a single inject_undef64().

Sure.

> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
> > +              */
> > +             WARN_ON(1);
> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
>
> It would be good if you had something that checks the ordering of this
> array at boot time. It is incredibly easy to screw up the ordering,
> and then everything goes subtly wrong.

Yes. I'll do something like check_sysreg_table().

Thanks,
/fuad

> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05 16:49       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:49 UTC (permalink / raw)
  To: Marc Zyngier; +Cc: kernel-team, kvm, pbonzini, will, kvmarm, linux-arm-kernel

Hi Marc,

On Tue, Oct 5, 2021 at 10:54 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:00 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
>
> Please drop the inline. The compiler will sort it out.

Sure.

> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
>
> The naming is odd. __blah() is usually a primitive for blah(), while
> you have it the other way around.

I agree, and I thought so too but I was following the same pattern as
__kvm_skip_instr, which invokes kvm_skip_instr in a similar manner.

> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
>
> An access exception is the result of a memory access. undef_access
> makes my head spin because you are conflating two unrelated terms.
>
> I suggest you merge all three functions in a single inject_undef64().

Sure.

> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
> > +              */
> > +             WARN_ON(1);
> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
>
> It would be good if you had something that checks the ordering of this
> array at boot time. It is incredibly easy to screw up the ordering,
> and then everything goes subtly wrong.

Yes. I'll do something like check_sysreg_table().

Thanks,
/fuad

> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.
_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers
@ 2021-10-05 16:49       ` Fuad Tabba
  0 siblings, 0 replies; 90+ messages in thread
From: Fuad Tabba @ 2021-10-05 16:49 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: kvmarm, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, drjones, oupton,
	qperret, kvm, linux-arm-kernel, kernel-team

Hi Marc,

On Tue, Oct 5, 2021 at 10:54 AM Marc Zyngier <maz@kernel.org> wrote:
>
> On Wed, 22 Sep 2021 13:47:00 +0100,
> Fuad Tabba <tabba@google.com> wrote:
> >
> > Add system register handlers for protected VMs. These cover Sys64
> > registers (including feature id registers), and debug.
> >
> > No functional change intended as these are not hooked in yet to
> > the guest exit handlers introduced earlier. So when trapping is
> > triggered, the exit handlers let the host handle it, as before.
> >
> > Signed-off-by: Fuad Tabba <tabba@google.com>
> > ---
> >  arch/arm64/include/asm/kvm_fixed_config.h  | 195 ++++++++
> >  arch/arm64/include/asm/kvm_hyp.h           |   5 +
> >  arch/arm64/kvm/arm.c                       |   5 +
> >  arch/arm64/kvm/hyp/include/nvhe/sys_regs.h |  28 ++
> >  arch/arm64/kvm/hyp/nvhe/Makefile           |   2 +-
> >  arch/arm64/kvm/hyp/nvhe/sys_regs.c         | 492 +++++++++++++++++++++
> >  6 files changed, 726 insertions(+), 1 deletion(-)
> >  create mode 100644 arch/arm64/include/asm/kvm_fixed_config.h
> >  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> >  create mode 100644 arch/arm64/kvm/hyp/nvhe/sys_regs.c
> >
> > diff --git a/arch/arm64/include/asm/kvm_fixed_config.h b/arch/arm64/include/asm/kvm_fixed_config.h
> > new file mode 100644
> > index 000000000000..0ed06923f7e9
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/kvm_fixed_config.h
> > @@ -0,0 +1,195 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_FIXED_CONFIG_H__
> > +#define __ARM64_KVM_FIXED_CONFIG_H__
> > +
> > +#include <asm/sysreg.h>
> > +
> > +/*
> > + * This file contains definitions for features to be allowed or restricted for
> > + * guest virtual machines, depending on the mode KVM is running in and on the
> > + * type of guest that is running.
> > + *
> > + * The ALLOW masks represent a bitmask of feature fields that are allowed
> > + * without any restrictions as long as they are supported by the system.
> > + *
> > + * The RESTRICT_UNSIGNED masks, if present, represent unsigned fields for
> > + * features that are restricted to support at most the specified feature.
> > + *
> > + * If a feature field is not present in either, than it is not supported.
> > + *
> > + * The approach taken for protected VMs is to allow features that are:
> > + * - Needed by common Linux distributions (e.g., floating point)
> > + * - Trivial to support, e.g., supporting the feature does not introduce or
> > + * require tracking of additional state in KVM
> > + * - Cannot be trapped or prevent the guest from using anyway
> > + */
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Floating-point and Advanced SIMD
> > + * - Data Independent Timing
> > + */
> > +#define PVM_ID_AA64PFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_FP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_ASIMD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR0_DIT) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - AArch64 guests only (no support for AArch32 guests):
> > + *   AArch32 adds complexity in trap handling, emulation, condition codes,
> > + *   etc...
> > + * - RAS (v1)
> > + *   Supported by KVM
> > + */
> > +#define PVM_ID_AA64PFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL0), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL2), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_EL3), ID_AA64PFR0_ELx_64BIT_ONLY) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_RAS), ID_AA64PFR0_RAS_V1) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Branch Target Identification
> > + * - Speculative Store Bypassing
> > + */
> > +#define PVM_ID_AA64PFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_BT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64PFR1_SSBS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Mixed-endian
> > + * - Distinction between Secure and Non-secure Memory
> > + * - Mixed-endian at EL0 only
> > + * - Non-context synchronizing exception entry and exit
> > + */
> > +#define PVM_ID_AA64MMFR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_SNSMEM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_BIGENDEL0) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR0_EXS) \
> > +     )
> > +
> > +/*
> > + * Restrict to the following *unsigned* features for protected VMs:
> > + * - 40-bit IPA
> > + * - 16-bit ASID
> > + */
> > +#define PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED (\
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_PARANGE), ID_AA64MMFR0_PARANGE_40) | \
> > +     FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64MMFR0_ASID), ID_AA64MMFR0_ASID_16) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Hardware translation table updates to Access flag and Dirty state
> > + * - Number of VMID bits from CPU
> > + * - Hierarchical Permission Disables
> > + * - Privileged Access Never
> > + * - SError interrupt exceptions from speculative reads
> > + * - Enhanced Translation Synchronization
> > + */
> > +#define PVM_ID_AA64MMFR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HADBS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_VMIDBITS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_HPD) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_PAN) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_SPECSEI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR1_ETS) \
> > +     )
> > +
> > +/*
> > + * Allow for protected VMs:
> > + * - Common not Private translations
> > + * - User Access Override
> > + * - IESB bit in the SCTLR_ELx registers
> > + * - Unaligned single-copy atomicity and atomic functions
> > + * - ESR_ELx.EC value on an exception by read access to feature ID space
> > + * - TTL field in address operations.
> > + * - Break-before-make sequences when changing translation block size
> > + * - E0PDx mechanism
> > + */
> > +#define PVM_ID_AA64MMFR2_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_CNP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_UAO) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IESB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_AT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_IDS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_TTL) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_BBM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64MMFR2_E0PD) \
> > +     )
> > +
> > +/*
> > + * No support for Scalable Vectors for protected VMs:
> > + *   Requires additional support from KVM, e.g., context-switching and
> > + *   trapping at EL2
> > + */
> > +#define PVM_ID_AA64ZFR0_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for debug, including breakpoints, and watchpoints for protected
> > + * VMs:
> > + *   The Arm architecture mandates support for at least the Armv8 debug
> > + *   architecture, which would include at least 2 hardware breakpoints and
> > + *   watchpoints. Providing that support to protected guests adds
> > + *   considerable state and complexity. Therefore, the reserved value of 0 is
> > + *   used for debug-related fields.
> > + */
> > +#define PVM_ID_AA64DFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64DFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No support for implementation defined features.
> > + */
> > +#define PVM_ID_AA64AFR0_ALLOW (0ULL)
> > +#define PVM_ID_AA64AFR1_ALLOW (0ULL)
> > +
> > +/*
> > + * No restrictions on instructions implemented in AArch64.
> > + */
> > +#define PVM_ID_AA64ISAR0_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_AES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA1) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA2) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_CRC32) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_ATOMICS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RDM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SHA3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM3) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_SM4) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_DP) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_FHM) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_TLB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR0_RNDR) \
> > +     )
> > +
> > +#define PVM_ID_AA64ISAR1_ALLOW (\
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DPB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_API) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_JSCVT) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FCMA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_LRCPC) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_FRINTTS) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SB) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_SPECRES) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_BF16) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_DGH) | \
> > +     ARM64_FEATURE_MASK(ID_AA64ISAR1_I8MM) \
> > +     )
> > +
> > +#endif /* __ARM64_KVM_FIXED_CONFIG_H__ */
> > diff --git a/arch/arm64/include/asm/kvm_hyp.h b/arch/arm64/include/asm/kvm_hyp.h
> > index 657d0c94cf82..5afd14ab15b9 100644
> > --- a/arch/arm64/include/asm/kvm_hyp.h
> > +++ b/arch/arm64/include/asm/kvm_hyp.h
> > @@ -115,7 +115,12 @@ int __pkvm_init(phys_addr_t phys, unsigned long size, unsigned long nr_cpus,
> >  void __noreturn __host_enter(struct kvm_cpu_context *host_ctxt);
> >  #endif
> >
> > +extern u64 kvm_nvhe_sym(id_aa64pfr0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64pfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar0_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64isar1_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val);
> >  extern u64 kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val);
> > +extern u64 kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val);
> >
> >  #endif /* __ARM64_KVM_HYP_H__ */
> > diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> > index fe102cd2e518..6aa7b0c5bf21 100644
> > --- a/arch/arm64/kvm/arm.c
> > +++ b/arch/arm64/kvm/arm.c
> > @@ -1802,8 +1802,13 @@ static int kvm_hyp_init_protection(u32 hyp_va_bits)
> >       void *addr = phys_to_virt(hyp_mem_base);
> >       int ret;
> >
> > +     kvm_nvhe_sym(id_aa64pfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
> > +     kvm_nvhe_sym(id_aa64pfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64PFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64isar0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR0_EL1);
> > +     kvm_nvhe_sym(id_aa64isar1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64ISAR1_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr0_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
> >       kvm_nvhe_sym(id_aa64mmfr1_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR1_EL1);
> > +     kvm_nvhe_sym(id_aa64mmfr2_el1_sys_val) = read_sanitised_ftr_reg(SYS_ID_AA64MMFR2_EL1);
> >
> >       ret = create_hyp_mappings(addr, addr + hyp_mem_size, PAGE_HYP);
> >       if (ret)
> > diff --git a/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > new file mode 100644
> > index 000000000000..0865163d363c
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/include/nvhe/sys_regs.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#ifndef __ARM64_KVM_NVHE_SYS_REGS_H__
> > +#define __ARM64_KVM_NVHE_SYS_REGS_H__
> > +
> > +#include <asm/kvm_host.h>
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu);
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu);
> > +
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code);
> > +void __inject_undef64(struct kvm_vcpu *vcpu);
> > +
> > +#endif /* __ARM64_KVM_NVHE_SYS_REGS_H__ */
> > diff --git a/arch/arm64/kvm/hyp/nvhe/Makefile b/arch/arm64/kvm/hyp/nvhe/Makefile
> > index 8d741f71377f..0bbe37a18d5d 100644
> > --- a/arch/arm64/kvm/hyp/nvhe/Makefile
> > +++ b/arch/arm64/kvm/hyp/nvhe/Makefile
> > @@ -14,7 +14,7 @@ lib-objs := $(addprefix ../../../lib/, $(lib-objs))
> >
> >  obj-y := timer-sr.o sysreg-sr.o debug-sr.o switch.o tlb.o hyp-init.o host.o \
> >        hyp-main.o hyp-smp.o psci-relay.o early_alloc.o stub.o page_alloc.o \
> > -      cache.o setup.o mm.o mem_protect.o
> > +      cache.o setup.o mm.o mem_protect.o sys_regs.o
> >  obj-y += ../vgic-v3-sr.o ../aarch32.o ../vgic-v2-cpuif-proxy.o ../entry.o \
> >        ../fpsimd.o ../hyp-entry.o ../exception.o ../pgtable.o
> >  obj-y += $(lib-objs)
> > diff --git a/arch/arm64/kvm/hyp/nvhe/sys_regs.c b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > new file mode 100644
> > index 000000000000..ef8456c54b18
> > --- /dev/null
> > +++ b/arch/arm64/kvm/hyp/nvhe/sys_regs.c
> > @@ -0,0 +1,492 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Copyright (C) 2021 Google LLC
> > + * Author: Fuad Tabba <tabba@google.com>
> > + */
> > +
> > +#include <asm/kvm_asm.h>
> > +#include <asm/kvm_fixed_config.h>
> > +#include <asm/kvm_mmu.h>
> > +
> > +#include <hyp/adjust_pc.h>
> > +
> > +#include "../../sys_regs.h"
> > +
> > +/*
> > + * Copies of the host's CPU features registers holding sanitized values at hyp.
> > + */
> > +u64 id_aa64pfr0_el1_sys_val;
> > +u64 id_aa64pfr1_el1_sys_val;
> > +u64 id_aa64isar0_el1_sys_val;
> > +u64 id_aa64isar1_el1_sys_val;
> > +u64 id_aa64mmfr2_el1_sys_val;
> > +
> > +static inline void inject_undef64(struct kvm_vcpu *vcpu)
>
> Please drop the inline. The compiler will sort it out.

Sure.

> > +{
> > +     u32 esr = (ESR_ELx_EC_UNKNOWN << ESR_ELx_EC_SHIFT);
> > +
> > +     vcpu->arch.flags |= (KVM_ARM64_EXCEPT_AA64_EL1 |
> > +                          KVM_ARM64_EXCEPT_AA64_ELx_SYNC |
> > +                          KVM_ARM64_PENDING_EXCEPTION);
> > +
> > +     __kvm_adjust_pc(vcpu);
> > +
> > +     write_sysreg_el1(esr, SYS_ESR);
> > +     write_sysreg_el1(read_sysreg_el2(SYS_ELR), SYS_ELR);
> > +}
> > +
> > +/*
> > + * Inject an unknown/undefined exception to an AArch64 guest while most of its
> > + * sysregs are live.
> > + */
> > +void __inject_undef64(struct kvm_vcpu *vcpu)
> > +{
> > +     *vcpu_pc(vcpu) = read_sysreg_el2(SYS_ELR);
> > +     *vcpu_cpsr(vcpu) = read_sysreg_el2(SYS_SPSR);
> > +
> > +     inject_undef64(vcpu);
>
> The naming is odd. __blah() is usually a primitive for blah(), while
> you have it the other way around.

I agree, and I thought so too but I was following the same pattern as
__kvm_skip_instr, which invokes kvm_skip_instr in a similar manner.

> > +
> > +     write_sysreg_el2(*vcpu_pc(vcpu), SYS_ELR);
> > +     write_sysreg_el2(*vcpu_cpsr(vcpu), SYS_SPSR);
> > +}
> > +
> > +/*
> > + * Accessor for undefined accesses.
> > + */
> > +static bool undef_access(struct kvm_vcpu *vcpu,
> > +                      struct sys_reg_params *p,
> > +                      const struct sys_reg_desc *r)
> > +{
> > +     __inject_undef64(vcpu);
> > +     return false;
>
> An access exception is the result of a memory access. undef_access
> makes my head spin because you are conflating two unrelated terms.
>
> I suggest you merge all three functions in a single inject_undef64().

Sure.

> > +}
> > +
> > +/*
> > + * Returns the restricted features values of the feature register based on the
> > + * limitations in restrict_fields.
> > + * A feature id field value of 0b0000 does not impose any restrictions.
> > + * Note: Use only for unsigned feature field values.
> > + */
> > +static u64 get_restricted_features_unsigned(u64 sys_reg_val,
> > +                                         u64 restrict_fields)
> > +{
> > +     u64 value = 0UL;
> > +     u64 mask = GENMASK_ULL(ARM64_FEATURE_FIELD_BITS - 1, 0);
> > +
> > +     /*
> > +      * According to the Arm Architecture Reference Manual, feature fields
> > +      * use increasing values to indicate increases in functionality.
> > +      * Iterate over the restricted feature fields and calculate the minimum
> > +      * unsigned value between the one supported by the system, and what the
> > +      * value is being restricted to.
> > +      */
> > +     while (sys_reg_val && restrict_fields) {
> > +             value |= min(sys_reg_val & mask, restrict_fields & mask);
> > +             sys_reg_val &= ~mask;
> > +             restrict_fields &= ~mask;
> > +             mask <<= ARM64_FEATURE_FIELD_BITS;
> > +     }
> > +
> > +     return value;
> > +}
> > +
> > +/*
> > + * Functions that return the value of feature id registers for protected VMs
> > + * based on allowed features, system features, and KVM support.
> > + */
> > +
> > +u64 get_pvm_id_aa64pfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 set_mask = 0;
> > +     u64 allow_mask = PVM_ID_AA64PFR0_ALLOW;
> > +
> > +     if (!vcpu_has_sve(vcpu))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR0_SVE);
> > +
> > +     set_mask |= get_restricted_features_unsigned(id_aa64pfr0_el1_sys_val,
> > +             PVM_ID_AA64PFR0_RESTRICT_UNSIGNED);
> > +
> > +     /* Spectre and Meltdown mitigation in KVM */
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV2),
> > +                            (u64)kvm->arch.pfr0_csv2);
> > +     set_mask |= FIELD_PREP(ARM64_FEATURE_MASK(ID_AA64PFR0_CSV3),
> > +                            (u64)kvm->arch.pfr0_csv3);
> > +
> > +     return (id_aa64pfr0_el1_sys_val & allow_mask) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64pfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     const struct kvm *kvm = (const struct kvm *)kern_hyp_va(vcpu->kvm);
> > +     u64 allow_mask = PVM_ID_AA64PFR1_ALLOW;
> > +
> > +     if (!kvm_has_mte(kvm))
> > +             allow_mask &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_MTE);
> > +
> > +     return id_aa64pfr1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64zfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for Scalable Vectors, therefore, hyp has no sanitized
> > +      * copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64ZFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, including breakpoints, and watchpoints,
> > +      * therefore, pKVM has no sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64dfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for debug, therefore, hyp has no sanitized copy of the
> > +      * feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64DFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR0_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64afr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     /*
> > +      * No support for implementation defined features, therefore, hyp has no
> > +      * sanitized copy of the feature id register.
> > +      */
> > +     BUILD_BUG_ON(PVM_ID_AA64AFR1_ALLOW != 0ULL);
> > +     return 0;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar0(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64isar0_el1_sys_val & PVM_ID_AA64ISAR0_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64isar1(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 allow_mask = PVM_ID_AA64ISAR1_ALLOW;
> > +
> > +     if (!vcpu_has_ptrauth(vcpu))
> > +             allow_mask &= ~(ARM64_FEATURE_MASK(ID_AA64ISAR1_APA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_API) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPA) |
> > +                             ARM64_FEATURE_MASK(ID_AA64ISAR1_GPI));
> > +
> > +     return id_aa64isar1_el1_sys_val & allow_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr0(const struct kvm_vcpu *vcpu)
> > +{
> > +     u64 set_mask;
> > +
> > +     set_mask = get_restricted_features_unsigned(id_aa64mmfr0_el1_sys_val,
> > +             PVM_ID_AA64MMFR0_RESTRICT_UNSIGNED);
> > +
> > +     return (id_aa64mmfr0_el1_sys_val & PVM_ID_AA64MMFR0_ALLOW) | set_mask;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr1(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr1_el1_sys_val & PVM_ID_AA64MMFR1_ALLOW;
> > +}
> > +
> > +u64 get_pvm_id_aa64mmfr2(const struct kvm_vcpu *vcpu)
> > +{
> > +     return id_aa64mmfr2_el1_sys_val & PVM_ID_AA64MMFR2_ALLOW;
> > +}
> > +
> > +/* Read a sanitized cpufeature ID register by its sys_reg_desc. */
> > +static u64 read_id_reg(const struct kvm_vcpu *vcpu,
> > +                    struct sys_reg_desc const *r)
> > +{
> > +     u32 id = reg_to_encoding(r);
> > +
> > +     switch (id) {
> > +     case SYS_ID_AA64PFR0_EL1:
> > +             return get_pvm_id_aa64pfr0(vcpu);
> > +     case SYS_ID_AA64PFR1_EL1:
> > +             return get_pvm_id_aa64pfr1(vcpu);
> > +     case SYS_ID_AA64ZFR0_EL1:
> > +             return get_pvm_id_aa64zfr0(vcpu);
> > +     case SYS_ID_AA64DFR0_EL1:
> > +             return get_pvm_id_aa64dfr0(vcpu);
> > +     case SYS_ID_AA64DFR1_EL1:
> > +             return get_pvm_id_aa64dfr1(vcpu);
> > +     case SYS_ID_AA64AFR0_EL1:
> > +             return get_pvm_id_aa64afr0(vcpu);
> > +     case SYS_ID_AA64AFR1_EL1:
> > +             return get_pvm_id_aa64afr1(vcpu);
> > +     case SYS_ID_AA64ISAR0_EL1:
> > +             return get_pvm_id_aa64isar0(vcpu);
> > +     case SYS_ID_AA64ISAR1_EL1:
> > +             return get_pvm_id_aa64isar1(vcpu);
> > +     case SYS_ID_AA64MMFR0_EL1:
> > +             return get_pvm_id_aa64mmfr0(vcpu);
> > +     case SYS_ID_AA64MMFR1_EL1:
> > +             return get_pvm_id_aa64mmfr1(vcpu);
> > +     case SYS_ID_AA64MMFR2_EL1:
> > +             return get_pvm_id_aa64mmfr2(vcpu);
> > +     default:
> > +             /*
> > +              * Should never happen because all cases are covered in
> > +              * pvm_sys_reg_descs[] below.
> > +              */
> > +             WARN_ON(1);
> > +             break;
> > +     }
> > +
> > +     return 0;
> > +}
> > +
> > +/*
> > + * Accessor for AArch32 feature id registers.
> > + *
> > + * The value of these registers is "unknown" according to the spec if AArch32
> > + * isn't supported.
> > + */
> > +static bool pvm_access_id_aarch32(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     /*
> > +      * No support for AArch32 guests, therefore, pKVM has no sanitized copy
> > +      * of AArch32 feature id registers.
> > +      */
> > +     BUILD_BUG_ON(FIELD_GET(ARM64_FEATURE_MASK(ID_AA64PFR0_EL1),
> > +                  PVM_ID_AA64PFR0_RESTRICT_UNSIGNED) > ID_AA64PFR0_ELx_64BIT_ONLY);
> > +
> > +     /* Use 0 for architecturally "unknown" values. */
> > +     p->regval = 0;
> > +     return true;
> > +}
> > +
> > +/*
> > + * Accessor for AArch64 feature id registers.
> > + *
> > + * If access is allowed, set the regval to the protected VM's view of the
> > + * register and return true.
> > + * Otherwise, inject an undefined exception and return false.
> > + */
> > +static bool pvm_access_id_aarch64(struct kvm_vcpu *vcpu,
> > +                               struct sys_reg_params *p,
> > +                               const struct sys_reg_desc *r)
> > +{
> > +     if (p->is_write)
> > +             return undef_access(vcpu, p, r);
> > +
> > +     p->regval = read_id_reg(vcpu, r);
> > +     return true;
> > +}
> > +
> > +/* Mark the specified system register as an AArch32 feature id register. */
> > +#define AARCH32(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch32 }
> > +
> > +/* Mark the specified system register as an AArch64 feature id register. */
> > +#define AARCH64(REG) { SYS_DESC(REG), .access = pvm_access_id_aarch64 }
> > +
> > +/* Mark the specified system register as not being handled in hyp. */
> > +#define HOST_HANDLED(REG) { SYS_DESC(REG), .access = NULL }
> > +
> > +/*
> > + * Architected system registers.
> > + * Important: Must be sorted ascending by Op0, Op1, CRn, CRm, Op2
> > + *
> > + * NOTE: Anything not explicitly listed here is *restricted by default*, i.e.,
> > + * it will lead to injecting an exception into the guest.
> > + */
> > +static const struct sys_reg_desc pvm_sys_reg_descs[] = {
> > +     /* Cache maintenance by set/way operations are restricted. */
> > +
> > +     /* Debug and Trace Registers are restricted. */
> > +
> > +     /* AArch64 mappings of the AArch32 ID registers */
> > +     /* CRm=1 */
> > +     AARCH32(SYS_ID_PFR0_EL1),
> > +     AARCH32(SYS_ID_PFR1_EL1),
> > +     AARCH32(SYS_ID_DFR0_EL1),
> > +     AARCH32(SYS_ID_AFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR0_EL1),
> > +     AARCH32(SYS_ID_MMFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR2_EL1),
> > +     AARCH32(SYS_ID_MMFR3_EL1),
> > +
> > +     /* CRm=2 */
> > +     AARCH32(SYS_ID_ISAR0_EL1),
> > +     AARCH32(SYS_ID_ISAR1_EL1),
> > +     AARCH32(SYS_ID_ISAR2_EL1),
> > +     AARCH32(SYS_ID_ISAR3_EL1),
> > +     AARCH32(SYS_ID_ISAR4_EL1),
> > +     AARCH32(SYS_ID_ISAR5_EL1),
> > +     AARCH32(SYS_ID_MMFR4_EL1),
> > +     AARCH32(SYS_ID_ISAR6_EL1),
> > +
> > +     /* CRm=3 */
> > +     AARCH32(SYS_MVFR0_EL1),
> > +     AARCH32(SYS_MVFR1_EL1),
> > +     AARCH32(SYS_MVFR2_EL1),
> > +     AARCH32(SYS_ID_PFR2_EL1),
> > +     AARCH32(SYS_ID_DFR1_EL1),
> > +     AARCH32(SYS_ID_MMFR5_EL1),
> > +
> > +     /* AArch64 ID registers */
> > +     /* CRm=4 */
> > +     AARCH64(SYS_ID_AA64PFR0_EL1),
> > +     AARCH64(SYS_ID_AA64PFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ZFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR0_EL1),
> > +     AARCH64(SYS_ID_AA64DFR1_EL1),
> > +     AARCH64(SYS_ID_AA64AFR0_EL1),
> > +     AARCH64(SYS_ID_AA64AFR1_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR0_EL1),
> > +     AARCH64(SYS_ID_AA64ISAR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR0_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR1_EL1),
> > +     AARCH64(SYS_ID_AA64MMFR2_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCTLR_EL1),
> > +     HOST_HANDLED(SYS_ACTLR_EL1),
> > +     HOST_HANDLED(SYS_CPACR_EL1),
> > +
> > +     HOST_HANDLED(SYS_RGSR_EL1),
> > +     HOST_HANDLED(SYS_GCR_EL1),
> > +
> > +     /* Scalable Vector Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TTBR0_EL1),
> > +     HOST_HANDLED(SYS_TTBR1_EL1),
> > +     HOST_HANDLED(SYS_TCR_EL1),
> > +
> > +     HOST_HANDLED(SYS_APIAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APIBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDAKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APDBKEYHI_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYLO_EL1),
> > +     HOST_HANDLED(SYS_APGAKEYHI_EL1),
> > +
> > +     HOST_HANDLED(SYS_AFSR0_EL1),
> > +     HOST_HANDLED(SYS_AFSR1_EL1),
> > +     HOST_HANDLED(SYS_ESR_EL1),
> > +
> > +     HOST_HANDLED(SYS_ERRIDR_EL1),
> > +     HOST_HANDLED(SYS_ERRSELR_EL1),
> > +     HOST_HANDLED(SYS_ERXFR_EL1),
> > +     HOST_HANDLED(SYS_ERXCTLR_EL1),
> > +     HOST_HANDLED(SYS_ERXSTATUS_EL1),
> > +     HOST_HANDLED(SYS_ERXADDR_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC0_EL1),
> > +     HOST_HANDLED(SYS_ERXMISC1_EL1),
> > +
> > +     HOST_HANDLED(SYS_TFSR_EL1),
> > +     HOST_HANDLED(SYS_TFSRE0_EL1),
> > +
> > +     HOST_HANDLED(SYS_FAR_EL1),
> > +     HOST_HANDLED(SYS_PAR_EL1),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_MAIR_EL1),
> > +     HOST_HANDLED(SYS_AMAIR_EL1),
> > +
> > +     /* Limited Ordering Regions Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_VBAR_EL1),
> > +     HOST_HANDLED(SYS_DISR_EL1),
> > +
> > +     /* GIC CPU Interface registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CONTEXTIDR_EL1),
> > +     HOST_HANDLED(SYS_TPIDR_EL1),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL1),
> > +
> > +     HOST_HANDLED(SYS_CNTKCTL_EL1),
> > +
> > +     HOST_HANDLED(SYS_CCSIDR_EL1),
> > +     HOST_HANDLED(SYS_CLIDR_EL1),
> > +     HOST_HANDLED(SYS_CSSELR_EL1),
> > +     HOST_HANDLED(SYS_CTR_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_TPIDR_EL0),
> > +     HOST_HANDLED(SYS_TPIDRRO_EL0),
> > +
> > +     HOST_HANDLED(SYS_SCXTNUM_EL0),
> > +
> > +     /* Activity Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_CNTP_TVAL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CTL_EL0),
> > +     HOST_HANDLED(SYS_CNTP_CVAL_EL0),
> > +
> > +     /* Performance Monitoring Registers are restricted. */
> > +
> > +     HOST_HANDLED(SYS_DACR32_EL2),
> > +     HOST_HANDLED(SYS_IFSR32_EL2),
> > +     HOST_HANDLED(SYS_FPEXC32_EL2),
> > +};
>
> It would be good if you had something that checks the ordering of this
> array at boot time. It is incredibly easy to screw up the ordering,
> and then everything goes subtly wrong.

Yes. I'll do something like check_sysreg_table().

Thanks,
/fuad

> > +
> > +/*
> > + * Handler for protected VM MSR, MRS or System instruction execution.
> > + *
> > + * Returns true if the hypervisor has handled the exit, and control should go
> > + * back to the guest, or false if it hasn't, to be handled by the host.
> > + */
> > +bool kvm_handle_pvm_sysreg(struct kvm_vcpu *vcpu, u64 *exit_code)
> > +{
> > +     const struct sys_reg_desc *r;
> > +     struct sys_reg_params params;
> > +     unsigned long esr = kvm_vcpu_get_esr(vcpu);
> > +     int Rt = kvm_vcpu_sys_get_rt(vcpu);
> > +
> > +     params = esr_sys64_to_params(esr);
> > +     params.regval = vcpu_get_reg(vcpu, Rt);
> > +
> > +     r = find_reg(&params, pvm_sys_reg_descs, ARRAY_SIZE(pvm_sys_reg_descs));
> > +
> > +     /* Undefined access (RESTRICTED). */
> > +     if (r == NULL) {
> > +             __inject_undef64(vcpu);
> > +             return true;
> > +     }
> > +
> > +     /* Handled by the host (HOST_HANDLED) */
> > +     if (r->access == NULL)
> > +             return false;
> > +
> > +     /* Handled by hyp: skip instruction if instructed to do so. */
> > +     if (r->access(vcpu, &params, r))
> > +             __kvm_skip_instr(vcpu);
> > +
> > +     if (!params.is_write)
> > +             vcpu_set_reg(vcpu, Rt, params.regval);
> > +
> > +     return true;
> > +}
>
> Thanks,
>
>         M.
>
> --
> Without deviation from the norm, progress is not possible.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
  2021-09-22 12:47   ` Fuad Tabba
  (?)
@ 2021-10-06  6:56     ` Andrew Jones
  -1 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-06  6:56 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:47:01PM +0100, Fuad Tabba wrote:
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

Regarding the approach, LGTM

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-06  6:56     ` Andrew Jones
  0 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-06  6:56 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kernel-team, kvm, maz, pbonzini, will, kvmarm, linux-arm-kernel

On Wed, Sep 22, 2021 at 01:47:01PM +0100, Fuad Tabba wrote:
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

Regarding the approach, LGTM

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew

_______________________________________________
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm

^ permalink raw reply	[flat|nested] 90+ messages in thread

* Re: [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs
@ 2021-10-06  6:56     ` Andrew Jones
  0 siblings, 0 replies; 90+ messages in thread
From: Andrew Jones @ 2021-10-06  6:56 UTC (permalink / raw)
  To: Fuad Tabba
  Cc: kvmarm, maz, will, james.morse, alexandru.elisei, suzuki.poulose,
	mark.rutland, christoffer.dall, pbonzini, oupton, qperret, kvm,
	linux-arm-kernel, kernel-team

On Wed, Sep 22, 2021 at 01:47:01PM +0100, Fuad Tabba wrote:
> Protected VMs have more restricted features that need to be
> trapped. Moreover, the host should not be trusted to set the
> appropriate trapping registers and their values.
> 
> Initialize the trapping registers, i.e., hcr_el2, mdcr_el2, and
> cptr_el2 at EL2 for protected guests, based on the values of the
> guest's feature id registers.
> 
> No functional change intended as trap handlers introduced in the
> previous patch are still not hooked in to the guest exit
> handlers.
> 
> Signed-off-by: Fuad Tabba <tabba@google.com>
> ---
>  arch/arm64/include/asm/kvm_asm.h       |   1 +
>  arch/arm64/include/asm/kvm_host.h      |   2 +
>  arch/arm64/kvm/arm.c                   |   8 ++
>  arch/arm64/kvm/hyp/include/nvhe/pkvm.h |  14 ++
>  arch/arm64/kvm/hyp/nvhe/Makefile       |   2 +-
>  arch/arm64/kvm/hyp/nvhe/hyp-main.c     |  10 ++
>  arch/arm64/kvm/hyp/nvhe/pkvm.c         | 186 +++++++++++++++++++++++++
>  7 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm64/kvm/hyp/include/nvhe/pkvm.h
>  create mode 100644 arch/arm64/kvm/hyp/nvhe/pkvm.c

Regarding the approach, LGTM

Reviewed-by: Andrew Jones <drjones@redhat.com>

Thanks,
drew


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 90+ messages in thread

end of thread, other threads:[~2021-10-06  6:58 UTC | newest]

Thread overview: 90+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-22 12:46 [PATCH v6 00/12] KVM: arm64: Fixed features for protected VMs Fuad Tabba
2021-09-22 12:46 ` Fuad Tabba
2021-09-22 12:46 ` Fuad Tabba
2021-09-22 12:46 ` [PATCH v6 01/12] KVM: arm64: Move __get_fault_info() and co into their own include file Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-30 13:04   ` Will Deacon
2021-09-30 13:04     ` Will Deacon
2021-09-30 13:04     ` Will Deacon
2021-09-22 12:46 ` [PATCH v6 02/12] KVM: arm64: Don't include switch.h into nvhe/kvm-main.c Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-30 13:07   ` Will Deacon
2021-09-30 13:07     ` Will Deacon
2021-09-30 13:07     ` Will Deacon
2021-09-22 12:46 ` [PATCH v6 03/12] KVM: arm64: Move early handlers to per-EC handlers Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-30 13:35   ` Will Deacon
2021-09-30 13:35     ` Will Deacon
2021-09-30 13:35     ` Will Deacon
2021-09-30 16:02     ` Marc Zyngier
2021-09-30 16:02       ` Marc Zyngier
2021-09-30 16:02       ` Marc Zyngier
2021-09-30 16:27     ` Marc Zyngier
2021-09-30 16:27       ` Marc Zyngier
2021-09-30 16:27       ` Marc Zyngier
2021-09-22 12:46 ` [PATCH v6 04/12] KVM: arm64: Add missing FORCE prerequisite in Makefile Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 14:17   ` Marc Zyngier
2021-09-22 14:17     ` Marc Zyngier
2021-09-22 14:17     ` Marc Zyngier
2021-09-22 12:46 ` [PATCH v6 05/12] KVM: arm64: Pass struct kvm to per-EC handlers Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46 ` [PATCH v6 06/12] KVM: arm64: Add missing field descriptor for MDCR_EL2 Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46 ` [PATCH v6 07/12] KVM: arm64: Simplify masking out MTE in feature id reg Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:46   ` Fuad Tabba
2021-09-22 12:47 ` [PATCH v6 08/12] KVM: arm64: Add handlers for protected VM System Registers Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-10-05  8:52   ` Andrew Jones
2021-10-05  8:52     ` Andrew Jones
2021-10-05  8:52     ` Andrew Jones
2021-10-05 16:43     ` Fuad Tabba
2021-10-05 16:43       ` Fuad Tabba
2021-10-05 16:43       ` Fuad Tabba
2021-10-05  9:53   ` Marc Zyngier
2021-10-05  9:53     ` Marc Zyngier
2021-10-05  9:53     ` Marc Zyngier
2021-10-05 16:49     ` Fuad Tabba
2021-10-05 16:49       ` Fuad Tabba
2021-10-05 16:49       ` Fuad Tabba
2021-09-22 12:47 ` [PATCH v6 09/12] KVM: arm64: Initialize trap registers for protected VMs Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-10-05  9:23   ` Marc Zyngier
2021-10-05  9:23     ` Marc Zyngier
2021-10-05  9:23     ` Marc Zyngier
2021-10-05  9:33     ` Fuad Tabba
2021-10-05  9:33       ` Fuad Tabba
2021-10-05  9:33       ` Fuad Tabba
2021-10-06  6:56   ` Andrew Jones
2021-10-06  6:56     ` Andrew Jones
2021-10-06  6:56     ` Andrew Jones
2021-09-22 12:47 ` [PATCH v6 10/12] KVM: arm64: Move sanitized copies of CPU features Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47 ` [PATCH v6 11/12] KVM: arm64: Trap access to pVM restricted features Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-10-04 17:27   ` Marc Zyngier
2021-10-04 17:27     ` Marc Zyngier
2021-10-04 17:27     ` Marc Zyngier
2021-10-05  7:20     ` Fuad Tabba
2021-10-05  7:20       ` Fuad Tabba
2021-10-05  7:20       ` Fuad Tabba
2021-09-22 12:47 ` [PATCH v6 12/12] KVM: arm64: Handle protected guests at 32 bits Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-09-22 12:47   ` Fuad Tabba
2021-10-05  8:48   ` Marc Zyngier
2021-10-05  8:48     ` Marc Zyngier
2021-10-05  8:48     ` Marc Zyngier
2021-10-05  9:05     ` Fuad Tabba
2021-10-05  9:05       ` Fuad Tabba
2021-10-05  9:05       ` Fuad Tabba

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.